TW202309299A - Methods of identifying drug sensitive genes and drug resistant genes in cancer cells - Google Patents

Methods of identifying drug sensitive genes and drug resistant genes in cancer cells Download PDF

Info

Publication number
TW202309299A
TW202309299A TW111126186A TW111126186A TW202309299A TW 202309299 A TW202309299 A TW 202309299A TW 111126186 A TW111126186 A TW 111126186A TW 111126186 A TW111126186 A TW 111126186A TW 202309299 A TW202309299 A TW 202309299A
Authority
TW
Taiwan
Prior art keywords
sgrna
ibar
cancer cell
library
sequence
Prior art date
Application number
TW111126186A
Other languages
Chinese (zh)
Inventor
鵬飛 袁
鳴 金
永建 張
紅豔 申
玲 楊
娜 劉
美華 蘇
雅茹 鄭
玉蘭 李
Original Assignee
大陸商北京輯因醫療科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大陸商北京輯因醫療科技有限公司 filed Critical 大陸商北京輯因醫療科技有限公司
Publication of TW202309299A publication Critical patent/TW202309299A/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • G01N33/5091Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing the pathological state of an organism
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57419Specifically defined cancers of colon
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/10Applications; Uses in screening processes
    • C12N2320/12Applications; Uses in screening processes in functional genomics, i.e. for the determination of gene function
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2740/00Reverse transcribing RNA viruses
    • C12N2740/00011Details
    • C12N2740/10011Retroviridae
    • C12N2740/16011Human Immunodeficiency Virus, HIV
    • C12N2740/16041Use of virus, viral particle or viral elements as a vector
    • C12N2740/16043Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/52Predicting or monitoring the response to treatment, e.g. for selection of therapy based on assay results in personalised medicine; Prognosis
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/60Complex ways of combining multiple protein biomarkers for diagnosis

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Medicinal Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Food Science & Technology (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Physiology (AREA)
  • Biophysics (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present application relates to methods of identifying target genes in cancer cells whose mutations make the cancer cells sensitive or resistant to anti-cancer drugs. Also provided are methods of treating cancer and selecting patients based on aberrations (e.g., mutations) in target genes identified herein. Modified cancer cells that are sensitive or resistant to anti-cancer drugs, and methods and kits for generating thereof are also provided.

Description

鑒定癌細胞中藥物敏感基因和耐藥基因的方法Method for identifying drug-sensitive and drug-resistant genes in cancer cells

本申請要求2021年7月12日提交的國際專利申請PCT/CN2021/105822和2021年7月12日提交的國際專利申請PCT/CN2021/105816的優先權,其各自的內容均以引用方式以其整體併入本文。This application claims priority to International Patent Application PCT/CN2021/105822 filed on July 12, 2021 and International Patent Application PCT/CN2021/105816 filed on July 12, 2021, the contents of each of which are incorporated by reference in their Incorporated into this article as a whole.

本申請涉及鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法。還提供了治療癌症和基於本文鑒定的靶基因中的畸變(例如,突變)選擇患者的方法。還提供了對抗癌藥物敏感或耐藥的修飾的癌細胞及產生其的方法和試劑盒。The present application relates to methods of identifying target genes in cancer cells whose mutations render the cancer cells sensitive or resistant to anticancer drugs. Also provided are methods of treating cancer and selecting patients based on aberrations (eg, mutations) in target genes identified herein. Also provided are modified cancer cells sensitive or resistant to anticancer drugs, methods and kits for producing the same.

當突變發生時,癌細胞可以獲得對靶向治療劑的抗性。對抗癌藥物的耐藥性已成為成功治療癌症的主要障礙。以結直腸癌為例,它是世界上第三大常見癌症,第二大癌症相關死亡原因,也是導致胃腸道癌死亡的主要原因。傳統的病理分期根據腫瘤浸潤腸壁的深度、至淋巴結的轉移或遠處轉移,將結直腸癌分為0期、I期、II期、III期和IV期。目前,早期結直腸癌通常採用手術或放射療法來治療。除手術和放射療法外,中晚期患者通常採用化療和靶向藥物治療(如PARP 抑制劑)進行系統治療。以目前的治療,早期結直腸癌的五年生存率超過90%;然而,晚期轉移性結直腸癌的五年生存率僅為14% (H. Sung et al., CA Cancer J Clin. May 2021; 71(3):209-249; R. Dienstmann et al., Nat Rev Cancer. 2017;17(2):79-92; E.J. Kuipers et al., Nat Rev Dis Primers. 2015;1:15065; C. Joachim et al., Medicine (Baltimore). 2019; 98(35):e16941)。 When mutations occur, cancer cells can acquire resistance to targeted therapeutics. Resistance to anticancer drugs has become a major obstacle to the successful treatment of cancer. Take colorectal cancer as an example. It is the third most common cancer in the world, the second leading cause of cancer-related death, and the leading cause of death from gastrointestinal cancer. Traditional pathological staging divides colorectal cancer into stage 0, stage I, stage II, stage III, and stage IV according to the depth of tumor invasion into the intestinal wall, metastasis to lymph nodes, or distant metastasis. Currently, early-stage colorectal cancer is usually treated with surgery or radiation therapy. In addition to surgery and radiotherapy, patients with advanced disease are usually treated systematically with chemotherapy and targeted drug therapy (such as PARP inhibitors). With current treatments, the five-year survival rate for early-stage colorectal cancer exceeds 90%; however, the five-year survival rate for advanced metastatic colorectal cancer is only 14% (H. Sung et al ., CA Cancer J Clin . May 2021 ; 71(3):209-249; R. Dienstmann et al ., Nat Rev Cancer . 2017;17(2):79-92; EJ Kuipers et al ., Nat Rev Dis Primers . 2015;1:15065; C . Joachim et al ., Medicine (Baltimore). 2019; 98(35):e16941).

在癌症發作後,可能會發生特定的基因突變。突變基因通常與特定的發病機制和/或治療途徑有關。在這些突變的基因中,有的可能是藥物敏感基因(即突變後癌細胞對抗癌藥物的治療效果更敏感),有的可能是耐藥基因(即突變後癌細胞對抗癌藥的治療效果更耐藥)。鑒定參與多種治療途徑的藥物敏感基因和耐藥基因,對於針對相應治療途徑的藥物進行患者選擇和治療設計以達到更好的治療效果具有重要意義。After the onset of cancer, specific genetic mutations may occur. Mutated genes are often associated with specific pathogenesis and/or therapeutic pathways. Among these mutated genes, some may be drug-sensitive genes (that is, the mutated cancer cells are more sensitive to the therapeutic effect of anticancer drugs), and some may be drug-resistant genes (that is, the mutated cancer cells are more sensitive to the treatment effect of anticancer drugs). effect is more resistant). Identifying drug-sensitive genes and drug-resistant genes involved in multiple therapeutic pathways is of great significance for patient selection and treatment design for drugs targeting the corresponding therapeutic pathways to achieve better therapeutic effects.

成簇的規則間隔的短回文重複序列(CRISPR)/Cas9 (CRISPR相關蛋白9) (CRISPR/Cas9)系統,能夠在靶向的基因組位點以高效率和特異性進行編輯。其廣泛的應用之一是通過結合下一代測序(“NGS”)分析的高通量彙集篩選來識別編碼基因、非編碼RNA和調控元件的功能。通過將彙集的單嚮導 RNA (“sgRNA”)或配對嚮導RNA (“pgRNA”)文庫引入表達Cas9或與效應域融合的無催化活性Cas9 (dCas9)的細胞中,研究人員可以通過產生不同的突變來進行多種遺傳篩選、大的基因組缺失、轉錄啟動或轉錄抑制。The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 (CRISPR-associated protein 9) (CRISPR/Cas9) system enables editing at targeted genomic loci with high efficiency and specificity. One of its broad applications is the identification of the function of coding genes, non-coding RNAs, and regulatory elements through high-throughput pooled screens combined with next-generation sequencing ("NGS") analysis. By introducing pooled single-guide RNA (“sgRNA”) or paired-guide RNA (“pgRNA”) libraries into cells expressing Cas9 or catalytically inactive Cas9 fused to an effector domain (dCas9), researchers can generate distinct mutations for multiple genetic screens, large genomic deletions, transcriptional initiation, or transcriptional repression.

為了為任何給定的彙集CRISPR篩選生成高品質的gRNA細胞文庫,必須在細胞文庫構建期間使用低感染複數(“MOI”),以確保每個細胞平均含有少於一個sgRNA或pgRNA以最小化螢幕的假髮現率(FDR)。為了進一步降低FDR並提高資料可重複性,通常需要gRNA的深入覆蓋和多個生物重複以獲得具有高統計意義的命中基因(hit gene),導致工作量增加。當一個人進行大量的全基因組篩選,當用於文庫構建的細胞材料有限,或者當進行難以獲得實驗重複或控制MOI的更具挑戰性的篩選(即體內篩選)時,可能會出現額外的困難。申請人先前開發的“內部條碼(iBAR)”方法(參見WO2020125762,其內容通過引用以其整體併入本文)為真核細胞中的大規模靶標識別提供了可靠且高效的篩選策略,假陽性和假陰性率低得多,並允許使用高MOI生成細胞庫。例如,與具有0.3的低MOI的傳統CRISPR/Cas篩選相比,iBAR方法可以將起始細胞數減少20倍以上(例如,MOI為3)至70倍以上(例如,MOI為10),同時保持高效率和準確性。iBAR系統特別適用於細胞數量有限的基於細胞的篩選,或者用於在低MOI下難以控制特定細胞或組織的病毒感染的體內篩選。In order to generate high-quality gRNA cell libraries for any given pooled CRISPR screen, low multiplicity of infection (“MOI”) must be used during cell library construction to ensure that each cell contains on average less than one sgRNA or pgRNA to minimize screen false discovery rate (FDR). In order to further reduce FDR and improve data reproducibility, in-depth coverage of gRNAs and multiple biological replicates are usually required to obtain hit genes with high statistical significance, resulting in increased workload. Additional difficulties may arise when one performs extensive genome-wide screens, when cellular material for library construction is limited, or when performing more challenging screens (i.e., in vivo screens) where experimental replicates or MOI control are difficult to obtain. The applicant's previously developed "internal barcoding (iBAR)" approach (see WO2020125762, the content of which is incorporated herein by reference in its entirety) provides a reliable and efficient screening strategy for large-scale target identification in eukaryotic cells, free of false positives and The false negative rate is much lower and allows the use of high MOI to generate cell banks. For example, compared to traditional CRISPR/Cas screens with a low MOI of 0.3, the iBAR approach can reduce starting cell numbers by more than 20-fold (e.g., MOI of 3) to more than 70-fold (e.g., MOI of 10), while maintaining High efficiency and accuracy. The iBAR system is particularly suitable for cell-based screening where the number of cells is limited, or for in vivo screening where viral infection of specific cells or tissues is difficult to control at low MOI.

本文提及的所有出版物、專利、專利申請和公開的專利申請的公開內容通過引用以其整體併入本文。The disclosures of all publications, patents, patent applications, and published patent applications mentioned herein are incorporated by reference in their entirety.

本發明一個方面提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含多個癌細胞的癌細胞文庫,其中所述多個癌細胞中的每一個在命中基因(“命中基因突變”)具有突變,其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同;其中所述癌細胞文庫是在允許將sgRNA構建體和Cas組件引入初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使初始癌細胞群與以下物質接觸來產生的:i)包含多個sgRNA構建體的單鏈嚮導RNA (“sgRNA”)文庫,其中每個sgRNA構建體包含或編碼sgRNA,且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列;以及ii)包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件(例如,Cas9);b) 使所述癌細胞文庫與所述抗癌藥物接觸;c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中sgRNA或命中基因突變的譜之間的差異鑒定所述靶基因。在一些實施方案中,所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫。One aspect of the invention provides a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitive or resistant to an anticancer drug, the method comprising: a) providing a cancer cell library comprising a plurality of cancer cells, wherein the Each of the plurality of cancer cells has a mutation in a hit gene ("hit gene mutation"), wherein the hit genes in at least two of the plurality of cancer cells are different from each other; wherein the cancer cell library is Generated by contacting a naive cancer cell population with i) comprising multiple sgRNA constructs under conditions that allow introduction of the sgRNA construct and the Cas module into the naive cancer cell population and generation of the mutation in the hit gene A single-stranded guide RNA ("sgRNA") library, wherein each sgRNA construct comprises or encodes a sgRNA, and wherein each sgRNA comprises a target site complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary) guide sequence; and ii) Cas comprising Cas protein or nucleic acid encoding Cas protein element (e.g., Cas9); b) contacting the library of cancer cells with the anticancer drug; c) growing the library of cancer cells to obtain a population of treated cancer cells (e.g., viable and resistant to the anticancer drug drug); and d) identifying the target gene based on the difference between the profiles of the sgRNA or hit gene mutations in the treated cancer cell population and the control cancer cell population. In some embodiments, the control population of cancer cells is obtained from a library of cancer cells cultured under the same conditions and not exposed to the anticancer drug.

在根據上述方法中任一種的一些實施方案中,基於處理後癌細胞群和對照癌細胞群中的sgRNA譜之間的差異鑒定所述靶基因。在一些實施方案中,處理後癌細胞群和對照癌細胞群中的sgRNA譜是通過下一代測序來鑒定的。在一些實施方案中,所述方法包括比較獲自處理後癌細胞群的sgRNA序列計數(例如,活的且對抗癌藥物耐藥)和獲自對照癌細胞群的sgRNA序列計數,其中:i) 其相應的sgRNA嚮導序列相比對照癌細胞群在處理後癌細胞群中被鑒定為富集的且具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA嚮導序列相比對照癌細胞群在處理後癌細胞群中被鑒定為耗竭的且具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。In some embodiments according to any one of the methods above, the target gene is identified based on a difference between the sgRNA profiles in a population of treated cancer cells and a population of control cancer cells. In some embodiments, the sgRNA profiles in the treated and control cancer cell populations are identified by next generation sequencing. In some embodiments, the method comprises comparing sgRNA sequence counts obtained from a treated cancer cell population (e.g., viable and resistant to an anticancer drug) to sgRNA sequence counts obtained from a control cancer cell population, wherein: i ) whose corresponding sgRNA guide sequence is identified as enriched in the treated cancer cell population compared to the control cancer cell population and has a hit gene with FDR ≤ 0.1, whose mutations are identified to render the cancer cells resistant to the anticancer drug and/or ii) hit genes whose corresponding sgRNA guide sequences are identified as depleted in the treated cancer cell population compared to the control cancer cell population and have FDR ≤ 0.1, are identified as mutations that make the Target genes sensitive to anticancer drugs in cancer cells.

在根據上述方法中任一種的一些實施方案中,將sgRNA文庫和Cas組件依次引入所述初始癌細胞群。在一些實施方案中,所述Cas元件被引入至所述初始癌細胞群,然後引入所述sgRNA文庫。In some embodiments according to any of the methods above, the sgRNA library and the Cas module are introduced sequentially into the initial cancer cell population. In some embodiments, the Cas element is introduced into the initial population of cancer cells and then introduced into the sgRNA library.

在根據上述方法中任一種的一些實施方案中,所述Cas蛋白是Cas9。在一些實施方案中,每個sgRNA包含與第二序列融合的嚮導序列,其中所述第二序列包含與Cas9相互作用的重複-反重複莖環。在一些實施方案中,每個sgRNA的第二序列還包含莖環1、莖環2和/或莖環3。In some embodiments according to any one of the methods above, the Cas protein is Cas9. In some embodiments, each sgRNA comprises a guide sequence fused to a second sequence comprising a repeat-inverter stem-loop that interacts with Cas9. In some embodiments, the second sequence of each sgRNA further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3.

在根據上述方法中任一種的一些實施方案中,每個sgRNA還包含內部條碼(iBAR)序列(“sgRNA iBAR”),其中每個sgRNA iBAR與Cas蛋白(例如,Cas9)一起操作以修飾所述命中基因(例如,切割所述命中基因,或調節命中基因表達)。在一些實施方案中,每個sgRNA iBAR包含在5’-至-3’方向的第一莖環和第二莖環,其中第一莖環序列與第二莖環序列雜交以形成與Cas蛋白相互作用的雙鏈RNA (dsRNA)區,以及所述iBAR序列位於第一莖環序列的3’端和第二莖環序列的5’端之間。在一些實施方案中,所述Cas蛋白是Cas9,且每個sgRNA iBAR的iBAR序列被插入至所述重複-反重複莖環的環區中。在一些實施方案中,每個iBAR序列包含約1至約50個核苷酸(例如,約6個核苷酸)。在一些實施方案中,所述sgRNA文庫是sgRNA iBAR文庫,其中所述sgRNA iBAR文庫包含多組sgRNA iBAR構建體,其中每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中所述4個sgRNA iBAR構建體的嚮導序列是相同的,其中所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,且其中每組sgRNA iBAR構建體的嚮導序列與命中基因的不同靶位點(例如,相同命中基因或不同命中基因中的不同靶位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。在一些實施方案中,所述sgRNA iBAR文庫包含至少約100組(例如,至少約1,000、10,000、50,000或更多組中的任一個) sgRNA iBAR構建體。在一些實施方案中,不同組的sgRNA iBAR構建體中至少兩個sgRNA iBAR構建體的iBAR序列是相同的(例如,第一組和第二組的sgRNA iBAR構建體在兩組sgRNA iBAR構建體中具有至少1、2、3、4或更多個共有的iBAR序列)。在一些實施方案中,至少兩組sgRNA iBAR構建體的iBAR序列是相同的。在一些實施方案中,針對每個sgRNA iBAR,所述癌細胞文庫(例如,Cas9 +sgRNA iBAR癌細胞文庫)具有平均至少約100-倍(例如,至少約200-、400-、500-、1,000-、5,000-倍或更多倍中的任一個)覆蓋率,如針對每個sgRNA iBAR平均約100-倍至約1000-倍,或平均約1000-倍覆蓋率。在一些實施方案中,針對每組sgRNA iBAR,所述癌細胞文庫(例如,Cas9 +sgRNA iBAR癌細胞文庫)具有平均至少約400-倍(例如,至少約800-、1000-、2000-、4000-、16,000-倍或更多倍中的任一個)覆蓋率,如針對每組sgRNA iBAR平均約400-倍至約4000-倍,或平均約4000-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述癌細胞文庫(例如,Cas9 +sgRNA iBAR癌細胞文庫)具有平均至少約400-倍(例如,至少約800-、1000-、1200-、2000-、3000-、4000-、10,000-、12,000-、16,000-倍或更多倍中的任一個)覆蓋率,如針對每個命中基因平均約1200-倍至約12,000-倍覆蓋率,或針對每個命中基因平均約12,000-倍覆蓋率。 In some embodiments according to any of the methods above, each sgRNA further comprises an internal barcode (iBAR) sequence ("sgRNA iBAR "), wherein each sgRNA iBAR operates with a Cas protein (e.g., Cas9) to modify the A hit gene (eg, cleavage of the hit gene, or modulation of hit gene expression). In some embodiments, each sgRNA iBAR comprises a first stem-loop and a second stem-loop in the 5'-to-3' direction, wherein the first stem-loop sequence hybridizes to the second stem-loop sequence to form an interaction with the Cas protein. An active double-stranded RNA (dsRNA) region, and the iBAR sequence is located between the 3' end of the first stem-loop sequence and the 5' end of the second stem-loop sequence. In some embodiments, the Cas protein is Cas9, and the iBAR sequence of each sgRNA iBAR is inserted into the loop region of the repeat-invert repeat stem-loop. In some embodiments, each iBAR sequence comprises about 1 to about 50 nucleotides (eg, about 6 nucleotides). In some embodiments, the sgRNA library is a sgRNA iBAR library, wherein the sgRNA iBAR library comprises multiple sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises four sgRNA iBAR constructs, each of which comprises or encoding sgRNA iBAR , wherein the guide sequences of the four sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the four sgRNA iBAR constructs are different from each other, and wherein the guide sequences of each set of sgRNA iBAR constructs Complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96% to different target sites of a hit gene (e.g., different target sites in the same hit gene or different hit genes) %, 97%, 98%, 99% or 100% complementary). In some embodiments, the sgRNA iBAR library comprises at least about 100 sets (eg, at least about any of 1,000, 10,000, 50,000, or more sets) of sgRNA iBAR constructs. In some embodiments, the iBAR sequences of at least two sgRNA iBAR constructs in different sets of sgRNA iBAR constructs are identical (e.g., the first and second set of sgRNA iBAR constructs in the two sets of sgRNA iBAR constructs have at least 1, 2, 3, 4 or more consensus iBAR sequences). In some embodiments, the iBAR sequences of at least two sets of sgRNA iBAR constructs are identical. In some embodiments, the cancer cell library (e.g., a Cas9 + sgRNA iBAR cancer cell library) has an average of at least about 100-fold (e.g., at least about 200-, 400-, 500-, 1,000 -, 5,000-fold or more) coverage, such as an average of about 100-fold to about 1000-fold, or an average of about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the cancer cell library (e.g., a Cas9 + sgRNA iBAR cancer cell library) has an average of at least about 400-fold (e.g., at least about 800-, 1000-, 2000-, 4000-fold) for each set of sgRNA iBARs . -, 16,000-fold or more) coverage, such as an average of about 400-fold to about 4000-fold, or an average of about 4000-fold coverage for each set of sgRNA iBARs . In some embodiments, the cancer cell library (e.g., a Cas9 + sgRNA iBAR cancer cell library) has an average of at least about 400-fold (e.g., at least about 800-, 1000-, 1200-, 2000-fold) per hit gene. -, 3000-, 4000-, 10,000-, 12,000-, 16,000-fold or more) coverage, such as an average of about 1200-fold to about 12,000-fold coverage for each hit gene, or for Each gene hit averaged about 12,000-fold coverage.

在根據上述方法中任一種的一些實施方案中,在所述sgRNA文庫(或sgRNA iBAR文庫)中,至少約95% (例如,至少約96%, 97%, 98%, 99%或100%中的任一個) 的所述sgRNA構建體(或sgRNA iBAR構建體)被引入至所述初始癌細胞群。 In some embodiments according to any of the methods above, of the sgRNA library (or sgRNA iBAR library), at least about 95% (eg, at least about 96%, 97%, 98%, 99%, or 100%) The sgRNA construct (or sgRNA iBAR construct) of any of ) was introduced into the initial cancer cell population.

在根據上述方法中任一種的一些實施方案中,針對每個sgRNA (或sgRNA iBAR),所述癌細胞文庫(例如,Cas9 +sgRNA癌細胞文庫,或Cas9 +sgRNA iBAR癌細胞文庫)具有至少約400-倍(例如,至少約600-、800-、1,000-、2,000-、8,000-、12,000-倍或更多倍中的任一個)覆蓋率。 In some embodiments according to any one of the methods above, for each sgRNA (or sgRNA iBAR ), the cancer cell library (e.g., a Cas9 + sgRNA cancer cell library, or a Cas9 + sgRNA iBAR cancer cell library) has at least about 400-fold (eg, at least about any of 600-, 800-, 1,000-, 2,000-, 8,000-, 12,000-fold, or more) coverage.

在根據上述方法中任一種的一些實施方案中,所述sgRNA (或gRNAiBAR)文庫包含至少約400個(例如,至少約400、600、1000、5000、10,000、50,000、100,000或更多個中的任一個) sgRNA (或gRNAiBAR)構建體,如約6000至約18,000個sgRNA (或gRNAiBAR)構建體。In some embodiments according to any of the methods above, the sgRNA (or gRNAiBAR) library comprises at least about 400 (e.g., at least about 400, 600, 1000, 5000, 10,000, 50,000, 100,000 or more) Either) sgRNA (or gRNAiBAR) constructs, such as about 6000 to about 18,000 sgRNA (or gRNAiBAR) constructs.

在根據上述方法中任一種的一些實施方案中,所述sgRNA (或sgRNA iBAR)文庫中每個sgRNA (或sgRNA iBAR)構建體均為RNA。在一些實施方案中,所述sgRNA (或sgRNA iBAR)文庫中每個sgRNA (或sgRNA iBAR)構建體均為質粒。在一些實施方案中,所述sgRNA (或sgRNA iBAR)文庫中每個sgRNA (或sgRNA iBAR)構建體均為病毒載體,如慢病毒載體。在一些實施方案中,所述sgRNA (或sgRNA iBAR)文庫中每個sgRNA (或sgRNA iBAR)構建體均為病毒,如慢病毒。在一些實施方案中,所述sgRNA (或sgRNA iBAR)文庫以至少約2(如3)的感染複數(MOI)與所述初始癌細胞群接觸。 In some embodiments according to any one of the methods above, each sgRNA (or sgRNA iBAR ) construct in the sgRNA (or sgRNA iBAR ) library is RNA. In some embodiments, each sgRNA (or sgRNA iBAR ) construct in the sgRNA (or sgRNA iBAR ) library is a plasmid. In some embodiments, each sgRNA (or sgRNA iBAR ) construct in the sgRNA (or sgRNA iBAR ) library is a viral vector, such as a lentiviral vector. In some embodiments, each sgRNA (or sgRNA iBAR ) construct in the sgRNA (or sgRNA iBAR ) library is a virus, such as a lentivirus. In some embodiments, the sgRNA (or sgRNA iBAR ) library is contacted with the initial cancer cell population at a multiplicity of infection (MOI) of at least about 2 (eg, 3).

在根據上述方法中任一種的一些實施方案中,每個嚮導序列包含約17至約23個核苷酸。In some embodiments according to any of the methods above, each guide sequence comprises about 17 to about 23 nucleotides.

在根據上述方法中任一種的一些實施方案中,步驟b)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間。在一些實施方案中,步驟b)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間。In some embodiments according to any one of the methods above, step b) comprises contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times. In some embodiments, step b) comprises contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times.

在根據上述方法中任一種的一些實施方案中,所述sgRNA (或sgRNA iBAR)序列計數經歷中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,所述sgRNA文庫是sgRNA iBAR文庫,並基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上,則所述嚮導序列的方差增加了(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變)。 In some embodiments according to any of the methods above, the sgRNA (or sgRNA iBAR ) sequence counts undergo median ratio normalization followed by mean-variance modeling. In some embodiments, the sgRNA library is a sgRNA iBAR library, and the variance of each guide sequence is adjusted based on the data identity between the iBAR sequences among the sgRNA iBAR sequences corresponding to the guide sequences. In some embodiments, the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequences is relatively In different directions relative to each other, the variance of the guide sequences is increased (eg, increased relative to decreased, increased relative to unchanged, or decreased relative to unchanged).

在根據上述方法中任一種的一些實施方案中,所述方法包括:用步驟b)中的抗癌藥物對來自步驟a)的癌細胞文庫進行至少兩個分別不同的處理;使所述癌細胞文庫生長以獲得來自每個處理的處理後癌細胞群(例如,活的且對抗癌藥物耐藥);鑒定獲自每個處理的處理後癌細胞群中的一個或多個命中基因;以及組合從所有處理鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,i) 其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且在至少一個處理中具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且在至少一個處理中具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。 In some embodiments according to any one of the methods above, the method comprises: subjecting the cancer cell library from step a) to at least two different treatments with the anticancer drug in step b); growing the library to obtain a population of treated cancer cells from each treatment (e.g., viable and resistant to an anticancer drug); identifying one or more hit genes in the population of treated cancer cells obtained from each treatment; and The one or more hit genes identified from all treatments are combined, thereby identifying target genes in the cancer cells whose mutations render the cancer cells sensitive or resistant to the anticancer drug. In some embodiments, i) its corresponding sgRNA (or sgRNA iBAR ) guide sequence is identified as enriched in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) compared to a control cancer cell population Hit genes with FDR ≤ 0.1 in at least one treatment identified as target genes whose mutations render said cancer cells resistant to anticancer drugs; and/or ii) their corresponding sgRNA (or sgRNA iBAR ) guides Hit genes identified as depleted in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) compared to control cancer cell populations and having an FDR ≤ 0.1 in at least one treatment were identified as Mutations target genes that sensitize the cancer cells to anticancer drugs.

在根據上述方法中任一種的一些實施方案中,所述方法包括:對來自步驟a)的癌細胞文庫進行兩個分別的處理b1)和b2):b1) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間;b2) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間;c1) 使來自處理b1)的癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);c2) 使來自處理b2)的癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);d1) 鑒定來自處理b1)的處理後癌細胞群中的一個或多個命中基因,d2) 鑒定來自處理b2)的處理後癌細胞群中的一個或多個命中基因,以及d3) 組合從所有處理b1)和處理b2)鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,i) 其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且在至少一個處理中具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且在至少一個處理中具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。 In some embodiments according to any one of the methods above, the method comprises: subjecting the cancer cell library from step a) to two separate treatments b1) and b2): b1) making the cancer cell library from step a) contacting the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times; b2) contacting the cancer cell library from step a) with the anticancer drug at a concentration of about IC50 to about IC70 Contacting lasts about 15 to about 16 doubling times; c1) growing the cancer cell library from treatment b1) to obtain a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug); c2) growing the cancer cell library from treatment b2 ) to obtain a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug); d1) identifying one or more hit genes in the post-treatment cancer cell population from treatment b1), d2 ) identifying one or more hit genes in the post-treatment cancer cell population from treatment b2), and d3) combining the one or more hit genes identified from all treatments b1) and treatment b2), thereby identifying said cancer cells Target genes whose mutations render the cancer cells sensitive or resistant to anticancer drugs. In some embodiments, i) its corresponding sgRNA (or sgRNA iBAR ) guide sequence is identified as enriched in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) compared to a control cancer cell population Hit genes with FDR ≤ 0.1 in at least one treatment identified as target genes whose mutations render said cancer cells resistant to anticancer drugs; and/or ii) their corresponding sgRNA (or sgRNA iBAR ) guides Hit genes identified as depleted in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) compared to control cancer cell populations and having an FDR ≤ 0.1 in at least one treatment were identified as Mutations target genes that sensitize the cancer cells to anticancer drugs.

在根據上述方法中任一種的一些實施方案中,所述方法包括:i) 分別鑒定一組的一個或多個靶基因,其突變使得所述癌細胞對抗癌藥物,單獨處理時針對兩種或更多種(例如,2、3、4、5或更多種)不同的抗癌藥物敏感;ii) 獲得存在於針對每種抗癌藥物鑒定的每組靶基因中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對兩種或更多種不同抗癌藥物的組合處理敏感的靶基因;和/或i) 分別鑒定一組的一個或多個靶基因,其突變使得所述癌細胞對抗癌藥物,單獨處理時針對兩種或更多種(例如,2、3、4、5或更多種)不同的抗癌藥物耐藥;ii) 獲得存在於針對所有抗癌藥物鑒定的靶基因組的組合中的一個或多個靶基因,由此鑒定其突變使所述癌細胞對兩種或更多種不同抗癌藥物的組合處理耐藥的靶基因。在一些實施方案中,所述兩種或更多種不同抗癌藥物靶向相同的癌症靶標。在一些實施方案中,所述兩種或更多種不同抗癌藥物靶向不同的癌症靶標。In some embodiments according to any one of the methods above, the method comprises: i) identifying a set of one or more target genes, respectively, whose mutations render the cancer cells anti-cancer agents that, when treated alone, target both or more (e.g., 2, 3, 4, 5 or more) different anticancer drugs; ii) obtain one or more targets present in each set of target genes identified for each anticancer drug genes, thereby identifying target genes whose mutations render said cancer cells susceptible to combined treatment with two or more different anticancer drugs; and/or i) identifying a set of one or more target genes, respectively, whose mutations rendering the cancer cells resistant to two or more (e.g., 2, 3, 4, 5 or more) different anticancer drugs when treated alone; ii) acquiring One or more target genes in the combination of anticancer drug identified target genes, thereby identifying target genes whose mutations render the cancer cells resistant to the combination treatment of two or more different anticancer drugs. In some embodiments, the two or more different anticancer drugs target the same cancer target. In some embodiments, the two or more different anticancer drugs target different cancer targets.

在根據上述方法中任一種的一些實施方案中,所述方法還包括對鑒定的靶基因進行排序,其中基於相比對照癌細胞群在處理後癌細胞群中所述sgRNA (或sgRNA iBAR)嚮導序列的富集或耗竭的程度(例如,富集的倍數,耗竭的倍數,富集FDR,或耗竭FDR),進行靶基因排序。在一些實施方案中,所述sgRNA文庫是sgRNA iBAR文庫,並基於對應於所述靶基因的嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來進一步調節所述靶基因排序。在一些實施方案中,所述方法還包括將敏感性評分或耐藥性評分分配給鑒定的靶基因,其中基於相比對照癌細胞群在處理後癌細胞群中所述sgRNA (或sgRNA iBAR)嚮導序列的富集的倍數(或基於富集FDR – FDR越小,排序越高;或基於資料一致性的程度 – 資料一致性程度越高,排序越高),將其突變使所述癌細胞對抗癌藥物耐藥的靶基因從高到低排序,且從高到低相應地給每個靶基因分配耐藥性評分;和/或其中基於相比對照癌細胞群在處理後癌細胞群中所述sgRNA (或sgRNA iBAR)嚮導序列的耗竭的倍數(或基於耗竭FDR – FDR越小,排序越高;或基於資料一致性的程度 – 資料一致性程度越高,排序越高),將其突變使所述癌細胞對抗癌藥物敏感的靶基因從高到低排序,且每個靶基因按從高到低排序相應地分配敏感性評分。 In some embodiments according to any one of the methods above, the method further comprises ranking the identified target genes, wherein the sgRNA (or sgRNA iBAR ) guide is based on the post-treatment cancer cell population compared to the control cancer cell population The degree of enrichment or depletion of the sequence (eg, fold enriched, fold depleted, FDR enriched, or FDR depleted) was used to rank the target genes. In some embodiments, the sgRNA library is a sgRNA iBAR library, and the ranking of the target genes is further adjusted based on the data identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences of the target genes. In some embodiments, the method further comprises assigning a sensitivity score or a drug resistance score to the identified target gene, wherein the sgRNA (or sgRNA iBAR ) in the post-treatment cancer cell population is compared to the control cancer cell population Based on the enrichment factor of the guide sequence (or based on the enrichment FDR - the smaller the FDR, the higher the ranking; or based on the degree of data consistency - the higher the data consistency, the higher the ranking), the mutation will make the cancer cells The target genes for anticancer drug resistance are ranked from high to low, and each target gene is assigned a drug resistance score accordingly; and/or wherein the cancer cell population is The multiple of the depletion of the sgRNA (or sgRNA iBAR ) guide sequence described in (or based on the depletion FDR - the smaller the FDR, the higher the ranking; or based on the degree of data consistency - the higher the data consistency, the higher the ranking), the The target genes whose mutations sensitize the cancer cells to the anticancer drug are ranked from high to low, and each target gene is assigned a sensitivity score accordingly in the sequence from high to low.

在根據上述方法中任一種的一些實施方案中,所述抗癌藥物是PARP抑制劑。In some embodiments according to any one of the methods above, the anticancer drug is a PARP inhibitor.

在根據上述方法中任一種的一些實施方案中,所述癌細胞是結直腸癌細胞。In some embodiments according to any one of the methods above, the cancer cells are colorectal cancer cells.

在根據上述方法中任一種的一些實施方案中,所述方法還包括在相同條件且沒有接觸所述抗癌藥物下培養相同的癌細胞文庫,以及任選地經歷步驟c)中的相同獲取方法以得到所述對照癌細胞群。In some embodiments according to any one of the methods above, the method further comprises culturing the same library of cancer cells under the same conditions without contacting the anticancer drug, and optionally undergoing the same acquisition method in step c) to obtain the control cancer cell population.

在根據上述方法中任一種的一些實施方案中,所述方法還包括通過以下步驟來驗證所述靶基因:a)通過在癌細胞的靶基因中產生突變(例如,失活突變)來修飾所述癌細胞;以及b)確定所述修飾癌細胞對所述抗癌藥物的敏感性或耐藥性。In some embodiments according to any one of the methods above, the method further comprises verifying the target gene by: a) modifying the target gene by generating a mutation (eg, an inactivating mutation) in the cancer cell said cancer cells; and b) determining the sensitivity or resistance of said modified cancer cells to said anticancer drug.

本發明另一個方面提供了一種鑒定癌細胞中靶基因的方法,所述靶基因中的突變使得所述癌細胞對包含第一抗癌藥物和第二抗癌藥物的聯合治療敏感,所述方法包括:i) 鑒定癌細胞中第一組的一個或多個靶基因,所述靶基因的突變使得所述癌細胞對根據本文上述方法中任一種所述的第一抗癌藥物敏感;ii) 鑒定癌細胞中第二組的一個或多個靶基因,所述靶基因的突變使得所述癌細胞對根據本文上述方法中任一種所述的第二抗癌藥物敏感;以及iii) 獲得存在於第一組靶基因和第二組靶基因兩者中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對所述聯合治療敏感的靶基因。Another aspect of the invention provides a method of identifying a target gene in a cancer cell whose mutation sensitizes the cancer cell to a combination therapy comprising a first anticancer drug and a second anticancer drug, the method comprising: i) identifying a first set of one or more target genes in cancer cells whose mutation renders the cancer cells sensitive to the first anticancer drug according to any one of the methods herein above; ii) identifying a second set of one or more target genes in cancer cells whose mutations sensitize the cancer cells to a second anticancer drug according to any one of the methods described above; and iii) obtaining the presence of One or more target genes in both the first set of target genes and the second set of target genes, thereby identifying target genes whose mutations render said cancer cells sensitive to said combination therapy.

本發明另一個方面提供了治療個體(例如,人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下情況來選擇所述個體進行治療:所述個體在靶基因(“藥物敏感基因”)中具有其使得所述癌細胞對抗癌藥物敏感的畸變(例如,攜帶突變)(“藥物敏感性畸變”),且其中所述藥物敏感基因按照上述靶基因鑒定方法中任一種進行鑒定。Another aspect of the invention provides a method of treating cancer in an individual (e.g., a human) comprising administering to the individual an effective amount of an anti-cancer drug, wherein the individual is selected for treatment based on: ("Drug Sensitivity Gene") has an aberration (e.g., carries a mutation) that renders the cancer cell sensitive to an anticancer drug ("Drug Sensitivity Aberration"), and wherein the Drug Sensitivity Gene is identified according to the target gene identification method described above any of them to be identified.

本發明另一個方面提供了將患有癌症的個體(例如,人)排除於治療之外的方法,所述治療包括向所述個體施用有效量的抗癌藥物,其中如果個體在靶基因(“耐藥基因”)中具有其使得所述癌細胞對抗癌藥物耐藥的畸變(例如,攜帶突變)(“耐藥畸變”),則所述個體被排除,且其中所述耐藥基因按照上述靶基因鑒定方法中任一種進行鑒定。Another aspect of the invention provides a method of excluding an individual (e.g., a human) with cancer from treatment comprising administering to the individual an effective amount of an anticancer drug, wherein if the individual has a target gene (" drug resistance gene") that has an aberration (e.g., carries a mutation) that renders the cancer cell resistant to an anticancer drug ("drug resistance aberration"), the individual is excluded, and wherein the drug resistance gene is Any one of the above target gene identification methods for identification.

本發明另一個方面提供了治療個體(例如,人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中所述個體是基於以下情況來選擇的:i) 一個或多個靶基因(“藥物敏感基因”)中的畸變(例如,突變),其使所述癌細胞對抗癌藥物敏感(“藥物敏感性畸變”,如“藥物敏感性突變”),以及ii) 一個或多個靶基因(“耐藥基因”)中的畸變(例如,突變),其使所述癌細胞對抗癌藥物耐藥 (“耐藥畸變”如“耐藥突變”),其中所述藥物敏感基因和耐藥基因採用根據上述靶基因鑒定方法中的任一個來鑒定,且其中如果所述藥物敏感畸變(例如,藥物敏感突變)和所述耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則選擇所述個體進行治療。在一些實施方案中,所述綜合評分是通過以下獲得的:(i) (所述藥物敏感基因的敏感性評分總數的絕對值)減去(所述耐藥基因的耐藥性評分總數的絕對值),或(ii) 本文所述式I,其中如果所述綜合評分高於零,則選擇所述個體進行治療。Another aspect of the invention provides a method of treating cancer in an individual (e.g., a human), comprising administering to the individual an effective amount of an anticancer drug, wherein the individual is selected based on: i) one or more an aberration (eg, a mutation) in a target gene ("drug-sensitivity gene") that makes said cancer cell sensitive to an anticancer drug ("drug-sensitivity aberration", such as a "drug-sensitivity mutation"), and ii) a or aberrations (eg, mutations) in multiple target genes ("drug resistance genes") that render the cancer cells resistant to anticancer drugs ("drug resistance aberrations" such as "drug resistance mutations"), wherein the The drug-sensitive gene and the drug-resistant gene are identified according to any one of the above target gene identification methods, and wherein if the drug-sensitive aberration (for example, drug-sensitive mutation) and the drug-resistant aberration (for example, drug-resistant mutation) If the composite score is higher than the composite score threshold level, the individual is selected for treatment. In some embodiments, the composite score is obtained by: (i) (the absolute value of the total number of sensitivity scores for the drug-sensitive genes) minus (the absolute value of the total number of drug resistance scores for the drug-resistant genes value), or (ii) formula I described herein, wherein the individual is selected for treatment if the composite score is above zero.

在另一個方面,還提供了產生修飾的癌細胞的方法,包括使通過上述靶基因鑒定方法中的任一個所鑒定的靶基因失活。In another aspect, there is also provided a method of producing a modified cancer cell comprising inactivating a target gene identified by any of the above target gene identification methods.

還提供了修飾的結直腸癌細胞,其在靶基因中包含突變 (例如,失活突變),其中所述靶基因為:i)選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1;或ii)選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2。Also provided is a modified colorectal cancer cell comprising a mutation (eg, an inactivating mutation) in a target gene, wherein the target gene is: i) selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2 , CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1 , RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1 and WEE1; or ii) selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5 , E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5 , IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2.

還提供了包含一個或多個sgRNA (或sgRNA iBAR)構建體的sgRNA (或sgRNA iBAR)文庫,其中每個sgRNA (或sgRNA iBAR)構建體包含或編碼sgRNA (或sgRNA iBAR),且其中每個sgRNA (或sgRNA iBAR)包含與靶基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列,所述靶基因 i)選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1;或ii)選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2。 Also provided is a sgRNA (or sgRNA iBAR ) library comprising one or more sgRNA (or sgRNA iBAR ) constructs, wherein each sgRNA (or sgRNA iBAR ) construct comprises or encodes a sgRNA (or sgRNA iBAR ), and wherein each The sgRNA (or sgRNA iBAR ) comprises a target site complementary to (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary), the target gene i) is selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1; or ii) selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2.

還提供了用於本文所述方法的試劑盒和製品,如用於產生對抗癌藥物敏感或耐藥的修飾的癌細胞的試劑盒。Also provided are kits and articles of manufacture for use in the methods described herein, such as kits for generating modified cancer cells that are sensitive or resistant to anticancer drugs.

當突變發生時,癌細胞可以獲得對靶向治療藥物的耐藥性。對抗癌藥物的耐藥性已成為成功的癌症治療的主要障礙。某些基因突變可能使癌細胞更容易被抗癌劑殺死。用抗癌劑治療攜帶這種藥物敏感突變的癌症可能會導致更高的治療成功率。對於具有耐藥突變的患者,可以尋求替代治療計畫。因此,鑒定出涉及多種治療途徑的“藥物敏感基因”(即突變後癌細胞對抗癌藥物的治療作用更敏感)和“耐藥基因”(即突變後癌細胞對抗癌藥物的治療作用更耐藥),對於患者選擇和治療設計(例如,選擇針對選定治療途徑的藥物或聯合治療)具有重要意義,以達到更好的治療效果。攜帶這些藥物敏感性突變或耐藥突變的工程化的癌細胞也可用於新藥設計和篩選,例如從頭設計,或通過修飾被某些耐藥突變抵抗的現有化合物的化學基團。When mutations occur, cancer cells can acquire resistance to targeted therapy drugs. Resistance to anticancer drugs has become a major obstacle to successful cancer treatment. Certain genetic mutations may make cancer cells more likely to be killed by anticancer agents. Treating cancers harboring such drug-sensitive mutations with anticancer agents may lead to higher rates of treatment success. For patients with resistance mutations, alternative treatment options may be sought. As a result, "drug sensitivity genes" (i.e., mutated cancer cells are more sensitive to anticancer drugs) and "drug resistance genes" (i.e., mutated cancer cells are more responsive to anticancer drugs) are identified that are involved in multiple therapeutic pathways. Drug resistance), which has important implications for patient selection and treatment design (eg, selection of drugs or combination therapy targeting selected therapeutic pathways) to achieve better therapeutic outcomes. Engineered cancer cells carrying these drug-sensitive or drug-resistant mutations can also be used in new drug design and screening, such as de novo design, or by modifying chemical groups of existing compounds that are resistant to certain drug-resistant mutations.

例如,聚(ADP核糖)聚合酶(PARP)抑制劑(PARPi)是一種靶向特定治療途徑的癌症藥物。一旦PARP檢測到單鏈斷裂(SSB),PAPR就會結合DNA並催化在蛋白質底物上合成聚合的二磷酸腺苷核糖(聚(ADP-核糖)或PAR)鏈。通過這種催化作用,PARP可以將其他DNA損傷修復(DDR)蛋白募集到損傷位點,以共同修復DNA損傷。PARPi與PARP催化位點結合,阻止多聚ADP核糖基化(PARylation)和其他DDR蛋白的募集;更重要的是,PARP被困在受損的DNA上,無法脫落。在DNA損傷位點被捕獲的PRAP導致DNA複製叉停滯,DNA複製無法進行,導致DNA雙鏈斷裂。發生這種情況時,細胞通常會觸發同源重組修復(HRR)。BRCA在HRR中起著重要作用。在具有HRR缺陷的腫瘤中,如具有BRCA突變的腫瘤,雙鏈DNA(dsDNA)斷裂的HRR受損,並且腫瘤細胞被引導以使用其他DNA修復方法,如易出錯的非同源末端連接(NHEJ),這通常會引入大規模的基因組重組,導致基因不穩定和細胞死亡。因此,PARPi和BRCA功能缺失的結合可能會極大地抑制腫瘤細胞DDR,並促進腫瘤細胞凋亡。For example, poly(ADP ribose) polymerase (PARP) inhibitor (PARPi) is a cancer drug that targets a specific therapeutic pathway. Once PARP detects a single-strand break (SSB), PAPR binds DNA and catalyzes the synthesis of polymerized adenosine diphosphate-ribose (poly(ADP-ribose) or PAR) chains on protein substrates. Through this catalysis, PARP can recruit other DNA damage repair (DDR) proteins to the site of damage to collectively repair DNA damage. PARPi binds to the PARP catalytic site, preventing poly ADP ribosylation (PARylation) and the recruitment of other DDR proteins; more importantly, PARP becomes trapped on damaged DNA and cannot be shed. PRAP trapped at the site of DNA damage causes the DNA replication fork to stall, DNA replication cannot proceed, resulting in DNA double-strand breaks. When this happens, cells typically trigger homologous recombination repair (HRR). BRCA plays an important role in HRR. In tumors with HRR deficiency, such as those with BRCA mutations, the HRR of double-strand DNA (dsDNA) breaks is impaired, and tumor cells are directed to use other DNA repair methods, such as error-prone non-homologous end joining (NHEJ ), which often introduce large-scale genome recombinations, leading to genetic instability and cell death. Therefore, the combination of PARPi and BRCA loss of function may greatly inhibit tumor cell DDR and promote tumor cell apoptosis.

為了提供更有效的患者選擇和治療設計,並為癌症特別是難以治療的癌症類型和/或階段(例如晚期轉移性結直腸癌)實現更好的治療效果,本發明使用高通量篩選鑒定導致對某些抗癌藥物的藥物敏感性和/或耐藥性表型的突變,獲得基因功能和藥物反應之間的關係,並探索將這些藥物敏感和耐藥基因用作患者選擇和治療設計的生物標誌物。這將極大地促進患者人群的準確選擇,提高抗癌藥物(如PARPi)治療癌症(如結直腸癌)的功效。在這些藥物敏感或耐藥基因中攜帶突變的工程化的癌細胞也將作為新藥設計和篩選的有前途的工具。In order to provide more efficient patient selection and treatment design, and to achieve better therapeutic outcomes for cancers, especially difficult-to-treat cancer types and/or stages (such as advanced metastatic colorectal cancer), the present invention uses high-throughput screening to identify leading Mutations in drug-sensitivity and/or resistance phenotypes to certain anticancer drugs, to obtain the relationship between gene function and drug response, and to explore the use of these drug-sensitivity and resistance genes as a basis for patient selection and treatment design Biomarkers. This will greatly facilitate the accurate selection of patient populations and improve the efficacy of anticancer drugs such as PARPi in the treatment of cancers such as colorectal cancer. Engineered cancer cells carrying mutations in these drug-sensitive or drug-resistant genes will also serve as promising tools for new drug design and screening.

本申請提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法。其突變使癌細胞對抗癌藥物敏感的靶基因在下文稱為“藥物敏感基因”,且該基因中的突變在下文稱為“藥物敏感突變”。其突變使癌細胞對抗癌藥物耐藥的靶基因在下文稱為“耐藥基因”,且該基因中的突變在下文稱為“耐藥突變”。所述方法包括:a) 提供包含多個癌細胞的癌細胞文庫,其中所述多個癌細胞中的每一個具有在命中基因的突變(例如,失活突變) (“命中基因突變”),其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同;b) 使所述癌細胞文庫與所述抗癌藥物接觸;c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中命中基因突變的譜之間的差異鑒定所述靶基因(例如,獲自在相同條件且沒有接觸所述抗癌藥物下培養的相同癌細胞文庫)。在一些實施方案中,一個或多個命中基因的一個或多個突變(例如,失活突變),通過以下物質來產生:CRISPR/Cas嚮導RNA (例如,單鏈嚮導RNA,“sgRNA”)或編碼CRISPR/Cas嚮導RNA的構建體(例如,載體如病毒載體,或病毒如慢病毒),如包含本文所述的iBAR序列(sgRNA iBAR)的sgRNA。因此,可以基於處理後癌細胞群和對照癌細胞群中命中基因突變的譜之間的直接差異(例如,通過DNA測序)來鑒定靶基因,或者基於在處理後癌細胞群和對照癌細胞群中產生所述命中基因突變的sgRNA或sgRNA iBAR的譜之間的差異(例如,通過鑒定sgRNA嚮導序列,從而鑒定相應的命中基因)。使用本文所述的sgRNA iBAR分子、構建體、組或文庫進行篩選測定,為真核細胞(例如癌細胞)中的大規模靶標鑒定提供了可靠且高效的篩選策略,具有低得多的假陽性和假陰性率,並允許使用高MOI生成細胞文庫。本文鑒定的靶基因在癌症治療中的患者選擇/排除中特別有用。例如,與健康個體相比,在本文鑒定的藥物敏感基因中攜帶突變(例如,失活)和/或藥物敏感基因表達減少或缺失(例如,mRNA或蛋白質)的患者,和/或與健康個體相比,藥物敏感基因的表達產物(例如,mRNA或蛋白質)的活性降低或消失,特別適用於採用相應的抗癌藥物進行治療。 The present application provides methods for identifying target genes in cancer cells whose mutations render the cancer cells sensitive or resistant to anticancer drugs. A target gene whose mutation renders cancer cells sensitive to an anticancer drug is hereinafter referred to as a "drug-sensitive gene", and a mutation in this gene is hereinafter referred to as a "drug-sensitive mutation". A target gene whose mutation renders cancer cells resistant to an anticancer drug is hereinafter referred to as a "drug resistance gene", and a mutation in this gene is hereinafter referred to as a "drug resistance mutation". The method comprises: a) providing a cancer cell library comprising a plurality of cancer cells, wherein each of the plurality of cancer cells has a mutation (e.g., an inactivating mutation) in a hit gene (“hit mutation”), wherein said hit genes in at least two of said plurality of cancer cells are different from each other; b) contacting said library of cancer cells with said anticancer drug; c) growing said library of cancer cells to obtain post-treatment a population of cancer cells (e.g., viable and resistant to an anticancer drug); and d) identifying the target gene based on the difference between the profile of the hit gene mutation in the treated cancer cell population and a control cancer cell population (e.g., obtained from the same cancer cell library cultured under the same conditions without exposure to the anticancer drug). In some embodiments, one or more mutations (e.g., inactivating mutations) of one or more hit genes are generated by: a CRISPR/Cas guide RNA (e.g., a single-stranded guide RNA, "sgRNA") or A construct encoding a CRISPR/Cas guide RNA (eg, a vector such as a viral vector, or a virus such as a lentivirus), such as an sgRNA comprising the iBAR sequence described herein (sgRNA iBAR ). Thus, target genes can be identified based on the direct difference (e.g., by DNA sequencing) between the profiles of hit gene mutations in the treated and control cancer cell populations, or based on the Differences between the profiles of sgRNAs or sgRNA iBARs that produce mutations in the hit genes (eg, by identifying sgRNA guide sequences and thus corresponding hit genes). Screening assays using the sgRNA iBAR molecules, constructs, panels or libraries described herein provide a robust and efficient screening strategy for large-scale target identification in eukaryotic cells (e.g., cancer cells) with much lower false positives and false negative rates, and allows the generation of cell libraries using high MOI. The target genes identified herein are particularly useful in patient selection/exclusion in cancer therapy. For example, patients who carry mutations (e.g., inactivation) and/or reduced or absent expression of drug-sensitive genes (e.g., mRNA or protein) in drug-sensitive genes identified herein, compared to healthy individuals, and/or with healthy individuals In contrast, the activity of the expression product (for example, mRNA or protein) of the drug-sensitive gene is reduced or disappeared, which is especially suitable for treatment with corresponding anticancer drugs.

因此,本發明一個方面提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含sgRNA文庫或sgRNA iBAR文庫和靶向一個或多個命中基因的Cas元件(例如,Cas9)的癌細胞文庫;b) 使所述癌細胞文庫與所述抗癌藥物接觸;c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中sgRNA、sgRNA iBAR或命中基因突變的譜之間的差異來鑒定所述靶基因。在一些實施方案中,所述Cas元件包含Cas蛋白或編碼所述Cas蛋白的核酸。在一些實施方案中,所述sgRNA文庫包含一個或多個sgRNA構建體,其中每個sgRNA構建體包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列。在一些實施方案中,所述sgRNA iBAR文庫包含多組sgRNA iBAR構建體,其中每組sgRNA iBAR構建體包含3個或更多個(例如,4個)sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中針對所述3個或更多個(4個) sgRNA iBAR構建體的嚮導序列是相同的並且與相應命中基因中的相同靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),其中所述3個或更多個(4個) sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中每組sgRNA iBAR構建體的嚮導序列與命中基因(例如,不同基因,或相同基因內的不同位點)中的不同靶點互補,且其中每個sgRNA iBAR可與Cas蛋白(例如,Cas9)一起操作以修飾(例如,切割或調控表達)所述靶位點。在一些實施方案中,針對每個命中基因設計多於一個(例如,2、3、4或更多個,如3個)嚮導序列。在一些實施方案中,所述方法包括比較獲自處理後癌細胞群的sgRNA (或sgRNA iBAR)序列計數與獲自對照癌細胞群的sgRNA (或sgRNA iBAR)序列計數。在一些實施方案中,其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的命中基因(例如,具有FDR ≤ 0.1)被鑒定為耐藥基因。在一些實施方案中,其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的命中基因(例如,具有FDR ≤ 0.1),被鑒定為藥物敏感基因。 Accordingly, one aspect of the present invention provides a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitive or resistant to an anticancer drug, the method comprising: a) providing a gene comprising an sgRNA library or an sgRNA iBAR library and a targeting A cancer cell library of Cas elements (e.g., Cas9) of one or more hit genes; b) contacting the cancer cell library with the anticancer drug; c) growing the cancer cell library to obtain treated cancer cells population (e.g., viable and resistant to an anticancer drug); and d) identifying the target gene based on the difference between the profile of the sgRNA, sgRNA iBAR , or hit gene mutation in the treated and control cancer cell populations . In some embodiments, the Cas element comprises a Cas protein or a nucleic acid encoding the Cas protein. In some embodiments, the sgRNA library comprises one or more sgRNA constructs, wherein each sgRNA construct comprises or encodes an sgRNA, and wherein each sgRNA comprises a target site complementary to a corresponding hit gene (e.g., at least A guide sequence that is about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary). In some embodiments, the sgRNA iBAR library comprises multiple sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 4) sgRNA iBAR constructs, each of which comprises or encoding sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the guide sequences for the 3 or more (4) sgRNA iBAR constructs are identical and target the same in the corresponding hit gene The sites are complementary (e.g., at least about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary), wherein the The iBAR sequence of each of 3 or more (4) sgRNA iBAR constructs differs from each other, where the guide sequence of each set of sgRNA iBAR constructs is identical to the hit gene (e.g., a different gene, or a different site within the same gene). ) are complementary to different targets, and wherein each sgRNA iBAR is operable with a Cas protein (eg, Cas9) to modify (eg, cut or regulate expression) the target site. In some embodiments, more than one (eg, 2, 3, 4 or more, such as 3) guide sequences are designed for each hit gene. In some embodiments, the method comprises comparing sgRNA (or sgRNA iBAR ) sequence counts obtained from a population of treated cancer cells to sgRNA (or sgRNA iBAR ) sequence counts obtained from a control cancer cell population. In some embodiments, its corresponding sgRNA (or sgRNA iBAR ) guide sequence is identified as an enriched hit in a post-treatment cancer cell population (e.g., alive and resistant to an anticancer drug) compared to a control cancer cell population Genes (eg, with FDR ≤ 0.1) were identified as drug resistance genes. In some embodiments, its corresponding sgRNA (or sgRNA iBAR ) guide sequence is identified as a depleted hit gene in a post-treatment cancer cell population (e.g., alive and resistant to an anticancer drug) compared to a control cancer cell population (eg, with FDR ≤ 0.1), were identified as drug-sensitive genes.

還提供了鑒定其突變使所述癌細胞對包含兩種或更多種(例如,2、3、4、5或更多種)抗癌藥物的聯合治療敏感的癌細胞中靶基因的方法,包括採用本文所述的任何方法分別鑒定在單獨處理時其突變使所述癌細胞對抗癌藥物敏感的一組靶基因,以及獲得存在於針對每種抗癌藥物鑒定的每組靶基因中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對所述聯合治療敏感的靶基因。Also provided is a method of identifying a target gene in a cancer cell whose mutation sensitizes said cancer cell to a combination therapy comprising two or more (e.g., 2, 3, 4, 5 or more) anticancer drugs, comprising separately identifying a set of target genes whose mutations sensitize said cancer cells to an anticancer drug when treated alone, using any of the methods described herein, and obtaining the number of genes present in each set of target genes identified for each anticancer drug. One or more target genes, whereby target genes whose mutations render said cancer cells sensitive to said combination therapy are identified.

本發明另一個方面提供了治療個體(例如,人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下情況來選擇所述個體進行治療:相比健康個體,所述個體具有藥物敏感基因的藥物敏感畸變(例如,攜帶針對抗癌藥物的藥物敏感突變,和/或具有異常(例如,降低的或缺乏的)表達(例如,mRNA或蛋白),和/或相比健康個體具有藥物敏感基因的異常(例如,降低的或消除的)活性(例如,RNA或蛋白活性,如由於表觀遺傳或翻譯後修飾)。本發明還提供了一種將患有癌症的個體排除在治療外的方法,包括向所述個體施用有效量的抗癌藥物,其中如果相比健康個體,所述個體具有耐藥基因的耐藥畸變(例如,攜帶針對抗癌藥物的耐藥突變,和/或具有異常(例如,降低的或缺乏的)表達(例如,mRNA或蛋白),和/或相比健康個體具有耐藥基因的異常(例如,降低的或消除的)活性(例如,RNA或蛋白活性,如由於表觀遺傳或翻譯後修飾),則所述個體被排除。本發明還提供了一種治療個體中癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中所述個體是基於以下情況來選擇的:藥物敏感性畸變(例如,藥物敏感性突變)和耐藥畸變(例如,耐藥突變),其中如果所述藥物敏感性畸變和耐藥畸變的綜合評分高於綜合評分閾值水準(例如,總的突變使所述癌細胞對所述抗癌藥物敏感),則選擇所述個體進行治療。Another aspect of the invention provides a method of treating cancer in an individual (e.g., a human), comprising administering to the individual an effective amount of an anticancer drug, wherein the individual is selected for treatment based on: compared to healthy individuals, the The individual has a drug-sensitive aberration of a drug-sensitive gene (for example, carries a drug-sensitive mutation for an anticancer drug, and/or has abnormal (for example, reduced or absent) expression (for example, mRNA or protein), and/or has a corresponding Abnormal (e.g., reduced or abolished) activity (e.g., RNA or protein activity, such as due to epigenetic or post-translational modifications) of a drug sensitive gene compared to healthy individuals. The invention also provides an individual who will suffer from cancer A method excluding treatment comprising administering to the individual an effective amount of an anticancer drug, wherein the individual has a drug resistance aberration of a drug resistance gene (e.g., carries a resistance mutation to an anticancer drug) if compared to a healthy individual , and/or have abnormal (e.g., reduced or absent) expression (e.g., mRNA or protein), and/or have abnormal (e.g., reduced or eliminated) activity of drug resistance genes compared to healthy individuals (e.g., RNA or protein activity, such as due to epigenetic or post-translational modification), the individual is excluded. The present invention also provides a method of treating cancer in an individual, comprising administering to the individual an effective amount of an anticancer drug, wherein The individual is selected based on: a drug sensitivity aberration (e.g., a drug sensitivity mutation) and a drug resistance aberration (e.g., a drug resistance mutation), wherein if the composite score of the drug sensitivity aberration and drug resistance aberration Above a composite score threshold level (eg, total mutations that sensitize the cancer cell to the anticancer drug), the individual is selected for treatment.

還提供了sgRNA或sgRNA iBAR分子、構建體、組或文庫,其可用于實施本文所述的篩選方法。還提供了包含所述sgRNA或sgRNA iBAR分子、構建體、組或文庫的修飾的癌細胞,以及產生其的方法。還提供了其突變(例如,失活如敲除)使癌細胞對一種或多種抗癌藥物的殺傷具有更高敏感性或更高耐藥性的靶基因。還提供了針對本文鑒定的藥物敏感基因或耐藥基因的sgRNA或sgRNA iBAR分子、構建體、組或文庫,包含其的修飾的癌細胞,其藥物組合物,以及試劑盒。 Also provided are sgRNA or sgRNA iBAR molecules, constructs, panels or libraries that can be used to perform the screening methods described herein. Also provided are modified cancer cells comprising the sgRNA or sgRNA iBAR molecules, constructs, panels or libraries, and methods of producing the same. Also provided are target genes whose mutations (eg, inactivation such as knockouts) render cancer cells more sensitive or resistant to killing by one or more anticancer drugs. Also provided are sgRNA or sgRNA iBAR molecules, constructs, panels or libraries directed against drug sensitive or resistant genes identified herein, modified cancer cells comprising the same, pharmaceutical compositions thereof, and kits.

I. 定義I. Definition

將針對特定實施方案並參考某些附圖來描述本發明,但本發明不限於此。權利要求中的任何參考符號不應被解釋為限制範圍。在附圖中,一些元件的尺寸可能被誇大並且出於說明目的未按比例繪製。除非另有定義,本文使用的所有技術和科學術語具有與本領域普通技術人員通常理解的相同含義。如有衝突,以本檔(包括定義)為准。下面描述了優選的方法和材料,儘管與本文所述的那些相似或等效的方法和材料可以用於本發明的實踐或測試。本文提及的所有出版物、專利申請、專利和其他參考文獻通過引用以其整體併入。本文所公開的材料、方法和實施例僅是說明性的而不是限制性的。The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto. Any reference signs in the claims should not be construed as limiting the scope. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In case of conflict, this document (including definitions) will control. The preferred methods and materials are described below, although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. The materials, methods, and examples disclosed herein are illustrative only and not restrictive.

如本文所用,“內部條碼”或“iBAR”是指插入到或附加到分子的索引,其可用于追蹤分子的身份和性能。iBAR可以是例如插入或附加到用於CRISPR/Cas系統的嚮導RNA的短核苷酸序列,如本發明所示例的。多個iBAR可用於在一個實驗中跟蹤單鏈嚮導RNA序列的性能,由此為統計分析提供重復資料,而無需重複實驗。As used herein, "internal barcode" or "iBAR" refers to an index inserted or appended to a molecule that can be used to track the identity and properties of the molecule. An iBAR can be, for example, a short nucleotide sequence inserted or appended to a guide RNA for a CRISPR/Cas system, as exemplified by the present invention. Multiple iBARs can be used to track the performance of ss-guide RNA-seq in a single experiment, thus providing replicates for statistical analysis without the need to repeat experiments.

“CRISPR系統”或“CRISPR/Cas系統”統指轉錄物和其他參與表達和/或指導CRISPR相關(“Cas”)基因活性的元件。例如,CRISPR/Cas系統可以包括編碼Cas基因的序列、tracr (反式啟動CRISPR)序列(例如,tracrRNA或活性部分tracrRNA)、tracr-配對序列(例如,包含“同向重複”以及在內源性CRISPR系統中經tracrRNA處理的部分同向重複)、嚮導序列(在內源性CRISPR系統中也稱為“間隔區”)和其他源自CRISPR基因座的序列和轉錄本。A "CRISPR system" or "CRISPR/Cas system" collectively refers to transcripts and other elements involved in the expression and/or directing of the activity of a CRISPR-associated ("Cas") gene. For example, a CRISPR/Cas system can include a sequence encoding a Cas gene, a tracr (trans-initiating CRISPR) sequence (e.g., tracrRNA or active part tracrRNA), a tracr-pair sequence (e.g., containing a "direct repeat" and an endogenous tracrRNA-processed partial direct repeats in CRISPR systems), guide sequences (also called "spacers" in endogenous CRISPR systems), and other sequences and transcripts derived from CRISPR loci.

在形成CRISPR複合物的情況下,“靶序列”是指標對其嚮導序列被設計成具有互補性的序列,其中靶序列和嚮導序列之間的雜交促進了CRISPR複合物的形成。不一定需要完全互補性,只要有足夠的互補性以引起雜交並促進CRISPR複合物的形成。靶序列可以包含任何多核苷酸,如DNA或RNA多核苷酸。CRISPR複合物可以包含與靶序列雜交並與一種或多種Cas蛋白複合的嚮導序列。In the context of the formation of a CRISPR complex, a "target sequence" refers to a sequence to which a guide sequence is designed to be complementary, wherein hybridization between the target sequence and the guide sequence facilitates the formation of the CRISPR complex. It is not necessary to have perfect complementarity, but sufficient complementarity to cause hybridization and facilitate the formation of the CRISPR complex. A target sequence may comprise any polynucleotide, such as a DNA or RNA polynucleotide. The CRISPR complex can comprise a guide sequence that hybridizes to the target sequence and complexes with one or more Cas proteins.

術語“嚮導序列”是指嚮導RNA中的連續核苷酸序列,其與靶多核苷酸中的靶序列具有部分或完全互補性,並且可以通過由Cas蛋白促進的堿基配對與靶序列雜交。在CRISPR/Cas9系統中,靶序列與PAM位點相鄰。PAM序列及其在另一條鏈上的互補序列共同構成了PAM位點。The term "guide sequence" refers to the continuous nucleotide sequence in the guide RNA, which has partial or complete complementarity with the target sequence in the target polynucleotide, and can hybridize with the target sequence through the alkali pairing promoted by the Cas protein. In the CRISPR/Cas9 system, the target sequence is adjacent to the PAM site. Together, the PAM sequence and its complement on the other strand form the PAM site.

術語“單鏈嚮導RNA”、“合成嚮導RNA”和“sgRNA”可互換使用,是指包含嚮導序列和sgRNA功能和/或sgRNA與一種或多種Cas蛋白相互作用形成CRISPR複合物所需的任何其他序列的多核苷酸序列。在一些實施方案中,sgRNA包含與包含衍生自tracrRNA的tracr序列和衍生自crRNA的tracr配對序列的第二序列融合的嚮導序列。tracr序列可以包含來自天然存在的CRISPR/Cas系統的tracrRNA的全部或部分序列。術語“嚮導序列”是指嚮導RNA中指定靶位點的核苷酸序列,並且可以與術語“嚮導”或“間隔區”互換使用。術語“tracr配對序列”也可以與術語“同向重複”互換使用。如本文所用,“sgRNA iBAR”是指具有iBAR序列的單鏈嚮導RNA。 The terms "single-stranded guide RNA", "synthetic guide RNA" and "sgRNA" are used interchangeably to refer to a gene comprising a guide sequence and any other required for sgRNA function and/or sgRNA interaction with one or more Cas proteins to form a CRISPR complex. The polynucleotide sequence of the sequence. In some embodiments, the sgRNA comprises a guide sequence fused to a second sequence comprising a tracr sequence derived from a tracrRNA and a tracr mate sequence derived from a crRNA. The tracr sequence may comprise all or part of the sequence of a tracrRNA from a naturally occurring CRISPR/Cas system. The term "guide sequence" refers to a nucleotide sequence specifying a target site in a guide RNA and is used interchangeably with the terms "guide" or "spacer". The term "tracr mate" is also used interchangeably with the term "direct repeat". As used herein, "sgRNA iBAR " refers to a single-stranded guide RNA having an iBAR sequence.

術語“可與Cas蛋白一起操作”是指嚮導RNA可以與Cas蛋白相互作用以形成CRISPR複合物。The term "operable with the Cas protein" means that the guide RNA can interact with the Cas protein to form a CRISPR complex.

如本文所用,術語“野生型”是本領域技術人員理解的術語,是指與突變體或變體形式不同的在自然界中存在的生物體、菌株、基因或特徵的典型形式。As used herein, the term "wild type" is a term understood by those skilled in the art, and refers to a typical form of an organism, strain, gene or characteristic existing in nature that is different from a mutant or variant form.

如本文所用,術語“變體”應當被理解為表示具有偏離自然界發生的模式的品質的表現。As used herein, the term "variant" should be understood to mean the expression of qualities that deviate from patterns that occur in nature.

“互補性”是指核酸通過傳統的沃森-克裡克(Watson-Crick)堿基配對或其他非傳統類型與另一個核酸序列形成氫鍵的能力。互補性百分比表示核酸分子中可以與第二個核酸序列形成氫鍵(例如,沃森-克裡克堿基配對)的殘基的百分比(例如,10個中的5、6、7、8、9、10個,50%、60%、70%、80%、90%和100%互補)。“完全互補”是指核酸序列的所有連續殘基會與第二核酸序列中相同數量的連續殘基形成氫鍵。如本文所用,“基本上互補”是指在8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、30、35、40、45、50個或更多個核苷酸的區域內,互補程度為至少60%、65%、70%、75%、80%、85%、90%、95%、97%、98%、99%或100%,或指在嚴格條件下雜交的兩種核酸。"Complementarity"refers to the ability of a nucleic acid to form hydrogen bonds with another nucleic acid sequence through conventional Watson-Crick base pairing or other non-traditional types. Percent complementarity represents the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 8 out of 10). 9, 10, 50%, 60%, 70%, 80%, 90% and 100% complementary). "Perfectly complementary" means that all contiguous residues of a nucleic acid sequence will form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence. As used herein, "substantially complementary" means at 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, Within a region of 35, 40, 45, 50 or more nucleotides, the degree of complementarity is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100%, or refers to two nucleic acids that hybridize under stringent conditions.

如本文所用,用於雜交的“嚴格條件”是指與靶序列具有互補性的核酸主要與靶序列雜交並且基本上不與非靶序列雜交的條件。嚴格條件通常是序列依賴性的,並且取決於許多因素。一般而言,序列越長,序列與其靶序列特異性雜交的溫度就越高。嚴格條件的非限制性實例詳細描述於,Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology-Hybridization With Nucleic Acid Probes Part 1, Second Chapter “Overview of principles of hybridization and the strategy of nucleic acid probe assay”, Elsevier, N.Y.。As used herein, "stringent conditions" for hybridization refers to conditions under which a nucleic acid having complementarity to a target sequence hybridizes primarily to the target sequence and does not substantially hybridize to non-target sequences. Stringent conditions are generally sequence-dependent and depend on many factors. In general, the longer the sequence, the higher the temperature at which the sequence will specifically hybridize to its target sequence. Non-limiting examples of stringent conditions are described in detail in, Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology-Hybridization With Nucleic Acid Probes Part 1, Second Chapter "Overview of principles of hybridization and the strategy of nuclear acid probe assay", Elsevier, N.Y.

“雜交”是指一種或多種多核苷酸反應形成複合物的反應,該複合物通過核苷酸殘基的堿基之間的氫鍵鍵合而穩定。氫鍵鍵合可以通過沃森-克裡克堿基配對、Hoogstein結合或以任何其他序列特異性方式發生。複合物可包含形成雙鏈體結構的兩條鏈、形成多鏈複合物的三條或更多條鏈、單個自雜交鏈或上述這些的任何組合。雜交反應可以構成更廣泛過程中的一個步驟,如PCR的開始,或酶對多核苷酸的切割。能夠與給定序列雜交的序列,稱為給定序列的“補體”。"Hybridization"refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized by hydrogen bonding between alkyl groups of nucleotide residues. Hydrogen bonding can occur through Watson-Crick base pairing, Hoogstein bonding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of the foregoing. A hybridization reaction can constitute a step in a wider process, such as the initiation of PCR, or the enzymatic cleavage of a polynucleotide. A sequence capable of hybridizing to a given sequence is called the "complement" of the given sequence.

本文中的“倍增時間”或“群體倍增時間”(PDT)是指細胞群體大小加倍所花費的時間。細胞倍增時間 = ln(2)/(增長率)。增長率(gr)是指單位時間內倍增的數量。gr = ln⁡(N(t)/N(0))/t,其中N(t)為時間t的細胞數,N(0)為時間0的細胞數,t為時間(通常以小時h為單位)。當一個細胞群體是一個指數增長的群體,即每個個體細胞在每個細胞週期中倍增時,生長速度只取決於細胞週期的長度,gr = log2(N(t)/N(0))/t。"Doubling time" or "population doubling time" (PDT) herein refers to the time it takes for a population of cells to double in size. Cell doubling time = ln(2)/(growth rate). Growth rate (gr) refers to the amount of doubling per unit time. gr = ln⁡(N(t)/N(0))/t, where N(t) is the number of cells at time t, N(0) is the number of cells at time 0, and t is time (usually in hours h unit). When a cell population is an exponentially growing population, i.e. each individual cell doubles in each cell cycle, the growth rate depends only on the length of the cell cycle, gr = log2(N(t)/N(0))/ t.

如本文所用,“構建體”是指核酸分子(例如,DNA或RNA),或能夠遞送此類核酸分子的載體。例如,當在sgRNA的上下文中使用時,構建體是指sgRNA分子、編碼sgRNA的核酸分子(例如,分離的DNA或病毒載體),或能夠遞送編碼sgRNA的核酸分子的載體,例如攜帶編碼sgRNA的核酸分子的慢病毒。當在蛋白質的上下文中使用時,構建體是指包含可以轉錄為RNA或表達為蛋白質的核苷酸序列的核酸分子。構建體可以包含與核苷酸序列可操作地連接的必要調控元件,當構建體存在於宿主細胞中時允許所述核苷酸序列轉錄或表達。As used herein, "construct" refers to a nucleic acid molecule (eg, DNA or RNA), or a vector capable of delivering such a nucleic acid molecule. For example, when used in the context of an sgRNA, a construct refers to an sgRNA molecule, a nucleic acid molecule encoding an sgRNA (e.g., an isolated DNA or viral vector), or a vector capable of delivering a nucleic acid molecule encoding an sgRNA, e.g., carrying a nucleic acid molecule encoding an sgRNA. Nucleic acid molecules of lentivirus. When used in the context of a protein, a construct refers to a nucleic acid molecule comprising a nucleotide sequence that can be transcribed into RNA or expressed as a protein. The construct may contain the necessary regulatory elements operably linked to the nucleotide sequence that permit transcription or expression of the nucleotide sequence when the construct is present in the host cell.

如本文所用,“可操作地連接”是指基因的表達處於與其空間連接的調控元件(例如,啟動子)的控制之下。調控元件可以位於其控制下的基因的5'(上游)或3'(下游)。調控元件(例如,啟動子)和基因之間的距離可以與該調控元件(例如,啟動子)和它天然控制的基因之間的距離大致相同,並且調控元件源自該基因。如本領域已知的,可以適應該距離的變化而不損失調控元件(例如,啟動子)的功能。As used herein, "operably linked" means that the expression of a gene is under the control of a regulatory element (eg, a promoter) to which it is spatially linked. A regulatory element may be located 5' (upstream) or 3' (downstream) of the gene under its control. The distance between the regulatory element (eg, a promoter) and a gene can be about the same as the distance between the regulatory element (eg, promoter) and the gene it naturally controls and from which the regulatory element is derived. Variations in this distance can be accommodated without loss of function of the regulatory element (eg, promoter), as is known in the art.

術語“載體”用於描述一個核酸分子,該核酸分子可以被工程化用於包含可能在宿主細胞中傳播的克隆多核苷酸或多核苷酸。載體包括但不限於:單鏈、雙鏈或部分雙鏈的核酸分子;包含一個或多個游離末端的核酸分子,沒有游離末端(例如圓形);包含DNA、RNA或兩者的核酸分子;以及本領域已知的其他種類的多核苷酸。一種類型的載體是“質粒”,其指可以插入其他DNA片段的圓形雙鏈DNA環,如通過標準分子克隆技術。某些載體能夠在引入它們的宿主細胞中自主複製(例如,具有細菌複製起點的細菌載體和游離型哺乳動物載體)。引入宿主細胞後,將其他載體(例如非游離的哺乳動物載體)整合到宿主細胞的基因組中,並由此與宿主基因組一起複製。此外,某些載體能夠指導與其可操作連接的基因的表達。此類載體在本文中稱為“表達載體”。重組表達載體可以以適合於在宿主細胞表達該核酸的形式包含本發明的核酸,這意味著重組表達載體包括一個或多個調控元件,其可以根據用於表達的宿主細胞來選擇,即與待表達的核酸序列可操作地連接。The term "vector" is used to describe a nucleic acid molecule that can be engineered to contain a cloned polynucleotide or polynucleotide for possible propagation in a host cell. Vectors include, but are not limited to: nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules comprising one or more free ends, without free ends (e.g., circular); nucleic acid molecules comprising DNA, RNA, or both; and other types of polynucleotides known in the art. One type of vector is a "plasmid," which refers to a circular double-stranded DNA circle into which other DNA segments can be inserted, such as by standard molecular cloning techniques. Certain vectors are capable of autonomous replication in the host cell into which they are introduced (eg, bacterial vectors with a bacterial origin of replication and episomal mammalian vectors). After introduction into the host cell, other vectors (eg, non-episomal mammalian vectors) are integrated into the genome of the host cell and are thereby replicated along with the host genome. In addition, certain vectors are capable of directing the expression of genes to which they are operably linked. Such vectors are referred to herein as "expression vectors." The recombinant expression vector may contain the nucleic acid of the present invention in a form suitable for expressing the nucleic acid in a host cell, which means that the recombinant expression vector includes one or more regulatory elements, which can be selected according to the host cell used for expression, i.e. The expressed nucleic acid sequences are operably linked.

“宿主細胞”是指可能是或已成為載體或分離的多核苷酸的受體的細胞。宿主細胞可能是原核細胞或真核細胞。在一些實施方案中,宿主細胞是真核細胞,其可以在體外培養並使用本文所述的方法進行修飾。術語“細胞”包括原代受試細胞及其後代。"Host cell" refers to a cell that may be or has become a recipient for a vector or isolated polynucleotide. Host cells may be prokaryotic or eukaryotic. In some embodiments, the host cells are eukaryotic cells that can be cultured in vitro and modified using the methods described herein. The term "cell" includes the primary subject cell and its progeny.

“感染複數”或“MOI”在本文中互換使用,以指代藥劑(例如噬菌體、病毒或細菌)與其感染靶標(例如細胞或生物體)的比例。例如,當指接種病毒顆粒的一組細胞時,感染複數或MOI是病毒顆粒數量(例如,包含sgRNA文庫的病毒顆粒)與在病毒轉導期間混合物中存在的靶細胞數量之間的比率。"Multiplicity of infection" or "MOI" are used interchangeably herein to refer to the ratio of an agent (eg, phage, virus or bacteria) to its target (eg, cell or organism) it infects. For example, when referring to a group of cells inoculated with viral particles, the multiplicity of infection or MOI is the ratio between the number of viral particles (eg, viral particles comprising a library of sgRNAs) and the number of target cells present in the mixture during viral transduction.

如本文所用,細胞的“表型”是指細胞的可觀察特徵或特性,如其形態、發育(例如生長、增殖、分化或死亡)、生化或生理特性、物候學或行為。表型可能是由基因在細胞中的表達、環境因素的影響或兩者之間的相互作用引起的。在一些實施方案中,表型是對殺傷(例如,通過抗癌藥)的耐藥性或敏感性。在一些實施方案中,表型是對生長或增殖的抑制。在一些實施方案中,表型是死亡。As used herein, a "phenotype" of a cell refers to an observable characteristic or characteristic of a cell, such as its morphology, development (eg, growth, proliferation, differentiation, or death), biochemical or physiological properties, phenology, or behavior. Phenotypes may be caused by the expression of genes in cells, the influence of environmental factors, or an interaction between the two. In some embodiments, the phenotype is resistance or sensitivity to killing (eg, by an anticancer drug). In some embodiments, the phenotype is inhibition of growth or proliferation. In some embodiments, the phenotype is death.

本文描述的“分離”的核酸分子是指在其產生的環境中通常與之相關的至少一種污染物核酸分子中鑒定並分離的核酸分子。優選地,分離的核酸不與和產生環境相關的所有組分結合。編碼本文中的多肽和抗體的分離的核酸分子的形式,不同於其在自然界中發現的形式或環境。因此,分離的核酸分子區別于天然存在於細胞中的編碼本文中多肽和抗體的核酸。An "isolated" nucleic acid molecule as described herein refers to a nucleic acid molecule identified and separated from at least one contaminant nucleic acid molecule with which it is normally associated in the environment in which it is produced. Preferably, the isolated nucleic acid is not associated with all components associated with the environment in which it was produced. An isolated nucleic acid molecule that encodes the polypeptides and antibodies herein is in a form other than that in which it is found in nature or in an environment. Isolated nucleic acid molecules thus are distinguished from nucleic acid encoding the polypeptides and antibodies herein that occur naturally in cells.

除非另有說明,“編碼氨基酸序列的核苷酸序列”包括所有彼此簡並版本的核苷酸序列,且其編碼相同的氨基酸序列。編碼蛋白質或RNA的短語核苷酸序列,也可能包含內含子,以至於編碼蛋白質的核苷酸序列在某些版本中可能包含內含子。Unless otherwise stated, a "nucleotide sequence encoding an amino acid sequence" includes all nucleotide sequences that are degenerate versions of each other and that encode the same amino acid sequence. A phrase nucleotide sequence that encodes a protein or RNA may also contain introns, so that a nucleotide sequence that encodes a protein may contain introns in some versions.

如本文所用,術語“轉染的”或“轉化的”或“轉導的”是指將外源核酸轉移或引入宿主細胞(例如癌細胞)中的過程。“轉染的”或“轉化的”或“轉導的”細胞,是已經用外源核酸轉染的、轉化的或轉導的細胞。該細胞包括原代受試細胞及其後代。As used herein, the term "transfected" or "transformed" or "transduced" refers to the process of transferring or introducing exogenous nucleic acid into a host cell (eg, a cancer cell). A "transfected" or "transformed" or "transduced" cell is a cell that has been transfected, transformed or transduced with an exogenous nucleic acid. The cells include primary test cells and their progeny.

如本文所用,“治療(treatment/treating)”是用於獲得包括臨床結果在內的有益的或希望的結果的方法。出於本發明的目的,有益的或希望的臨床結果包括但不限於以下一種或多種:減輕由疾病引起的一種或多種症狀,減輕疾病的程度,穩定疾病(例如,預防或延緩疾病的惡化),預防或延緩疾病的傳播(例如,轉移),預防或延緩疾病的復發,延緩或減緩疾病的進展,改善疾病狀態,提供緩解(部分或全部)疾病,減少治療疾病所需的一種或多種其他藥物的劑量,延緩疾病進展,提高生活品質,和/或延長生存期。“治療”還包括減少癌症的病理後果。As used herein, "treatment/treating" is a method used to obtain beneficial or desired results, including clinical results. For the purposes of this invention, beneficial or desired clinical outcomes include, but are not limited to, one or more of the following: alleviation of one or more symptoms caused by the disease, reduction of the extent of the disease, stabilization of the disease (e.g., prevention or delay of progression of the disease) , to prevent or delay the spread (e.g., metastasis) of disease, to prevent or delay the recurrence of disease, to delay or slow the progression of disease, to ameliorate disease state, to provide remission (partial or total) of disease, to reduce one or more other conditions needed to treat disease The dose of the drug slows disease progression, improves quality of life, and/or prolongs survival. "Treatment" also includes reducing the pathological consequences of cancer.

如本文所用,“個體”或“受試者”是指哺乳動物,包括但不限於:人、牛、馬、貓、犬、齧齒動物或靈長類動物。在一些實施方案中,個體是人。As used herein, an "individual" or "subject" refers to a mammal including, but not limited to, a human, bovine, equine, feline, canine, rodent, or primate. In some embodiments, the individual is a human.

如本文所用,“患者”包括患有疾病(例如,癌症)的任何人。術語“受試者”、“個體”和“患者”在本文中可互換使用。As used herein, "patient" includes anyone suffering from a disease (eg, cancer). The terms "subject", "individual" and "patient" are used interchangeably herein.

在本說明書和權利要求中使用術語“包含/包括(comprising)”時,它不排除其他元件或步驟。When the term "comprising" is used in the present description and claims, it does not exclude other elements or steps.

應當理解,本文描述的本申請的實施方案包括“由…組成”和/或“基本上由…組成”的實施方案。It is to be understood that embodiments of the present application described herein include "consisting of" and/or "consisting essentially of" embodiments.

本文提及“大約”某個值或參數,包括(並描述)針對該值或參數本身的變化。例如,提及“約X”的描述包括“X”的描述。Reference herein to "about" a value or parameter includes (and describes) variations to that value or parameter itself. For example, description referring to "about X" includes description of "X."

如本文所用,提及“不是”某個值或參數通常意味著和描述“除了”某個值或參數以外的。例如,該方法不用於治療X型癌症,意味著該方法用於治療X型以外的癌症。As used herein, reference to "not" a value or parameter generally means and describes "other than" a value or parameter. For example, the method is not used to treat type X cancer, meaning that the method is used to treat cancer other than type X.

本文使用的術語“約X-Y”具有與“約X至約Y”相同的含義。As used herein, the term "about X-Y" has the same meaning as "about X to about Y".

對於本文中核苷酸的數字範圍的敘述,明確考慮了其間的每個中間數字。例如,對於19-21nt的範圍,除了19nt和21nt之外,還考慮了數字20nt,而對於MOI的範圍,都明確考慮了其間的每個中間數位(無論是整數還是小數)。For the recitation of numerical ranges for nucleotides herein, each intervening number therebetween is expressly contemplated. For example, for the range 19-21nt, the number 20nt is considered in addition to 19nt and 21nt, while for the range of MOI, every intermediate digit in between (whether integer or decimal) is explicitly considered.

如本文和所附權利要求中使用的,單數形式“一個/一種(a/an)”、“或”和“該/所述(the)”包括複數指示物,除非上下文另有明確規定。As used herein and in the appended claims, the singular forms "a/an", "or" and "the" include plural referents unless the context clearly dictates otherwise.

II. 鑒定癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因的方法II. Methods of Identifying Target Genes in Cancer Cells whose Mutations Make Said Cancer Cells Sensitive or Resistant to Anticancer Drugs

本申請提供了鑒定癌細胞中調控所述癌細胞活性(如對抗癌藥物治療的回應)的靶基因的方法。The present application provides methods for identifying target genes in cancer cells that regulate the activity of said cancer cells, such as the response to anticancer drug treatment.

在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含多個癌細胞的癌細胞文庫,其中所述多個癌細胞中的每一個具有在命中基因的突變(例如,失活突變) (“命中基因突變”),其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同;b) 使所述癌細胞文庫與抗癌藥物(例如,以約IC50至約IC70的濃度)接觸;c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中命中基因突變的譜之間的差異鑒定所述靶基因。在一些實施方案中,所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫。在一些實施方案中,處理後癌細胞群和對照癌細胞群中命中基因突變的譜是通過下一代測序來鑒定的。在一些實施方案中,所述方法包括比較來自處理後癌細胞群的包含所述命中基因突變的序列的序列計數和來自對照癌細胞群的包含所述命中基因突變的序列的序列計數,其中:i) 其相應的命中基因突變序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的命中基因突變序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,所述癌細胞文庫針對每個命中基因,具有至少約100-倍(例如,至少約200-、600-、1000-、2000-、4000-、8000-、10000-倍或更多倍中的任一個)覆蓋率,如針對每個命中基因約600-倍至約1200-倍,或約1200-倍至約12,000-倍覆蓋率。在一些實施方案中,所述癌細胞文庫中至少2個(例如,2、3、4、5、6或更多個,如3個,或6-12個)不同的命中基因突變(例如,靶向命中基因的不同靶位點)靶向每個命中基因。在一些實施方案中,步驟b)和c)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,步驟b)和c)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,連續抗癌藥物處理傳代後,針對癌細胞文庫的每個命中基因的覆蓋率保持相同或相似(例如,差異在約10%以內)。在一些實施方案中,包含命中基因突變的序列的序列計數經歷中值比率歸一化,然後進行均值方差建模。在一些實施方案中,基於相同基因中的資料一致性,調整包含命中基因突變(例如,失活突變)的每個序列的方差。在一些實施方案中,基於每個命中基因突變序列的倍數變化的方向確定對應于相同基因的不同命中基因突變(例如,失活突變)序列之間的資料一致性,其中如果針對相同命中基因所述不同命中基因突變序列的倍數變化相對於彼此在不同方向上(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變,均被看作為不同方向),則所述命中基因突變序列的方差增加了。In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a cancer cell library comprising a plurality of cancer cells, wherein each of the plurality of cancer cells has a mutation (e.g., an inactivating mutation) in a hit gene ("hit gene mutation"), wherein the hit gene in at least two of the plurality of cancer cells different from each other; b) contacting the cancer cell library with an anticancer drug (e.g., at a concentration of about IC50 to about IC70); c) growing the cancer cell library to obtain a treated cancer cell population (e.g., viable and anticancer drug resistance); and d) identifying the target gene based on the difference between the profile of the hit gene mutation in the treated cancer cell population and the control cancer cell population. In some embodiments, the control population of cancer cells is obtained from a library of cancer cells cultured under the same conditions and not exposed to the anticancer drug. In some embodiments, the profile of hit gene mutations in the treated cancer cell population and the control cancer cell population is identified by next generation sequencing. In some embodiments, the method comprises comparing a sequence count of a sequence comprising the hit gene mutation from a treated cancer cell population to a sequence count of a sequence comprising the hit gene mutation from a control cancer cell population, wherein: i) its corresponding hit mutation sequence is identified as enriched in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) compared to a control cancer cell population and has an FDR ≤ 0.1 (and/or A hit gene having at least about 2-fold enrichment) identified as a target gene whose mutation renders said cancer cell resistant to the anticancer drug; and/or ii) its corresponding hit gene mutated sequence compared to a control cancer cell population Hit genes identified as depleted and having a FDR ≤ 0.1 (and/or having at least about 2-fold depletion) in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) are identified as mutated Target genes that sensitize the cancer cells to anticancer drugs. In some embodiments, the cancer cell library has at least about 100-fold (e.g., at least about 200-, 600-, 1000-, 2000-, 4000-, 8000-, 10000-fold, or Any of more multiples) coverage, such as about 600-fold to about 1200-fold, or about 1200-fold to about 12,000-fold coverage for each hit gene. In some embodiments, at least 2 (e.g., 2, 3, 4, 5, 6 or more, such as 3, or 6-12) different hit gene mutations (e.g., Different target sites targeting hit genes) Target each hit gene. In some embodiments, steps b) and c) comprise contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times while allowing viable cancer cells to The cancer cells are grown, optionally passaged every about 3 doubling times. In some embodiments, steps b) and c) comprise contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times while allowing viable cancer cells to The cancer cells are grown, optionally passaged every about 3 doubling times. In some embodiments, the coverage of each hit gene against the cancer cell library remains the same or similar (eg, within about 10% difference) after serial anticancer drug treatment passages. In some embodiments, sequence counts of sequences comprising hit mutations are subjected to median ratio normalization followed by mean variance modeling. In some embodiments, the variance of each sequence comprising a mutation in a hit gene (eg, an inactivating mutation) is adjusted based on the consistency of the data in the same gene. In some embodiments, the data concordance between sequences of different hit mutations (e.g., inactivating mutations) corresponding to the same gene is determined based on the direction of the fold change of each hit mutation sequence, wherein if the sequence for the same hit gene The fold changes of the different hit gene mutation sequences are in different directions relative to each other (for example, increased relative to decreased, increased relative to unchanged, or decreased relative to unchanged, all are considered as different directions), then The variance of the hit gene mutation sequence is increased.

在一些實施方案中,癌症患者中(例如,基於文獻或資料庫)其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)的基因被選為命中基因。在一些實施方案中,癌症患者中(例如,基於文獻或資料庫)其RNA表達水準上調或下調了至少約1.2-倍(例如,至少約1.5、2、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個,如至少約2-倍)的基因被選為命中基因。在一些實施方案中,癌症患者中(例如,基於文獻或資料庫)其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)的基因被選為命中基因。在一些實施方案中,基於以下情況進一步選擇命中基因:在健康細胞或癌細胞中,編碼的mRNA或蛋白在細胞內表達,或編碼的蛋白在細胞表面表達。在一些實施方案中,基於以下情況選擇命中基因:i) 其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),ii) 癌症患者中(例如,基於文獻或資料庫)其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個);以及iii) 其編碼的mRNA或蛋白在細胞內表達,或其編碼的蛋白在細胞表面表達(在癌細胞或健康細胞中)。In some embodiments, the frequency of DNA mutations in cancer patients (e.g., based on literature or databases) is at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher) genes were selected as hit genes. In some embodiments, RNA expression levels are up-regulated or down-regulated at least about 1.2-fold (e.g., at least about 1.5, 2, 3, 4, 5, 6, 7, Any of 8, 9, 10, 50, 100-fold or more, such as at least about 2-fold) genes are selected as hit genes. In some embodiments, the frequency of DNA mutations in cancer patients (e.g., based on literature or databases) is at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher), and its RNA expression level is up-regulated or down-regulated by more than about 2-fold (for example, more than about 2.5, 3, 4, 5, 6, 7, 8, 9 , 10, 50, 100 times or higher) genes were selected as hit genes. In some embodiments, the hit genes are further selected based on whether the encoded mRNA or protein is expressed in the cell, or the encoded protein is expressed on the cell surface in healthy or cancer cells. In some embodiments, hit genes are selected based on: i) having a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher), ii) in cancer patients (for example, based on literature or databases) its RNA expression level is up-regulated or down-regulated by more than about 2-fold (for example, by more than about 2.5, 3, 4 , 5, 6, 7, 8, 9, 10, 50, 100 times or more); and iii) the mRNA or protein encoded by it is expressed in the cell, or the protein encoded by it is expressed on the cell surface (in cancerous or healthy cells).

在一些實施方案中,所述癌細胞文庫是通過使初始癌細胞群與誘變劑接觸來產生的。In some embodiments, the library of cancer cells is generated by contacting an initial population of cancer cells with a mutagen.

在一些實施方案中,所述癌細胞文庫是通過使初始癌細胞群經歷基因編輯(例如,全基因組,或基因亞組)來產生的。在一些實施方案中,所述癌細胞文庫是在允許將sgRNA構建體和Cas組件引入初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使初始癌細胞群與以下物質接觸來產生的:i)包含多個sgRNA構建體的sgRNA文庫,其中每個sgRNA構建體(例如,慢病毒載體或慢病毒)包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列;以及ii)包含Cas蛋白或編碼Cas蛋白的核酸的Cas組件。在一些實施方案中,所述sgRNA文庫和Cas元件被同時引入至所述初始癌細胞群。在一些實施方案中,所述sgRNA文庫和Cas組件被依次引入所述初始癌細胞群。在一些實施方案中,所述初始癌細胞文庫包含Cas元件(例如,Cas9)。在一些實施方案中,所述癌細胞文庫是在允許將sgRNA構建體引入包含Cas9的初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使包含Cas9的初始癌細胞群與包含多個sgRNA構建體的sgRNA文庫接觸來產生的,其中每個sgRNA構建體(例如,慢病毒載體或慢病毒)包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列。在一些實施方案中,將所述Cas元件引入所述初始癌細胞群,然後引入所述sgRNA文庫。在一些實施方案中,所述癌細胞文庫是通過如下來產生的:i)在允許將Cas組件引入初始癌細胞群的條件下,使初始癌細胞群與包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件(例如,編碼Cas9的慢病毒載體或慢病毒)接觸;ii)任選地獲得包含Cas元件的癌細胞群 (“Cas +癌細胞群”;如通過FACS分選,例如,採用在編碼Cas的載體上的標誌物);iii)在允許將sgRNA構建體引入癌細胞(例如,Cas +癌細胞)並且在所述命中基因產生所述突變的條件下,使Cas +癌細胞群與包含多個sgRNA構建體的sgRNA文庫接觸,其中每個sgRNA構建體(例如,慢病毒載體或慢病毒)包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列。在一些實施方案中,所述Cas蛋白是Cas9。在一些實施方案中,每個sgRNA包含與第二序列融合的嚮導序列,其中所述第二序列包含與Cas9相互作用的重複-反重複莖環。在一些實施方案中,每個sgRNA的第二序列還包含莖環1、莖環2和/或莖環3。在一些實施方案中,每個sgRNA還包含iBAR序列(“sgRNA iBAR”),其中每個sgRNA iBAR可與Cas蛋白一起操作以修飾(例如,切割或調控表達)所述命中基因。在一些實施方案中,每個sgRNA iBAR包含在5’-至-3’方向的第一莖環序列和第二莖環序列,其中第一莖環序列與第二莖環序列雜交以形成與Cas蛋白相互作用的dsRNA區,且其中所述iBAR序列位於第一莖環序列的3’端和第二莖環序列的5’端之間。在一些實施方案中,所述Cas蛋白是Cas9,且每個sgRNA iBAR的iBAR序列被插入至所述重複-反重複莖環的環區中。在一些實施方案中,每個嚮導序列包含約17至約23個核苷酸。在一些實施方案中,所述sgRNA文庫中的至少約95% (例如,至少約96%、97%、98%、99%或更高中的任一個),如至少約99%的sgRNA構建體被引入至所述初始癌細胞群。在一些實施方案中,在癌細胞文庫或所述sgRNA文庫內的每個命中基因,被所述命中基因的至少約3個(例如,約6至約12個)不同的靶基因位點的至少約3個(例如,約6至約12個)不同的sgRNA構建體靶向。在一些實施方案中,針對每個sgRNA所述癌細胞文庫具有至少約100-倍(例如,約600-倍至約1200-倍)覆蓋率。在一些實施方案中,針對每個命中基因所述癌細胞文庫具有至少約300-倍覆蓋率,如針對每個命中基因約600-倍至約1200-倍覆蓋率。 In some embodiments, the library of cancer cells is generated by subjecting an initial population of cancer cells to gene editing (eg, a whole genome, or a subset of genes). In some embodiments, the cancer cell library is obtained by contacting the initial cancer cell population with Generated by: i) a sgRNA library comprising multiple sgRNA constructs, wherein each sgRNA construct (e.g., lentiviral vector or lentivirus) comprises or encodes an sgRNA, and wherein each sgRNA comprises a target in the corresponding hit gene a guide sequence that is site complementary (e.g., at least about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary); And ii) a Cas assembly comprising a Cas protein or a nucleic acid encoding a Cas protein. In some embodiments, the sgRNA library and Cas elements are simultaneously introduced into the naive cancer cell population. In some embodiments, the sgRNA library and the Cas module are introduced sequentially into the initial cancer cell population. In some embodiments, the naive cancer cell library comprises a Cas element (eg, Cas9). In some embodiments, the cancer cell library is obtained by combining the initial cancer cell population comprising Cas9 with sgRNA libraries comprising multiple sgRNA constructs (e.g., lentiviral vectors or lentiviruses) comprising or encoding sgRNAs are generated by contacting each sgRNA construct, and wherein each sgRNA comprises a target site associated with a corresponding hit gene A guide sequence that is complementary (eg, at least about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary). In some embodiments, the Cas element is introduced into the naive cancer cell population followed by the sgRNA library. In some embodiments, the cancer cell library is produced by: i) combining an initial cancer cell population with a Cas protein or a nucleic acid encoding a Cas protein under conditions that allow the introduction of a Cas component into the initial cancer cell population. Cas element (e.g., lentiviral vector or lentivirus encoding Cas9) contact; ii) optionally obtain a population of cancer cells comprising the Cas element ("Cas + cancer cell population"; e.g. by FACS sorting, e.g., using markers on vectors of Cas); iii) under conditions that allow the introduction of sgRNA constructs into cancer cells (e.g., Cas + cancer cells) and the mutation of the hit gene, the Cas+ cancer cell population is combined with the Cas + cancer cell containing A sgRNA library contact of multiple sgRNA constructs, wherein each sgRNA construct (e.g., lentiviral vector or lentivirus) contains or encodes an sgRNA, and wherein each sgRNA contains a target site complementary to the corresponding hit gene (e.g., A guide sequence that is at least about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary). In some embodiments, the Cas protein is Cas9. In some embodiments, each sgRNA comprises a guide sequence fused to a second sequence comprising a repeat-inverter stem-loop that interacts with Cas9. In some embodiments, the second sequence of each sgRNA further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3. In some embodiments, each sgRNA further comprises an iBAR sequence ("sgRNA iBAR "), wherein each sgRNA iBAR is operable with a Cas protein to modify (eg, cleave or regulate expression) the hit gene. In some embodiments, each sgRNA iBAR comprises a first stem-loop sequence and a second stem-loop sequence in the 5'-to-3' direction, wherein the first stem-loop sequence hybridizes with the second stem-loop sequence to form a Cas A protein-interacting dsRNA region, and wherein the iBAR sequence is located between the 3' end of the first stem-loop sequence and the 5' end of the second stem-loop sequence. In some embodiments, the Cas protein is Cas9, and the iBAR sequence of each sgRNA iBAR is inserted into the loop region of the repeat-invert repeat stem-loop. In some embodiments, each guide sequence comprises about 17 to about 23 nucleotides. In some embodiments, at least about 95% (e.g., at least about any of 96%, 97%, 98%, 99% or higher), such as at least about 99% of the sgRNA constructs in the sgRNA library are constructed Introduced into the initial cancer cell population. In some embodiments, each hit gene within the cancer cell library or the sgRNA library is identified by at least about 3 (eg, about 6 to about 12) different target gene loci of the hit gene. About 3 (eg, about 6 to about 12) different sgRNA constructs target. In some embodiments, the cancer cell library has at least about 100-fold (eg, about 600-fold to about 1200-fold) coverage for each sgRNA. In some embodiments, the cancer cell library has at least about 300-fold coverage for each hit gene, such as about 600-fold to about 1200-fold coverage for each hit gene.

在一些實施方案中,所述癌細胞文庫是在允許將所述sgRNA iBAR構建體和所述Cas組件引入所述初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使初始癌細胞群與以下物質接觸來產生的:i) 包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含3個或更多個(例如,4個) sgRNA iBAR構建體(例如,慢病毒載體或慢病毒),每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中針對所述3個或更多個(4個) sgRNA iBAR構建體的嚮導序列是相同的,並且與命中基因的相同靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),其中所述3個或更多個(4個) sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中每組sgRNA iBAR構建體的嚮導序列與命中基因(例如,不同命中基因,或相同命中基因內的不同位點)中的不同靶點互補,且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點;以及ii) 包含Cas蛋白或編碼所述Cas蛋白的核酸的Cas (例如,Cas9)元件。在一些實施方案中,所述初始癌細胞文庫包含Cas元件(例如,Cas9)。在一些實施方案中,所述癌細胞文庫是在允許將所述sgRNA iBAR構建體引入包含Cas9的初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使包含Cas9的初始癌細胞群與包含多組sgRNA iBAR構建體的sgRNA iBAR文庫接觸來產生的,其中每組sgRNA iBAR構建體包含3個或更多個(例如,4個)sgRNA iBAR構建體(例如,慢病毒載體或慢病毒),每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中針對所述3個或更多個(4個) sgRNA iBAR構建體的嚮導序列是相同的,並且與命中基因的相同靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),其中所述3個或更多個(例如,4個) sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中每組sgRNA iBAR構建體的嚮導序列與命中基因(例如,不同命中基因,或相同命中基因內的不同位點)中的不同靶點互補,且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點。在一些實施方案中,所述癌細胞文庫是通過以下步驟來產生的:i) 在允許將所述Cas組件引入所述初始癌細胞群的條件下,使初始癌細胞群與包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件(例如,編碼Cas9的慢病毒載體或慢病毒)接觸;ii) 任選地獲得包含Cas元件的癌細胞群(“Cas +癌細胞群”;如通過FACS分選,例如,採用在編碼Cas的載體上的標誌物);iii) 在允許將所述sgRNA iBAR構建體引入癌細胞(例如,Cas +癌細胞)並且在所述命中基因產生所述突變的條件下,使Cas +癌細胞群與包含多組sgRNA iBAR構建體的sgRNA iBAR文庫接觸,其中每組sgRNA iBAR構建體包含3個或更多個(例如,4個)sgRNA iBAR構建體(例如,慢病毒載體或慢病毒),每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中針對所述3個或更多個(例如,4個) sgRNA iBAR構建體的嚮導序列是相同的,並且與命中基因的相同靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),其中所述3個或更多個(例如,4個) sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中每組sgRNA iBAR構建體的嚮導序列與命中基因(例如,不同命中基因,或相同命中基因內的不同位點)中的不同靶點互補,且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點。在一些實施方案中,所述癌細胞文庫是在允許將所述sgRNA iBAR構建體和Cas9組件引入所述初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使初始癌細胞群與以下接觸來產生的:i) 包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含3個或更多個(例如,4個)sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列、第二序列和iBAR序列,其中針對所述3個或更多個(例如,4個) sgRNA iBAR構建體的嚮導序列是相同的,並且與命中基因的相同靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),其中所述3個或更多個(例如,4個) sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中所述嚮導序列與第二序列融合,其中所述第二序列包含與Cas9蛋白相互作用的重複-反重複莖環,其中所述iBAR序列被插入至所述重複-反重複莖環的環區中,其中每組sgRNA iBAR構建體的嚮導序列與命中基因(例如,不同命中基因,或相同命中基因的不同靶基因位點)中的不同靶點互補,且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點;以及ii)包含Cas9蛋白或編碼Cas9蛋白的核酸的Cas9組件。在一些實施方案中,所述Cas元件(例如,Cas9)被引入至所述癌細胞,然後引入所述sgRNA iBAR文庫。在一些實施方案中,所述sgRNA iBAR文庫被引入至癌細胞,然後引入所述Cas元件(例如,Cas9)。在一些實施方案中,所述Cas元件(例如,Cas9)和所述sgRNA iBAR文庫被同時引入至所述癌細胞。在一些實施方案中,每個iBAR序列包含約1個至約50個(如6個) 核苷酸。在一些實施方案中,每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,以及針對所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同。在一些實施方案中,所述sgRNA iBAR文庫包含至少約100組sgRNA iBAR構建體。在一些實施方案中,針對不同組的sgRNA iBAR構建體中至少兩個sgRNA iBAR構建體的iBAR序列是相同的(例如,在兩組sgRNA iBAR構建體中,第一組和第二組的sgRNA iBAR構建體具有至少1、2、3、4或更多個共有的iBAR序列)。在一些實施方案中,針對至少兩組sgRNA iBAR構建體的iBAR序列是相同的。在一些實施方案中,使sgRNA iBAR文庫與所述初始癌細胞群以大於約2 (例如,至少約3、5或10),如3的MOI進行接觸。在一些實施方案中,包含多個sgRNA iBAR構建體的sgRNA iBAR文庫包含或編碼具有與癌症相關基因的靶位點互補的嚮導序列的sgRNA iBAR。在一些實施方案中,所述sgRNA iBAR文庫中的至少約95% (例如,至少約96%、97%、98%、99%或更多中的任一個),如至少約99%的sgRNA iBAR構建體被引入至所述初始癌細胞群。在一些實施方案中,所述癌細胞文庫或所述sgRNA iBAR文庫中的每個命中基因被所述命中基因的3個不同的靶基因位點的3個不同組的sgRNA iBAR構建體靶向。在一些實施方案中,針對每個sgRNA iBAR,所述癌細胞文庫具有至少約100-倍覆蓋率,如針對每個sgRNA iBAR約100-倍至約1000-倍覆蓋率。在一些實施方案中,針對每組sgRNA iBAR,所述癌細胞文庫具有至少約400-倍覆蓋率,如針對每組sgRNA iBAR約400-倍至約4000-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述癌細胞文庫具有至少約400-倍覆蓋率,如針對每個命中基因約1200-倍至約12,000-倍覆蓋率。 In some embodiments, the cancer cell library is obtained by making the initial generated by contacting a population of cancer cells with: i) a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 4) sgRNA iBAR constructs ( For example, a lentiviral vector or a lentivirus), each of the constructs comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the three or more (4) sgRNA iBAR constructs The guide sequences of the clones are identical and complementary to the same target site of the hit gene (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary), wherein the iBAR sequences of each of the 3 or more (4) sgRNA iBAR constructs are different from each other, wherein the guide sequence of each set of sgRNA iBAR constructs is identical to the hit Different target sites in a gene (e.g., different hit genes, or different sites within the same hit gene) are complementary, and wherein each sgRNA iBAR is operable with a Cas9 protein to modify the target site; and ii) comprises Cas Cas (for example, Cas9) element of the nucleic acid of albumen or coding described Cas albumen. In some embodiments, the naive cancer cell library comprises a Cas element (eg, Cas9). In some embodiments, the cancer cell library is obtained by making the initial cancer cell population containing Cas9 Cell populations are produced by contacting sgRNA iBAR libraries comprising sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 4) sgRNA iBAR constructs (e.g., lentiviral vectors or lentivirus), each of which constructs contains or encodes a sgRNA iBAR , wherein each sgRNA iBAR contains a guide sequence and an iBAR sequence, wherein the guide sequences for the three or more (4) sgRNA iBAR constructs are the same and is complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary to the same target site of the hit gene any of ), wherein the iBAR sequences of each of the 3 or more (e.g., 4) sgRNA iBAR constructs are different from each other, wherein the guide sequence of each set of sgRNA iBAR constructs is identical to the hit gene (e.g., Different target sites in different hit genes, or different sites within the same hit gene), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site. In some embodiments, the cancer cell library is generated by: i) combining an initial cancer cell population with a Cas protein or encoding Cas elements (for example, lentiviral vectors or lentiviruses encoding Cas9) of nucleic acids of Cas protein contact; ii) optionally obtain a population of cancer cells comprising the Cas element ("Cas + cancer cell population"; e.g. by FACS sorting, For example, using a marker on a vector encoding Cas); iii) under conditions that allow introduction of the sgRNA iBAR construct into cancer cells (e.g., Cas + cancer cells) and generation of the mutation in the hit gene, A population of Cas + cancer cells is contacted with a sgRNA iBAR library comprising sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 4) sgRNA iBAR constructs (e.g., lentiviral vectors or lentivirus), each of which constructs comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the guide for said 3 or more (for example, 4) sgRNA iBAR constructs The sequences are identical and complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary), wherein the iBAR sequences of each of the 3 or more (e.g., 4) sgRNA iBAR constructs are different from each other, wherein the guide sequence of each set of sgRNA iBAR constructs is identical to the hit gene (eg, different hit genes, or different sites within the same hit gene), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site. In some embodiments, the cancer cell library is obtained by making naive cancer cells The population is produced by contacting: i) a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 4) sgRNA iBAR constructs, each of which The construct comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence, a second sequence and an iBAR sequence, wherein the guide sequences for the 3 or more (e.g., 4) sgRNA iBAR constructs are identical , and is complementary to the same target site of the hit gene (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary in any of ), wherein the iBAR sequences of each of the 3 or more (e.g., 4) sgRNA iBAR constructs are different from each other, wherein the guide sequence is fused to a second sequence, wherein the second sequence Comprising a repeat-inverse repeat stem-loop that interacts with the Cas9 protein, wherein the iBAR sequence is inserted into the loop region of the repeat-inverse repeat stem-loop, wherein the guide sequence of each set of sgRNA iBAR constructs is aligned with the hit gene (e.g. , different hit genes, or different target gene loci of the same hit gene), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site; and ii) comprises the Cas9 protein or A Cas9 component of a nucleic acid encoding a Cas9 protein. In some embodiments, the Cas element (eg, Cas9) is introduced into the cancer cell and then introduced into the sgRNA iBAR library. In some embodiments, the sgRNA iBAR library is introduced into cancer cells followed by the introduction of the Cas element (eg, Cas9). In some embodiments, the Cas element (eg, Cas9) and the sgRNA iBAR library are introduced into the cancer cell simultaneously. In some embodiments, each iBAR sequence comprises about 1 to about 50 (eg, 6) nucleotides. In some embodiments, each set of sgRNA iBAR constructs comprises 4 sgRNA iBAR constructs, and the iBAR sequences for each of the 4 sgRNA iBAR constructs are different from each other. In some embodiments, the sgRNA iBAR library comprises at least about 100 sets of sgRNA iBAR constructs. In some embodiments, the iBAR sequences for at least two sgRNA iBAR constructs in different sets of sgRNA iBAR constructs are identical (e.g., in two sets of sgRNA iBAR constructs, the sgRNA iBAR sequences of the first set and the second set Constructs have at least 1, 2, 3, 4 or more consensus iBAR sequences). In some embodiments, the iBAR sequences for at least two sets of sgRNA iBAR constructs are identical. In some embodiments, the sgRNA iBAR library is contacted with the initial population of cancer cells at an MOI of greater than about 2 (eg, at least about 3, 5, or 10), such as 3. In some embodiments, a sgRNA iBAR library comprising a plurality of sgRNA iBAR constructs comprises or encodes a sgRNA iBAR with a guide sequence complementary to a target site of a cancer-associated gene. In some embodiments, at least about 95% (e.g., at least about any of 96%, 97%, 98%, 99% or more) of the sgRNA iBAR library, such as at least about 99% of the sgRNA iBAR Constructs are introduced into the initial cancer cell population. In some embodiments, each hit gene in the cancer cell library or the sgRNA iBAR library is targeted by 3 different sets of sgRNA iBAR constructs for 3 different target loci of the hit gene. In some embodiments, the library of cancer cells has at least about 100-fold coverage for each sgRNA iBAR , such as about 100-fold to about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the cancer cell library has at least about 400-fold coverage for each set of sgRNA iBARs , such as about 400-fold to about 4000-fold coverage for each set of sgRNA iBARs . In some embodiments, the cancer cell library has at least about 400-fold coverage for each hit gene, such as about 1200-fold to about 12,000-fold coverage for each hit gene.

在一些實施方案中,使用本文描述的sgRNA iBAR文庫的篩選方法,可以通過統計分析提高靶標識別和資料再現性並降低FDR。在使用彙集的sgRNA文庫的傳統基於CRISPR/Cas的篩選方法中,在細胞文庫構建過程中使用低MOI生成表達gRNA的高品質細胞文庫,以確保每個細胞平均具有少於一個sgRNA或配對的嚮導RNA (“pgRNA”)。由於文庫中的sgRNA分子隨機整合到轉染細胞中,因此足夠低的MOI確保每個細胞表達單個sgRNA,由此最大限度地減少篩選的FDR。為了進一步降低FDR並提高資料可重複性,通常需要gRNA的深入覆蓋和多個生物學重複以獲得具有高統計學意義的命中基因。當需要大量全基因組篩選時,當用於文庫構建的細胞材料有限時,或者當進行更具挑戰性的篩選(即體內篩選)時,難以安排實驗重複或控制MOI,那麼傳統篩選方法面臨困難。使用本文所述的sgRNA iBAR文庫的篩選方法,通過在每個sgRNA中包含iBAR序列來克服困難,這使得能夠在每個具有相同嚮導序列但不同iBAR序列的sgRNA組中收集內部重複。這種iBAR方法可以降低實驗雜訊。例如,如WO2020125762所示,針對每個sgRNA,具有四個核苷酸的iBAR可以提供足夠的內部重複來評價針對相同基因組位點的不同sgRNA iBAR構建體之間的資料一致性。WO2020125762中兩個獨立實驗之間的高度一致性表明,對於使用iBAR方法的CRISPR/Cas篩選來說,一個實驗重複就足夠了。由於在宿主細胞的病毒轉導過程中,文庫覆蓋率隨著高MOI顯著增加,初始細胞群中的細胞數量可以減少20倍以上,以達到相同的文庫覆蓋率,如WO2020125762中構建的全基因組人類文庫所示。同樣,使用sgRNA iBAR的每個全基因組篩選的工作量可以成比例地減少。使用具有不同iBAR序列的sgRNA,然後通過計算嚮導序列和相應的iBAR核苷酸序列,可以在同一實驗中多次追蹤每個嚮導序列的性能,由此大大降低FDR,並提高效率和可能性。可以進一步提高轉導效率和文庫覆蓋率,在病毒轉導步驟中使用高病毒滴度,例如,MOI>1 (例如,MOI>1.5、MOI>2、MOI>2.5、MOI>3、MOI>3.5、MOI>4、MOI>4.5、MOI>5、MOI>5.5、MOI>6、MOI>6.5、MOI>7、MOI>7.5、MOI>8、MOI>8.5、MOI>9、MOI>9.5或MOI>10;如MOI約為1、1.5、2、2.5、3、3.5、4、4.5、5、5.5、6、6.5、7、7.5、8、8.5、9、9.5或10中的任一個)。 In some embodiments, using the screening methods for sgRNA iBAR libraries described herein, target recognition and data reproducibility can be improved and FDR reduced by statistical analysis. In traditional CRISPR/Cas-based screening methods using pooled sgRNA libraries, high-quality cell libraries expressing gRNAs are generated using low MOI during cell library construction to ensure that each cell has on average less than one sgRNA or paired guide RNA ("pgRNA"). Since the sgRNA molecules in the library are randomly integrated into the transfected cells, a sufficiently low MOI ensures expression of a single sgRNA per cell, thereby minimizing the FDR of the screen. To further reduce FDR and improve data reproducibility, deep coverage of gRNAs and multiple biological replicates are usually required to obtain hit genes with high statistical significance. Traditional screening methods face difficulties when large genome-wide screens are required, when cellular material for library construction is limited, or when performing more challenging screens (i.e., in vivo screens) where it is difficult to schedule experimental replicates or control the MOI. Using the screening method for sgRNA iBAR libraries described here, the difficulty was overcome by including an iBAR sequence in each sgRNA, which enabled the collection of internal duplications in each set of sgRNAs with the same guide sequence but different iBAR sequences. This iBAR approach reduces experimental noise. For example, as shown in WO2020125762, iBARs with four nucleotides per sgRNA can provide enough internal repeats to evaluate the data consistency between different sgRNA iBAR constructs targeting the same genomic locus. The high agreement between two independent experiments in WO2020125762 suggests that for CRISPR/Cas screening using the iBAR approach, one experimental replicate is sufficient. Since library coverage increases significantly with high MOI during viral transduction of host cells, the number of cells in the initial cell population can be reduced by more than 20-fold to achieve the same library coverage as the genome-wide human constructed in WO2020125762 shown in the library. Likewise, the workload per genome-wide screen using sgRNA iBARs can be proportionally reduced. Using sgRNAs with different iBAR sequences, and then by calculating the guide sequences and the corresponding iBAR nucleotide sequences, the performance of each guide sequence can be tracked multiple times in the same experiment, thereby greatly reducing FDR and increasing efficiency and probability. Transduction efficiency and library coverage can be further improved by using high virus titers during the viral transduction step, e.g., MOI > 1 (e.g., MOI > 1.5, MOI > 2, MOI > 2.5, MOI > 3, MOI > 3.5 , MOI>4, MOI>4.5, MOI>5, MOI>5.5, MOI>6, MOI>6.5, MOI>7, MOI>7.5, MOI>8, MOI>8.5, MOI>9, MOI>9.5 or MOI >10; such as MOI is about any one of 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5 or 10).

在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述的sgRNA iBAR文庫的癌細胞文庫;b) 使所述癌細胞文庫與抗癌藥物接觸(例如,持續約9至約10個倍增時間,或約15至約16個倍增時間,進行或不進行細胞傳代);c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中sgRNA iBAR或命中基因突變的譜之間的差異鑒定所述靶基因。在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述的sgRNA iBAR文庫的癌細胞文庫;b/c) 使所述癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(例如,持續約9至約10個倍增時間,或約15至約16個倍增時間,進行或不進行細胞傳代),通過去除含有抗癌藥物的細胞培養基(和死的漂浮細胞)來收穫癌細胞並收集剩餘的粘附癌細胞(例如,通過胰蛋白酶消化),從而獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中sgRNA iBAR或命中基因突變的譜之間的差異鑒定所述靶基因。在一些實施方案中,所述sgRNA iBAR文庫靶向癌症相關的基因。在一些實施方案中,針對每個sgRNA iBAR,所述癌細胞文庫具有約100-倍至約1000-倍覆蓋率,如針對每個sgRNA iBAR約1000-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述癌細胞文庫具有至少約400-倍覆蓋率,例如,針對每個命中基因約1200-倍至約12,000-倍覆蓋率。在一些實施方案中,所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫,以及任選地在步驟c)中經歷相同的獲得方法。在一些實施方案中,將獲自處理後癌細胞群的序列計數與獲自所述對照癌細胞群的相應序列計數進行比較,以提供倍數變化(例如,實際倍數變化,或倍數變化的導數如log2或log10倍數變化)。在一些實施方案中,基於處理後癌細胞群和對照癌細胞群中sgRNA iBAR的譜之間的差異鑒定所述靶基因。在一些實施方案中,通過下一代測序來鑒定處理後癌細胞群和對照癌細胞群的sgRNA iBAR的譜。在一些實施方案中,在步驟d)中鑒定靶基因包括:將獲自處理後癌細胞群的所述sgRNA iBAR(或其嚮導序列)序列計數與獲自所述對照癌細胞群的sgRNA iBAR(或其嚮導序列)序列計數進行比較,其中i) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,在步驟d)中鑒定靶基因包括:i) 鑒定處理後癌細胞群中的sgRNA iBAR序列;以及ii) 鑒定對應於所述sgRNA iBAR的嚮導序列的命中基因。在一些實施方案中,在步驟d)中鑒定靶基因包括:i) 獲得處理後癌細胞群中的sgRNA iBAR序列;ii) 基於序列計數對所述sgRNA iBAR序列的相應嚮導序列排序,其中所述排序包括基於對應于嚮導序列的所述sgRNA iBAR序列中的iBAR序列之間的資料一致性調整每個嚮導序列的排序;以及iii) 鑒定對應于其排序高於預定閾值水準的嚮導序列的命中基因。在一些實施方案中,所述方法是陽性篩查。在一些實施方案中,所述方法是陰性篩選。在一些實施方案中,步驟b)和c)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,步驟b)和c)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,在傳代以用於連續抗癌藥物處理後,針對癌細胞文庫的每個命中基因(或sgRNA iBAR)的覆蓋率保持相同或相似(例如,約10%以內的差異)。在一些實施方案中,所述sgRNA iBAR序列計數經歷中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變),則所述嚮導序列的方差增加了。 In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hit genes A cancer cell library of an sgRNA iBAR library described herein; b) contacting the cancer cell library with an anticancer drug (e.g., for about 9 to about 10 doubling times, or about 15 to about 16 doubling times, for or without cell passaging); c) growing the library of cancer cells to obtain a population of treated cancer cells (e.g., viable and resistant to an anticancer drug); and d) based on the population of treated cancer cells and control cancer cells Differences between profiles of sgRNA iBAR or hit gene mutations in the population identified the target genes. In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hit genes A cancer cell library of the sgRNA iBAR library described herein; b/c) contacting the cancer cell library with an anticancer drug while allowing viable cancer cells to grow (e.g., for about 9 to about 10 doubling times, or about 15 to about 16 doubling times, with or without cell passage), harvest the cancer cells by removing the cell culture medium (and dead floating cells) containing the anticancer drug and collect the remaining adherent cancer cells (e.g., by trypsinization digestion) to obtain treated cancer cell populations (e.g., viable and resistant to anticancer drugs); and d) based on the comparison between profiles of sgRNA iBAR or hit gene mutations in treated and control cancer cell populations The target genes were differentially identified. In some embodiments, the sgRNA iBAR library targets cancer-associated genes. In some embodiments, the cancer cell library has about 100-fold to about 1000-fold coverage for each sgRNA iBAR , such as about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the cancer cell library has at least about 400-fold coverage for each hit gene, eg, about 1200-fold to about 12,000-fold coverage for each hit gene. In some embodiments, the control cancer cell population is obtained from a library of cancer cells cultured under the same conditions and not exposed to the anticancer drug, and optionally subjected to the same method of obtaining in step c). In some embodiments, sequence counts obtained from a treated cancer cell population are compared to corresponding sequence counts obtained from the control cancer cell population to provide a fold change (e.g., an actual fold change, or a derivative of a fold change such as log2 or log10 fold change). In some embodiments, the target gene is identified based on the difference between the profiles of sgRNA iBARs in a population of treated cancer cells and a population of control cancer cells. In some embodiments, the profile of the sgRNA iBAR of a treated cancer cell population and a control cancer cell population is identified by next generation sequencing. In some embodiments, identifying the target gene in step d) comprises: comparing the sgRNA iBAR (or its guide sequence) sequence count obtained from the treated cancer cell population with the sgRNA iBAR (or guide sequence) obtained from the control cancer cell population ( or its guide sequence) sequence counts where i) its corresponding sgRNA iBAR guide sequence is identified as enriched in a treated cancer cell population (e.g., viable and resistant to an anticancer drug) compared to a control cancer cell population. A set of hit genes with FDR ≤ 0.1 (and/or with at least about 2-fold enrichment), identified as target genes whose mutations render said cancer cells resistant to the anticancer drug; and/or ii) their corresponding sgRNA iBAR guide sequences were identified as depleted in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) and having an FDR ≤ 0.1 (and/or having at least about 2-fold compared to a control cancer cell population) depletion) were identified as target genes whose mutations sensitized the cancer cells to anticancer drugs. In some embodiments, identifying the target gene in step d) comprises: i) identifying a sgRNA iBAR sequence in the treated cancer cell population; and ii) identifying a hit gene corresponding to the guide sequence of the sgRNA iBAR . In some embodiments, identifying the target gene in step d) comprises: i) obtaining the sgRNA iBAR sequence in the cancer cell population after treatment; ii) sorting the corresponding guide sequence of the sgRNA iBAR sequence based on sequence counting, wherein the Ranking comprises adjusting the ranking of each guide sequence based on data concordance between iBAR sequences in the sgRNA iBAR sequences corresponding to guide sequences; and iii) identifying hit genes corresponding to guide sequences whose ranking is above a predetermined threshold level . In some embodiments, the method is a positive screen. In some embodiments, the method is negative screening. In some embodiments, steps b) and c) comprise contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times while allowing viable cancer cells to The cancer cells are grown, optionally passaged every about 3 doubling times. In some embodiments, steps b) and c) comprise contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times while allowing viable cancer cells to The cancer cells are grown, optionally passaged every about 3 doubling times. In some embodiments, the coverage of each hit gene (or sgRNA iBAR ) against a cancer cell library remains the same or similar (e.g., within about 10% difference) after passaging for serial anticancer drug treatment . In some embodiments, the sgRNA iBAR sequence counts undergo median ratio normalization followed by mean-variance modeling. In some embodiments, the variance of each guide sequence is adjusted based on the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences. In some embodiments, the profile identity between iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequences is relatively The variance of the guide sequences increases relative to each other in different directions (eg, increasing versus decreasing, increasing versus unchanged, or decreasing versus unchanged).

因此,在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述sgRNA iBAR文庫的癌細胞文庫;b) 使癌細胞文庫與抗癌藥物接觸(例如,約9至約10個倍增時間,或約15至約16個倍增時間,進行或不進行細胞傳代);c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中sgRNA iBAR的譜之間的差異鑒定所述靶基因,其中所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫,其中處理後癌細胞群和對照癌細胞群中sgRNA iBAR的譜是通過下一代測序來鑒定的,其中步驟d)包括對獲自處理後癌細胞群的sgRNA iBAR序列計數和獲自對照癌細胞群的sgRNA序列計數進行比較,且其中i) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,步驟b)和c)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,步驟b)和c)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,在傳代以用於連續抗癌藥物處理後,針對癌細胞文庫的每個命中基因(或sgRNA iBAR)的覆蓋率保持相同或相似(例如,約10%以內的差異)。在一些實施方案中,針對每個sgRNA iBAR,所述癌細胞文庫具有約100-倍至約1000-倍覆蓋率,如針對每個sgRNA iBAR約1000-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述癌細胞文庫具有至少約400-倍(例如,約1200-倍至約12,000-倍)覆蓋率。在一些實施方案中,所述sgRNA iBAR序列計數經歷中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變),則所述嚮導序列的方差增加了。 Accordingly, in some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hits A cancer cell library of a sgRNA iBAR library described herein for genes; b) contacting the cancer cell library with an anticancer drug (e.g., about 9 to about 10 doubling times, or about 15 to about 16 doubling times, with or without c) growing the cancer cell library to obtain a treated cancer cell population (e.g., viable and resistant to an anticancer drug); and d) based on the treated cancer cell population and the control cancer cell population The difference between the profile of sgRNA iBAR identifies the target gene, wherein the control cancer cell population is obtained from a cancer cell library cultured under the same conditions and has not been exposed to the anticancer drug, wherein the treated cancer cell population and the control cancer cell The profile of sgRNA iBAR in the population is identified by next generation sequencing, wherein step d) comprises comparing the sgRNA iBAR sequence counts obtained from the treated cancer cell population to the sgRNA sequence counts obtained from the control cancer cell population, and wherein i ) whose corresponding sgRNA iBAR guide sequence is identified as enriched in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) compared to a control cancer cell population with an FDR ≤ 0.1 (and/or with At least about 2-fold enrichment) hit genes identified as target genes whose mutations render the cancer cells resistant to the anticancer drug; and/or ii) their corresponding sgRNA iBAR guide sequences compared to the control cancer cell population in Hit genes identified as depleted and having a FDR ≤ 0.1 (and/or having at least about 2-fold depletion) in a cancer cell population after treatment (e.g., viable and resistant to an anticancer drug) are identified as mutations that render The cancer cell is sensitive to a target gene of an anticancer drug. In some embodiments, steps b) and c) comprise contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times while allowing viable cancer cells to The cancer cells are grown, optionally passaged every about 3 doubling times. In some embodiments, steps b) and c) comprise contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times while allowing viable cancer cells to The cancer cells are grown, optionally passaged every about 3 doubling times. In some embodiments, the coverage of each hit gene (or sgRNA iBAR ) against a cancer cell library remains the same or similar (e.g., within about 10% difference) after passaging for serial anticancer drug treatment . In some embodiments, the cancer cell library has about 100-fold to about 1000-fold coverage for each sgRNA iBAR , such as about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the cancer cell library has at least about 400-fold (eg, about 1200-fold to about 12,000-fold) coverage for each hit gene. In some embodiments, the sgRNA iBAR sequence counts undergo median ratio normalization followed by mean-variance modeling. In some embodiments, the variance of each guide sequence is adjusted based on the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences. In some embodiments, the profile identity between iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequences is relatively The variance of the guide sequences increases relative to each other in different directions (eg, increasing versus decreasing, increasing versus unchanged, or decreasing versus unchanged).

在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述sgRNA iBAR文庫的癌細胞文庫;b) 使癌細胞文庫經歷至少兩個分別不同的抗癌藥物處理(例如,本文所述的處理);c) 使所述癌細胞文庫生長以獲得來自每個處理的處理後癌細胞群(例如,均為活的且對抗癌藥物耐藥);d1) 基於來自每個處理的處理後癌細胞群和相應對照癌細胞群中sgRNA iBAR的譜之間的差異,鑒定獲自每個處理的處理後癌細胞群中的一個或多個命中基因,其突變使所述癌細胞具有抗癌藥物敏感性或耐藥性,以及d2) 組合從所有處理鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因;其中鑒定步驟d1)中的一個或多個命中基因包括針對每個處理,將獲自處理後癌細胞群的sgRNA iBAR(或其嚮導序列)序列計數與獲自所述對照癌細胞群的sgRNA iBAR(或其嚮導序列)序列計數進行比較,其中i) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且針對相應的處理具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為針對相應的處理其突變使所述癌細胞對抗癌藥物耐藥的命中基因;和/或ii) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且針對相應的處理具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為針對相應的處理其突變使所述癌細胞對抗癌藥物敏感的命中基因;且其中步驟d2)包括組合來自所有處理的其突變使所述癌細胞對抗癌藥物耐藥的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或組合來自所有處理的其突變使所述癌細胞對抗癌藥物敏感的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述sgRNA iBAR文庫的癌細胞文庫;b) 使癌細胞文庫經歷至少兩個分別不同的抗癌藥物處理(例如,本文所述的處理);c) 使所述癌細胞文庫生長以獲得來自每個處理的處理後癌細胞群(例如,均為活的且對抗癌藥物耐藥);d1) 基於來自每個處理的處理後癌細胞群和相應對照癌細胞群中sgRNA iBAR的譜之間的差異,鑒定獲自每個處理的處理後癌細胞群中的一個或多個命中基因,其突變使所述癌細胞對抗癌藥物敏感或耐藥,以及d2) 組合從所有處理鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因;其中i) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且在至少一個處理中具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且在至少一個處理中具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述sgRNA iBAR文庫的癌細胞文庫;b) 使癌細胞文庫經歷至少兩個分別不同的抗癌藥物處理(例如,本文所述的處理);c) 使所述癌細胞文庫生長以獲得來自每個處理的處理後癌細胞群(例如,均為活的且對抗癌藥物耐藥),以及d) 在至少兩個分別不同的處理基礎上,基於處理後癌細胞群和對照癌細胞群中sgRNA iBAR的譜之間的差異鑒定所述靶基因:其中鑒定靶基因包括:i) 針對每個處理,獲得處理後癌細胞群中的sgRNA iBAR(或其嚮導序列)序列;ii) 針對每個處理,基於序列計數對所述sgRNA iBAR序列的相應嚮導序列進行排序,其中所述排序包括基於對應于嚮導序列的所述sgRNA iBAR序列中的iBAR序列之間的資料一致性調整每個嚮導序列的排序;以及iii) 針對每個處理,鑒定對應于其排序高於預定閾值水準的嚮導序列的靶基因;其中(1)在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為耗竭的且在至少一個處理中具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變(例如,失活)使所述癌細胞對抗癌藥物敏感的靶基因;和/或(2)在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為富集的且在至少一個處理中具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變(例如,失活)使所述癌細胞對抗癌藥物耐藥的靶基因。在一些實施方案中,將針對每個處理獲自處理後癌細胞群的序列計數與獲自對照癌細胞群的相應序列計數進行比較,以提供倍數變化(例如,實際倍數變化,或倍數變化的導數如log2或log10倍數變化)。在一些實施方案中,針對每個sgRNA iBAR,所述癌細胞文庫具有約100-倍至約1000-倍覆蓋率,如針對每個sgRNA iBAR約1000-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述癌細胞文庫具有至少約400-倍覆蓋率,例如,針對每個命中基因約1200-倍至約12,000-倍覆蓋率。在一些實施方案中,所述方法是陽性篩查。在一些實施方案中,所述方法是陰性篩選。在一些實施方案中,所述對照癌細胞群獲自在相同條件且沒有接觸所述抗癌藥物下培養的相同癌細胞文庫,任選地在步驟c)中經歷相同的獲得方法。在一些實施方案中,所述方法還包括針對來自每個處理的處理後癌細胞群和對照癌細胞群進行下一代測序。在一些實施方案中,一個處理包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,另一個處理包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間,同時允許活的癌細胞生長,任選地每約3個倍增時間傳代所述癌細胞。在一些實施方案中,在傳代以用於連續抗癌藥物處理後,針對癌細胞文庫的每個命中基因(或sgRNA iBAR)的覆蓋率保持相同或相似(例如,約10%以內的差異)。在一些實施方案中,所述sgRNA iBAR序列計數經歷中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變),則所述嚮導序列的方差增加了。 In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hit genes A cancer cell library of the sgRNA iBAR library described herein; b) subjecting the cancer cell library to at least two separately different anticancer drug treatments (e.g., the treatments described herein); c) growing the cancer cell library to obtain Post-treatment cancer cell populations (e.g., all viable and resistant to anticancer drugs) for each treatment; d1) Based on the comparison of profiles of sgRNA iBAR in post-treatment cancer cell populations from each treatment and corresponding control cancer cell populations , identifying one or more hit genes in the post-treatment cancer cell population obtained from each treatment whose mutation confers anticancer drug sensitivity or resistance to the cancer cells, and d2) combining all treatments obtained from One or more hit genes identified, thereby identifying target genes in said cancer cells whose mutations render said cancer cells sensitive or resistant to anticancer drugs; wherein the one or more hit genes in identifying step d1) include For each treatment, the sgRNA iBAR (or its guide sequence) sequence counts obtained from the treated cancer cell population were compared to the sgRNA iBAR (or its guide sequence) sequence counts obtained from the control cancer cell population, where i) Their corresponding sgRNA iBAR guide sequences were identified as enriched in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) compared to control cancer cell populations and had FDR ≤ 0.1 for the corresponding treatments (and /or a hit gene with at least about 2-fold enrichment), identified as a hit gene whose mutation renders the cancer cell resistant to the anticancer drug against the corresponding treatment; and/or ii) its corresponding sgRNA iBAR guide sequence Identified as depleted in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) compared to a control cancer cell population and having an FDR ≤ 0.1 (and/or having at least about 2-fold depletion for the corresponding treatment) ) is identified as a hit gene whose mutation sensitizes the cancer cell to the anticancer drug for the corresponding treatment; and wherein step d2) comprises combining all treatments whose mutation sensitizes the cancer cell to the anticancer drug one or more hit genes for drug resistance, thereby identifying target genes in said cancer cells whose mutations render said cancer cells resistant to an anticancer drug; and/or a combination of all treatments whose mutations render said cancer cells One or more hit genes that are sensitive to the anticancer drug, thereby identifying target genes in the cancer cells whose mutations render the cancer cells sensitive to the anticancer drug. In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hit genes A cancer cell library of the sgRNA iBAR library described herein; b) subjecting the cancer cell library to at least two separately different anticancer drug treatments (e.g., the treatments described herein); c) growing the cancer cell library to obtain Post-treatment cancer cell populations (e.g., all viable and resistant to anticancer drugs) for each treatment; d1) Based on the comparison of profiles of sgRNA iBAR in post-treatment cancer cell populations from each treatment and corresponding control cancer cell populations , identifying one or more hit genes in the post-treatment cancer cell population obtained from each treatment whose mutations render said cancer cells sensitive or resistant to anticancer drugs, and d2) combining the identified genes from all treatments One or more hit genes, thereby identifying target genes in said cancer cells whose mutations render said cancer cells sensitive or resistant to an anticancer drug; wherein i) their corresponding sgRNA iBAR guide sequences compared to a control cancer cell population Hits identified as enriched in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) and having a FDR ≤ 0.1 (and/or having at least about 2-fold enrichment) in at least one treatment Genes identified as target genes whose mutations render the cancer cells resistant to the anticancer drug; and/or ii) their corresponding sgRNA iBAR guide sequences in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) that were identified as depleted and had FDR ≤ 0.1 (and/or had at least about 2-fold depletion) in at least one treatment, identified as mutations that rendered the cancer cells Anticancer drug sensitive target genes. In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hit genes A cancer cell library of the sgRNA iBAR library described herein; b) subjecting the cancer cell library to at least two separately different anticancer drug treatments (e.g., the treatments described herein); c) growing the cancer cell library to obtain Post-treatment cancer cell populations (e.g., all viable and resistant to anticancer drugs) for each treatment, and d) based on post-treatment cancer cell populations and control cancer cell populations on at least two separate treatment basis Differences between profiles of sgRNA iBARs in the identified target genes: wherein identifying target genes comprises: i) for each treatment, obtaining the sgRNA iBAR (or its guide sequence) sequences in the cancer cell population after treatment; ii) for each treatment A process of sorting the corresponding guide sequences of the sgRNA iBAR sequences based on sequence counts, wherein the ranking comprises adjusting each guide sequence based on the data identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences and iii) for each treatment, identifying target genes corresponding to guide sequences whose ranking is above a predetermined threshold level; wherein (1) in post-treatment cancer cell populations (live) resistant to anticancer drugs Hit genes identified as depleted and having FDR ≤ 0.1 (and/or having at least about 2-fold depletion) in at least one treatment whose mutations (e.g., inactivation) are identified to render the cancer cells anticancer drugs Sensitive target genes; and/or (2) identified as enriched in post-treatment cancer cell populations (live) resistant to anticancer drugs and having FDR ≤ 0.1 in at least one treatment (and/or having Hit genes that are at least about 2-fold enriched) are identified as target genes whose mutations (eg, inactivation) render the cancer cells resistant to the anticancer drug. In some embodiments, sequence counts obtained for each treatment from a population of cancer cells after treatment are compared to corresponding sequence counts obtained from a population of control cancer cells to provide a fold change (e.g., an actual fold change, or a fraction of a fold change). Derivatives such as log2 or log10 fold change). In some embodiments, the cancer cell library has about 100-fold to about 1000-fold coverage for each sgRNA iBAR , such as about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the cancer cell library has at least about 400-fold coverage for each hit gene, eg, about 1200-fold to about 12,000-fold coverage for each hit gene. In some embodiments, the method is a positive screen. In some embodiments, the method is negative screening. In some embodiments, the control cancer cell population is obtained from the same cancer cell library cultured under the same conditions without exposure to the anticancer drug, optionally undergoing the same method of obtaining in step c). In some embodiments, the method further comprises performing next generation sequencing on the treated cancer cell population and the control cancer cell population from each treatment. In some embodiments, a treatment comprises contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times while allowing viable cancer cells to grow, optionally The cancer cells were passaged approximately every 3 doubling times. In some embodiments, another treatment comprises contacting the library of cancer cells with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times while allowing viable cancer cells to grow, either The cancer cells are optionally passaged every about 3 doubling times. In some embodiments, the coverage of each hit gene (or sgRNA iBAR ) against a cancer cell library remains the same or similar (e.g., within about 10% difference) after passaging for serial anticancer drug treatment . In some embodiments, the sgRNA iBAR sequence counts undergo median ratio normalization followed by mean-variance modeling. In some embodiments, the variance of each guide sequence is adjusted based on the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences. In some embodiments, the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequences is relatively The variance of the guide sequences increases relative to each other in different directions (eg, increasing versus decreasing, increasing versus unchanged, or decreasing versus unchanged).

在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含靶向一個或多個命中基因的本文所述sgRNA iBAR文庫的癌細胞文庫;對來自步驟a)的癌細胞文庫進行兩個分別的處理b1)和b2):b1) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間;b2) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間;c1) 使(例如,在抗癌藥物存在下,每約3個倍增時間進行傳代)來自處理b1)的癌細胞文庫生長,以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);c2) 使(例如,在抗癌藥物存在下,每約3個倍增時間進行傳代)來自處理b2)的癌細胞文庫生長,以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);d1) 基於來自c1)的處理後癌細胞群和相應對照癌細胞群中sgRNA iBAR的譜之間的差異,鑒定來自處理b1)的處理後癌細胞群中的一個或多個命中基因,d2) 基於來自c2)的處理後癌細胞群和相應對照癌細胞群中sgRNA iBAR的譜之間的差異,鑒定來自處理b2)的處理後癌細胞群中的一個或多個命中基因,以及d3) 組合從所有處理b1)和處理b2)鑒定的一個或多個命中基因(敏感的或耐藥的),由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,在步驟d1)或d2)中分別來自處理b1)或b2)的處理後癌細胞群中的命中基因,包括:i) 鑒定來自每個處理的處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中的sgRNA iBAR序列;以及ii) 鑒定對應於所述sgRNA iBAR的嚮導序列的命中基因。在一些實施方案中,鑒定步驟d1)和/或d2)中的一個或多個命中基因包括:針對每個處理,對獲自處理後癌細胞群(例如,活的且對抗癌藥物耐藥)的sgRNA iBAR(或其嚮導序列)序列計數和獲自所述對照癌細胞群的sgRNA iBAR(或其嚮導序列)序列計數進行比較,其中i) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群中被鑒定為富集的且針對相應的處理具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為針對相應的處理其突變使所述癌細胞對抗癌藥物耐藥的命中基因;和/或ii) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在處理後癌細胞群中被鑒定為耗竭的且針對相應的處理具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為針對相應的處理其突變使所述癌細胞對抗癌藥物敏感的命中基因。在一些實施方案中,步驟d3)包括:組合來自所有處理的其突變使所述癌細胞對抗癌藥物耐藥的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或組合來自所有處理的其突變使所述癌細胞對抗癌藥物敏感的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,鑒定所述靶基因包括鑒定獲自兩個分別的處理b1)和b2)的處理後癌細胞群中的一個或多個命中基因,其中:i) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為富集的且在處理b1)或b2)中具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA iBAR嚮導序列相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為耗竭的且在處理b1)或b2)中具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,針對每個處理,對獲自處理後癌細胞群的序列計數和獲自對照癌細胞群的相應序列計數進行比較,以提供倍數變化(例如,實際倍數變化,或倍數變化的導數如log2或log10倍數變化)。在一些實施方案中,針對每個sgRNA iBAR,所述癌細胞文庫具有約100-倍至約1000-倍覆蓋率,如針對每個sgRNA iBAR約1000-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述癌細胞文庫具有至少約400-倍覆蓋率,例如,針對每個命中基因約1200-倍至約12,000-倍覆蓋率。在一些實施方案中,所述方法是陽性篩查。在一些實施方案中,所述方法是陰性篩選。在一些實施方案中,所述對照癌細胞群獲自在相同條件且沒有接觸所述抗癌藥物下培養的相同癌細胞文庫,任選地在步驟c)中經歷相同的獲得方法。在一些實施方案中,所述方法還包括針對來自每個處理的處理後癌細胞群和對照癌細胞群進行下一代測序。在一些實施方案中,在傳代以用於連續抗癌藥物處理後,針對癌細胞文庫的每個命中基因(或sgRNA iBAR)的覆蓋率保持相同或相似(例如,約10%以內的差異)。在一些實施方案中,所述sgRNA iBAR序列計數經歷中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變),則所述嚮導序列的方差增加了。 In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing a gene comprising targeting one or more hit genes Cancer cell library of the sgRNA iBAR library described herein; two separate treatments b1) and b2) of the cancer cell library from step a): b1) combining the cancer cell library from step a) with the anticancer drug contacting at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times; b2) contacting the library of cancer cells from step a) with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times; c1) growing (eg, passaged every about 3 doubling times in the presence of an anticancer drug) the cancer cell library from treatment b1) to obtain a post-treatment cancer cell population (eg, viable and anticancer drug resistance); c2) growing (e.g., passage every about 3 doubling times in the presence of anticancer drug) the cancer cell library from treatment b2) to obtain a post-treatment cancer cell population (e.g. , viable and resistant to anticancer drugs); d1) identification of treated cancer cells from treatment b1) based on the difference between the profile of sgRNA iBAR in the treated cancer cell population from c1) and the corresponding control cancer cell population One or more hit genes in the population, d2) identified in the treated cancer cell population from treatment b2) based on the difference between the profile of sgRNA iBAR in the treated cancer cell population from c2) and the corresponding control cancer cell population. and d3) combining the one or more hit genes (sensitive or drug-resistant) identified from all treatments b1) and treatments b2), thereby identifying the cancer cells whose mutations make the Target genes that are sensitive or resistant to anticancer drugs in cancer cells. In some embodiments, the hit genes in the treated cancer cell populations from treatment b1) or b2) in step d1) or d2), respectively, comprise: i) identifying the treated cancer cell populations from each treatment (e.g. , live and anticancer drug resistant) sgRNA iBAR sequence; and ii) identifying a hit gene corresponding to the guide sequence of the sgRNA iBAR . In some embodiments, identifying the one or more hit genes in steps d1) and/or d2) comprises: for each treatment, a population of cancer cells obtained from the treatment (e.g., alive and resistant to an anticancer drug). ) sgRNA iBAR (or its guide sequence) sequence counts are compared with the sgRNA iBAR (or its guide sequences) sequence counts obtained from the control cancer cell population, wherein i) its corresponding sgRNA iBAR guide sequence compared to the control cancer cell A population is identified as enriched in the post-treatment cancer cell population and has a hit gene with FDR ≤ 0.1 (and/or with at least about 2-fold enrichment) for the corresponding treatment identified as having a mutation that renders the corresponding treatment and/or ii) its corresponding sgRNA iBAR guide sequence is identified as depleted in the treated cancer cell population compared to the control cancer cell population and has FDR for the corresponding treatment Hit genes < 0.1 (and/or having at least about 2-fold depletion) were identified as hit genes whose mutations sensitized the cancer cells to the anticancer drug against the corresponding treatment. In some embodiments, step d3) comprises: combining one or more hit genes from all treatments whose mutation renders the cancer cell resistant to an anticancer drug, thereby identifying in the cancer cell whose mutation renders the Target genes for which cancer cells are resistant to anticancer drugs; and/or combining one or more hit genes from all treatments whose mutations render said cancer cells sensitive to anticancer drugs, thereby identifying mutations in said cancer cells Target genes that sensitize the cancer cells to anticancer drugs. In some embodiments, identifying the target gene comprises identifying one or more hit genes in post-treatment cancer cell populations obtained from two separate treatments b1) and b2), wherein: i) its corresponding sgRNA iBAR guide Sequences are identified as enriched in post-treatment cancer cell populations (live) that are resistant to anticancer drugs compared to control cancer cell populations and have an FDR ≤ 0.1 in treatment b1) or b2) (and/or have at least approximately 2-fold enrichment) hit genes identified as targets whose mutations render the cancer cells resistant to the anticancer drug; and/or ii) their corresponding sgRNA iBAR guide sequences compared to the control cancer cell population in response to Hit genes identified as depleted and having FDR ≤ 0.1 (and/or having at least about 2-fold depletion) in treatment b1) or b2) in anticancer drug-resistant post-treatment cancer cell populations (live) were identified by Target genes whose mutations sensitize the cancer cells to anticancer drugs are identified. In some embodiments, for each treatment, sequence counts obtained from a population of cancer cells after treatment are compared to corresponding sequence counts obtained from a population of control cancer cells to provide a fold change (e.g., an actual fold change, or a fold change derivatives such as log2 or log10 fold change). In some embodiments, the cancer cell library has about 100-fold to about 1000-fold coverage for each sgRNA iBAR , such as about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the cancer cell library has at least about 400-fold coverage for each hit gene, eg, about 1200-fold to about 12,000-fold coverage for each hit gene. In some embodiments, the method is a positive screen. In some embodiments, the method is negative screening. In some embodiments, the control cancer cell population is obtained from the same cancer cell library cultured under the same conditions without exposure to the anticancer drug, optionally undergoing the same method of obtaining in step c). In some embodiments, the method further comprises performing next generation sequencing on the treated cancer cell population and the control cancer cell population from each treatment. In some embodiments, the coverage of each hit gene (or sgRNA iBAR ) against a cancer cell library remains the same or similar (e.g., within about 10% difference) after passaging for serial anticancer drug treatment . In some embodiments, the sgRNA iBAR sequence counts undergo median ratio normalization followed by mean-variance modeling. In some embodiments, the variance of each guide sequence is adjusted based on the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences. In some embodiments, the profile identity between iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequences is relatively The variance of the guide sequences increases relative to each other in different directions (eg, increasing versus decreasing, increasing versus unchanged, or decreasing versus unchanged).

在一些實施方案中,採用本文所述的方法鑒定的一個或多個靶基因,針對相同的癌症類型(例如,結直腸癌)的兩個或更多個(例如,2、3、4、5或更多個)癌細胞系(例如,HCT116、SW480)進行測試。在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供各自包含多個癌細胞的兩個或更多個(例如,2、3、4、5或更多個)癌細胞文庫(例如,Cas9 +sgRNA iBAR癌細胞文庫),其中所述多個癌細胞中的每一個具有命中基因突變,其中在相同癌細胞文庫內多個癌細胞中的至少兩個中的命中基因彼此不同,且其中所述兩個或更多個癌細胞文庫產生自相同的癌症類型(例如,結直腸癌)的不同的初始癌細胞群(例如,HCT116或SW480);b) 分別使每個癌細胞文庫與抗癌藥物接觸(例如,以約IC50至約IC70的濃度持續約9至約10個倍增時間,或持續約15至約16個倍增時間);c) 分別使每個癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物具有耐藥性);d1) 基於處理後癌細胞群和相應對照癌細胞群中的命中基因突變(或sgRNA或sgRNA iBAR)的譜之間的差異,分別鑒定獲自每個癌細胞文庫的處理後癌細胞群中的一個或多個命中基因;以及d2) 組合從所有癌細胞文庫鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,針對不同癌細胞文庫,處理步驟b)和癌細胞獲取步驟c)是相同的。在一些實施方案中,針對不同癌細胞文庫,處理步驟b)和/或癌細胞獲取步驟c)是不同的。在一些實施方案中,所述兩個或更多個癌細胞文庫是Cas9 +sgRNA iBAR癌細胞文庫。在一些實施方案中,所述對照癌細胞群獲自在相同條件且沒有接觸所述抗癌藥物下培養的對應的相同癌細胞文庫。在一些實施方案中,通過下一代測序來鑒定處理後癌細胞群和對照癌細胞群中命中基因突變或sgRNA或sgRNA iBAR的譜。在一些實施方案中,在步驟d1)中針對每個癌細胞文庫鑒定一個或多個命中基因包括:對獲自處理後癌細胞群的所述命中基因突變(或sgRNA或sgRNA iBAR)序列計數和獲自相應對照癌細胞群的命中基因突變(或sgRNA或sgRNA iBAR)序列計數進行比較,其中:i) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比相應的對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的命中基因;和/或ii) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比相應的對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的命中基因。在一些實施方案中,步驟d2)包括組合來自所有癌細胞文庫中各文庫的其突變使所述癌細胞對抗癌藥物耐藥的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或組合來自所有癌細胞文庫中各文庫的其突變使所述癌細胞對抗癌藥物敏感的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感的靶基因。 In some embodiments, one or more target genes identified using the methods described herein are directed against two or more (e.g., 2, 3, 4, 5 or more) cancer cell lines (eg, HCT116, SW480) for testing. In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitized or resistant to an anticancer drug, the method comprising: a) providing two or more cancer cells each comprising a plurality of cancer cells A plurality (e.g., 2, 3, 4, 5, or more) of cancer cell libraries (e.g., a Cas9 + sgRNA iBAR cancer cell library), wherein each of the plurality of cancer cells has a hit gene mutation, wherein Hit genes in at least two of the plurality of cancer cells within the same cancer cell library differ from each other, and wherein the two or more cancer cell libraries arise from different genes of the same cancer type (e.g., colorectal cancer) (e.g., HCT116 or SW480); b) separately contacting each cancer cell library with an anticancer drug (e.g., at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times, or for about 15 to about 16 doubling times); c) growing each cancer cell library separately to obtain a population of treated cancer cells (e.g., viable and resistant to an anticancer drug); d1) based on the treated cancer cell difference between the profiles of hit gene mutations (or sgRNA or sgRNA iBAR ) in the population of cells and the corresponding control cancer cell population, identifying one or more hit genes in the treated cancer cell population obtained from each cancer cell library, respectively; and d2) combining the one or more hit genes identified from all cancer cell libraries, thereby identifying target genes in said cancer cells whose mutations render said cancer cells sensitive or resistant to an anticancer drug. In some embodiments, the processing step b) and the cancer cell acquisition step c) are the same for different cancer cell libraries. In some embodiments, the processing step b) and/or the cancer cell acquisition step c) are different for different cancer cell libraries. In some embodiments, the two or more cancer cell libraries are Cas9 + sgRNA iBAR cancer cell libraries. In some embodiments, the control population of cancer cells is obtained from a corresponding library of identical cancer cells cultured under the same conditions without exposure to the anticancer drug. In some embodiments, the profiles of hit gene mutations or sgRNAs or sgRNA iBARs in post-treatment and control cancer cell populations are identified by next generation sequencing. In some embodiments, identifying one or more hit genes for each cancer cell library in step d1) comprises: sequence counting and Sequence counts of hit gene mutations (or sgRNA or sgRNA iBAR ) obtained from corresponding control cancer cell populations were compared, where: i) their corresponding sgRNA or sgRNA iBAR guide sequences or hit gene mutations were compared to corresponding control cancer cell populations in the treated Hit genes identified as enriched and having a FDR ≤ 0.1 (and/or having at least about 2-fold enrichment) in a cancer cell population (e.g., viable and resistant to an anticancer drug) are identified as mutated A hit gene that renders the cancer cells resistant to an anticancer drug; and/or ii) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutation compared to a corresponding control cancer cell population after treatment in a cancer cell population (e.g., live and resistant to anticancer drugs) that are identified as depleted and have FDR ≤ 0.1 (and/or have at least about 2-fold depletion) hit genes whose mutations are identified to render the cancer cells anticancer drugs Sensitive hit genes. In some embodiments, step d2) comprises combining one or more gene hits from each of the libraries of all cancer cells whose mutation renders said cancer cells resistant to an anticancer drug, thereby identifying which of said cancer cells mutating target genes that render said cancer cells resistant to an anticancer drug; and/or combining one or more hit genes from each of all cancer cell libraries whose mutations render said cancer cells sensitive to an anticancer drug, by This identifies target genes in the cancer cells whose mutations sensitize the cancer cells to anticancer drugs.

在一些實施方案中,提供了鑒定其突變使癌細胞對兩種或更多種(例如,2、3、4、5或更多中)不同抗癌藥物敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含多個癌細胞的癌細胞文庫(例如,Cas9 +sgRNA iBAR癌細胞文庫),其中所述多個癌細胞中的每一個具有命中基因突變,其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同;b) 分別使所述癌細胞文庫與兩種或更多種不同抗癌藥物接觸(例如,以約IC50至約IC70的濃度持續約9至約10個倍增時間,或持續約15至約16個倍增時間);c) 針對每種抗癌藥物,分別使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);d1) 當單獨處理時針對兩種或更多種不同抗癌藥物,分別鑒定其突變使所述癌細胞對抗癌藥物敏感的一組的一個或多個靶基因(例如,採用本文所述的任一種鑒定方法);以及d2) 獲得存在於針對每種抗癌藥物鑒定的每組靶基因中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對兩種或更多種不同抗癌藥物的組合處理敏感的靶基因;和/或d1) 當單獨處理時針對兩種或更多種不同抗癌藥物,分別鑒定其突變使所述癌細胞對抗癌藥物耐藥的一組的一個或多個靶基因(例如,採用本文所述的任一種鑒定方法);以及d2) 獲得存在於針對所有抗癌藥物鑒定的靶基因組的組合中的一個或多個靶基因,由此鑒定其突變使所述癌細胞對兩種或更多種不同抗癌藥物的組合處理耐藥的靶基因。在一些實施方案中,所述兩種或更多種不同抗癌藥物靶向相同的癌症靶標。在一些實施方案中,所述兩種或更多種不同抗癌藥物靶向不同的癌症靶標。 In some embodiments, methods of identifying target genes whose mutations render cancer cells sensitive or resistant to two or more (e.g., 2, 3, 4, 5 or more) different anticancer drugs are provided, The method comprises: a) providing a cancer cell library (e.g., a Cas9 + sgRNA iBAR cancer cell library) comprising a plurality of cancer cells, wherein each of the plurality of cancer cells has a hit gene mutation, wherein in the plurality of cancer cells The hit genes in at least two of the cancer cells are different from each other; b) separately contacting the library of cancer cells with two or more different anticancer drugs (for example, at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times, or for about 15 to about 16 doubling times); c) growing the cancer cell library separately for each anticancer drug to obtain a population of treated cancer cells (e.g., viable and anticancer drug resistance); d1) separately identifying a set of one or more target genes whose mutations render said cancer cells sensitive to said anticancer drug against two or more different anticancer drugs when treated individually (e.g., using any of the identification methods described herein); and d2) obtaining one or more target genes present in each set of target genes identified for each anticancer drug, thereby identifying mutations thereof that render the cancer Target genes for which cells are sensitive to combined treatment of two or more different anticancer drugs; and/or d1) identification of mutations thereof which make said cancer cells, respectively, directed against two or more different anticancer drugs when treated alone a set of one or more target genes that are resistant to the anticancer drug (e.g., using any of the methods of identification described herein); and d2) obtaining one that is present in the set of target genes identified for all anticancer drugs or a plurality of target genes, thereby identifying target genes whose mutations render the cancer cells resistant to a combination treatment of two or more different anticancer drugs. In some embodiments, the two or more different anticancer drugs target the same cancer target. In some embodiments, the two or more different anticancer drugs target different cancer targets.

因此,在一些實施方案中,提供了鑒定癌細胞中其突變使得所述癌細胞對包含第一抗癌藥物和第二抗癌藥物的聯合治療敏感的靶基因的方法,包括:i) 按照本文所述的任何方法鑒定癌細胞中其突變使所述癌細胞對第一抗癌藥物敏感的第一組的一個或多個靶基因;ii) 按照本文所述的任何方法鑒定癌細胞中其突變使所述癌細胞對第二抗癌藥物敏感的第二組的一個或多個靶基因;以及iii) 獲得存在於第一組靶基因和第二組靶基因兩者中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對所述聯合治療敏感的靶基因。在一些實施方案中,所述兩種抗癌藥物靶向相同的癌症靶標。在一些實施方案中,所述兩種抗癌藥物靶向不同的癌症靶標。Accordingly, in some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders said cancer cell susceptible to a combination therapy comprising a first anticancer drug and a second anticancer drug, comprising: i) according to herein identifying a first set of one or more target genes in cancer cells whose mutations sensitize said cancer cells to a first anti-cancer drug according to any of the methods described herein; ii) identifying mutations in cancer cells according to any of the methods described herein one or more target genes of a second set that sensitizes said cancer cells to a second anticancer drug; and iii) obtaining one or more target genes present in both the first set of target genes and the second set of target genes genes, thereby identifying target genes whose mutations render said cancer cells sensitive to said combination therapy. In some embodiments, the two anticancer drugs target the same cancer target. In some embodiments, the two anticancer drugs target different cancer targets.

在一些實施方案中,提供了鑒定癌細胞中其突變使得所述癌細胞對包含第一抗癌藥物和第二抗癌藥物的聯合治療敏感或耐藥的靶基因的方法,所述方法包括:a) 提供包含多個癌細胞的癌細胞文庫(例如,sgRNA或sgRNA iBAR癌細胞文庫),其中所述多個癌細胞中的每一個具有命中基因突變,其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同;b) 使所述癌細胞文庫與第一抗癌藥物和第二抗癌藥物接觸;c) 培養所述癌細胞文庫以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及d) 基於處理後癌細胞群和對照癌細胞群中的命中基因突變的譜之間的差異(或sgRNA或sgRNA iBAR),鑒定所述靶基因。在一些實施方案中,第一抗癌藥物和第二抗癌藥物同時與所述癌細胞文庫接觸。在一些實施方案中,第一抗癌藥物和第二抗癌藥物以相重疊的時間段與所述癌細胞文庫接觸。在一些實施方案中,第一抗癌藥物和第二抗癌藥物依次與所述癌細胞文庫接觸。在一些實施方案中,獲得了一種藥物處理後的癌細胞群(例如,活的,針對活細胞可富集/分選的或未富集/分選的,有或沒有恢復生長期),然後與另一種抗癌藥物接觸,以獲得最終的處理後癌細胞群。在一些實施方案中,所述對照癌細胞群獲自在相同條件且未接觸任何抗癌藥物下培養的相同癌細胞文庫,任選地經歷步驟c)中的相同癌細胞獲取方法。在一些實施方案中,所述對照癌細胞群獲自在相同條件且接觸僅一種抗癌藥物下培養的相同癌細胞文庫,任選地經歷步驟c)中的相同癌細胞獲取方法。因此,在一些實施方案中,提供了鑒定癌細胞中其突變使所述癌細胞對包含第一抗癌藥物和第二抗癌藥物的聯合治療敏感的靶基因的方法,所述方法包括:a) 提供包含多個癌細胞的癌細胞文庫(例如,sgRNA或sgRNA iBAR癌細胞文庫),其中所述多個癌細胞中的每一個具有命中基因突變,其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同;b) 使所述癌細胞文庫與第一抗癌藥物和第二抗癌藥物接觸;c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥); d1) 基於處理後癌細胞群和第一對照癌細胞群中的命中基因突變的譜之間的差異(或sgRNA或sgRNA iBAR),鑒定第一組的一個或多個命中基因;d2) 基於處理後癌細胞群和第二對照癌細胞群中的命中基因突變的譜之間的差異(或sgRNA或sgRNA iBAR),鑒定第二組的一個或多個命中基因;以及d3) 組合從d1)和d2)鑒定的第一組和第二組的一個或多個命中基因,由此鑒定癌細胞中其突變使所述癌細胞對包含第一抗癌藥物和第二抗癌藥物的聯合治療敏感或耐藥的靶基因;其中所述第一對照癌細胞群獲自在相同條件且單獨接觸第一抗癌藥物下培養的癌細胞文庫,並且是採用步驟c)中相同的癌細胞獲取方法獲得的;且其中所述第二對照癌細胞群獲自在相同條件且單獨接觸第二抗癌藥物下培養的癌細胞文庫,並且是採用步驟c)中相同的癌細胞獲取方法獲得的。在一些實施方案中,在d1)中鑒定第一組的一個或多個命中基因,和/或在d2)中鑒定第二組的一個或多個命中基因,可包括本文所述的任一種命中基因/靶基因鑒定方法。例如,在一些實施方案中,鑒定第一(或第二)組的一個或多個命中基因包括:對獲自處理後癌細胞群的sgRNA或sgRNA iBAR序列計數 (或包含所述命中基因突變的序列的序列計數)和獲自第一(或第二)對照癌細胞群的sgRNA或sgRNA iBAR序列計數 (或包含所述命中基因突變的序列的序列計數)進行比較,其中:i) 其相應的sgRNA或sgRNA iBAR嚮導序列(或命中基因突變)相比第一(和/或第二)對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞相比單獨用第一(和/或第二)抗癌藥物處理對聯合治療更耐藥的命中基因;和/或ii) 其相應的sgRNA或sgRNA iBAR嚮導序列(或命中基因突變)相比第一(和/或第二)對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞相比單獨用第一(和/或第二)抗癌藥物處理對聯合治療更敏感的命中基因。在一些實施方案中,在步驟d1) (或d2))中鑒定第一(或第二)組的一個或多個命中基因還包括:,獲自處理後癌細胞群的sgRNA或sgRNA iBAR序列計數 (或包含所述命中基因突變的序列的序列計數)和獲自對照癌細胞群的sgRNA或sgRNA iBAR序列計數 (或包含所述命中基因突變的序列的序列計數)進行比較,所述對照癌細胞群獲自在相同條件且未接觸任何抗癌藥物下培養的相同癌細胞文庫,其中i) 其相應的sgRNA或sgRNA iBAR嚮導序列(或命中基因突變)相比對照癌細胞群和第一(和/或第二)對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1 (和/或具有至少約2倍富集)的命中基因,被鑒定為其突變使所述癌細胞相比單獨用第一(和/或第二)抗癌藥物處理對聯合治療更耐藥的命中基因;和/或ii)其相應的sgRNA或sgRNA iBAR嚮導序列(或命中基因突變)相比對照癌細胞群和第一(和/或第二)對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (和/或具有至少約2倍耗竭)的命中基因,被鑒定為其突變使所述癌細胞相比單獨用第一(和/或第二)抗癌藥物處理對聯合治療更敏感的命中基因。在一些實施方案中,所述方法還包括下一代測序以獲得所述sgRNA或sgRNA iBAR序列或包含所述命中基因突變的序列。在一些實施方案中,所述兩種抗癌藥物靶向相同的癌症靶標。在一些實施方案中,所述兩種抗癌藥物靶向不同的癌症靶標。 In some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitive or resistant to a combination therapy comprising a first anticancer drug and a second anticancer drug, the method comprising: a) providing a cancer cell library (eg, sgRNA or sgRNA iBAR cancer cell library) comprising a plurality of cancer cells, wherein each of the plurality of cancer cells has a hit gene mutation, wherein at least one of the plurality of cancer cells The hit genes in the two are different from each other; b) contacting the cancer cell library with a first anticancer drug and a second anticancer drug; c) culturing the cancer cell library to obtain a treated cancer cell population (e.g. , alive and resistant to anticancer drugs); and d) identifying the target gene based on the difference between the profile of the hit gene mutation (either sgRNA or sgRNA iBAR ) in the treated and control cancer cell populations . In some embodiments, the first anticancer drug and the second anticancer drug are simultaneously contacted with the library of cancer cells. In some embodiments, the first anticancer drug and the second anticancer drug are contacted with the library of cancer cells for overlapping periods of time. In some embodiments, the first anticancer drug and the second anticancer drug are sequentially contacted with the library of cancer cells. In some embodiments, a drug-treated population of cancer cells (e.g., viable, enriched/sorted for viable cells or non-enriched/sorted, with or without reverting to growth phase) is obtained and then Contact with another anticancer drug to obtain the final treated cancer cell population. In some embodiments, the control cancer cell population is obtained from the same cancer cell library cultured under the same conditions without exposure to any anticancer drug, optionally subjected to the same cancer cell harvesting method in step c). In some embodiments, the control cancer cell population is obtained from the same cancer cell library cultured under the same conditions and exposed to only one anticancer drug, optionally subjected to the same cancer cell harvesting method in step c). Accordingly, in some embodiments, there is provided a method of identifying a target gene in a cancer cell whose mutation sensitizes the cancer cell to a combination therapy comprising a first anticancer drug and a second anticancer drug, the method comprising: a ) providing a cancer cell library (e.g., sgRNA or sgRNA iBAR cancer cell library) comprising a plurality of cancer cells, wherein each of the plurality of cancer cells has a hit gene mutation, wherein at least two of the plurality of cancer cells wherein said hit genes are different from each other; b) contacting said library of cancer cells with a first anticancer drug and a second anticancer drug; c) growing said library of cancer cells to obtain a population of treated cancer cells (e.g. , alive and resistant to anticancer drugs); d1) Identify the first panel based on the difference between the profile of the hit gene mutation (either sgRNA or sgRNA iBAR ) in the treated cancer cell population and the first control cancer cell population one or more hit genes; d2) based on the difference between the profile of the hit gene mutation (or sgRNA or sgRNA iBAR ) in the treated cancer cell population and the second control cancer cell population, identify one or more hit genes of the second group hit genes; and d3) combining the one or more hit genes of the first set and the second set identified from d1) and d2), thereby identifying cancer cells whose mutations make said cancer cells contain the first anticancer Combination therapy of the drug and the second anticancer drug to treat sensitive or drug-resistant target genes; wherein the first control cancer cell population is obtained from a cancer cell library cultured under the same conditions and in contact with the first anticancer drug alone, and the steps of Obtained by the same cancer cell acquisition method in c); and wherein the second control cancer cell group is obtained from a cancer cell library cultured under the same conditions and exposed to the second anticancer drug alone, and the same method as in step c) is used. Obtained by cancer cell acquisition method. In some embodiments, identifying a first set of one or more gene hits in d1), and/or identifying a second set of one or more gene hits in d2), may include any of the hits described herein Gene/Target Gene Identification Methods. For example, in some embodiments, identifying the first (or second) set of one or more hit genes comprises: counting sgRNA or sgRNA iBAR sequences (or sgRNAs comprising mutations in the hit genes) obtained from a cancer cell population after treatment. sequence count) and the sgRNA or sgRNA iBAR sequence count (or the sequence count of the sequence containing the mutation of the hit gene) obtained from the first (or second) control cancer cell population is compared, wherein: i) its corresponding The sgRNA or sgRNA iBAR guide sequence (or hit gene mutation) is identified in a post-treatment cancer cell population (e.g., alive and resistant to an anticancer drug) compared to a first (and/or second) control cancer cell population Hit genes that are enriched and have FDR ≤ 0.1 (and/or have at least about 2-fold enrichment), identified as mutations that render the cancer cell as compared to treatment with the first (and/or second) anticancer drug alone treatment of hit genes that are more resistant to the combination therapy; and/or ii) their corresponding sgRNA or sgRNA iBAR guide sequence (or hit gene mutation) compared to the first (and/or second) control cancer cell population after treatment cancer A hit gene identified as depleted and having a FDR ≤ 0.1 (and/or having at least about 2-fold depletion) in a cell population (e.g., viable and resistant to an anticancer drug) whose mutations are identified to render the cancer Hit genes for which cells are more sensitive to combination therapy than to treatment with the first (and/or second) anticancer drug alone. In some embodiments, identifying the first (or second) set of one or more hit genes in step d1) (or d2)) further comprises: sgRNA or sgRNA iBAR sequence counts obtained from a cancer cell population after treatment (or the sequence count of the sequence containing the mutation in the hit gene) was compared with the sgRNA or sgRNA iBAR sequence count (or the sequence count of the sequence containing the mutation in the hit gene) obtained from a control cancer cell population, the control cancer cell Populations were obtained from the same cancer cell library cultured under the same conditions without exposure to any anticancer drugs, where i) its corresponding sgRNA or sgRNA iBAR guide sequence (or hit gene mutation) compared to the control cancer cell population and the first (and/or or second) a control cancer cell population identified as enriched in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) and having a FDR ≤ 0.1 (and/or having at least about 2-fold enrichment ) hit genes whose mutations are identified to render the cancer cells more resistant to the combination therapy than treatment with the first (and/or second) anticancer drug alone; and/or ii) its corresponding sgRNA or sgRNA iBAR guide sequence (or hit gene mutation) compared to the control cancer cell population and the first (and/or second) control cancer cell population after the treatment of the cancer cell population (e.g., alive and resistant to anticancer drugs) ) that are identified as depleted and have FDR ≤ 0.1 (and/or have at least about 2-fold depletion) hit genes identified as mutations that make the cancer cell compared to the first (and/or second) alone Anticancer drugs address hit genes that are more sensitive to combination therapy. In some embodiments, the method further comprises next generation sequencing to obtain the sgRNA or sgRNA iBAR sequence or a sequence comprising the mutation of the hit gene. In some embodiments, the two anticancer drugs target the same cancer target. In some embodiments, the two anticancer drugs target different cancer targets.

在一些實施方案中,本文所述的任一種鑒定方法還包括通過以下步驟來驗證所述靶基因:a)通過在癌細胞的靶基因中產生突變(例如,失活突變)來修飾所述癌細胞; b)確定所述修飾的癌細胞對所述抗癌藥物的敏感性或耐藥性。In some embodiments, any of the methods of identification described herein further comprises verifying the target gene by: a) modifying the cancer cell by generating a mutation (e.g., an inactivating mutation) in the target gene of the cancer cell cells; b) determining the sensitivity or resistance of said modified cancer cells to said anticancer drug.

還提供了修飾的癌細胞,其通過使藉由本文所述的任何方法鑒定的一個或多個靶基因失活來獲得。Also provided are modified cancer cells obtained by inactivating one or more target genes identified by any of the methods described herein.

單鏈嚮導RNA (sgRNA)文庫和sgRNA iBAR文庫 Single-stranded guide RNA (sgRNA) library and sgRNA iBAR library

在一些實施方案中,本發明使用CRISPR/Cas嚮導RNA(例如,單鏈嚮導RNA)和編碼所述CRISPR/Cas嚮導RNA的構建體以在一個或多個命中基因中產生突變(例如,失活突變)。在一些實施方案中,所述突變是通過切割所述命中基因(例如,使用CRISPR/Cas9)來產生的。在一些實施方案中,所述突變是通過調控(例如,抑制或降低)所述命中基因的表達(例如,採用CRISPR/dCas與抑制域融合)來產生的。In some embodiments, the invention uses a CRISPR/Cas guide RNA (e.g., a single-stranded guide RNA) and a construct encoding the CRISPR/Cas guide RNA to generate mutations (e.g., inactivation) in one or more hit genes. mutation). In some embodiments, the mutation is generated by cleaving the hit gene (eg, using CRISPR/Cas9). In some embodiments, the mutation is produced by modulating (eg, suppressing or reducing) the expression of the hit gene (eg, using CRISPR/dCas fused to a repression domain).

在一些實施方案中,提供了包含一個或多個(例如,1、2、3、4、5、10、100、1,000、10,000、20,000個或更多個) sgRNA構建體的sgRNA文庫,其中每個sgRNA構建體(例如,編碼所述sgRNA的慢病毒或慢病毒載體)包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列。在一些實施方案中,所述sgRNA文庫包含多個(例如,2、3、4、5、10、100、1,000、10,000、20,000個或更多個) sgRNA構建體,其中至少兩個與嚮導序列互補的命中基因彼此不同。在一些實施方案中,所述sgRNA構建體包含sgRNA(或由其組成)。在一些實施方案中,所述sgRNA構建體編碼sgRNA。在一些實施方案中,所述sgRNA構建體是編碼所述sgRNA的質粒。在一些實施方案中,所述sgRNA構建體是編碼所述sgRNA的病毒載體(例如,慢病毒載體)。在一些實施方案中,所述sgRNA構建體是編碼所述sgRNA的病毒(例如,慢病毒)。在一些實施方案中,每個sgRNA包含與第二序列融合的嚮導序列,其中所述第二序列包含與Cas蛋白(例如,Cas9)相互作用的重複-反重複莖環。在一些實施方案中,每個sgRNA的第二序列還包含莖環1、莖環2和/或莖環3。在一些實施方案中,每個嚮導序列包含約17至約23個核苷酸。在一些實施方案中,所述sgRNA文庫包含至少約100個sgRNA構建體,如至少約200、300、400、1,000、1,600、4,000、10,000、15,000、16,000、19,000、20,000、38,000、50,000、100,000、150,000、155,000、200,000個或更多個sgRNA構建體中的任一個。在一些實施方案中,所述sgRNA文庫包含約6000至約16,000個sgRNA構建體。在一些實施方案中,所述sgRNA文庫包含約10,000至約18,000個sgRNA構建體。在一些實施方案中,含有多個sgRNA構建體的sgRNA文庫包含或編碼具有與基因組中每個注釋基因的靶位點互補的嚮導序列的sgRNA (以下也稱為“全基因組sgRNA文庫”)。在一些實施方案中,含有多個sgRNA構建體的sgRNA文庫包含或編碼具有與命中基因的靶位點互補的嚮導序列的sgRNA,所述命中基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),且在癌症患者中(例如,基於文獻或資料庫)其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)。在一些實施方案中,所述命中基因編碼在細胞內或細胞表面表達的蛋白,在健康細胞或癌細胞中。在一些實施方案中,所述sgRNA文庫包含至少兩個(例如,2、3、4、5個或更多個,如3個) sgRNA構建體,所述構建體包含或編碼具有與相同命中基因的至少兩個(例如,2、3、4、5個或更多個,如3個)不同靶位點互補的嚮導序列的sgRNA,即針對所述命中基因,所述sgRNA文庫具有至少2-倍覆蓋率。在一些實施方案中,針對每個命中基因,所述sgRNA文庫包含至少3個(例如,約6至約12個) sgRNA構建體,所述構建體包含或編碼具有與相同命中基因的至少3個(例如,約6至約12個)不同靶位點互補的嚮導序列的sgRNA。在一些實施方案中,所述sgRNA文庫包含至少兩個(例如,2、3、4、5個或更多個,如3個) sgRNA構建體,針對基因組內每個注釋基因,所述構建體包含或編碼具有與相同命中基因內至少兩個(例如,2、3、4、5個或更多個,如3個)不同的靶位點互補的嚮導序列的sgRNA,即針對整個基因組所述sgRNA文庫具有至少2-倍覆蓋率。在一些實施方案中,所述sgRNA文庫還包含一個或多個(例如,1、2、3、4、5、10、100、1,000、2,000、10,000個或更多個)“陰性對照sgRNA構建體”,其中每個陰性對照sgRNA構建體(例如,慢病毒或編碼所述陰性對照sgRNA的慢病毒載體)包含或編碼陰性對照sgRNA,且其中每個陰性對照sgRNA包含這樣的嚮導序列,其:與不在基因組內的不相關序列互補,與對照基因互補(例如,已知在基因失活後在測試組合對照組之間的反應相同或相似),或與和基因組內任何注釋基因無關的序列互補。在一些實施方案中,所述sgRNA文庫以所述sgRNA文庫中命中基因sgRNA構建體數量的約3%至約30%的量還包含陰性對照sgRNA構建體。在一些實施方案中,所述sgRNA文庫還包含約500至約4000(例如,約500) 陰性對照sgRNA構建體。In some embodiments, sgRNA libraries comprising one or more (e.g., 1, 2, 3, 4, 5, 10, 100, 1,000, 10,000, 20,000, or more) sgRNA constructs are provided, wherein each Each sgRNA construct (e.g., lentivirus or lentiviral vector encoding said sgRNA) comprises or encodes sgRNA, and wherein each sgRNA comprises a target site complementary (e.g., at least about 50%, 60% , 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary) guide sequence. In some embodiments, the sgRNA library comprises a plurality (e.g., 2, 3, 4, 5, 10, 100, 1,000, 10,000, 20,000, or more) of sgRNA constructs, at least two of which are aligned with guide sequences Complementary hits differ from each other. In some embodiments, the sgRNA construct comprises (or consists of) an sgRNA. In some embodiments, the sgRNA construct encodes a sgRNA. In some embodiments, the sgRNA construct is a plasmid encoding the sgRNA. In some embodiments, the sgRNA construct is a viral vector (eg, a lentiviral vector) encoding the sgRNA. In some embodiments, the sgRNA construct is a virus (eg, a lentivirus) encoding the sgRNA. In some embodiments, each sgRNA comprises a guide sequence fused to a second sequence comprising a repeat-inverter stem-loop that interacts with a Cas protein (eg, Cas9). In some embodiments, the second sequence of each sgRNA further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3. In some embodiments, each guide sequence comprises about 17 to about 23 nucleotides. In some embodiments, the sgRNA library comprises at least about 100 sgRNA constructs, such as at least about 200, 300, 400, 1,000, 1,600, 4,000, 10,000, 15,000, 16,000, 19,000, 20,000, 38,000, 50,000, 100,000, Any of 150,000, 155,000, 200,000 or more sgRNA constructs. In some embodiments, the sgRNA library comprises about 6000 to about 16,000 sgRNA constructs. In some embodiments, the sgRNA library comprises about 10,000 to about 18,000 sgRNA constructs. In some embodiments, the sgRNA library containing multiple sgRNA constructs comprises or encodes sgRNAs with guide sequences complementary to the target site of each annotated gene in the genome (hereinafter also referred to as "genome-wide sgRNA library"). In some embodiments, a sgRNA library comprising a plurality of sgRNA constructs comprises or encodes sgRNAs having guide sequences complementary to target sites of hit genes having a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more), and in cancer patients (e.g., based on literature or databases) its RNA expression levels are up-regulated or down-regulated by greater than about 2-fold (eg, greater than any of about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or more). In some embodiments, the hit gene encodes a protein that is expressed inside or on the cell surface, in healthy cells or cancer cells. In some embodiments, the sgRNA library comprises at least two (eg, 2, 3, 4, 5 or more, such as 3) sgRNA constructs comprising or encoding genes with the same hit as sgRNAs with guide sequences complementary to at least two (eg, 2, 3, 4, 5 or more, such as 3) different target sites, that is, for the hit gene, the sgRNA library has at least 2- double coverage. In some embodiments, for each hit gene, the sgRNA library comprises at least 3 (eg, about 6 to about 12) sgRNA constructs comprising or encoding at least 3 sgRNA constructs with the same hit gene (eg, about 6 to about 12) sgRNAs of guide sequences complementary to different target sites. In some embodiments, the sgRNA library comprises at least two (eg, 2, 3, 4, 5 or more, such as 3) sgRNA constructs for each annotated gene within the genome sgRNA comprising or encoding a guide sequence complementary to at least two (e.g., 2, 3, 4, 5 or more, such as 3) different target sites within the same hit gene, i.e. for the entire genome The sgRNA library has at least 2-fold coverage. In some embodiments, the sgRNA library further comprises one or more (e.g., 1, 2, 3, 4, 5, 10, 100, 1,000, 2,000, 10,000, or more) "negative control sgRNA constructs ", wherein each negative control sgRNA construct (e.g., lentivirus or lentiviral vector encoding said negative control sgRNA) comprises or encodes a negative control sgRNA, and wherein each negative control sgRNA comprises a guide sequence that: An unrelated sequence that is not within the genome is complementary, is complementary to a control gene (eg, known to respond identically or similarly between the test combination control group after gene inactivation), or is complementary to a sequence that is not related to any annotated gene within the genome. In some embodiments, the sgRNA library further comprises a negative control sgRNA construct in an amount of about 3% to about 30% of the number of hit gene sgRNA constructs in the sgRNA library. In some embodiments, the sgRNA library further comprises about 500 to about 4000 (eg, about 500) negative control sgRNA constructs.

在一些實施方案中,所述sgRNA還包含內部條碼(iBAR)序列(該sgRNA在下文稱為“sgRNA iBAR”)。在一些實施方案中,所述iBAR位於所述sgRNA內,使得所得的sgRNA iBAR可與Cas蛋白(例如,Cas9)一起操作以修飾(例如,切割或調控表達)與所述sgRNA iBAR的嚮導序列互補的命中基因。因此,在一些實施方案中,本文所述sgRNA文庫是sgRNA iBAR文庫。在一些實施方案中,所述sgRNA iBAR文庫包含一個或多個(例如,1、2、3、4、5、10、100、1,000、10,000、20,000個或更多個) sgRNA iBAR構建體,其中每個sgRNA iBAR構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,且其中每個嚮導序列與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。在一些實施方案中,所述sgRNA iBAR文庫包含多個(例如,2、3、4、5、10、100、1,000、2,000、10,000個或更多個) sgRNA iBAR構建體,其中至少兩個與嚮導序列互補的命中基因彼此不同。在一些實施方案中,每個sgRNA iBAR包含在5’-至-3’方向的第一莖環和第二莖環,其中第一莖環序列與第二莖環序列雜交以形成與Cas蛋白相互作用的雙鏈RNA (dsRNA)區,且其中所述iBAR序列位於第一莖環序列的3’端和第二莖環序列的5’端之間。在一些實施方案中,每個sgRNA iBAR包含與第二序列融合的嚮導序列,其中所述第二序列包含與Cas9蛋白(例如,Cas9)相互作用的重複-反重複莖環。在一些實施方案中,每個sgRNA iBAR的第二序列還包含莖環1、莖環2和/或莖環3。在一些實施方案中,所述Cas蛋白是Cas9,且每個sgRNA iBAR的iBAR序列被插入至所述重複-反重複莖環的環區中。在一些實施方案中,每個sgRNA iBAR從5’-至-3’包含:嚮導序列、具有插入至所述環區的iBAR序列的重複-反重複莖環、莖環1、莖環2和莖環3。在一些實施方案中,提供了包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含3個或更多個(例如,3、4、5個或更多個,如4個) sgRNA iBAR構建體(例如,編碼所述sgRNA iBAR的慢病毒或慢病毒載體),每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中所述3個或更多個sgRNA iBAR構建體的嚮導序列是相同的,其中所述3個或更多個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,且其中每組sgRNA iBAR構建體的嚮導序列與相應命中基因的不同靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。在一些實施方案中,每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,且其中4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同。因此,在一些實施方案中,提供了包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中所述4個sgRNA iBAR構建體的嚮導序列是相同的,其中所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,且其中每組sgRNA iBAR構建體的嚮導序列與相應命中基因中的不同靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。在一些實施方案中,所述sgRNA iBAR文庫包含至少約100 (例如,至少約200、400、1,000、1,300、1,600、4,000、10,000、15,000、19,000、20,000、38,000、50,000、100,000、150,000、155,000、200,000或更多中的任一個)組sgRNA iBAR構建體,如約1000至約4000組sgRNA iBAR構建體。在一些實施方案中,不同組的sgRNA iBAR構建體中至少兩個sgRNA iBAR構建體的iBAR序列是相同的(例如,第一組和第二組的sgRNA iBAR構建體在兩組sgRNA iBAR構建體中具有至少1、2、3、4或更多個共有的iBAR序列)。在一些實施方案中,至少兩組sgRNA iBAR構建體的iBAR序列是相同的。在一些實施方案中,包含多組sgRNA iBAR構建體的sgRNA iBAR文庫包含或編碼具有與基因組中每個注釋基因的靶位點互補的嚮導序列的sgRNA iBAR(以下也稱為“全基因組sgRNA iBAR文庫”)。在一些實施方案中,包含多組sgRNA iBAR構建體的sgRNA iBAR文庫包含或編碼具有下述嚮導序列的sgRNA iBAR,該嚮導序列與在癌症患者中(例如,基於文獻或資料庫)其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)的命中基因的靶位點互補。在一些實施方案中,在健康細胞或癌細胞中,所述命中基因編碼在細胞內或細胞表面表達的蛋白。在一些實施方案中,所述sgRNA iBAR文庫包含至少兩組(例如,2、3、4、5或更多,如3組) sgRNA iBAR構建體,其包含或編碼具有與相同命中基因的至少兩個(例如,2、3、4、5個或更多個,如3個)不同靶基因位點互補的嚮導序列的sgRNA iBAR,即,針對所述命中基因,所述sgRNA iBAR文庫具有至少兩倍的覆蓋率。在一些實施方案中,針對每個命中基因,所述sgRNA iBAR文庫包含3組sgRNA iBAR構建體,其包含或編碼具有與相同命中基因的3個不同靶基因位點互補的嚮導序列的sgRNA iBAR。在一些實施方案中,針對基因組中每個注釋的基因,所述sgRNA iBAR文庫包含至少兩組(例如,2、3、4、5或更多,如3組) sgRNA iBAR構建體,其包含或編碼具有與相同命中基因中的至少兩個(例如,2、3、4、5個或更多個,如3個)不同的靶基因位點互補的嚮導序列,即,針對整個基因組,所述sgRNA iBAR文庫具有至少兩倍的覆蓋率。在一些實施方案中,每個嚮導序列包含約17至約23個核苷酸。在一些實施方案中,每個iBAR序列包含約1至約50個(例如,約6個)核苷酸。在一些實施方案中,所述sgRNA iBAR構建體包含sgRNA iBAR(或由其組成)。在一些實施方案中,所述sgRNA iBAR構建體編碼sgRNA iBAR。在一些實施方案中,所述sgRNA iBAR構建體是編碼所述sgRNA iBAR的質粒。在一些實施方案中,所述sgRNA iBAR構建體是編碼所述sgRNA iBAR的病毒載體(例如,慢病毒載體)。在一些實施方案中,所述sgRNA iBAR構建體是編碼所述sgRNA iBAR的病毒(例如,慢病毒)。具有不同iBAR序列的不同sgRNA iBAR構建體的組可用于單個基因編輯和篩選實驗以提供重復資料。在一些實施方案中,所述sgRNA iBAR文庫還包含一組或多組“陰性對照sgRNA iBAR構建體”,其中每組陰性對照sgRNA iBAR構建體包含3個或更多個(例如,3、4、5個或更多個,如4個)陰性對照sgRNA iBAR構建體(例如,慢病毒或編碼所述陰性對照sgRNA iBAR的慢病毒載體),每個所述構建體包含或編碼陰性對照sgRNA iBAR,其中每個陰性對照sgRNA iBAR包含嚮導序列和iBAR序列,其中所述3個或更多個陰性對照sgRNA iBAR構建體的嚮導序列是相同的,其中所述3個或更多個陰性對照sgRNA iBAR構建體中每一個的iBAR序列彼此不同,且其中每組陰性對照sgRNA iBAR構建體的嚮導序列:與和所述基因組中的任何注釋基因無關的靶位點互補,與對照基因互補(例如,已知在基因失活後在測試組合對照組之間的反應相同或相似),或與不在所述基因組中的不相關序列互補。在一些實施方案中,所述sgRNA iBAR文庫還包含數量為所述sgRNA iBAR文庫中命中基因sgRNA iBAR構建體數量的約3%至約30%的陰性對照sgRNA iBAR構建體。在一些實施方案中,所述sgRNA iBAR文庫還包含約500至約4000個陰性對照sgRNA iBAR構建體(例如,2000)或多組陰性對照sgRNA iBAR構建體(例如,500組)。 In some embodiments, the sgRNA further comprises an internal barcode (iBAR) sequence (the sgRNA is hereinafter referred to as "sgRNA iBAR "). In some embodiments, the iBAR is located within the sgRNA such that the resulting sgRNA iBAR can operate with a Cas protein (e.g., Cas9) to modify (e.g., cleave or regulate expression) the guide sequence complementary to the sgRNA iBAR hit genes. Accordingly, in some embodiments, the sgRNA library described herein is an sgRNA iBAR library. In some embodiments, the sgRNA iBAR library comprises one or more (e.g., 1, 2, 3, 4, 5, 10, 100, 1,000, 10,000, 20,000, or more) sgRNA iBAR constructs, wherein Each sgRNA iBAR construct comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, and wherein each guide sequence is complementary (e.g., at least about 50%, 60% , 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary). In some embodiments, the sgRNA iBAR library comprises a plurality (e.g., 2, 3, 4, 5, 10, 100, 1,000, 2,000, 10,000, or more) of sgRNA iBAR constructs, at least two of which are associated with The hit genes complementary to the guide sequence were different from each other. In some embodiments, each sgRNA iBAR comprises a first stem-loop and a second stem-loop in the 5'-to-3' direction, wherein the first stem-loop sequence hybridizes to the second stem-loop sequence to form an interaction with the Cas protein. An acting double-stranded RNA (dsRNA) region, and wherein the iBAR sequence is located between the 3' end of the first stem-loop sequence and the 5' end of the second stem-loop sequence. In some embodiments, each sgRNA iBAR comprises a guide sequence fused to a second sequence comprising a repeat-inverter stem-loop that interacts with a Cas9 protein (eg, Cas9). In some embodiments, the second sequence of each sgRNA iBAR further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3. In some embodiments, the Cas protein is Cas9, and the iBAR sequence of each sgRNA iBAR is inserted into the loop region of the repeat-invert repeat stem-loop. In some embodiments, each sgRNA iBAR comprises from 5'-to-3': a guide sequence, a repeat-inverted repeat stem-loop with the iBAR sequence inserted into the loop region, stem-loop 1, stem-loop 2, and stem-loop Ring 3. In some embodiments, a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs is provided, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 3, 4, 5 or more, such as 4 a) sgRNA iBAR constructs (for example, lentiviruses or lentiviral vectors encoding said sgRNA iBARs ), each of which comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein said 3 The guide sequences of the three or more sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the 3 or more sgRNA iBAR constructs are different from each other, and wherein the guide sequences of each set of sgRNA iBAR constructs are identical to Different target sites (e.g., different hit genes, or different sites within the same hit gene) of corresponding hit genes are complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96% %, 97%, 98%, 99% or 100% complementary). In some embodiments, each set of sgRNA iBAR constructs comprises 4 sgRNA iBAR constructs, and wherein the iBAR sequences of each of the 4 sgRNA iBAR constructs are different from each other. Accordingly, in some embodiments, there is provided a sgRNA iBAR library comprising sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises four sgRNA iBAR constructs, each comprising or encoding a sgRNA iBAR , wherein each Each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the guide sequences of the four sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the four sgRNA iBAR constructs are different from each other, and wherein each set of sgRNA iBAR The guide sequence of the construct is complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary). In some embodiments, the sgRNA iBAR library comprises at least about 100 (e.g., at least about 200, 400, 1,000, 1,300, 1,600, 4,000, 10,000, 15,000, 19,000, 20,000, 38,000, 50,000, 100,000, 150,000, 155,000, Any of 200,000 or more) sets of sgRNA iBAR constructs, such as about 1000 to about 4000 sets of sgRNA iBAR constructs. In some embodiments, the iBAR sequences of at least two sgRNA iBAR constructs in different sets of sgRNA iBAR constructs are identical (e.g., the first and second set of sgRNA iBAR constructs in the two sets of sgRNA iBAR constructs have at least 1, 2, 3, 4 or more consensus iBAR sequences). In some embodiments, the iBAR sequences of at least two sets of sgRNA iBAR constructs are identical. In some embodiments, a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs comprises or encodes a sgRNA iBAR with a guide sequence complementary to the target site of each annotated gene in the genome (hereinafter also referred to as a "genome-wide sgRNA iBAR library") "). In some embodiments, a sgRNA iBAR library comprising panels of sgRNA iBAR constructs comprises or encodes sgRNA iBARs with guide sequences that correlate with DNA mutation frequencies in cancer patients (e.g., based on literature or databases) is at least about 5% (e.g., at least about any of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more) and its RNA expression level is upregulated or Target sites for hit genes that are downregulated by greater than about 2-fold (eg, greater than any of about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold, or more) are complemented. In some embodiments, the hit gene encodes a protein expressed inside or on the cell surface in a healthy cell or a cancer cell. In some embodiments, the sgRNA iBAR library comprises at least two sets (e.g., 2, 3, 4, 5 or more, such as 3 sets) of sgRNA iBAR constructs comprising or encoding at least two genes with the same hit gene. sgRNA iBAR of guide sequences complementary to different target gene loci (for example, 2, 3, 4, 5 or more, such as 3), that is, for the hit gene, the sgRNA iBAR library has at least two double the coverage. In some embodiments, for each hit gene, the sgRNA iBAR library comprises 3 sets of sgRNA iBAR constructs comprising or encoding sgRNA iBARs with guide sequences complementary to 3 different target gene sites of the same hit gene. In some embodiments, for each annotated gene in the genome, the sgRNA iBAR library comprises at least two sets (e.g., 2, 3, 4, 5 or more, such as 3 sets) of sgRNA iBAR constructs comprising or Encoding a guide sequence complementary to at least two (e.g., 2, 3, 4, 5 or more, such as 3) target gene loci that differ from at least two of the same hit genes, i.e., for the entire genome, the The sgRNA iBAR library has at least two-fold coverage. In some embodiments, each guide sequence comprises about 17 to about 23 nucleotides. In some embodiments, each iBAR sequence comprises about 1 to about 50 (eg, about 6) nucleotides. In some embodiments, the sgRNA iBAR construct comprises (or consists of) a sgRNA iBAR . In some embodiments, the sgRNA iBAR construct encodes a sgRNA iBAR . In some embodiments, the sgRNA iBAR construct is a plasmid encoding the sgRNA iBAR . In some embodiments, the sgRNA iBAR construct is a viral vector (eg, a lentiviral vector) encoding the sgRNA iBAR. In some embodiments, the sgRNA iBAR construct is a virus (eg, a lentivirus) encoding the sgRNA iBAR. Panels of different sgRNA iBAR constructs with different iBAR sequences can be used in a single gene editing and screening experiment to provide replicate profiles. In some embodiments, the sgRNA iBAR library further comprises one or more sets of "negative control sgRNA iBAR constructs", wherein each set of negative control sgRNA iBAR constructs comprises 3 or more (e.g., 3, 4, 5 or more, such as 4) negative control sgRNA iBAR constructs (for example, lentivirus or lentiviral vectors encoding said negative control sgRNA iBAR ), each of said constructs comprising or encoding a negative control sgRNA iBAR , wherein each negative control sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the guide sequences of the three or more negative control sgRNA iBAR constructs are identical, wherein the three or more negative control sgRNA iBAR constructs The iBAR sequences of each of the constructs are different from each other, and wherein the guide sequence of each set of negative control sgRNA iBAR constructs: Complementary to a target site unrelated to any annotated gene in the genome, Complementary to a control gene (e.g., known The same or similar response between the test combination control group after gene inactivation), or complementary to an unrelated sequence not in the genome. In some embodiments, the sgRNA iBAR library further comprises negative control sgRNA iBAR constructs in an amount of about 3% to about 30% of the number of hit gene sgRNA iBAR constructs in the sgRNA iBAR library. In some embodiments, the sgRNA iBAR library further comprises from about 500 to about 4000 negative control sgRNA iBAR constructs (eg, 2000) or sets of negative control sgRNA iBAR constructs (eg, 500 sets).

在一些實施方案中,提供了包含一個或多個sgRNA構建體(例如,sgRNA iBAR構建體)的sgRNA文庫(例如,sgRNA iBAR文庫),其中每個sgRNA構建體(例如,編碼所述sgRNA的慢病毒或慢病毒載體)包含或編碼sgRNA(例如,sgRNA iBAR),且其中每個sgRNA包含與靶基因中靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列,所述靶基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1。在一些實施方案中,提供了包含一個或多個sgRNA構建體(例如,sgRNA iBAR構建體)的sgRNA文庫(例如,sgRNA iBAR文庫),其中每個sgRNA構建體(例如,編碼所述sgRNA的慢病毒或慢病毒載體)包含或編碼sgRNA(例如,sgRNA iBAR),且其中每個sgRNA包含與靶基因中靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列,所述靶基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2。 In some embodiments, an sgRNA library (e.g., an sgRNA iBAR library) comprising one or more sgRNA constructs (e.g., an sgRNA iBAR construct) is provided, wherein each sgRNA construct (e.g., a slow Viral or lentiviral vectors) comprise or encode sgRNAs (e.g., sgRNA iBAR ), and wherein each sgRNA comprises (e.g., at least about 50%, 60%, 70%, 80%, 90% , 95%, 96%, 97%, 98%, 99% or 100% complementary), the target gene is selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1 , CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51 , RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1. In some embodiments, an sgRNA library (e.g., an sgRNA iBAR library) comprising one or more sgRNA constructs (e.g., an sgRNA iBAR construct) is provided, wherein each sgRNA construct (e.g., a slow Viral or lentiviral vectors) comprise or encode sgRNAs (e.g., sgRNA iBAR ), and wherein each sgRNA comprises (e.g., at least about 50%, 60%, 70%, 80%, 90% , 95%, 96%, 97%, 98%, 99% or 100% complementary), the target gene is selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3 , E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R , ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2.

在一些實施方案中,提供了包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含3個或更多個(例如,3、4、5個或更多個,如4個) sgRNA iBAR構建體(例如,編碼所述sgRNA iBAR的慢病毒或慢病毒載體),每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中所述3個或更多個sgRNA iBAR構建體的嚮導序列是相同的,其中所述3個或更多個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中每組sgRNA iBAR構建體的嚮導序列與相應命中基因的不同靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點。在一些實施方案中,提供了包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中所述4個sgRNA iBAR構建體的嚮導序列是相同的,其中所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中每組sgRNA iBAR構建體的嚮導序列與相應命中基因中的不同靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點。在一些實施方案中,每個sgRNA iBAR序列包含與第二序列融合的嚮導序列,其中所述第二序列包含與Cas9相互作用的重複-反重複莖環。在一些實施方案中,每個sgRNA iBAR序列的第二序列還包含莖環1、莖環2和/或莖環3。在一些實施方案中,所述iBAR序列被插入至所述重複-反重複莖環的環區中,和/或莖環1、莖環2或莖環3的環區。在一些實施方案中,每個iBAR序列包含約1-50 (例如,約6)個核苷酸。在一些實施方案中,每個sgRNA iBAR構建體是RNA、質粒、病毒載體(例如,慢病毒載體),或病毒(例如,慢病毒)。在一些實施方案中,所述sgRNA iBAR文庫包含至少約100 (例如,至少約200、400、1,000、1,300、1,600、4,000、10,000、15,000、19,000、20,000、38,000、50,000、100,000、150,000、155,000、200,000或更多中的任一個)組sgRNA iBAR構建體,如約1000至約4000組sgRNA iBAR構建體。在一些實施方案中,針對不同組的sgRNA iBAR構建體中至少兩個sgRNA iBAR構建體,所述iBAR序列是相同的(例如,第一組和第二組的sgRNA iBAR構建體在兩組sgRNA iBAR構建體中具有至少1、2、3、4或更多個共有的iBAR序列)。在一些實施方案中,針對至少兩組sgRNA iBAR構建體,所述iBAR序列是相同的。在一些實施方案中,包含多組sgRNA iBAR構建體的sgRNA iBAR文庫包含或編碼具有在癌症患者中(例如,基於文獻或資料庫)其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)的命中基因的靶位點互補的嚮導序列的sgRNA iBAR。在一些實施方案中,所述sgRNA iBAR文庫包含至少兩組(例如,2、3、4、5個或更多,如3組) sgRNA iBAR構建體,其包含或編碼具有如下嚮導序列的sgRNA iBAR,所述嚮導序列與針對每個命中基因在癌症患者中(例如,基於文獻或資料庫)其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)的相同命中基因中的至少兩個(例如,2、3、4、5個或更多個,如3個)不同的靶位點互補。在一些實施方案中,所述命中基因編碼在細胞內或細胞表面表達的蛋白,在健康細胞或癌細胞中。在一些實施方案中,每個嚮導序列包含約17至約23個核苷酸。 In some embodiments, a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs is provided, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 3, 4, 5 or more, such as 4 a) sgRNA iBAR constructs (for example, lentiviruses or lentiviral vectors encoding said sgRNA iBARs ), each of which comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein said 3 The guide sequences of three or more sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the 3 or more sgRNA iBAR constructs are different from each other, wherein the guide sequences of each set of sgRNA iBAR constructs are identical to the corresponding Different target sites (e.g., different hit genes, or different sites within the same hit gene) of the hit genes are complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96% , 97%, 98%, 99% or 100% complementary), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site. In some embodiments, a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs is provided, wherein each set of sgRNA iBAR constructs comprises four sgRNA iBAR constructs, each comprising or encoding a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the guide sequences of the four sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the four sgRNA iBAR constructs are different from each other, wherein each set of sgRNA iBAR constructs The guide sequence is complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site. In some embodiments, each sgRNA iBAR sequence comprises a guide sequence fused to a second sequence comprising a repeat-inverter stem-loop that interacts with Cas9. In some embodiments, the second sequence of each sgRNA iBAR sequence further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3. In some embodiments, the iBAR sequence is inserted into the loop region of the repeat-inverted repeat stem-loop, and/or the loop region of stem-loop 1, stem-loop 2, or stem-loop 3. In some embodiments, each iBAR sequence comprises about 1-50 (eg, about 6) nucleotides. In some embodiments, each sgRNA iBAR construct is RNA, plasmid, viral vector (eg, lentiviral vector), or virus (eg, lentivirus). In some embodiments, the sgRNA iBAR library comprises at least about 100 (e.g., at least about 200, 400, 1,000, 1,300, 1,600, 4,000, 10,000, 15,000, 19,000, 20,000, 38,000, 50,000, 100,000, 150,000, 155,000, Any of 200,000 or more) sets of sgRNA iBAR constructs, such as about 1000 to about 4000 sets of sgRNA iBAR constructs. In some embodiments, for at least two sgRNA iBAR constructs in different sets of sgRNA iBAR constructs, the iBAR sequences are identical (e.g., the first set and the second set of sgRNA iBAR constructs in the two sets of sgRNA iBAR constructs There are at least 1, 2, 3, 4 or more consensus iBAR sequences in the construct). In some embodiments, the iBAR sequences are identical for at least two sets of sgRNA iBAR constructs. In some embodiments, the sgRNA iBAR library comprising sets of sgRNA iBAR constructs comprises or encodes DNA mutations with a frequency of at least about 5% (e.g., at least about 10%) in cancer patients (e.g., based on a literature or database). , 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher) and its RNA expression level is up-regulated or down-regulated by more than about 2-fold (for example, by more than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100 times or higher) sgRNA iBAR with complementary guide sequence to the target site of the hit gene. In some embodiments, the sgRNA iBAR library comprises at least two sets (e.g., 2, 3, 4, 5 or more, such as 3 sets) of sgRNA iBAR constructs comprising or encoding sgRNA iBARs with guide sequences , the guide sequence has a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher) and its RNA expression level is up-regulated or down-regulated by more than about 2-fold (e.g., more than about 2.5, 3, 4, 5, 6, Any of 7, 8, 9, 10, 50, 100-fold or more) of the same hit genes differ in at least two (e.g., 2, 3, 4, 5 or more, such as 3) The target site is complementary. In some embodiments, the hit gene encodes a protein that is expressed inside or on the cell surface, in healthy cells or cancer cells. In some embodiments, each guide sequence comprises about 17 to about 23 nucleotides.

在一些實施方案中,提供了包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含3個或更多個(例如,3、4、5個或更多個,如4個) sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列、第二序列和iBAR序列,其中所述3個或更多個sgRNA iBAR構建體的嚮導序列是相同的,其中所述3個或更多個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中所述嚮導序列與第二序列融合,其中所述第二序列包含與Cas9蛋白相互作用的重複-反重複莖環,其中所述iBAR序列被插入至所述重複-反重複莖環的環區中,其中每組sgRNA iBAR構建體的嚮導序列與相應命中基因的不同靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點。在一些實施方案中,提供了包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列、第二序列和iBAR序列,其中所述4個sgRNA iBAR構建體的嚮導序列是相同的,其中所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,其中嚮導序列與第二序列融合,其中所述第二序列包含與Cas9蛋白相互作用的重複-反重複莖環,其中所述iBAR序列被插入至所述重複-反重複莖環的環區中,其中每組sgRNA iBAR構建體的嚮導序列與相應命中基因(例如,不同命中基因,或相同命中基因內的不同位點)的不同靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),且其中每個sgRNA iBAR可與Cas9蛋白一起操作以修飾所述靶位點。在一些實施方案中,每個sgRNA iBAR序列的第二序列還包含莖環1、莖環2和/或莖環3,例如,與重複-反重複莖環序列的3’端融合。在一些實施方案中,每個iBAR序列包含約1-50個(例如,6個)核苷酸。在一些實施方案中,每個sgRNA iBAR構建體為RNA、質粒、病毒載體(例如,慢病毒載體),或病毒(例如,慢病毒)。在一些實施方案中,所述sgRNA iBAR文庫包含至少約100組(例如,至少約any of 200、400、1,000、1,300、1,600、4,000、10,000、15,000、19,000、20,000、38,000、50,000、100,000、150,000、155,000、200,000組或更多中的任一個) sgRNA iBAR構建體,如約1000至約4000組sgRNA iBAR構建體。在一些實施方案中,不同組的sgRNA iBAR構建體中至少兩個sgRNA iBAR構建體的iBAR序列是相同的(例如,第一組和第二組的sgRNA iBAR構建體在所述兩組sgRNA iBAR構建體中具有至少1、2、3、4或更多個共有的iBAR序列)。在一些實施方案中,至少兩組sgRNA iBAR構建體的iBAR序列是相同的。在一些實施方案中,含有多組sgRNA iBAR構建體的sgRNA iBAR文庫包含或編碼具有與命中基因靶位點互補的嚮導序列的sgRNA iBAR,在癌症患者中(例如,基於文獻或資料庫)所述命中基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)。在一些實施方案中,所述sgRNA iBAR文庫包含至少兩組(例如,2、3、4、5個或更多組,如3組) sgRNA iBAR構建體,所述構建體包含或編碼sgRNA iBAR具有與針對每個命中基因的相同命中基因內至少兩個(例如,2、3、4、5個或更多個,如3個)不同的靶基因位點互補的嚮導序列,在癌症患者中(例如,基於文獻或資料庫)所述命中基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)。在一些實施方案中,在健康細胞或癌細胞中,所述命中基因編碼在細胞內或細胞表面表達的蛋白。在一些實施方案中,每個嚮導序列包含約17至約23個核苷酸。 In some embodiments, a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs is provided, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 3, 4, 5 or more, such as 4 a) sgRNA iBAR constructs, each of which comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence, a second sequence and an iBAR sequence, wherein the guide sequences of the three or more sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the 3 or more sgRNA iBAR constructs are different from each other, wherein the guide sequence is fused to a second sequence, wherein the second sequence comprises a Cas9 protein-interacting Repeat-inverse repeat stem-loop, wherein the iBAR sequence is inserted into the loop region of the repeat-invert repeat stem-loop, wherein the guide sequence of each set of sgRNA iBAR constructs is aligned with a different target site of the corresponding hit gene (e.g., Different hit genes, or different sites within the same hit gene) are complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site. In some embodiments, a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs is provided, wherein each set of sgRNA iBAR constructs comprises four sgRNA iBAR constructs, each comprising or encoding a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence, a second sequence and an iBAR sequence, wherein the guide sequences of the four sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the four sgRNA iBAR constructs are different from each other, wherein the guide sequences are identical to The second sequence is fused, wherein the second sequence comprises a repeat-repeat stem-loop interacting with the Cas9 protein, wherein the iBAR sequence is inserted into the loop region of the repeat-repeat stem-loop, wherein each set of sgRNA The guide sequences of the iBAR constructs are complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary), and wherein each sgRNA iBAR is operable with the Cas9 protein to modify the target site. In some embodiments, the second sequence of each sgRNA iBAR sequence further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3, e.g., fused to the 3' end of the repeat-invert repeat stem-loop sequence. In some embodiments, each iBAR sequence comprises about 1-50 (eg, 6) nucleotides. In some embodiments, each sgRNA iBAR construct is RNA, plasmid, viral vector (eg, lentiviral vector), or virus (eg, lentivirus). In some embodiments, the sgRNA iBAR library comprises at least about 100 panels (e.g., at least about any of 200, 400, 1,000, 1,300, 1,600, 4,000, 10,000, 15,000, 19,000, 20,000, 38,000, 50,000, 100,000, 150,000 , 155,000, 200,000 sets or more) sgRNA iBAR constructs, such as about 1000 to about 4000 sets of sgRNA iBAR constructs. In some embodiments, the iBAR sequences of at least two sgRNA iBAR constructs in different sets of sgRNA iBAR constructs are identical (for example, the sgRNA iBAR constructs of the first set and the second set are identical in the two sets of sgRNA iBAR constructs individuals with at least 1, 2, 3, 4 or more consensus iBAR sequences). In some embodiments, the iBAR sequences of at least two sets of sgRNA iBAR constructs are identical. In some embodiments, a sgRNA iBAR library comprising sets of sgRNA iBAR constructs comprising or encoding sgRNA iBARs with guide sequences complementary to target sites of hit genes described in cancer patients (e.g., based on literature or databases) The DNA mutation frequency of the hit gene is at least about 5% (e.g., at least about any of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or higher) and The level of RNA expression is up-regulated or down-regulated by greater than about 2-fold (eg, greater than any of about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or more). In some embodiments, the sgRNA iBAR library comprises at least two sets (e.g., 2, 3, 4, 5 or more sets, such as 3 sets) of sgRNA iBAR constructs comprising or encoding sgRNA iBARs having A guide sequence complementary to at least two (e.g., 2, 3, 4, 5 or more, such as 3) different target gene loci within the same hit gene for each hit gene, in cancer patients ( For example, based on literature or databases), the DNA mutation frequency of the hit gene is at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher) and its RNA expression level is up-regulated or down-regulated by more than about 2-fold (e.g., greater than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or any higher school). In some embodiments, the hit gene encodes a protein expressed inside or on the cell surface in a healthy cell or a cancer cell. In some embodiments, each guide sequence comprises about 17 to about 23 nucleotides.

在一些實施方案中,提供了包含嚮導序列和針對重複:反重複雙鏈和四環的嚮導髮夾編碼序列的sgRNA iBAR構建體,所述嚮導序列與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個),其中iBAR被嵌入至所述四環中作為內部重複。在一些實施方案中,所述iBAR包含由A、T、C和G核苷酸組成的1個核苷酸 (“nt”)-50nt (例如,1nt-40nt,1nt-30nt,1nt-25nt,2nt-20nt,3nt-18nt,3nt-16nt,3nt-14nt,3nt-12nt,3nt-10nt,3nt-9nt,4nt-8nt,5nt-7nt;優選地,3nt、4nt、5nt、6nt、7nt)的序列。在一些實施方案中,所述嚮導序列長度為約17-23, 18-22,或19-21個核苷酸中的任一個,以及所述髮夾序列一經轉錄即可結合至Cas核酸酶(例如,Cas9)。在一些實施方案中,所述sgRNA iBAR構建體還包含編碼莖環1、莖環2和/或莖環3的序列。在一些實施方案中,每個sgRNA iBAR構建體為RNA、質粒、病毒載體(例如,慢病毒載體),或病毒(例如,慢病毒)。 In some embodiments, sgRNA iBAR constructs are provided comprising a guide sequence complementary to the target site in the corresponding hit gene (e.g. , at least about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary), wherein the iBAR is embedded into the four in the ring as an internal repeat. In some embodiments, the iBAR comprises 1 nucleotide ("nt")-50nt consisting of A, T, C, and G nucleotides (e.g., 1nt-40nt, 1nt-30nt, 1nt-25nt, 2nt-20nt, 3nt-18nt, 3nt-16nt, 3nt-14nt, 3nt-12nt, 3nt-10nt, 3nt-9nt, 4nt-8nt, 5nt-7nt; preferably, 3nt, 4nt, 5nt, 6nt, 7nt) sequence. In some embodiments, the length of the guide sequence is about 17-23, 18-22, or any one of 19-21 nucleotides, and the hairpin sequence can bind to the Cas nuclease once transcribed ( For example, Cas9). In some embodiments, the sgRNA iBAR construct further comprises sequences encoding stem-loop 1, stem-loop 2, and/or stem-loop 3. In some embodiments, each sgRNA iBAR construct is RNA, plasmid, viral vector (eg, lentiviral vector), or virus (eg, lentivirus).

還提供了由本文所述sgRNA構建體或文庫中的任一個編碼的sgRNA分子。還提供了由本文所述sgRNA iBAR構建體、組或文庫中的任一個編碼的sgRNA iBAR分子。還提供了包含所述sgRNA或sgRNA iBAR構建體、分子、組或文庫中的任一個的組合物和試劑盒。 Also provided are sgRNA molecules encoded by any of the sgRNA constructs or libraries described herein. Also provided are sgRNA iBAR molecules encoded by any of the sgRNA iBAR constructs, panels or libraries described herein. Compositions and kits comprising any of the described sgRNA or sgRNA iBAR constructs, molecules, panels or libraries are also provided.

在一些實施方案中,提供了包含本文所述sgRNA或sgRNA iBAR構建體、分子、組或文庫中的任一個的修飾的癌細胞。在一些實施方案中,提供了癌細胞文庫,其中每個癌細胞包含來自本文所述sgRNA文庫的一個或多個sgRNA構建體,或來自本文所述sgRNA iBAR文庫的一個或多個sgRNA iBAR構建體。在一些實施方案中,所述癌細胞文庫包含靶向本文鑒定的任何靶基因的本文所述sgRNA文庫或sgRNA iBAR文庫,或任何命中基因,相比健康個體在癌症患者中(例如,基於文獻或資料庫),所述命中基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)。在一些實施方案中,所述修飾的癌細胞或所述初始癌細胞群包含或表達CRISPR/Cas系統的一個或多個組分,如與所述sgRNA或sgRNA iBAR構建體可一起操作的Cas蛋白(例如,Cas9)。 In some embodiments, cancer cells comprising a modification of any of the sgRNA or sgRNA iBAR constructs, molecules, panels or libraries described herein are provided. In some embodiments, a library of cancer cells is provided, wherein each cancer cell comprises one or more sgRNA constructs from the sgRNA library described herein, or one or more sgRNA iBAR constructs from the sgRNA iBAR library described herein . In some embodiments, the cancer cell library comprises a sgRNA library or sgRNA iBAR library described herein targeting any of the target genes identified herein, or any gene hit, in cancer patients compared to healthy individuals (e.g., based on literature or database), the DNA mutation frequency of the hit gene is at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more high school) and its RNA expression level is up-regulated or down-regulated by greater than about 2-fold (e.g., greater than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or higher) either). In some embodiments, the modified cancer cells or the initial population of cancer cells comprise or express one or more components of a CRISPR/Cas system, such as a Cas protein operable with the sgRNA or sgRNA iBAR construct (eg, Cas9).

iBAR序列iBAR sequence

一組sgRNA iBAR構建體包含3個或更多sgRNA iBAR構建體,每個所述構建體包含不同的iBAR序列。在一些實施方案中,一組sgRNA iBAR構建體包含3個sgRNA iBAR構建體,每個所述構建體包含不同的iBAR序列。在一些實施方案中,一組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,每個所述構建體包含不同的iBAR序列。在一些實施方案中,一組sgRNA iBAR構建體包含5個sgRNA iBAR構建體,每個所述構建體包含不同的iBAR序列。在一些實施方案中,一組sgRNA iBAR構建體包含6個或更多sgRNA iBAR構建體,每個所述構建體包含不同的iBAR序列。 A set of sgRNA iBAR constructs comprises 3 or more sgRNA iBAR constructs, each of which comprises a different iBAR sequence. In some embodiments, a set of sgRNA iBAR constructs comprises 3 sgRNA iBAR constructs, each of said constructs comprising a different iBAR sequence. In some embodiments, a set of sgRNA iBAR constructs comprises 4 sgRNA iBAR constructs, each of said constructs comprising a different iBAR sequence. In some embodiments, a set of sgRNA iBAR constructs comprises 5 sgRNA iBAR constructs, each of said constructs comprising a different iBAR sequence. In some embodiments, a set of sgRNA iBAR constructs comprises 6 or more sgRNA iBAR constructs, each of said constructs comprising a different iBAR sequence.

所述iBAR序列可具有任何合適的長度。在一些實施方案中,每個iBAR序列長度為約1-50個核苷酸 (“nt”),如約1nt-40nt,1nt-30nt,1nt-20nt,2nt-20nt,3nt-18nt,3nt-16nt,3nt-14nt,3nt-12nt,3nt-10nt,3nt-9nt,3nt-8nt,4nt-8nt,或5nt-7nt中的任一個。在一些實施方案中,每個iBAR序列長度為約2nt、3nt、4nt、5nt、6nt、7nt或8nt中的任一個。在一些實施方案中,每個sgRNA iBAR構建體中的iBAR序列具有相同的長度。在一些實施方案中,不同sgRNA iBAR構建體的iBAR序列具有不同的長度。在一些實施方案中,一組sgRNA iBAR構建體內的iBAR序列具有相同的長度。在一些實施方案中,一組sgRNA iBAR構建體內的iBAR序列具有不同的長度。在一些實施方案中,一組sgRNA iBAR構建體內的iBAR序列與另一組sgRNA iBAR構建體內的iBAR序列具有不同的長度。在一些實施方案中,所述iBAR序列為約6nt,以下稱為“iBAR6”。在一些實施方案中,所述sgRNA iBAR文庫內的每個iBAR序列為約6nt。 The iBAR sequence can be of any suitable length. In some embodiments, each iBAR sequence is about 1-50 nucleotides ("nt") in length, such as about 1nt-40nt, 1nt-30nt, 1nt-20nt, 2nt-20nt, 3nt-18nt, 3nt- Any one of 16nt, 3nt-14nt, 3nt-12nt, 3nt-10nt, 3nt-9nt, 3nt-8nt, 4nt-8nt, or 5nt-7nt. In some embodiments, each iBAR sequence is about any of 2nt, 3nt, 4nt, 5nt, 6nt, 7nt, or 8nt in length. In some embodiments, the iBAR sequences in each sgRNA iBAR construct are the same length. In some embodiments, the iBAR sequences of different sgRNA iBAR constructs have different lengths. In some embodiments, the iBAR sequences within a set of sgRNA iBAR constructs are the same length. In some embodiments, the iBAR sequences within a set of sgRNA iBAR constructs are of different lengths. In some embodiments, the iBAR sequences within one set of sgRNA iBAR constructs are of different lengths than the iBAR sequences within another set of sgRNA iBAR constructs. In some embodiments, the iBAR sequence is about 6 nt, hereinafter referred to as "iBAR6". In some embodiments, each iBAR sequence within the sgRNA iBAR library is about 6 nt.

所述iBAR序列可具有任何合適的序列。在一些實施方案中,所述iBAR序列是由A、T、C和/或G中的任何核苷酸製成的DNA序列。在一些實施方案中,所述iBAR序列是由A、U、C和/或G中任何核苷酸製成的RNA序列。在一些實施方案中,所述iBAR序列具有除A、T/U、C和G以外的非常規或修飾的核苷酸。在一些實施方案中,由A、T、C和G核苷酸組成的每個iBAR序列長度為6個核苷酸。在一些實施方案中,編碼的sgRNA iBAR中的iBAR序列為由A、U、C和G核苷酸組成的6個核苷酸長度。 The iBAR sequence may have any suitable sequence. In some embodiments, the iBAR sequence is a DNA sequence made from any of A, T, C, and/or G nucleotides. In some embodiments, the iBAR sequence is an RNA sequence made from any of A, U, C, and/or G nucleotides. In some embodiments, the iBAR sequence has unconventional or modified nucleotides other than A, T/U, C, and G. In some embodiments, each iBAR sequence consisting of A, T, C, and G nucleotides is 6 nucleotides in length. In some embodiments, the iBAR sequence in the encoded sgRNA iBAR is 6 nucleotides in length consisting of A, U, C, and G nucleotides.

在一些實施方案中,所述sgRNA iBAR文庫中與每組sgRNA iBAR構建體相關的iBAR序列組彼此不同。在一些實施方案中,不同組的sgRNA iBAR構建體中針對至少兩個sgRNA iBAR構建體的iBAR序列是相同的(例如,第一組和第二組的sgRNA iBAR構建體在所述兩組sgRNA iBAR構建體中具有至少1、2、3、4個或更多個共有的iBAR序列,但在相同sgRNA iBAR構建體組內針對每個sgRNA iBAR構建體的iBAR序列彼此不同)。在一些實施方案中,所述sgRNA iBAR文庫中針對至少兩組(例如,至少約2、3、4、5、10、50、100、1000組或更多中的任一個) sgRNA iBAR構建體的iBAR序列是相同的。在一些實施方案中,將一個或多個相同的iBAR序列用於所述sgRNA iBAR文庫內每組sgRNA iBAR構建體的一個或多個sgRNA iBAR構建體(但相同sgRNA iBAR構建體組內每個sgRNA iBAR構建體的iBAR序列彼此不同)。在一些實施方案中,將相同組的iBAR序列用於所述sgRNA iBAR文庫中每組sgRNA iBAR構建體。在一些實施方案中,針對不同組的sgRNA iBAR構建體,無需設計不同的iBAR組。在一些實施方案中,將固定的iBAR組用於所述sgRNA iBAR文庫中所有sgRNA iBAR構建體組。在一些實施方案中,將多個iBAR序列隨機分配到所述sgRNA iBAR文庫的不同組的sgRNA iBAR構建體。本文所述具有改進的分析工具的iBAR策略(MAGeCKiBAR; Zhu et al., Genome Biol. 2019; 20:20)可促進在各種環境中針對生物醫藥發現的大規模CRISPR/Cas篩選。 In some embodiments, the set of iBAR sequences associated with each set of sgRNA iBAR constructs in the sgRNA iBAR library are different from each other. In some embodiments, the iBAR sequences for at least two sgRNA iBAR constructs in different sets of sgRNA iBAR constructs are identical (e.g., the first set and the second set of sgRNA iBAR constructs in the two sets of sgRNA iBAR constructs are identical) Constructs with at least 1, 2, 3, 4 or more consensus iBAR sequences, but different iBAR sequences for each sgRNA iBAR construct within the same set of sgRNA iBAR constructs). In some embodiments, the sgRNA iBAR library is directed against at least two sets (e.g., at least about any of 2, 3, 4, 5, 10, 50, 100, 1000 sets, or more) of sgRNA iBAR constructs The iBAR sequence is the same. In some embodiments, one or more identical iBAR sequences are used for one or more sgRNA iBAR constructs per set of sgRNA iBAR constructs within the sgRNA iBAR library (but for each sgRNA within the same set of sgRNA iBAR constructs). The iBAR sequences of the iBAR constructs differ from each other). In some embodiments, the same set of iBAR sequences is used for each set of sgRNA iBAR constructs in the sgRNA iBAR library. In some embodiments, different sets of iBARs need not be designed for different sets of sgRNA iBAR constructs. In some embodiments, a fixed set of iBARs is used for all sets of sgRNA iBAR constructs in the sgRNA iBAR library. In some embodiments, multiple iBAR sequences are randomly assigned to different sets of sgRNA iBAR constructs of the sgRNA iBAR library. The iBAR strategy described here (MAGeCKiBAR; Zhu et al., Genome Biol. 2019; 20:20) with improved analytical tools could facilitate large-scale CRISPR/Cas screening for biopharmaceutical discovery in various settings.

可將所述iBAR序列插入(包括附接)到嚮導RNA(例如,sgRNA)中的任何合適區,所述嚮導RNA不影響該gRNA引導Cas核酸酶(例如,Cas9)至其靶位點的效率。在一些實施方案中,所述iBAR序列位於sgRNA的3’端。在一些實施方案中,所述iBAR序列位於sgRNA的5’端。在一些實施方案中,所述iBAR序列位於sgRNA的內部位置。例如, sgRNA可包含與CRISPR複合體中的Cas核酸酶相互作用的各種莖環,以及所述iBAR序列可被嵌入至任一莖環中的環區。在一些實施方案中,每個sgRNA iBAR序列包含在5’-至-3’方向的第一莖環序列和第二莖環序列,其中第一莖環序列與第二莖環序列雜交以形成與Cas蛋白相互作用的雙鏈RNA (dsRNA)區,且其中所述iBAR序列位於第一莖環序列的3’端和第二莖環序列的5’端之間。在一些實施方案中,所述嚮導RNA(例如,sgRNA)還包含莖環1、莖環2和/或莖環3,且其中所述iBAR序列被插入至莖環1的環區中、莖環2和/或莖環3。 The iBAR sequence can be inserted (including attached) into any suitable region in a guide RNA (e.g., sgRNA) that does not affect the efficiency of the gRNA to guide a Cas nuclease (e.g., Cas9) to its target site . In some embodiments, the iBAR sequence is located 3' to the sgRNA. In some embodiments, the iBAR sequence is located at the 5' end of the sgRNA. In some embodiments, the iBAR sequence is located internal to the sgRNA. For example, sgRNAs can comprise various stem-loops that interact with Cas nucleases in the CRISPR complex, and the iBAR sequence can be inserted into a loop region in either stem-loop. In some embodiments, each sgRNA iBAR sequence comprises a first stem-loop sequence and a second stem-loop sequence in the 5'-to-3' direction, wherein the first stem-loop sequence hybridizes to the second stem-loop sequence to form a A double-stranded RNA (dsRNA) region of Cas protein interaction, and wherein the iBAR sequence is positioned between the 3' end of the first stem-loop sequence and the 5' end of the second stem-loop sequence. In some embodiments, the guide RNA (e.g., sgRNA) further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3, and wherein the iBAR sequence is inserted into the loop region of stem-loop 1, stem-loop 2 and/or stem loop 3.

例如,CRISPR/Cas9系統的嚮導RNA可包含靶向基因組位點(例如,命中基因中的靶位點)的嚮導序列,以及編碼重複:反重複雙鏈和四環的嚮導髮夾序列。在一些實施方案中,所述iBAR被插入至所述四環中作為內部重複。在內源性CRISPR/Cas9系統的環境下,所述crRNA與反式啟動crRNA (tracrRNA)雜交以形成crRNA:tracrRNA雙鏈,其被載入至Cas9以引導對攜帶合適的前間隔序列鄰近基序(PAM)的同源DNA序列的切割。內源性crRNA序列可分為嚮導區(20 nt)和重複區(12 nt),而內源性tracrRNA序列可分為反重複 (14 nt)和3個tracrRNA莖環。在一些實施方案中,所述sgRNA結合靶標DNA以形成包含嚮導:靶標異源雙鏈、重複:反重複雙鏈和莖環1–3的T形結構。在一些實施方案中,所述重複和反重複部分由所述四環來連接,且所述重複和反重複形成重複:反重複雙鏈,其通過單核苷酸(A51)與莖環1連接,而莖環1和2通過5 nt單連結頭(核苷酸63–67)連接。在一些實施方案中,所述嚮導序列(核苷酸1–20)和靶標DNA (核苷酸10–200)通過20對沃森-克裡克(Watson-Crick)堿基對形成嚮導:靶標異源雙鏈,以及所述重複(核苷酸21–32)和反重複(核苷酸37–50)通過9對沃森-克裡克堿基對(U22:A49–A26:U45和G29:C40–A32:U37)形成重複:反重複雙鏈。在一些實施方案中,所述tracrRNA尾(核苷酸68–81和82–96)分別通過4和6對沃森-克裡克堿基對(A69:U80–U72:A77和G82:C96–G87:C91)形成莖環2和3。Nishimasu et al.描述了示例性CRISPR/Cas9系統的晶體結構(Nishimasu et al. “Crystal structure of cas9 in complex with guide RNA and target DNA.” Cell. 2014; 156:935–949),其內容通過引用以其整體併入本文。For example, the guide RNA of the CRISPR/Cas9 system can comprise a guide sequence targeting a genomic locus (eg, a target site in a hit gene), and a guide hairpin sequence encoding a repeat:inverted repeat duplex and a tetraloop. In some embodiments, the iBAR is inserted into the tetraloop as an internal repeat. In the context of the endogenous CRISPR/Cas9 system, the crRNA hybridizes to a trans-promoting crRNA (tracrRNA) to form a crRNA:tracrRNA duplex, which is loaded into Cas9 to direct a pair carrying the appropriate prospacer-adjacent motif (PAM) cleavage of homologous DNA sequences. The endogenous crRNA sequence can be divided into a guide region (20 nt) and a repeat region (12 nt), while the endogenous tracrRNA sequence can be divided into an inverted repeat (14 nt) and three tracrRNA stem-loops. In some embodiments, the sgRNA binds target DNA to form a T-shaped structure comprising guide:target heteroduplex, repeat:inverted repeat duplex, and stem-loops 1-3. In some embodiments, the repeat and anti-repeat portions are linked by the tetraloop, and the repeats and anti-repeats form a repeat:inverter duplex linked to stem-loop 1 by a single nucleotide (A51) , while stem-loops 1 and 2 are joined by a 5 nt single-linker head (nucleotides 63–67). In some embodiments, the guide sequence (nucleotides 1-20) and target DNA (nucleotides 10-200) form a guide:target through 20 pairs of Watson-Crick base pairs The heteroduplex, and the repeat (nucleotides 21–32) and inverted repeat (nucleotides 37–50) are passed through nine Watson-Crick base pairs (U22:A49–A26:U45 and G29 :C40–A32:U37) form a repeat:inverted repeat duplex. In some embodiments, the tracrRNA tails (nucleotides 68-81 and 82-96) are separated by 4 and 6 Watson-Crick base pairs (A69:U80-U72:A77 and G82:C96- G87:C91) form stem-loops 2 and 3. The crystal structure of an exemplary CRISPR/Cas9 system is described by Nishimasu et al. (Nishimasu et al. “Crystal structure of cas9 in complex with guide RNA and target DNA.” Cell. 2014; 156:935–949), the contents of which are incorporated by reference incorporated herein in its entirety.

在一些實施方案中,所述iBAR序列被插入至所述四環,或sgRNA的重複:反重複莖環的環區。在一些實施方案中,該文庫內每個sgRNA iBAR的iBAR序列被插入至所述重複-反重複莖環的環區中。Cas9 sgRNA骨架的四環位於 Cas9-sgRNA核糖核蛋白複合物之外,為了各種目的而對其進行了改造,但不影響其上游嚮導序列的活性(Gilbert et al. Cell 159, 647-661 (2014); Zhu et al. Methods Mol Biol 1656, 175-181 (2017))。申請人此前在WO2020125762中已經證明了,長度為6-nt的iBAR (iBAR6)可被嵌入至典型Cas9 sgRNA骨架的四環中而不影響所述sgRNA的基因編輯效率或增加脫靶效應,並且在iBAR6沒有序列偏差。所述示例性iBAR6導致4,096個條碼組合,其為高通量篩選提供了足夠的變化(參見WO2020125762的圖1A)。 In some embodiments, the iBAR sequence is inserted into the tetraloop, or loop region of the repeat:inverted repeat stem-loop of the sgRNA. In some embodiments, the iBAR sequence of each sgRNA iBAR within the library is inserted into the loop region of the repeat-invert repeat stem-loop. The four loops of the Cas9 sgRNA backbone are located outside the Cas9-sgRNA ribonucleoprotein complex and have been modified for various purposes without affecting the activity of its upstream guide sequence (Gilbert et al. Cell 159, 647-661 (2014 ); Zhu et al. Methods Mol Biol 1656, 175-181 (2017)). The applicant has previously demonstrated in WO2020125762 that a 6-nt iBAR (iBAR6) can be embedded into the four rings of a typical Cas9 sgRNA backbone without affecting the gene editing efficiency of the sgRNA or increasing off-target effects, and the iBAR6 There is no serial bias. The exemplary iBAR6 resulted in 4,096 barcode combinations, which provided sufficient variation for high-throughput screening (see Figure 1A of WO2020125762).

嚮導序列wizard sequence

所述嚮導序列與靶標序列(例如,命中基因中的靶位點)雜交並引導CRISPR複合體與靶標序列的序列特異性結合。在一些實施方案中,當使用合適的比對演算法進行最佳比對時,嚮導序列與其相應靶標序列之間的互補程度為約或大於約50%、60%、70%、75%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更高中的任一個(例如,100%互補)。與靶位點或命中基因“互補”的嚮導序列可以是與所述靶位點或命中基因完全或部分地互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。可以使用任何合適的演算法進行比對序列來確定最佳比對,其非限制性實施包括史密斯-沃特曼(Smith-Waterman)演算法、內德勒曼-溫施(Needleman-Wimsch)演算法、基於巴羅斯-惠勒(Burrows-Wheeler)變換的演算法。在一些實施方案中,嚮導序列的長度為約或大於約10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30個或更多個核苷酸中的任一個。在一些實施方案中,所述嚮導序列包含約17至約23個核苷酸。嚮導序列引導CRISPR複合體與靶標序列的序列特異性結合的能力可以通過任何合適的測定進行評估。例如,足以形成CRISRP複合體的CRISPR系統的組分(包括待測的嚮導序列),可通過以下方法來提供給具有相應靶標序列的宿主細胞,如用編碼所述CRISPR序列組分的載體進行轉染,然後評估靶標序列內的優先切割。類似地,靶標多核苷酸序列的切割可以在測試管中通過如下方法來評估:提供靶標序列、CRISPR複合體的組分,包括待測的嚮導序列和不同于測試嚮導序列的對照嚮導序列,以及比較測試嚮導序列和對照嚮導序列反應之間在靶標序列的結合和切割比例。The guide sequence hybridizes to a target sequence (eg, hits a target site in a gene) and directs sequence-specific binding of the CRISPR complex to the target sequence. In some embodiments, the degree of complementarity between the guide sequence and its corresponding target sequence is about or greater than about 50%, 60%, 70%, 75%, 80% when optimally aligned using a suitable alignment algorithm. %, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher (e.g., 100% complementary). A guide sequence that is "complementary" to a target site or gene hit can be fully or partially complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95% to the target site or gene hit). %, 96%, 97%, 98%, 99% or 100% complementary). Sequences may be aligned to determine optimal alignment using any suitable algorithm, non-limiting implementations of which include the Smith-Waterman algorithm, the Needleman-Wimsch algorithm, method, based on Burrows-Wheeler (Burrows-Wheeler) transformation algorithm. In some embodiments, the length of the guide sequence is about or greater than about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 , 28, 29, 30 or more nucleotides. In some embodiments, the guide sequence comprises about 17 to about 23 nucleotides. The ability of a guide sequence to direct sequence-specific binding of a CRISPR complex to a target sequence can be assessed by any suitable assay. For example, components of a CRISPR system sufficient to form a CRISRP complex (including the guide sequence to be tested) can be provided to a host cell having the corresponding target sequence by, for example, transfection with a vector encoding the CRISPR sequence component staining and then assess preferential cleavage within the target sequence. Similarly, cleavage of a target polynucleotide sequence can be assessed in a test tube by providing the target sequence, the components of the CRISPR complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and Compare the binding and cleavage ratios at the target sequence between the test guide sequence and the control guide sequence reactions.

在一些實施方案中,嚮導序列最短可達約10個核苷酸,最長可達約30個核苷酸。在一些實施方案中,所述嚮導序列長度為約15、16、17、18、19、20、21、22、23、24或25個核苷酸中的任一個。合成的嚮導序列長度可為約20個核苷酸,但可更長或更短。例如,針對CRISPR/Cas9系統的嚮導序列可由與靶標序列(例如,命中基因中的靶位點)互補的20個核苷酸組成,即所述嚮導序列可與PAM序列上游的20個核苷酸相同,除了在DNA和RNA之間的A/U差異以外。在一些實施方案中,所述嚮導序列包含約17至約23個核苷酸。在一些實施方案中,所述文庫內每個sgRNA或sgRNA iBAR的嚮導序列具有相同的長度。在一些實施方案中,所述文庫內至少兩個sgRNA或sgRNA iBAR的嚮導序列具有不同的長度。在一些實施方案中,一組sgRNA iBAR構建體內的嚮導序列具有相同的長度。在一些實施方案中,一組sgRNA iBAR構建體內的嚮導序列具有不同的長度。在一些實施方案中,一組sgRNA iBAR構建體內的嚮導序列具有與另一組sgRNA iBAR構建體內的嚮導序列不同的長度。 In some embodiments, the guide sequence can be as short as about 10 nucleotides and as long as about 30 nucleotides. In some embodiments, the guide sequence is any of about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides in length. Synthetic guide sequences can be about 20 nucleotides in length, but can be longer or shorter. For example, the guide sequence for the CRISPR/Cas9 system can consist of 20 nucleotides complementary to the target sequence (e.g., the target site in the hit gene), that is, the guide sequence can be 20 nucleotides upstream of the PAM sequence. Same, except for the A/U difference between DNA and RNA. In some embodiments, the guide sequence comprises about 17 to about 23 nucleotides. In some embodiments, the guide sequences of each sgRNA or sgRNA iBAR within the library are the same length. In some embodiments, the guide sequences of at least two sgRNAs or sgRNA iBARs within the library are of different lengths. In some embodiments, the guide sequences within a set of sgRNA iBAR constructs are the same length. In some embodiments, guide sequences within a set of sgRNA iBAR constructs are of different lengths. In some embodiments, the guide sequences within one set of sgRNA iBAR constructs have a different length than the guide sequences within another set of sgRNA iBAR constructs.

在一些實施方案中,一組sgRNA iBAR構建體內的嚮導序列是相同的。在一些實施方案中,一組sgRNA iBAR構建體內的嚮導序列是相同的, 而每組sgRNA iBAR構建體的嚮導序列與不同靶點 (例如,不同命中基因,或相同命中基因的不同靶位點)互補。在一些實施方案中,至少兩組sgRNA iBAR構建體的嚮導序列與相同命中基因的兩個不同靶位點互補。在一些實施方案中,3組sgRNA iBAR構建體的嚮導序列與相同命中基因的3個不同靶位點互補。在一些實施方案中,在至少兩個(例如,2、3、4或更多個,如3個)不同的靶位點,每個命中基因被至少兩組(例如,2、3、4或更多,如3組) sgRNA iBAR構建體的至少兩個(例如,2、3、4或更多個,如3個)嚮導序列靶向。在一些實施方案中,每組sgRNA iBAR構建體中的嚮導序列與基因組的不同命中基因互補。 In some embodiments, the guide sequences within a set of sgRNA iBAR constructs are identical. In some embodiments, the guide sequences within a set of sgRNA iBAR constructs are identical, and the guide sequences of each set of sgRNA iBAR constructs are associated with different targets (e.g., different hit genes, or different target sites of the same hit gene) complementary. In some embodiments, the guide sequences of at least two sets of sgRNA iBAR constructs are complementary to two different target sites of the same hit gene. In some embodiments, the guide sequences of the 3 sets of sgRNA iBAR constructs are complementary to 3 different target sites of the same hit gene. In some embodiments, at least two (e.g., 2, 3, 4 or more, such as 3) different target sites, each gene hit is identified by at least two groups (e.g., 2, 3, 4, or More, such as 3 sets) of at least two (eg, 2, 3, 4 or more, such as 3) guide sequences of the sgRNA iBAR construct are targeted. In some embodiments, the guide sequences in each set of sgRNA iBAR constructs are complementary to different hit genes of the genome.

sgRNA構建體或sgRNA iBAR構建體的嚮導序列可以根據本領域的任何已知方法設計。嚮導序列可以靶向編碼區域,如外顯子或剪接位元點、目標基因的5'非翻譯區(UTR)或3'非翻譯區域(UTR)。例如,基因的閱讀框架可能會被嚮導RNA靶位點的雙鏈斷裂(DSB)介導的插入缺失所破壞。或者,靶向編碼序列5'端的嚮導RNA可用于高效產生基因敲除。嚮導序列可以根據特定的序列特徵進行設計和優化,以實現高的中靶基因編輯活性和低的脫靶效應。例如,嚮導序列的GC含量可為約20%到約70%,並且可以避免含有均聚物延伸的序列(例如TTTT、GGGG)。 The guide sequence of the sgRNA construct or sgRNA iBAR construct can be designed according to any method known in the art. Guide sequences can target coding regions such as exons or splice site sites, 5' untranslated regions (UTRs) or 3' untranslated regions (UTRs) of a gene of interest. For example, the reading frame of a gene may be disrupted by a double-strand break (DSB)-mediated indel at the target site of the guide RNA. Alternatively, guide RNA targeting the 5' end of the coding sequence can be used to efficiently generate gene knockouts. Guide sequences can be designed and optimized based on specific sequence features to achieve high on-target gene editing activity and low off-target effects. For example, the GC content of the guide sequence can be from about 20% to about 70%, and sequences containing homopolymeric extensions (eg, TTTT, GGGG) can be avoided.

可設計嚮導序列以靶向任何目標基因組基因座(例如,任何命中基因的任何靶位點)。在一些實施方案中,所述嚮導序列靶向編碼蛋白質的基因。在一些實施方案中,所述嚮導序列靶向編碼RNA的基因,如小RNA(例如,微RNA、piRNA、siRNA、snoRNA、tRNA、rRNA和snRNA)、核糖體RNA或長非編碼RNA (lincRNA)。在一些實施方案中,所述嚮導序列靶向基因組的非編碼區。在一些實施方案中,所述嚮導序列靶向染色體基因座。在一些實施方案中,所述嚮導序列靶向染色體外的基因座。在一些實施方案中,所述嚮導序列靶向線粒體基因。在一些實施方案中,所述嚮導序列與基因組(例如,人基因組)內任何注釋基因的靶位點互補。在一些實施方案中,所述嚮導序列靶向這樣的基因,如在癌症患者中(例如,基於文獻或資料庫)該基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)。在一些實施方案中,所述嚮導序列靶向這樣的基因,如在癌症患者中(例如,基於文獻或資料庫)該基因的RNA表達水準上調或下調了大於約1.2-倍(例如,大於約1.5、2、2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)。在一些實施方案中,所述嚮導序列靶向這樣的基因,如在癌症患者中(例如,基於文獻或資料庫)該基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)。在一些實施方案中,所述嚮導序列靶向編碼蛋白在細胞內或細胞表面表達的基因(在健康細胞或癌細胞中)。在一些實施方案中,所述嚮導序列靶向基因組中沒有任何基因注釋的區域(“非基因區”)。包含或編碼與非基因區互補的嚮導序列的sgRNA或sgRNA iBAR構建體可用作陰性對照。 Guide sequences can be designed to target any genomic locus of interest (eg, any target site of any gene hit). In some embodiments, the guide sequence targets a gene encoding a protein. In some embodiments, the guide sequence targets a gene encoding an RNA, such as a small RNA (e.g., microRNA, piRNA, siRNA, snoRNA, tRNA, rRNA, and snRNA), ribosomal RNA, or long noncoding RNA (lincRNA) . In some embodiments, the guide sequence targets a non-coding region of the genome. In some embodiments, the guide sequence targets a chromosomal locus. In some embodiments, the guide sequence targets an extrachromosomal locus. In some embodiments, the guide sequence targets a mitochondrial gene. In some embodiments, the guide sequence is complementary to the target site of any annotated gene within the genome (eg, the human genome). In some embodiments, the guide sequence targets a gene that has a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, etc.) in cancer patients (e.g., based on literature or databases). %, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher). In some embodiments, the guide sequence targets a gene whose RNA expression level is up-regulated or down-regulated by greater than about 1.2-fold (e.g., greater than about 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100 times or more). In some embodiments, the guide sequence targets a gene that has a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, etc.) in cancer patients (e.g., based on literature or databases). %, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher) and its RNA expression level is up-regulated or down-regulated by more than about 2-fold (e.g., greater than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100 times or more). In some embodiments, the guide sequence targets a gene encoding a protein that is expressed in the cell or on the cell surface (in healthy or cancer cells). In some embodiments, the guide sequence targets a region of the genome that does not have any gene annotation ("nongenic region"). An sgRNA or sgRNA iBAR construct containing or encoding a guide sequence complementary to a nongenic region can be used as a negative control.

在一些實施方案中,設計所述嚮導序列以抑制或失活任何目標命中基因或靶基因的表達。所述命中基因或靶基因可為內源基因或轉基因。在一些實施方案中,所述命中基因或靶基因可能已知與特定表型有關。在一些實施方案中,所述命中基因或靶基因是與特定表型無關的基因,如未知與特定表型有關的已知基因或尚未表徵的未知基因。在一些實施方案中,所述嚮導序列靶向的區域位於與所述命中基因或靶基因不同的染色體上。In some embodiments, the guide sequence is designed to inhibit or inactivate the expression of any hit or target gene of interest. The hit gene or target gene can be an endogenous gene or a transgene. In some embodiments, the hit or target gene may be known to be associated with a particular phenotype. In some embodiments, the hit gene or target gene is a gene that is not associated with a particular phenotype, such as a known gene that is not known to be associated with a particular phenotype or an unknown gene that has not yet been characterized. In some embodiments, the region targeted by the guide sequence is on a different chromosome than the hit or target gene.

其他sgRNA或sgRNA iBAR組分 Other sgRNA or sgRNA iBAR components

在一些實施方案中,所述sgRNA或sgRNA iBAR包含促進與Cas蛋白形成CRISPR複合體的其他序列元件。在一些實施方案中,所述sgRNA或sgRNA iBAR包含含有重複-反重複莖環的第二序列。重複-反重複莖環包含與tracr序列融合的tracr配對序列,所述tracr序列通過環區與tracr配對序列互補。 In some embodiments, the sgRNA or sgRNA iBAR comprises additional sequence elements that facilitate formation of a CRISPR complex with a Cas protein. In some embodiments, the sgRNA or sgRNA iBAR comprises a second sequence comprising a repeat-invert repeat stem-loop. The repeat-inverted repeat stem-loop comprises a tracr mate sequence fused to a tracr sequence that is complementary to the tracr mate sequence through the loop region.

通常,在內源性CRISPR/Cas9系統的環境中,CRISPR複合體的形成(包括與靶標序列雜交並與一個或多個Cas蛋白複合的嚮導序列)導致靶標序列中或附近(例如,距離其1、2、3、4、5、6、7、8、9、10、20、50或更多個堿基對)的一條或兩條鏈的切割。所述tracr序列,其可包含所有或部分的野生型tracr序列(例如,野生型tracr序列的約或大於約20、26、32、45、48、54、63、67、85個或更多個核苷酸中的任一個)或由其組成,也可形成CRISPR複合體的一部分,如通過沿著至少一部分的tracr序列與可操作地連接至嚮導序列的所有或部分tracr配對序列雜交。在一些實施方案中,所述tracr序列與tracr配對序列具有足夠的互補性,以雜交並參與CRISPR複合體的形成。對於所述靶標序列,據信並不需要完全互補,只要足以發揮作用就行。在一些實施方案中,當進行最佳比對時,所述tracr序列在沿所述tracr配對序列的長度上具有至少約50%、60%、70%、80%、90%、95%或99%序列互補性中的任一個。確定最佳比對在本領域技術人員的能力範圍內。例如,有公開的和商用的比對演算法和程式,如但不限於ClustalW、Matlab 中的Smith-Waterman、Bowtie、Geneious、Biopython 和 SeqMan。在一些實施方案中,所述tracr序列長度為約或大於約5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、25、30、40、50個或更多核苷酸中的任一個。可使用來自天然存在的CRISPR系統的任一個已知的tracr配對序列和tracr序列,如來自US8697359所述的化膿性鏈球菌(S. pyogenes) CRISPR/Cas9系統和本文所述的tracr配對序列和tracr序列。Typically, in the context of an endogenous CRISPR/Cas9 system, formation of the CRISPR complex (including a guide sequence that hybridizes to the target sequence and complexes with one or more Cas proteins) results in , 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more alkyl pairs) of one or both strands. The tracr sequence, which may comprise all or part of the wild-type tracr sequence (e.g., about or greater than about 20, 26, 32, 45, 48, 54, 63, 67, 85 or more of the wild-type tracr sequence or consisting of any of the nucleotides) may also form part of the CRISPR complex, such as by hybridizing along at least a portion of the tracr sequence to all or part of the tracr mate sequence operably linked to the guide sequence. In some embodiments, the tracr sequence is sufficiently complementary to the tracr mate sequence to hybridize and participate in the formation of the CRISPR complex. It is not believed to be necessary for the target sequence to be completely complementary, but only sufficient to be effective. In some embodiments, when optimally aligned, the tracr sequence has at least about 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the tracr sequence along the length of the tracr mate sequence. Either in % sequence complementarity. Determining the optimal alignment is within the ability of those skilled in the art. For example, there are public and commercial alignment algorithms and programs such as, but not limited to, ClustalW, Smith-Waterman in Matlab, Bowtie, Geneious, Biopython, and SeqMan. In some embodiments, the tracr sequence is about or greater than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, Any of 30, 40, 50 or more nucleotides. Any of the known tracr mate sequences and tracr sequences from naturally occurring CRISPR systems can be used, such as from the S. pyogenes CRISPR/Cas9 system described in US8697359 and the tracr mate sequences and tracr sequences described herein. sequence.

在一些實施方案中,所述tracr序列和tracr配對序列包含於單個轉錄本內,使得兩者之間的雜交產生具有二級結構的轉錄物,如莖環(也稱為髮夾),稱為“重複-反重複莖環”。In some embodiments, the tracr sequence and the tracr mate sequence are contained within a single transcript such that hybridization between the two produces a transcript with a secondary structure, such as a stem-loop (also known as a hairpin), referred to as "Repeat-repeat stem-loops".

在一些實施方案中,無iBAR序列的sgRNA構建體中莖環的環區長度為4個核苷酸,且該環區也稱為“四環”。在一些實施方案中,所述環區具有序列GAAA。然而,可以使用更長或更短的環序列,或者可以使用備選的序列,如包括核苷酸三聯體(例如,AAA)的序列,以及其他核苷酸(例如,C或G)。在一些實施方案中,環區的序列為CAAA或AAAG。在一些實施方案中,所述iBAR被插入至環區,如四環。例如,所述iBAR序列可插入至四環中的第一核苷酸前、第一核苷酸與第二核苷酸之間、第二核苷酸和第三核苷酸之間、第三核苷酸和第四核苷酸之間,或第四核苷酸後。在一些實施方案中,所述iBAR序列替代環區中的一個或多個核苷酸。In some embodiments, the loop region of the stem-loop in the sgRNA construct without an iBAR sequence is 4 nucleotides in length, and the loop region is also referred to as a "tetraloop". In some embodiments, the loop region has the sequence GAAA. However, longer or shorter loop sequences may be used, or alternative sequences may be used, such as sequences comprising nucleotide triplets (eg, AAA), as well as other nucleotides (eg, C or G). In some embodiments, the sequence of the loop region is CAAA or AAAG. In some embodiments, the iBAR is inserted into a loop region, such as a tetraloop. For example, the iBAR sequence can be inserted before the first nucleotide, between the first and second nucleotide, between the second and third nucleotide, between the third and third nucleotides in the tetraloop. Between a nucleotide and the fourth nucleotide, or after the fourth nucleotide. In some embodiments, the iBAR sequence replaces one or more nucleotides in the loop region.

在一些實施方案中,所述sgRNA iBAR包含至少兩個或更多個莖環。在一些實施方案中,所述sgRNA iBAR具有兩個、3個、4個或5個莖環。在一些實施方案中,所述sgRNA iBAR具有至多5個髮夾。在一些實施方案中,所述sgRNA或sgRNA iBAR構建體還包括轉錄終止序列,如polyT序列,例如,6個T核苷酸。 In some embodiments, the sgRNA iBAR comprises at least two or more stem-loops. In some embodiments, the sgRNA iBAR has two, 3, 4 or 5 stem-loops. In some embodiments, the sgRNA iBAR has at most 5 hairpins. In some embodiments, the sgRNA or sgRNA iBAR construct further includes a transcription termination sequence, such as a polyT sequence, eg, 6 T nucleotides.

在一些實施方案中,其中所述Cas蛋白是Cas9,每個sgRNA或sgRNA iBAR包含與第二序列融合的嚮導序列,所述第二序列包含與Cas9相互作用的重複-反重複莖環。在一些實施方案中,所述iBAR序列被插入至所述重複-反重複莖環的環區中。在一些實施方案中,所述iBAR序列替代所述重複-反重複莖環的環區中的一個或多個核苷酸。在一些實施方案中,每個sgRNA或sgRNA iBAR的第二序列還包含莖環1、莖環2和/或莖環3。在一些實施方案中,所述iBAR序列被插入至莖環1的環區中。在一些實施方案中,所述iBAR序列替代莖環1的環區中的一個或多個核苷酸。在一些實施方案中,所述iBAR序列被插入至莖環2的環區中。在一些實施方案中,所述iBAR序列替代莖環2的環區中的一個或多個核苷酸。在一些實施方案中,所述iBAR序列被插入至莖環3的環區中。在一些實施方案中,所述iBAR序列替代莖環3的環區中的一個或多個核苷酸。 In some embodiments, wherein the Cas protein is Cas9, each sgRNA or sgRNA iBAR comprises a guide sequence fused to a second sequence comprising a repeat-inverter stem-loop that interacts with Cas9. In some embodiments, the iBAR sequence is inserted into the loop region of the repeat-invert repeat stem-loop. In some embodiments, the iBAR sequence replaces one or more nucleotides in the loop region of the repeat-inverted repeat stem-loop. In some embodiments, the second sequence of each sgRNA or sgRNA iBAR further comprises stem-loop 1, stem-loop 2, and/or stem-loop 3. In some embodiments, the iBAR sequence is inserted into the loop region of stem-loop 1. In some embodiments, the iBAR sequence replaces one or more nucleotides in the loop region of stem-loop 1. In some embodiments, the iBAR sequence is inserted into the loop region of stem-loop 2. In some embodiments, the iBAR sequence replaces one or more nucleotides in the loop region of stem-loop 2. In some embodiments, the iBAR sequence is inserted into the loop region of stem-loop 3. In some embodiments, the iBAR sequence replaces one or more nucleotides in the loop region of stem-loop 3.

在一些實施方案中,每個sgRNA iBAR包含在5’-至-3’方向的第一莖環序列和第二莖環序列,其中第一莖環序列與第二莖環序列雜交以形成與Cas蛋白相互作用的雙鏈RNA (dsRNA)區,且其中所述iBAR序列位於第一莖環序列的3’端和第二莖環序列的5’端之間。 In some embodiments, each sgRNA iBAR comprises a first stem-loop sequence and a second stem-loop sequence in the 5'-to-3' direction, wherein the first stem-loop sequence hybridizes with the second stem-loop sequence to form a Cas A protein-interacting double-stranded RNA (dsRNA) region, and wherein the iBAR sequence is located between the 3' end of the first stem-loop sequence and the 5' end of the second stem-loop sequence.

在CRISPR/Cas9系統中,嚮導RNA可用于通過Cas9核酸酶引導基因組DNA的切割。例如,所述嚮導RNA可由可變序列(嚮導序列)的核苷酸間隔序列和不變的髮夾序列組成,所述間隔序列使CRISPR/Cas系統核酸酶以序列特異性方式靶向基因組位置,且所述不變的髮夾序列在不同嚮導RNA中保持不變並允許所述嚮導RNA結合至Cas核酸酶。在一些實施方案中,CRISPR/Cas嚮導RNA,其包含與宿主細胞中的靶基因組序列(例如,命中基因的靶位點)同源或互補的CRISPR/Cas可變嚮導序列和在轉錄時能夠結合Cas核酸酶(例如,Cas9)的不變髮夾序列,其中髮夾序列編碼重複:反重複雙鏈體和四環,並且iBAR被嵌入四環區。In the CRISPR/Cas9 system, guide RNA can be used to guide the cleavage of genomic DNA by the Cas9 nuclease. For example, the guide RNA may consist of a nucleotide spacer sequence of variable sequence (guide sequence) and an invariant hairpin sequence, which allows the CRISPR/Cas system nuclease to target the genomic location in a sequence-specific manner, And the invariant hairpin sequence remains unchanged in different guide RNAs and allows the guide RNAs to bind to the Cas nuclease. In some embodiments, a CRISPR/Cas guide RNA comprising a CRISPR/Cas variable guide sequence homologous or complementary to a target genomic sequence (e.g., a target site of a hit gene) in a host cell and capable of binding upon transcription The invariant hairpin sequence of a Cas nuclease (eg, Cas9), wherein the hairpin sequence encodes a repeat:inverted repeat duplex and a tetraloop, and the iBAR is embedded in the tetraloop region.

CRISPR/Cas9嚮導RNA的嚮導序列長度可為約17-23、18-22或19-21個核苷酸中的任一個。所述嚮導序列可以以序列特異的方式將Cas核酸酶靶向基因組基因座,並且可以按照本領域已知的一般原理進行設計。可以根據本領域的公知常識提供不變的嚮導RNA髮夾序列,例如,如Nishimasu等人所公開的(Nishimasu H, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell. 2014; 156:935–949)。只要轉錄後能夠與Cas核酸酶結合,任何不變髮夾序列都可以使用。The guide sequence length of the CRISPR/Cas9 guide RNA can be any of about 17-23, 18-22, or 19-21 nucleotides in length. The guide sequence can target the Cas nuclease to the genomic locus in a sequence-specific manner and can be designed according to general principles known in the art. An invariant guide RNA hairpin sequence can be provided according to common knowledge in the art, for example, as disclosed by Nishimasu et al. (Nishimasu H, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell. 2014; 156:935–949). Any invariant hairpin sequence can be used as long as it can bind Cas nuclease after transcription.

先前的研究表明,儘管具有48 nt的tracrRNA尾的sgRNA(稱為sgRNA ( +48))是最小區域,但對於體外Cas9催化的DNA切割(Jinek et al., 2012),具有擴展tracrRNA尾的sgRNA、sgRNA( +67)和sgRNA( +85)可以提高體內Cas9的切割活性(Hsu et al., 2013)。在一些實施方案中,所述sgRNA或sgRNA iBAR包含莖環1、莖環2和/或莖環3。所述莖環1、莖環2和/或莖環3區可提高CRISPR/Cas9系統中的編輯效率。 Previous studies have shown that, although sgRNAs with 48 nt tracrRNA tails (termed sgRNA ( +48 )) are the smallest regions, for in vitro Cas9-catalyzed DNA cleavage (Jinek et al., 2012), sgRNAs with extended tracrRNA tails , sgRNA( +67 ) and sgRNA( +85 ) can improve the cleavage activity of Cas9 in vivo (Hsu et al., 2013). In some embodiments, the sgRNA or sgRNA iBAR comprises stem-loop 1, stem-loop 2, and/or stem-loop 3. The stem-loop 1, stem-loop 2 and/or stem-loop 3 regions can improve editing efficiency in the CRISPR/Cas9 system.

在一些實施方案中,從5’至3’所述sgRNA包含:嚮導序列、重複-反重複莖環、莖環1、莖環2和莖環3。在一些實施方案中,從5’至3’所述sgRNA iBAR包含:嚮導序列、重複-反重複莖環且 iBAR序列被插入至所述環區、莖環1、莖環2和莖環3。 In some embodiments, from 5' to 3' the sgRNA comprises: a guide sequence, a repeat-repeat stem-loop, stem-loop 1, stem-loop 2, and stem-loop 3. In some embodiments, the sgRNA iBAR comprises from 5' to 3': a guide sequence, a repeat-inverted repeat stem-loop and iBAR sequences are inserted into the loop region, stem-loop 1, stem-loop 2, and stem-loop 3.

載體(vectors/vehicles)Carriers (vectors/vehicles)

在一些實施方案中,所述sgRNA構建體包含可操作地連接至嚮導RNA序列的一個或多個調控元件。在一些實施方案中,所述sgRNA iBAR構建體包含可操作地連接至嚮導RNA序列和所述iBAR序列的一個或多個調控元件。示例性調控元件包括但不限於:啟動子、增強子、內部核糖體進入位點(IRES),以及其他表達控制元件(例如轉錄終止信號,如聚腺苷酸化信號和poly-U序列)。這樣的調節元件,描述於例如Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990)。調控元件包括指導核苷酸序列在許多類型的宿主細胞中進行組成型表達的那些元件,以及指導核苷酸序列僅在某些宿主細胞中表達的那些元件(例如,組織特異性調控序列)。 In some embodiments, the sgRNA construct comprises one or more regulatory elements operably linked to a guide RNA sequence. In some embodiments, the sgRNA iBAR construct comprises one or more regulatory elements operably linked to a guide RNA sequence and the iBAR sequence. Exemplary regulatory elements include, but are not limited to, promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (eg, transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990). Regulatory elements include those that direct the constitutive expression of a nucleotide sequence in many types of host cells, as well as those that direct the expression of a nucleotide sequence only in certain host cells (eg, tissue-specific regulatory sequences).

所述sgRNA或sgRNA iBAR構建體可存在於載體中。在一些實施方案中,在一些實施方案中,載體適合於在真核細胞如哺乳動物細胞(例如癌細胞)中複製和整合。在一些實施方案中,sgRNA或sgRNA iBAR構建體是表達載體,例如病毒載體或質粒。病毒載體的實例包括但不限於腺病毒載體、腺相關病毒載體、慢病毒載體、逆轉錄病毒載體、單純皰疹病毒載體,及其衍生物。病毒載體技術在本領域中是眾所周知的,並且例如描述於Sambrook et al. 2001, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York,以及其他病毒學和分子生物學手冊。本領域技術人員將理解,表達載體的設計可以取決於諸如要轉化的宿主細胞的選擇、所需的表達水準等因素。在一些實施方案中,sgRNA或sgRNA iBAR構建體是慢病毒載體。在一些實施方案中,sgRNA或sgRNA iBAR構建體是病毒。在一些實施方案中,sgRNA或sgRNA iBAR構建體是腺病毒或腺相關病毒。在一些實施方案中,sgRNA或sgRNA iBAR構建體是慢病毒。在一些實施方案中,所述載體還包含選擇標記。在一些實施方案中,載體進一步包含編碼CRISPR/Cas系統的一種或多種元件的一種或多種核苷酸序列,如編碼Cas核酸酶(例如,Cas9)的核苷酸序列。在一些實施方案中,提供了一種載體系統,其包含一種或多種編碼核苷酸序列的載體,所述核苷酸序列編碼CRISPR/Cas系統的一種或多種元件,以及包含本文所述的任何一種sgRNA或sgRNA iBAR構建體的載體。載體可以包括一種或多種以下元件:複製起點、一種或多種調節目標多肽表達的調節序列(例如,啟動子和/或增強子)和/或一種或多種選擇標記基因(例如,抗生素抗性基因或螢光蛋白編碼基因)。 The sgRNA or sgRNA iBAR construct can be present in a vector. In some embodiments, the vector is suitable for replication and integration in eukaryotic cells, such as mammalian cells (eg, cancer cells). In some embodiments, the sgRNA or sgRNA iBAR construct is an expression vector, such as a viral vector or plasmid. Examples of viral vectors include, but are not limited to, adenoviral vectors, adeno-associated viral vectors, lentiviral vectors, retroviral vectors, herpes simplex virus vectors, and derivatives thereof. Viral vector technology is well known in the art and is described, for example, in Sambrook et al. 2001, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York, and other handbooks of virology and molecular biology. Those skilled in the art will appreciate that the design of the expression vector may depend on factors such as the choice of host cell to be transformed, the level of expression desired, and the like. In some embodiments, the sgRNA or sgRNA iBAR construct is a lentiviral vector. In some embodiments, the sgRNA or sgRNA iBAR construct is a virus. In some embodiments, the sgRNA or sgRNA iBAR construct is an adenovirus or an adeno-associated virus. In some embodiments, the sgRNA or sgRNA iBAR construct is a lentivirus. In some embodiments, the vector further comprises a selectable marker. In some embodiments, the vector further comprises one or more nucleotide sequences encoding one or more elements of the CRISPR/Cas system, such as a nucleotide sequence encoding a Cas nuclease (eg, Cas9). In some embodiments, a vector system is provided comprising one or more vectors encoding nucleotide sequences encoding one or more elements of the CRISPR/Cas system, and comprising any one of the Vector for sgRNA or sgRNA iBAR constructs. A vector may include one or more of the following elements: an origin of replication, one or more regulatory sequences (e.g., a promoter and/or enhancer) that regulate expression of a polypeptide of interest, and/or one or more selectable marker genes (e.g., an antibiotic resistance gene or fluorescent protein-encoding gene).

已經開發了許多基於病毒的系統,以用於將基因轉移到哺乳動物細胞中。例如,逆轉錄病毒為基因遞送系統提供了一個方便的平臺。可以使用本領域已知的技術將異源核酸插入載體並包裝在逆轉錄病毒顆粒中。然後可以在體外或離體分離重組病毒並將其遞送至工程化的哺乳動物細胞。許多逆轉錄病毒系統是本領域已知的。在一些實施方案中,使用腺病毒載體。許多腺病毒載體是本領域已知的。在一些實施方案中,使用慢病毒載體。在一些實施方案中,使用自滅活慢病毒載體。可以使用本領域已知的方案將自滅活慢病毒載體包裝到慢病毒中。使用本領域已知的方法,所得慢病毒可用於轉導哺乳動物細胞(例如癌細胞)。源自逆轉錄病毒(如慢病毒)的載體是實現長期基因轉移的合適工具,因為它們允許轉基因的長期、穩定整合及其在後代細胞中的繁殖。慢病毒載體還具有低免疫原性,並且可以轉導非增殖細胞。A number of virus-based systems have been developed for gene transfer into mammalian cells. For example, retroviruses provide a convenient platform for gene delivery systems. The heterologous nucleic acid can be inserted into a vector and packaged in retroviral particles using techniques known in the art. The recombinant virus can then be isolated in vitro or ex vivo and delivered to engineered mammalian cells. Many retroviral systems are known in the art. In some embodiments, adenoviral vectors are used. Many adenoviral vectors are known in the art. In some embodiments, lentiviral vectors are used. In some embodiments, a self-inactivating lentiviral vector is used. Self-inactivating lentiviral vectors can be packaged into lentiviruses using protocols known in the art. The resulting lentiviruses can be used to transduce mammalian cells (eg, cancer cells) using methods known in the art. Vectors derived from retroviruses such as lentiviruses are suitable tools to achieve long-term gene transfer because they allow long-term, stable integration of the transgene and its propagation in progeny cells. Lentiviral vectors are also low immunogenic and can transduce non-proliferating cells.

在一些實施方案中,載體是非病毒載體。在一些實施方案中,載體是轉座子,如睡美人(Sleeping Beauty)轉座子系統,或PiggyBac轉座子系統。在一些實施方案中,載體是基於聚合物的非病毒載體,包括例如聚(乳酸-共-乙醇酸)(PLGA)和聚乳酸(PLA)、聚(乙烯亞胺)(PEI),以及樹枝狀大分子。在一些實施方案中,載體是基於陽離子脂質的非病毒載體,如陽離子脂質體、脂質納米乳劑和固體脂質納米顆粒(SLN)。在一些實施方案中,載體是基於肽的基因非病毒載體,如聚-L-賴氨酸。適用于基因編輯的任何已知非病毒載體,均可用於將編碼sgRNA或sgRNA iBAR的核酸引入癌細胞。例如,參見Yin H. et al. Nature Rev. Genetics (2014) 15:521-555;Aronovich EL et al. “The Sleeping Beauty transposon system: a non-viral vector for gene therapy.” Hum. Mol. Genet. (2011) R1: R14-20;以及Zhao S. et al. “PiggyBac transposon vectors: the tools of the human gene editing.” Transl. Lung Cancer Res. (2016) 5(1): 120-125,其通過引用併入本文。在一些實施方案中,將編碼本文所述的sgRNA或sgRNA iBAR的任何一種或多種核酸通過物理方法引入癌細胞,包括但不限於:電穿孔、聲穿孔、光穿孔、磁轉染、水穿孔。 In some embodiments, the vector is a non-viral vector. In some embodiments, the vector is a transposon, such as the Sleeping Beauty transposon system, or the PiggyBac transposon system. In some embodiments, the vector is a polymer-based non-viral vector including, for example, poly(lactic-co-glycolic acid) (PLGA) and polylactic acid (PLA), poly(ethyleneimine) (PEI), and dendritic macromolecule. In some embodiments, the vector is a cationic lipid-based non-viral vector, such as cationic liposomes, lipid nanoemulsions, and solid lipid nanoparticles (SLN). In some embodiments, the vector is a peptide-based genetic non-viral vector, such as poly-L-lysine. Any known non-viral vector suitable for gene editing can be used to introduce nucleic acid encoding sgRNA or sgRNA iBAR into cancer cells. See, for example, Yin H. et al. Nature Rev. Genetics (2014) 15:521-555; Aronovich EL et al. “The Sleeping Beauty transposon system: a non-viral vector for gene therapy.” Hum. Mol. Genet. (2011) R1: R14-20; and Zhao S. et al. “PiggyBac transposon vectors: the tools of the human gene editing.” Transl. Lung Cancer Res. (2016) 5(1): 120-125, via incorporated herein by reference. In some embodiments, any one or more nucleic acids encoding sgRNAs or sgRNA iBARs described herein are introduced into cancer cells by physical means, including but not limited to: electroporation, sonication, photoporation, magnetofection, hydroporation.

在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和編碼CRISPR/Cas系統(例如,Cas核酸酶,如Cas9)的一種或多種元件的一種或多種核酸位於不同的載體(例如,病毒載體如慢病毒載體)。在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和編碼CRISPR/Cas系統的一種或多種元件的一種或多種核酸在同一載體上。在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和編碼CRISPR/Cas系統的一種或多種元件的一種或多種核酸由單獨的啟動子可操作地控制。在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和編碼CRISPR/Cas系統的一種或多種元件的一種或多種核酸通過相同啟動子可操作地控制。在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和編碼CRISPR/Cas系統的一種或多種元件的一種或多種核酸通過一種或多種連接序列(如IRES)連接。 In some embodiments, the nucleic acid encoding the sgRNA or sgRNA iBAR and the nucleic acid or nucleic acids encoding one or more elements of the CRISPR/Cas system (e.g., a Cas nuclease, such as Cas9) are located on different vectors (e.g., a viral vector such as a lentigo viral vector). In some embodiments, the nucleic acid encoding the sgRNA or sgRNA iBAR and the nucleic acid or nucleic acids encoding one or more elements of the CRISPR/Cas system are on the same vector. In some embodiments, the nucleic acid encoding the sgRNA or sgRNA iBAR and the nucleic acid or nucleic acids encoding one or more elements of the CRISPR/Cas system are operably controlled by separate promoters. In some embodiments, the nucleic acid encoding the sgRNA or sgRNA iBAR and the nucleic acid or nucleic acids encoding one or more elements of the CRISPR/Cas system are operably controlled by the same promoter. In some embodiments, nucleic acid encoding a sgRNA or sgRNA iBAR and one or more nucleic acids encoding one or more elements of a CRISPR/Cas system are linked by one or more linker sequences (eg, IRES).

可以使用本領域中任何已知的分子克隆方法將核酸克隆到載體中,包括例如使用限制性核酸內切酶位點和一種或多種選擇標記。在一些實施方案中,核酸與啟動子可操作地連接。已經探索了多種啟動子用於哺乳動物細胞中的基因表達,並且本領域已知的任何啟動子都可以用於本發明。啟動子可大致分為組成型啟動子或受調控的啟動子,如誘導型啟動子。Nucleic acids can be cloned into vectors using any molecular cloning method known in the art, including, for example, the use of restriction endonuclease sites and one or more selectable markers. In some embodiments, the nucleic acid is operably linked to a promoter. A variety of promoters have been explored for gene expression in mammalian cells, and any promoter known in the art may be used in the present invention. Promoters can be broadly classified as constitutive promoters or regulated promoters, such as inducible promoters.

在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和/或編碼CRISPR/Cas系統(例如,Cas9)的一種或多種元件的一種或多種核酸可操作地連接至組成型啟動子。組成型啟動子允許異源基因(也稱為轉基因)在宿主細胞中組成型表達。本文考慮的示例性啟動子包括但不限於:巨細胞病毒早期暫態啟動子(CMVIE)、人延伸因數-1α (hEF1α)、泛素C啟動子(UbiC)、磷酸甘油激酶啟動子(PGK)、猿病毒40早期啟動子(SV40)、與CMV早期增強子偶聯的雞β-肌動蛋白啟動子(CAGG)、勞斯肉瘤病毒(RSV)啟動子、多瘤增強子/單純皰疹胸苷激酶(MC1)啟動子、β-肌動蛋白(β-ACT)啟動子、“骨髓增殖性肉瘤病毒增強子,陰性對照區缺失,d1587rev引物結合位點取代(MND)”啟動子。這種組成型啟動子在驅動轉基因表達方面的效率,已在大量研究中進行了廣泛比較。 In some embodiments, a nucleic acid encoding a sgRNA or sgRNA iBAR and/or one or more nucleic acids encoding one or more elements of a CRISPR/Cas system (eg, Cas9) is operably linked to a constitutive promoter. Constitutive promoters allow the constitutive expression of heterologous genes (also known as transgenes) in host cells. Exemplary promoters contemplated herein include, but are not limited to: cytomegalovirus early transient promoter (CMVIE), human elongation factor-1α (hEF1α), ubiquitin C promoter (UbiC), phosphoglycerol kinase promoter (PGK) , simian virus 40 early promoter (SV40), chicken β-actin promoter coupled to CMV early enhancer (CAGG), Rous sarcoma virus (RSV) promoter, polyoma enhancer/herpes simplex breast Glycoside kinase (MC1) promoter, β-actin (β-ACT) promoter, "myeloproliferative sarcoma virus enhancer, negative control region deletion, d1587rev primer binding site substitution (MND)" promoter. The efficiency of this constitutive promoter in driving transgene expression has been extensively compared in numerous studies.

在一些實施方案中,編碼sgRNA或sgRNA iBAR的核酸和/或編碼CRISPR/Cas系統(例如Cas9)的一種或多種元件的一種或多種核酸可操作地連接至誘導型啟動子。誘導型啟動子屬於受調控的啟動子。誘導型啟動子可以由一種或多種條件誘導,如物理條件、癌細胞(例如,工程改造的癌細胞)的微環境、或癌細胞的生理狀態、誘導劑(即,用於誘導的藥劑),或其組合。在一些實施方案中,誘導條件不誘導工程化的癌細胞和/或接受癌細胞療法的受試者中的內源基因的表達。在一些實施方案中,誘導條件選自下組:誘導劑、輻射(如電離輻射、光)、溫度(如熱)、氧化還原狀態、腫瘤環境和工程化癌的細胞的活化狀態。在一些實施方案中,誘導型啟動子可以是NFAT啟動子、TETON®啟動子或NFκB啟動子。 In some embodiments, a nucleic acid encoding a sgRNA or sgRNA iBAR and/or one or more nucleic acids encoding one or more elements of a CRISPR/Cas system (eg, Cas9) is operably linked to an inducible promoter. Inducible promoters are regulated promoters. An inducible promoter can be induced by one or more conditions, such as physical conditions, the microenvironment of a cancer cell (e.g., an engineered cancer cell), or the physiological state of a cancer cell, an inducing agent (i.e., an agent for induction), or a combination thereof. In some embodiments, the inducing conditions do not induce expression of an endogenous gene in the engineered cancer cell and/or in the subject receiving cancer cell therapy. In some embodiments, the inducing condition is selected from the group consisting of inducing agent, radiation (eg, ionizing radiation, light), temperature (eg, heat), redox state, tumor environment, and activation state of cells of the engineered cancer. In some embodiments, the inducible promoter can be a NFAT promoter, a TETON® promoter, or an NFκB promoter.

文庫library

本文所述sgRNA文庫包含一個或多個sgRNA構建體,其中每個sgRNA構建體包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列。根據遺傳篩選的需要,本文所述的sgRNA文庫可設計為靶向一個或多個基因組基因座(例如,基因組中一個或多個命中基因中的多個靶位點)。在一些實施方案中,設計單個sgRNA構建體以靶向每個命中基因。在一些實施方案中,可以設計多個(例如,至少約2、3、4、5、10、20、100或更多個)具有靶向單個命中基因的不同嚮導序列的sgRNA構建體。例如,這樣的多個sgRNA構建體可以包含或編碼靶向單個命中基因的不同靶位點的嚮導序列,如單個命中基因的3個(或約6至約12個)不同靶位點。The sgRNA libraries described herein comprise one or more sgRNA constructs, wherein each sgRNA construct comprises or encodes a sgRNA, and wherein each sgRNA comprises a target site complementary (e.g., at least about 50%, 60%, %, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary) guide sequence. According to the needs of genetic screening, the sgRNA library described herein can be designed to target one or more genomic loci (eg, multiple target sites in one or more hit genes in the genome). In some embodiments, a single sgRNA construct is designed to target each hit gene. In some embodiments, multiple (eg, at least about 2, 3, 4, 5, 10, 20, 100, or more) sgRNA constructs can be designed with different guide sequences targeting a single hit gene. For example, such multiple sgRNA constructs may comprise or encode guide sequences targeting different target sites of a single hit gene, such as 3 (or about 6 to about 12) different target sites of a single hit gene.

包含一個或多個sgRNA iBAR構建體的sgRNA文庫在本文中也稱為sgRNA iBAR文庫,其中每個sgRNA構建體包含或編碼iBAR序列。本文所述sgRNA iBAR文庫包含一個或多個sgRNA iBAR構建體,其中每個sgRNA iBAR構建體包含或編碼sgRNA iBAR,且其中每個sgRNA iBAR包含嚮導序列,所述嚮導序列與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。根據遺傳篩選的需要,本文所述sgRNA iBAR文庫可設計為靶向一個或多個基因組基因座(例如,基因組中一個或多個命中基因中的多個靶位點)。在一些實施方案中,設計單個sgRNA iBAR構建體以靶向每個命中基因。在一些實施方案中,可以設計多個(例如,至少約2、3、4、5、10、20或更多個)具有靶向單個命中基因的不同嚮導序列的sgRNA iBAR構建體。例如,這樣的多個sgRNA iBAR構建體可以包含或編碼靶向單個命中基因的不同靶位點的嚮導序列,如單個命中基因的3個不同靶位點。 A sgRNA library comprising one or more sgRNA iBAR constructs is also referred to herein as a sgRNA iBAR library, wherein each sgRNA construct comprises or encodes an iBAR sequence. The sgRNA iBAR library described herein comprises one or more sgRNA iBAR constructs, wherein each sgRNA iBAR construct comprises or encodes a sgRNA iBAR , and wherein each sgRNA iBAR comprises a guide sequence corresponding to the target in the hit gene The sites are complementary (eg, at least about any of 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary). According to the needs of genetic screening, the sgRNA iBAR library described herein can be designed to target one or more genomic loci (eg, multiple target sites in one or more hit genes in the genome). In some embodiments, a single sgRNA iBAR construct is designed to target each hit gene. In some embodiments, multiple (eg, at least about 2, 3, 4, 5, 10, 20, or more) sgRNA iBAR constructs can be designed with different guide sequences targeting a single hit gene. For example, such multiple sgRNA iBAR constructs may contain or encode guide sequences targeting different target sites of a single hit gene, such as 3 different target sites of a single hit gene.

在一些實施方案中,本文所述sgRNA iBAR文庫包含一組或多組sgRNA iBAR構建體,其中每組sgRNA iBAR構建體包含3個或更多個(例如,3、4、5個或更多個,如4個) sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中每個sgRNA iBAR包含嚮導序列和iBAR序列,其中所述3個或更多個sgRNA iBAR構建體的嚮導序列是相同的,其中所述3個或更多個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,且其中每組sgRNA iBAR構建體的嚮導序列與命中基因的不同靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)。在一些實施方案中,每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,且其中針對所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同。在一些實施方案中,單個sgRNA iBAR構建體組被設計為靶向每個命中基因。在一些實施方案中,所述sgRNA iBAR文庫包含多組(例如,至少約2、3、4、5、10、20組或更多組) sgRNA iBAR構建體,其具有靶向單個命中基因的不同嚮導序列。在一些實施方案中,所述sgRNA iBAR文庫包含至少3組sgRNA iBAR構建體,所述構建體被設計為靶向每個命中基因的3個不同的靶位點,其中每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體。在一些實施方案中,所述sgRNA iBAR文庫包含至少約100組sgRNA iBAR構建體,如至少約200、300、400、800、1,000、2,000、3,000、5,000、10,000、15,000、19,000、20,000、40,000、50,000、100,000、150,000、200,000或更多組sgRNA iBAR構建體中的任一個。在一些實施方案中,所述sgRNA iBAR文庫包含約100至約30,000組sgRNA iBAR構建體,如約1000至約4000,約1000至約6000,或約3000至約5000組sgRNA iBAR構建體。 In some embodiments, the sgRNA iBAR library described herein comprises one or more sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 3, 4, 5 or more , such as 4) sgRNA iBAR constructs, each of which comprises or encodes a sgRNA iBAR , wherein each sgRNA iBAR comprises a guide sequence and an iBAR sequence, wherein the guide sequences of the three or more sgRNA iBAR constructs are Identical, wherein the iBAR sequences of each of the 3 or more sgRNA iBAR constructs are different from each other, and wherein the guide sequences of each set of sgRNA iBAR constructs correspond to different target sites of the hit genes (e.g., different hit genes , or different sites within the same hit gene) complementary (e.g., at least about 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% complementary any of them). In some embodiments, each set of sgRNA iBAR constructs comprises 4 sgRNA iBAR constructs, and wherein the iBAR sequences for each of the 4 sgRNA iBAR constructs are different from each other. In some embodiments, a single set of sgRNA iBAR constructs is designed to target each hit gene. In some embodiments, the sgRNA iBAR library comprises multiple sets (e.g., at least about 2, 3, 4, 5, 10, 20, or more sets) of sgRNA iBAR constructs with distinct genes targeting a single hit gene. wizard sequence. In some embodiments, the sgRNA iBAR library comprises at least 3 sets of sgRNA iBAR constructs designed to target 3 different target sites for each hit gene, wherein each set of sgRNA iBAR constructs comprises 4 sgRNA iBAR constructs. In some embodiments, the sgRNA iBAR library comprises at least about 100 sets of sgRNA iBAR constructs, such as at least about 200, 300, 400, 800, 1,000, 2,000, 3,000, 5,000, 10,000, 15,000, 19,000, 20,000, 40,000, Any of 50,000, 100,000, 150,000, 200,000 or more sets of sgRNA iBAR constructs. In some embodiments, the sgRNA iBAR library comprises about 100 to about 30,000 sets of sgRNA iBAR constructs, such as about 1000 to about 4000, about 1000 to about 6000, or about 3000 to about 5000 sets of sgRNA iBAR constructs.

在一些實施方案中,所述sgRNA文庫或sgRNA iBAR文庫包含至少約1、2、3、4、5、10、20、50、100、200、400、500、1,000、2,000、3,000、4,000、5,000、10,000、15,000、19,000、20,000、38,000、39,000、40,000、50,000、100,000、150,000、155,000、200,000或更多個sgRNA構建體或sgRNA iBAR構建體中的任一個。在一些實施方案中,所述sgRNA文庫或sgRNA iBAR文庫包含至少約100個(例如,至少約200、300、400、600、1000、1200、3000、6000、10,000、20,000個或更多中的任一個) sgRNA構建體或sgRNA iBAR構建體,如至少約300或約400個sgRNA構建體或sgRNA iBAR構建體。在一些實施方案中,所述sgRNA文庫包含約1000至約300,000個sgRNA構建體,如約6000至約14,000,約1000至約20,000,約1000至約5000,約10,000至約200,000,約15,000至約20,000,約100,000至約300,000,或約150,000至約180,000個sgRNA構建體。在一些實施方案中,所述sgRNA iBAR文庫包含約1000至約1,200,000個sgRNA iBAR構建體,如約1000至約20,000,約10,000至約18,000,約1000至約5000,約10,000至約200,000,約15,000至約20,000,約100,000至約300,000,約300,000至約1,200,000,或約150,000至約180,000個sgRNA iBAR構建體。在一些實施方案中,所述sgRNA iBAR文庫包含至少約1、2、3、4、5、10、20、50、100、200、400、500、1,000、2,000、3,000、5,000、10,000、15,000、19,000、20,000、38,000、50,000、100,000、150,000、200,000組或更多組sgRNA iBAR構建體中的任一個,如約1000至約4000組sgRNA iBAR構建體。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫靶向細胞或生物體中至少約1、2、3、4、5、10、20、50、100、200、500、1,000、2,000、5,000、10,000、15,000、19,000、20,000、38,000、50,000個或更多基因中的任一個。在一些實施方案中,所述生物體為人。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫是用於編碼蛋白質的基因和/或非編碼RNA的全基因組文庫。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫是針對每個注釋基因的全基因組文庫。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫靶向細胞或生物體基因中至少約0.1%、0.5%、1%、2%、3%、4%、5%、6%、7%、8%、9%、10%、20%、30%、40%、50%、60%、70%、80%、90%,或95%中的任一個。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫是靶向的文庫,其靶向信號通路中的選定基因或與細胞過程相關的基因,如對抗癌藥物介導的殺傷的敏感性或抗性、細胞增殖、細胞週期、轉錄調節、泛素化、細胞凋亡、免疫反應如自身免疫、腫瘤轉移、腫瘤惡性轉化等。在一些實施方案中,sgRNA文庫或sgRNA iBAR文庫用於與特定調節表型相關的全基因組篩選,如對抗癌藥物介導的殺傷的敏感性或抗性。在一些實施方案中,sgRNA文庫或sgRNA iBAR文庫用於全基因組篩選以鑒定與特定調節表型相關的至少一種靶基因,如癌細胞中調節癌細胞對抗癌藥反應的活性的靶基因。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫靶向“癌症相關基因”,例如,在癌症患者中所述基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),和/或在癌症患者中所述基因的RNA表達水準上調或下調了至少約1.2-倍(例如,至少約1.5、2、2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個,如約2-倍),如基於文獻或資料庫。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫靶向其編碼的mRNA和/或蛋白質在細胞內(在健康細胞或癌細胞中)表達的基因。在一些實施方案中,sgRNA文庫或sgRNA iBAR文庫靶向其編碼的蛋白質在細胞表面(在健康細胞或癌細胞中)表達的基因。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫靶向這樣的基因:i)在癌症患者中(例如,基於文獻或資料庫)其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),ii)在癌症患者中(例如,基於文獻或資料庫)其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個),以及iii) 在健康細胞或癌細胞中,其編碼的mRNA或蛋白質在細胞內表達,或其編碼的蛋白質在細胞表面表達。因此,在一些實施方案中,含有多個sgRNA構建體的sgRNA文庫包含或編碼具有與癌症相關基因的靶位點互補的嚮導序列的sgRNA,如人基因組中約1323個結直腸癌相關基因的靶位點,其DNA突變頻率≥5%,且RNA表達水準上調或下調大於III期和IV期結直腸癌患者的2倍,具有表達在細胞或細胞表面的基因產物。在一些實施方案中,含有多個sgRNA iBAR構建體的sgRNA iBAR文庫包含或編碼具有與癌症相關基因的靶位點互補的嚮導序列的sgRNA iBAR,如人類基因組中約1323個結腸直腸癌相關基因的靶位點,其DNA突變頻率≥5%,且RNA表達水準上調或下調大於III期和IV期結直腸癌患者的2倍,具有表達在細胞或細胞表面的基因產物。在一些實施方案中,所述sgRNA文庫或sgRNA iBAR文庫被設計為靶向真核基因組,如哺乳動物基因組。示例性目標基因組包括以下的基因組:齧齒動物(小鼠、大鼠、倉鼠、豚鼠)、家養動物(例如,牛、羊、貓、狗、馬或兔)、非人靈長類動物(例如,猴子)、魚(例如斑馬魚)、非脊椎動物(例如,黑腹果蠅(Drosophila melanogaster)和秀麗隱杆線蟲)(Caenorhabditis elegans)和人類。 In some embodiments, the sgRNA library or sgRNA iBAR library comprises at least about 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, 400, 500, 1,000, 2,000, 3,000, 4,000, 5,000 , 10,000, 15,000, 19,000, 20,000, 38,000, 39,000, 40,000, 50,000, 100,000, 150,000, 155,000, 200,000 or more sgRNA constructs or sgRNA iBAR constructs. In some embodiments, the sgRNA library or sgRNA iBAR library comprises at least about 100 (e.g., at least about 200, 300, 400, 600, 1000, 1200, 3000, 6000, 10,000, 20,000 or more of any of a) sgRNA constructs or sgRNA iBAR constructs, such as at least about 300 or about 400 sgRNA constructs or sgRNA iBAR constructs. In some embodiments, the sgRNA library comprises about 1000 to about 300,000 sgRNA constructs, such as about 6000 to about 14,000, about 1000 to about 20,000, about 1000 to about 5000, about 10,000 to about 200,000, about 15,000 to about 20,000, about 100,000 to about 300,000, or about 150,000 to about 180,000 sgRNA constructs. In some embodiments, the sgRNA iBAR library comprises about 1000 to about 1,200,000 sgRNA iBAR constructs, such as about 1000 to about 20,000, about 10,000 to about 18,000, about 1000 to about 5000, about 10,000 to about 200,000, about 15,000 to about 20,000, about 100,000 to about 300,000, about 300,000 to about 1,200,000, or about 150,000 to about 180,000 sgRNA iBAR constructs. In some embodiments, the sgRNA iBAR library comprises at least about 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, 400, 500, 1,000, 2,000, 3,000, 5,000, 10,000, 15,000, Any of 19,000, 20,000, 38,000, 50,000, 100,000, 150,000, 200,000 or more sets of sgRNA iBAR constructs, such as about 1000 to about 4000 sets of sgRNA iBAR constructs. In some embodiments, the sgRNA library or the sgRNA iBAR library targets at least about 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, 500, 1,000, 2,000 , 5,000, 10,000, 15,000, 19,000, 20,000, 38,000, 50,000 or more genes. In some embodiments, the organism is a human. In some embodiments, the sgRNA library or the sgRNA iBAR library is a genome-wide library for protein-coding genes and/or non-coding RNAs. In some embodiments, the sgRNA library or the sgRNA iBAR library is a genome-wide library for each annotated gene. In some embodiments, the sgRNA library or the sgRNA iBAR library targets at least about 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, Any of 7%, 8%, 9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95%. In some embodiments, the sgRNA library or the sgRNA iBAR library is a targeted library that targets selected genes in signaling pathways or genes related to cellular processes, such as susceptibility to anticancer drug-mediated killing Sex or resistance, cell proliferation, cell cycle, transcriptional regulation, ubiquitination, apoptosis, immune responses such as autoimmunity, tumor metastasis, malignant transformation of tumors, etc. In some embodiments, sgRNA libraries or sgRNA iBAR libraries are used for genome-wide screens associated with specific regulatory phenotypes, such as sensitivity or resistance to anticancer drug-mediated killing. In some embodiments, the sgRNA library or sgRNA iBAR library is used in a genome-wide screen to identify at least one target gene associated with a particular regulatory phenotype, such as a target gene in a cancer cell that modulates the activity of the cancer cell in response to an anticancer drug. In some embodiments, the sgRNA library or the sgRNA iBAR library targets "cancer-associated genes," e.g., genes with a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more), and/or the RNA expression level of the gene in cancer patients is up-regulated or down-regulated by at least About 1.2-fold (e.g., at least about any of 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or higher, such as about 2-fold), such as Based on literature or databases. In some embodiments, the sgRNA library or the sgRNA iBAR library targets genes whose encoded mRNAs and/or proteins are expressed intracellularly (in healthy cells or cancer cells). In some embodiments, the sgRNA library or sgRNA iBAR library targets genes whose encoded proteins are expressed on the cell surface (in healthy cells or cancer cells). In some embodiments, the sgRNA library or the sgRNA iBAR library targets genes that i) have a DNA mutation frequency of at least about 5% (e.g., at least Any of about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more), ii) in cancer patients (eg, based on literature or databases) its RNA expression level is up-regulated or down-regulated by more than about 2-fold (e.g., by any of greater than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or more), and iii ) in healthy cells or cancer cells, the mRNA or protein it encodes is expressed inside the cell, or the protein it encodes is expressed on the cell surface. Thus, in some embodiments, a sgRNA library comprising a plurality of sgRNA constructs comprises or encodes sgRNAs with guide sequences complementary to target sites of cancer-associated genes, such as the targets of approximately 1323 colorectal cancer-associated genes in the human genome. Loci, whose DNA mutation frequency is ≥ 5%, and whose RNA expression levels are up-regulated or down-regulated more than 2 times that of stage III and stage IV colorectal cancer patients, have gene products expressed on cells or on the cell surface. In some embodiments, a sgRNA iBAR library comprising multiple sgRNA iBAR constructs comprises or encodes sgRNA iBARs with guide sequences complementary to target sites of cancer-associated genes, such as about 1323 colorectal cancer-associated genes in the human genome The target site, whose DNA mutation frequency is ≥5%, and whose RNA expression level is up-regulated or down-regulated more than 2 times that of stage III and stage IV colorectal cancer patients, has a gene product expressed on the cell or cell surface. In some embodiments, the sgRNA library or sgRNA iBAR library is designed to target a eukaryotic genome, such as a mammalian genome. Exemplary genomes of interest include those of rodents (mouse, rat, hamster, guinea pig), domestic animals (e.g., cattle, sheep, cats, dogs, horses, or rabbits), non-human primates (e.g., monkeys), fish (eg zebrafish), invertebrates (eg Drosophila melanogaster and Caenorhabditis elegans) and humans.

所述sgRNA文庫或sgRNA iBAR文庫的嚮導序列可以使用任何已知演算法來設計,這些演算法在使用者定義的在人類基因組中具有高度靶向特異性的列表中識別CRISPR/Cas 靶位點,如基因組靶掃描(GT-Scan) (參見O’Brien et al., Bioinformatics (2014) 30:2673-2675)、DeepCRISPR、CasFinder、CHOPCHOP、CRISPRscan等。在一些實施方案中,在單個陣列上可以生成至少約100、400、500、1,000、3,000、5,000、10,000、15,000、19,000、20,000、50,000、100,000、150,000、155,000、200,000個或更多個sgRNA構建體或sgRNA iBAR構建體中的任一個。該方法也可以擴大規模以使得能夠通過並行合成多個sgRNA文庫或sgRNA iBAR文庫來進行全基因組篩選。sgRNA文庫中sgRNA構建體的確切數量,或sgRNA iBAR文庫中的sgRNA iBAR構建體(或sgRNA iBAR構建體組)的確切數量可以取決於篩選是否:1)針對基因或調節元件,2)針對整個基因組或基因組基因的亞組。 Guide sequences for the sgRNA library or sgRNA iBAR library can be designed using any known algorithm that recognizes CRISPR/Cas target sites in a user-defined list with high target specificity in the human genome, Such as genome target scanning (GT-Scan) (see O'Brien et al., Bioinformatics (2014) 30:2673-2675), DeepCRISPR, CasFinder, CHOPCHOP, CRISPRscan, etc. In some embodiments, at least about 100, 400, 500, 1,000, 3,000, 5,000, 10,000, 15,000, 19,000, 20,000, 50,000, 100,000, 150,000, 155,000, 200,000 or more sgRNA constructs can be generated on a single array Either of the body or the sgRNA iBAR construct. This method can also be scaled up to enable genome-wide screening by synthesizing multiple sgRNA libraries or sgRNA iBAR libraries in parallel. The exact number of sgRNA constructs in an sgRNA library, or the exact number of sgRNA iBAR constructs (or sets of sgRNA iBAR constructs) in a sgRNA iBAR library can depend on whether the screen is: 1) targeting a gene or regulatory element, 2) targeting the entire genome or subsets of genomic genes.

在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫被設計為針對基因組中與基因重疊的每個PAM序列,其中PAM序列對應於Cas蛋白。在一些實施方案中,sgRNA文庫或sgRNA iBAR文庫被設計為針對基因組中發現的PAM序列的子集,其中PAM序列對應於Cas蛋白。 In some embodiments, the sgRNA library or the sgRNA iBAR library is designed for every PAM sequence overlapping with a gene in the genome, wherein the PAM sequence corresponds to the Cas protein. In some embodiments, the sgRNA library or sgRNA iBAR library is designed to target a subset of PAM sequences found in the genome, where the PAM sequences correspond to Cas proteins.

在一些實施方案中,所述sgRNA文庫包含不靶向基因組中的任何基因組基因座的一個或多個對照sgRNA構建體。在一些實施方案中,不靶向推定的基因組基因的sgRNA構建體可包括在sgRNA文庫中作為陰性對照。在一些實施方案中,所述sgRNA iBAR文庫包含不靶向基因組中的任何基因組基因座的一個或多個對照sgRNA iBAR構建體。在一些實施方案中,不靶向推定的基因組基因的sgRNA iBAR構建體可包括在sgRNA iBAR文庫中作為陰性對照。在一些實施方案中,所述sgRNA文庫(或sgRNA iBAR文庫)包含靶向非癌症相關基因的一個或多個對照sgRNA構建體(或對照sgRNA iBAR構建體),例如,所述基因在癌症患者和健康個體之間其表達(RNA水準或蛋白水準)不會有至少1.2-倍(例如,至少約1.5、2、2.5、3、4、5倍或更多中的任一個)的差異;例如所述基因在癌症患者和健康個體之間其表達水準差異小於2-倍;和/或所述基因在癌症患者中的突變頻率小於約5% (例如,小於約4%、3%、2%或1%中的任一個)。 In some embodiments, the sgRNA library comprises one or more control sgRNA constructs that do not target any genomic locus in the genome. In some embodiments, sgRNA constructs that do not target putative genomic genes can be included in the sgRNA library as negative controls. In some embodiments, the sgRNA iBAR library comprises one or more control sgRNA iBAR constructs that do not target any genomic locus in the genome. In some embodiments, sgRNA iBAR constructs that do not target putative genomic genes can be included in sgRNA iBAR libraries as negative controls. In some embodiments, the sgRNA library (or sgRNA iBAR library) comprises one or more control sgRNA constructs (or control sgRNA iBAR constructs) targeting non-cancer-associated genes, e.g., genes in cancer patients and There will be no at least 1.2-fold (e.g., at least about any of 1.5, 2, 2.5, 3, 4, 5-fold or more) difference in its expression (RNA level or protein level) between healthy individuals; e.g., all The expression level of the gene differs less than 2-fold between cancer patients and healthy individuals; and/or the gene has a mutation frequency of less than about 5% (e.g., less than about 4%, 3%, 2%, or any one of 1%).

本文所述sgRNA構建體和文庫可使用本領域中任何已知核酸合成和/或分子克隆方法製備。在一些實施方案中,所述sgRNA文庫通過電化學方法在陣列上合成(例如,CustomArray、Twist、Gen9)、DNA列印(例如,安捷倫)或單個寡聚物的固相合成(例如,IDT)。所述sgRNA構建體可以通過PCR擴增並克隆到表達載體(例如,慢病毒載體)。在一些實施方案中,所述慢病毒載體還編碼基於CRISPR/Cas的遺傳編輯系統的一個或多個組分,如Cas蛋白,例如Cas9。The sgRNA constructs and libraries described herein can be prepared using any nucleic acid synthesis and/or molecular cloning method known in the art. In some embodiments, the sgRNA library is synthesized electrochemically on an array (e.g., CustomArray, Twist, Gen9), DNA printing (e.g., Agilent), or solid-phase synthesis of individual oligos (e.g., IDT) . The sgRNA constructs can be amplified by PCR and cloned into expression vectors (eg, lentiviral vectors). In some embodiments, the lentiviral vector also encodes one or more components of a CRISPR/Cas-based genetic editing system, such as a Cas protein, such as Cas9.

在一些實施方案中,本發明提供編碼本文所述sgRNA構建體、sgRNA iBAR構建體、sgRNA iBAR構建體組、sgRNA文庫或sgRNA iBAR文庫中的任一個的分離的核酸。還提供了包含編碼本文所述sgRNA構建體、sgRNA iBAR構建體、sgRNA iBAR構建體組、sgRNA文庫或sgRNA iBAR文庫中的任一個的任意核酸的載體(例如,非病毒載體,或病毒載體如慢病毒載體)和病毒(例如,慢病毒)。 In some embodiments, the invention provides an isolated nucleic acid encoding any one of the sgRNA constructs, sgRNA iBAR constructs, sets of sgRNA iBAR constructs, sgRNA libraries, or sgRNA iBAR libraries described herein. Also provided are vectors comprising any nucleic acid encoding any of the sgRNA constructs, sgRNA iBAR constructs, sets of sgRNA iBAR constructs, sgRNA libraries, or sgRNA iBAR libraries described herein (e.g., non-viral vectors, or viral vectors such as slow viral vectors) and viruses (eg, lentiviruses).

Cas蛋白Cas protein

將本文所述sgRNA構建體或sgRNA iBAR構建體設計為可與本領域已知的任一自然發生或工程化的CRISPR/Cas系統一起操作。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與I型CRISPR/Cas系統一起操作。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與II型CRISPR/Cas系統一起操作。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與III型CRISPR/Cas系統一起操作。示例性CRISPR/Cas系統,可參見WO2013176772、WO2014065596、WO2014018423、WO2016011080、US8697359、US8932814、US10113167B2,出於所有目的將其公開的內容通過引用以其整體併入本文。 The sgRNA constructs or sgRNA iBAR constructs described herein are designed to operate with any naturally occurring or engineered CRISPR/Cas system known in the art. In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a Type I CRISPR/Cas system. In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a Type II CRISPR/Cas system. In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a Type III CRISPR/Cas system. Exemplary CRISPR/Cas systems can be found in WO2013176772, WO2014065596, WO2014018423, WO2016011080, US8697359, US8932814, US10113167B2, the disclosures of which are incorporated herein by reference in their entireties for all purposes.

在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與來自CRISPR/Cas I型、II型或III型系統的Cas蛋白一起操作,所述CRISPR/Cas系統具有RNA-引導的多核苷酸結合和/或核酸酶活性。此類Cas蛋白的實例描述於例如,WO2014144761、WO2014144592、WO2013176772、US20140273226和US20140273233,其通過引用以其整體併入本文。 In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a Cas protein from a CRISPR/Cas Type I, Type II, or Type III system with an RNA-guided Polynucleotide binding and/or nuclease activity. Examples of such Cas proteins are described, eg, in WO2014144761, WO2014144592, WO2013176772, US20140273226 and US20140273233, which are incorporated herein by reference in their entirety.

在一些實施方案中,所述Cas蛋白來自II型CRISPR-Cas系統。在一些實施方案中,所述Cas蛋白為或來自Cas9蛋白。在一些實施方案中,所述Cas蛋白為或來自細菌Cas9蛋白,包括WO2014144761中鑒定的那些。In some embodiments, the Cas protein is from a Type II CRISPR-Cas system. In some embodiments, the Cas protein is or is derived from a Cas9 protein. In some embodiments, the Cas protein is or is derived from a bacterial Cas9 protein, including those identified in WO2014144761.

在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與Cas9 (也稱為Csn1和Csx12)、其同系物或其修改形式一起操作。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與兩個或更多個(例如,2、3、4、5或更多個) Cas蛋白一起操作。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體與來自化膿性鏈球菌(S. pyogenes)或肺炎鏈球菌(S. pneumoniae)的Cas9蛋白一起操作。Cas酶是本領域已知的;例如,化膿性鏈球菌Cas9蛋白的氨基酸序列可以在SwissProt資料庫中找到,登記號為Q99ZW2。 In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with Cas9 (also known as Csn1 and Csx12), a homolog thereof, or a modified form thereof. In some embodiments, the sgRNA construct or the sgRNA iBAR construct can operate with two or more (eg, 2, 3, 4, 5 or more) Cas proteins. In some embodiments, the sgRNA construct or the sgRNA iBAR construct operates with a Cas9 protein from S. pyogenes or S. pneumoniae. Cas enzymes are known in the art; for example, the amino acid sequence of the Streptococcus pyogenes Cas9 protein can be found in the SwissProt database with accession number Q99ZW2.

Cas蛋白(本文也稱為“Cas核酸酶”) 提供所需的活性,如靶結合、靶切刻或切割活性。在某些實施方案中,所需活性是靶結合。在某些實施方案中,所需的活性是靶切刻或靶切割。在某些實施方案中,所需活性還包括由共價融合到Cas蛋白或核酸酶缺乏的Cas蛋白的多肽提供的功能。此類所需活性的實例包括轉錄調節活性(啟動或抑制)、表觀遺傳修飾活性、或靶視覺化/識別活性。Cas proteins (also referred to herein as "Cas nucleases") provide the desired activity, such as target binding, target nicking, or cleavage activity. In certain embodiments, the desired activity is target binding. In certain embodiments, the desired activity is target nicking or target cleavage. In certain embodiments, the desired activity also includes a function provided by a polypeptide covalently fused to a Cas protein or a nuclease-deficient Cas protein. Examples of such desired activities include transcriptional regulatory activity (initiation or repression), epigenetic modification activity, or target visualization/recognition activity.

在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與切割靶標序列的Cas核酸酶一起操作,包括雙鏈切割和單鏈切割。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與無催化活性的Cas (“dCas”)一起操作。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與 CRISPR啟動(“CRISPRa”)系統的dCas一起操作,其中所述dCas與轉錄啟動劑融合。在一些實施方案中,所述sgRNA構建體或所述sgRNA iBAR構建體可與CRISPR擾動(CRISPRi)系統的dCas一起操作。在一些實施方案中,所述dCas與阻遏結構域(如KRAB結構域)融合。此類CRISPR/Cas系統可用於調控(例如,誘導、抑制、提高或降低)基因表達。 In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a Cas nuclease that cleaves a target sequence, including double-strand cleavage and single-strand cleavage. In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a catalytically inactive Cas ("dCas"). In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with a dCas of a CRISPR initiation ("CRISPRa") system, wherein the dCas is fused to a transcriptional promoter. In some embodiments, the sgRNA construct or the sgRNA iBAR construct is operable with the dCas of the CRISPR perturbation (CRISPRi) system. In some embodiments, the dCas is fused to a repression domain, such as a KRAB domain. Such CRISPR/Cas systems can be used to regulate (eg, induce, inhibit, increase or decrease) gene expression.

在某些實施方案中,所述Cas蛋白是野生型Cas蛋白(如Cas9)的突變體或其片段。Cas9蛋白通常具有至少兩個核酸酶(例如,DNA酶)結構域。例如,Cas9蛋白可具有RuvC-樣核酸酶結構域和HNH-樣核酸酶結構域。RuvC和HNH結構域一同工作以切割靶位點的兩條鏈以使靶多核苷酸中的雙鏈斷裂(Jinek et al., Science 337: 816-21)。在某些實施方案中,突變體Cas9蛋白被修飾以包含僅一個功能性核酸酶結構域 (RuvC樣或HNH樣核酸酶結構域)。例如,在某些實施方案中,突變體Cas9蛋白被修飾使得一個核酸酶結構域被缺失或突變從而不在具有功能性(即,不具有核酸酶活性)。在一些實施方案中,其中一個核酸酶結構域失活,所述突變體能將一個切刻引入雙鏈多核苷酸 (該蛋白稱為“切口酶”)但無法切割雙鏈多核苷酸。在某些實施方案中,Cas蛋白被修飾以增加核酸結合親和力和/或特異性、改變酶活性和/或改變該蛋白的另一特性。在某些實施方案中,Cas蛋白被截短或修飾以優化效應結構域的活性。在某些實施方案中,RuvC-樣核酸酶結構域和HNH-樣核酸酶結構域均被修飾或去除使得突變體Cas9蛋白無法切刻或切割靶多核苷酸。在某些實施方案中,然而,相對於野生型對應物缺乏一些或所有核酸酶活性的Cas9蛋白,在更大或更少程度上維持靶標識別活性。In certain embodiments, the Cas protein is a mutant or fragment thereof of a wild-type Cas protein (such as Cas9). Cas9 proteins typically have at least two nuclease (eg, DNase) domains. For example, a Cas9 protein can have a RuvC-like nuclease domain and an HNH-like nuclease domain. The RuvC and HNH domains work together to cleave both strands of the target site to cause a double-strand break in the target polynucleotide (Jinek et al., Science 337: 816-21). In certain embodiments, the mutant Cas9 protein is modified to comprise only one functional nuclease domain (RuvC-like or HNH-like nuclease domain). For example, in certain embodiments, a mutant Cas9 protein is modified such that one nuclease domain is deleted or mutated so that it is no longer functional (ie, has no nuclease activity). In some embodiments in which one of the nuclease domains is inactivated, the mutant is capable of introducing a nick into a double-stranded polynucleotide (the protein is termed a "nickase") but is unable to cleave the double-stranded polynucleotide. In certain embodiments, a Cas protein is modified to increase nucleic acid binding affinity and/or specificity, to alter enzymatic activity, and/or to alter another property of the protein. In certain embodiments, the Cas protein is truncated or modified to optimize the activity of the effector domain. In certain embodiments, both the RuvC-like nuclease domain and the HNH-like nuclease domain are modified or removed such that the mutant Cas9 protein is unable to nick or cleave the target polynucleotide. In certain embodiments, however, a Cas9 protein lacking some or all nuclease activity maintains target recognition activity to a greater or lesser extent relative to a wild-type counterpart.

在某些實施方案中,所述Cas蛋白是一種融合蛋白,其包含與另一多肽或效應結構域融合的天然存在的Cas或其變體。另一個多肽或效應結構域可以是,例如,切割結構域、轉錄啟動結構域、轉錄阻遏結構域,或表觀遺傳修飾域。在某些實施方案中,融合蛋白包含修飾或突變的Cas蛋白,其中所有核酸酶結構域已失活或缺失。在某些實施方案中,Cas蛋白的RuvC和/或HNH結構域被修飾或突變,使其不再具有核酸酶活性。In certain embodiments, the Cas protein is a fusion protein comprising naturally occurring Cas or a variant thereof fused to another polypeptide or effector domain. Another polypeptide or effector domain can be, for example, a cleavage domain, a transcription initiation domain, a transcription repression domain, or an epigenetic modification domain. In certain embodiments, the fusion protein comprises a modified or mutated Cas protein in which all nuclease domains have been inactivated or deleted. In certain embodiments, the RuvC and/or HNH domains of the Cas protein are modified or mutated such that they no longer have nuclease activity.

在某些實施方案中,融合蛋白的效應結構域是從任何核酸內切酶或核酸外切酶獲得的具有所希望特性的剪切結構域。In certain embodiments, the effector domain of the fusion protein is a cleavage domain having the desired properties obtained from any endonuclease or exonuclease.

在某些實施方案中,融合蛋白的效應結構域是一個轉錄啟動結構域。一般來說,轉錄啟動結構域與轉錄控制元件和/或轉錄調節蛋白(即,轉錄因數、RNA聚合酶等)相互作用以增加和/或啟動基因的轉錄。在某些實施方案中,轉錄啟動結構域是單純皰疹病毒VP16啟動結構域、VP64(其為VP16的四聚體衍生物)、NFκB p65啟動結構域、p53啟動結構域1和2、CREB(cAMP反應元件結合蛋白)啟動結構域、E2A啟動結構域、或NFAT(活化T細胞的核因數)啟動結構域。在某些實施方案中,轉錄啟動結構域為Gal4、Gcn4、MLL、Rtg3、Gln3、Oaf1、Pip2、Pdr1、Pdr3、Pho4、或Leu3。轉錄啟動結構域可能是野生型,或原始轉錄啟動結構域的修改或截短形式。In certain embodiments, the effector domain of the fusion protein is a transcription initiation domain. Generally, a transcription initiation domain interacts with transcriptional control elements and/or transcriptional regulatory proteins (ie, transcription factors, RNA polymerases, etc.) to increase and/or initiate transcription of a gene. In certain embodiments, the transcription initiation domain is a herpes simplex virus VP16 initiation domain, VP64 (which is a tetrameric derivative of VP16), NFκB p65 initiation domain, p53 initiation domains 1 and 2, CREB ( cAMP response element binding protein) promoter domain, E2A promoter domain, or NFAT (nuclear factor of activated T cell) promoter domain. In certain embodiments, the transcription initiation domain is Gal4, Gcn4, MLL, Rtg3, Gln3, Oaf1, Pip2, Pdrl, Pdr3, Pho4, or Leu3. The transcription initiation domain may be wild type, or a modified or truncated form of the original transcription initiation domain.

在某些實施方案中,融合蛋白的效應結構域是轉錄阻遏結構域,如可誘導的cAMP早期阻遏(ICER)結構域、Kruppel相關的盒A (KRAB-A) 阻遏結構域、YY1甘氨酸富集阻遏結構域、Sp1-樣阻遏蛋白、E(spI)阻遏蛋白、I. κ (kappa). B阻遏蛋白或MeCP2。In certain embodiments, the effector domain of the fusion protein is a transcriptional repression domain, such as an inducible cAMP early repression (ICER) domain, a Kruppel-associated box A (KRAB-A) repression domain, a YY1 glycine-rich Repression domain, Sp1-like repressor, E(spI) repressor, I. kappa (kappa). B repressor or MeCP2.

在某些實施方案中,融合蛋白的效應結構域是表觀遺傳修飾結構域,其通過改變組蛋白結構和/或染色體結構來改變基因表達,如組蛋白乙醯轉移酶結構域、組蛋白脫乙醯酶結構域、組蛋白甲基轉移酶結構區、組蛋白去甲基化酶結構域、DNA甲基轉移酶結構域或DNA脫甲基酶結構域。In certain embodiments, the effector domain of the fusion protein is an epigenetic modification domain that alters gene expression by altering histone structure and/or chromosomal structure, such as histone acetyltransferase domain, histone decapitation An acetylase domain, a histone methyltransferase domain, a histone demethylase domain, a DNA methyltransferase domain or a DNA demethylase domain.

在某些實施方案中,Cas蛋白還包括至少一個其他結構域,如核定位元信號(NLS)、細胞穿透或易位結構域以及標誌物結構域(例如,螢光蛋白標誌物)。In certain embodiments, the Cas protein also includes at least one other domain, such as a nuclear localizer signal (NLS), a cell penetration or translocation domain, and a marker domain (eg, a fluorescent protein marker).

所述Cas蛋白可被引入至癌細胞作為:(i) Cas蛋白,或(ii)編碼Cas蛋白的mRNA,或(iii)編碼該蛋白的線狀或環狀DNA。所述Cas蛋白或編碼Cas蛋白的構建體在組合物中可為純化的或非純化的。將蛋白或核酸構建體引入宿主細胞的方法是本領域公知的,並且可應用于本文所述的需要將Cas蛋白或其構建體引入癌細胞的所有方法。在一些實施方案中,Cas蛋白作為蛋白被遞送至癌細胞。在一些實施方案中,Cas蛋白由宿主癌細胞(例如,工程化的癌細胞)的mRNA或DNA進行組成型表達。在一些實施方案中,Cas蛋白從mRNA或DNA的表達在宿主癌細胞中是可誘導的或被誘導的。在一些實施方案中,採用本領域已知的重組技術,可將Cas蛋白以Cas蛋白:sgRNA複合體的形式引入宿主癌細胞中。引入Cas蛋白或其構建體的示例性方法描述於,例如, WO2014144761、WO2014144592和WO2013176772,其通過引用以其整體併入本文。The Cas protein can be introduced into cancer cells as: (i) Cas protein, or (ii) mRNA encoding Cas protein, or (iii) linear or circular DNA encoding the protein. The Cas protein or the construct encoding the Cas protein may be purified or non-purified in the composition. Methods for introducing proteins or nucleic acid constructs into host cells are well known in the art and are applicable to all methods described herein that require introducing a Cas protein or a construct thereof into a cancer cell. In some embodiments, the Cas protein is delivered to cancer cells as a protein. In some embodiments, the Cas protein is constitutively expressed from mRNA or DNA of a host cancer cell (eg, engineered cancer cell). In some embodiments, expression of the Cas protein from mRNA or DNA is inducible or induced in a host cancer cell. In some embodiments, the Cas protein can be introduced into host cancer cells in the form of a Cas protein:sgRNA complex using recombinant techniques known in the art. Exemplary methods of introducing Cas proteins or constructs thereof are described, for example, in WO2014144761, WO2014144592 and WO2013176772, which are incorporated herein by reference in their entirety.

在一些實施方案中,所述方法使用CRISPR/Cas9系統。Cas9是來自微生物II型CRISPR(簇狀規則間隔的短回文重複序列)系統的核酸酶,當與單鏈嚮導RNA(sgRNA)互補配對時其顯示能切割DNA。所述sgRNA將Cas9引導至目標基因組基因中的互補區域,這可能會導致位點特異性雙鏈斷裂(DSB),這種斷裂可以通過細胞非同源末端連接(NHEJ)機制以易錯方式修復。野生型Cas9主要切割gRNA序列後面是PAM序列(-NGG)的基因組位點。NHEJ介導的Cas9誘導的DSB修復可誘導在切割位點啟動的廣泛突變,這些突變是通常小(<10 bp)的插入/缺失(indels),但可包括大(>100 bp)的插入/缺失。In some embodiments, the method uses the CRISPR/Cas9 system. Cas9 is a nuclease from a microbial type II CRISPR (clustered regularly interspaced short palindromic repeats) system that has been shown to cleave DNA when paired complementary to a single-stranded guide RNA (sgRNA). The sgRNA guides Cas9 to a complementary region in the target genomic gene, which can lead to site-specific double-strand breaks (DSBs) that can be repaired in an error-prone manner by the cellular non-homologous end-joining (NHEJ) mechanism . Wild-type Cas9 primarily cleaves genomic sites where the gRNA sequence is followed by the PAM sequence (-NGG). NHEJ-mediated Cas9-induced DSB repair can induce extensive mutations initiated at the cleavage site, which are typically small (<10 bp) insertions/deletions (indels) but can include large (>100 bp) insertions/deletions missing.

癌細胞文庫cancer cell library

本文所述的癌細胞文庫包含多個(例如,至少約2、3、4、5、10、100、1×103、1×104、1×105、1×106、1×107、2×107、1×108 個或更多中的任一個)癌細胞,其中所述多個癌細胞中的每一個在基因組(例如,人基因組)的命中基因中具有突變(例如,失活突變),以及其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同。Cancer cell libraries described herein comprise a plurality (e.g., at least about , 1×108 or more) cancer cells, wherein each of the plurality of cancer cells has a mutation (e.g., an inactivating mutation) in a hit gene in a genome (e.g., a human genome), and wherein the hit genes in at least two of the plurality of cancer cells are different from each other.

在一些實施方案中,所述癌細胞文庫包含多個癌細胞,其在細胞或生物體的至少約2、3、4、5、10、20、50、100、200、500、1,000、2,000、5,000、10,000、20,000、50,000個或更多個命中基因中的任一個具有突變(例如,失活突變)。在一些實施方案中,所述生物體是人。在一些實施方案中,所述癌細胞文庫包含多個癌細胞,其在約100至約30,000命中基因,如約500至約5000,或約1000至約1500命中基因具有突變(例如,失活突變)。在一些實施方案中,所述癌細胞文庫包含至少約2、3、4、5、10、20、50、100、200、500、1,000、2,000、5,000、1×104、2×104、5×104、1×105、2×105、1×106、5×106、1×107、1.5×107、2×107、1×108、1×109、1×1010個或更多個中的任一個的癌細胞。在一些實施方案中,所述癌細胞文庫中至少兩個癌細胞在不同的靶位點(例如,不同命中基因,或相同命中基因內的不同位點)具有突變(例如,失活突變)。在一些實施方案中,所述癌細胞文庫的每個癌細胞在不同命中基因具有突變(例如,失活突變)。在一些實施方案中,所述癌細胞文庫的每個癌細胞在不同靶位點(例如,可在相同命中基因內,或不同命中基因內)具有突變(例如,失活突變)。在一些實施方案中,所述癌細胞文庫不包含在相同命中基因具有突變(例如,失活突變)的癌細胞,如相同命中基因的相同靶位點的失活突變,或相同命中基因的不同靶基因位點的失活突變。在一些實施方案中,所述癌細胞文庫不包含在相同靶位點具有突變(例如,失活突變)的癌細胞。在一些實施方案中,所述癌細胞文庫中多個(例如,至少約2、3、4、5、10、100、500、1000、2000、5000、10000、2×107個,或更多個) 癌細胞在相同命中基因具有突變(例如,失活突變),如相同命中基因的相同靶位點的失活突變,或相同命中基因的不同靶基因位點的失活突變。在一些實施方案中,所述癌細胞文庫包含多個癌細胞,其在基因組的至少約0.1%、0.5%、1%、2%、3%、4%、5%、6%、7%、8%、9%、10%、20%、30%、40%、60%、70%、80%、90%、95%或更多的命中基因中包含突變(例如,失活突變)。在一些實施方案中,所述癌細胞文庫包含多個癌細胞,其在基因組(本文也稱為“全基因組癌細胞文庫”)的所有基因(如人基因組的所有注釋基因)中包含突變(例如,失活突變)。在一些實施方案中,針對基因組的每個注釋基因或針對每個命中基因,所述癌細胞文庫有至少兩個(例如,2、3、4、5個或更多個,如3個) 癌細胞各自在相同命中基因的不同靶位點包含突變(例如,失活突變),例如,癌細胞A在基因X的靶位點A’包含突變(例如,失活突變),癌細胞B在基因X的靶位點B’包含突變(例如,失活突變),以及癌細胞 C在基因X的靶位點C’包含突變(例如,失活突變)。在一些實施方案中,所述癌細胞文庫是靶向的文庫,其在信號轉導通路中或與細胞過程相關的選定基因中包含突變(例如,失活突變),所述細胞過程如對抗癌藥物-介導的殺傷具有敏感性或抗性、細胞增殖、細胞週期、轉錄調控、泛素化、細胞凋亡、免疫反應如自身免疫、腫瘤轉移、腫瘤惡性轉化等。在一些實施方案中,將所述癌細胞文庫用於與特定的受調控表型(如對抗癌藥物介導的殺傷具有敏感性或抗性)相關的全基因組篩選。在一些實施方案中,將所述癌細胞文庫用於全基因組篩選以鑒定至少一個與特定的受調控表型相關的靶基因,如癌細胞中調控所述癌細胞回應抗癌藥物治療的活性的靶基因。在一些實施方案中,所述癌細胞文庫是哺乳動物癌細胞文庫。所述癌細胞文庫涵蓋的示例性目標基因自包括齧齒動物(小鼠、大鼠、倉鼠、豚鼠)、家養動物 (例如,牛、羊、貓、狗、馬或兔)、非人靈長類動物(例如,猴子)、魚(例如,斑馬魚),非脊椎動物(例如,黑腹果蠅(Drosophila melanogaster)和秀麗隱杆線蟲(Caenorhabditis elegans))以及人類的基因組。在一些實施方案中,所述癌細胞文庫是人癌細胞文庫,如人結直腸癌細胞文庫。In some embodiments, the cancer cell library comprises a plurality of cancer cells at least about 2, 3, 4, 5, 10, 20, 50, 100, 200, 500, 1,000, 2,000, Any of the 5,000, 10,000, 20,000, 50,000 or more hit genes have a mutation (eg, an inactivating mutation). In some embodiments, the organism is a human. In some embodiments, the cancer cell library comprises a plurality of cancer cells having mutations (e.g., inactivating mutations) in about 100 to about 30,000 hit genes, such as about 500 to about 5000, or about 1000 to about 1500 hit genes ). In some embodiments, the cancer cell library comprises at least about 2, 3, 4, 5, 10, 20, 50, 100, 200, 500, 1,000, 2,000, 5,000, 1×104, 2×104, 5× Any of 104, 1×105, 2×105, 1×106, 5×106, 1×107, 1.5×107, 2×107, 1×108, 1×109, 1×1010 or more of cancer cells. In some embodiments, at least two cancer cells in the cancer cell library have mutations (eg, inactivating mutations) at different target sites (eg, different hit genes, or different sites within the same hit gene). In some embodiments, each cancer cell of the cancer cell library has mutations (eg, inactivating mutations) at different hit genes. In some embodiments, each cancer cell of the cancer cell library has a mutation (eg, an inactivating mutation) at a different target site (eg, can be within the same hit gene, or within a different hit gene). In some embodiments, the cancer cell library does not comprise cancer cells with mutations (e.g., inactivating mutations) in the same hit gene, such as inactivating mutations at the same target site of the same hit gene, or different mutations in the same hit gene. Inactivating mutations at target gene loci. In some embodiments, the library of cancer cells does not comprise cancer cells with mutations (eg, inactivating mutations) at the same target site. In some embodiments, a plurality (e.g., at least about 2, 3, 4, 5, 10, 100, 500, 1000, 2000, 5000, 10000, 2×107, or more) of the cancer cell library ) Cancer cells have mutations (eg, inactivating mutations) in the same hit gene, such as inactivating mutations at the same target site of the same hit gene, or inactivating mutations at different target gene sites of the same hit gene. In some embodiments, the cancer cell library comprises a plurality of cancer cells comprising at least about 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 20%, 30%, 40%, 60%, 70%, 80%, 90%, 95% or more of the hit genes contained mutations (eg, inactivating mutations). In some embodiments, the cancer cell library comprises a plurality of cancer cells comprising mutations in all genes (eg, all annotated genes of the human genome) of a genome (also referred to herein as a "genome-wide cancer cell library") (eg, , inactivating mutation). In some embodiments, the cancer cell library has at least two (eg, 2, 3, 4, 5 or more, such as 3) cancer cells for each annotated gene of the genome or for each hit gene. The cells each contain a mutation (e.g., an inactivating mutation) at a different target site of the same hit gene, e.g., cancer cell A contains a mutation (e.g., an inactivating mutation) at the target site A' of gene X, and cancer cell B contains a mutation (e.g., an inactivating mutation) at gene X's target site Target site B' of X contains a mutation (eg, an inactivating mutation), and cancer cell C contains a mutation (eg, an inactivating mutation) at target site C' of gene X. In some embodiments, the cancer cell library is a targeted library comprising mutations (e.g., inactivating mutations) in selected genes in signal transduction pathways or associated with cellular processes, such as against Cancer drug-mediated killing with sensitivity or resistance, cell proliferation, cell cycle, transcriptional regulation, ubiquitination, apoptosis, immune responses such as autoimmunity, tumor metastasis, malignant transformation of tumors, etc. In some embodiments, the cancer cell library is used for genome-wide screens associated with a particular regulated phenotype, such as sensitivity or resistance to anticancer drug-mediated killing. In some embodiments, the library of cancer cells is used in a genome-wide screen to identify at least one target gene associated with a particular regulated phenotype, such as a gene in a cancer cell that regulates the activity of the cancer cell in response to anticancer drug treatment. target gene. In some embodiments, the cancer cell library is a mammalian cancer cell library. Exemplary target genes encompassed by the cancer cell library include rodents (mouse, rat, hamster, guinea pig), domesticated animals (e.g., cows, sheep, cats, dogs, horses, or rabbits), non-human primate Animal (eg, monkey), fish (eg, zebrafish), invertebrate (eg, Drosophila melanogaster and Caenorhabditis elegans) and human genomes. In some embodiments, the cancer cell library is a human cancer cell library, such as a human colorectal cancer cell library.

在一些實施方案中,所述癌細胞文庫在“癌症相關基因”包含突變,例如,在癌症患者中所述基因的DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),和/或在癌症患者中所述基因的RNA表達水準上調或下調了至少約1.2-倍(例如,至少約1.5、2、2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個,如約2-倍),如基於文獻或資料庫。在一些實施方案中,所述癌細胞文庫在基因中包含突變,所述基因編碼的 mRNA和/或蛋白在細胞(在健康細胞或癌細胞)中表達。在一些實施方案中,所述癌細胞文庫在基因中包含突變,所述基因編碼的蛋白表達在細胞表面(在健康細胞或癌細胞中)。在一些實施方案中,所述癌細胞文庫在基因中包含突變,所述基因:i)在癌症患者中(例如,基於文獻或資料庫),其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個),ii)在癌症患者中(例如,基於文獻或資料庫),其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個),以及iii)在健康細胞或癌細胞中,其編碼的 mRNA和/或蛋白在細胞中表達,或其編碼的蛋白在細胞表面表達。在一些實施方案中,所述癌細胞文庫在人基因組約1323個結直腸癌相關的基因包含突變,III期和IV期結直腸癌患者的DNA突變頻率≥5%且RNA表達水準上調或下調了大於2倍,基因產物表達於細胞或細胞表面。In some embodiments, the cancer cell library comprises mutations in "cancer-associated genes," e.g., genes with a DNA mutation frequency of at least about 5% (e.g., at least about 10%, 20%, 30% in cancer patients). %, 40%, 50%, 60%, 70%, 80%, 90% or higher), and/or at least about 1.2-fold up-regulation or down-regulation of the RNA expression level of the gene in the cancer patient (e.g., at least about any of 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or higher, such as about 2-fold), as based on literature or sources library. In some embodiments, the cancer cell library comprises mutations in genes encoding mRNAs and/or proteins expressed in cells (either healthy cells or cancer cells). In some embodiments, the cancer cell library comprises mutations in genes encoding proteins expressed on the cell surface (either in healthy cells or in cancer cells). In some embodiments, the library of cancer cells comprises mutations in genes that: i) have a DNA mutation frequency of at least about 5% (e.g., at least Any of about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more), ii) in cancer patients (eg, based on literature or databases) , whose RNA expression levels are up-regulated or down-regulated by more than about 2-fold (e.g., by any of greater than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100-fold or more), and iii) In healthy cells or cancer cells, the encoded mRNA and/or protein is expressed in the cell, or the encoded protein is expressed on the cell surface. In some embodiments, the cancer cell library contains mutations in about 1323 colorectal cancer-related genes in the human genome, and the DNA mutation frequency of stage III and stage IV colorectal cancer patients is ≥ 5% and the RNA expression level is up-regulated or down-regulated Greater than 2-fold, the gene product is expressed on the cell or on the cell surface.

在一些實施方案中,癌細胞文庫中多個(例如,約2、3、4、5、10、100、500、1000、2000、5000、10000個或更多)癌細胞在相同命中基因中具有突變(例如,失活突變),該癌細胞文庫也稱為“針對命中基因具有X-倍覆蓋率”,其中“X”是在相同命中基因中具有突變(例如,失活突變)的癌細胞數目。例如,對於靶向約1000個命中基因並包含約2×107個癌細胞的癌細胞文庫,所述癌細胞文庫針對每個命中基因具有約20,000-倍覆蓋率。在一些實施方案中,本文所述癌細胞文庫針對每個命中基因(例如,癌症相關基因)具有至少約2-倍、3-倍、4-倍、5-倍、10-倍、100-倍、200-倍、500-倍、1,000-倍、2,000-倍、5,000-倍、10,000-倍或更多倍覆蓋率,如針對每個命中基因平均約600-倍至約12000-倍,平均約600-倍至約1200-倍,或平均約1200-倍至約12000-倍覆蓋率。在一些實施方案中,Cas9 +sgRNA癌細胞文庫針對每個sgRNA具有平均約600-倍至約1200-倍覆蓋率。在一些實施方案中,本文所述Cas9 +sgRNA (或誘變劑誘導的突變)癌細胞文庫針對每個命中基因(例如,癌症相關基因)具有平均約600-倍至約1200-倍覆蓋率。在一些實施方案中,Cas9 +sgRNA iBAR癌細胞文庫針對每個sgRNA iBAR具有平均約100-倍至約1,000-倍,如約1000-倍覆蓋率。在一些實施方案中,Cas9 +sgRNA iBAR癌細胞文庫針對每組sgRNA iBAR具有平均約400-倍至約4000-倍,如約4000-倍覆蓋率。在一些實施方案中,本文所述Cas9 +sgRNA iBAR癌細胞文庫針對每個命中基因(例如,癌症相關基因)具有平均約1200-倍至約12,000-倍,如約12,000-倍覆蓋率。 In some embodiments, a plurality (e.g., about 2, 3, 4, 5, 10, 100, 500, 1000, 2000, 5000, 10000 or more) of cancer cells in the cancer cell library have in the same hit gene Mutations (e.g., inactivating mutations), the cancer cell library is also referred to as "X-fold coverage for hit genes", where "X" is cancer cells with mutations (e.g., inactivating mutations) in the same hit genes number. For example, for a cancer cell library targeting about 1000 hit genes and comprising about 2 x 107 cancer cells, the cancer cell library has about 20,000-fold coverage for each hit gene. In some embodiments, the cancer cell libraries described herein have at least about 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 100-fold , 200-fold, 500-fold, 1,000-fold, 2,000-fold, 5,000-fold, 10,000-fold or more coverage, such as an average of about 600-fold to about 12000-fold for each hit gene, an average of about 600-fold to about 1200-fold, or an average of about 1200-fold to about 12000-fold coverage. In some embodiments, the Cas9 + sgRNA cancer cell library has an average of about 600-fold to about 1200-fold coverage for each sgRNA. In some embodiments, the Cas9 + sgRNA (or mutagen-induced mutation) cancer cell library described herein has an average of about 600-fold to about 1200-fold coverage for each hit gene (eg, a cancer-associated gene). In some embodiments, the Cas9 + sgRNA iBAR cancer cell library has an average of about 100-fold to about 1,000-fold, such as about 1000-fold coverage for each sgRNA iBAR . In some embodiments, the Cas9 + sgRNA iBAR cancer cell library has an average of about 400-fold to about 4000-fold, such as about 4000-fold coverage for each set of sgRNA iBARs . In some embodiments, the Cas9 + sgRNA iBAR cancer cell library described herein has an average of about 1200-fold to about 12,000-fold, such as about 12,000-fold coverage for each hit gene (eg, a cancer-related gene).

在命中基因的突變Mutations in hit genes

在一些實施方案中,基因組(例如,人基因組)的所有注釋基因被選為命中基因。在一些實施方案中,在癌症患者中(例如,基於文獻或資料庫),其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)的基因被選為命中基因。在一些實施方案中,在癌症患者中(例如,基於文獻或資料庫),其RNA表達水準上調或下調了至少約1.2-倍(例如,至少約1.5、2、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個,如約2-倍)的基因被選為命中基因。在一些實施方案中,在癌症患者中(例如,基於文獻或資料庫),如III期和/或IV期結直腸癌患者中,其DNA突變頻率為至少約5% (例如,至少約10%、20%、30%、40%、50%、60%、70%、80%、90%或更高中的任一個)且其RNA表達水準上調或下調了大於約2-倍(例如,大於約2.5、3、4、5、6、7、8、9、10、50、100倍或更高中的任一個)的基因被選為命中基因。在一些實施方案中,基於以下情況進一步選擇命中基因:編碼的mRNA或蛋白在細胞內表達,或編碼的蛋白在細胞表面表達,在健康細胞或癌細胞中。In some embodiments, all annotated genes of a genome (eg, a human genome) are selected as hit genes. In some embodiments, the frequency of DNA mutations in cancer patients (e.g., based on literature or databases) is at least about 5% (e.g., at least about 10%, 20%, 30%, 40%, 50%, 60% %, 70%, 80%, 90% or higher) were selected as hit genes. In some embodiments, RNA expression levels are up-regulated or down-regulated by at least about 1.2-fold (e.g., at least about 1.5, 2, 3, 4, 5, 6, Any of 7, 8, 9, 10, 50, 100-fold or more, such as about 2-fold) genes were selected as hit genes. In some embodiments, the DNA mutation frequency is at least about 5% (e.g., at least about 10%) among cancer patients (e.g., based on literature or databases), such as stage III and/or stage IV colorectal cancer patients , 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher) and its RNA expression level is up-regulated or down-regulated by more than about 2-fold (for example, by more than about 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 50, 100 times or higher) genes were selected as hit genes. In some embodiments, hit genes are further selected based on the expression of the encoded mRNA or protein within the cell, or the expression of the encoded protein on the cell surface, in healthy cells or cancer cells.

在一些實施方案中,命中基因中的突變是致病性或失活突變。本文所述失活突變可為何突變,如插入、缺失(插入缺失)、替換、移碼、染色體重排,或它們的組合,其導致基因表達(轉錄和/或翻譯)和/或功能的完全廢除或消除。在一些實施方案中,失活突變可以完全消除基因的轉錄、翻譯、翻譯後修飾、與其他分子(例如,蛋白質複合體中其他分子)的結合,和/或功能(例如,信號轉導或受體啟動)。在一些實施方案中,命中基因中的突變是這樣的突變,其降低(例如,降低至少約5%、10%、20%、30%、40%、50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或更多中的任一個)或影響(例如,破壞)以下中的一個或多個:命中基因轉錄、命中基因翻譯、命中基因mRNA加工、命中基因mRNA穩定性、命中基因mRNA功能、命中基因蛋白功能、與其他分子(例如,蛋白質複合體中其他分子)的結合和命中基因的翻譯後修飾。命中基因的突變(例如,失活突變)可以在命中基因的一個或多個調控區內如增強子、啟動子、5'非翻譯區(UTR)、3'UTR,或編碼區內如外顯子或剪接位點。本文所述命中基因可為任何基因組序列,如蛋白編碼基因、RNA編碼基因如小RNA(例如,微型RNA、piRNA、siRNA、snoRNA、tRNA、rRNA和snRNA)、核糖體RNA、長鏈非編碼RNA(lincRNA)或線粒體基因。所述命中基因可能已知與特定表型 (例如,癌症表型)有關,或與特定表型無關,如不知道與特定表型有關的已知基因,或尚未表徵的未知基因。在一些實施方案中,所述命中基因是基因組序列,其不編碼任何內容或還不知道其編碼什麼。In some embodiments, the mutation in the hit gene is a pathogenic or inactivating mutation. The inactivating mutations described herein can be any mutations, such as insertions, deletions (indels), substitutions, frameshifts, chromosomal rearrangements, or combinations thereof, which result in complete gene expression (transcription and/or translation) and/or function repeal or eliminate. In some embodiments, an inactivating mutation can completely abolish transcription, translation, post-translational modification, association with other molecules (e.g., other molecules in a protein complex), and/or function (e.g., signal transduction or body start). In some embodiments, the mutation in the hit gene is a mutation that reduces (e.g., reduces by at least about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%) , 90%, 95%, 96%, 97%, 98%, 99% or more) or affect (e.g., disrupt) one or more of the following: hit gene transcription, hit gene translation, hit gene Gene mRNA processing, hit mRNA stability, hit mRNA function, hit protein function, association with other molecules (eg, other molecules in protein complexes), and post-translational modification of hits. Mutations (e.g., inactivating mutations) of a hit gene can be within one or more regulatory regions of the hit gene such as enhancers, promoters, 5' untranslated regions (UTRs), 3' UTRs, or within coding regions such as extronic sub or splice site. A hit gene described herein can be any genomic sequence, such as protein-coding genes, RNA-coding genes such as small RNAs (e.g., microRNAs, piRNAs, siRNAs, snoRNAs, tRNAs, rRNAs, and snRNAs), ribosomal RNAs, long non-coding RNAs (lincRNA) or mitochondrial genes. The hit gene may be known to be associated with a particular phenotype (eg, a cancer phenotype), or not, such as a known gene not known to be associated with a particular phenotype, or an unknown gene that has not yet been characterized. In some embodiments, the hit gene is a genomic sequence that does not encode anything or is not yet known to encode what it encodes.

某些基因的致病性失活突變(功能喪失)可以通過審閱已發表科學文獻中的實驗證據和可能被破壞的關鍵區域來確定,包括但不限於:移碼、錯義突變、截短突變、缺失、拷貝數變異、無義突變、以及基因的丟失或缺失。致病性或失活突變包括但不限於:具有確定的影響的純合子缺失、雙等位基因(雙命中基因)突變、剪接位點突變(例如,第二個或其他的剪接位點突變)、移碼突變以及編碼區無義突變、錯義突變。Pathogenic inactivating mutations (loss of function) in certain genes can be identified by reviewing the published scientific literature for experimental evidence and critical regions that may be disrupted, including but not limited to: frameshifts, missense mutations, truncating mutations , deletions, copy number variations, nonsense mutations, and loss or deletion of genes. Pathogenic or inactivating mutations include, but are not limited to: homozygous deletions with defined effects, biallelic (double hit gene) mutations, splice site mutations (eg, second or additional splice site mutations) , frameshift mutations, nonsense mutations, and missense mutations in coding regions.

在一些實施方案中,所述癌細胞文庫是通過使初始癌細胞群遭受(例如,接觸)誘變劑而產生的。誘變劑可分為三類:物理(例如,γ射線、紫外線輻射)、化學(例如,甲基磺酸乙酯或EMS)和可翻譯元件(例如,轉座子、逆轉錄轉座子、T-DNA、逆轉錄酶病毒)。In some embodiments, the library of cancer cells is generated by subjecting (eg, contacting) an initial population of cancer cells to a mutagen. Mutagens can be divided into three classes: physical (e.g., gamma rays, ultraviolet radiation), chemical (e.g., ethyl methanesulfonate or EMS), and translatable elements (e.g., transposons, retrotransposons, T-DNA, retroviruses).

在一些實施方案中,所述癌細胞文庫是通過使初始癌細胞群經歷基因編輯來產生的。任何已知基因編輯方法均可用于產生本文所述的癌細胞文庫,如鋅指核酸酶(ZFN),轉錄啟動物樣效應核酸酶(TALEN),以及用於基因編輯或基因組工程的基於CRISPR/Cas的方法。參見例如,Gaj et al. (Trends Biotechnol. 2013; 31(7): 397–405)。在一些實施方案中,所述癌細胞文庫是通過基於CRISPR/Cas的方法使初始癌細胞群經歷基因編輯來產生的。In some embodiments, the library of cancer cells is generated by subjecting an initial population of cancer cells to gene editing. Any known gene editing method can be used to generate the cancer cell library described herein, such as zinc finger nucleases (ZFNs), transcriptional promoter-like effector nucleases (TALENs), and CRISPR/based gene editing or genome engineering methods. Cas method. See eg, Gaj et al. (Trends Biotechnol. 2013; 31(7): 397–405). In some embodiments, the library of cancer cells is generated by subjecting an initial population of cancer cells to gene editing by a CRISPR/Cas-based approach.

在一些實施方案中,所述癌細胞文庫是在允許將所述sgRNA構建體或sgRNA iBAR構建體和所述Cas組件引入所述初始癌細胞群和在命中基因中產生突變的條件下,使初始癌細胞群與以下物質接觸來產生的:i)本文所述的sgRNA文庫或sgRNA iBAR文庫;以及ii)包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件(例如,Cas9)。因此,在一些實施方案中,所述癌細胞文庫是在允許將所述sgRNA構建體和Cas組件引入所述初始癌細胞群和在命中基因產生突變的條件下,使初始癌細胞群與以下物質接觸來產生的: i)包含多個sgRNA構建體的sgRNA文庫,其中每個sgRNA構建體包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個)的嚮導序列;以及ii) Cas元件,其包含Cas蛋白(例如,Cas9)或編碼所述Cas蛋白的核酸。在一些實施方案中,所述癌細胞文庫是在允許將所述sgRNA構建體和Cas組件引入所述初始癌細胞群和在命中基因產生突變的條件下,使初始癌細胞群與以下物質接觸來產生的:i) 包含多組sgRNA iBAR構建體的sgRNA iBAR文庫,其中每組sgRNA iBAR構建體包含3個或更多個(例如,3、4、5個或更多個,如4個) sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中所述3個或更多個sgRNA iBAR構建體的嚮導序列是相同的,其中所述3個或更多個sgRNA iBAR構建體中每一個的iBAR序列彼此不同,且其中每組sgRNA iBAR構建體的嚮導序列與不同命中基因的靶位點(例如,不同命中基因,或相同命中基因內的不同位點)互補(例如,至少約50%、60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個);以及ii)包含Cas蛋白或編碼Cas蛋白的核酸的Cas組件。在一些實施方案中,每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,且其中針對所述4個sgRNA iBAR構建體中每一個的iBAR序列彼此不同。在一些實施方案中,通過單獨的載體(例如,慢病毒載體)或單獨的病毒將所述sgRNA文庫或所述sgRNA iBAR文庫以及Cas組件引入所述初始癌細胞群。在一些實施方案中,通過相同的載體或相同的病毒將所述sgRNA文庫或所述sgRNA iBAR文庫以及Cas組件引入所述初始癌細胞群。在一些實施方案中,通過慢病毒載體或慢病毒將所述sgRNA文庫或所述sgRNA iBAR文庫引入所述初始癌細胞群,以及將Cas元件引入所述初始癌細胞群作為編碼Cas元件(例如,Cas9)的mRNA。在一些實施方案中,所述初始癌細胞群已經各自攜帶了Cas元件(例如,轉基因Cas9,或Cas9被引入為mRNA;以下也稱為“Cas9 +癌細胞”),以及然後通過載體(例如,慢病毒載體)或病毒(例如,慢病毒)將所述sgRNA文庫或所述sgRNA iBAR文庫引入每個細胞。 In some embodiments, the cancer cell library is made from an initial sgRNA construct or sgRNA iBAR construct and the Cas module under conditions that allow introduction of the initial cancer cell population and mutations in hit genes. A cancer cell population is produced by contacting i) an sgRNA library or an sgRNA iBAR library as described herein; and ii) a Cas element (eg, Cas9) comprising a Cas protein or a nucleic acid encoding a Cas protein. Accordingly, in some embodiments, the cancer cell library is prepared by combining an initial cancer cell population with Generated by contacting: i) an sgRNA library comprising a plurality of sgRNA constructs, wherein each sgRNA construct comprises or encodes an sgRNA, and wherein each sgRNA comprises a target site complementary (e.g., at least about 50 %, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary) guide sequence; and ii) a Cas element comprising a Cas protein (for example, Cas9) or the nucleic acid encoding described Cas protein. In some embodiments, the cancer cell library is obtained by contacting the initial cancer cell population with Generated: i) a sgRNA iBAR library comprising multiple sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 3 or more (e.g., 3, 4, 5 or more, such as 4) sgRNAs iBAR constructs, each of which comprises or encodes a sgRNA iBAR , wherein the guide sequences of the three or more sgRNA iBAR constructs are identical, wherein each of the three or more sgRNA iBAR constructs The iBAR sequences of one are different from each other, and wherein the guide sequences of each set of sgRNA iBAR constructs are complementary (e.g., at least about 50% to target sites of different hit genes (e.g., different hit genes, or different sites within the same hit gene). %, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary); and ii) a nucleic acid comprising or encoding a Cas protein The Cas component. In some embodiments, each set of sgRNA iBAR constructs comprises 4 sgRNA iBAR constructs, and wherein the iBAR sequences for each of the 4 sgRNA iBAR constructs are different from each other. In some embodiments, the sgRNA library or the sgRNA iBAR library and the Cas module are introduced into the naive cancer cell population via a separate vector (eg, a lentiviral vector) or a separate virus. In some embodiments, the sgRNA library or the sgRNA iBAR library and the Cas module are introduced into the naive cancer cell population via the same vector or the same virus. In some embodiments, the sgRNA library or the sgRNA iBAR library is introduced into the naive cancer cell population via a lentiviral vector or lentivirus, and a Cas element is introduced into the naive cancer cell population as an encoding Cas element (e.g., Cas9) mRNA. In some embodiments, the initial population of cancer cells has each carried a Cas element (for example, transgenic Cas9, or Cas9 introduced as mRNA; lentiviral vector) or virus (eg, lentivirus) to introduce the sgRNA library or the sgRNA iBAR library into each cell.

在一些實施方案中,所述癌細胞文庫僅包含本文所述sgRNA文庫或所述sgRNA iBAR文庫而不含Cas元件(例如,Cas9),即在癌細胞文庫中所述sgRNA文庫或所述sgRNA iBAR文庫靶向的命中基因仍未失活,直到Cas元件(例如,Cas9)被進一步引入。僅包含本文所述sgRNA文庫或sgRNA iBAR文庫的癌細胞文庫,在下文稱為“sgRNA癌細胞文庫”或“sgRNA iBAR癌細胞文庫”。在一些實施方案中,所述癌細胞文庫包含所述sgRNA文庫或所述sgRNA iBAR文庫以及Cas元件(例如,Cas9),即所述癌細胞文庫包含失活的命中基因。在一些實施方案中,所述初始癌細胞群表達Cas蛋白。在一些實施方案中,所述癌細胞文庫是通過使表達Cas蛋白的初始癌細胞群與本文所述sgRNA文庫或sgRNA iBAR文庫接觸來產生的,其將產生包含失活的命中基因的癌細胞文庫。包含本文所述sgRNA文庫或sgRNA iBAR文庫以及Cas9元件(例如,Cas9蛋白,或編碼其的核酸)的癌細胞文庫下文稱為“Cas9 +sgRNA 癌細胞文庫”或“Cas9 +sgRNA iBAR癌細胞文庫”。 In some embodiments, the cancer cell library only comprises the sgRNA library or the sgRNA iBAR library described herein without a Cas element (e.g., Cas9), that is, the sgRNA library or the sgRNA iBAR in the cancer cell library Hit genes targeted by the library remain inactivated until Cas elements (e.g., Cas9) are further introduced. A cancer cell library comprising only the sgRNA library or sgRNA iBAR library described herein is hereinafter referred to as "sgRNA cancer cell library" or "sgRNA iBAR cancer cell library". In some embodiments, the cancer cell library comprises the sgRNA library or the sgRNA iBAR library and a Cas element (eg, Cas9), ie, the cancer cell library comprises an inactivated hit gene. In some embodiments, the initial population of cancer cells expresses a Cas protein. In some embodiments, the cancer cell library is generated by contacting an initial population of cancer cells expressing a Cas protein with the sgRNA library or sgRNA iBAR library described herein, which will generate a cancer cell library comprising an inactivated hit gene . A cancer cell library comprising a sgRNA library or sgRNA iBAR library described herein and a Cas9 element (e.g., a Cas9 protein, or a nucleic acid encoding the same) is hereinafter referred to as a "Cas9 + sgRNA cancer cell library" or a "Cas9 + sgRNA iBAR cancer cell library" .

在一些實施方案中,將Cas元件(例如,Cas9)引入所述癌細胞,然後再引入所述sgRNA文庫或所述sgRNA iBAR文庫。在一些實施方案中,分選所述癌細胞以獲得Cas +癌細胞,然後再引入所述sgRNA文庫或所述sgRNA iBAR文庫。在一些實施方案中,將所述sgRNA文庫或所述sgRNA iBAR文庫引入所述癌細胞,然後再引入Cas元件(例如,Cas9)。在一些實施方案中,分選所述癌細胞以獲得sgRNA +或sgRNA iBAR+癌細胞,然後再引入Cas元件(例如,Cas9)。在一些實施方案中,將Cas元件(例如,Cas9)和所述sgRNA文庫或所述sgRNA iBAR文庫同時引入所述癌細胞。在一些實施方案中,分選所述癌細胞以獲得Cas +sgRNA +癌細胞(Cas +sgRNA +癌細胞文庫)或Cas +sgRNA iBAR+癌細胞(Cas +sgRNA iBAR+癌細胞文庫),然後再進行藥物處理。 In some embodiments, a Cas element (eg, Cas9) is introduced into the cancer cell prior to the introduction of the sgRNA library or the sgRNA iBAR library. In some embodiments, the cancer cells are sorted to obtain Cas + cancer cells before introducing the sgRNA library or the sgRNA iBAR library. In some embodiments, the sgRNA library or the sgRNA iBAR library is introduced into the cancer cell followed by the introduction of a Cas element (eg, Cas9). In some embodiments, the cancer cells are sorted to obtain sgRNA + or sgRNA iBAR+ cancer cells prior to the introduction of a Cas element (eg, Cas9). In some embodiments, a Cas element (eg, Cas9) and the sgRNA library or the sgRNA iBAR library are introduced into the cancer cell simultaneously. In some embodiments, the cancer cells are sorted to obtain Cas + sgRNA + cancer cells (Cas + sgRNA + cancer cell library) or Cas + sgRNA iBAR + cancer cells (Cas + sgRNA iBAR + cancer cell library) prior to drug administration. deal with.

在一些實施方案中,將至少約50% (如至少約60%、70%、80%、90%、95%、96%、97%、98%、99%或更多中的任一個)的所述sgRNA文庫的sgRNA構建體,或所述sgRNA iBAR文庫的sgRNA iBAR構建體,或所述sgRNA iBAR文庫的sgRNA iBAR構建體組引入本文所述初始癌細胞群或Cas9 +癌細胞。在一些實施方案中,將至少約95% (例如,至少約96%、97%、98%、99%或更多中的任一個)的所述sgRNA文庫的sgRNA構建體,或所述sgRNA iBAR文庫的sgRNA iBAR構建體,或所述sgRNA iBAR文庫的sgRNA iBAR構建體組引入所述初始癌細胞群或Cas9 +癌細胞。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫的命中基因失活效率為至少約80%,如至少約81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或更多中的任一個。在一些實施方案中,所述sgRNA文庫或所述sgRNA iBAR文庫的命中基因失活效率為至少約90%。 In some embodiments, at least about 50% (such as at least about any of 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or more) of The sgRNA constructs of the sgRNA library, or the sgRNA iBAR constructs of the sgRNA iBAR library, or the set of sgRNA iBAR constructs of the sgRNA iBAR library are introduced into the initial cancer cell population or Cas9 + cancer cells described herein. In some embodiments, at least about 95% (e.g., at least about any of 96%, 97%, 98%, 99%, or more) of the sgRNA constructs of the sgRNA library, or the sgRNA iBAR A library of sgRNA iBAR constructs, or a set of sgRNA iBAR constructs of the sgRNA iBAR library, is introduced into the initial cancer cell population or Cas9 + cancer cells. In some embodiments, the hit gene inactivation efficiency of the sgRNA library or the sgRNA iBAR library is at least about 80%, such as at least about 81%, 82%, 83%, 84%, 85%, 86%, 87% %, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more. In some embodiments, the hit gene inactivation efficiency of the sgRNA library or the sgRNA iBAR library is at least about 90%.

在一些實施方案中,所述癌細胞文庫包含一個或多個(例如,約2、3、4、5、8、10、100、250、400、500、1,000、2,000、5,000、10,000個或更多)癌細胞,其包含靶向相同命中基因的靶位點的相同sgRNA構建體或相同sgRNA iBAR構建體。這樣的癌細胞文庫也稱為“針對所述sgRNA/sgRNA iBAR具有X-倍覆蓋率”或“針對每個sgRNA/sgRNA iBAR具有X-倍覆蓋率”,其中“X”為表達相同sgRNA或sgRNA iBAR的癌細胞數目。在一些實施方案中,所述癌細胞文庫針對每個sgRNA或sgRNA iBAR,或每組sgRNA iBAR具有約1至約12,000倍覆蓋率,如針對每個sgRNA或sgRNA iBAR,或每組sgRNA iBAR具有約1,000至約5,000,約1至約1,000,約10至約100,約50至約500,約80至約200,約100至約400,約100至約800,約100至約1,000,或約300至約600倍覆蓋率中的任一個。在一些實施方案中,所述癌細胞文庫針對每個sgRNA或sgRNA iBAR,或每組sgRNA iBAR具有至少約1-倍、2-倍、3-倍、4-倍、5-倍、10-倍、100-倍、400-倍、500-倍、1,000-倍、2,000-倍、5,000-倍、10,000-倍或更多倍覆蓋率。 In some embodiments, the cancer cell library comprises one or more (e.g., about 2, 3, 4, 5, 8, 10, 100, 250, 400, 500, 1,000, 2,000, 5,000, 10,000, or more Multiple) cancer cells comprising the same sgRNA construct or the same sgRNA iBAR construct targeting the target site of the same hit gene. Such a cancer cell library is also referred to as "X-fold coverage for the sgRNA/sgRNA iBAR " or "X-fold coverage for each sgRNA/sgRNA iBAR ", where "X" is expression of the same sgRNA or sgRNA Number of cancer cells in iBAR . In some embodiments, the cancer cell library has about 1 to about 12,000 fold coverage for each sgRNA or sgRNA iBAR , or each set of sgRNA iBARs , such as about 1 to about 12,000 fold coverage for each sgRNA or sgRNA iBAR , or about 1,000 to about 5,000, about 1 to about 1,000, about 10 to about 100, about 50 to about 500, about 80 to about 200, about 100 to about 400, about 100 to about 800, about 100 to about 1,000, or about 300 to any of about 600x coverage. In some embodiments , the cancer cell library has at least about 1-fold, 2-fold, 3-fold, 4 -fold, 5-fold, 10-fold , 100-fold, 400-fold, 500-fold, 1,000-fold, 2,000-fold, 5,000-fold, 10,000-fold or more coverage.

在一些實施方案中,所述癌細胞文庫針對每個sgRNA或突變(例如,誘變劑誘導的突變)具有至少約100-倍(例如,至少約200-、400-、500-、1,000-、5,000-倍或更多倍中的任一個)覆蓋率。在一些實施方案中,每個命中基因被約6至約12個不同sgRNA靶向,或在至少2個(例如,約6至約12個)不同的靶基因位點具有突變。在一些實施方案中,所述癌細胞文庫針對每個命中基因具有至少約100-倍(例如,至少約200-、300-、400-、500-、1,000-、5,000-倍或更多倍中的任一個)覆蓋率,如針對每個命中基因具有約600-倍至約1200-倍覆蓋率。In some embodiments, the cancer cell library has at least about 100-fold (eg, at least about 200-, 400-, 500-, 1,000-, 5,000-fold or more) coverage. In some embodiments, each gene hit is targeted by about 6 to about 12 different sgRNAs, or has mutations at at least 2 (eg, about 6 to about 12) different target gene sites. In some embodiments, the library of cancer cells has at least about 100-fold (e.g., at least about 200-, 300-, 400-, 500-, 1,000-, 5,000-fold, or more) for each hit gene. any) coverage, such as about 600-fold to about 1200-fold coverage for each hit gene.

在一些實施方案中,所述癌細胞文庫針對每個sgRNA iBAR具有至少約100-倍(例如,至少約200-、400-、500-、1,000-、5,000-倍或更多倍中的任一個)覆蓋率,如針對每個sgRNA iBAR約100-倍至約1000-倍,或約1000-倍覆蓋率。在一些實施方案中,所述癌細胞文庫針對每組sgRNA iBAR具有至少約400-倍(例如,至少約800-、1000-、2000-、4000-、16,000-倍或更多倍中的任一個)覆蓋率,如針對每組sgRNA iBAR具有約400-倍至約4000-倍,或約4000-倍覆蓋率。在一些實施方案中,所述癌細胞文庫針對所述sgRNA iBAR文庫具有至少約100-倍(例如,至少約200-、400-、500-、1,000-、5,000-倍或更多倍中的任一個)覆蓋率,如約100-倍至約1000-倍,或針對所述sgRNA iBAR文庫具有約1000-倍覆蓋率。在一些實施方案中,所述癌細胞文庫針對每個命中基因具有至少約400-倍(例如,至少約800-、1000-、2000-、4000-、10,000、16,000-倍或更多倍中的任一個)覆蓋率,如針對每個命中基因具有約1200-倍至約12,000-倍覆蓋率,或針對每個命中基因具有約12,000-倍覆蓋率。在一些實施方案中,所述sgRNA iBAR文庫靶向基因組中每個注釋的基因 (即,所述sgRNA iBAR文庫是全基因組sgRNA iBAR文庫)。在一些實施方案中,所述癌細胞文庫針對全基因組sgRNA iBAR文庫具有至少約100-倍(例如,至少約400-倍、800-倍、1000-倍,或1200-倍中的任一個)覆蓋率。 In some embodiments, the cancer cell library has at least about 100-fold (e.g., at least about any of 200-, 400-, 500-, 1,000-, 5,000-fold, or more) iBAR per sgRNA ) coverage, such as about 100-fold to about 1000-fold, or about 1000-fold coverage, for each sgRNA iBAR . In some embodiments, the cancer cell library has at least about 400-fold (e.g., at least about any of 800-, 1000-, 2000-, 4000-, 16,000-fold, or more ) iBAR for each set of sgRNAs. ) coverage, such as about 400-fold to about 4000-fold, or about 4000-fold coverage for each set of sgRNA iBARs . In some embodiments, the library of cancer cells is at least about 100-fold (e.g., at least about any of 200-, 400-, 500-, 1,000-, 5,000-fold, or more) against the sgRNA iBAR library. a) coverage, such as about 100-fold to about 1000-fold, or about 1000-fold coverage for the sgRNA iBAR library. In some embodiments, the cancer cell library has at least about 400-fold (e.g., at least about 800-, 1000-, 2000-, 4000-, 10,000, 16,000-fold, or more) for each hit gene Either) coverage, such as about 1200-fold to about 12,000-fold coverage for each hit gene, or about 12,000-fold coverage for each hit gene. In some embodiments, the sgRNA iBAR library targets every annotated gene in the genome (ie, the sgRNA iBAR library is a genome-wide sgRNA iBAR library). In some embodiments, the cancer cell library has at least about 100-fold (e.g., at least about any of 400-fold, 800-fold, 1000-fold, or 1200-fold) coverage against a genome-wide sgRNA iBAR library Rate.

內源性突變endogenous mutation

在一些實施方案中,所述初始癌細胞群或最終癌細胞文庫中的癌細胞可包含不是由CRISPR/Cas系統或誘變劑(例如,EMS)產生的內源性突變,如天然存在的突變,或癌細胞中不符合命中基因選擇標準的突變 (例如,癌症患者中DNA突變頻率為至少約5%,和/或RNA表達水準上調或下調了大於約2-倍,和/或編碼的RNA/蛋白在細胞內表達或編碼的蛋白在細胞表面表達)。內源性突變不應影響本文所述的靶基因識別方法,因為將處理後癌細胞群體中sgRNAs或命中基因突變的情況與包含相同內源性突變的對照癌細胞群體進行了比較。In some embodiments, the cancer cells in the initial cancer cell population or in the final cancer cell library may contain endogenous mutations, such as naturally occurring mutations, not produced by the CRISPR/Cas system or mutagens (e.g., EMS) , or mutations in cancer cells that do not meet the hit gene selection criteria (e.g., a DNA mutation frequency of at least about 5% in cancer patients, and/or RNA expression levels that are up- or down-regulated by greater than about 2-fold, and/or encoding RNA /protein is expressed in the cell or the encoded protein is expressed on the cell surface). Endogenous mutations should not affect the target gene identification method described here, since the profile of sgRNAs or hit gene mutations in treated cancer cell populations was compared to control cancer cell populations containing the same endogenous mutations.

癌細胞cancer cell

在一些實施方案中,提供了編輯癌細胞中基因組位點的方法,包括向宿主癌細胞(例如,初始癌細胞、未修飾的癌細胞)引入嚮導RNA構建體,所述構建體包含靶向基因組基因座(例如,命中基因的靶位點)的嚮導序列和編碼重複:反重複雙鏈和四環的嚮導髮夾序列,其中iBAR被嵌入至四環中作為內部重複,表達靶向宿主癌細胞中基因組位點的嚮導RNA,且由此在Cas核酸酶(例如,Cas9)存在下編輯所述靶向的基因組位點(例如,命中基因)。In some embodiments, there is provided a method of editing a genomic locus in a cancer cell comprising introducing into a host cancer cell (e.g., a naive cancer cell, an unmodified cancer cell) a guide RNA construct comprising a targeting genome Guide sequences and coding repeats for loci (e.g., target sites of hit genes): inverted repeat double-stranded and four-loop guide hairpin sequences, where iBAR is embedded into the four-loop as an internal repeat, expression targeting host cancer cells guide RNA at the genomic locus, and thereby edit the targeted genomic locus (eg, hit gene) in the presence of a Cas nuclease (eg, Cas9).

在一些實施方案中,提供了通過將本文所述sgRNA文庫或所述sgRNA iBAR文庫中的任一個轉染至多個宿主癌細胞(例如,初始癌細胞群,具有或沒有Cas元件)來製備的癌細胞文庫,其中所述sgRNA構建體或所述sgRNA iBAR構建體存在於病毒載體(例如,慢病毒載體)或病毒(例如,慢病毒)中。在一些實施方案中,所述方法還包括向所述初始癌細胞群引入包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件,例如,作為Cas9 mRNA。在一些實施方案中,轉染期間病毒載體或病毒與宿主癌細胞(例如,初始癌細胞群)之間的感染複數(MOI)為至少約1。在一些實施方案中,MOI為至少約1.5、2、2.5、3、3.5、4、4.5、5、5.5、6、6.5、7、7.5、8、8.5、9、9.5、10或更高中的任一個。在一些實施方案中,MOI為約1、約1.5、約2、約2.5、約3、約3.5、約4、約4.5、約5、約5.5、約6、約6.5、約7、約7.5、約8、約8.5、約9、約9.5或約10。在一些實施方案中,MOI為約1-10、1-3、3-5、5-10、2-9、3-8、4-6或2-5中的任一個。在一些實施方案中,轉染期間病毒載體或病毒與宿主癌細胞(例如,初始癌細胞群)之間的MOI小於1,如小於約0.8、0.5、0.3或更低中的任一個。在一些實施方案中,MOI為約0.3至約1。在一些實施方案中,以至少約2,如至少約3的MOI使病毒sgRNA文庫或病毒sgRNA iBAR文庫與所述初始癌細胞群接觸。 In some embodiments, cancer cells prepared by transfecting any of the sgRNA libraries described herein or the sgRNA iBAR libraries described herein into a plurality of host cancer cells (e.g., an initial population of cancer cells, with or without Cas elements) are provided. A library of cells, wherein the sgRNA construct or the sgRNA iBAR construct is present in a viral vector (eg, a lentiviral vector) or a virus (eg, a lentivirus). In some embodiments, the method further comprises introducing into the initial population of cancer cells a Cas element comprising a Cas protein or a nucleic acid encoding a Cas protein, e.g., as Cas9 mRNA. In some embodiments, the multiplicity of infection (MOI) between the viral vector or virus and the host cancer cells (eg, the initial population of cancer cells) during transfection is at least about 1. In some embodiments, the MOI is at least about any of 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10 or more one. In some embodiments, the MOI is about 1, about 1.5, about 2, about 2.5, about 3, about 3.5, about 4, about 4.5, about 5, about 5.5, about 6, about 6.5, about 7, about 7.5, About 8, about 8.5, about 9, about 9.5 or about 10. In some embodiments, the MOI is about any of 1-10, 1-3, 3-5, 5-10, 2-9, 3-8, 4-6, or 2-5. In some embodiments, the MOI between the viral vector or virus and the host cancer cells (eg, the initial population of cancer cells) during transfection is less than 1, such as less than about any of 0.8, 0.5, 0.3 or less. In some embodiments, the MOI is from about 0.3 to about 1. In some embodiments, the viral sgRNA library or the viral sgRNA iBAR library is contacted with the initial population of cancer cells at an MOI of at least about 2, such as at least about 3.

在一些實施方案中,將驅動CRISPR/Cas系統一個或多個元件的表達的一個或多個載體引入宿主癌細胞(例如,所述初始癌細胞群),使得CRISPR系統元件的表達引導了在一個或多個命中基因的一個或多個靶位點與本文所述sgRNA分子或sgRNA iBAR分子形成CRISPR複合體。在一些實施方案中,宿主癌細胞(例如,所述初始癌細胞群)已引入了Cas核酸酶(例如,Cas9 mRNA)或被工程化以穩定地表達CRISPR/Cas核酸酶。 In some embodiments, one or more vectors that drive expression of one or more elements of the CRISPR/Cas system are introduced into host cancer cells (e.g., the naive cancer cell population) such that expression of the CRISPR system elements directs expression in a One or more target sites of one or more hit genes form a CRISPR complex with the sgRNA molecule or sgRNA iBAR molecule described herein. In some embodiments, the host cancer cells (eg, the initial cancer cell population) have introduced a Cas nuclease (eg, Cas9 mRNA) or are engineered to stably express a CRISPR/Cas nuclease.

在一些實施方案中,宿主癌細胞(例如,所述初始癌細胞群)是癌細胞系,如預先建立的癌細胞系。宿主癌細胞和癌細胞系可為人癌細胞或癌細胞系,或其可為非人的哺乳動物癌細胞或癌細胞系。在一些實施方案中,宿主癌細胞很難以低MOI(例如,低於1、0.5或0.3)採用病毒載體,如慢病毒載體轉染。在一些實施方案中,宿主癌細胞很難以低MOI(例如,低於1、0.5或0.3)採用CRISPR/Cas系統進行編輯。在一些實施方案中,宿主癌細胞以有限的數量提供。在一些實施方案中,宿主癌細胞獲自個體(例如,人癌症患者)的腫瘤樣本。In some embodiments, the host cancer cells (eg, the initial population of cancer cells) are cancer cell lines, such as pre-established cancer cell lines. Host cancer cells and cancer cell lines can be human cancer cells or cancer cell lines, or they can be non-human mammalian cancer cells or cancer cell lines. In some embodiments, host cancer cells are difficult to transfect with a viral vector, such as a lentiviral vector, at a low MOI (eg, less than 1, 0.5, or 0.3). In some embodiments, the host cancer cells are difficult to edit with a CRISPR/Cas system at a low MOI (eg, less than 1, 0.5, or 0.3). In some embodiments, host cancer cells are provided in limited numbers. In some embodiments, host cancer cells are obtained from a tumor sample from an individual (eg, a human cancer patient).

本文所述的方法適用於鑒定多種癌細胞中的敏感或耐藥靶基因,包括實體癌和血液癌,以及所有階段的癌症,包括早期癌症、非轉移癌、原發癌、晚期癌症、局部晚期癌症、轉移癌或緩解期癌症。在一些實施方案中,根據美國癌症聯合委員會(AJCC)分期組,實體癌或血液癌可以是I、II、III和IV期中的任何階段。The methods described herein are applicable to the identification of sensitive or drug-resistant target genes in a wide variety of cancer cells, including solid and hematological cancers, and in all stages of cancer, including early-stage cancers, non-metastatic cancers, primary cancers, advanced cancers, locally advanced cancers Cancer, metastatic cancer, or cancer in remission. In some embodiments, the solid or blood cancer can be any of stages I, II, III, and IV according to the American Joint Committee on Cancer (AJCC) staging groups.

在一些實施方案中,癌症是選自下組的實體癌:結腸癌、直腸癌、腎細胞癌、肝癌、非小細胞肺癌、小腸癌、食管癌、黑色素瘤、骨癌、胰腺癌、皮膚癌、頭頸癌、皮膚或眼內惡性黑色素瘤、子宮癌、乳腺癌、卵巢癌、直腸癌、肛門區域的癌症、胃癌、睾丸癌、子宮癌、輸卵管癌、子宮內膜癌、宮頸癌、陰道癌、外陰癌、霍奇金病、非霍奇金淋巴瘤(NHL)、皮膚T細胞淋巴瘤(CTCL)、內分泌系統癌、甲狀腺癌、甲狀旁腺癌、腎上腺癌、軟組織肉瘤、尿道癌、陰莖癌、兒童實體瘤、膀胱癌、腎癌或輸尿管癌、腎盂癌、中樞神經系統腫瘤、原發性中樞神經系統淋巴瘤、腫瘤血管生成、脊柱腫瘤、腦幹膠質瘤、垂體腺瘤、卡波西肉瘤、表皮樣癌、鱗狀細胞癌、T細胞淋巴瘤、環境誘發癌症、上述癌症的組合以及上述癌症的轉移病灶。In some embodiments, the cancer is a solid cancer selected from the group consisting of colon cancer, rectal cancer, renal cell carcinoma, liver cancer, non-small cell lung cancer, small bowel cancer, esophageal cancer, melanoma, bone cancer, pancreatic cancer, skin cancer , head and neck cancer, skin or intraocular malignant melanoma, uterine cancer, breast cancer, ovarian cancer, rectal cancer, cancer of the anal region, stomach cancer, testicular cancer, uterine cancer, fallopian tube cancer, endometrial cancer, cervical cancer, vaginal cancer , vulvar cancer, Hodgkin's disease, non-Hodgkin's lymphoma (NHL), cutaneous T-cell lymphoma (CTCL), endocrine system cancer, thyroid cancer, parathyroid cancer, adrenal cancer, soft tissue sarcoma, urethral cancer, Penile cancer, childhood solid tumors, bladder cancer, renal or ureteral cancer, renal pelvis cancer, central nervous system tumor, primary central nervous system lymphoma, tumor angiogenesis, spinal tumor, brainstem glioma, pituitary adenoma, card Percy's sarcoma, epidermoid carcinoma, squamous cell carcinoma, T-cell lymphoma, environmentally induced cancer, combinations of the above cancers, and metastatic lesions of the above cancers.

在一些實施方案中,癌症是選自以下一種或多種的血液學癌症:急性髓細胞白血病(AML)、慢性淋巴細胞白血病(CLL)、急性白血病、急性淋巴細胞白血病(ALL)、B細胞急性淋巴細胞白血病(B-ALL)、T細胞急性淋巴細胞白血病(T-ALL)、慢性骨髓性白血病(CML)、B細胞前淋巴細胞白血病、母細胞性漿細胞樣樹突細胞腫瘤(BPDCN)、伯基特淋巴瘤、彌漫性大B細胞淋巴瘤、濾泡性淋巴瘤、毛細胞白血病、小細胞或大細胞濾泡性淋巴瘤、惡性淋巴增生性疾病、MALT淋巴瘤、套細胞淋巴瘤、邊緣區淋巴瘤、多發性骨髓瘤、骨髓增生異常和骨髓增生異常綜合征、非霍奇金淋巴瘤、霍奇金氏淋巴瘤、漿母細胞淋巴瘤、漿細胞樣樹突細胞腫瘤、華氏(Waldenstrom)巨球蛋白血症或白血病前期。In some embodiments, the cancer is a hematological cancer selected from one or more of the following: acute myeloid leukemia (AML), chronic lymphocytic leukemia (CLL), acute leukemia, acute lymphoblastic leukemia (ALL), B-cell acute lymphoblastic leukemia B-cell leukemia (B-ALL), T-cell acute lymphoblastic leukemia (T-ALL), chronic myelogenous leukemia (CML), B-cell prolymphocytic leukemia, blastic plasmacytoid dendritic cell neoplasm (BPDCN), primary Kitt Lymphoma, Diffuse Large B-Cell Lymphoma, Follicular Lymphoma, Hairy Cell Leukemia, Small or Large Cell Follicular Lymphoma, Malignant Lymphoproliferative Disorder, MALT Lymphoma, Mantle Cell Lymphoma, Marginal Lymphoma, multiple myeloma, myelodysplasia and myelodysplastic syndrome, non-Hodgkin's lymphoma, Hodgkin's lymphoma, plasmablastic lymphoma, plasmacytoid dendritic cell neoplasm, Waldenstrom ) macroglobulinemia or preleukemia.

在一些實施方案中,癌細胞來自癌細胞系。在一些實施方案中,癌細胞來自異種來源,例如,來自小鼠、大鼠、非人靈長類和豬。在一些實施方案中,癌細胞為人癌細胞。在一些方面,癌細胞為原代細胞,如那些直接分離自受試者和/或分離自受試者並冷凍的細胞。在一些實施方案中,所述初始癌細胞群是同質的。在一些實施方案中,所述初始癌細胞群是異質的,如原代癌細胞,或包含混合階段的相同癌細胞,或相同癌細胞類型(如結直腸癌)的混合的細胞系。在一些實施方案中,從受試者採集癌細胞後,例如採用免疫磁珠法,分選癌細胞以獲得癌細胞亞群。在一些實施方案中,癌細胞直接獲自癌症治療(例如,採用抗癌劑)後的患者。在本發明的上下文中考慮了,在他們的恢復期採集癌細胞作為宿主癌細胞,或測試命中基因表達水準變化。In some embodiments, the cancer cells are from a cancer cell line. In some embodiments, the cancer cells are from xenogeneic sources, eg, from mice, rats, non-human primates, and pigs. In some embodiments, the cancer cells are human cancer cells. In some aspects, cancer cells are primary cells, such as those isolated directly from a subject and/or isolated from a subject and frozen. In some embodiments, the initial population of cancer cells is homogeneous. In some embodiments, the initial population of cancer cells is heterogeneous, such as primary cancer cells, or cell lines comprising mixed stages of the same cancer cells, or a mix of the same cancer cell types (eg, colorectal cancer). In some embodiments, after the cancer cells are collected from the subject, the cancer cells are sorted to obtain subpopulations of cancer cells, eg, using immunomagnetic beads. In some embodiments, cancer cells are obtained directly from a patient following cancer treatment (eg, with an anticancer agent). It is contemplated in the context of the present invention that cancer cells are harvested as host cancer cells during their recovery period, or tested for changes in expression levels of hit genes.

在一些實施方案中,癌細胞是III或IV期結直腸癌細胞。在一些實施方案中,所述初始癌細胞群是HCT116 (人結腸癌細胞系)。在一些實施方案中,所述初始癌細胞群是SW480 (人結直腸腺癌細胞系)。在一些實施方案中,結直腸癌是以下任一種:晚期結腸癌,惡性結腸癌,轉移性結腸癌,I期、II期、III期或IV期結腸癌,以基因組不穩定為特徵的結腸癌,路徑改變為特徵的結腸癌,按結腸癌亞型(CCS)系統分類為CCS1、CCS2或CCS3的結腸癌,按結直腸癌分類器(CRCA系統)分為莖狀、杯狀、炎症型、轉運放大型或腸細胞亞型的結腸癌,按結腸癌分子亞型(CCMS)系統分為C1、C2、C3、C4、C5或C6亞型的結腸癌,按CRC固有亞型(CRCIS)系統分為A型、B型或C型亞型的結腸癌,或按結直腸癌亞型聯合會(CRCSC)分類系統分為CMS1、CMS2、CMS3或CMS4的結腸癌。在一些實施方案中,結腸癌具有MSI高或MSI低的微衛星不穩定性(MSI)狀態。在一些實施方案中,癌細胞是從先前經歷過治療(例如化療、放射、手術或免疫調節治療)的個體(例如人類)獲得的。在一些實施方案中,個體對先前的治療(例如,化療、放射、手術或免疫調節治療)沒有反應。In some embodiments, the cancer cells are stage III or IV colorectal cancer cells. In some embodiments, the initial cancer cell population is HCT116 (a human colon cancer cell line). In some embodiments, the initial cancer cell population is SW480 (a human colorectal adenocarcinoma cell line). In some embodiments, the colorectal cancer is any of the following: advanced colon cancer, malignant colon cancer, metastatic colon cancer, stage I, II, III or IV colon cancer, colon cancer characterized by genomic instability , colon cancer characterized by a path change, colon cancer classified as CCS1, CCS2, or CCS3 by the Colon Cancer Subtype (CCS) system, and classified by the Colorectal Cancer Classifier (CRCA system) as stem-shaped, cup-shaped, inflammatory, Colon cancer of the transit-amplified or enterocytic subtype, classified by the Colon Cancer Molecular Subtype (CCMS) system as C1, C2, C3, C4, C5, or C6 colon cancer, by the CRC Inherent Subtype (CRCIS) system Colon cancer classified as subtype A, B, or C, or colon cancer classified as CMS1, CMS2, CMS3, or CMS4 by the Colorectal Cancer Subtype Consortium (CRCSC) classification system. In some embodiments, the colon cancer has a microsatellite instability (MSI) status of MSI high or MSI low. In some embodiments, cancer cells are obtained from individuals (eg, humans) who have previously undergone treatment (eg, chemotherapy, radiation, surgery, or immunomodulatory therapy). In some embodiments, the individual has not responded to previous therapy (eg, chemotherapy, radiation, surgery, or immunomodulatory therapy).

癌細胞,如所述初始癌細胞群,或本文所述癌細胞文庫可使用本領域已知的任何合適方法或培養基來培養。參見例如,Cree, Ian A. (Ed.), “Cancer Cell Culture. Methods and Protocols, 2nd Edition,” 2011, Springer Science +Business Media, New York, NY, USA。 Cancer cells, such as the starting population of cancer cells, or the library of cancer cells described herein can be cultured using any suitable method or medium known in the art. See, eg, Cree, Ian A. (Ed.), "Cancer Cell Culture. Methods and Protocols, 2nd Edition," 2011, Springer Science + Business Media, New York, NY, USA.

抗癌藥物處理和獲得對抗癌藥物耐藥的癌細胞Anticancer drug treatment and acquisition of anticancer drug resistant cancer cells

本文所述的方法包括用抗癌藥物處理本文所述癌細胞文庫(例如,通過誘變劑產生的癌細胞文庫、Cas9 +sgRNA 癌細胞文庫或Cas9 +sgRNA iBAR癌細胞文庫),以及從對抗癌藥物的殺傷耐藥的處理後癌細胞文庫獲得癌細胞。在一些實施方案中,所述方法包括使本文所述癌細胞文庫與抗癌藥物接觸,以及使所述癌細胞文庫生長以獲得處理後癌細胞群。示例性方法也參見實施例1和圖2。 The methods described herein include treating a library of cancer cells described herein (e.g., a library of cancer cells generated by a mutagen, a library of Cas9 + sgRNA cancer cells, or a library of Cas9 + sgRNA iBAR cancer cells) with an anticancer drug, and extracting a library of cancer cells from an anticancer drug. A cancer cell library is obtained by treating cancer cells with drug-killing drug resistance. In some embodiments, the method comprises contacting the library of cancer cells described herein with an anticancer drug, and growing the library of cancer cells to obtain a population of treated cancer cells. See also Example 1 and Figure 2 for exemplary methods.

抗癌藥物Anti-cancer drugs

用於治療癌症的任何藥劑均可在本文中用作抗癌藥物。抗癌藥物包括但不限於:用於所有類型和階段的癌症和癌症治療(化療、增殖、急性、遺傳、自發等)的抗癌物質,抗增殖劑,化學增敏劑,消炎劑(包括甾體和非甾體抗炎劑和退熱劑),抗氧化劑,激素,免疫抑制劑,酶抑制劑,細胞生長抑制劑和抗粘附分子,DNA、RNA或蛋白質合成的抑制劑,抗血管生成因數,抗分泌因數,放射性劑。在一些實施方案中,抗癌藥物是小分子藥物。在一些實施方案中,抗癌藥物是抗體。在一些實施方案中,抗癌藥物是抗體-藥物綴合物(ADC)。Any agent used to treat cancer can be used herein as an anticancer drug. Anticancer drugs include, but are not limited to: anticancer substances for all types and stages of cancer and cancer therapy (chemotherapy, proliferative, acute, genetic, idiopathic, etc.), antiproliferative agents, chemosensitizers, anti-inflammatory agents (including steroids anti-inflammatory and non-steroidal anti-inflammatory agents and antipyretics), antioxidants, hormones, immunosuppressants, enzyme inhibitors, cytostatic and anti-adhesion molecules, inhibitors of DNA, RNA or protein synthesis, anti-angiogenic Factors, antisecretory factors, radioactive agents. In some embodiments, the anticancer drug is a small molecule drug. In some embodiments, the anticancer drug is an antibody. In some embodiments, the anticancer drug is an antibody-drug conjugate (ADC).

在一些實施方案中,所述抗癌藥物是PARP抑制劑。在一些實施方案中,PARP抑制劑是以下任一種:他拉唑帕尼(talazoparib)、維利帕尼(veliparib)、帕米帕尼(pamiparib)、奧拉帕尼(olaparib)、蘆卡帕尼(rucaparib)、維利帕尼(veliparib)、CEP 9722、E7016、伊利帕尼(iniparib)、或3-氨基苯甲醯胺。In some embodiments, the anticancer drug is a PARP inhibitor. In some embodiments, the PARP inhibitor is any of the following: talazoparib, veliparib, pamiparib, olaparib, rucapar Rucaparib, veliparib, CEP 9722, E7016, iniparib, or 3-aminobenzamide.

使癌細胞文庫與抗癌藥物接觸的步驟Steps for exposing a library of cancer cells to an anticancer drug

在一些實施方案中,抗癌劑處理(以下也稱為“抗癌藥物處理步驟”,“抗癌藥物處理步驟 b)”,或“步驟b)”)包括使癌細胞文庫與抗癌藥物接觸的單一步驟。在一些實施方案中,步驟b)包括使癌細胞文庫與抗癌藥物以至少約IC5(例如,至少約IC10、IC20、IC30、IC40、IC50、IC60、IC70、IC80、IC90、IC95或更高中的任一個,或約IC20至約IC95)的濃度接觸持續至少約1個(例如,至少約2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、30、40、50個或更多中的任一個)倍增時間。“IC50”,或半最大抑制濃度(IC),是指體外抑制給定生物過程(如癌細胞增殖)或生物組分(如癌症細胞)50%所需的抑制物質(如抗癌藥物)的濃度。同樣,IC70或70%抑制濃度在本文中是指抑制70%癌細胞增殖(或殺死70%癌細胞)所需的抗癌藥物的抑制濃度。在一些實施方案中,測量藥物毒性曲線以確定治療步驟b)之前的抗癌藥物濃度。簡而言之,對一群癌細胞(例如,未經修飾的初始癌細胞群)進行一系列抗癌藥物濃度測試,讓細胞在抗癌藥物存在下生長幾個(例如,3個)倍增時間,然後根據抗癌藥物的濃度繪製細胞存活百分比或細胞殺傷率曲線,以獲得IC(例如,IC50、IC70等)。在一些實施方案中,可進行ATP測定(例如CellTiter Glo®發光細胞活性測定)以測量藥物毒性曲線。細胞殺傷率或死亡率也可以使用任何其他已知方法進行測試,如碘化丙錠(PI)染色。在一些實施方案中,含有抗癌藥物的細胞培養基每天或每隔2、3、4、5、6、7、8、9、10或更長的時間更換一次、兩次、3次、4次、5次、6次或更多次,並連續提供抗癌藥物。在一些實施方案中,細胞培養基在每個倍增時間後更換,例如,每天更換兩次細胞培養基,一次倍增時間為12 h。在一些實施方案中,細胞培養基在至少約2個倍增時間後改變,例如至少約3、4、5、6、7或更多個倍增時間中的任一個。在一些實施方案中,含有抗癌藥物的細胞培養基每3天更換一次。例如,含有抗癌藥物的細胞培養基每3天更換一次,而倍增時間約為20至40 h,例如約21 h,或約38 h。In some embodiments, anticancer agent treatment (hereinafter also referred to as "anticancer drug treatment step", "anticancer drug treatment step b)", or "step b)") comprises contacting the library of cancer cells with an anticancer drug single step. In some embodiments, step b) comprises combining the library of cancer cells with an anticancer drug at least about IC5 (e.g., at least about IC10, IC20, IC30, IC40, IC50, IC60, IC70, IC80, IC90, IC95 or higher) Any one, or about IC20 to about IC95) concentration exposure for at least about 1 (e.g., at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 , 16, 17, 18, 19, 20, 30, 40, 50 or more) doubling times. "IC50", or half-maximal inhibitory concentration (IC), refers to the concentration of an inhibitory substance (such as an anticancer drug) required to inhibit 50% of a given biological process (such as cancer cell proliferation) or biological component (such as cancer cells) in vitro concentration. Likewise, IC70 or 70% inhibitory concentration herein refers to the inhibitory concentration of an anticancer drug required to inhibit 70% of cancer cell proliferation (or kill 70% of cancer cells). In some embodiments, a drug toxicity profile is measured to determine the anticancer drug concentration prior to treatment step b). Briefly, a population of cancer cells (e.g., an unmodified initial cancer cell population) is tested for a series of anticancer drug concentrations, cells are grown in the presence of the anticancer drug for several (e.g., 3) doubling times, The percentage of cell survival or the rate of cell killing is then plotted against the concentration of the anticancer drug to obtain an IC (eg, IC50, IC70, etc.). In some embodiments, an ATP assay (eg, CellTiter Glo® Luminescent Cell Viability Assay) can be performed to measure drug toxicity profiles. Cell killing or mortality can also be tested using any other known method, such as propidium iodide (PI) staining. In some embodiments, the cell culture medium containing the anticancer drug is changed once, twice, 3 times, 4 times per day or every 2, 3, 4, 5, 6, 7, 8, 9, 10 or longer , 5 times, 6 times or more times, and provide anticancer drugs continuously. In some embodiments, the cell culture medium is changed after each doubling time, for example, the cell culture medium is changed twice a day with a doubling time of 12 h. In some embodiments, the cell culture medium is changed after at least about 2 doubling times, eg, at least about any of 3, 4, 5, 6, 7 or more doubling times. In some embodiments, the cell culture medium containing the anticancer drug is changed every 3 days. For example, the cell culture medium containing the anticancer drug is changed every 3 days, and the doubling time is about 20 to 40 h, such as about 21 h, or about 38 h.

在一些實施方案中,抗癌藥物處理步驟b)包括使癌細胞文庫與抗癌藥物以約IC50至約IC70的濃度(例如,約IC50、IC55、IC60、IC65、IC70或它們之間的任意值中的任一個)接觸持續約9至約10個(例如,約9、9.5、10或它們之間的任意值中的任一個)倍增時間。在一些實施方案中,抗癌藥物處理步驟b)包括使癌細胞文庫與抗癌藥物以約IC50至約IC70的濃度(例如,約IC50、IC55、IC60、IC65、IC70或或它們之間的任意值中的任一個)接觸持續約15至約16個(例如,約15、15.5、16或它們之間的任意值中的任一個)倍增時間。在一些實施方案中,抗癌藥物處理步驟b)包括使癌細胞文庫與抗癌藥物以約IC50至約IC70的濃度(例如,約IC50、IC55、IC60、IC65、IC70或或它們之間的任意值中的任一個)接觸持續約18至約19個(例如,約18、18.5、19或它們之間的任意值中的任一個)倍增時間。In some embodiments, the anticancer drug treatment step b) comprises subjecting the library of cancer cells to the anticancer drug at a concentration of about IC50 to about IC70 (e.g., about IC50, IC55, IC60, IC65, IC70, or any value therebetween. any of ) for about 9 to about 10 (eg, any of about 9, 9.5, 10, or any value therebetween) doubling times. In some embodiments, the anticancer drug treatment step b) comprises subjecting the cancer cell library to the anticancer drug at a concentration of about IC50 to about IC70 (e.g., about IC50, IC55, IC60, IC65, IC70, or any value in between). any of values) for about 15 to about 16 (eg, any of about 15, 15.5, 16, or any value therebetween) doubling times. In some embodiments, the anticancer drug treatment step b) comprises subjecting the cancer cell library to the anticancer drug at a concentration of about IC50 to about IC70 (e.g., about IC50, IC55, IC60, IC65, IC70, or any value in between). any of values) for about 18 to about 19 (eg, any of about 18, 18.5, 19, or any value therebetween) doubling times.

在一些實施方案中,抗癌藥物處理步驟包含以下步驟(或基本上由,或由以下步驟組成):使癌細胞文庫與抗癌劑接觸,持續至少約24 h,如至少約30 h、36 h、48 h、50 h、52 h、54 h、56 h、58 h、60 h、62 h、64 h、66 h、68 h、70 h、72 h、74 h、76 h、78 h、80 h、84 h、96 h、5天、6天、7天、8天、9天、10天、12天、14天、16天、18天、20天、24天、30天或更長中的任一個。在一些實施方案中,抗癌藥物處理步驟包含以下步驟(或基本上由,或由以下步驟組成):使癌細胞文庫與抗癌藥物接觸,持續約6至約10天,約12至約14天,約14至約16天,或約22至約26天。In some embodiments, the anticancer drug treatment step comprises (or consists essentially of, or consists of): contacting the library of cancer cells with an anticancer agent for at least about 24 h, such as at least about 30 h, 36 h h, 48 h, 50 h, 52 h, 54 h, 56 h, 58 h, 60 h, 62 h, 64 h, 66 h, 68 h, 70 h, 72 h, 74 h, 76 h, 78 h, 80 h, 84 h, 96 h, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 12 days, 14 days, 16 days, 18 days, 20 days, 24 days, 30 days or longer any of the In some embodiments, the anticancer drug treating step comprises (or consists essentially of, or consists of) contacting the library of cancer cells with the anticancer drug for about 6 to about 10 days, about 12 to about 14 days days, from about 14 to about 16 days, or from about 22 to about 26 days.

抗癌藥物接觸時間越長,和/或抗癌藥物濃度越高,則處理條件越苛刻。The longer the anticancer drug exposure time, and/or the higher the anticancer drug concentration, the harsher the treatment conditions.

在一些實施方案中,在抗癌藥物接觸步驟期間,癌細胞(例如,對抗癌藥物殺傷不敏感或敏感性較差的那些)繼續生長。在一些實施方案中,每隔1、2、3、4、5個或更多個(如3個)倍增時間將癌細胞傳代,並針對每個命中基因(或針對每個突變,或sgRNA,或sgRNA iBAR)保持相同或相似的(例如,約10%以內的差異)文庫倍數覆蓋,用於連續抗癌藥物處理。在一些實施方案中,當達到約90%匯合時將癌細胞傳代。 In some embodiments, cancer cells (eg, those that are insensitive or less sensitive to killing by the anticancer drug) continue to grow during the anticancer drug contacting step. In some embodiments, cancer cells are passaged every 1, 2, 3, 4, 5, or more (eg, 3) doubling times and targeted for each hit gene (or for each mutation, or sgRNA , or sgRNA iBAR ) maintain the same or similar (eg, within ~10% difference) library fold coverage for sequential anticancer drug treatments. In some embodiments, cancer cells are passaged when they reach about 90% confluency.

使經抗癌藥物處理的癌細胞文庫生長以獲得處理後癌細胞群的步驟Step of growing a library of cancer cells treated with an anticancer drug to obtain a population of treated cancer cells

在一些實施方案中,從對抗癌藥物耐藥的抗癌藥物處理的癌細胞文庫中獲得癌細胞(以下也稱為“處理後癌細胞群獲取步驟”、“癌細胞獲取步驟c)”或“步驟c)”),包括使所述抗癌藥物處理的癌細胞文庫生長以獲得處理後癌細胞群的單一步驟。在一些實施方案中,獲得的處理後癌細胞群活的細胞群,即,對抗癌藥物殺傷耐藥的。In some embodiments, cancer cells are obtained from an anticancer drug-treated cancer cell library that is resistant to an anticancer drug (hereinafter also referred to as "post-treatment cancer cell population acquisition step", "cancer cell acquisition step c)" or "Step c)"), comprising a single step of growing said anticancer drug-treated cancer cell library to obtain a treated cancer cell population. In some embodiments, the resulting treated cancer cell population is a viable cell population, ie, resistant to killing by an anticancer drug.

在一些實施方案中,步驟b)和步驟c)中生長的細胞可同時發生或具有重疊,例如,藥物處理可與細胞生長期重疊,以下也稱為“處理/生長步驟”。參見例如,實施例1。例如,在一些實施方案中,通過在培養基中提供抗癌藥物使所述癌細胞文庫與抗癌藥物接觸(步驟b)),允許癌細胞生長 (步驟c))同時經含有抗癌藥物的培養基持續處理(步驟b)),含有抗癌藥物的培養基可每幾個小時或幾天,如每3天改變(步驟c)),並在幾個倍增時間(例如,約9至約10個倍增時間,或約15至約16個倍增時間)後收集癌細胞以獲得處理後癌細胞群(步驟c))。在一些實施方案中,每1、2、3、4、5個或更多個(如3個)倍增時間將癌細胞傳代,同時針對每個命中基因(或針對每個突變,或sgRNA,或sgRNA iBAR)保持相同或相似的(例如,約10%以內的差異)文庫倍數覆蓋,用於連續抗癌藥物處理。在一些實施方案中,當達到約90%匯合時將癌細胞傳代。 In some embodiments, the cells grown in step b) and step c) may occur simultaneously or overlap, eg drug treatment may overlap with a cell growth period, hereinafter also referred to as a "treatment/growth step". See, eg, Example 1. For example, in some embodiments, the library of cancer cells is contacted with the anticancer drug by providing the anticancer drug in the medium (step b)), allowing the cancer cells to grow (step c)) while passing through the medium containing the anticancer drug. Ongoing treatment (step b)), the medium containing the anticancer drug may be changed every few hours or days, such as every 3 days (step c)), and at several doubling times (e.g., about 9 to about 10 doublings) time, or about 15 to about 16 doubling times), the cancer cells are harvested to obtain a treated cancer cell population (step c)). In some embodiments, cancer cells are passaged every 1, 2, 3, 4, 5 or more (eg, 3) doubling times while targeting each hit gene (or targeting each mutation, or sgRNA, or sgRNA iBAR ) maintain the same or similar (eg, within ~10% difference) library fold coverage for sequential anticancer drug treatments. In some embodiments, cancer cells are passaged when they reach about 90% confluency.

在一些實施方案中,使所述抗癌藥物處理的癌細胞文庫生長以獲得處理後癌細胞群,包括抗癌藥物處理後的“恢復步驟”,即處理後的癌細胞生長於沒有任何抗癌藥物的新鮮培養基中。因此,在一些實施方案中,步驟c)包括恢復步驟,該恢復步驟包括在抗癌藥物接觸步驟 b)後於沒有抗癌藥物存在下使處理的癌細胞生長。在一些實施方案中,恢復步驟包括使所述癌細胞生長,而此前使癌細胞文庫與抗癌藥物接觸持續至少約24 h,如至少約26 h、28 h、30 h、32 h、34 h、36 h、38 h、40 h、48 h、52 h、56 h、60 h、64 h、68 h、72 h、78 h、84 h、96 h、5天、6天、7天、8天、9天、10天、12天、14天、16天、18天、20天、24天、30天或更長中的任一個。In some embodiments, growing the library of anticancer drug-treated cancer cells to obtain a population of treated cancer cells includes a "recovery step" following anticancer drug treatment, where the treated cancer cells are grown in the absence of any anticancer drug. Drugs in fresh medium. Accordingly, in some embodiments, step c) comprises a reinstatement step comprising growing the treated cancer cells in the absence of the anticancer drug after the anticancer drug contacting step b). In some embodiments, the recovering step comprises growing the cancer cells prior to contacting the library of cancer cells with an anticancer drug for at least about 24 h, such as at least about 26 h, 28 h, 30 h, 32 h, 34 h , 36 h, 38 h, 40 h, 48 h, 52 h, 56 h, 60 h, 64 h, 68 h, 72 h, 78 h, 84 h, 96 h, 5 days, 6 days, 7 days, 8 Days, 9 days, 10 days, 12 days, 14 days, 16 days, 18 days, 20 days, 24 days, 30 days or longer.

“恢復步驟”期間的培養條件應適合癌細胞生長和/或增殖。在一些實施方案中,培養條件在擴增/生長期間不誘導癌細胞形成特定表型。這種培養條件在本領域是眾所周知的。例如,在37°C、5% CO2培養箱中。也參見Cree,Ian A.id.。在一些實施方案中,培養基是癌細胞完全培養基。在一些實施方案中,培養條件與抗癌藥物處理前癌細胞文庫的培養條件相同。成功培養的培養基類型取決於癌細胞的類型。在一些實施方案中,培養基進一步補充有用於可選擇的標誌物的試劑,例如,以選擇在增殖期間不丟失轉基因或突變的癌細胞。The culture conditions during the "recovery step" should be suitable for cancer cell growth and/or proliferation. In some embodiments, the culture conditions do not induce cancer cells to develop a particular phenotype during expansion/growth. Such culture conditions are well known in the art. For example, in a 37°C, 5% CO2 incubator. See also Cree, Ian A. id. In some embodiments, the medium is complete cancer cell medium. In some embodiments, the culture conditions are the same as the culture conditions of the cancer cell library prior to anticancer drug treatment. The type of medium for successful culture depends on the type of cancer cell. In some embodiments, the medium is further supplemented with reagents for selectable markers, eg, to select for cancer cells that do not lose transgenes or mutations during proliferation.

在一些實施方案中,“獲得處理後癌細胞群”包含單一“收穫步驟” (或基本上由,或由單一“收穫步驟”組成):即去除培養基(可含有死細胞或漂浮細胞),並收集經抗癌藥物處理/生長步驟後的剩餘癌細胞,或收集恢復步驟後的剩餘癌細胞。在一些實施方案中,癌細胞收穫步驟包括將處理後/生長或恢復後的癌細胞收集到容器中(例如,Falcon離心管、EP管或離心管)以存儲或用於隨後的實驗。在一些實施方案中,收穫步驟包括清洗獲得的癌細胞,以使癌細胞處於適合的存儲條件(例如,4°C、-20°C或-80°C存儲)或隨後的實驗(例如,細胞裂解、PCR或測序)。例如,對於粘附的癌細胞,在去除培養基(包含死細胞或漂浮細胞)後,使用胰蛋白酶將細胞培養容器(例如,細胞培養皿)中剩餘的癌細胞分離並收集(例如,轉移到新鮮容器中)。獲得的治療後癌細胞群會是活的癌細胞,或者是對抗癌藥物殺傷耐藥的那些細胞。In some embodiments, "obtaining a population of treated cancer cells" comprises (or consists essentially of, or consists of, a single "harvesting step"): i.e. removing the medium (which may contain dead or floating cells), and Harvest the remaining cancer cells after the anticancer drug treatment/growth step, or collect the remaining cancer cells after the recovery step. In some embodiments, the cancer cell harvesting step includes collecting the treated/grown or recovered cancer cells into containers (eg, Falcon centrifuge tubes, EP tubes, or centrifuge tubes) for storage or for subsequent experiments. In some embodiments, the harvesting step includes washing the harvested cancer cells so that the cancer cells are subjected to suitable storage conditions (e.g., 4°C, -20°C, or -80°C storage) or subsequent experiments (e.g., cell lysis, PCR or sequencing). For example, for adherent cancer cells, after removing the culture medium (containing dead or floating cells), trypsinize the remaining cancer cells in the cell culture vessel (e.g., a cell culture dish) and collect (e.g., transfer to a fresh in the container). The resulting post-treatment cancer cell population will be live cancer cells, or those cells that are resistant to killing by anticancer drugs.

任選的富集步驟optional enrichment step

如果希望從非粘附性癌細胞(例如造血癌)中獲得處理後/生長或恢復後癌細胞的存活(或耐藥)群體,或者如果希望(例如,從粘附性或非粘附性癌細胞中)獲得處理後/生長或恢復後癌細胞富集(或更純)的存活群體,“獲得處理後癌細胞群”的方法可以包括“富集步驟”,其包括對癌細胞進行分選以獲得純的活癌細胞群。在一些實施方案中,“獲得處理後癌細胞群”包括對處理後/生長或恢復後癌細胞群進行分選,以獲得活的癌細胞群,即對抗癌藥物耐藥的處理後癌細胞群(以下也稱為“活的富集”)。If it is desired to obtain a surviving (or drug-resistant) population of cancer cells after treatment/growth or recovery from non-adherent cancer cells (e.g., hematopoietic cancers), or if desired (e.g., from cells) to obtain an enriched (or more pure) surviving population of cancer cells following treatment/growth or recovery, the method of "obtaining a population of treated cancer cells" may include an "enrichment step" comprising sorting the cancer cells to obtain a pure live cancer cell population. In some embodiments, "obtaining a population of treated cancer cells" comprises sorting a population of treated/growing or recovered cancer cells to obtain a population of viable cancer cells, i.e. treated cancer cells resistant to an anticancer drug cluster (hereinafter also referred to as "live enrichment").

在一些實施方案中,富集步驟還包括在分選之前用細胞活性標誌物(例如染料)使從處理後/生長或恢復後的癌細胞染色。用於評估細胞活性的方法和試劑在本領域是公知的,例如基於螢光的或基於比色(酶)的。例如,基於膜滲透性的測定,如用DAPI、碘化丙啶(PI)、7-AAD或胺活性染料染色,表明細胞死亡;而吖啶橙能更有效地染色活細胞。羧基螢光素二乙酸酯(CFDA)是一種非螢光、細胞滲透性染料,其通過僅存在於活細胞中的非特異性胞內酯酶來水解以形成螢光分子羧基螢光素。CFDA-SE是CFDA的衍生物,其水解後更好地保留在活細胞中。四甲基羅丹明乙酯(TMRE)和四甲基羅丹明甲酯(TMRM)定位于健康細胞的線粒體和死亡細胞的細胞質。JC-1是一種常用的電位染料。在健康細胞中,JC-1定位於線粒體,在那裡形成紅色螢光聚集體。線粒體膜電位崩潰後,JC-1擴散到整個細胞,並以綠色螢光單體的形式存在。BrdU摻入新合成的DNA表明細胞是活的。In some embodiments, the enriching step further comprises staining the cancer cells after treatment/growth or recovery with a cell viability marker (eg, a dye) prior to sorting. Methods and reagents for assessing cellular activity are well known in the art, such as fluorometric or colorimetric (enzyme) based. For example, membrane permeability-based assays, such as staining with DAPI, propidium iodide (PI), 7-AAD, or amine reactive dyes, indicate cell death; whereas acridine orange stains live cells more efficiently. Carboxyfluorescein diacetate (CFDA) is a non-fluorescent, cell-permeable dye that is hydrolyzed by nonspecific intracellular esterases present only in living cells to form the fluorescent molecule carboxyfluorescein. CFDA-SE is a derivative of CFDA that is better retained in living cells after hydrolysis. Tetramethylrhodamine ethyl ester (TMRE) and tetramethylrhodamine methyl ester (TMRM) localize to the mitochondria of healthy cells and the cytoplasm of dead cells. JC-1 is a commonly used potentiometric dye. In healthy cells, JC-1 localizes to mitochondria, where it forms red fluorescent aggregates. After the mitochondrial membrane potential collapses, JC-1 diffuses throughout the cell and exists as a green fluorescent monomer. The incorporation of BrdU into newly synthesized DNA indicates that the cell is alive.

在一些實施方案中,富集步驟還包括在分選之前用碘化丙錠(PI)使處理後/生長或恢復後的癌細胞染色,其中PI染色指示細胞死亡。因此,在一些實施方案中,富集步驟包括對PI陰性(無PI染色)的處理後/生長或恢復後癌細胞進行分選,由此獲得對抗癌藥物耐藥的(活的)處理後癌細胞群。本文中可以使用任何細胞分選方法,如螢光活化細胞分選(FACS)、磁活化細胞分選(MACS)、微流體細胞分選、浮力活化細胞分選等。In some embodiments, the enriching step further comprises staining the treated/growing or recovered cancer cells with propidium iodide (PI) prior to sorting, wherein PI staining is indicative of cell death. Thus, in some embodiments, the enrichment step comprises sorting PI-negative (no PI staining) post-treatment/growth or recovery cancer cells, thereby obtaining anticancer drug-resistant (live) post-treatment cancer cell population. Any cell sorting method can be used herein, such as fluorescence activated cell sorting (FACS), magnetic activated cell sorting (MACS), microfluidic cell sorting, buoyancy activated cell sorting, etc.

因此,在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(例如,持續約9至約10個倍增時間,或約15至約16個倍增時間),以及通過去除含有抗癌藥物的細胞培養基(和死的漂浮細胞)來收穫癌細胞並收集剩餘的粘附癌細胞(例如,通過胰蛋白酶消化),由此獲得處理後癌細胞群。對於粘附癌細胞,這些細胞大部分存活或全部存活,或對抗癌藥物耐藥.Thus, in some embodiments, the anticancer drug treatment step b) and cancer cell harvesting step c) comprise: contacting the cancer cell library with an anticancer drug while allowing viable cancer cells to grow (eg, for about 9 to about 10 doubling times, or about 15 to about 16 doubling times), and harvesting the cancer cells by removing the cell culture medium (and dead floating cells) containing the anticancer drug and collecting the remaining adherent cancer cells (for example, by pancreatic protease digestion) to obtain the treated cancer cell population. For adherent cancer cells, most or all of these cells survive, or are resistant to anticancer drugs.

在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(例如,持續約9至約10個倍增時間,或約15至約16個倍增時間),以及去除含有抗癌藥物的細胞培養基(和死的漂浮細胞),並在不含抗癌藥物的細胞培養基中使剩餘的粘附癌細胞生長(恢復步驟),以及通過去除細胞培養基收穫癌細胞並收集剩餘的粘附癌細胞(例如,通過胰蛋白酶消化),從而獲得處理後癌細胞群。對於粘附癌細胞,這些細胞大部分存活或全部存活,或對抗癌藥物耐藥。In some embodiments, the anticancer drug treatment step b) and cancer cell harvesting step c) comprise: contacting the cancer cell library with the anticancer drug while allowing viable cancer cells to grow (e.g., for about 9 to about 10 doubling time, or about 15 to about 16 doubling times), and removing the cell culture medium (and dead floating cells) containing the anticancer drug and growing the remaining adherent cancer cells in cell culture medium without the anticancer drug (recovery step), and harvesting the cancer cells by removing the cell culture medium and collecting the remaining adherent cancer cells (eg, by trypsinization) to obtain a post-treatment cancer cell population. For adherent cancer cells, most or all of these cells survived, or were resistant to anticancer drugs.

在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(例如,持續約9至約10個倍增時間,或約15至約16個倍增時間),任選地去除含有抗癌藥物的細胞培養基(和死的漂浮細胞),用細胞活性標誌物(例如PI)染色剩餘的癌細胞,對存活的癌細胞進行分選(PI陰性,例如通過FACS),從而獲得處理後癌細胞群。獲得的處理後癌細胞群是富集的活癌細胞,或對抗癌藥物耐藥。對於非粘附性癌細胞(例如造血癌細胞),在染色和分選之前不去除細胞培養基,或者在去除細胞培養基時增加離心步驟以收集所有癌細胞(活細胞和死細胞的混合物)。In some embodiments, the anticancer drug treatment step b) and cancer cell harvesting step c) comprise: contacting the cancer cell library with the anticancer drug while allowing viable cancer cells to grow (e.g., for about 9 to about 10 doubling time, or about 15 to about 16 doubling times), optionally remove the cell culture medium (and dead floating cells) containing the anticancer drug, stain the remaining cancer cells with a cell viability marker (such as PI), and have a positive effect on survival The cancer cells are sorted (PI negative, eg by FACS) to obtain a treated cancer cell population. The resulting population of treated cancer cells is enriched in live cancer cells, or resistant to anticancer drugs. For non-adherent cancer cells (such as hematopoietic cancer cells), do not remove the cell culture medium before staining and sorting, or add a centrifugation step when removing the cell culture medium to collect all cancer cells (a mixture of live and dead cells).

在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(例如,持續約9至約10個倍增時間,或約15至約16個倍增時間),去除含有抗癌藥物的細胞培養基(和死的漂浮細胞),在不含抗癌藥物的細胞培養基中使剩餘的粘附癌細胞生長(恢復步驟),去除細胞培養基,用細胞活性標誌物(例如PI)染色剩餘的癌細胞,對活的癌細胞進行分選(PI陰性,例如通過FACS),從而獲得富集活癌細胞且對抗癌藥物耐藥的處理後癌細胞群。In some embodiments, the anticancer drug treatment step b) and cancer cell harvesting step c) comprise: contacting the cancer cell library with the anticancer drug while allowing viable cancer cells to grow (e.g., for about 9 to about 10 doubling time, or about 15 to about 16 doubling times), the cell culture medium (and dead floating cells) containing the anticancer drug was removed, and the remaining adherent cancer cells were grown in cell culture medium without the anticancer drug (recovery step), remove the cell culture medium, stain the remaining cancer cells with cell viability markers (such as PI), and sort the live cancer cells (PI negative, such as by FACS), so as to obtain enriched live cancer cells and anticancer Drug-resistant post-treatment cancer cell populations.

任選的第二處理步驟optional second processing step

在一些實施方案中,所述癌細胞文庫經歷兩個處理步驟。在一些實施方案中,本文所述方法包括第二處理步驟,其包括將初始處理後的癌細胞(在恢復步驟期間有或無進一步培養,或在富集步驟期間有或無分選活的癌細胞)與抗癌藥物接觸。在一些實施方案中,所述兩個處理步驟的處理條件是相同的,即抗癌藥物濃度是相同的,以及處理時長是相同的。在一些實施方案中,所述兩個處理步驟的處理條件是不同的。在一些實施方案中,所述第二處理步驟比所述第一處理步驟更苛刻,即在所述第二處理步驟中癌細胞與更高濃度的抗癌藥物接觸,相比所述第一處理步驟中的濃度更高如至少約1.1-倍、1.2-倍、1.3-倍、1.4-倍、1.5-倍、2-倍、3-倍、4-倍、5-倍、6-倍、7-倍、8-倍、9-倍、10-倍、15-倍或20-倍中的任一個;和/或所述癌細胞與抗癌藥物接觸更長時間,比所述第一處理步驟中的接觸時間更長如約10 min、20 min、30 min、40 min、50 min、1 h、2 h、4 h、6 h、8 h、10 h、12 h、24 h、36 h、48 h、60 h、72 h、84 h、96 h、5天、6天、7天、8天、9天或10天中的任一個。在一些實施方案中,所述第二處理步驟比所述第一處理步驟更溫和,即在所述第二處理步驟中癌細胞與更低濃度的抗癌藥物接觸,比在所述第一處理步驟中的濃度更低如至少約1.1-倍、1.2-倍、1.3-倍、1.4-倍、1.5-倍、2-倍、3-倍、4-倍、5-倍、6-倍、7-倍、8-倍、9-倍、10-倍、15-倍或20-倍中的任一個;和/或所述癌細胞與抗癌藥物接觸更短時間,比在所述第一處理步驟中的接觸時間更短如約10 min、20 min、30 min、40 min、50 min、1 h、2 h、4 h、6 h、8 h、10 h、12 h、24 h、36 h、48 h、60 h、72 h、84 h、96 h、5天、6天、7天、8天、9天或10天中的任一個。In some embodiments, the library of cancer cells undergoes two processing steps. In some embodiments, the methods described herein include a second processing step comprising substituting the initially treated cancer cells (with or without further culturing during the recovery step, or with or without sorting viable cancer cells during the enrichment step). cells) in contact with anticancer drugs. In some embodiments, the treatment conditions of the two treatment steps are the same, that is, the concentration of the anticancer drug is the same, and the treatment time is the same. In some embodiments, the processing conditions of the two processing steps are different. In some embodiments, the second treatment step is harsher than the first treatment step, i.e., the cancer cells are exposed to a higher concentration of the anticancer drug in the second treatment step than in the first treatment step. The concentration in the step is higher such as at least about 1.1-fold, 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold Any one of -fold, 8-fold, 9-fold, 10-fold, 15-fold, or 20-fold; and/or the cancer cells are in contact with the anticancer drug for a longer period of time than the first treatment step The contact time in is longer such as about 10 min, 20 min, 30 min, 40 min, 50 min, 1 h, 2 h, 4 h, 6 h, 8 h, 10 h, 12 h, 24 h, 36 h, Any one of 48 h, 60 h, 72 h, 84 h, 96 h, 5 days, 6 days, 7 days, 8 days, 9 days or 10 days. In some embodiments, the second treatment step is milder than the first treatment step, i.e., the cancer cells are contacted with a lower concentration of the anticancer drug in the second treatment step than in the first treatment step. The concentration in the step is lower such as at least about 1.1-fold, 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold Any one of -fold, 8-fold, 9-fold, 10-fold, 15-fold or 20-fold; and/or the cancer cells are in contact with the anticancer drug for a shorter time than in the first treatment Shorter contact times in steps such as about 10 min, 20 min, 30 min, 40 min, 50 min, 1 h, 2 h, 4 h, 6 h, 8 h, 10 h, 12 h, 24 h, 36 h , 48 h, 60 h, 72 h, 84 h, 96 h, 5 days, 6 days, 7 days, 8 days, 9 days or 10 days.

任選的第二恢復步驟optional second recovery step

在一些實施方案中,所述第二處理步驟後(或所述第二處理步驟後,針對活的癌細胞的富集步驟後),所述方法還包括另外的恢復步驟,其包括在沒有任何抗癌藥物的新鮮培養基中使第二處理後的癌細胞生長。在一些實施方案中,所述第二恢復步驟具有與在所述第一恢復步驟中相同的培養條件,例如,相同的培養時長。在一些實施方案中,所述第二恢復步驟具有與在所述第一恢復步驟中不同的培養條件。在一些實施方案中,所述第二恢復步驟比所述第一恢復步驟更長,比所述第一恢復步驟更長如至少約10 min、20 min、30 min、40 min、50 min、1 h、2 h、4 h、6 h、8 h、10 h、12 h、24 h、36 h、48 h、60 h、72 h、84 h、96 h、5天、6天、7天、8天、9天或10天中的任一個。在一些實施方案中,所述第二恢復步驟比所述第一恢復步驟更短,比所述第一恢復步驟更短如至少約10 min、20 min、30 min、40 min、50 min、1 h、2 h、4 h、6 h、8 h、10 h、12 h、24 h、36 h、48 h、60 h、72 h、84 h、96 h、5天、6天、7天、8天、9天,或10天中的任一個。In some embodiments, after the second processing step (or after the second processing step, after the enrichment step for live cancer cells), the method further comprises an additional recovery step comprising the absence of any Cancer cells after the second treatment were grown in fresh medium with anticancer drugs. In some embodiments, the second recovery step has the same culture conditions, eg, the same culture period, as in the first recovery step. In some embodiments, the second recovery step has different culture conditions than in the first recovery step. In some embodiments, the second recovery step is longer than the first recovery step, such as at least about 10 min, 20 min, 30 min, 40 min, 50 min, 1 min longer than the first recovery step. h, 2 h, 4 h, 6 h, 8 h, 10 h, 12 h, 24 h, 36 h, 48 h, 60 h, 72 h, 84 h, 96 h, 5 days, 6 days, 7 days, Either 8 days, 9 days or 10 days. In some embodiments, the second recovery step is shorter than the first recovery step, such as at least about 10 min, 20 min, 30 min, 40 min, 50 min, 1 h, 2 h, 4 h, 6 h, 8 h, 10 h, 12 h, 24 h, 36 h, 48 h, 60 h, 72 h, 84 h, 96 h, 5 days, 6 days, 7 days, Either 8 days, 9 days, or 10 days.

任選的第二富集步驟optional second enrichment step

在一些實施方案中,所述癌細胞文庫經歷兩個富集步驟。在一些實施方案中,獲得本文所述處理後癌細胞群的方法還包含分選第二處理後/生長或第二恢復後的癌細胞以獲得純的活癌細胞群。在一些實施方案中,所述方法包括分選第二處理後/生長或第二恢復後的癌細胞群以獲得純的活癌細胞群,即對抗癌藥物耐藥的處理後癌細胞群 (以下也稱為“第二活的富集”)。在一些實施方案中,所述第二富集方法與所述第一富集方法相同,例如,用相同的細胞活性標誌物標記細胞(例如,都用PI染色),細胞用相同的分選方法進行分選(例如,都使用FACS)。在一些實施方案中,所述第二富集方法與所述第一富集方法不同,例如,用不同的細胞活性標誌物標記細胞(例如,兩個富集步驟中為PI與DAPI染色,或第二個富集步驟中基於顯微鏡下的形態學),和/或使用不同的分選方法(例如,FACS與手動分選,或通過沖洗走死的漂浮細胞)對細胞進行分選。In some embodiments, the library of cancer cells undergoes two enrichment steps. In some embodiments, the method of obtaining a population of treated cancer cells described herein further comprises sorting the cancer cells after a second treatment/growth or after a second recovery to obtain a pure population of viable cancer cells. In some embodiments, the method comprises sorting a second treatment/growth or second recovery cancer cell population to obtain a pure viable cancer cell population, i.e., a treated cancer cell population resistant to an anticancer drug ( Hereinafter also referred to as "second live enrichment"). In some embodiments, the second enrichment method is the same as the first enrichment method, e.g., the cells are labeled with the same cell viability marker (e.g., both stained with PI), and the cells are sorted by the same method Sorting was performed (eg, both using FACS). In some embodiments, the second enrichment method is different from the first enrichment method, e.g., cells are labeled with different cell viability markers (e.g., PI and DAPI staining in two enrichment steps, or Cells were sorted in a second enrichment step based on morphology under the microscope), and/or using a different sorting method (e.g., FACS vs. manual sorting, or by washing away dead floating cells).

因此,在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(第一處理步驟,例如,持續約9至約10個倍增時間),去除含有抗癌藥物的細胞培養基(和死的漂浮細胞),在不含抗癌藥物的細胞培養基中使剩餘的粘附癌細胞生長(第一恢復步驟),去除不含抗癌藥物的細胞培養基,使剩餘的癌細胞(粘附癌細胞,大部分或所有活的)與抗癌藥物接觸(第二處理步驟,例如,持續約15至約16個倍增時間),去除含有抗癌藥物的細胞培養基(和死的漂浮細胞),在不含抗癌藥物的細胞培養基中使剩餘的粘附癌細胞生長(第二恢復步驟),以及通過去除細胞培養基(和死的漂浮細胞,如果有)來收穫癌細胞並收集剩餘的粘附癌細胞(例如,通過胰蛋白酶消化),從而獲得處理後的對抗癌藥物耐藥的癌細胞群。在一些實施方案中,所述第一處理步驟和所述第一恢復步驟之間,所述第一恢復步驟和所述第二處理步驟之間,所述第二處理步驟和所述第二恢復步驟之間,和/或所述第二恢復步驟和收穫步驟之間,所述方法可包括一個或多個富集步驟,如通過用細胞活性標誌物(例如PI)染色癌細胞,並對活的癌細胞進行分選(PI陰性,第一富集步驟,例如通過FACS)。Thus, in some embodiments, the anticancer drug treatment step b) and the cancer cell acquisition step c) comprise: contacting the cancer cell library with an anticancer drug while allowing viable cancer cells to grow (first treatment step, e.g., for about 9 to about 10 doubling times), the cell culture medium (and dead floating cells) containing the anticancer drug was removed, and the remaining adherent cancer cells were grown in cell culture medium without the anticancer drug (first recovery step ), remove the cell culture medium without the anticancer drug, and expose the remaining cancer cells (adherent cancer cells, mostly or all alive) to the anticancer drug (a second treatment step, e.g., for about 15 to about 16 doubling time), removal of cell culture medium (and dead floating cells) containing anticancer drug, growth of remaining adherent cancer cells in cell culture medium without anticancer drug (second recovery step), and (and dead floating cells, if any) to harvest the cancer cells and collect the remaining adherent cancer cells (eg, by trypsinization) to obtain a processed anticancer drug-resistant cancer cell population. In some embodiments, between said first treatment step and said first recovery step, between said first recovery step and said second treatment step, between said second treatment step and said second recovery Between the steps, and/or between the second recovery step and the harvesting step, the method may include one or more enrichment steps, such as by staining cancer cells with a cell viability marker (e.g., PI) and analyzing the viability Cancer cells were sorted (PI negative, first enrichment step, eg by FACS).

在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(第一處理步驟,例如,持續約9至約10個倍增時間),去除含有抗癌藥物的細胞培養基(和死的漂浮細胞),在不含抗癌藥物的細胞培養基中使剩餘的粘附癌細胞生長(第一恢復步驟),去除不含抗癌藥物的細胞培養基,使剩餘的癌細胞(粘附癌細胞,大部分或所有活的)與抗癌藥物接觸(第二處理步驟,例如,持續約15至約16個倍增時間),以及通過去除含有抗癌藥物的細胞培養基(和死的漂浮細胞)來收穫癌細胞並收集剩餘的粘附癌細胞(例如,通過胰蛋白酶消化),從而獲得處理後的對抗癌藥物耐藥的癌細胞群。在一些實施方案中,所述第一處理步驟和所述第一恢復步驟之間,所述第一恢復步驟和所述第二處理步驟之間,和/或所述第二恢復步驟和收穫步驟之間,所述方法可包括一個或多個富集步驟,如通過用細胞活性標誌物(例如PI)染色癌細胞,並對活的癌細胞進行分選(PI陰性,第一富集步驟,例如通過FACS)。In some embodiments, the anticancer drug treatment step b) and cancer cell acquisition step c) comprise: contacting the cancer cell library with an anticancer drug while allowing viable cancer cells to grow (the first treatment step, for example, for about 9 to about 10 doubling times), the cell culture medium (and dead floating cells) containing the anticancer drug was removed, and the remaining adherent cancer cells were grown in cell culture medium without the anticancer drug (first recovery step), The cell culture medium without the anticancer drug is removed and the remaining cancer cells (adherent cancer cells, most or all viable) are exposed to the anticancer drug (second treatment step, e.g., for about 15 to about 16 doubling times ), as well as harvesting the cancer cells by removing the cell culture medium (and dead floating cells) containing the anticancer drug and collecting the remaining adherent cancer cells (e.g., by trypsinization), resulting in processed anticancer drug-resistant Drug-based cancer cell populations. In some embodiments, between said first processing step and said first recovery step, between said first recovery step and said second processing step, and/or between said second recovery step and harvesting step In between, the method may include one or more enrichment steps, such as by staining cancer cells with a cell viability marker (eg PI) and sorting live cancer cells (PI negative, first enrichment step, eg by FACS).

在一些實施方案中,所述抗癌藥物處理步驟b)和癌細胞獲取步驟c)包括:使癌細胞文庫與抗癌藥物接觸同時允許活的癌細胞生長(第一處理步驟,例如,持續約9至約10個倍增時間),任選地去除含有抗癌藥物的細胞培養基,用細胞活性標誌物(例如PI)染色剩餘的癌細胞,對活的癌細胞進行分選(PI陰性,第一富集步驟,例如通過FACS),任選地在不含抗癌藥物的細胞培養基中使分選的活癌細胞生長(任選的第一恢復步驟),將分選的活癌細胞與抗癌藥物接觸(第二處理步驟,例如,持續約15至約16個倍增時間),並允許活癌細胞生長,任選地去除含有抗癌藥物的細胞培養基,任選地用細胞活性標誌物(例如,PI)染色剩餘的癌細胞,對活的癌細胞進行分選(PI陰性,第二富集步驟,例如通過FACS),從而獲得對抗癌藥物耐藥的處理後癌細胞群。在一些實施方案中,所述方法還包括在所述第二富集步驟後於不含抗癌藥物的細胞培養基中使分選的活癌細胞生長(任選的第二恢復步驟),然後通過去除細胞培養基(和死的漂浮細胞,如果有)來收穫癌細胞並收集剩餘的粘附癌細胞(例如,通過胰蛋白酶消化)。In some embodiments, the anticancer drug treatment step b) and cancer cell acquisition step c) comprise: contacting the cancer cell library with an anticancer drug while allowing viable cancer cells to grow (the first treatment step, for example, for about 9 to about 10 doubling times), optionally remove the cell culture medium containing the anticancer drug, stain the remaining cancer cells with a marker of cell viability (such as PI), and sort the viable cancer cells (PI negative, first Enrichment step, e.g. by FACS), optionally growing sorted live cancer cells in cell culture medium without anticancer drug (optional first recovery step), combining sorted live cancer cells with anticancer Drug exposure (a second treatment step, e.g., for about 15 to about 16 doubling times), and allowing viable cancer cells to grow, optionally removing the cell culture medium containing the anticancer drug, optionally using markers of cell viability (e.g., , PI) to stain the remaining cancer cells and sort the live cancer cells (PI negative, second enrichment step, eg by FACS) to obtain the treated cancer cell population resistant to anticancer drugs. In some embodiments, the method further comprises growing the sorted live cancer cells in cell culture medium without the anticancer drug after the second enrichment step (optional second recovery step), followed by Cell culture medium (and dead floating cells, if any) are removed to harvest cancer cells and remaining adherent cancer cells are collected (eg, by trypsinization).

命中基因鑒定Hit gene identification

本文所述方法包括鑒定對抗癌藥物耐藥的處理後癌細胞群中的命中基因(“命中基因鑒定步驟”)。在一些實施方案中,從對抗癌藥物耐藥的處理後癌細胞群中鑒定的命中基因分別被認為是其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。The methods described herein include identifying hit genes in a population of treated cancer cells resistant to an anticancer drug ("hit gene identification step"). In some embodiments, hit genes identified from a population of treated cancer cells resistant to an anticancer drug are considered target genes whose mutations render said cancer cells sensitized or resistant to the anticancer drug, respectively.

在一些實施方案中,所述命中基因鑒定步驟包括:i)從“癌細胞獲取步驟c)”獲得的處理後癌細胞群中,鑒定包含所述命中基因突變(例如,失活突變)的序列;以及ii)鑒定與包含所述命中基因突變(例如,失活突變)的序列對應的命中基因。在一些實施方案中,通過測序,例如PCR測序(例如Sanger測序)或基因組測序(或DNA-seq,如下一代測序或“NGS”),鑒定包含所述命中基因突變(例如,失活突變)的序列。例如,在一些實施方案中,通過測序、通過與野生型(或健康個體)基因組序列比較,或通過與初始癌細胞群的基因組序列比較,來鑒定對抗癌藥物耐藥的治療後癌細胞群的序列(核酸片段、PCR片段或全基因組),並且包含命中基因突變(例如,失活突變)的序列可以被鑒定並映射到命中基因。在一些實施方案中,所述命中基因的鑒定步驟還包括從步驟c)的處理後癌細胞群中分離基因組DNA或RNA。在一些實施方案中,所述命中基因鑒定步驟還包括包含命中基因突變(例如,失活突變)的核酸序列的PCR擴增。In some embodiments, the step of identifying the hit gene comprises: i) identifying the sequence comprising the mutation of the hit gene (for example, an inactivating mutation) from the treated cancer cell population obtained in the "cancer cell obtaining step c)" and ii) identifying a hit gene corresponding to a sequence comprising the hit gene mutation (eg, an inactivating mutation). In some embodiments, mutations (e.g., inactivating mutations) comprising the hit gene are identified by sequencing, e.g., PCR sequencing (e.g., Sanger sequencing) or genome sequencing (or DNA-seq, such as next-generation sequencing or "NGS") the sequence of. For example, in some embodiments, post-treatment cancer cell populations that are resistant to anticancer drugs are identified by sequencing, by comparison to wild-type (or healthy individual) genome sequence, or by comparison to the genome sequence of the naive cancer cell population Sequences (nucleic acid fragments, PCR fragments, or whole genomes), and sequences containing hit gene mutations (eg, inactivating mutations) can be identified and mapped to hit genes. In some embodiments, the step of identifying hit genes further comprises isolating genomic DNA or RNA from the treated cancer cell population in step c). In some embodiments, the hit gene identifying step further comprises PCR amplification of nucleic acid sequences comprising hit gene mutations (eg, inactivating mutations).

在一些實施方案中,本文所述癌細胞文庫包含針對本文所述命中基因的所述sgRNA構建體或所述sgRNA iBAR構建體。因此,在一些實施方案中,所述命中基因鑒定步驟包括:i)從“癌細胞獲取步驟c)”獲得的處理後癌細胞群中,鑒定所述sgRNA序列或sgRNA iBAR序列;以及ii)鑒定與所述sgRNA序列或sgRNA iBAR序列的嚮導序列(所靶向的)對應的命中基因。在一些實施方案中,通過RNA測序(RNA-seq),例如RNA NGS,鑒定所述sgRNA序列或sgRNA iBAR序列。在一些實施方案中,所述命中基因鑒定步驟包括:i)從“癌細胞獲取步驟c)”獲得的處理後癌細胞群中,鑒定編碼所述sgRNA或sgRNA iBAR的核酸序列;以及ii)鑒定與所述核酸序列編碼的嚮導序列對應的命中基因。在一些實施方案中,通過測序,例如PCR測序(例如Sanger測序)或基因組測序(DNA-seq),例如NGS,鑒定編碼所述sgRNA或sgRNA iBAR的核酸序列。在一些實施方案中,所述iBAR序列可用於鑒定所述sgRNA iBAR序列或編碼所述sgRNA iBAR的核酸序列。在一些實施方案中,所述命中基因的鑒定步驟還包括從“癌細胞獲取步驟c)”獲得的處理後癌細胞群中分離基因組DNA或RNA。在一些實施方案中,所述命中基因鑒定步驟還包括編碼所述sgRNA或sgRNA iBAR的核酸序列的PCR擴增。 In some embodiments, the cancer cell library described herein comprises said sgRNA construct or said sgRNA iBAR construct directed against a hit gene described herein. Therefore, in some embodiments, the step of identifying hit genes includes: i) identifying the sgRNA sequence or sgRNA iBAR sequence from the treated cancer cell population obtained in "cancer cell obtaining step c)"; and ii) identifying The hit gene corresponding to the guide sequence (targeted) of the sgRNA sequence or sgRNA iBAR sequence. In some embodiments, the sgRNA sequence or sgRNA iBAR sequence is identified by RNA sequencing (RNA-seq), eg, RNA NGS. In some embodiments, the step of identifying hit genes includes: i) identifying the nucleic acid sequence encoding the sgRNA or sgRNA iBAR from the treated cancer cell population obtained in "cancer cell obtaining step c)"; and ii) identifying A hit gene corresponding to the guide sequence encoded by the nucleic acid sequence. In some embodiments, the nucleic acid sequence encoding the sgRNA or sgRNA iBAR is identified by sequencing, eg, PCR sequencing (eg, Sanger sequencing) or genome sequencing (DNA-seq), eg, NGS. In some embodiments, the iBAR sequence can be used to identify the sgRNA iBAR sequence or the nucleic acid sequence encoding the sgRNA iBAR . In some embodiments, the step of identifying hit genes further includes isolating genomic DNA or RNA from the treated cancer cell population obtained in "cancer cell obtaining step c)". In some embodiments, the hit gene identifying step further comprises PCR amplification of the nucleic acid sequence encoding the sgRNA or sgRNA iBAR .

DNA-seq、RNA-seq、PCR測序(例如,Sanger測序)、DNA/RNA提取、cDNA製備和資料分析的方法在本領域是公知的,並且可以在本文中酌情用於鑒定處理後癌症細胞群中對抗癌藥物耐藥的命中基因。可以使用本領域的任何已知方法分析測序數據並與基因組比對。Methods of DNA-seq, RNA-seq, PCR sequencing (e.g., Sanger sequencing), DNA/RNA extraction, cDNA preparation, and data analysis are well known in the art and can be used herein to identify post-treatment cancer cell populations as appropriate Hit genes in anticancer drug resistance. Sequencing data can be analyzed and aligned to the genome using any method known in the art.

靶基因鑒定Target gene identification

在一些實施方案中,從對抗癌藥物耐藥的處理後癌細胞群中鑒定的命中基因被分別認為是所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,從對抗癌藥物耐藥的處理後癌細胞群(即,活的處理後癌細胞群)中鑒定的命中基因是其突變(例如,失活)使所述癌細胞對抗癌藥物耐藥的靶基因。In some embodiments, hit genes identified from a population of treated cancer cells resistant to an anticancer drug are considered targets in said cancer cells whose mutations render said cancer cells sensitive or resistant to an anticancer drug, respectively. Gene. In some embodiments, a hit gene identified from a population of treated cancer cells resistant to an anticancer drug (i.e., a viable population of treated cancer cells) is one whose mutation (eg, inactivation) renders the cancer cells resistant to Target genes of anticancer drug resistance.

在一些實施方案中,將從對抗癌藥物耐藥的處理後癌細胞群中鑒定的命中基因與對照進一步比較,和/或根據預定閾值水準進一步排序和/或過濾。In some embodiments, the hit genes identified from the treated cancer cell population resistant to an anticancer drug are further compared to a control, and/or further sorted and/or filtered according to a predetermined threshold level.

在一些實施方案中,鑒定靶基因包括:i) 獲取從步驟c)獲得的處理後癌細胞群中的包含所述命中基因突變(例如,失活突變)的序列;ii)基於序列計數對包含所述命中基因突變(例如,失活突變)的序列進行排序;以及iii)鑒定排序高於預定的閾值水準的對應於包含所述命中基因突變(例如,失活突變)的序列的命中基因。在一些實施方案中,排序步驟包括基於對應于相同命中基因(或相同命中基因的相同靶位點)的包含所述命中基因突變(例如,失活突變)的所有序列中的資料一致性,調整包含命中基因突變(例如,失活突變)的每個序列的排序。例如,資料不一致(如相對於對照,倍數變化的不同方向)會增加對應于相同命中基因的包含所述命中基因突變(例如,失活突變)的序列的方差,並降低該命中基因的排序。在一些實施方案中,在基於RRA或α-RRA演算法的零假設下,命中基因被鑒定為對應於包含所述命中基因突變(例如,失活突變)的序列,這些突變在排列的序列中的排序始終優於預期。在一些實施方案中,預定的閾值水準是值“X”的FDR (例如,0.1),以及對應於包含所述命中基因突變(例如,失活突變)的序列且具有FDR ≤ “X”的命中基因被鑒定為所述靶基因。在一些實施方案中,預定的閾值水準是值“X”-倍(例如,約2-倍)的富集或耗竭,以及對應於包含所述命中基因突變(例如,失活突變)的序列且具有富集或耗竭 ≥ “X”-倍的命中基因被鑒定為所述靶基因。在一些實施方案中,通過測序例如,Sanger測序或基因組測序(或DNA-seq,如NGS)來鑒定包含所述命中基因突變(例如,失活突變)的序列。In some embodiments, identifying the target gene comprises: i) obtaining sequences comprising said hit gene mutations (eg, inactivating mutations) in the treated cancer cell population obtained from step c); ii) counting sequences comprising ranking the sequences of the hit gene mutations (eg, inactivating mutations); and iii) identifying hit genes that are ranked above a predetermined threshold level corresponding to sequences comprising the hit gene mutations (eg, inactivating mutations). In some embodiments, the ordering step includes adjusting the sequence based on the identity of the data in all sequences containing mutations (e.g., inactivating mutations) of the same hit gene (or the same target site of the same hit gene) corresponding to the same hit gene. Ranking of each sequence containing mutations in hit genes (eg, inactivating mutations). For example, data inconsistencies (eg, different directions of fold change relative to controls) can increase the variance of sequences corresponding to the same hit gene containing mutations in the hit gene (e.g., inactivating mutations) and decrease the ranking of the hit gene. In some embodiments, under the null hypothesis based on the RRA or α-RRA algorithm, hit genes are identified as corresponding to sequences comprising mutations (e.g., inactivating mutations) of the hit genes that are present in the aligned sequences The ordering of is always better than expected. In some embodiments, the predetermined threshold level is an FDR of value "X" (e.g., 0.1), and hits corresponding to sequences comprising said hit gene mutations (e.g., inactivating mutations) having FDR ≤ "X" Genes were identified as the target genes. In some embodiments, the predetermined threshold level is a value "X"-fold (eg, about 2-fold) enrichment or depletion, and corresponds to sequences comprising said hit gene mutation (eg, an inactivating mutation) and Hit genes with enrichment or depletion > "X"-fold were identified as the target genes. In some embodiments, the sequence comprising the hit gene mutation (eg, an inactivating mutation) is identified by sequencing, eg, Sanger sequencing or genome sequencing (or DNA-seq, such as NGS).

在一些實施方案中,本文所述癌細胞文庫包含針對本文所述命中基因的所述sgRNA構建體或所述sgRNA iBAR構建體。因此,在一些實施方案中,鑒定靶基因包括:i) 獲取從步驟c)獲得的處理後癌細胞群中的sgRNA序列或sgRNA iBAR序列;ii)基於序列計數對sgRNA序列或sgRNA iBAR序列的相應嚮導序列進行排序;以及iii) 鑒定對應于排序高於預定閾值水準的嚮導序列的命中基因。在一些實施方案中,所述排序包括基於對應于相同命中基因(或相同命中基因的相同靶位點)的所有嚮導序列中的資料一致性,調整所述sgRNA序列或sgRNA iBAR序列的每個嚮導序列的排序。例如,資料不一致(如相對於對照,倍數變化的不同方向)會增加對應于相同命中基因的嚮導序列的方差,並降低該命中基因的排序。在一些實施方案中,在基於RRA或α-RRA演算法的零假設下,命中基因被鑒定為對應于嚮導序列,所述嚮導序列在排列的嚮導序列中的排序始終優於預期。在一些實施方案中,預定的閾值水準是值“X”的FDR (例如,0.1),以及對應于嚮導序列且具有FDR ≤ “X”的命中基因被鑒定為所述靶基因。在一些實施方案中,預定的閾值水準是值“X”-倍(例如,約2-倍)的富集或耗竭,以及對應于嚮導序列且具有富集或耗竭 ≥ “X”-倍的命中基因被鑒定為所述靶基因。在一些實施方案中,通過RNA-seq,例如RNA NGS來鑒定所述sgRNA序列或sgRNA iBAR序列。在一些實施方案中,通過基因組測序(DNA-seq),例如NGS來鑒定編碼所述sgRNA或sgRNA iBAR的核酸序列。 In some embodiments, the cancer cell library described herein comprises said sgRNA construct or said sgRNA iBAR construct directed against a hit gene described herein. Accordingly, in some embodiments, identifying the target gene comprises: i) obtaining the sgRNA sequence or sgRNA iBAR sequence in the treated cancer cell population obtained from step c); ii) corresponding sequence counting of the sgRNA sequence or sgRNA iBAR sequence ranking the guide sequences; and iii) identifying hit genes corresponding to guide sequences ranking above a predetermined threshold level. In some embodiments, the ranking includes aligning each guide for the sgRNA sequence or sgRNA iBAR sequence based on the identity of the data among all guide sequences corresponding to the same hit gene (or the same target site for the same hit gene). sequence ordering. For example, data inconsistencies (such as different directions of fold change relative to controls) can increase the variance of guide sequences corresponding to the same hit gene and decrease the ranking of that hit gene. In some embodiments, under the null hypothesis based on the RRA or α-RRA algorithm, hit genes are identified as corresponding to guide sequences that consistently rank better than expected among the aligned guide sequences. In some embodiments, the predetermined threshold level is an FDR of value "X" (eg, 0.1), and hit genes corresponding to guide sequences with FDR≦"X" are identified as the target genes. In some embodiments, the predetermined threshold level is an enrichment or depletion of "X"-fold (e.g., about 2-fold) in value, and hits corresponding to guide sequences with an enrichment or depletion ≥ "X"-fold Genes were identified as the target genes. In some embodiments, the sgRNA sequence or sgRNA iBAR sequence is identified by RNA-seq, eg, RNA NGS. In some embodiments, the nucleic acid sequence encoding the sgRNA or sgRNA iBAR is identified by genome sequencing (DNA-seq), such as NGS.

在一些實施方案中,本文所述癌細胞文庫包含針對本文所述命中基因的所述sgRNA iBAR構建體。在一些實施方案中,鑒定靶基因包括:i) 獲取從步驟c)獲得的處理後癌細胞群中的sgRNA iBAR序列;ii)基於序列計數對sgRNA iBAR序列的相應嚮導序列進行排序,其中所述排序包括基於對應于嚮導序列的所述sgRNA iBAR序列中的iBAR序列之間的資料一致性調整每個嚮導序列的排序;以及iii) 鑒定對應于排序高於預定閾值水準的嚮導序列的命中基因。在一些實施方案中,在基於RRA或α-RRA演算法的零假設下,命中基因被鑒定為對應于嚮導序列,所述嚮導序列在排列的嚮導序列中的排序始終優於預期。在一些實施方案中,預定的閾值水準是值“X”的FDR (例如,0.1),以及對應于嚮導序列且具有FDR ≤ “X”的命中基因被鑒定為所述靶基因。在一些實施方案中,預定的閾值水準是至少約2-倍富集或耗竭。 In some embodiments, a cancer cell library described herein comprises said sgRNA iBAR construct directed against a hit gene described herein. In some embodiments, identifying the target gene comprises: i) obtaining the sgRNA iBAR sequences in the treated cancer cell population obtained from step c); ii) ranking the corresponding guide sequences of the sgRNA iBAR sequences based on sequence counts, wherein the Ranking includes adjusting the rank of each guide sequence based on data concordance between iBAR sequences in the sgRNA iBAR sequences corresponding to guide sequences; and iii) identifying hit genes corresponding to guide sequences ranked above a predetermined threshold level. In some embodiments, under the null hypothesis based on the RRA or α-RRA algorithm, hit genes are identified as corresponding to guide sequences that consistently rank better than expected among the aligned guide sequences. In some embodiments, the predetermined threshold level is an FDR of value "X" (eg, 0.1), and hit genes corresponding to guide sequences with FDR≦"X" are identified as the target genes. In some embodiments, the predetermined threshold level is at least about 2-fold enrichment or depletion.

在一些實施方案中,從統計分析來確定包含所述命中基因突變(例如,失活突變)的序列或嚮導RNA的序列計數。在一些實施方案中,從統計分析來確定嚮導RNA和相應iBAR序列的序列計數。參見圖3的示例性靶基因鑒定工作流程。統計學方法可用於確定在處理後癌細胞群中富集或耗竭的包含所述命中基因突變(例如,失活突變)的序列、所述sgRNA分子或sgRNA iBAR分子的序列的身份。在一些實施方案中,針對抗癌藥物處理的癌細胞文庫進行大於1個(例如,2、3或更多個)生物學或技術重複。在一些實施方案中,針對對照癌細胞群進行大於1個(例如,2、3或更多個)生物學或技術重複。在一些實施方案中,將來自抗癌藥物處理組(或對照組)的2個或更多(例如,2、3或更多個)重複的,包含所述命中基因突變(例如,失活突變)的序列或嚮導RNA進行組合,以用於計算所述抗癌藥物處理組(或對照組)的重複中的均值和方差。示例統計方法包括但不限於:線性回歸、廣義線性回歸和層次回歸。在一些實施方案中,序列計數遵循歸一化方法,如總計數歸一化或中值比率歸一化。在一些實施方案中,例如,對於陽性篩選,首選中值比率歸一化。在一些實施方案中,例如,對於遵循正態分佈的序列計數,序列計數經過中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,將MAGeCK (Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol 15, 554 (2014))用於對包含所述命中基因突變(例如,失活突變)的序列或嚮導RNA序列進行排序,和/或鑒定靶基因。在一些實施方案中,將MAGeCKiBAR (Zhu et al., Genome Biol. 2019; 20:20)用於對包含所述命中基因突變(例如,失活突變)的序列或嚮導RNA序列進行排序,和/或鑒定靶基因。 In some embodiments, sequence counts of sequences or guide RNAs comprising the hit mutations (eg, inactivating mutations) are determined from statistical analysis. In some embodiments, sequence counts for guide RNA and corresponding iBAR sequences are determined from statistical analysis. See Figure 3 for an exemplary target gene identification workflow. Statistical methods can be used to determine the identity of sequences comprising the hit gene mutation (eg, inactivating mutation), the sgRNA molecule, or the sequence of the sgRNA iBAR molecule that are enriched or depleted in the cancer cell population after treatment. In some embodiments, greater than 1 (eg, 2, 3, or more) biological or technical replicates are performed against the anticancer drug-treated cancer cell library. In some embodiments, greater than 1 (eg, 2, 3 or more) biological or technical replicates are performed against a control cancer cell population. In some embodiments, 2 or more (eg, 2, 3 or more) replicates from the anticancer drug treatment group (or control group) comprising the hit gene mutation (eg, an inactivating mutation ) sequences or guide RNAs are combined for calculating the mean and variance among the repetitions of the anticancer drug treatment group (or control group). Example statistical methods include, but are not limited to: linear regression, generalized linear regression, and hierarchical regression. In some embodiments, sequence counts follow a normalization method, such as total count normalization or median ratio normalization. In some embodiments, eg, for positive screening, median ratio normalization is preferred. In some embodiments, for example, for sequence counts following a normal distribution, the sequence counts are normalized by the median ratio and then subjected to mean-variance modeling. In some embodiments, MAGeCK (Li, W. et al. MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol 15, 554 (2014)) is used to identify genes comprising the hit. Sequences of mutations (eg, inactivating mutations) or guide RNA sequences are sequenced, and/or target genes are identified. In some embodiments, MAGeCKiBAR (Zhu et al., Genome Biol. 2019; 20:20) is used to rank sequences or guide RNA sequences comprising said hit gene mutations (e.g., inactivating mutations), and/or or identify target genes.

在一些實施方案中,基於處理後癌細胞群和對照癌細胞群中sgRNA (或sgRNA iBAR)或命中基因突變的譜之間的差異,鑒定其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,基於處理後癌細胞群和對照癌細胞群中命中基因突變的譜之間的差異,鑒定所述靶基因。在一些實施方案中,基於處理後癌細胞群和對照癌細胞群中sgRNA (或sgRNA iBAR)的譜之間的差異,鑒定所述靶基因。在一些實施方案中,所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫。在一些實施方案中,處理後癌細胞群和對照癌細胞群中sgRNA (或sgRNA iBAR) 或命中基因突變的譜,是通過下一代測序來鑒定的 (NGS),如DNA-seq或RNA-seq。在一些實施方案中,sgRNA (或sgRNA iBAR)的譜包括:所述sgRNA (或sgRNA iBAR)的序列計數,或所述sgRNA (或sgRNA iBAR)的相應嚮導序列的序列計數。在一些實施方案中,sgRNA (或sgRNA iBAR)的譜包括:編碼所述sgRNA (或sgRNA iBAR)的核酸的序列計數,或編碼相應sgRNA (或sgRNA iBAR)的嚮導序列的核酸的序列計數。在一些實施方案中,所述命中基因突變的譜包括:包含所述命中基因突變的序列的序列計數。在一些實施方案中,本文所述的方法還包括:在相同條件且沒有接觸所述抗癌藥物下培養相同的癌細胞文庫。 In some embodiments, based on the difference between the profiles of sgRNA (or sgRNA iBAR ) or hit gene mutations in a treated cancer cell population and a control cancer cell population, mutations thereof that render the cancer cells sensitive or resistant to an anticancer drug are identified drug target gene. In some embodiments, the target gene is identified based on the difference between the profile of the hit gene mutation in a treated cancer cell population and a control cancer cell population. In some embodiments, the target gene is identified based on the difference between the profile of the sgRNA (or sgRNA iBAR ) in the treated cancer cell population and the control cancer cell population. In some embodiments, the control population of cancer cells is obtained from a library of cancer cells cultured under the same conditions and not exposed to the anticancer drug. In some embodiments, profiles of sgRNA (or sgRNA iBAR ) or hit gene mutations in treated and control cancer cell populations are identified by next generation sequencing (NGS), such as DNA-seq or RNA-seq . In some embodiments, the profile of the sgRNA (or sgRNA iBAR ) comprises: a sequence count of the sgRNA (or sgRNA iBAR ), or a sequence count of the corresponding guide sequence of the sgRNA (or sgRNA iBAR ). In some embodiments, the profile of a sgRNA (or sgRNA iBAR ) comprises: a sequence count of nucleic acids encoding said sgRNA (or sgRNA iBAR ), or a sequence count of nucleic acids encoding a guide sequence of the corresponding sgRNA (or sgRNA iBAR ). In some embodiments, the profile of hit gene mutations comprises: sequence counts of sequences comprising the hit gene mutations. In some embodiments, the methods described herein further comprise: culturing the same library of cancer cells under the same conditions without contacting the anticancer drug.

在一些實施方案中,將獲自步驟c)的處理後癌細胞群的序列計數(例如, sgRNA或sgRNA iBAR或其嚮導序列的序列計數,編碼所述sgRNA或sgRNA iBAR或其嚮導序列的核酸的序列計數,或包含所述命中基因突變的序列的序列計數)與獲自對照癌細胞群或對照癌細胞文庫的相應序列計數進行比較,例如,以提供倍數變化(例如,實際倍數變化,或倍數變化的導數如log2或log10倍數變化),用於顯著性檢驗(如FDR、p值),用於分佈統計,和/或通過評分和/或推導提供基因或序列排序。在一些實施方案中,所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫,例如,從測試開始到最終樣本採集,在相同的培養條件下連續培養與測試組(抗癌藥物處理組)相同時間。在一些實施方案中,對照癌細胞群是在相同條件下培養的完整相同的癌細胞文庫,而不經歷抗癌藥物處理,也不經歷步驟b)和步驟c中的任何選擇、恢復或獲取方法,以下也稱為“對照癌細胞文庫”。在一些實施方案中,對照癌細胞群是從在相同條件下培養的相同癌細胞文庫中獲得的,而不經歷抗癌藥物處理以及步驟c)中相同的獲取方法。 In some embodiments, the sequence count of the treated cancer cell population obtained from step c) (for example, the sequence count of the sgRNA or sgRNA iBAR or its guide sequence, the sequence count of the nucleic acid encoding the sgRNA or sgRNA iBAR or its guide sequence The sequence count, or the sequence count of the sequence containing the mutation in the hit gene) is compared to the corresponding sequence count obtained from a control cancer cell population or a control cancer cell library, for example, to provide a fold change (e.g., an actual fold change, or a fold Derivatives of change such as log2 or log10 fold change), for significance testing (e.g. FDR, p-value), for distribution statistics, and/or to provide gene or sequence ranking by scoring and/or derivation. In some embodiments, the control cancer cell population is obtained from a cancer cell library cultured under the same conditions without exposure to the anticancer drug, e.g., continuously cultured under the same culture conditions with Test group (anticancer drug treatment group) at the same time. In some embodiments, the control cancer cell population is an entire identical library of cancer cells cultured under the same conditions without being subjected to anticancer drug treatment, nor to any selection, recovery or acquisition method in step b) and step c , hereinafter also referred to as "control cancer cell library". In some embodiments, the control cancer cell population is obtained from the same cancer cell library cultured under the same conditions without undergoing anticancer drug treatment and the same obtaining method in step c).

在一些實施方案中,本文所述的方法還包括在相同條件且沒有接觸所述抗癌藥物下培養相同的癌細胞文庫,以及任選地經歷“癌細胞獲取步驟c)”中的相同獲取方法以獲得對照癌細胞群,其中鑒定來自所述對照癌細胞群或對照癌細胞文庫的對應於包含所述命中基因突變(例如,失活突變)的序列或所述sgRNA或sgRNA iBAR的嚮導序列的命中基因的存在,但鑒定獲自步驟c)的處理後癌細胞群中的不存在,將命中基因鑒定為靶基因。例如,對於在單獨的癌細胞中包含突變A、B和C的癌細胞文庫,如果從處理後癌細胞群中僅鑒定出突變A,從該處理後癌細胞群中鑒定突變B和C的不存在,表明命中基因B和C是靶基因,例如,當其突變時賦予對抗癌藥物殺傷的敏感性。 In some embodiments, the methods described herein further comprise culturing the same cancer cell library under the same conditions without contacting the anticancer drug, and optionally undergoing the same acquisition method in "cancer cell acquisition step c)" Obtaining a control cancer cell population, wherein identifying a sequence from the control cancer cell population or a control cancer cell library corresponding to the sequence comprising the hit gene mutation (for example, an inactivating mutation) or the guide sequence of the sgRNA or sgRNA iBAR The presence of the hit gene, but its absence in the population of cancer cells after the treatment from step c) is identified, the hit gene is identified as the target gene. For example, for a cancer cell library comprising mutations A, B, and C in separate cancer cells, if only mutation A is identified from a population of treated cancer cells, the difference between mutations B and C is identified from the population of treated cancer cells. exists, indicating that hit genes B and C are target genes that, for example, confer susceptibility to anticancer drug killing when mutated.

在一些實施方案中,獲得的處理後癌細胞群是活的癌細胞,其對抗癌藥物耐藥。在一些實施方案中,鑒定所述靶基因包括比較獲自處理後癌細胞群的所述sgRNA (或sgRNA iBAR或其嚮導序列,或編碼sgRNA或sgRNA iBAR或其嚮導序列的核酸)的序列計數與獲自所述對照癌細胞群的sgRNA (或sgRNA iBAR或其嚮導序列,或編碼sgRNA或sgRNA iBAR或其嚮導序列的核酸)的序列計數,其中:i)其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1(例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍富集,如約3-、4-、5-、10-、20-、50-、100-倍或更多富集中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii)其相應的sgRNA (或sgRNA iBAR)嚮導序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍耗竭,如約3-、4-、5-、10-、20-、50-、100-倍或更多耗竭中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,所述sgRNA (或sgRNA iBAR或其嚮導序列,或編碼sgRNA或sgRNA iBAR或其嚮導序列的核酸)序列計數經歷中值比率歸一化,然後進行均值-方差建模。在一些實施方案中,鑒定所述靶基因包括比較獲自處理後癌細胞群的所述命中基因突變的序列計數與獲自所述對照癌細胞群的命中基因突變的序列計數,其中:i)其相應的命中基因突變序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為富集的且具有FDR ≤ 0.1(例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍富集,如約3-、4-、5-、10-、20-、50-、100-倍或更多富集中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii)其相應的命中基因突變序列相比對照癌細胞群在處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中被鑒定為耗竭的且具有FDR ≤ 0.1 (例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍耗竭,如約3-、4-、5-、10-、20-、50-、100-倍或更多耗竭中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。在一些實施方案中,所述命中基因突變序列計數經歷中值比率歸一化,然後進行均值-方差建模。 In some embodiments, the obtained population of treated cancer cells is viable cancer cells that are resistant to an anticancer drug. In some embodiments, identifying the target gene comprises comparing the sequence count of the sgRNA (or sgRNA iBAR or guide sequence thereof, or nucleic acid encoding the sgRNA or sgRNA iBAR or guide sequence thereof) obtained from a cancer cell population after treatment with Sequence counts of sgRNA (or sgRNA iBAR or its guide sequence, or nucleic acid encoding sgRNA or sgRNA iBAR or its guide sequence) obtained from said control cancer cell population, wherein: i) its corresponding sgRNA (or sgRNA iBAR ) guide Sequences are identified as enriched in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) compared to control cancer cell populations and have FDR ≤ 0.1 (e.g., FDR ≤ 0.09, 0.08, 0.07, 0.06 , 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less) (and/or have at least about 2-fold enrichment, such as about 3-, 4-, 5-, 10-, 20- , 50-, 100-fold or more enrichment in any one) hit genes identified as targets whose mutations make said cancer cells resistant to anticancer drugs; and/or ii) their corresponding sgRNAs ( or sgRNA iBAR ) guide sequences were identified as depleted in post-treatment cancer cell populations (e.g., viable and resistant to anticancer drugs) with FDR ≤ 0.1 (e.g., FDR ≤ 0.09, 0.08) compared to control cancer cell populations , 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less any one) (and/or have at least about 2-fold depletion, such as about 3-, 4-, 5-, 10- , 20-, 50-, 100-fold or more depletion) were identified as target genes whose mutations sensitized the cancer cells to anticancer drugs. In some embodiments, the sgRNA (or sgRNA iBAR or guide sequence thereof, or nucleic acid encoding the sgRNA or sgRNA iBAR or guide sequence thereof) sequence counts are subjected to median ratio normalization followed by mean-variance modeling. In some embodiments, identifying the target gene comprises comparing the sequence count of the hit gene mutation obtained from the treated cancer cell population to the sequence count of the hit gene mutation obtained from the control cancer cell population, wherein: i) Their corresponding hit mutation sequences are identified as enriched in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) with an FDR ≤ 0.1 (e.g., FDR ≤ 0.09) compared to a control cancer cell population , 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less in any one) (and/or have at least about 2-fold enrichment, such as about 3-, 4-, 5- , 10-, 20-, 50-, 100-fold or more enrichment), identified as target genes whose mutations make said cancer cells resistant to anticancer drugs; and/or ii ) whose corresponding hit gene mutation sequence is identified as depleted in a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug) with an FDR ≤ 0.1 (e.g., FDR ≤ 0.09) compared to a control cancer cell population , 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less any one) (and/or have at least about 2-fold depletion, such as about 3-, 4-, 5-, 10-, 20-, 50-, 100-fold or more depletion) were identified as target genes whose mutations sensitized the cancer cells to anticancer drugs. In some embodiments, the hit mutation sequence counts are subjected to median ratio normalization followed by mean-variance modeling.

在一些實施方案中,所述sgRNA文庫是sgRNA iBAR文庫。在一些實施方案中,基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。在一些實施方案中,基於相同基因中的資料一致性調整每個嚮導序列或包含所述命中基因突變(例如,失活突變)的序列的方差。如本文所用,“資料一致性” 是指篩選實驗中,對應於不同iBAR序列的相同嚮導序列(例如,序列計數、歸一化的序列計數、排序或倍數變化)的測序結果的一致性;或對應于相同基因的不同命中基因突變如失活突變(例如,相同命中基因的不同靶基因位點)或不同sgRNA序列的測序結果的一致性。理論上來自篩選的真正命中應該具有生物學上相關的性能相似性,如對應于具有相同嚮導序列的sgRNA iBAR構建體的相似的歸一化序列計數、排序和/或倍數變化,但不同的iBAR;和/或對應于相同基因的相似的歸一化序列計數、排序和/或倍數變化,但不同的命中基因突變序列如失活突變序列(例如,所述命中基因的不同的靶基因位點)或不同的sgRNA序列。還參見WO2020125762如何進行均值-方差建模,以及如何基於對應于所述嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。 In some embodiments, the sgRNA library is an sgRNA iBAR library. In some embodiments, the variance of each guide sequence is adjusted based on the profile identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences. In some embodiments, the variance of each guide sequence or sequence comprising a mutation (eg, an inactivating mutation) of the hit gene is adjusted based on the identity of the data in the same gene. As used herein, "data consistency" refers to the consistency of the sequencing results of the same guide sequence (e.g., sequence count, normalized sequence count, ordering or fold change) corresponding to different iBAR sequences in a screening experiment; or Concordance of sequencing results corresponding to different hit mutations of the same gene such as inactivating mutations (eg, different target loci of the same hit gene) or different sgRNA sequences. Theoretically true hits from the screen should have biologically relevant performance similarities, such as similar normalized sequence counts, ranks, and/or fold changes corresponding to sgRNA iBAR constructs with identical guide sequences, but different iBARs and/or similar normalized sequence counts, ranks, and/or fold changes corresponding to the same gene, but different hit gene mutation sequences such as inactivation mutation sequences (e.g., different target loci of the hit gene ) or different sgRNA sequences. See also WO2020125762 how to perform mean-variance modeling and how to adjust the variance of each guide sequence based on the data consistency between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences.

在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上(例如,增加了相對於降低了,增加了相對於不變,或降低了相對於不變,均被看作為不同方向),則所述嚮導序列的方差增加了。在一些實施方案中,不同命中基因突變(例如,失活突變)序列或對應于相同基因的不同sgRNA序列中的資料一致性,是基於每個命中基因突變(例如,失活突變)序列或每個sgRNA序列的倍數變化的方向確定的,其中如果不同命中基因突變(例如,失活突變)序列或不同sgRNA序列的倍數變化相對於彼此在不同方向上,則所述命中基因突變(例如,失活突變)序列或嚮導序列的方差增加了。這樣的資料不一致導致的方差增加,有助於排除高MOI條件下陽性篩查中罕見但顯著改變的命中基因突變(如失活突變)/sgRNA/sgRNA iBAR序列。例如,對於iBAR系統,由於文庫構建期間的高MOI,針對真陽性命中基因可有與sgRNA相關的假陽性sgRNA的“搭便車者”。本文所述的“搭便車者”是指靶向無關序列(例如,無關命中基因)而進入相同的癌細胞的sgRNA,這些序列與靶向真陽性命中基因的sgRNA不相關。在一些實施方案中,基於一組sgRNA iBAR構建體內針對每個嚮導序列的不同iBAR的富集方向修改sgRNA iBAR的方差。如果一組sgRNA iBAR構建體的所有iBAR (即,對應于相同嚮導序列的所有iBAR)呈現相同的倍數變化方向,即都大於或小於對照組,那麼該組sgRNA iBAR構建體的方差(或嚮導序列的方差)會是不變的。如果一組sgRNA iBAR構建體的iBAR (或對應于相同嚮導序列的iBAR)顯示相比對照不一致的倍數變化方向,那麼相應的嚮導序列由於增加了其方差而被罰分。在一些實施方案中,針對不一致的sgRNA iBAR的最終調整的方差是模型估計方差(例如,通過均值-方差建模)加上由抗癌藥物處理樣本和對照組計算的實驗方差。在一些實施方案中,命中基因包含2個或更多個(例如,2、3、4、5個或更多個,如3個)命中基因突變(例如,失活突變),或命中基因由不同靶基因位點的2個或更多個(例如,2、3、4、5個或更多個,如3個)不同嚮導序列靶向(例如,2個或更多個不同sgRNA,或2組或更多組sgRNA iBAR構建體,其各自包含靶向不同的靶位點的嚮導序列)。在一些實施方案中,對應于每個嚮導序列和相同命中基因的sgRNA iBAR序列中所述iBAR序列之間的資料一致性都是基於每個iBAR序列的倍數變化方向來確定的,其中如果相應iBAR序列的倍數變化相對於彼此在不同方向上,則嚮導序列的方差增加了,並且如果靶向相同命中基因的兩個或更多個(例如,2、3、4、5或多個,例如3個)不同嚮導序列相對于彼此的倍數變化在不同方向上,則嚮導序列的方差(或命中基因的方差)進一步增加。例如,對於靶向相同命中基因的不同靶位點X的sgRNA A和sgRNA B,如果與對照相比,sgRNA A和sgRNA B的嚮導序列都富集或耗竭,則每個嚮導序列或命中基因的方差不變;與對照組相比,如果sgRNA A的嚮導序列被富集,而sgRNA B的嚮導序列被耗竭,則每個嚮導序列或命中基因的方差增加。在一些實施方案中,基於每個iBAR序列的倍數變化的方向來確定對應于相同命中基因的sgRNA iBAR序列的iBAR序列之間的資料一致性,其中如果對應于相同命中基因的iBAR序列的倍數變化相對於彼此在不同方向上,則靶向相同命中基因的每個嚮導序列的方差增加了,以及靶向相同命中基因的每個嚮導序列的方差(或命中基因的方差)增加了。例如,如果3組sgRNA iBAR(每組4個sgRNA iBAR)靶向相同命中基因的3個不同靶基因位點,如果相比對照所有12個iBAR序列均鑒定為富集,則所有3個嚮導序列的方差保持不變;如果一些iBAR序列被鑒定為富集的,而其他被鑒定為相比對照不變的或耗竭的,則所有3個嚮導序列的方差均增加了。 In some embodiments, the profile identity between iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequences is relatively The variance of the guide sequence increases relative to each other in different directions (for example, increasing versus decreasing, increasing versus unchanged, or decreasing versus unchanged, all considered different directions). In some embodiments, the identity of data among different hit gene mutation (e.g., inactivating mutation) sequences or different sgRNA sequences corresponding to the same gene is based on each hit gene mutation (e.g., inactivating mutation) sequence or each The direction of the fold change of each sgRNA sequence is determined, wherein if the sequence of different hit mutations (for example, inactivation mutations) or the fold changes of different sgRNA sequences are in different directions relative to each other, then the hit mutation (for example, inactivation mutation) Live mutation) sequence or guide sequence variance increased. The increased variance caused by such data inconsistencies helps to exclude rare but significantly altered hit gene mutations (such as inactivating mutations)/sgRNA/sgRNA iBAR sequences in positive screening under high MOI conditions. For example, for the iBAR system, due to the high MOI during library construction, there can be a "free rider" of false positive sgRNAs associated with sgRNAs for true positive hit genes. As used herein, "free riders" refer to sgRNAs that enter the same cancer cell targeting unrelated sequences (eg, irrelevant hit genes) that are not related to sgRNAs that target true positive hit genes. In some embodiments, the variance of sgRNA iBARs is modified based on the direction of enrichment for different iBARs for each guide sequence within a set of sgRNA iBAR constructs. If all iBARs (i.e., all iBARs corresponding to the same guide sequence) of a group of sgRNA iBAR constructs exhibit the same fold-change direction, that is, they are all larger or smaller than the control group, then the variance of the group of sgRNA iBAR constructs (or guide sequence The variance of ) will be constant. If the iBARs of a set of sgRNA iBAR constructs (or iBARs corresponding to the same guide sequences) showed inconsistent fold change directions compared to controls, the corresponding guide sequences were penalized for increasing their variance. In some embodiments, the final adjusted variance for discordant sgRNA iBARs is the model estimated variance (eg, by mean-variance modeling) plus the experimental variance calculated from anticancer drug treated samples and controls. In some embodiments, the hit gene comprises 2 or more (eg, 2, 3, 4, 5 or more, such as 3) hit gene mutations (eg, inactivating mutations), or the hit gene consists of 2 or more (e.g., 2, 3, 4, 5 or more, such as 3) different guide sequences targeting (e.g., 2 or more different sgRNAs, or 2 or more sets of sgRNA iBAR constructs, each comprising a guide sequence targeting a different target site). In some embodiments, the data identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence and the same hit gene is determined based on the fold change direction of each iBAR sequence, wherein if the corresponding iBAR If the fold changes of the sequences are in different directions relative to each other, the variance of the guide sequences increases, and if two or more (e.g., 2, 3, 4, 5 or more, e.g., 3) targeting the same hit gene (1) the fold changes of different guide sequences relative to each other are in different directions, then the variance of the guide sequences (or the variance of the hit genes) is further increased. For example, for sgRNA A and sgRNA B targeting different target sites X of the same hit gene, if the guide sequences of both sgRNA A and sgRNA B are enriched or depleted compared to the control, then each guide sequence or hit gene The variance was unchanged; the variance per guide or hit gene increased if the guide was enriched for sgRNA A and depleted for guide for sgRNA B compared to the control. In some embodiments, data identity between iBAR sequences corresponding to sgRNA iBAR sequences of the same hit gene is determined based on the direction of the fold change of each iBAR sequence, wherein if the fold change of the iBAR sequence corresponding to the same hit gene In different orientations relative to each other, the variance of each guide sequence targeting the same hit gene increases, and the variance of each guide sequence targeting the same hit gene (or the variance of the hit genes) increases. For example, if 3 sets of sgRNA iBARs (4 sgRNA iBARs per set) target 3 different target loci of the same hit gene, if all 12 iBAR sequences are identified as enriched compared to the control, then all 3 guide sequences The variance of α remained constant; if some iBAR sequences were identified as enriched, while others were identified as unchanged or depleted compared to controls, the variance of all 3 guide sequences increased.

在一些實施方案中,在相同命中基因的不同靶基因位點包含命中基因突變(例如,失活突變)的序列,在相應靶位點中所述相同命中基因的倍數變化呈現不同方向;靶向相同命中基因的不同靶基因位點的所述sgRNA或sgRNA iBAR,在相應靶位點中所述相同命中基因的倍數變化呈現不同方向;或在相應iBAR中所述sgRNA的倍數變化呈現不同方向,可以通過增加方差來罰分,導致更低的評分和針對某些命中基因的排序。例如,如果3組sgRNA iBAR(每組4個sgRNA iBAR)靶向相同命中基因的3個不同靶基因位點,如果相比對照所有12個iBAR序列均鑒定為富集,則命中基因具有低方差,因此排序和/或評分高(例如,排序高的藥物敏感基因,具有較高的敏感性評分);如果一些iBAR序列被鑒定為富集,而另一些序列與對照組相比被鑒定為沒有變化的或耗竭的,那麼命中基因具有高方差,因此排序和/或評分低(例如,低排序耐藥基因,具有低的耐藥性評分)。 In some embodiments, different target gene loci of the same hit gene comprise sequences of hit gene mutations (eg, inactivating mutations), and the fold change of the same hit gene in the corresponding target loci exhibits different directions; targeting The sgRNA or sgRNA iBAR of different target gene sites of the same hit gene, the fold change of the same hit gene in the corresponding target site presents different directions; or the fold change of the sgRNA in the corresponding iBAR presents different directions, Scores can be penalized by increasing variance, resulting in lower scores and rankings for certain hits. For example, if 3 sets of sgRNA iBARs (4 sgRNA iBARs per set) target 3 different target loci of the same hit gene, the hit gene has low variance if all 12 iBAR sequences are identified as enriched compared to the control , and thus rank and/or score high (e.g., highly ranked drug-sensitivity genes, with higher sensitivity scores); if some iBAR sequences are identified as enriched, while others are identified as not compared to controls variable or depleted, then the hit genes have high variance and are therefore ranked and/or scored low (eg, a low ranked resistance gene has a low resistance score).

在一組sgRNA iBAR構建體中,可以基於該組中不同iBAR序列的預定閾值數x的富集方向的一致性來調整針對嚮導序列的排序,其中x是1-y的整數。例如,如果所述sgRNA iBAR組的至少x個iBAR序列呈現相同的倍數變化方向,即均大於或小於所述對照癌細胞群的倍數變化方向,那麼所述嚮導序列的排序(或方差)不變。然而,如果大於y-x個不同的iBAR序列顯示了倍數變化方向的不一致,那麼所述sgRNA iBAR組就會通過降低其排序來罰分,例如,通過增加其方差。在一些實施方案中,基於預定閾值數x的不同命中基因突變(例如,失活突變)或對應于相同命中基因的不同嚮導序列的富集方向的一致性,可調整(或進一步調整)針對包含命中基因突變(例如,失活突變)的序列或嚮導序列的排序,其中x是1-y的整數。例如,如果對應于相同命中基因的至少x個命中基因突變(例如,失活突變)或x個嚮導序列呈現相同的倍數變化方向,即均大於或小於所述對照癌細胞群的倍數變化方向,那麼所述排序(或方差)不變。然而,如果大於y-x個不同命中基因突變(例如,失活突變)或大於y-x個不同嚮導序列顯示了倍數變化方向的不一致,那麼包含所述命中基因突變(例如,失活突變)的序列或所述嚮導序列就會通過降低其排序來罰分,例如,通過增加其方差。 Within a set of sgRNA iBAR constructs, the ranking for guide sequences can be adjusted based on the concordance of enrichment directions for a predetermined threshold number x of different iBAR sequences in the set, where x is an integer from 1 to y. For example, if at least x iBAR sequences of the sgRNA iBAR group exhibit the same fold change direction, that is, they are all larger or smaller than the fold change direction of the control cancer cell population, then the ranking (or variance) of the guide sequences does not change . However, if more than yx different iBAR sequences show inconsistencies in the direction of fold change, then the sgRNA iBAR set is penalized by lowering its rank, eg, by increasing its variance. In some embodiments, based on a predetermined threshold number x of mutations (e.g., inactivating mutations) of different hit genes or the agreement of the enrichment directions of different guide sequences corresponding to the same hit gene, the target for inclusion can be adjusted (or further adjusted). Ranking of sequences or guide sequences that hit genetic mutations (eg, inactivating mutations), where x is an integer from 1 to y. For example, if at least x hit gene mutations (e.g., inactivating mutations) or x guide sequences corresponding to the same hit gene exhibit the same fold change direction, that is, they are all larger or smaller than the fold change direction of the control cancer cell population, The ordering (or variance) is then unchanged. However, if more than yx different hit gene mutations (e.g., inactivating mutations) or more than yx different guide sequences show inconsistencies in the direction of fold change, then the sequence comprising the hit gene mutation (e.g., inactivating mutation) or all If the guide sequence described above is penalized by reducing its rank, for example, by increasing its variance.

在一些實施方案中,包含命中基因突變(例如,失活突變)的每個序列的P-值,或sgRNA或sgRNA iBAR的每個嚮導序列的P-值,是使用處理組與對照組相比的均值和方差(例如,實驗方差、模型估計方差或基於資料不一致的修正方差)來計算的。 In some embodiments, the P-value for each sequence comprising a hit gene mutation (e.g., an inactivating mutation), or the P-value for each guide sequence of an sgRNA or sgRNA iBAR , is calculated using the treatment group compared to the control group. The mean and variance (for example, experimental variance, model estimated variance, or corrected variance based on data inconsistency) were calculated.

穩健排名聚合(Robust Rank Aggregation) (RRA; Kolde R et al. Bioinformatics. 2012;28:573–580)或改進的RRA(例如,α-RRA in MAGeCK; Li W et al. Genome Biol. 2014;15:554)是本領域中用於統計和排序的可用工具之一,它可以檢測在不相關輸入的零假設下排序始終好于預期的基因,並為每個基因分配顯著性分數,並將排序清單合併為單個排序。它假設所有資訊歸一化的排序都來自一個強烈向零傾斜的分佈,並從假定的排序均勻分佈中計算出二項式概率來檢測這些分佈。並且,為聚合列表中的每個元件指定一個P值,用於對基因進行排序,並描述其排序比預期好多少,從而使隨機排序的基因不那麼顯著。潛在的概率模型使RRA演算法無需參數,對異常值、雜訊和錯誤具有魯棒性。顯著性評分也提供了一種嚴格的方法,只將統計上相關的基因保留在最終列表中。這些特性使此方法在許多設置中都具有強大的生命力和吸引力。簡而言之,在RRA和α-RRA中,對於包含命中基因突變(例如,失活突變)的每個序列,對應于命中基因的每個sgRNA嚮導序列或每個sgRNA iBAR嚮導序列(以下也稱為“命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列”) (例如,當存在靶向相同命中基因的3個sgRNA),該演算法著眼於該序列如何定位在獲自處理後癌細胞群或對照癌細胞群/對照癌細胞文庫的所有命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的歸一化排序列表中,並將其與所有命中基因突變(例如,失活突變)/sgRNA-嚮導序列/sgRNA iBAR嚮導序列隨機重排(“排列的序列”)的基線情況進行比較。結果,將P-值分配給對應於它們的命中基因的所有命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列,顯示其在排序清單中的位置比隨機預期的要好多少。將該P-值用於重新排序對應于命中基因的所述命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列並確定其顯著性。本領域技術人員可以理解,也可以使用其他工具進行統計和排序。在一些實施方案中,RRA或α-RRA用於計算每個命中基因的最終評分,以便基於每個命中基因的均值和方差(例如,修改的方差)獲得命中基因的排序。 Robust Rank Aggregation (RRA; Kolde R et al. Bioinformatics. 2012;28:573–580) or modified RRA (eg, α-RRA in MAGeCK; Li W et al. Genome Biol. 2014;15 :554) is one of the tools available in the field for statistics and ranking, which detects genes that rank consistently better than expected under the null hypothesis of uncorrelated inputs, assigns a significance score to each gene, and ranks Lists are merged into a single sort. It assumes that all information-normalized ranks come from a distribution strongly skewed towards zero, and computes binomial probabilities from the assumed uniform distribution of ranks to detect these distributions. And, assign a P-value to each element in the aggregated list that ranks the genes and describes how much better they rank than expected, making randomly ordered genes less significant. The underlying probabilistic model makes the RRA algorithm parameter-free and robust to outliers, noise, and errors. Significance scoring also provides a rigorous way to keep only statistically relevant genes in the final list. These properties make this method powerful and attractive in many settings. Briefly, in RRA and α-RRA, for each sequence containing a hit gene mutation (e.g., an inactivating mutation), each sgRNA guide sequence corresponding to the hit gene or each sgRNA iBAR guide sequence (hereinafter also Called "hit gene mutation (e.g., inactivating mutation)/sgRNA guide sequence/sgRNA iBAR guide sequence") (e.g., when there are 3 sgRNAs targeting the same hit gene), the algorithm looks at how the sequence localizes in A normalized ranked list of all hit gene mutations (such as inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences obtained from treated cancer cell populations or control cancer cell populations/control cancer cell libraries and combined with all Baseline comparisons were made to hit gene mutations (eg, inactivating mutations)/sgRNA-guide sequences/sgRNA iBAR -guide sequence random rearrangements (“aligned sequences”). As a result, P-values are assigned to all hit gene mutations (e.g., inactivating mutations)/sgRNA guides/sgRNA iBAR guides corresponding to their hit genes, showing how much better their position in the ranked list is than would be expected by random . This P-value was used to rerank the hit gene mutation (eg, inactivating mutation)/sgRNA guide sequence/sgRNA iBAR guide sequence corresponding to the hit gene and determine its significance. Those skilled in the art can understand that other tools can also be used for statistics and sorting. In some embodiments, RRA or α-RRA is used to calculate a final score for each hit to obtain a ranking of the hits based on the mean and variance (eg, modified variance) of each hit.

在一些實施方案中,基於使用來自負二項(NB)分佈模型的均值和方差(例如,針對資料不一致性調整的修正方差)計算的P值,將包含所述命中基因突變(例如,失活突變)的序列、sgRNA嚮導序列或sgRNA iBAR嚮導序列(命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列)排序,所述分佈模型用於估計生物/實驗重複以及治療組與對照組之間每個命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的概率,然後應用RRA或α-RRA演算法鑒定對應於排序靠前(例如,排序靠前的α%,如排序靠前5%)的命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的陽性或陰性選擇的命中基因。較低的RRA評分對應命中基因的更強富集。在一些實施方案中,選擇P值低於閾值(例如,P值<0.25)的排序靠前的命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列,且將相應命中基因鑒定為靶基因。在一些實施方案中,選擇FDR低於閾值(例如,FDR≤0.1)的排序靠前的命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列,且將相應命中基因鑒定為靶基因。在一些實施方案中,當針對相同命中基因設計多個命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列時,在RRA或α-RRA計算中僅考慮一個基因的排序靠前的命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列。RRA或α-RRA假定,如果命中基因對於癌細胞對抗癌藥物處理的敏感性/耐藥性沒有影響,那麼對應於這樣的命中基因的命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列應在從癌細胞文庫獲得的所有命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的排序列表中均勻分佈。在一些實施方案中,將所有命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列排序,並在處理組和對照組根據其在各組中的相對排序和各組的不同分佈,用RRA或α-RRA進行比較。通過將命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的β分佈的斜度與均勻的零假設模型進行比較,對所有癌細胞文庫覆蓋的命中基因進行排序,並且其相應的命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列排序始終高於通過排列測試的統計顯著性(P值)和/或通過Benjamini-Hochberg程式可接受的FDR的預期的命中基因,在RRA或α-RRA中被優先考慮(較低的RRA評分)。這種RRA或α-RRA分析可以顯著減少或消除實驗或採樣中由於擾動而導致的假陽性。在一些實施方案中,基於通過中值比率歸一化後進行均值-方差建模所獲得的相應命中基因突變(例如,失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的排序評分,對命中基因排序。在一些實施方案中,考慮到相同命中基因的多個命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列,通過RRA或α-RRA對命中基因進行進一步排序。 In some embodiments, the hit gene mutations (e.g., inactivating Mutation), sgRNA guide sequence or sgRNA iBAR guide sequence (hit gene mutation (eg, inactivating mutation)/sgRNA guide sequence/sgRNA iBAR guide sequence), the distribution model is used to estimate biological/experimental replicates and treatment groups The probability of each hit gene mutation (such as an inactivating mutation)/sgRNA guide sequence/sgRNA iBAR guide sequence compared with the control group, and then apply the RRA or α-RRA algorithm to identify the corresponding top-ranked (for example, top-ranked α%, such as top 5%) of hit gene mutations (eg, inactivating mutations)/sgRNA guides/sgRNA iBAR guides for positive or negative selection of hit genes. Lower RRA scores correspond to stronger enrichment of hit genes. In some embodiments, the top hit gene mutations (e.g., inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences with P values below a threshold value (e.g., P value<0.25) are selected, and the corresponding hit genes identified as target genes. In some embodiments, the top hit mutations (e.g., inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences with FDR below a threshold (e.g., FDR≦0.1) are selected and the corresponding hit genes are identified as target gene. In some embodiments, when multiple hit gene mutations (e.g., inactivating mutations)/sgRNA guides/sgRNA iBAR guides are designed for the same hit gene, only one gene rank by Previous hit gene mutation (eg, inactivating mutation)/sgRNA guide sequence/sgRNA iBAR guide sequence. RRA or α-RRA assumes that if a hit gene has no effect on cancer cell sensitivity/resistance to anticancer drug treatment, then the hit gene mutation (e.g., inactivating mutation)/sgRNA guide sequence corresponding to such a hit gene /sgRNA iBAR guide sequences should be evenly distributed in the sorted list of all hit gene mutations (such as inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences obtained from the cancer cell library. In some embodiments, all hit gene mutations (e.g., inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences are sorted and compared between the treatment group and the control group according to their relative ranking in each group and the differences in each group. distribution, compared with RRA or α-RRA. Hit genes covered by all cancer cell libraries were ranked by comparing the slope of the beta distribution of hit gene mutations (such as inactivating mutations)/sgRNA guides/sgRNA iBAR guides to the uniform null hypothesis model, and their corresponding Hit gene mutations (such as inactivating mutations)/sgRNA guides/sgRNA iBAR guides rank consistently higher than expected hits by permutation test for statistical significance (P-value) and/or acceptable FDR by Benjamini-Hochberg procedure Genes, are prioritized in RRA or α-RRA (lower RRA score). This RRA or α-RRA analysis can significantly reduce or eliminate false positives due to perturbations in experiments or sampling. In some embodiments, hits are scored based on ranking scores of corresponding hit gene mutations (e.g., inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences obtained by mean-variance modeling after normalization by median ratio Gene sequencing. In some embodiments, the hit genes are further ranked by RRA or α-RRA considering multiple hit mutations (eg, inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences for the same hit gene.

在一些實施方案中,預定閾值水準是從實驗(處理或對照)獲得的所有命中基因突變(如失活突變)/sgRNA嚮導序列/sgRNA iBAR嚮導序列的排列測試的FDR值。在一些實施方案中,通過考慮特定篩選中的最大潛在的真靶基因(例如,涉及回應抗癌藥物處理的特定途徑)來確定FDR值。在一些實施方案中,閾值是從癌細胞文庫獲得的序列計數的前β%(歸一化或未歸一化),並且相應命中基因被鑒定為靶基因。 In some embodiments, the predetermined threshold level is the FDR value of the permutation test of all hit gene mutations (eg, inactivating mutations)/sgRNA guide sequences/sgRNA iBAR guide sequences obtained from the experiment (treatment or control). In some embodiments, FDR values are determined by considering the largest potential true target genes (eg, involved in specific pathways in response to anticancer drug treatment) in a particular screen. In some embodiments, the threshold is the top β% (normalized or not) of sequence counts obtained from a cancer cell library, and corresponding hit genes are identified as target genes.

本文可以使用本領域已知的任何靶標鑒定方法。例如,經驗貝葉斯(Empirical Bayesian)方法(通過可能性鑒定靶標)或基於此的演算法,如casTLE(cas9 High Throughput maximum Likelihood Estimator,cas9高通量最大似然估計量),其使用經驗貝葉斯框架來解釋變異性的多個來源,包括用於分析大規模基因組擾動篩選的試劑功效和脫靶效應,以及提供用於排序和閾值截止的casTLE評分(Morgens, D.W. et al. (2016) Nat Biotechnol 34, 634-636)。在一些實施方案中,log2比率差異和來自t檢驗的p值可用于鑒定靶基因。例如,RIGER (Luo, J. et al. (2009). Cell 137, 835-848),其根據shRNA在兩類樣本之間的差異效應對其進行排序,然後在列表的頂部鑒定出shRNA靶向的基因由此鑒定對所述類別之間的差異至關重要的基因。LFC和P值可用於排序和閾值截止。在一些實施方案中,二項分佈(或基於二項分佈的演算法)的概率品質函數可用于靶基因鑒定。例如,STARS (Doench, J.G., et al. (2016) Nat Biotechnol 34, 184-191),其中STAR評分可以用於排序和閾值截止。在一些實施方案中,基於負二項式模型和α-RRA演算法可用于靶基因鑒定,如MAGeCK (Li, W. et al. (2014) Genome Biol 15, 554),以及RRA評分用於排序和閾值截止。在一些實施方案中,基於β-二項式建模的演算法可用于靶基因鑒定,如CRISPR-β-二項式 (CB2) (Jeong, H.H. et al. (2019). Genome Res 29, 999-1008),P值或FDR可用於排序和閾值截止。在一些實施方案中,如在嚴格的陽性篩選期間,sgRNA或sgRNA iBAR原始讀取計數排序、歸一化讀取計數排序和/或處理組與對照組之間的log2 倍數變化可用于靶基因鑒定,例如,對應於前X%讀取計數的命中基因被鑒定為靶基因。 Any target identification method known in the art can be used herein. For example, Empirical Bayesian (Target Identification by Likelihood) or algorithms based thereon, such as casTLE (cas9 High Throughput maximum Likelihood Estimator, cas9 High Throughput Maximum Likelihood Estimator), which uses Empirical Bayesian Yess framework to account for multiple sources of variability, including reagent efficacy and off-target effects for analysis of large-scale genomic perturbation screens, and providing casTLE scores for ranking and threshold cutoffs (Morgens, DW et al. (2016) Nat Biotechnol 34, 634-636). In some embodiments, log2 ratio differences and p-values from t-tests can be used to identify target genes. For example, RIGER (Luo, J. et al. (2009). Cell 137, 835-848), which ranks shRNAs according to their differential effects between two types of samples, then identifies shRNAs targeting The genes that are critical to the difference between the classes are thus identified. LFC and P-values can be used for ranking and threshold cutoffs. In some embodiments, the probability mass function of the binomial distribution (or an algorithm based on the binomial distribution) can be used for target gene identification. For example, STARS (Doench, JG, et al. (2016) Nat Biotechnol 34, 184-191), where STAR scores can be used for ranking and threshold cutoffs. In some embodiments, based on negative binomial model and α-RRA algorithm can be used for target gene identification, such as MAGeCK (Li, W. et al. (2014) Genome Biol 15, 554), and RRA score for ranking and threshold cutoff. In some embodiments, algorithms based on β-binomial modeling can be used for target gene identification, such as CRISPR-β-binomial (CB2) (Jeong, HH et al. (2019). Genome Res 29, 999 -1008), P-value or FDR can be used for ranking and threshold cutoff. In some embodiments, sgRNA or sgRNA iBAR raw read count ranking, normalized read count ranking, and/or log2 fold change between treatment and control groups can be used for target gene identification, such as during stringent positive screening , for example, hit genes corresponding to the top X% of read counts are identified as target genes.

在一些實施方案中,靶基因鑒定為陽性篩選,即通過鑒定在處理後癌細胞群中富集的命中基因突變(例如,失活突變)序列或嚮導序列。在一些實施方案中,靶基因鑒定為陰性篩選,即通過鑒定在處理後癌細胞群中耗竭的命中基因突變(例如,失活突變)序列或嚮導序列。基於序列計數或倍數變化在處理後癌細胞群中富集的命中基因突變(例如,失活突變)序列或嚮導序列排序高,而基於序列計數或倍數變化在處理後癌細胞群中耗竭的命中基因突變(例如,失活突變)序列或嚮導序列排序低。在一些實施方案中,所述富集或耗竭是相對於從處理後癌細胞群獲得的總序列計數。在一些實施方案中,所述富集或耗竭是相對於對照癌細胞群或對照癌細胞文庫中的相應序列計數,如來自沒有用抗癌藥物處理的相同癌細胞文庫的對照癌細胞群。在一些實施方案中,基於RRA或α-RRA演算法來計算所述富集或耗竭。In some embodiments, target genes are identified as a positive screen by identifying hit gene mutation (eg, inactivating mutation) sequences or guide sequences that are enriched in cancer cell populations following treatment. In some embodiments, target gene identification is a negative screen by identifying hit gene mutation (eg, inactivating mutation) sequences or guide sequences that are depleted in the cancer cell population following treatment. Hit gene mutations (e.g., inactivating mutations) sequences or guide sequences that are enriched in the post-treatment cancer cell population based on sequence count or fold change are highly ranked, while hits that are depleted in the post-treatment cancer cell population based on sequence count or fold change Gene mutation (eg, inactivating mutation) sequences or guide sequences are ranked low. In some embodiments, the enrichment or depletion is relative to total sequence counts obtained from the population of cancer cells after treatment. In some embodiments, the enrichment or depletion is counted relative to corresponding sequences in a control cancer cell population or a control cancer cell library, such as a control cancer cell population from the same cancer cell library that has not been treated with an anticancer drug. In some embodiments, the enrichment or depletion is calculated based on the RRA or α-RRA algorithm.

在一些實施方案中,所述方法包括使來自步驟a)的癌細胞文庫經歷採用步驟b)中的抗癌藥物的至少兩個(例如,至少3、4、5、6、7、7、8、10個或更多個)單獨的不同處理,以及在步驟c)中使所述癌細胞文庫生長以獲得來自每個處理的處理後癌細胞群(例如,均為活的且對抗癌藥物耐藥),鑒定獲自每個處理的處理後癌細胞群中的一個或多個命中基因;以及獲得從所有處理中鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,鑒定所述靶基因包括鑒定處理後癌細胞群中的一個或多個命中基因,所述癌細胞群獲自採用抗癌藥物的至少兩個(例如,至少3、4、5、6、7、7、8、10個或更多個)單獨的不同處理,其中:i) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為富集的且在所有各個不同的處理中具有FDR ≤ 0.1(例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍富集,如約3-、4-、5-、10-、20-、50-、100-倍或更多富集中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為耗竭的且在所有各個不同的處理中具有FDR ≤ 0.1 (例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍耗竭,如約3-、4-、5-、10-、20-、50-、100-倍或更多耗竭中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。 In some embodiments, the method comprises subjecting the cancer cell library from step a) to at least two (e.g., at least 3, 4, 5, 6, 7, 7, 8) of the anticancer drugs in step b). , 10 or more) separate different treatments, and growing the cancer cell library in step c) to obtain a post-treatment cancer cell population from each treatment (e.g., all viable and anticancer drug drug resistance), identifying one or more hit genes in the post-treatment cancer cell population obtained from each treatment; and obtaining one or more hit genes identified from all treatments, thereby identifying mutations thereof in the cancer cells Target genes that make the cancer cells sensitized or resistant to anticancer drugs. In some embodiments, identifying the target gene comprises identifying one or more hit genes in a treated cancer cell population obtained from at least two (e.g., at least 3, 4, 5, 6, 7, 7, 8, 10 or more) individual different treatments, wherein: i) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutation compared to the control cancer cell population in the anticancer drug Drug-resistant post-treatment cancer cell populations (live) were identified as enriched and had FDR ≤ 0.1 (e.g., FDR ≤ 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, Any of 0.02, 0.01, 0.005, 0.001 or less) (and/or having at least about 2-fold enrichment, such as about 3-, 4-, 5-, 10-, 20-, 50-, 100-fold or more enriched) hit genes identified as target genes whose mutations make said cancer cells resistant to anticancer drugs; and/or ii) their corresponding sgRNA or sgRNA iBAR guide sequences or hit genes Mutations were identified as depleted in anticancer drug resistant post-treatment cancer cell populations (live) compared to control cancer cell populations and had FDR ≤ 0.1 (e.g., FDR ≤ 0.09, 0.08 , 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less any one) (and/or have at least about 2-fold depletion, such as about 3-, 4-, 5-, 10- , 20-, 50-, 100-fold or more depletion) were identified as target genes whose mutations sensitized the cancer cells to anticancer drugs.

在一些實施方案中,所述方法包括使來自步驟a)的癌細胞文庫經歷採用步驟b)中的抗癌藥物的至少兩個(例如,至少3、4、5、6、7、7、8、10個或更多個)單獨的不同處理,以及在步驟c)中培養所述癌細胞文庫以獲得來自每個處理的處理後癌細胞群(例如,均為活的且對抗癌藥物耐藥), 鑒定獲自每個處理的處理後癌細胞群中的一個或多個命中基因;以及組合從所有處理鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,鑒定所述靶基因包括鑒定處理後癌細胞群中的一個或多個命中基因,所述癌細胞群來自採用抗癌藥物的至少兩個(例如,至少3、4、5、6、7、7、8、10個或更多個)單獨的不同處理,其中:i) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為富集的且在至少一個處理中具有FDR ≤ 0.1(例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍富集,如約3-、4-、5-、10-、20-、50-、100-倍或更多富集中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為耗竭的且在至少一個處理中具有FDR ≤ 0.1 (例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍耗竭,如約3-、4-、5-、10-、20-、50-、100-倍或更多耗竭中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。 In some embodiments, the method comprises subjecting the cancer cell library from step a) to at least two (e.g., at least 3, 4, 5, 6, 7, 7, 8) of the anticancer drugs in step b). , 10 or more) separate different treatments, and culturing said cancer cell library in step c) to obtain a post-treatment cancer cell population from each treatment (e.g., all viable and resistant to an anticancer drug drug), identifying one or more hit genes in the post-treatment cancer cell population obtained from each treatment; and combining the one or more hit genes identified from all treatments, thereby identifying in the cancer cells whose mutations make the Target genes that are sensitive or resistant to anticancer drugs in cancer cells. In some embodiments, identifying the target gene comprises identifying one or more hit genes in a treated cancer cell population from at least two (e.g., at least 3, 4, 5 , 6, 7, 7, 8, 10 or more) individual different treatments in which: i) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutations are more effective in anticancer drug resistance than control cancer cell populations Drug-treated cancer cell populations (live) identified as enriched and having a FDR ≤ 0.1 in at least one treatment (e.g., FDR ≤ 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01 , 0.005, 0.001 or less) (and/or having at least about 2-fold enrichment, such as about 3-, 4-, 5-, 10-, 20-, 50-, 100-fold or more Any one of the hit genes in the enrichment) identified as target genes whose mutations render the cancer cells resistant to anticancer drugs; and/or ii) their corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutations compared Control cancer cell populations are identified as depleted in post-treatment cancer cell populations (viable) that are resistant to anticancer drugs and have a FDR ≤ 0.1 (e.g., FDR ≤ 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less) (and/or have at least about 2-fold depletion, such as about 3-, 4-, 5-, 10-, 20-, 50 -, 100-fold or more depletion) were identified as target genes whose mutations sensitized the cancer cells to anticancer drugs.

在一些實施方案中,本文所述的方法包括對來自步驟a)的癌細胞文庫進行兩個分別的處理b1)和b2):b1) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間;b2) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間;c1)使來自處理b1)的癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);c2)使來自處理b2)的癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);d1)鑒定來自處理b1)的處理後癌細胞群中的一個或多個命中基因,d2)鑒定來自處理b2)的處理後癌細胞群中的一個或多個命中基因,以及d3)獲得從處理b1)和b2)鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,鑒定所述靶基因包括鑒定從兩個分別的處理b1)和b2)獲得的處理後癌細胞群中的一個或多個命中基因,其中:i) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為富集的且在處理b1)和b2)兩者中具有FDR ≤ 0.1(例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍富集,如約3-、4-、5-、10-、20-、50-、100-倍或更多富集中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為耗竭的且在處理b1)和b2)兩者中具有FDR ≤ 0.1 (例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍耗竭,如約3-、4-、5-、10-、20-、50-、100-倍或更多耗竭中的任一個)的命中基因,被鑒定為其突變使所述癌細胞具有抗癌藥物敏感性的靶基因。 In some embodiments, the methods described herein comprise subjecting the cancer cell library from step a) to two separate treatments b1) and b2): b1) combining the cancer cell library from step a) with the anticancer drug contacting at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times; b2) contacting the cancer cell library from step a) with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times; c1) growing the cancer cell library from treatment b1) to obtain a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug); c2) growing the cancer cell library from treatment b2) Obtaining a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug); d1) identifying one or more hit genes in the post-treatment cancer cell population from treatment b1), d2) identification from treatment b2) one or more hit genes in the treated cancer cell population of , and d3) obtaining one or more hit genes identified from treatments b1) and b2), thereby identifying mutations in said cancer cells that render said cancer cells Target genes for sensitivity or resistance to anticancer drugs. In some embodiments, identifying said target gene comprises identifying one or more hit genes in post-treatment cancer cell populations obtained from two separate treatments b1) and b2), wherein: i) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutations were identified as enriched in anticancer drug resistant post-treatment cancer cell populations (live) compared to control cancer cell populations and had FDR in both treatments b1) and b2) ≤ 0.1 (e.g., any of FDR ≤ 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less) (and/or have at least about 2-fold enrichment, such as about 3-, 4-, 5-, 10-, 20-, 50-, 100-fold or more enrichment) hit genes identified as mutations that render the cancer cells resistant to the anticancer drug and/or ii) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutations are identified as depleted in post-treatment cancer cell populations (live) resistant to anticancer drugs compared to control cancer cell populations and have FDR ≤ 0.1 in both treatments b1) and b2) (eg, FDR ≤ any of 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less) (and/or have at least about 2-fold depletion, such as any of about 3-, 4-, 5-, 10-, 20-, 50-, 100-fold or more depletion), are identified as A target gene whose mutation renders the cancer cell sensitive to anticancer drugs.

在一些實施方案中,本文所述的方法包括對來自步驟a)的癌細胞文庫進行兩個分別的處理b1)和b2):b1) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間;b2) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間;c1)使來自處理b1)的癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);c2)使來自處理b2)的癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);d1) 鑒定來自處理b1)的處理後癌細胞群中的一個或多個命中基因,d2) 鑒定來自處理b2)的處理後癌細胞群中的一個或多個命中基因,以及d3) 組合從處理b1)和處理b2)鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對抗癌藥物敏感或耐藥的靶基因。在一些實施方案中,鑒定所述靶基因包括鑒定從兩個分別的處理b1)和b2)獲得的處理後癌細胞群中的一個或多個命中基因,其中:i) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為富集的且在處理b1)或b2)中具有FDR ≤ 0.1(例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍富集,如約3-、4-、5-、10-、20-、50-、100-倍或更多富集中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物耐藥的靶基因;和/或ii) 其相應的sgRNA或sgRNA iBAR嚮導序列或命中基因突變相比對照癌細胞群在對抗癌藥物耐藥的處理後癌細胞群(活的)中被鑒定為耗竭的且在處理b1)或b2)中具有FDR ≤ 0.1 (例如,FDR ≤ 0.09、0.08、0.07、0.06、0.05、0.04、0.03、0.02、0.01、0.005、0.001或更少中的任一個) (和/或具有至少約2倍耗竭,如約3-、4-、5-、10-、20-、50-、100-倍或更多耗竭中的任一個)的命中基因,被鑒定為其突變使所述癌細胞對抗癌藥物敏感的靶基因。 In some embodiments, the methods described herein comprise subjecting the cancer cell library from step a) to two separate treatments b1) and b2): b1) combining the cancer cell library from step a) with the anticancer drug contacting at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times; b2) contacting the cancer cell library from step a) with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times; c1) growing the cancer cell library from treatment b1) to obtain a post-treatment cancer cell population (e.g., viable and resistant to an anticancer drug); c2) growing the cancer cell library from treatment b2) Obtaining a post-treatment cancer cell population (eg, viable and resistant to an anticancer drug); d1) identifying one or more hit genes in the post-treatment cancer cell population from treatment b1), d2) identification from treatment b2) One or more hit genes in the treated cancer cell population of , and d3) combining the one or more hit genes identified from treatment b1) and treatment b2), thereby identifying said cancer cells whose mutations make said cancer cells Target genes of cells that are sensitive or resistant to anticancer drugs. In some embodiments, identifying said target gene comprises identifying one or more hit genes in post-treatment cancer cell populations obtained from two separate treatments b1) and b2), wherein: i) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutations are identified as enriched in post-treatment cancer cell populations (live) resistant to anticancer drugs compared to control cancer cell populations and have FDR ≤ 0.1 in treatment b1) or b2) (e.g., any of FDR≤0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less) (and/or have at least about 2-fold enrichment, such as about 3- , 4-, 5-, 10-, 20-, 50-, 100-fold or more enrichment) hit genes identified as targets whose mutation renders the cancer cell resistant to the anticancer drug gene; and/or ii) its corresponding sgRNA or sgRNA iBAR guide sequence or hit gene mutation is identified as depleted in a post-treatment cancer cell population (live) resistant to an anticancer drug compared to a control cancer cell population and Have FDR ≤ 0.1 (e.g., any of FDR ≤ 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less) in treatment b1) or b2) (and/or Hit genes with at least about 2-fold depletion, such as about any of 3-, 4-, 5-, 10-, 20-, 50-, 100-fold or more depletion, are identified as mutations that render the Target genes sensitive to anticancer drugs in cancer cells.

在一些實施方案中,所述方法包括鑒定其突變使癌細胞對兩種或更多種(例如,2、3、4、5或更多種)抗癌藥物敏感或耐藥的癌細胞中靶基因。在一些實施方案中,所述兩種或更多種不同抗癌藥物靶向相同的癌症靶標(例如,PARP)。在一些實施方案中,所述兩種或更多種不同抗癌藥物靶向不同的癌症靶標(例如,一種靶向PARP,另一種靶向非PARP靶標)。在一些實施方案中,所述方法包括:i)對於兩種或更多種(例如,2、3、4、5或更多種)不同抗癌藥物在單獨處理時,採用本文所述的任何方法(例如,可包括一個或多個單獨的不同處理),分別鑒定一組的一個或多個靶基因,所述靶基因的突變使所述癌細胞對抗癌藥物敏感;以及ii) 獲得存在於針對每種抗癌藥物鑒定的每組靶基因中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對兩種或更多種不同抗癌藥物的組合處理敏感的靶基因。在一些實施方案中,所述方法包括:i)對於兩種或更多種(例如,2、3、4、5或更多種)不同抗癌藥物在單獨處理時,採用本文所述的任何方法(例如,可包括一個或多個單獨的不同處理),分別鑒定一組的一個或多個靶基因,所述靶基因的突變使所述癌細胞對抗癌藥物耐藥;以及ii) 獲得存在於針對所有抗癌藥物鑒定的靶基因組的組合中的一個或多個靶基因,由此鑒定其突變使所述癌細胞對兩種或更多種不同抗癌藥物的組合處理耐藥的靶基因。在一些實施方案中,所述方法包括:a)提供本文所述的癌細胞文庫;b)使癌細胞文庫與兩種或更多種(例如,2、3、4、5或更多中)不同抗癌藥物的組合接觸(例如,同時接觸,以重疊的時間接觸,或依次接觸);c) 使所述癌細胞文庫生長以獲得處理後癌細胞群(例如,活的且對抗癌藥物耐藥);以及採用本文所述的任何靶基因鑒定方法,基於處理後癌細胞群和對照癌細胞群中sgRNA或sgRNA iBAR或命中基因突變的譜之間的差異鑒定所述靶基因。 In some embodiments, the methods include identifying targets in cancer cells whose mutations render the cancer cells sensitive or resistant to two or more (e.g., 2, 3, 4, 5, or more) anticancer drugs. Gene. In some embodiments, the two or more different anticancer drugs target the same cancer target (eg, PARP). In some embodiments, the two or more different anticancer drugs target different cancer targets (eg, one targets PARP and the other targets a non-PARP target). In some embodiments, the method comprises: i) using any of the methods described herein for two or more (e.g., 2, 3, 4, 5 or more) different anticancer drugs when treated separately. method (for example, may comprise one or more separate different treatments), each identifying a set of one or more target genes whose mutations sensitize the cancer cells to an anticancer drug; and ii) obtaining the presence of One or more target genes in each set of target genes identified for each anticancer drug, thereby identifying target genes whose mutations render the cancer cell sensitive to combined treatment of two or more different anticancer drugs . In some embodiments, the method comprises: i) using any of the methods described herein for two or more (e.g., 2, 3, 4, 5 or more) different anticancer drugs when treated separately. A method (eg, which may comprise one or more separate distinct treatments) of respectively identifying a set of one or more target genes whose mutations render said cancer cells resistant to an anticancer drug; and ii) obtaining One or more target genes present in combination in the set of target genes identified for all anticancer drugs, thereby identifying targets whose mutations render said cancer cells resistant to the combined treatment of two or more different anticancer drugs Gene. In some embodiments, the method comprises: a) providing a cancer cell library described herein; b) combining the cancer cell library with two or more (eg, 2, 3, 4, 5 or more) Combinatorial exposure (e.g., simultaneous exposure, overlapping time exposure, or sequential exposure) of different anticancer drugs; c) growing the cancer cell library to obtain a treated cancer cell population (e.g., viable and anticancer drug drug resistance); and using any of the target gene identification methods described herein, identifying the target gene based on the difference between the profile of the sgRNA or sgRNA iBAR or hit gene mutation in the treated cancer cell population and the control cancer cell population.

在一些實施方案中,所述方法還包括對鑒定的靶基因進行排序,其中基於相比對照癌細胞群在處理後癌細胞群中sgRNA或sgRNA iBAR嚮導序列或命中基因突變的富集或耗竭的程度(例如,富集的倍數,耗竭的倍數,富集FDR,或耗竭FDR),進行靶基因排序。在一些實施方案中,基於對應于相同靶基因的包含所述命中基因突變(例如,失活突變)的所有序列中的資料一致性,進一步調整所述靶基因排序。在一些實施方案中,所述sgRNA文庫是sgRNA iBAR文庫,並且基於對應于靶基因的嚮導序列的所述sgRNA iBAR序列的iBAR序列中的資料一致性,和/或基於對應于相同靶基因(例如,相同或不同的靶位點)的所有嚮導序列中的資料一致性,進一步調整所述靶基因排序。在一些實施方案中,將RRA或α-RRA演算法用於對鑒定的靶基因進行排序。在一些實施方案中,所鑒定的靶基因的排序是:i) 基於對應于相同靶基因的所有包含所述命中基因突變(例如,失活突變)的序列中的資料一致性;或ii) 基於對應于靶基因的嚮導序列的所述sgRNA iBAR序列的iBAR序列中的資料一致性;和/或iii)基於對應于相同靶基因(例如,相同或不同的靶位點)的sgRNA或sgRNA iBAR的所有嚮導序列中的資料一致性;其中基於從高到低的資料一致性程度,將所鑒定的靶基因從高到低排序。在一些實施方案中,所述處理後癌細胞群是活細胞群,即對抗癌藥物耐藥。在一些實施方案中,所述方法還包括將敏感性評分或耐藥性評分分配給鑒定的靶基因,其中與所述對照癌細胞群相比,基於處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中所述sgRNA或sgRNA iBAR嚮導序列或命中基因突變的富集的倍數 (或基於富集FDR –FDR越小,排序越高;或基於資料一致性程度 –資料一致性程度越高,排序越高),將其突變使所述癌細胞對抗癌藥物耐藥的靶基因從高到低排序,且從高到低相應地給每個靶基因分配一個耐藥性評分;和/或其中與所述對照癌細胞群相比,基於處理後癌細胞群(例如,活的且對抗癌藥物耐藥)中所述sgRNA或sgRNA iBAR嚮導序列或命中基因突變的耗竭的倍數 (或基於耗竭FDR –FDR越小,排序越高;或基於資料一致性程度–資料一致性程度越高,排序越高),將其突變使所述癌細胞對抗癌藥物敏感的靶基因從高到低排序,且每個靶基因按從高到低相應地分配一個敏感性評分。 In some embodiments, the method further comprises ranking the identified target genes based on the enrichment or depletion of sgRNA or sgRNA iBAR guide sequence or hit gene mutations in the treated cancer cell population compared to the control cancer cell population. Target gene ranking was performed according to degree (eg, fold enriched, fold depleted, enriched FDR, or depleted FDR). In some embodiments, the ranking of the target genes is further adjusted based on the identity of the data among all sequences comprising the mutation of the hit gene (eg, an inactivating mutation) corresponding to the same target gene. In some embodiments, the sgRNA library is a sgRNA iBAR library and is based on data identity in the iBAR sequence of the sgRNA iBAR sequence corresponding to the guide sequence of the target gene, and/or based on the identity of the iBAR sequence corresponding to the same target gene (e.g. , the same or different target sites), the data consistency among all guide sequences further adjusts the target gene ranking. In some embodiments, the RRA or α-RRA algorithm is used to rank the identified target genes. In some embodiments, the ranking of the identified target genes is: i) based on the identity of the data in all sequences containing the mutations in the hit gene (e.g., inactivating mutations) corresponding to the same target gene; or ii) based on Data identity in the iBAR sequence of said sgRNA iBAR sequence corresponding to the guide sequence of the target gene; and/or iii) based on sgRNA or sgRNA iBAR corresponding to the same target gene (e.g., same or different target site) Data identity across all guide sequences; where the identified target genes are ranked from highest to lowest based on the degree of data identity from highest to lowest. In some embodiments, the treated cancer cell population is a viable cell population, ie resistant to an anticancer drug. In some embodiments, the method further comprises assigning a sensitivity score or a drug resistance score to the identified target gene, wherein the post-treatment cancer cell population (e.g., viable and Anticancer drug resistance) The multiple of enrichment of the sgRNA or sgRNA iBAR guide sequence or hit gene mutation (or based on enrichment FDR - the smaller the FDR, the higher the ranking; or based on the degree of data consistency - data consistency The higher the degree, the higher the ranking), the target genes whose mutations make the cancer cells resistant to anticancer drugs are ranked from high to low, and a drug resistance score is assigned to each target gene accordingly and/or wherein the sgRNA or sgRNA iBAR guide sequence or hit gene mutations are depleted based on the depletion of the sgRNA or sgRNA iBAR guide sequence or hit gene mutation in the treated cancer cell population (e.g., viable and resistant to an anticancer drug) as compared to the control cancer cell population Fold (either based on depleted FDR - the smaller the FDR, the higher the ranking; or based on the degree of data agreement - the higher the degree of data agreement, the higher the ranking), target genes whose mutations sensitize said cancer cells to anticancer drugs Ranked from highest to lowest, and each target gene is assigned a sensitivity score accordingly, from highest to lowest.

在一些實施方案中,所述方法還包括通過以下步驟來驗證所鑒定的靶基因:a)通過在癌細胞的靶基因中產生突變(例如,失活突變)來修飾所述癌細胞;以及b)確定所述修飾癌細胞對所述抗癌藥物的敏感性或耐藥性。在一些實施方案中,所述方法包括使修飾的癌細胞經歷以下步驟:本文所述任何抗癌藥物處理步驟b),以及任選的任何癌細胞獲取步驟。本領域已知以及本文所述的任何細胞活性測定,可用於確定經修飾的癌細胞對抗癌藥物的敏感性或耐藥性。當修飾的癌細胞是同質群體(即包含相同的突變如失活突變)時,可以使用更多的細胞活性測定,如基於代謝活性的測定,例如重氮唑(氧化還原(redox)指示劑)、四唑鹽MTT和XTT、二氫羅丹明、鈣黃綠素或螢光素、發光ATP測定。靶基因中的突變(例如失活突變)可以通過本領域已知和本文所述的任何方法產生,如通過誘變劑或TALEN-、ZFN-或CRISPR/Cas-介導的基因編輯(例如,針對靶基因使用Cas、sgRNA)。在一些實施方案中,在靶基因中產生突變(例如失活突變)之前的癌細胞包含內源性突變,如內源性突變經常發生在癌細胞中。In some embodiments, the method further comprises validating the identified target gene by: a) modifying the cancer cell by creating a mutation (e.g., an inactivating mutation) in the target gene of the cancer cell; and b ) determining the sensitivity or resistance of the modified cancer cells to the anticancer drug. In some embodiments, the method comprises subjecting the modified cancer cells to any of the anticancer drug treatment steps b) described herein, and optionally any cancer cell harvesting steps. Any cell viability assay known in the art and described herein can be used to determine the sensitivity or resistance of the modified cancer cells to anticancer drugs. When the modified cancer cells are a homogeneous population (i.e. contain the same mutation such as an inactivating mutation), more cell viability assays can be used, such as those based on metabolic activity, e.g. diazole (redox indicator) , tetrazolium salt MTT and XTT, dihydrorhodamine, calcein or luciferin, luminescent ATP determination. Mutations in target genes (e.g., inactivating mutations) can be generated by any method known in the art and described herein, such as by mutagens or TALEN-, ZFN-, or CRISPR/Cas-mediated gene editing (e.g., For target genes using Cas, sgRNA). In some embodiments, the cancer cell prior to the development of a mutation (eg, an inactivating mutation) in the target gene contains an endogenous mutation, as endogenous mutations often occur in cancer cells.

III. 治療癌症和/或選擇患者的方法III. Methods of Treating Cancer and/or Selecting Patients

本發明另一個方面提供了基於本文所述任一靶基因,或基於用本文所述的任何靶基因鑒定方法鑒定的一個或多個靶基因,治療個體的癌症的方法,選擇患有癌症的個體進行抗癌藥物治療的方法,以及患有癌症的個體排除在抗癌藥物治療之外的方法。Another aspect of the invention provides a method of treating cancer in an individual based on any of the target genes described herein, or based on one or more target genes identified using any of the target gene identification methods described herein, selecting an individual with cancer Methods of administering anticancer drug therapy, and methods of excluding individuals with cancer from anticancer drug therapy.

基因 (例如,靶基因、藥物敏感基因、耐藥基因)的“畸變”是指基因的遺傳和/或表觀遺傳畸變、異常表達水準和/或異常活性水準,和/或基因(或基因產物,如RNA或蛋白質)的異常修飾水準,其可能導致基因編碼的RNA和/或蛋白質功能異常喪失或功能降低和/或異常表達(如減少或缺失)。在一些實施方案中,遺傳畸變包括核酸(如DNA或RNA)或蛋白序列的改變(即突變)或與基因相關聯的異常表觀遺傳特徵,包括但不限於:基因的編碼、非編碼、調節、增強子、沉默子、啟動子、內含子、外顯子和非翻譯區域。在一些實施方案中,基因的畸變包括基因的突變,包括但不限於:缺失、移碼、插入、插入缺失、錯義突變、無義突變、點突變、沉默突變、剪接位點突變、剪接變體和易位。在一些實施方案中,突變可能是基因的丟失或缺失。在一些實施方案中,突變是有害突變。在一些實施方案中,與對照水準相比,基因的畸變包括基因的異常表達(例如,減少或缺失)(例如,mRNA或蛋白質)。在一些實施方案中,與對照水準相比,基因的畸變包括基因產物(例如RNA或蛋白質)的異常(例如減少或消除)活性,如下游靶點的啟動或抑制。在一些實施方案中,與對照水準相比,基因的畸變包括基因(例如,在DNA水準或組蛋白水準)或基因產物(例如,RNA或蛋白質)的畸變修飾(例如,增加、減少或錯誤修飾),如翻譯後修飾(例如磷酸化、泛素化)。在一些實施方案中,基因的畸變包括該基因的拷貝數變化。在一些實施方案中,基因的拷貝數變化是由基因組的結構重排引起的,包括缺失、重複、反轉和易位。在一些實施方案中,基因的畸變包括基因的異常表觀遺傳特徵,包括但不限於:DNA甲基化、羥甲基化、組蛋白結合增加或減少、組蛋白甲基化、組織蛋白乙醯化、染色質重塑等。在一些實施方案中,通過與對照或參考的比較來確定畸變,如參考序列(如核酸序列或蛋白質序列)、對照表達(如RNA或蛋白表達)水準、對照活性(如下游靶標的啟動或抑制)水準、或對照修飾(例如,翻譯後修飾或表觀遺傳修飾)水準。在一些實施方案中,基因中的異常表達水準或異常活性水準,可以低於對照水準(如比對照水準低約10%、20%、30%、40%、60%、70%、80%、90%或更多中的任一個)。在一些實施方案中,基因中的異常修飾水準(例如,DNA、核小體、RNA或蛋白質的修飾)可以低於對照水準(如比對照水準低約10%、20%、30%、40%、60%、70%、80%、90%或更多中的任一個),或者高於對照水準(如比對照水準高約10%、20%、30%、40%、60%、70%、80%、90%或更多中的任一個)。在一些實施方案中,基因中的異常修飾是錯誤修飾,例如泛素化而不是磷酸化。在一些實施方案中,對照水準(例如,表達水準、或活性水準、或修飾水準)是對照人群的中間水準(例如表達水準、或活性水準、或修飾水準)。在一些實施方案中,對照人群是與正在/將要治療的個體具有相同癌症的人群。在一些實施方案中,對照人群是沒有癌症的健康人群,並且作為正在/將要治療的個體,可選地具有可比較的人口特徵(例如,性別、年齡、種族等)。在一些實施方案中,對照水準(例如表達水準、或活性水準、或修飾水準)是來自同一個體的健康組織的水準(例如,表達水準、或活性水準、或修飾水準)。基因的畸變可以通過與參考序列進行比較來確定,包括對照樣品中參考序列的表觀遺傳模式。在一些實施方案中,參考序列是對應于相應基因的全功能等位基因的序列(DNA、RNA或蛋白序列),如存在於沒有癌症的健康人群個體中的相應基因的等位基因(例如流行等位基因),但也可任選地具有與正在/將要治療的個體類似的人口特徵(如性別、年齡、種族等)。"Aberrations" of genes (e.g., target genes, drug-sensitivity genes, drug-resistance genes) refer to genetic and/or epigenetic aberrations, abnormal expression levels, and/or abnormal activity levels of genes, and/or genes (or gene product , such as RNA or protein), which may lead to abnormal loss of function or reduced function and/or abnormal expression (such as reduction or deletion) of RNA and/or protein encoded by the gene. In some embodiments, genetic aberrations include alterations (i.e., mutations) in nucleic acid (eg, DNA or RNA) or protein sequence or abnormal epigenetic features associated with genes, including but not limited to: coding, non-coding, regulatory , enhancers, silencers, promoters, introns, exons and untranslated regions. In some embodiments, genetic aberrations include genetic mutations, including but not limited to: deletions, frameshifts, insertions, indels, missense mutations, nonsense mutations, point mutations, silent mutations, splice site mutations, splice mutations body and translocation. In some embodiments, the mutation may be a loss or deletion of a gene. In some embodiments, the mutation is a deleterious mutation. In some embodiments, an aberration of a gene comprises an aberrant expression (eg, reduction or deletion) of a gene (eg, mRNA or protein) compared to a control level. In some embodiments, an aberration of a gene comprises an aberrant (eg, reduced or eliminated) activity of a gene product (eg, RNA or protein) compared to a control level, such as activation or inhibition of a downstream target. In some embodiments, genetic aberrations include aberrant modifications (e.g., increases, decreases, or mismodifications) of genes (e.g., at the DNA level or histone levels) or gene products (e.g., RNA or proteins) compared to control levels. ), such as post-translational modifications (eg, phosphorylation, ubiquitination). In some embodiments, an aberration of a gene comprises a copy number change of the gene. In some embodiments, the copy number change of a gene results from a structural rearrangement of the genome, including deletions, duplications, inversions, and translocations. In some embodiments, aberrations in a gene include abnormal epigenetic features of the gene, including but not limited to: DNA methylation, hydroxymethylation, increased or decreased histone binding, histone methylation, histone acetylation chromatin remodeling, etc. In some embodiments, aberrations are determined by comparison to a control or reference, such as a reference sequence (e.g., nucleic acid sequence or protein sequence), control expression (e.g., RNA or protein expression) levels, control activity (e.g., activation or inhibition of downstream targets). ) level, or a control modification (eg, post-translational modification or epigenetic modification) level. In some embodiments, the abnormal expression level or abnormal activity level in the gene can be lower than the control level (such as about 10%, 20%, 30%, 40%, 60%, 70%, 80%, lower than the control level). 90% or more). In some embodiments, the level of aberrant modification (e.g., modification of DNA, nucleosomes, RNA, or protein) in a gene may be lower than a control level (e.g., about 10%, 20%, 30%, 40% lower than a control level , 60%, 70%, 80%, 90% or more), or higher than the control level (such as about 10%, 20%, 30%, 40%, 60%, 70% higher than the control level , 80%, 90% or more). In some embodiments, the abnormal modification in the gene is a wrong modification, such as ubiquitination instead of phosphorylation. In some embodiments, the control level (eg, expression level, or activity level, or modification level) is the median level (eg, expression level, or activity level, or modification level) of a control population. In some embodiments, the control population is a population having the same cancer as the individual being/to be treated. In some embodiments, the control population is a healthy population without cancer, and optionally has comparable demographic characteristics (eg, gender, age, race, etc.) as the individuals being/to be treated. In some embodiments, the control level (eg, expression level, or activity level, or modified level) is a level (eg, expression level, or activity level, or modified level) from healthy tissue of the same individual. Genetic aberrations can be identified by comparison to a reference sequence, including the epigenetic pattern of the reference sequence in a control sample. In some embodiments, the reference sequence is a sequence (DNA, RNA, or protein sequence) corresponding to a fully functional allele of the corresponding gene, such as an allele of the corresponding gene present in healthy population individuals without cancer (e.g., prevalent alleles), but may also optionally have similar demographic characteristics (eg, gender, age, race, etc.) to the individual being/to be treated.

在本文中,靶基因的畸變也稱為“靶基因畸變”,包括但不限於靶基因突變。藥物敏感基因的畸變在本文中也稱為“藥物敏感畸變”,包括但不限於藥物敏感突變,其使癌細胞對抗癌藥物敏感。耐藥基因的畸變在本文中也被稱為“耐藥畸變”,包括但不限於使癌細胞對抗癌藥物耐藥的耐藥突變。患者基因的畸變在本文中也稱為“患者基因畸變”,包括但不限於患者基因突變。患者靶基因的畸變在本文中也稱為“患者靶基因畸變”,包括但不限於患者靶基因突變。Herein, the aberration of the target gene is also referred to as "target gene aberration", including but not limited to target gene mutation. Aberrations in drug-sensitive genes, also referred to herein as "drug-sensitive aberrations," include, but are not limited to, drug-sensitive mutations that sensitize cancer cells to anticancer drugs. Aberrations in drug resistance genes, also referred to herein as "drug resistance aberrations," include, but are not limited to, drug resistance mutations that render cancer cells resistant to anticancer drugs. An aberration of a patient's gene is also referred to herein as a "patient's genetic aberration", including but not limited to a patient's gene mutation. An aberration of a patient target gene, also referred to herein as a "patient target gene aberration," includes, but is not limited to, a patient target gene mutation.

基因畸變的“狀態”可能指基因存在或不存在畸變,或基因的畸變水準(表達、活性或修飾水準)。在一些實施方案中,與對照相比,一個或多個藥物敏感基因中的畸變(如突變)的存在表明:(a)個體更可能對抗癌藥物治療產生反應,或(b)個體被選擇用於抗癌藥物治療。在一些實施方案中,與對照組相比,一個或多個藥物敏感基因中不存在畸變(如突變)表明:(a)個體對抗癌藥物治療的反應較小,或(b)個體未被選擇用於抗癌藥物治療。在一些實施方案中,一個或多個藥物敏感基因和/或一個或多個耐藥基因的異常水準(如表達水準、或活性水準、或修飾水準)與個體對治療反應的可能性相關。例如,一個或多個藥物敏感基因的水準(如表達、活性或修飾水準)朝著減少或消除基因功能的方向出現較大偏差,表明個體更有可能對抗癌藥物治療產生反應。在一些實施方案中,基於一個或多個藥物敏感基因和/或一個或多個耐藥基因的水準(例如,表達水準、或活性水準、或修飾水準)的預測模型(例如,綜合評分),用於預測:(a)個體對抗癌藥物治療反應的可能性,以及(b)是否選擇個體進行抗癌藥物治療。在一些實施方案中,預測模型(包括例如,每個水準的係數)可通過使用臨床試驗資料的統計分析(如回歸分析)獲得。The "status" of a genetic aberration may refer to the presence or absence of the aberration, or the level of aberration (level of expression, activity, or modification) of the gene. In some embodiments, the presence of an aberration (e.g., a mutation) in one or more drug-sensitive genes, compared to a control, indicates that: (a) the individual is more likely to respond to anticancer drug treatment, or (b) the individual is selected For anticancer drug treatment. In some embodiments, the absence of aberrations (eg, mutations) in one or more drug-sensitive genes indicates that (a) the individual is less responsive to anticancer drug treatment, or (b) the individual is not being treated with an anticancer drug, compared to a control group. Selected for anticancer drug therapy. In some embodiments, abnormal levels (eg, expression levels, or activity levels, or modification levels) of one or more drug-sensitivity genes and/or one or more drug-resistance genes correlate with the individual's likelihood of responding to treatment. For example, a greater deviation in the level (eg, expression, activity, or modification level) of one or more drug-sensitive genes toward reduced or eliminated gene function indicates that an individual is more likely to respond to anticancer drug treatment. In some embodiments, based on a predictive model (e.g., composite score) of the level (e.g., expression level, or activity level, or modification level) of one or more drug-sensitive genes and/or one or more drug-resistant genes, Used to predict: (a) the likelihood that an individual will respond to anticancer drug treatment, and (b) whether an individual will be selected for anticancer drug treatment. In some embodiments, a predictive model (including, for example, coefficients for each level) can be obtained by statistical analysis (eg, regression analysis) using clinical trial data.

在一些實施方案中,提供了治療個體(例如人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下情況來選擇所述個體進行治療:所述個體在靶基因(“藥物敏感基因”)具有畸變(例如,攜帶突變),所述畸變使癌細胞對抗癌藥物敏感 (“藥物敏感性畸變”(如“藥物敏感性突變”),且其中所述藥物敏感基因(或藥物敏感性突變)是採用本文所述的任何靶基因鑒定方法來鑒定的。在一些實施方案中,提供了治療個體(例如人類)結直腸癌的方法,包括向個體施用有效量的PARPi,其中基於以下情況來選擇所述個體進行治療:所述個體在藥物敏感基因中具有藥物敏感性畸變(例如,攜帶藥物敏感性突變),且其中所述藥物敏感基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1h和WEE1。In some embodiments, there is provided a method of treating cancer in an individual, such as a human, comprising administering to the individual an effective amount of an anti-cancer drug, wherein the individual is selected for treatment based on: (“Drug Sensitivity Gene”) has an aberration (e.g., carries a mutation) that sensitizes cancer cells to an anticancer drug (a “drug sensitivity aberration” (eg, a “drug sensitivity mutation”), and wherein the drug sensitivity Genes (or drug-sensitive mutations) are identified using any of the target gene identification methods described herein. In some embodiments, methods for treating colorectal cancer in an individual (e.g., a human) are provided, comprising administering to the individual an effective amount of PARPi, wherein the individual is selected for treatment on the basis that the individual has a drug sensitivity aberration in a drug sensitivity gene (e.g., carries a drug sensitivity mutation), and wherein the drug sensitivity gene is selected from the group consisting of: ARID2 , ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7 , SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1h, and WEE1.

在一些實施方案中,本文提供了鑒定患有癌症的個體(例如人)的方法, 所述個體受益於包括施用抗癌藥物的治療,所述方法包括在來自所述個體的樣本中檢測使用本文所述的任何靶基因鑒定方法鑒定的一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),其中樣本中存在一個或多個藥物敏感畸變(例如,藥物敏感突變),則將個體鑒定為可能從所述治療中受益的個體。在一些實施方案中,本文提供了一種鑒定患有結直腸癌的個體(例如,人類)的方法,該個體可能受益於包括施用PARPi的治療,所述方法包括在個體的樣本中檢測一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),所述一個或多個藥物敏感基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1,其中樣本中存在一個或多個藥物敏感畸變(例如,藥物敏感突變),則將個體鑒定為可能從治療中受益的個體。In some embodiments, provided herein are methods of identifying an individual (e.g., a human) with cancer who would benefit from treatment comprising administering an anticancer drug, the method comprising detecting in a sample from the individual One or more drug sensitive aberrations (for example, drug sensitive mutations) in one or more drug sensitive genes identified by any of the target gene identification methods, wherein one or more drug sensitive aberrations (for example, drug sensitive mutations) are present in the sample mutation), the individual is identified as one who may benefit from the treatment. In some embodiments, provided herein is a method of identifying an individual (e.g., a human) with colorectal cancer who may benefit from treatment comprising administering PARPi, the method comprising detecting in a sample from the individual one or more One or more drug-sensitive aberrations (e.g., drug-sensitive mutations) in a drug-sensitive gene selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2 , FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B , RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1, where one or more drug-sensitive aberrations (eg, drug-sensitive mutations) are present in the sample, the individual is identified For individuals who may benefit from treatment.

在一些實施方案中,本文提供了為患有癌症的個體(例如人)選擇治療的方法,所述方法包括在所述個體的樣本中檢測使用本文所述的任何靶基因鑒定方法鑒定的一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),其中樣本中存在一個或多個藥物敏感畸變(例如,藥物敏感突變),則將包括施用抗癌藥物的治療鑒定為所述個體的合適治療。在一些實施方案中,本文提供了為患有結直腸癌的個體(例如人)選擇治療的方法,所述方法包括在所述個體的樣本中檢測一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),所述一個或多個藥物敏感基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1,其中樣本中存在一個或多個藥物敏感畸變(例如,藥物敏感突變),則將包括施用PARPi的治療鑒定為所述個體的合適治療。In some embodiments, provided herein are methods of selecting treatment for an individual (e.g., a human) with cancer, the method comprising detecting in a sample from the individual one or more genes identified using any of the target gene identification methods described herein. One or more drug-susceptibility aberrations (e.g., drug-sensitivity mutations) in a drug-sensitivity gene, where one or more drug-susceptibility aberrations (e.g., drug-sensitivity mutations) are present in a sample, would include treatment identification for administration of anticancer drugs suitable treatment for such individuals. In some embodiments, provided herein are methods of selecting treatment for an individual (e.g., a human) with colorectal cancer, the method comprising detecting one or more of one or more drug-sensitivity genes in a sample from the individual Drug susceptibility aberrations (eg, drug susceptibility mutations), the one or more drug susceptibility genes selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1 , PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1 , LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1, where one or more drug-sensitive aberrations (e.g., drug-sensitive mutations) are present in the sample, then treatment comprising administration of PARPi is identified as appropriate for the individual treat.

在一些實施方案中,提供了將患有癌症的個體(例如人)從包括向所述個體施用有效量的抗癌藥物的治療中排除的方法,其中如果所述個體在靶基因(“耐藥基因”)中具有畸變(例如,攜帶突變)則將該個體排除,所述畸變使癌細胞對抗癌藥物具有耐藥性(“耐藥畸變” (如“耐藥突變”)),且其中所述耐藥基因是使用本文所述的任何靶基因鑒定方法來鑒定的。在一些實施方案中,提供了將患有結直腸癌的個體(例如人)從包括向所述個體施用有效量的PARPi的治療中排除的方法,其中如果所述個體在耐藥基因中具有耐藥畸變(例如,攜帶耐藥突變)則將該個體排除,且其中所述耐藥基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2。In some embodiments, there is provided a method of excluding an individual (e.g., a human) with cancer from treatment comprising administering to the individual an effective amount of an anticancer drug, wherein if the individual has a target gene ("drug resistance") gene") that has an aberration (e.g., carries a mutation) that renders cancer cells resistant to anticancer drugs ("drug resistance aberration" (such as "drug resistance mutation")), and wherein The drug resistance gene is identified using any of the target gene identification methods described herein. In some embodiments, there is provided a method of excluding an individual (e.g., a human) with colorectal cancer from treatment comprising administering to the individual an effective amount of PARPi, wherein if the individual has resistance in a drug resistance gene Drug aberrations (for example, carrying drug-resistant mutations) exclude the individual, and wherein the drug-resistant genes are selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2.

在一些實施方案中,本文提供了鑒定患有癌症的個體(例如人)的方法,所述個體不受益於包括施用抗癌藥物的治療,所述方法包括在來自所述個體的樣本中檢測使用本文所述的任何靶基因鑒定方法鑒定的一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),其中樣本中存在一個或多個耐藥畸變(例如,耐藥突變),則將個體鑒定為不能從所述治療中受益的個體。在一些實施方案中,提供了鑒定患有結直腸癌的個體(例如人)的方法,所述個體不受益於包括施用PARPi的治療,所述方法包括在來自所述個體的樣本中檢測一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),所述一個或多個耐藥基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2,其中樣本中存在一個或多個耐藥畸變(例如,耐藥突變),則將個體鑒定為不能從所述治療中受益的個體。In some embodiments, provided herein are methods of identifying an individual (e.g., a human) with cancer who would not benefit from treatment comprising administration of an anticancer drug, the method comprising detecting in a sample from the individual the use of One or more drug resistance aberrations (e.g., drug resistance mutations) in one or more drug resistance genes identified by any of the target gene identification methods described herein, wherein one or more drug resistance aberrations (e.g., resistance mutations) are present in the sample drug mutation), the individual is identified as one who cannot benefit from the treatment. In some embodiments, there is provided a method of identifying an individual (e.g., a human) with colorectal cancer who would not benefit from treatment comprising administering PARPi, the method comprising detecting in a sample from the individual one or One or more drug resistance aberrations (eg, drug resistance mutations) in a plurality of drug resistance genes selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2, where one or more drug-resistant aberrations (e.g., drug-resistant mutations) are present in the sample, then Individuals are identified as those who would not benefit from the treatment.

在一些實施方案中,本文提供了從患有癌症的個體(例如人)中排除治療的方法,所述方法包括在所述個體的樣本中檢測使用本文所述的任何靶基因鑒定方法鑒定的一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),其中樣本中存在一個或多個耐藥畸變(例如,耐藥突變),則排除包括施用抗癌藥物的治療作為所述個體的合適治療。在一些實施方案中,本文提供了一種排除對患有結直腸癌的個體(例如人)的治療的方法,該方法包括在來自個體的樣本中檢測一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),所述一個或多個耐藥基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2,其中樣本中存在一個或多個耐藥畸變(例如,耐藥突變),則排除包括施用PARPi的治療作為所述個體的合適治療。In some embodiments, provided herein are methods of excluding therapy from an individual (e.g., a human) with cancer, the method comprising detecting in a sample from the individual a target gene identified using any of the target gene identification methods described herein. One or more resistance aberrations (e.g., drug resistance mutations) in one or more drug resistance genes, where one or more drug resistance aberrations (e.g., drug resistance mutations) are present in the sample, exclusions involving the administration of anticancer drugs Treatment is an appropriate treatment for said individual. In some embodiments, provided herein is a method of precluding treatment in an individual (e.g., a human) with colorectal cancer, the method comprising detecting one or more of one or more drug resistance genes in a sample from the individual A drug-resistant aberration (for example, a drug-resistant mutation), the one or more drug-resistant genes are selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2, wherein one or more drug-resistant aberrations (e.g., drug-resistant mutations) are present in the sample, then exclude the treatment that includes the administration of PARPi as the individual appropriate treatment.

在一些實施方案中,提供了治療個體(例如人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下選擇所述個體:i)一個或多個靶基因(“藥物敏感基因”)中的畸變(例如,突變),所述畸變使癌細胞對抗癌藥物敏感(“藥物敏感畸變”),以及ii)一個或多個靶基因(“耐藥基因”)中的畸變(例如,突變),所述畸變使所述癌細胞對抗癌藥物耐藥(“耐藥畸變”),其中所述藥物敏感基因和耐藥基因是使用本文所述的任何靶基因鑒定方法來鑒定的,且其中如果所述藥物敏感畸變和耐藥畸變的綜合評分高於綜合評分閾值水準,則選擇所述個體進行治療。In some embodiments, there is provided a method of treating cancer in an individual, such as a human, comprising administering to the individual an effective amount of an anticancer drug, wherein the individual is selected based on: i) one or more target genes (“ Aberrations (e.g., mutations) in a drug-sensitive gene”) that sensitize cancer cells to anticancer drugs (“drug-sensitive aberrations”), and ii) one or more target genes (“drug-resistant genes”) Aberrations (e.g., mutations) that render the cancer cells resistant to anticancer drugs ("drug resistance aberrations"), wherein the drug-sensitive and drug-resistant genes are identified using any of the target genes described herein method, and wherein if the combined score of the drug-sensitivity aberration and the drug-resistant aberration is higher than the combined score threshold level, the individual is selected for treatment.

在一些實施方案中,治療癌症或選擇或排除對患者的癌症治療的方法,還包括檢測所述個體的樣本中一個或多個藥物敏感畸變(例如,藥物敏感突變)和/或一個或多個耐藥畸變(例如,耐藥突變) (例如,通過NGS)。在一些實施方案中,所述方法還包括鑒定所述一個或多個藥物敏感基因和/或一個或多個耐藥基因。在一些實施方案中,所述方法還包括檢測與對照水準相比一個或多個藥物敏感基因和/或一個或多個耐藥基因的異常(例如,降低的或缺乏)表達(例如,RNA或蛋白),如通過qPCR、RNA-seq、質譜、蛋白質印跡或任何其他RNA或蛋白表達水準檢測方法。在一些實施方案中,所述方法還包括檢測與對照水準相比在一個或多個藥物敏感基因和/或一個或多個耐藥基因的異常修飾,如表觀遺傳修飾(如DNA甲基化、組蛋白甲基化、組織蛋白乙醯化)或翻譯後修飾(如磷酸化、糖基化、泛素化、亞硝化、甲基化、乙醯化、脂化和蛋白水解)。本文可以使用任何已知方法檢測DNA、核小體、RNA或蛋白的修飾,如ChIP-seq、ChIP-qPCR、DNase-seq、MNase-seq、質譜、蛋白質印跡等。在一些實施方案中,所述方法還包括檢測與對照水準相比一個或多個藥物敏感基因和/或一個或多個耐藥基因的表達產物(例如,RNA或蛋白)的異常(例如,降低的或缺乏)活性。本文可以使用任何合適的基因功能/活性測試方法,如檢測信號轉導、下游通路分子的啟動狀態(例如磷酸化狀態)、蛋白-蛋白質結合親和力和/或特異性、代謝、細胞行為(例如細胞增殖、死亡、細胞週期)、細胞因數釋放等。在一些實施方案中,所述方法還包括獲得個體的綜合評分。在一些實施方案中,所述綜合評分是基於一個或多個藥物敏感畸變和/或耐藥畸變,如以下的一個或多個:藥物敏感突變、耐藥突變、藥物敏感基因的異常表達、耐藥基因的異常表達、藥物敏感基因表達產物的異常活性、耐藥基因表達產物的異常活性、藥物敏感基因(或基因產物)的異常修飾和耐藥基因(或基因產物)的異常修飾等。在一些實施方案中,所述綜合評分是通過以下獲得的:(具有患者攜帶的藥物敏感畸變的藥物敏感基因數目)減去(具有患者攜帶的耐藥畸變的耐藥基因數目),其中如果所述綜合評分高於零,則選擇所述個體進行治療。在一些實施方案中,患者中藥物敏感突變或耐藥突變的嚴重程度給綜合評分增加了權重, 例如,相比更小影響相同藥物敏感基因的表達和/或活性的另一藥物敏感突變,影響了藥物敏感基因的表達和/或活性的藥物敏感突變給綜合評分增加了更多的權重。在一些實施方案中,與對照水準相比(例如,健康個體),患者中藥物敏感基因或耐藥基因的異常表達的程度給綜合評分增加了權重,例如,相比相同藥物敏感基因的降低的表達,藥物敏感基因的表達喪失給綜合評分增加了更多的權重。在一些實施方案中,與對照水準相比(例如,健康個體),患者藥物敏感基因或耐藥基因的異常活性(例如,RNA或蛋白活性)程度給綜合評分增加了權重,例如,相比相同藥物敏感基因的降低的蛋白活性(例如,降低的結合),藥物敏感基因的蛋白活性的喪失(例如,消除的結合)給綜合評分增加了更多的權重。在一些實施方案中,與對照水準相比(例如,健康個體),患者藥物敏感基因或耐藥基因的異常修飾的程度(例如,DNA、核小體、RNA或蛋白的修飾) 給綜合評分增加了權重,例如,相比相同藥物敏感基因的蛋白磷酸化降低,藥物敏感基因蛋白磷酸化的喪失(例如,消除信號轉導)給綜合評分增加了更多的權重。在一些實施方案中,所述綜合評分是通過以下獲得的:(所述藥物敏感基因的敏感性評分總數的絕對值)減去(所述耐藥基因的耐藥性評分總數的絕對值),其中如果所述綜合評分高於零,則選擇所述個體進行治療。在一些實施方案中,所述方法還包括採用本文所述的任何靶基因鑒定方法鑒定的藥物敏感基因和耐藥基因進行排序,其中所述耐藥基因或藥物敏感基因的排序是基於相比所述對照癌細胞群,處理後癌細胞群(例如,活的)中所述sgRNA或sgRNA iBAR嚮導序列(或包含所述命中基因突變的序列)的富集程度或耗竭程度(例如,富集的倍數,耗竭的倍數,富集FDR,或耗竭FDR)。在一些實施方案中,耐藥基因或藥物敏感基因的排序被進一步調整:i)基於對應于相同靶基因的嚮導序列的所述sgRNA iBAR序列的iBAR序列中的資料一致性,或ii)基於對應于相同靶基因(或相同靶基因的相同靶位點)的所有嚮導序列中的資料一致性,或iii) 基於對應于相同靶基因(或相同靶基因的相同靶位點)的所有包含所述命中基因突變(例如,失活突變)的序列中的資料一致性。在一些實施方案中,將RRA或α-RRA演算法用於對耐藥基因和/或藥物敏感基因進行排序。在一些實施方案中,所述方法還包括將敏感性評分分配給鑒定的藥物敏感基因,和/或將耐藥性評分分配給鑒定的耐藥基因,i)其中基於相比所述對照癌細胞群,處理後癌細胞群(例如,活的)中所述sgRNA或sgRNA iBAR嚮導序列(或包含所述命中基因突變的序列)的富集的倍數(或基於富集FDR – FDR越小,排序越高;或基於資料一致性的程度 – 資料一致性程度越高,排序越高),將耐藥基因從高到低排序,以及相應地從高到低給每個耐藥基因分配耐藥性評分;和/或ii) 其中基於相比所述對照癌細胞群,處理後癌細胞群(例如,活的)中所述sgRNA或sgRNA iBAR嚮導序列(或包含所述命中基因突變的序列)的耗竭的倍數(或基於耗竭FDR – FDR越小,排序越高;或基於資料一致性的程度 – 資料一致性程度越高,排序越高),將藥物敏感基因從高到低排序,以及相應地按從高到低給每個藥物敏感基因分配敏感性評分。 In some embodiments, the method of treating cancer or selecting or excluding cancer therapy in a patient further comprises detecting one or more drug-sensitive aberrations (e.g., drug-sensitive mutations) and/or one or more Resistance aberrations (eg, resistance mutations) (eg, by NGS). In some embodiments, the method further comprises identifying the one or more drug sensitivity genes and/or the one or more drug resistance genes. In some embodiments, the method further comprises detecting aberrant (e.g., reduced or absent) expression (e.g., RNA or protein), such as by qPCR, RNA-seq, mass spectrometry, Western blot, or any other method for detecting RNA or protein expression levels. In some embodiments, the method further comprises detecting abnormal modifications, such as epigenetic modifications (e.g., DNA methylation, , histone methylation, histone acetylation) or post-translational modifications (such as phosphorylation, glycosylation, ubiquitination, nitrosation, methylation, acetylation, lipidation, and proteolysis). Herein, any known method can be used to detect the modification of DNA, nucleosome, RNA or protein, such as ChIP-seq, ChIP-qPCR, DNase-seq, MNase-seq, mass spectrometry, western blotting, etc. In some embodiments, the method further comprises detecting an abnormality (eg, a decrease in expression product (eg, RNA or protein)) of one or more drug-sensitive genes and/or one or more drug-resistant genes compared to a control level. or lack thereof) activity. Any suitable assay for gene function/activity may be used herein, such as detection of signal transduction, initiation status (e.g., phosphorylation state) of downstream pathway molecules, protein-protein binding affinity and/or specificity, metabolism, cellular behavior (e.g., cell proliferation, death, cell cycle), cytokine release, etc. In some embodiments, the method further comprises obtaining a composite score for the individual. In some embodiments, the composite score is based on one or more drug sensitivity and/or resistance aberrations, such as one or more of the following: drug sensitivity mutations, drug resistance mutations, aberrant expression of drug sensitivity genes, resistance Abnormal expression of drug-sensitive genes, abnormal activity of drug-sensitive gene expression products, abnormal activity of drug-resistant gene expression products, abnormal modification of drug-sensitive genes (or gene products), abnormal modification of drug-resistant genes (or gene products), etc. In some embodiments, the composite score is obtained by subtracting (number of drug-susceptible genes with drug-susceptibility aberrations carried by the patient) minus (number of drug-resistant genes with drug-resistant aberrations carried by the patient), wherein if If the composite score is above zero, the individual is selected for treatment. In some embodiments, the severity of a drug-susceptibility mutation or a drug-resistance mutation in a patient adds weight to the composite score, e.g., compared to another drug-susceptibility mutation that affects expression and/or activity of the same drug-sensitivity gene to a lesser extent, Drug-sensitivity mutations that decrease the expression and/or activity of drug-sensitivity genes add more weight to the composite score. In some embodiments, the degree of aberrant expression of a drug-sensitivity or resistance gene in a patient adds weight to the composite score, as compared to control levels (e.g., healthy individuals), e.g., compared to reduced Expression, loss of expression of drug-sensitive genes adds more weight to the composite score. In some embodiments, the degree of aberrant activity (e.g., RNA or protein activity) of a patient's drug-sensitivity or resistance gene compared to control levels (e.g., healthy individuals) adds weight to the composite score, e.g., compared to the same Reduced protein activity (eg, reduced binding) of drug sensitive genes, loss of protein activity of drug sensitive genes (eg, abolished binding) adds more weight to the composite score. In some embodiments, the degree of aberrant modification (e.g., DNA, nucleosome, RNA, or protein modification) of a patient's drug-sensitivity or resistance gene compared to a control level (e.g., a healthy individual) contributes to an increase in the composite score For example, loss of protein phosphorylation (eg, abrogation of signal transduction) of a drug-sensitive gene adds more weight to the composite score than a decrease in protein phosphorylation of the same drug-sensitive gene. In some embodiments, the composite score is obtained by subtracting (the absolute value of the total number of sensitivity scores for the drug-sensitive genes) minus (the absolute value of the total number of drug resistance scores for the drug-resistant genes), Wherein if the composite score is higher than zero, the individual is selected for treatment. In some embodiments, the method further comprises ranking the drug-sensitive genes and drug-resistant genes identified by any of the target gene identification methods described herein, wherein the ranking of the drug-resistant genes or drug-sensitive genes is based on the comparison of the The control cancer cell population, the degree of enrichment or depletion ( for example, the enriched fold, fold depleted, enriched FDR, or depleted FDR). In some embodiments, the ordering of drug-resistant or drug-sensitive genes is further adjusted: i) based on data concordance in the iBAR sequence of the sgRNA iBAR sequence corresponding to the guide sequence of the same target gene, or ii) based on the corresponding data identity in all guide sequences for the same target gene (or the same target site for the same target gene), or iii) based on all guide sequences corresponding to the same target gene (or the same target site for the same target gene) containing the Data identity among sequences of hit gene mutations (eg, inactivating mutations). In some embodiments, the RRA or α-RRA algorithm is used to rank drug resistance genes and/or drug sensitivity genes. In some embodiments, the method further comprises assigning a sensitivity score to the identified drug-susceptible gene, and/or assigning a drug-resistance score to the identified drug-resistant gene, i) wherein based on the comparison of the control cancer cells Population, the fold enrichment (or FDR based on enrichment) of the sgRNA or sgRNA iBAR guide sequence (or sequence containing the mutation of the hit gene) in the cancer cell population (e.g., live) after treatment - the smaller the FDR, the ranking or based on the degree of data consistency – the higher the data consistency, the higher the ranking), rank the resistance genes from high to low, and assign resistance to each resistance gene accordingly from high to low scoring; and/or ii) wherein the sgRNA or sgRNA iBAR guide sequence (or sequence comprising the hit gene mutation) is based on the sgRNA or sgRNA iBAR guide sequence (or sequence comprising the hit gene mutation) in a cancer cell population (e.g., live) after treatment compared to the control cancer cell population The multiple of depletion (either based on depletion FDR - the smaller the FDR, the higher the ranking; or based on the degree of data agreement - the higher the data consistency, the higher the ranking), the drug susceptibility genes are ranked from high to low, and accordingly Each drug susceptibility gene is assigned a sensitivity score from high to low.

除上述方法外,還可以使用本領域已知的任何方法計算綜合評分,和/或選擇綜合評分閾值水準。例如,參見US20160369353中的回應評分或重組熟練程度評分(RPS),也可以參見US20200254259、US20180068083,其各自的內容均通過引用以其整體併入本文。In addition to the methods described above, any method known in the art may be used to calculate the composite score, and/or select a composite score threshold level. For example, see Response Score or Recombined Proficiency Score (RPS) in US20160369353, see also US20200254259, US20180068083, the contents of each of which are hereby incorporated by reference in their entirety.

在一些實施方案中,對於特定癌症類型(例如,結直腸癌)和/或特定抗癌藥物(例如,PARPi),參數“m”是使用本文描述的任何方法鑒定的耐藥基因和藥物敏感基因的總數;或是以下耐藥基因小組(panel)的組合中的基因總數:耐藥基因小組(i) AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2,以及藥物敏感基因小組(ii) ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1。在一些實施方案中,從個體樣本中,在一個或多個患者基因中鑒定了一個或多個患者畸變(例如,突變或異常的表達/活性/修飾),如一個或多個患者的突變(例如,非同義、無義、錯義、移碼、插入、刪除、停止丟失、停止獲取、導致錯接、基因融合的突變等),這些患者基因屬於使用本文所述的任何方法鑒定的耐藥基因和藥物敏感基因的組合,或者屬於上述小組(i)和(ii)的組合。在一些實施方案中,從個體樣本中,在屬於使用本文所述的任何方法鑒定的耐藥基因和藥物敏感基因的組合的患者基因中,或屬於上述小組(i)和(ii)的組合的患者基因中,未鑒定出患者畸變(例如,患者突變)。從患者(例如,患者樣本,如通過NGS)中鑒定出的患者基因或患者畸變(例如,突變或異常表達/活性/修飾),其屬於使用本文所述任何方法鑒定的耐藥基因或藥物敏感基因,或在上述(i)和(ii)靶基因小組中鑒定的藥物耐藥基因或藥敏基因,以下也分別稱為“患者靶基因”或“患者靶畸變”(例如“患者靶突變”)。參數“m”是一個至少為1的整數,是特定癌症類型和特定抗癌藥物的常數。In some embodiments, for a particular cancer type (e.g., colorectal cancer) and/or a particular anticancer drug (e.g., PARPi), the parameter "m" is the drug resistance and drug sensitivity genes identified using any of the methods described herein or the total number of genes in the combination of the following drug resistance gene panels (i) AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1 , RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2 , KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2, and a panel of drug-sensitive genes (ii) ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B , NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL , EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1. In some embodiments, one or more patient aberrations (e.g., mutations or aberrant expression/activity/modifications) are identified in one or more patient genes from a sample from an individual, such as one or more patient mutations ( For example, non-synonymous, nonsense, missense, frameshift, insertion, deletion, stop loss, stop gain, mutations leading to missplicing, gene fusion, etc.) that are gene resistant in patients identified using any of the methods described herein A combination of genes and drug susceptibility genes, or a combination belonging to groups (i) and (ii) above. In some embodiments, from an individual sample, among patient genes belonging to the combination of drug resistance genes and drug susceptibility genes identified using any of the methods described herein, or belonging to the combination of groups (i) and (ii) above In the patient's genes, no patient aberrations (eg, patient mutations) were identified. Patient genes or patient aberrations (e.g., mutations or aberrant expression/activity/modifications) identified from patients (e.g., patient samples, such as by NGS) that are drug resistance genes or drug susceptibility identified using any of the methods described herein Genes, or drug resistance genes or drug susceptibility genes identified in the above (i) and (ii) target gene panels, are also referred to below as "patient target genes" or "patient target aberrations" respectively (e.g. "patient target mutations" ). The parameter "m" is an integer of at least 1, which is a constant for a particular cancer type and a particular anticancer drug.

在一些實施方案中,根據一個或多個患者相關參數計算綜合評分,如i)從患者(例如通過NGS)鑒定的每個患者靶基因上攜帶的有害突變(例如,非同義、無義、錯義、移碼、插入、刪除、停止丟失、停止增益、導致錯誤剪接、基因融合等的突變)的數量(參數“

Figure 02_image001
”),ii)從患者(例如,通過NGS)鑒定的特定患者靶基因中攜帶特定有害突變的細胞的估計分數(參數“
Figure 02_image004
”),iii)相對于正常組織,患者疾病組織中患者靶基因表達水準的對數標度(例如,log2)倍變化(參數“
Figure 02_image006
”)等。在一些實施方案中,一個或多個患者相關參數是根據患者樣本的資料/資訊得出的,如測序讀取計數。參數“
Figure 02_image008
”表示從患者身上鑒定的第i個患者靶基因中攜帶第j個突變的細胞的估計分數。0 <
Figure 02_image008
≤ 1。“
Figure 02_image001
”是至少為1的整數,並且是檢測到的相應已鑒定患者靶基因的有害患者靶突變的總數。“
Figure 02_image013
”是整數,且1 ≤
Figure 02_image013
Figure 02_image001
。“
Figure 02_image016
”是整數,且0 ≤
Figure 02_image016
Figure 02_image018
。當
Figure 02_image016
= 0,這表明從個體樣本中,在屬於使用本文所述的任何方法鑒定的耐藥基因和藥物敏感基因的組合的任何患者基因中,或屬於上述基因小組(i)和(ii)的組合的任何患者基因中,未發現有害突變。在一些實施方案中,第i個患者靶基因中攜帶第j個突變的細胞的分數,是基於從患者樣本中鑒定的在第i個患者靶基因中包含突變的所有序列中第j個突變的序列的分數來估計的。參數“
Figure 02_image021
”表示相對于正常組織,患者疾病組織中第i個患者靶基因表達水準的對數標度(例如log2)倍數變化。患者靶基因的表達水準可以用任何已知的方法測量,如RNA-seq、qPCR、質譜、蛋白質印跡、FISH、免疫螢光染色等。 In some embodiments, a composite score is calculated based on one or more patient-related parameters, such as i) deleterious mutations (e.g., nonsynonymous, nonsense, missense) carried on each patient's target gene identified from the patient (e.g., by NGS). number of senses, frameshifts, insertions, deletions, stop losses, stop gains, mutations leading to missplicing, gene fusions, etc.) (parameter "
Figure 02_image001
”), ii) the estimated fraction of cells carrying a specific deleterious mutation in a specific patient target gene identified (e.g., by NGS) from the patient (parameter “
Figure 02_image004
"), iii) the logarithmic scale (eg, log2) fold change in the expression level of the patient's target gene in the patient's disease tissue relative to normal tissue (parameter "
Figure 02_image006
"), etc. In some embodiments, one or more patient-related parameters are derived from data/information from the patient sample, such as sequencing read counts. Parameters"
Figure 02_image008
” represents the estimated fraction of cells carrying the jth mutation in the ith patient target gene identified from the patient. 0 <
Figure 02_image008
≤ 1. "
Figure 02_image001
"is an integer of at least 1 and is the total number of detected deleterious patient target mutations corresponding to the identified patient target gene."
Figure 02_image013
” is an integer, and 1 ≤
Figure 02_image013
Figure 02_image001
. "
Figure 02_image016
” is an integer, and 0 ≤
Figure 02_image016
Figure 02_image018
. when
Figure 02_image016
= 0, which indicates that from the individual sample, in any patient gene that belongs to the combination of drug-resistant and drug-susceptible genes identified using any of the methods described herein, or belongs to the combination of gene groups (i) and (ii) above No deleterious mutations were found in any of the patient's genes. In some embodiments, the fraction of cells carrying the jth mutation in the ith patient target gene is based on the jth mutation among all sequences comprising a mutation in the ith patient target gene identified from the patient sample Sequence scores are estimated. parameter"
Figure 02_image021
"Represents the logarithmic scale (eg log2) fold change of the i-th patient's target gene expression level in the patient's disease tissue relative to the normal tissue. The expression level of the patient's target gene can be measured by any known method, such as RNA-seq, qPCR, mass spectrometry, western blot, FISH, immunofluorescent staining, etc.

在一些實施方案中,綜合評分是基於一個或多個基因相關的參數來計算的,如i) 患者靶基因與抗癌藥物治療之間的相關性(正相關或負相關)(例如,在IC50) (參數“

Figure 02_image024
”),其源於機器學習(例如,基於來自關於細胞系的公共資料的訓練模型),ii)回應抗癌藥物治療的患者靶基因的歸一化權重(參數“
Figure 02_image027
”),其源於機器學習(例如,基於來自關於細胞系的公共資料的訓練模型),iii)預測的患者靶基因的有害突變的影響(參數“
Figure 02_image030
”;例如,基於採用公共資料庫的有害預測,如異常的基因或基因產物活性), iv)根據卡普蘭-梅爾(Kaplan-Meier)生存曲線,在給定時間點,患者靶基因的淨存活率與總存活率的比值(參數“
Figure 02_image032
”;例如,基於癌基因圖譜(TCGA)資料庫和/或cBioPortal資料庫),v)相對于正常組織,患者疾病組織中患者靶基因表達水準的對數標度(例如log2)倍數變化(參數“
Figure 02_image006
”; 例如,基於患者資料庫,即從患有相同癌症的患者收集的資訊)等。在一些實施方案中,一個或多個基因相關的參數來自基於公共的或患者的資料庫的資料,用於訓練綜合評分模型。參數“
Figure 02_image036
”表示從患者鑒定的第i個患者靶基因與抗癌藥物治療之間的相關性(正相關或負相關)(例如,在IC50),其來自機器學習。參數“
Figure 02_image039
”表示回應抗癌藥物治療的第i個患者靶基因的歸一化權重(即,對於抗癌藥物治療,第i個患者靶基因的功能喪失的貢獻),其源自機器學習。參數“
Figure 02_image041
”表示預測的第i個患者靶基因的第j個有害突變的影響(例如,基於採用公共資料庫的有害預測,或手工分配的常數)。參數“
Figure 02_image043
”表示根據卡普蘭-梅爾(Kaplan-Meier)生存曲線,在給定時間點,第i個患者靶基因的淨存活率與總存活率的比值(例如,基於TCGA和/或cBioPortal資料庫)。參數“
Figure 02_image021
”表示相對于正常組織,患者疾病組織中第i個患者靶基因表達水準的對數標度(例如log2)倍數變化(例如,基於患者資料庫,即從患有相同癌症的患者收集的資訊)。“
Figure 02_image016
”是整數,0 ≤
Figure 02_image016
Figure 02_image018
. “
Figure 02_image013
”是整數,且1 ≤
Figure 02_image013
Figure 02_image001
。 In some embodiments, the composite score is calculated based on one or more gene-related parameters, such as i) the correlation (positive or negative) between the patient's target gene and the anticancer drug treatment (eg, at IC50 ) (parameter "
Figure 02_image024
”) derived from machine learning (e.g., a trained model based on public data on cell lines), ii) normalized weights of patient target genes in response to anticancer drug treatment (parameter “
Figure 02_image027
”) derived from machine learning (e.g., a trained model based on public data on cell lines), iii) the predicted impact of deleterious mutations in patient target genes (parameters “
Figure 02_image030
”; e.g., based on predictions of deleteriousness using public databases, such as abnormal gene or gene product activity), iv) according to Kaplan-Meier (Kaplan-Meier) survival curve, at a given time point, the patient's net The ratio of survival rate to overall survival rate (parameter "
Figure 02_image032
”; for example, based on The Cancer Gene Atlas (TCGA) database and/or the cBioPortal database), v) the logarithmic scale (eg log2) fold change in the expression level of the patient's target gene in the patient's disease tissue relative to normal tissue (parameter "
Figure 02_image006
"; for example, based on patient databases, i.e., information collected from patients with the same cancer), etc. In some embodiments, one or more gene-related parameters are derived from public or patient-based database data, using for training the comprehensive scoring model. Parameters "
Figure 02_image036
"Denotes the correlation (positive or negative) between the i-th patient target gene identified from a patient and anticancer drug treatment (eg, at IC50), which is derived from machine learning. Parameters"
Figure 02_image039
"Represents the normalized weight of the i-th patient's target gene in response to anticancer drug treatment (i.e., the contribution of the loss-of-function of the i-th patient's target gene to anticancer drug treatment), derived from machine learning. Parameters"
Figure 02_image041
"Represents the predicted effect of the jth deleterious mutation in the i'th patient's target gene (e.g., based on deleterious predictions using public databases, or a manually assigned constant). Parameter"
Figure 02_image043
"Indicates the ratio of the net survival rate to the overall survival rate of the i-th patient's target gene at a given time point according to the Kaplan-Meier survival curve (for example, based on TCGA and/or cBioPortal database) .parameter"
Figure 02_image021
" denotes the logarithmic scale (eg log2) fold change in the i-th patient's target gene expression level in the patient's disease tissue relative to normal tissue (eg, based on a patient database, ie, information collected from patients with the same cancer). "
Figure 02_image016
" is an integer, 0 ≤
Figure 02_image016
Figure 02_image018
. "
Figure 02_image013
” is an integer, and 1 ≤
Figure 02_image013
Figure 02_image001
.

在一些實施方案中,綜合評分是基於一個或多個路徑相關的參數來計算的,如i)在涉及患者靶基因的路徑和/或調控網路中患者靶基因的估計權重(參數“

Figure 02_image050
”;例如,基於公共資料庫如KEGG和InterProScan),ii)在抗癌藥物相關的路徑中患者靶基因的歸一化權重(參數“
Figure 02_image052
”; 例如,基於公共資料庫)等。在一些實施方案中,一個或多個路徑相關的參數來自基於公共資料庫的資料,用於訓練綜合評分模型。參數“
Figure 02_image050
”表示在涉及第i個患者靶基因的路徑和/或調控網路中第i個患者靶基因的估計權重(例如,基於公共資料庫 如KEGG和InterProScan)。參數“
Figure 02_image055
”表示在抗癌藥物相關的路徑中第i個患者靶基因的歸一化權重,例如,基於公共資料庫。“
Figure 02_image016
”是整數,以及0 ≤
Figure 02_image016
Figure 02_image018
。 In some embodiments, the composite score is calculated based on one or more pathway-related parameters, such as i) the estimated weight of the patient's target gene in the pathway and/or regulatory network involving the patient's target gene (parameter "
Figure 02_image050
”; for example, based on public databases such as KEGG and InterProScan), ii) normalized weights of patient target genes in anticancer drug-related pathways (parameters “
Figure 02_image052
"; for example, based on a public database), etc. In some embodiments, one or more path-related parameters are derived from data based on a public database for training the composite scoring model. Parameters"
Figure 02_image050
"Denotes the estimated weight of the ith patient target gene in the pathway and/or regulatory network involving the ith patient target gene (eg, based on public databases such as KEGG and InterProScan). Parameter"
Figure 02_image055
"Represents the normalized weights of the i-th patient's target gene in a pathway associated with an anticancer drug, e.g., based on a public database."
Figure 02_image016
” is an integer, and 0 ≤
Figure 02_image016
Figure 02_image018
.

在一些實施方案中,綜合評分是基於一個或多個參數來計算的,所述參數選自以下的一個或多個:本文所述的患者相關的參數、基因相關的參數和路徑相關的參數。In some embodiments, the composite score is calculated based on one or more parameters selected from one or more of the following: patient-related parameters, gene-related parameters, and pathway-related parameters described herein.

在一些實施方案中,採用以下的式I來計算綜合評分:In some embodiments, the composite score is calculated using Formula I below:

Figure 02_image059
Figure 02_image059

其中in

Figure 02_image061
Figure 02_image063
Figure 02_image065
是模型調整常數(例如,來自相應抗癌藥物的訓練模型的常數),其中-1 ≤
Figure 02_image061
≤ 1, -1 ≤
Figure 02_image063
≤ 1,且-1 ≤
Figure 02_image065
≤ 1;
Figure 02_image061
,
Figure 02_image063
and
Figure 02_image065
is the model tuning constant (e.g., the constant from the trained model for the corresponding anticancer drug), where -1 ≤
Figure 02_image061
≤ 1, -1 ≤
Figure 02_image063
≤ 1, and -1 ≤
Figure 02_image065
≤ 1;

Figure 02_image018
是用本文所述的任何靶基因鑒定方法鑒定的耐藥基因和藥物敏感基因的總數或上述小組(i)和(ii)的組合中靶基因的總數;
Figure 02_image018
is the total number of drug-resistant genes and drug-susceptible genes identified by any of the target gene identification methods described herein or the total number of target genes in the combination of the above panels (i) and (ii);

Figure 02_image001
是針對患者的第i個患者靶基因檢測的有害突變數目;
Figure 02_image001
is the number of deleterious mutations detected for the i-th patient target gene of the patient;

Figure 02_image008
”是第i個患者靶基因中攜帶第j個有害突變的患者細胞的估計分數,其中0 <
Figure 02_image008
≤ 1; "
Figure 02_image008
” is the estimated fraction of patient cells carrying the jth deleterious mutation in the target gene of the ith patient, where 0 <
Figure 02_image008
≤ 1;

Figure 02_image036
是第i個患者靶基因和抗癌藥物治療(例如,在IC50)之間的相關性(正相關性或負相關性);
Figure 02_image036
is the correlation (positive or negative) between the i-th patient's target gene and anticancer drug treatment (eg, at IC50);

Figure 02_image039
是響應抗癌藥物治療的第i個患者靶基因的歸一化權重;
Figure 02_image039
is the normalized weight of the i-th patient's target gene in response to anticancer drug treatment;

Figure 02_image041
是第i個患者靶基因的第j個有害突變或的預測的影響;
Figure 02_image041
is the predicted impact of the jth deleterious mutation or the target gene of the ith patient;

Figure 02_image074
是根據卡普蘭-梅爾(Kaplan-Meier)生存曲線,在給定時間點,第i個患者靶基因的淨存活率與總存活率的比值;
Figure 02_image074
is the ratio of the net survival rate of the i-th patient's target gene to the overall survival rate at a given time point according to the Kaplan-Meier survival curve;

Figure 02_image021
是相對于正常組織,疾病組織中第i個患者靶基因表達水準的對數標度(例如log2)倍數變化;
Figure 02_image021
is the logarithmic scale (eg log2) fold change of the i-th patient's target gene expression level in the disease tissue relative to the normal tissue;

Figure 02_image077
是在涉及第i個患者靶基因的路徑和/或調控網路中第i個患者靶基因的估計的權重;
Figure 02_image077
is the estimated weight of the ith patient target gene in the pathway and/or regulatory network involving the ith patient target gene;

Figure 02_image055
是在抗癌藥物相關的路徑中第i個患者靶基因的歸一化權重;
Figure 02_image055
is the normalized weight of the i-th patient's target gene in the anticancer drug-related pathway;

其中

Figure 02_image016
和 均為整數,0 ≤
Figure 02_image016
Figure 02_image013
,且1 ≤
Figure 02_image013
Figure 02_image001
;以及 in
Figure 02_image016
and are integers, 0 ≤
Figure 02_image016
Figure 02_image013
, and 1 ≤
Figure 02_image013
Figure 02_image001
;as well as

Figure 02_image082
Figure 02_image084
的標準評分(“Z-評分”):
Figure 02_image082
yes
Figure 02_image084
Standardized scoring ("Z-score") for :

Figure 02_image086
Figure 02_image086

其中

Figure 02_image088
是相對于正常組織,疾病組織中第i個患者靶基因表達水準的中值的對數標度(例如log2)倍數變化(例如,基於患者資料庫,即從患有相同癌症的患者收集的資訊);以及 in
Figure 02_image088
is the logarithmic scale (e.g. log2) fold change in the median value of the target gene expression level of the ith patient in disease tissue relative to normal tissue (e.g. based on a patient database, i.e. information collected from patients with the same cancer) ;as well as

其中

Figure 02_image091
是相對于正常組織,疾病組織中第i個患者靶基因表達水準的標準差的對數標度(例如log2)倍數變化(例如,基於患者資料庫,即從患有相同癌症的患者收集的資訊)。 in
Figure 02_image091
is the logarithmic scale (e.g. log2) fold change in the standard deviation of the i-th patient's target gene expression level in disease tissue relative to normal tissue (e.g. based on a patient database, i.e. information collected from patients with the same cancer) .

在一些實施方案中,綜合評分閾值水準是0。在一些實施方案中,如果根據式I的患者綜合評分高於0,則患者適合(即可能受益於)抗癌藥物治療。在一些實施方案中,如果根據式I的患者綜合評分大於或等於至少0.1(例如,0.3),則選擇或建議患者進行抗癌藥物治療。在一些實施方案中,如果根據式I的患者的綜合評分大於0但小於0.1,則患者適合抗癌藥物治療,但應使用其他方法進行進一步評價(例如,藥物劑量測試、癌症基因測試(例如,尋找可能有助於抗癌藥物治療的其他協同突變,或驗證原發癌症類型)等)或基於其他資訊(例如,患者的臨床記錄或已知的耐藥性等)來確定是否應選擇或推薦患者進行抗癌藥物治療。在一些實施方案中,如果根據式I的患者的綜合評分低於或等於0,則患者不適合(即可能不受益於)或被排除在抗癌藥物治療之外。在一些實施方案中,使用其他方法(例如,藥物劑量測試、癌症基因測試(例如,尋找可能有助於抗癌藥物治療的其他協同突變,或驗證原發癌症類型)等)或根據其他資訊(如患者的臨床記錄或已知的耐藥性等)進行進一步評價,如果根據式I患者的綜合評分等於0或非常接近0(如-0.1至0),則應在完全排除患者接受抗癌藥物治療之前進行。In some embodiments, the composite score threshold level is zero. In some embodiments, a patient is suitable for (ie likely to benefit from) anticancer drug treatment if the patient's composite score according to Formula I is above 0. In some embodiments, a patient is selected or recommended for anticancer drug treatment if the patient's composite score according to Formula I is greater than or equal to at least 0.1 (eg, 0.3). In some embodiments, if the patient's composite score according to Formula I is greater than 0 but less than 0.1, the patient is suitable for anticancer drug therapy, but should be further evaluated using other methods (e.g., drug dosage testing, cancer genetic testing (e.g., Find other synergistic mutations that may contribute to anticancer drug treatment, or verify the primary cancer type), etc.) or based on other information (for example, the patient's clinical record or known drug resistance, etc.) to determine whether it should be selected or recommended The patient is treated with anticancer drugs. In some embodiments, if the patient's composite score according to Formula I is less than or equal to 0, the patient is not suitable for (ie, likely not to benefit from) or excluded from anticancer drug treatment. In some embodiments, other methods (e.g., drug dose testing, cancer genetic testing (e.g., looking for other synergistic mutations that might facilitate anticancer drug therapy, or verifying primary cancer type), etc.) or based on other information ( Such as the patient's clinical records or known drug resistance, etc.) for further evaluation, if the patient's composite score according to formula I is equal to 0 or very close to 0 (such as -0.1 to 0), the patient should be completely excluded from receiving anticancer drugs before treatment.

因此,在一些實施方案中,提供了治療個體(例如人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下選擇所述個體:i)一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),以及 ii)一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),其中使用本文所述的任何靶基因鑒定方法來鑒定藥物敏感基因和耐藥基因,且其中如果所述藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則選擇所述個體進行治療;其中所述綜合評分是(所述藥物敏感基因的敏感性評分總數的絕對值)減去(所述耐藥基因的耐藥性評分總數的絕對值)得到的,且所述綜合評分閾值水準為零。在一些實施方案中,提供了治療個體(例如人)的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下選擇所述個體:i)一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),以及 ii)一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),其中使用本文所述的任何靶基因鑒定方法來鑒定藥物敏感基因和耐藥基因,且其中如果所述藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)根據式I的綜合評分高於零(例如,大於或等於至少0.1(例如,0.3)),則選擇所述個體進行治療。在一些實施方案中,所述方法還包括從個體的樣本中檢測一個或多個藥物敏感性畸變(例如,突變、異常表達、異常活性、異常修飾)和一個或多個耐藥畸變(例如,突變、異常表達、異常活性、異常修飾)。Accordingly, in some embodiments, there is provided a method of treating cancer in an individual (eg, a human) comprising administering to the individual an effective amount of an anticancer drug, wherein the individual is selected based on: i) one or more drug sensitive One or more drug-sensitive aberrations (e.g., drug-sensitive mutations) in the gene, and ii) one or more drug-resistant aberrations (e.g., drug-resistant mutations) in one or more drug-resistant genes, wherein using Any target gene identification method to identify drug-sensitive genes and drug-resistant genes, and wherein if the combined score of the drug-sensitive aberration (eg, drug-sensitive mutation) and drug-resistant aberration (eg, drug-resistant mutation) is higher than the combined score threshold level, then select the individual for treatment; wherein the comprehensive score is (the absolute value of the total number of sensitivity scores of the drug-sensitive gene) minus (the absolute value of the total number of drug resistance scores of the drug-resistant gene) to obtain , and the comprehensive score threshold level is zero. In some embodiments, there is provided a method of treating cancer in an individual (e.g., a human) comprising administering to the individual an effective amount of an anticancer drug, wherein the individual is selected based on: i) one or more of the drug sensitivity genes and ii) one or more drug-resistant aberrations (e.g., drug-resistant mutations) in one or more drug-resistant genes, wherein using any of the herein described target gene identification method to identify drug-sensitive genes and drug-resistant genes, and wherein if the comprehensive score of the drug-sensitive aberration (for example, drug-sensitive mutation) and drug-resistant aberration (for example, drug-resistant mutation) according to formula I is higher than zero ( For example, greater than or equal to at least 0.1 (eg, 0.3)), the individual is selected for treatment. In some embodiments, the method further comprises detecting one or more drug susceptibility aberrations (e.g., mutation, aberrant expression, aberrant activity, aberrant modification) and one or more drug resistance aberrations (e.g., mutation, abnormal expression, abnormal activity, abnormal modification).

在一些實施方案中,提供了治療個體(例如,人)的結直腸癌的方法,包括向個體施用有效的PARPi,其中基於以下選擇所述個體:i) 一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),所述藥物敏感基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1,以及ii) 一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變) ,所述耐藥基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2,且其中如果所述藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則選擇所述個體進行治療。在一些實施方案中,所述方法還包括檢測個體的樣本中一個或多個藥物敏感畸變(例如,藥物敏感突變)和一個或多個耐藥畸變(例如,耐藥突變)。在一些實施方案中,所述方法還包括獲得針對個體的綜合評分。在一些實施方案中,所述綜合評分是通過以下獲得的:(所述藥物敏感基因的敏感性評分總數的絕對值)減去(所述耐藥基因的耐藥性評分總數的絕對值),其中如果所述綜合評分高於零,則選擇所述個體進行治療。在一些實施方案中,根據式I來獲得綜合評分,其中如果所述綜合評分高於零(例如,大於或等於至少0.1(例如,0.3)),則選擇所述個體進行治療。In some embodiments, there is provided a method of treating colorectal cancer in an individual (e.g., a human) comprising administering to the individual an effective PARPi, wherein the individual is selected based on: i) one of one or more drug sensitive genes or multiple drug sensitive aberrations (eg, drug sensitive mutations) selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1, and ii) one or more drug resistance aberrations (eg, drug resistance mutations) in one or more drug resistance genes selected from Lower group: AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1 , ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2, and if If the composite score of the drug-sensitive aberration (for example, drug-sensitive mutation) and drug-resistant aberration (for example, drug-resistant mutation) is higher than the threshold level of the composite score, the individual is selected for treatment. In some embodiments, the method further comprises detecting one or more drug-sensitizing aberrations (eg, drug-sensitizing mutations) and one or more drug-resistant aberrations (eg, drug-resistance mutations) in the sample from the individual. In some embodiments, the method further comprises obtaining a composite score for the individual. In some embodiments, the composite score is obtained by subtracting (the absolute value of the total number of sensitivity scores for the drug-sensitive genes) minus (the absolute value of the total number of drug resistance scores for the drug-resistant genes), Wherein if the composite score is higher than zero, the individual is selected for treatment. In some embodiments, a composite score is obtained according to Formula I, wherein the individual is selected for treatment if the composite score is above zero (eg, greater than or equal to at least 0.1 (eg, 0.3)).

在一些實施方案中,本文提供了鑒定患有癌症的個體(例如人)的方法,所述個體受益於包括施用抗癌藥物的治療,所述方法包括:檢測個體的樣本中用本文所述的任何靶基因鑒定方法鑒定的一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),以及用本文所述的任何靶基因鑒定方法鑒定的一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),其中所述藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則將個體鑒定為可受益於所述治療。在一些實施方案中,本文提供了一種鑒定患有結直腸癌的個體(例如人)的方法,所述個體可能受益於包括施用PARPi的治療,所述方法包括:檢測個體的樣本中一個或多個藥物敏感基因的一個或多個藥物敏感畸變(例如,藥物敏感突變),所述藥物敏感基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1;以及一個或多個耐藥基因的一個或多個耐藥畸變(例如,耐藥突變),所述耐藥基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2,其中所述藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則將個體鑒定為可受益於所述治療。在一些實施方案中,所述方法還包括檢測個體樣本中的一個或多個藥物敏感性畸變(例如,突變、異常表達、異常活性、異常修飾)和一個或多個耐藥畸變(例如,突變、異常表達、異常活性、異常修飾)。在一些實施方案中,所述方法還包括獲得個體的綜合評分。在一些實施方案中,所述綜合評分是通過以下獲得的:(所述藥物敏感基因的敏感性評分總數的絕對值)減去(所述耐藥基因的耐藥性評分總數的絕對值),其中所述綜合評分高於零,則鑒定所述個體可受益於所述治療。在一些實施方案中,根據式I來獲得綜合評分,其中所述綜合評分高於零(例如,大於或等於至少0.1(例如,0.3)),則鑒定所述個體可受益於所述治療。In some embodiments, provided herein are methods of identifying an individual (e.g., a human) with cancer who would benefit from treatment comprising administering an anticancer drug, the method comprising: detecting in a sample from the individual a method described herein One or more drug-sensitivity aberrations (e.g., drug-sensitivity mutations) in one or more drug-sensitivity genes identified by any of the target gene identification methods, and one or more drug resistance identified by any of the target gene identification methods described herein One or more drug-resistant aberrations (eg, drug-resistant mutations) in a gene, wherein the combined score of the drug-sensitive aberrations (eg, drug-sensitive mutations) and drug-resistant aberrations (eg, drug-resistant mutations) is above the combined score threshold level, the individual is identified as benefiting from the treatment. In some embodiments, provided herein is a method of identifying an individual (e.g., a human) with colorectal cancer who may benefit from treatment comprising administering a PARPi, the method comprising: detecting in a sample from the individual one or more One or more drug-sensitive aberrations (e.g., drug-sensitive mutations) in a drug-sensitive gene selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1; and one or more resistance aberrations (e.g., resistance mutations) of one or more resistance genes that The gene is selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2, Where the composite score of the drug-susceptibility aberration (eg, drug-susceptibility mutation) and drug-resistance aberration (eg, drug-resistance mutation) is above a composite score threshold level, the individual is identified as benefiting from the treatment. In some embodiments, the method further comprises detecting one or more drug susceptibility aberrations (e.g., mutation, aberrant expression, aberrant activity, aberrant modification) and one or more drug resistance aberrations (e.g., mutation , abnormal expression, abnormal activity, abnormal modification). In some embodiments, the method further comprises obtaining a composite score for the individual. In some embodiments, the composite score is obtained by subtracting (the absolute value of the total number of sensitivity scores for the drug-sensitive genes) minus (the absolute value of the total number of drug resistance scores for the drug-resistant genes), Where the composite score is above zero, the individual is identified as benefiting from the treatment. In some embodiments, a composite score is obtained according to Formula I, wherein the composite score above zero (eg, greater than or equal to at least 0.1 (eg, 0.3)) identifies the individual as benefiting from the treatment.

在一些實施方案中,本文提供了為患有癌症的個體(例如人)選擇治療的方法,所述方法包括在所述個體的樣本中檢測使用本文所述的任何靶基因鑒定方法鑒定的一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),以及使用本文所述的任何靶基因鑒定方法鑒定的一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),其中樣本中的藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則鑒定包括施用抗癌藥物的治療作為所述個體的合適治療。在一些實施方案中,本文提供了為患有結直腸癌的個體(例如人)選擇治療的方法,所述方法包括在所述個體的樣本中檢測一個或多個藥物敏感基因中的一個或多個藥物敏感畸變(例如,藥物敏感突變),所述藥物敏感基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1,以及一個或多個耐藥基因中的一個或多個耐藥畸變(例如,耐藥突變),所述耐藥基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2,其中樣本中的藥物敏感畸變(例如,藥物敏感突變)和耐藥畸變(例如,耐藥突變)的綜合評分高於綜合評分閾值水準,則鑒定包括施用PARPi的治療為所述個體的合適治療。在一些實施方案中,所述方法還包括檢測個體的樣本中的一個或多個藥物敏感畸變(例如,突變、異常表達、異常活性、異常修飾)和一個或多個耐藥畸變(例如,突變、異常表達、異常活性、異常修飾)。在一些實施方案中,所述方法還包括獲得所述個體的綜合評分。在一些實施方案中,所述綜合評分是通過以下獲得的:(所述藥物敏感基因的敏感性評分總數的絕對值)減去(所述耐藥基因的耐藥性評分總數的絕對值),其中所述綜合評分高於零,則鑒定包括施用PARPi的治療是適合所述個體的治療。在一些實施方案中,根據式I獲得綜合評分,其中綜合評分大於零(例如,大於或等於至少0.1(例如,0.3)),則鑒定包括施用PARPi的治療是適合所述個體的治療。In some embodiments, provided herein are methods of selecting treatment for an individual (e.g., a human) with cancer, the method comprising detecting in a sample from the individual one or more genes identified using any of the target gene identification methods described herein. One or more drug-sensitivity aberrations (e.g., drug-sensitivity mutations) in a drug-sensitivity gene, and one or more drug-resistance aberrations in one or more drug-resistant genes identified using any of the target gene identification methods described herein (e.g., drug-resistant mutations), where the combined score of drug-sensitive aberrations (e.g., drug-sensitive mutations) and drug-resistant aberrations (e.g., drug-resistant mutations) in the sample is above the combined score threshold level, identification includes administration of anticancer drugs treatment as an appropriate treatment for the individual. In some embodiments, provided herein are methods of selecting treatment for an individual (e.g., a human) with colorectal cancer, the method comprising detecting one or more of one or more drug-sensitivity genes in a sample from the individual Drug-sensitive aberrations (eg, drug-sensitive mutations) selected from the group consisting of ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2 , SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A , HDAC2, PMS2, MSH6, MSH2, MLH1 and WEE1, and one or more drug resistance aberrations (eg, drug resistance mutations) in one or more drug resistance genes selected from the group consisting of AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2, among which drug-sensitive aberrations in samples ( For example, a combined score for a drug-sensitizing mutation) and a drug-resistant aberration (eg, a drug-resistant mutation) is above a combined score threshold level, identifying a treatment comprising administration of PARPi as an appropriate treatment for the individual. In some embodiments, the method further comprises detecting one or more drug-susceptibility aberrations (e.g., mutation, aberrant expression, aberrant activity, aberrant modification) and one or more drug-resistant aberrations (e.g., mutation , abnormal expression, abnormal activity, abnormal modification). In some embodiments, the method further comprises obtaining a composite score for the individual. In some embodiments, the composite score is obtained by subtracting (the absolute value of the total number of sensitivity scores for the drug-sensitive genes) minus (the absolute value of the total number of drug resistance scores for the drug-resistant genes), Where said composite score is above zero, treatment comprising administration of PARPi is identified as an appropriate treatment for said individual. In some embodiments, a composite score is obtained according to Formula I, wherein a composite score greater than zero (eg, greater than or equal to at least 0.1 (eg, 0.3)) identifies treatment comprising administration of PARPi as an appropriate treatment for the individual.

IV. 修飾的癌細胞及產生方法IV. Modified Cancer Cells and Methods of Production

本發明一個方面提供了產生修飾的癌細胞的方法,如對抗癌藥物耐藥、或對抗癌藥物敏感的修飾的癌細胞。在一些實施方案中,產生修飾的癌細胞的方法包括在癌細胞中使通過本文所述的任何靶基因鑒定方法來鑒定的一個或多個靶基因失活。還提供了通過本文所述的任何方法產生的修飾的癌細胞。One aspect of the present invention provides methods for generating modified cancer cells, such as modified cancer cells that are resistant to or sensitive to anticancer drugs. In some embodiments, the method of generating a modified cancer cell comprises inactivating in the cancer cell one or more target genes identified by any of the target gene identification methods described herein. Also provided are modified cancer cells produced by any of the methods described herein.

在一些實施方案中,產生修飾的癌細胞的方法包括在本文所述的任何靶基因鑒定方法所鑒定的一個或多個靶基因產生一個或多個突變(例如,失活突變)。在一些實施方案中,所述方法包括使初始癌細胞群與誘變劑接觸,以及選擇在本文鑒定的一個或多個靶基因中包含一個或多個突變(例如,失活突變)的修飾的癌細胞。檢測此類突變的方法在本領域是公的,如通過PCR。在一些實施方案中,所述方法包括通過基因編輯,如本領域已知的或本文所述的任何基因編輯方法,在癌細胞中,在本文鑒定的一個或多個靶基因產生一個或多個突變(例如,失活突變)。例如,非同源末端連接(NHEJ)-或同源重組介導的基因破壞或ZFN-、TALEN-、或CRISPR/Cas-介導的基因破壞。在一些實施方案中,產生修飾的癌細胞的方法包括將sgRNA構建體引入宿主癌細胞,其中所述sgRNA構建體包含或編碼sgRNA(例如,sgRNA,或攜帶編碼所述sgRNA的核酸的載體(例如,病毒載體 如慢病毒載體)),其中所述sgRNA包含與本文鑒定的靶基因中的靶位點互補(例如,至少約60%、70%、80%、90%、95%、96%、97%、98%、99%或100%互補中的任一個的)的嚮導序列。在一些實施方案中,所述方法還包括將攜帶編碼Cas蛋白(例如,Cas9)或Cas (例如,Cas9) mRNA的核酸的載體(例如,病毒載體 如慢病毒載體)引入宿主癌細胞或包含所述sgRNA構建體的宿主癌細胞。在一些實施方案中,宿主癌細胞包含Cas組件。在一些實施方案中,將針對所述靶基因的所述sgRNA構建體和/或包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件(例如,載體,或mRNA)同時引入宿主癌細胞。在一些實施方案中,編碼所述靶基因sgRNA的核酸,和/或編碼所述Cas蛋白的核酸在相同的載體上,在相同的啟動子控制下,或在分別的啟動子控制下。在一些實施方案中,編碼所述靶基因sgRNA的核酸,和/或編碼所述Cas蛋白的核酸由一個或多個IRES連接序列連接,並在相同的啟動子控制下。在一些實施方案中,編碼所述靶基因sgRNA的核酸,和/或編碼所述Cas蛋白的核酸在不同的載體上。在一些實施方案中,將針對所述靶基因的sgRNA構建體,和/或包含Cas蛋白或編碼Cas蛋白的核酸的Cas元件(例如,載體,或mRNA)依次引入所述宿主癌細胞。In some embodiments, the method of generating a modified cancer cell comprises one or more mutations (eg, inactivating mutations) in one or more target genes identified by any of the target gene identification methods described herein. In some embodiments, the methods comprise contacting an initial population of cancer cells with a mutagen, and selecting modified cells comprising one or more mutations (e.g., inactivating mutations) in one or more target genes identified herein. cancer cell. Methods for detecting such mutations are well known in the art, such as by PCR. In some embodiments, the methods comprise producing one or more of the target genes identified herein in cancer cells by gene editing, such as any gene editing method known in the art or described herein. Mutations (eg, inactivating mutations). For example, non-homologous end joining (NHEJ)- or homologous recombination-mediated gene disruption or ZFN-, TALEN-, or CRISPR/Cas-mediated gene disruption. In some embodiments, the method of producing a modified cancer cell comprises introducing a sgRNA construct into a host cancer cell, wherein the sgRNA construct comprises or encodes an sgRNA (e.g., an sgRNA, or a vector carrying a nucleic acid encoding the sgRNA (e.g., , viral vectors such as lentiviral vectors)), wherein the sgRNA comprises a target site complementary to a target gene identified herein (e.g., at least about 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementary) guide sequence. In some embodiments, the method further comprises introducing a vector (e.g., a viral vector such as a lentiviral vector) carrying a nucleic acid encoding a Cas protein (e.g., Cas9) or Cas (e.g., Cas9) mRNA into a host cancer cell or comprising the Host cancer cells for the sgRNA constructs. In some embodiments, the host cancer cell comprises a Cas module. In some embodiments, the sgRNA construct for the target gene and/or a Cas element (eg, vector, or mRNA) comprising a Cas protein or a nucleic acid encoding a Cas protein is simultaneously introduced into a host cancer cell. In some embodiments, the nucleic acid encoding the target gene sgRNA, and/or the nucleic acid encoding the Cas protein is on the same vector, under the control of the same promoter, or under the control of separate promoters. In some embodiments, the nucleic acid encoding the target gene sgRNA, and/or the nucleic acid encoding the Cas protein is connected by one or more IRES linking sequences and is under the control of the same promoter. In some embodiments, the nucleic acid encoding the target gene sgRNA, and/or the nucleic acid encoding the Cas protein are on different vectors. In some embodiments, the sgRNA construct for the target gene, and/or a Cas element (eg, vector, or mRNA) comprising a Cas protein or a nucleic acid encoding a Cas protein is sequentially introduced into the host cancer cell.

在一些實施方案中,當宿主癌細胞群(或初始癌細胞群)用於產生本文所述修飾的癌細胞,所述方法包括一個或多個分離和/或富集步驟,例如,從用本文所述的修飾試劑接觸的癌細胞群分離和/或富集在靶基因、靶基因sgRNA構建體或Cas元件中包含一個或多個突變(例如,失活突變)的癌細胞。這樣的分離和/或富集步驟可以使用本領域和本文所述的任何已知技術進行,如磁啟動細胞分選(MACS)。另請參閱上述“任選的富集步驟”小節中描述的方法。In some embodiments, when a population of host cancer cells (or an initial population of cancer cells) is used to generate the modified cancer cells described herein, the method includes one or more isolation and/or enrichment steps, e.g., from The population of cancer cells contacted with the modifying reagent isolates and/or enriches cancer cells comprising one or more mutations (eg, inactivating mutations) in the target gene, target gene sgRNA construct, or Cas element. Such separation and/or enrichment steps can be performed using any technique known in the art and described herein, such as magnetically activated cell sorting (MACS). See also the method described in the "Optional Enrichment Step" subsection above.

在一些實施方案中,通過轉導/轉染核酸(DNA或RNA)或其編碼載體(例如,非病毒載體,或病毒載體如慢病毒載體),或包含其編碼核酸的病毒(例如,慢病毒),將靶基因sgRNA構建體和/或Cas組件引入宿主癌細胞。在一些實施方案中,通過將蛋白插入細胞膜,同時通過微流體系統如CELL SQUEEZE®(參見例如,美國專利申請公開US20140287509)傳遞細胞,來使Cas元件(例如Cas9蛋白)引入宿主癌細胞。In some embodiments, by transduction/transfection of nucleic acid (DNA or RNA) or its encoding vector (eg, non-viral vector, or viral vector such as lentiviral vector), or virus (eg, lentiviral vector) containing its encoding nucleic acid ), introducing target gene sgRNA constructs and/or Cas components into host cancer cells. In some embodiments, a Cas element (eg, Cas9 protein) is introduced into host cancer cells by inserting the protein into the cell membrane while delivering the cells through a microfluidic system such as CELL SQUEEZE® (see, eg, US Patent Application Publication US20140287509).

將載體(例如,病毒載體)或分離的核酸引入哺乳動物細胞的方法是本領域已知的。本文所述的核酸或載體可以通過物理、化學或生物方法轉移到癌細胞中。Methods for introducing vectors (eg, viral vectors) or isolated nucleic acids into mammalian cells are known in the art. The nucleic acids or vectors described herein can be transferred into cancer cells by physical, chemical or biological means.

將載體(例如病毒載體)引入癌細胞的物理方法包括磷酸鈣沉澱、脂質感染、粒子轟擊、微注射、電穿孔等。生產包含載體和/或外源核酸的細胞的方法在本領域是公知的。參見例如, Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York。在一些實施方案中,通過電穿孔將載體(例如,病毒載體)引入癌細胞。Physical methods for introducing vectors (eg, viral vectors) into cancer cells include calcium phosphate precipitation, lipid infection, particle bombardment, microinjection, electroporation, and the like. Methods for producing cells comprising vectors and/or exogenous nucleic acids are well known in the art. See, eg, Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York. In some embodiments, vectors (eg, viral vectors) are introduced into cancer cells by electroporation.

將載體引入癌細胞的生物學方法包括使用DNA和RNA載體。病毒載體已成為將基因插入哺乳動物(如人細胞)最廣泛使用的方法。Biological methods for introducing vectors into cancer cells include the use of DNA and RNA vectors. Viral vectors have become the most widely used method for inserting genes into mammalian (such as human) cells.

將載體(例如病毒載體)引入癌細胞的化學方法包括膠體分散系統,如大分子複合物、納米膠囊、微球、珠和基於脂質的系統,包括水包油乳劑、膠束、混合膠束和脂質體。體外用作遞送工具的示例性膠體系統是脂質體(例如,人工膜囊)。Chemical methods for introducing vectors, such as viral vectors, into cancer cells include colloidal dispersion systems such as macromolecular complexes, nanocapsules, microspheres, beads, and lipid-based systems including oil-in-water emulsions, micelles, mixed micelles, and Liposomes. An exemplary colloidal system for use as a delivery vehicle in vitro is a liposome (eg, an artificial membrane vesicle).

在一些實施方案中,RNA分子(例如,sgRNA或編碼Cas9的mRNA)可以通過傳統方法(例如,體外轉錄)製備,然後通過已知方法(例如mRNA電穿孔)引入癌細胞。參見例如,Rabinovich et al., Human Gene Therapy 17:1027-1035。In some embodiments, RNA molecules (eg, sgRNA or mRNA encoding Cas9) can be prepared by conventional methods (eg, in vitro transcription) and then introduced into cancer cells by known methods (eg, mRNA electroporation). See, eg, Rabinovich et al., Human Gene Therapy 17:1027-1035.

在一些實施方案中,將包含編碼本文所述靶基因sgRNA、靶基因sgRNA iBAR和/或Cas蛋白中任一個的核酸的病毒載體(慢病毒載體)或病毒(例如,慢病毒)與宿主癌細胞(或初始癌細胞群),例如,以至少約1,如至少約1、1.5、2、2.5、3、3.5、4、4.5、5、5.5、6、6.5、7、8、9或10中的任一個的MOI接觸。在一些實施方案中,將包含編碼本文所述靶基因sgRNA、靶基因sgRNA iBAR和/或Cas蛋白中任一個的核酸的病毒載體(慢病毒載體)或病毒(例如,慢病毒)與宿主癌細胞(或初始癌細胞群)以約3的MOI接觸。 In some embodiments, a viral vector (lentiviral vector) or virus (e.g., lentivirus) comprising a nucleic acid encoding any of the target gene sgRNA described herein, the target gene sgRNA iBAR , and/or the Cas protein is combined with a host cancer cell (or initial cancer cell population), for example, in at least about 1, such as at least about 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 8, 9 or 10 MOI of either contact. In some embodiments, a viral vector (lentiviral vector) or virus (e.g., lentivirus) comprising a nucleic acid encoding any of the target gene sgRNA described herein, the target gene sgRNA iBAR , and/or the Cas protein is combined with a host cancer cell (or the initial cancer cell population) were contacted at an MOI of about 3.

在一些實施方案中,轉導/轉染的癌細胞在引入載體或分離的核酸後離體增殖。在一些實施方案中,培養轉導/轉染的癌細胞以增殖至少約1天、2天、3天、4天、5天、6天、7天、10天、12天或14天中的任一個。在一些實施方案中,進一步評價或篩選轉導/轉染的癌細胞以選擇本文所述的所希望的修飾的癌細胞。In some embodiments, transduced/transfected cancer cells are propagated ex vivo following introduction of the vector or isolated nucleic acid. In some embodiments, the transduced/transfected cancer cells are cultured to proliferate for at least about 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 10 days, 12 days, or 14 days either one. In some embodiments, the transduced/transfected cancer cells are further evaluated or screened to select the desired modified cancer cells described herein.

報告基因可用於鑒定潛在的轉染/轉導細胞,並用於評價調控序列的功能。一般來說,報告基因是一種在受體生物體或組織中不存在或不表達的基因,它編碼一種其表達表現出一些容易檢測的特性(例如酶活性)的多肽。將DNA/RNA引入受體細胞後,在適當的時間測定報告基因的表達。合適的報告基因可包括:編碼螢光素酶、β-半乳糖苷酶、氯黴素乙醯轉移酶、分泌型鹼性磷酸酶或綠色螢光蛋白(GFP)基因的基因(例如,Ui-Tei et al. FEBS Letters 479: 79-82 (2000))。合適的表達系統是公知的,可以使用已知技術製備或商業化獲得。抗生素選擇標誌物也可用於鑒定潛在的轉染/轉導細胞。Reporter genes can be used to identify potentially transfected/transduced cells and to evaluate the function of regulatory sequences. In general, a reporter gene is a gene that is absent or not expressed in the recipient organism or tissue and that encodes a polypeptide whose expression exhibits some readily detectable property, such as enzymatic activity. Expression of the reporter gene is measured at an appropriate time after the DNA/RNA has been introduced into the recipient cells. Suitable reporter genes may include: genes encoding luciferase, β-galactosidase, chloramphenicol acetyltransferase, secreted alkaline phosphatase, or the green fluorescent protein (GFP) gene (e.g., Ui- Tei et al. FEBS Letters 479: 79-82 (2000)). Suitable expression systems are known and can be prepared using known techniques or obtained commercially. Antibiotic selection markers can also be used to identify potentially transfected/transduced cells.

用於確認本文所述任何核酸(例如,sgRNA結構)的存在或修飾的癌細胞的靶基因中存在突變(例如,失活突變)的其他方法,包括例如,本領域技術人員公知的分子生物學測定,如DNA和RNA印跡、RT-PCR、PCR、DNA-seq或RNA seq;生化測定,如檢測特定肽的存在或缺失,例如,通過免疫學方法(如ELISA和蛋白質印跡)、螢光啟動細胞分選(FACS)或磁啟動細胞分選(MACS)。Other methods for confirming the presence of any of the nucleic acids described herein (e.g., sgRNA constructs) or the presence of mutations (e.g., inactivating mutations) in target genes of modified cancer cells include, for example, molecular biology methods known to those skilled in the art Assays, such as Southern and Northern blots, RT-PCR, PCR, DNA-seq or RNA-seq; biochemical assays, such as detection of the presence or absence of specific peptides, for example, by immunological methods (such as ELISA and Western blot), fluorescence priming Cell sorting (FACS) or magnetically activated cell sorting (MACS).

在一些實施方案中,提供了修飾的結直腸癌細胞,其包含在一個或多個靶基因中的一個或多個突變(例如,失活突變如敲除),其中所述靶基因選自下組:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1。在一些實施方案中,提供了修飾的結直腸癌細胞,其包含在一個或多個靶基因中的一個或多個突變(例如,失活突變如敲除),其中所述靶基因選自下組:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2.In some embodiments, provided are modified colorectal cancer cells comprising one or more mutations (e.g., inactivating mutations such as knockouts) in one or more target genes, wherein the target genes are selected from the group consisting of: Groups: ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1, and WEE1. In some embodiments, provided are modified colorectal cancer cells comprising one or more mutations (e.g., inactivating mutations such as knockouts) in one or more target genes, wherein the target genes are selected from the group consisting of: Groups: AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2.

在一些實施方案中,提供了篩選能夠治療個體(例如,人)的癌症(例如,結直腸癌)的抗癌藥物的方法,其中所述癌症包括在使用本文所述任何靶基因鑒定方法鑒定的一個或多個耐藥基因中的一個或多個耐藥突變,所述方法包括:a)提供包含在一個或多個耐藥基因中的一個或多個耐藥突變的癌細胞文庫,b)將所述癌細胞文庫與一種或多種候選抗癌藥物單獨接觸,其中,能夠抑制癌細胞文庫的生長超過某一閾值(例如,抑制至少約10%、20%、30%、40%、50%或更多的生長)的候選抗癌藥物被鑒定為能夠治療個體癌症的抗癌藥物。In some embodiments, there is provided a method of screening for an anticancer drug capable of treating a cancer (e.g., colorectal cancer) in an individual (e.g., a human), wherein the cancer comprises a gene identified using any of the target gene identification methods described herein. One or more drug resistance mutations in one or more drug resistance genes, the method comprising: a) providing a library of cancer cells comprising the one or more drug resistance mutations in the one or more drug resistance genes, b) Individually contacting the library of cancer cells with one or more candidate anticancer drugs, wherein growth of the library of cancer cells is inhibited above a certain threshold (e.g., inhibited by at least about 10%, 20%, 30%, 40%, 50% or more growth) are identified as anticancer drugs capable of treating an individual's cancer.

V. 試劑盒和製品V. Kits and Articles

本申請還提供了試劑盒和製品,其用於鑒定本文所述癌細胞中靶基因的方法的任一實施方案的,如採用本文所述sgRNA文庫或sgRNA iBAR文庫。還提供了試劑盒和製品,其用於產生對抗癌藥物敏感和耐藥的修飾的癌細胞。 The present application also provides kits and articles of manufacture for use in any of the embodiments of the methods for identifying target genes in cancer cells described herein, such as using the sgRNA library or the sgRNA iBAR library described herein. Also provided are kits and articles of manufacture for generating modified cancer cells that are sensitive and resistant to anticancer drugs.

在一些實施方案中,提供了鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的試劑盒,其包含本文所述sgRNA文庫或sgRNA iBAR文庫中任一個。在一些實施方案中,所述試劑盒還包含Cas蛋白或編碼Cas蛋白(例如,Cas9)的核酸。在一些實施方案中,所述試劑盒還包含一個或多個陽性和/或陰性對照組sgRNA iBAR構建體,或一個或多個陽性和/或陰性對照的sgRNA構建體。在一些實施方案中,所述試劑盒還包含抗癌藥物和/或所述初始癌細胞群,或包含所述Cas組件的癌細胞。在一些實施方案中,所述試劑盒還包含資料分析軟體。在一些實施方案中,所述試劑盒包含用於實施本文所述的任一種方法的指導說明。 In some embodiments, kits are provided for identifying target genes in cancer cells whose mutations render the cancer cells sensitive or resistant to anticancer drugs, comprising any of the sgRNA libraries or sgRNA iBAR libraries described herein. In some embodiments, the kit further comprises a Cas protein or a nucleic acid encoding a Cas protein (eg, Cas9). In some embodiments, the kit further comprises one or more positive and/or negative control sgRNA iBAR constructs, or one or more positive and/or negative control sgRNA constructs. In some embodiments, the kit further comprises an anticancer drug and/or the initial population of cancer cells, or cancer cells comprising the Cas component. In some embodiments, the kit further comprises data analysis software. In some embodiments, the kit comprises instructions for performing any of the methods described herein.

在一些實施方案中,提供了用於鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的試劑盒,其包含任一本文所述癌細胞文庫,如在基因組(或癌症相關基因)的一些或所有命中基因中包含突變(例如,失活突變)的癌細胞文庫,或包含任一本文所述sgRNA文庫或sgRNA iBAR文庫的癌細胞文庫。在一些實施方案中,所述試劑盒還包含Cas蛋白或編碼所述Cas蛋白的核酸。在一些實施方案中,所述試劑盒還包含抗癌藥物。在一些實施方案中,所述試劑盒還包含對照癌細胞文庫,如在基因組的非基因區具有一個或多個突變(例如,失活突變),或包含一個或多個內源性癌症突變,或包含一個或多個陽性和/或陰性對照的sgRNA構建體或一個或多個陽性和/或陰性對照組sgRNA iBAR構建體。在一些實施方案中,所述試劑盒還包含資料分析軟體。在一些實施方案中,所述試劑盒包含用於實施本文所述的任一種方法的指導說明。 In some embodiments, there is provided a kit for identifying a target gene in a cancer cell whose mutation renders the cancer cell sensitive or resistant to an anticancer drug, comprising any of the cancer cell libraries described herein, such as in the genome ( or cancer-associated genes), or a cancer cell library comprising any of the sgRNA libraries or sgRNA iBAR libraries described herein. In some embodiments, the kit further comprises a Cas protein or a nucleic acid encoding the Cas protein. In some embodiments, the kit further comprises an anticancer drug. In some embodiments, the kit further comprises a library of control cancer cells, such as having one or more mutations (e.g., inactivating mutations) in a nongenic region of the genome, or comprising one or more endogenous cancer mutations, Or sgRNA constructs comprising one or more positive and/or negative controls or one or more positive and/or negative control sgRNA iBAR constructs. In some embodiments, the kit further comprises data analysis software. In some embodiments, the kit comprises instructions for performing any of the methods described herein.

所述試劑盒可包含其他組件,如容器、試劑、培養基、引物、緩衝液、酶等,以促進執行本文所述的任一種篩選方法。在一些實施方案中,所述試劑盒包含試劑、緩衝液和用於將所述sgRNA文庫或sgRNA iBAR文庫和Cas蛋白或編碼Cas蛋白的核酸引入癌細胞的載體。在一些實施方案中,所述試劑盒包含用於製備序列的測序文庫的引物、試劑和酶(例如聚合酶),所述序列包括從處理後癌細胞群中提取的命中基因突變(例如失活突變)、sgRNA序列或sgRNA iBAR序列。 The kit may comprise other components, such as containers, reagents, media, primers, buffers, enzymes, etc., to facilitate performance of any of the screening methods described herein. In some embodiments, the kit comprises reagents, buffers, and vectors for introducing the sgRNA library or sgRNA iBAR library and the Cas protein or nucleic acid encoding the Cas protein into cancer cells. In some embodiments, the kit comprises primers, reagents, and enzymes (e.g., polymerases) for preparing a sequencing library of sequences comprising hit gene mutations (e.g., inactivating mutation), sgRNA sequence or sgRNA iBAR sequence.

本申請的試劑盒採用合適的包裝。合適的包裝包括但不限於:小瓶、瓶子、罐子、軟包裝(例如密封的聚酯薄膜或塑膠袋)等。試劑盒可以任選性地提供其他元件,如緩衝液和解釋性資訊。因此,本申請還提供了製品,包括小瓶(如密封小瓶)、瓶子、罐子、軟包裝等。The kits of the present application are provided in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (eg, sealed mylar or plastic bags), and the like. Kits may optionally provide other elements such as buffers and explanatory information. Accordingly, the present application also provides articles of manufacture, including vials (eg, sealed vials), bottles, jars, flexible packaging, and the like.

製品可以包括容器和容器上或與容器相關的標籤或包裝插頁。合適的容器包括,例如,瓶子、小瓶、注射器等。容器可以由玻璃或塑膠等多種材料製成。通常,容器中裝有一種成分(例如,對抗癌藥物敏感或耐藥的修飾的癌細胞),並可有一個無菌的入口。包裝插頁是指商業包裝中通常包含的指導說明,其中包含有關此類產品使用說明和/或警告的資訊。它還可以包括從商業和用戶角度來看所需的其他材料,包括其他緩衝液、稀釋劑、篩檢程式。An article of manufacture may include a container and a label or package insert on or associated with the container. Suitable containers include, for example, bottles, vials, syringes, and the like. Containers can be made from a variety of materials such as glass or plastic. Typically, the container contains a component (eg, modified cancer cells that are sensitive or resistant to an anticancer drug) and may have a sterile inlet. A package insert is an instruction usually included in commercial packaging that contains information about the directions and/or warnings for use of such products. It can also include other materials, including other buffers, diluents, screening procedures, as desired from a commercial and user standpoint.

實施例Example

以下實施例和示例性實施方案旨在純粹作為本發明的示例,因此不應被視為以任何方式限制本發明。以下實施例和詳細描述是通過說明而非限制的方式提供的。The following examples and exemplary embodiments are intended purely as illustrations of the invention and therefore should not be construed as limiting the invention in any way. The following examples and detailed description are offered by way of illustration and not limitation.

實施例1:鑒定癌細胞中的藥物敏感基因和耐藥基因Example 1: Identification of drug-sensitive and drug-resistant genes in cancer cells

本實施例提供了示例性鑒定藥物敏感基因和/或耐藥基因的方法。簡而言之,為Cas9介導的基因敲除(KO)構建了一個攜帶靶向癌相關基因的sgRNA iBAR的癌細胞文庫。通過檢測構建的Cas9 +sgRNA iBAR癌細胞文庫的抗癌藥物(例如PARPi)殺傷功效,可以鑒定基因敲除(KO)後賦予對抗癌藥物殺傷的耐藥表型或敏感表型的基因。圖1-2顯示了示例性工作流程。 This example provides exemplary methods for identifying drug susceptibility genes and/or drug resistance genes. In brief, a cancer cell library carrying sgRNA iBARs targeting cancer-associated genes was constructed for Cas9-mediated gene knockout (KO). By detecting the anticancer drug (such as PARPi) killing efficacy of the constructed Cas9 + sgRNA iBAR cancer cell library, genes that endow anticancer drug-killing drug-resistant or sensitive phenotypes after gene knockout (KO) can be identified. Figure 1-2 shows an example workflow.

1. sgRNA iBAR文庫的設計和構建 1. Design and construction of sgRNA iBAR library

基於公共資料庫,來自III期和IV期結直腸癌患者的DNA突變頻率≥5%且RNA表達水準上調或下調大於2-倍(在細胞內或細胞表面表達)的基因,被選為用於進一步sgRNA iBAR設計的文庫基因(共1323個基因)。 Based on public databases, genes with a DNA mutation frequency ≥5% and RNA expression levels up- or down-regulated >2-fold (expressed in cells or on the cell surface) from patients with stage III and IV colorectal cancer were selected for use Library genes for further sgRNA iBAR design (1323 genes in total).

sgRNA iBAR文庫的設計和構造類似於WO2020125762和Zhu et al. “Guide RNAs with embedded barcodes boost CRISPR-pooled screens,” Genome Biol. 2019; 20:20中所述,其各自的內容通過引用以其整體併入本文中。簡而言之,從UCSC人類基因組中檢索到1323個以上選擇的基因。針對每個基因的sgRNA是使用DeepRank演算法(參見Zhu et al.)設計的,每個基因有三個不同的靶向sgRNA,四個6-bp iBAR (iBAR6)隨機分配給每個sgRNA(“sgRNA iBAR”)。內部條碼序列被設計為位於Cas9 sgRNA核糖核蛋白複合物外gRNA骨架的四環中,這不會影響其上游嚮導序列的活性。此外,500個不靶向任何人類基因的對照sgRNA被設計為陰性對照,四個iBAR6被隨機分配給每個對照sgRNA(“對照sgRNA iBAR”)。因此,設計的CRISPR sgRNA iBAR文庫總共包括17876個sgRNA iBAR(靶標和對照)。 The design and construction of the sgRNA iBAR library is similar to that described in WO2020125762 and Zhu et al. "Guide RNAs with embedded barcodes boost CRISPR-pooled screens," Genome Biol. 2019; 20:20, the contents of each of which are incorporated by reference in their entirety. into this article. In brief, more than 1323 selected genes were retrieved from the UCSC human genome. The sgRNA targeting each gene was designed using the DeepRank algorithm (see Zhu et al.), with three different targeting sgRNAs per gene, and four 6-bp iBARs (iBAR6) were randomly assigned to each sgRNA (“sgRNA iBAR "). The internal barcode sequence is designed to be located in the tetraloop of the outer gRNA backbone of the Cas9 sgRNA ribonucleoprotein complex, which will not affect the activity of its upstream guide sequence. In addition, 500 control sgRNAs that did not target any human genes were designed as negative controls, and four iBAR6s were randomly assigned to each control sgRNA (“control sgRNA iBAR ”). Therefore, the designed CRISPR sgRNA iBAR library included a total of 17876 sgRNA iBARs (targets and controls).

編碼sgRNA iBAR的DNA寡核苷酸被設計和合成(來自Twist Bioscience),然後PCR擴增。PCR產物用PCR純化試劑盒純化,然後通過金門克隆法(Golden Gate cloning)克隆到基於pLenti-sgRNA-Lib (Addgene #53121)的實驗室內部修飾的慢病毒sgRNA iBAR-表達骨架以得到sgRNA iBAR質粒,其編碼覆蓋1323個人類基因的15876個sgRNA iBAR(每個基因3組sgRNA iBAR,每個基因靶向3個不同的靶位點;每組sgRNA iBAR包含4個sgRNA iBAR),以及靶向500個非基因區的2000個對照sgRNA iBAR(每個非基因區1組sgRNA iBAR,每組sgRNA iBAR包含4個sgRNA iBAR)。 DNA oligonucleotides encoding sgRNA iBARs were designed and synthesized (from Twist Bioscience), followed by PCR amplification. PCR products were purified with a PCR purification kit and then cloned by Golden Gate cloning into an in-lab modified lentiviral sgRNA iBAR -expression backbone based on pLenti-sgRNA-Lib (Addgene #53121) to yield sgRNA iBAR plasmids , which encode 15876 sgRNA iBARs covering 1323 human genes (3 groups of sgRNA iBARs per gene, each gene targeting 3 different target sites; each group of sgRNA iBARs contains 4 sgRNA iBARs ), and 500 2000 control sgRNA iBARs for each non-genic region (one group of sgRNA iBARs for each non-genic region, and each group of sgRNA iBARs contains 4 sgRNA iBARs ).

為了確保在癌細胞文庫中sgRNA iBAR的豐度(針對每個sgRNA iBAR至少1000-倍覆蓋率),使用上述獲得的sgRNA iBAR質粒進行10個電穿孔反應。對於每個電穿孔反應,將1μL sgRNA iBAR質粒添加到無菌1.5 mL Eppendorf離心管中,再將50μL感受態細胞(大腸桿菌)添加到試管中並旋轉,然後進行電穿孔。立即將不含抗生素的950μL SOC (Super Optimal Broth)培養基添加到每個反應管中,輕輕用移液器混合,然後在37°C和225 rpm在搖床中培養1h。將所得的細菌轉移到補充有氨苄西林的1L LB液體培養基中,在37°C和225 rpm的搖床中培養過夜。第二天,使用EndoFree®質粒純化試劑盒(QIAGEN,#12391)對獲得的細菌進行質粒提取。 To ensure the abundance of sgRNA iBARs in cancer cell libraries (at least 1000-fold coverage for each sgRNA iBAR ), 10 electroporation reactions were performed using the sgRNA iBAR plasmids obtained above. For each electroporation reaction, 1 μL of sgRNA iBAR plasmid was added to a sterile 1.5 mL Eppendorf centrifuge tube, and 50 μL of competent cells (E. coli) were added to the tube and spun before electroporation. Immediately add 950 μL of SOC (Super Optimal Broth) medium without antibiotics to each reaction tube, mix gently with a pipette, and then incubate in a shaker at 37°C and 225 rpm for 1 h. The resulting bacteria were transferred to 1 L of LB liquid medium supplemented with ampicillin and cultured overnight at 37°C and 225 rpm in a shaker. The next day, the obtained bacteria were subjected to plasmid extraction using the EndoFree® Plasmid Purification Kit (QIAGEN, #12391).

採用標準方案獲得sgRNA iBAR文庫慢病毒。簡單地說,將1×107個293T細胞置於150mm的細胞培養皿中,加入20 mL細胞培養基,然後在37°C、5% CO2培養箱中培養293T細胞過夜。第二天,丟棄培養基,向293T細胞中添加10mL新鮮無血清培養基。用無血清培養基(4 mL)、上述獲得sgRNA iBAR文庫質粒(20μg)、pCMVR8.74質粒(20μg)和pCMV-VSV-G質粒(2μg)製備轉染複合物;混合後,添加105μL PEI;混合後,將轉染複合物在室溫下靜置15min。然後將轉染複合物添加到10 mL新鮮無血清培養基中的293T細胞中,在37°C、5% CO2在培養箱中培養6h。細胞培養基被丟棄。向293T細胞中添加20 mL新鮮完整培養基,然後在37°C、5% CO2在培養箱中培養。3天后,收集培養基並在4°C、1000 rpm下離心。收集含有sgRNA iBAR文庫慢病毒的上清液,測量病毒滴度並將其等分以備用。 The sgRNA iBAR library lentiviruses were obtained using standard protocols. Briefly, 1×107 293T cells were placed in a 150mm cell culture dish, 20 mL of cell culture medium was added, and then the 293T cells were cultured overnight in a 37°C, 5% CO2 incubator. The next day, discard the medium and add 10 mL of fresh serum-free medium to the 293T cells. Prepare the transfection complex with serum-free medium (4 mL), sgRNA iBAR library plasmid (20 μg) obtained above, pCMVR8.74 plasmid (20 μg) and pCMV-VSV-G plasmid (2 μg); after mixing, add 105 μL PEI; mix Afterwards, the transfection complex was allowed to stand at room temperature for 15 min. The transfection complex was then added to 293T cells in 10 mL of fresh serum-free medium and incubated at 37°C, 5% CO2 in an incubator for 6 h. Cell culture medium was discarded. Add 20 mL of fresh complete medium to the 293T cells, then culture in an incubator at 37 °C, 5% CO2. After 3 days, the medium was collected and centrifuged at 1000 rpm at 4°C. Supernatants containing lentiviruses from the sgRNA iBAR library were collected, titered and aliquoted for future use.

2. Cas9 +sgRNA iBAR癌細胞文庫的構建 2. Construction of Cas9 + sgRNA iBAR cancer cell library

選擇HCT116(人結腸癌細胞系)和SW480(人結直腸癌細胞系)進行Cas9 +sgRNA iBAR癌細胞文庫構建和藥物處理。 HCT116 (human colon cancer cell line) and SW480 (human colorectal cancer cell line) were selected for Cas9 + sgRNA iBAR cancer cell library construction and drug treatment.

將每個細胞系的2×105個癌細胞接種在6孔板中,並在37°C、5% CO2培養箱中培養。24h後,向細胞培養基中添加100μL Cas9包裝的慢病毒,並在37°C、5% CO2培養箱中培養癌細胞。24h後,丟棄培養基,將新鮮的完整培養基添加到癌細胞中。讓癌細胞在37°C、5% CO2培養箱中生長7天,然後使用mCherry標誌物(在Cas9慢病毒載體上進行)用FACS進行分選。分選出的具有mCherry螢光的癌細胞為Cas9-表達(Cas9 +)細胞,並對其進行擴增以構建Cas9 +sgRNA iBAR文庫。 2 × 105 cancer cells of each cell line were seeded in 6-well plates and cultured in a 37°C, 5% CO2 incubator. After 24 h, 100 μL of Cas9-packaged lentivirus was added to the cell culture medium, and the cancer cells were cultured in a 37°C, 5% CO2 incubator. After 24 h, the medium was discarded and fresh complete medium was added to the cancer cells. Cancer cells were grown in a 37°C, 5% CO2 incubator for 7 days and then sorted by FACS using the mCherry marker (on a Cas9 lentiviral vector). The sorted cancer cells with mCherry fluorescence were Cas9-expressing (Cas9 + ) cells, which were amplified to construct the Cas9 + sgRNA iBAR library.

為了確保Cas9 +sgRNA iBAR癌細胞文庫中sgRNA iBAR的覆蓋率至少為1000倍,將上述獲得的sgRNA iBAR文庫慢病毒以MOI為3添加到培養基(無抗生素)中的2×107個Cas9 +癌細胞中,並輕輕混合。將Cas9 +癌細胞在37°C、5% CO2培養箱中培養24h以進行感染。第二天,丟棄培養基,將新鮮的完整培養基添加到Cas9 +癌細胞中,然後在37°C、5% CO2培養箱中培養。每隔3天將Cas9 +癌細胞在補充有嘌呤黴素的新鮮完整培養基中傳代一次。未成功轉染sgRNA iBAR質粒的Cas9 +癌細胞會死亡。連續兩次傳代後,獲得sgRNA iBAR癌細胞文庫(以下分別稱為“Cas9 +sgRNA iBARHCT116文庫”和“Cas9 +sgRNA iBARSW480文庫”)。 In order to ensure that the coverage of sgRNA iBAR in the Cas9 + sgRNA iBAR cancer cell library is at least 1000-fold, the sgRNA iBAR library lentivirus obtained above was added to 2×107 Cas9 + cancer cell culture medium (without antibiotics) at an MOI of 3 and mix gently. Cas9 + cancer cells were cultured for 24 h in a 37°C, 5% CO2 incubator for infection. The next day, discard the medium and add fresh complete medium to the Cas9 + cancer cells, then culture in a 37°C, 5% CO2 incubator. Cas9 + cancer cells were passaged every 3 days in fresh complete medium supplemented with puromycin. Cas9 + cancer cells unsuccessfully transfected with sgRNA iBAR plasmids die. After two consecutive passages, the sgRNA iBAR cancer cell library (hereinafter respectively referred to as "Cas9 + sgRNA iBAR HCT116 library" and "Cas9 + sgRNA iBAR SW480 library") was obtained.

3. 篩選用抗癌藥物處理的Cas9 +sgRNA iBAR癌細胞文庫 3. Screening of Cas9 + sgRNA iBAR Cancer Cell Library Treated with Anticancer Drugs

先測量每個癌細胞系藥物毒性曲線,然後用抗癌藥物(例如,PARP抑制劑;PARPi)處理Cas9 +sgRNA iBAR癌細胞文庫。 The drug toxicity profile of each cancer cell line was measured first, and then the Cas9 + sgRNA iBAR cancer cell library was treated with anticancer drugs (e.g., PARP inhibitor; PARPi).

將2000 HCT116或將SW480細胞添加到96孔板中的每個孔中,每個孔添加100μL培養基,然後在37°C、5% CO2細胞培養箱中培養。第二天,將多種濃度的抗癌藥物(例如,PARPi)添加到每個孔中,每種濃度3個重複。最終藥物濃度為33μM、11μM、3.7μM、1.23μM、0.41μM、0.14μM、0.05μM和0.02μM。在抗癌藥物存在下的三個倍增時間後,進行CellTiter Glo®發光細胞活性測定(ATP測定),以獲得藥物毒性曲線。Add 2000 HCT116 or SW480 cells to each well in a 96-well plate, add 100 μL of medium to each well, and then culture in a 37°C, 5% CO2 cell incubator. The next day, various concentrations of anticancer drugs (eg, PARPi) were added to each well in triplicate for each concentration. Final drug concentrations were 33 μM, 11 μM, 3.7 μM, 1.23 μM, 0.41 μM, 0.14 μM, 0.05 μM and 0.02 μM. After three doubling times in the presence of anticancer drugs, the CellTiter Glo® Luminescent Cell Viability Assay (ATP Assay) was performed to obtain drug toxicity curves.

基於獲得的每個癌細胞系的藥物毒性曲線,為Cas9 +sgRNA iBAR癌細胞文庫篩選選擇與IC50-IC70的細胞生長抑制相對應的藥物濃度。例如,HCT116和SW480的PARPi濃度分別為5μM和10μM。將1×106個Cas9 +sgRNA iBAR癌細胞置於150 mm細胞培養皿中,並在37°C、5% CO2細胞培養箱中培養。第二天,用抗癌藥物(例如,PARPi;測試組)或DMSO(對照組)處理Cas9 +sgRNA iBAR癌細胞。每組設兩個生物學重複。每3天更換一次新鮮細胞培養基(添加藥物或DMSO)。藥物或對照處理繼續進行,並在處理9-10個倍增時間或15-16個倍增時間後收集細胞(請參見圖2)。對於貼壁細胞,死細胞會漂浮在培養基中,因此通過胰蛋白酶消化收穫的貼壁細胞都是活的(或大部分是活的)細胞。在整個篩選過程和細胞收集過程中,對於每個重複,細胞數量始終為所述sgRNA iBAR文庫的至少約1000-倍的大小,即對於每個sgRNA iBAR至少約1000個細胞。 Based on the obtained drug toxicity curves for each cancer cell line, drug concentrations corresponding to cell growth inhibition with IC50-IC70 were selected for the Cas9 + sgRNA iBAR cancer cell library screening. For example, the PARPi concentrations of HCT116 and SW480 were 5 μM and 10 μM, respectively. 1×106 Cas9 + sgRNA iBAR cancer cells were placed in a 150 mm cell culture dish and cultured in a 37°C, 5% CO2 cell incubator. The next day, Cas9 + sgRNA iBAR cancer cells were treated with anticancer drugs (e.g., PARPi; test group) or DMSO (control group). Two biological replicates were set up for each group. Fresh cell culture medium (addition of drug or DMSO) was changed every 3 days. Drug or control treatment was continued and cells were harvested after 9-10 doubling times or 15-16 doubling times of treatment (see Figure 2). For adherent cells, dead cells will be floating in the medium, so the adherent cells harvested by trypsinization are all viable (or mostly viable) cells. Throughout the screening process and cell collection, the number of cells was consistently at least about 1000-fold the size of the sgRNA iBAR library for each replicate, ie at least about 1000 cells per sgRNA iBAR .

4. 靶基因的鑒定和分析4. Identification and Analysis of Target Genes

從以上收集的處理後癌細胞提取基因組DNA(大部分為活Cas9 +sgRNA iBAR癌細胞)。對於每個癌細胞類型,有“9-10 PDT測試組”、“15-16 PDT測試組”、“9-10 PD對照組”和“15-16 PDT對照組”;每組有兩個生物學重複。對於每種抗癌藥物(例如,PARPi),測試了兩種不同的細胞系文庫(例如,Cas9 +sgRNA iBARHCT116文庫和Cas9 +sgRNA iBARSW480文庫)。從提取的基因組中PCR擴增sgRNA iBAR編碼片段,純化並為NGS測序準備。採用MAGeCKiBAR演算法進行測序數據分析(參見Zhu et al., “Guide RNAs with embedded barcodes boost CRISPR-pooled screens,” Genome Biol. 2019; 20:20;其內容通過引用以其整體併入本文),它包括三個主要部分:分析準備、統計測試和排序聚合。言而言之,基於測試組和對照組之間每個基因的富集或耗竭程度,對每個sgRNA iBAR靶向的基因進行評分和排序,以便確定該基因是否為具有高置信度的候選基因。參見圖3中用於靶基因鑒定的工作流程。針對其失活導致對抗癌藥物殺傷的敏感表型的候選基因,與對照組(陰性篩選)相比,sgRNA iBAR編碼片段會耗竭;而針對其失活導致對抗癌藥物殺傷的耐藥表型的候選基因,與對照組(陽性性篩選)相比,sgRNA iBAR編碼片段會富集。這些排序靠前的候選基因被發現涉及細胞增殖、細胞死亡、或細胞週期調控。 Genomic DNA was extracted from the treated cancer cells collected above (mostly live Cas9 + sgRNA iBAR cancer cells). For each cancer cell type, there are "9-10 PDT test group", "15-16 PDT test group", "9-10 PD control group" and "15-16 PDT control group"; each group has two biological Learn to repeat. For each anticancer drug (eg, PARPi), two different cell line libraries (eg, Cas9 + sgRNA iBAR HCT116 library and Cas9 + sgRNA iBAR SW480 library) were tested. The sgRNA iBAR coding fragment was PCR amplified from the extracted genome, purified and prepared for NGS sequencing. Sequencing data analysis was performed using the MAGeCKiBAR algorithm (see Zhu et al., "Guide RNAs with embedded barcodes boost CRISPR-pooled screens," Genome Biol. 2019; 20:20; the contents of which are incorporated herein by reference in their entirety), which Consists of three main sections: Analysis Preparation, Statistical Tests, and Ranked Aggregation. In summary, the genes targeted by each sgRNA iBAR are scored and ranked based on the degree of enrichment or depletion of each gene between the test and control groups in order to determine whether the gene is a candidate gene with high confidence . See Figure 3 for the workflow for target gene identification. For candidate genes whose inactivation leads to anticancer drug-killing sensitive phenotypes, the sgRNA iBAR coding fragment will be depleted compared with the control group (negative screening); sgRNA iBAR coding fragments will be enriched compared with the control group (positive screening). These top candidate genes were found to be involved in cell proliferation, cell death, or cell cycle regulation.

5. 結果5. Results

與對照組相比,在“9-10 PDT測試組”或“15-16 PDT測試組”其sgRNA iBAR編碼片段在收穫的活細胞中耗竭以及在每個細胞系文庫具有FDR≤0.1的候選基因,被歸類為其失活使癌細胞對抗癌藥物具有敏感的藥物敏感基因。示例性藥物敏感基因(例如,對於PARPi)包括但不限於:ARID2、ATM、BIRC6、BRCA1、BRCA2、CCNA2、CCND1、CDK2、FBXW7、HRAS、KAT2B、NBN、PBRM1、PTEN、SKP2、SMAD7、TGFB2、TSC1、TSC2、ATR、RIF1、POLQ、AXIN1、GSK3A、GSK3B、CHD7、SCAF4、FANCM、NIPBL、ATRX、STAG1、RAD51、RAD51B、RAD51C、RAD51D、FANCL、EXO1、DIDO1、LRBA、FAM71A、HDAC2、PMS2、MSH6、MSH2、MLH1和WEE1。 Candidate genes whose sgRNA iBAR coding fragments were depleted in harvested live cells and had FDR ≤ 0.1 in each cell line library in the "9-10 PDT test group" or "15-16 PDT test group" compared to the control group , are classified as drug-sensitive genes whose inactivation sensitizes cancer cells to anticancer drugs. Exemplary drug sensitive genes (e.g., for PARPi) include, but are not limited to: ARID2, ATM, BIRC6, BRCA1, BRCA2, CCNA2, CCND1, CDK2, FBXW7, HRAS, KAT2B, NBN, PBRM1, PTEN, SKP2, SMAD7, TGFB2, TSC1, TSC2, ATR, RIF1, POLQ, AXIN1, GSK3A, GSK3B, CHD7, SCAF4, FANCM, NIPBL, ATRX, STAG1, RAD51, RAD51B, RAD51C, RAD51D, FANCL, EXO1, DIDO1, LRBA, FAM71A, HDAC2, PMS2, MSH6, MSH2, MLH1 and WEE1.

與對照組相比,在“9-10 PDT測試組”或“15-16 PDT測試組”其sgRNA iBAR編碼片段在收穫的活細胞中富集以及在每個細胞系文庫具有FDR≤0.1的候選基因,被歸類為其失活使癌細胞對抗癌藥物耐藥的耐藥基因。示例性耐藥基因(例如,對於PARPi)包括但不限於:AKT1、CDKN1A、CKS1B、CKS2、CTNNB1、DLG5、E2F3、E2F4、HDAC1、MAPK1、MYC、RAC1、RAF1、RICTOR、SMAD4、TP53、BRAF、HSP90B1、PARP2、PARP1、PIK3CA、EIF3A、CCNA1、RBL1、ZMYND8、MED12、GCN1、Kras、TP53BP1、CHD2、DOCK5、IGF1R、ILK、IRS1、RAPGEF1、EP300、TCF7L2、KMT2B、CDKN2A、CHEK1、CHEK2、RHEB、SPTA1、PKMYT1、SIDT2、APC和SETD2. Candidates whose sgRNA iBAR- encoding fragments were enriched in harvested live cells in the "9-10 PDT test group" or "15-16 PDT test group" compared to the control group and had FDR ≤ 0.1 in each cell line library Genes, classified as resistance genes whose inactivation renders cancer cells resistant to anticancer drugs. Exemplary drug resistance genes (e.g., for PARPi) include, but are not limited to: AKT1, CDKN1A, CKS1B, CKS2, CTNNB1, DLG5, E2F3, E2F4, HDAC1, MAPK1, MYC, RAC1, RAF1, RICTOR, SMAD4, TP53, BRAF, HSP90B1, PARP2, PARP1, PIK3CA, EIF3A, CCNA1, RBL1, ZMYND8, MED12, GCN1, Kras, TP53BP1, CHD2, DOCK5, IGF1R, ILK, IRS1, RAPGEF1, EP300, TCF7L2, KMT2B, CDKN2A, CHEK1, CHEK2, RHEB, SPTA1, PKMYT1, SIDT2, APC, and SETD2.

針對PARPi在“15-16 PDT測試組”和Cas9 +sgRNA iBARHCT116文庫或Cas9 +sgRNA iBARSW480文庫中鑒定的一小組PARPi敏感基因和耐藥基因,其篩選分數(反映富集/耗竭的顯著性和程度)和FDR(反映顯著性)如表1所示。 Screening scores (reflecting the significance of enrichment/depletion) against PARPi for a small set of PARPi-sensitive and resistant genes identified in the "15-16 PDT test set" and the Cas9 + sgRNA iBAR HCT116 library or the Cas9 + sgRNA iBAR SW480 library and degree) and FDR (reflection significance) are shown in Table 1.

表1. 針對PARPi的藥物敏感基因或耐藥基因 PARPi 的藥物敏感基因 PARPi 的耐藥基因 基因 篩選評分 FDR 基因 篩選評分 FDR PTEN 27.72 0.00 PARP1 34.15 0.00 ATM 26.01 0.00 MED12 33.39 0.00 RIF1 24.53 0.00 MAPK1 30.19 0.00 BRCA1 23.82 0.00 TCF7L2 27.27 0.00 TSC1 20.37 0.00 CKS2 20.86 0.00 STAG1 19.29 0.00 TP53 20.70 0.00 FBXW7 19.25 0.00 HSP90B1 20.20 0.00 NBN 19.23 0.00 CDKN1A 20.17 0.00 CHD7 16.84 0.00 TP53BP1 18.56 0.00 SKP2 16.56 0.00 RHEB 18.34 0.00 GSK3B 15.55 0.00 CTNNB1 18.09 0.00 TSC2 14.79 0.00 ILK 17.88 0.00 POLQ 13.56 0.00 MYC 17.75 0.00 CDK2 11.13 0.00 PARP2 17.62 0.00 FANCM 9.87 0.00 RICTOR 17.47 0.00 ATR 8.98 0.00 ZMYND8 17.46 0.00 SCAF4 7.55 0.00 PIK3CA 14.84 0.00 BIRC6 7.48 0.00 RAC1 14.76 0.00 PBRM1 7.31 0.00 RAPGEF1 14.22 0.00 KAT2B 7.12 0.00 GCN1 13.96 0.00 GSK3A 7.07 0.00 EIF3A 13.11 0.00 NIPBL 6.90 0.00 SMAD4 12.46 0.00 HRAS 6.80 0.00 DLG5 11.37 0.00 TGFB2 6.78 0.00 CHD2 11.23 0.00 WEE1 6.66 0.00 BRAF 10.87 0.00 CCNA2 6.20 0.00 AKT1 10.76 0.00 HDAC2 5.91 0.00 IGF1R 10.50 0.00 AXIN1 5.72 0.00 CKS1B 10.07 0.00 SMAD7 4.77 0.01 IRS1 9.38 0.00 ARID2 4.10 0.03 RBL1 8.59 0.00 CCND1 3.52 0.09 DOCK5 7.75 0.00 DIDO1 3.42 0.08 KMT2B 6.56 0.00 BRCA2 3.36 0.11 EP300 5.90 0.00 SETD2 5.15 0.00 PKMYT1 5.08 0.00 CCNA1 4.72 0.01 E2F3 4.68 0.01 SPTA1 4.57 0.02 HDAC1 4.18 0.02 E2F4 3.43 0.15 Table 1. Drug-sensitive or resistant genes against PARPi PARPi drug sensitive gene PARPi resistance gene Gene Screening score FDR Gene Screening score FDR PTEN 27.72 0.00 PARP1 34.15 0.00 ATMs 26.01 0.00 MED12 33.39 0.00 RIF1 24.53 0.00 MAPK1 30.19 0.00 BRCA1 23.82 0.00 TCF7L2 27.27 0.00 TSC1 20.37 0.00 CKS2 20.86 0.00 STAG1 19.29 0.00 TP53 20.70 0.00 FBXW7 19.25 0.00 HSP90B1 20.20 0.00 NBN 19.23 0.00 CDKN1A 20.17 0.00 CHD7 16.84 0.00 TP53BP1 18.56 0.00 SKP2 16.56 0.00 RHEB 18.34 0.00 GSK3B 15.55 0.00 CTNNB1 18.09 0.00 TSC2 14.79 0.00 ILK 17.88 0.00 POLQ 13.56 0.00 MYC 17.75 0.00 CDK2 11.13 0.00 PARP2 17.62 0.00 FANCM 9.87 0.00 RICTOR 17.47 0.00 ATR 8.98 0.00 ZMYND8 17.46 0.00 SCAF4 7.55 0.00 PIK3CA 14.84 0.00 BIRC6 7.48 0.00 RAC1 14.76 0.00 PBRM1 7.31 0.00 RAPGEF1 14.22 0.00 KAT2B 7.12 0.00 GCN1 13.96 0.00 GSK3A 7.07 0.00 EIF3A 13.11 0.00 NIPBL 6.90 0.00 SMAD4 12.46 0.00 HRAS 6.80 0.00 DLG5 11.37 0.00 TGFB2 6.78 0.00 CHD2 11.23 0.00 WEE1 6.66 0.00 BRAF 10.87 0.00 CCNA2 6.20 0.00 AKT1 10.76 0.00 HDAC2 5.91 0.00 IGF1R 10.50 0.00 AXIN1 5.72 0.00 CKS1B 10.07 0.00 SMAD7 4.77 0.01 IRS1 9.38 0.00 ARID2 4.10 0.03 RBL1 8.59 0.00 CCND1 3.52 0.09 DOCK5 7.75 0.00 DIDO1 3.42 0.08 KMT2B 6.56 0.00 BRCA2 3.36 0.11 EP300 5.90 0.00 SETD2 5.15 0.00 PKMYT1 5.08 0.00 CCNA1 4.72 0.01 E2F3 4.68 0.01 SPTA1 4.57 0.02 HDAC1 4.18 0.02 E2F4 3.43 0.15

此處獲得的結果,特別是那些被發現其失活賦予癌細胞對抗癌藥物(例如,PARPi)殺傷敏感的基因,證明了其在癌症治療中為有價值的靶標和用於患者選擇的生物標誌物。其失活使癌細胞對抗癌藥物耐藥的耐藥基因,會作為不選擇此類患者的生物標誌物,和/或應該使用備選的癌症治療劑。The results obtained here, particularly those genes whose inactivation was found to render cancer cells sensitive to killing by anticancer drugs (e.g., PARPi), demonstrate their value as targets in cancer therapy and biological candidates for patient selection. landmark. The inactivation of resistance genes that render cancer cells resistant to anticancer drugs would serve as a biomarker for not selecting such patients, and/or that alternative cancer therapeutics should be used.

6. 靶基因驗證6. Target gene validation

為了驗證鑒定的藥物敏感基因和耐藥基因,從PARPi敏感基因和PARPi耐藥基因中選擇一個基因亞組(參見表2)用於實驗測試。In order to verify the identified drug-sensitive genes and drug-resistant genes, a gene subset (see Table 2) was selected from PARPi-sensitive genes and PARPi-resistant genes for experimental testing.

簡而言之,設計並合成了編碼靶向這些基因的sgRNA的核酸。允許正鏈和反鏈退火,以形成兩端具有懸垂的雙鏈核酸。基於pLenti sgRNA-Lib (Addgene#53121),對由實驗室內部進行修飾的慢病毒sgRNA-表達的骨架進行酶切,將雙鏈核酸連接到切割位點,以獲得sgRNA質粒,此sgRNA質粒攜帶嘌呤黴素和氨苄西林抗生素基因。Briefly, nucleic acids encoding sgRNAs targeting these genes were designed and synthesized. The forward and reverse strands are allowed to anneal to form a double-stranded nucleic acid with overhangs at both ends. Based on pLenti sgRNA-Lib (Addgene #53121), the lentiviral sgRNA-expressed backbone modified in-house is digested, and the double-stranded nucleic acid is ligated to the cutting site to obtain the sgRNA plasmid carrying purines Mycin and ampicillin antibiotic genes.

為了擴增sgRNA質粒,將2μL sgRNA質粒添加到1.5 mL Eppendorf離心管中的20μL感受態細胞(大腸桿菌)中,然後採用標準的冰/熱休克轉化方案,在37°C搖床中的液體LB中生長1h,然後塗布到LBAmp+平板上並在37℃下生長過夜。第二天,挑選5-10個單克隆在37°C搖床中的LBAmp+液體培養基中過夜生長。第二天,用試劑盒提取sgRNA質粒,然後進行測序以驗證序列。To amplify the sgRNA plasmid, add 2 µL of the sgRNA plasmid to 20 µL of competent cells (E. coli) in a 1.5 mL Eppendorf centrifuge tube, followed by a standard ice/heat shock transformation protocol in liquid LB in a 37 °C shaker Grow for 1 h in , then spread onto LBAmp+ plates and grow overnight at 37°C. The next day, 5-10 single colonies were picked and grown overnight in LBAmp+ liquid medium in a shaker at 37°C. The next day, the sgRNA plasmid was extracted with a kit and then sequenced to verify the sequence.

然後使用標準方案獲得sgRNA慢病毒。簡單地說,將5×106個293T細胞置於10 cm細胞培養皿中,並在37°C、5% CO2培養箱中培養過夜。第二天,丟棄培養基,向293T細胞中添加新鮮無血清培養基。用無血清培養基(1mL)、以上純化的sgRNA質粒(10μg)、pCMVR8.74質粒(10μg)和pCMV-VSV-G質粒(1μg)製備轉染複合物;混合後,添加52.5μL PEI。混合後,將轉染複合物在室溫下靜置15min。然後將轉染複合物添加到新鮮無血清培養基中的293T細胞,在37°C、5% CO2的培養箱中培養6-8h。丟棄細胞培養基,將新鮮的完整培養基添加到293T細胞中,然後在37°C、5% CO2的培養箱中培養。72h後,收集細胞培養物並在200 g、5min的條件下離心。收集含有sgRNA慢病毒的上清液,用0.45μm篩檢程式過濾,然後儲存在-80°C備用。sgRNA lentiviruses were then obtained using standard protocols. Briefly, 5 × 106 293T cells were placed in a 10 cm cell culture dish and cultured overnight in a 37°C, 5% CO2 incubator. The next day, discard the medium and add fresh serum-free medium to the 293T cells. Prepare a transfection complex with serum-free medium (1 mL), the above-purified sgRNA plasmid (10 μg), pCMVR8.74 plasmid (10 μg), and pCMV-VSV-G plasmid (1 μg); after mixing, add 52.5 μL of PEI. After mixing, the transfection complex was allowed to stand at room temperature for 15 min. The transfection complex was then added to 293T cells in fresh serum-free medium and cultured in an incubator at 37°C with 5% CO2 for 6-8h. Discard the cell culture medium, add fresh complete medium to the 293T cells, and culture in a 37°C, 5% CO2 incubator. After 72 hours, the cell cultures were collected and centrifuged at 200 g for 5 minutes. The supernatant containing sgRNA lentivirus was collected, filtered with a 0.45 μm screening program, and then stored at -80°C for future use.

為了構建具有靶基因敲除(KO)的癌細胞系,將2×105個SW620癌細胞接種到6孔板中,並在37°C、5% CO2培養箱中培養。24h後,將100μL Cas9包裝的慢病毒添加到細胞培養基中,並將癌細胞在37°C、5% CO2培養箱中培養。24h後,丟棄培養基並將新鮮的完整培養基添加到癌細胞。允許所述癌細胞在37°C、5% CO2培養箱中生長7天,然後使用mCherry標誌物用FACS進行分選(在Cas9-慢病毒載體上進行)。分選的具有mCherry螢光的癌細胞為Cas9-表達(Cas9+)細胞,並擴增以用於構建Cas9+ sgRNA。將上述獲得的500μL非濃縮sgRNA慢病毒添加到MOI為3的培養基中(無抗生素)的2×107個Cas9+癌細胞,並輕輕混合。將Cas9+癌細胞在37°C、5% CO2培養箱中培養過夜以進行感染。第二天,丟棄培養基,將新鮮的完整培養基添加到Cas9+癌細胞,然後在37°C、5% CO2培養箱中培養48h。然後向培養基中添加1μL嘌呤黴素進行篩選。未成功轉染sgRNA質粒的Cas9+癌細胞將死亡。To construct cancer cell lines with target gene knockout (KO), 2×105 SW620 cancer cells were seeded into 6-well plates and cultured in a 37°C, 5% CO2 incubator. After 24 h, 100 μL of Cas9-packaged lentivirus was added to the cell culture medium, and the cancer cells were cultured in a 37°C, 5% CO2 incubator. After 24 h, the medium was discarded and fresh complete medium was added to the cancer cells. The cancer cells were allowed to grow for 7 days in a 37°C, 5% CO2 incubator, and then sorted by FACS using the mCherry marker (on a Cas9-lentiviral vector). Cancer cells with mCherry fluorescence were sorted as Cas9-expressing (Cas9+) cells and expanded for construction of Cas9+ sgRNA. Add 500 μL of the non-condensed sgRNA lentivirus obtained above to 2×107 Cas9+ cancer cells in medium at an MOI of 3 (without antibiotics) and mix gently. Cas9+ cancer cells were cultured overnight in a 37°C, 5% CO2 incubator for infection. The next day, the medium was discarded, and fresh complete medium was added to the Cas9+ cancer cells, followed by culturing in a 37°C, 5% CO2 incubator for 48h. Then add 1 μL of puromycin to the medium for selection. Cas9+ cancer cells that are not successfully transfected with the sgRNA plasmid will die.

為了測試靶基因敲除(KO)效率(%),收集上述經嘌呤黴素處理的一個癌細胞亞群。提取基因組DNA,並使靶基因序列擴增和對其進行測序。敲除(KO)效率是通過分解追蹤插入缺失(TIDE)網路工具來計算的,該工具可以從序列蹤跡來精確重建插入缺失的譜圖,並將檢測到的插入缺失及其頻率報告為敲除(KO)效率。結果匯總於表2。To test target gene knockout (KO) efficiency (%), a subpopulation of cancer cells treated with puromycin as described above was collected. Genomic DNA is extracted, and the target gene sequence is amplified and sequenced. Knockout (KO) efficiencies were calculated using the Decomposition Tracing Indels (TIDE) web tool, which accurately reconstructs indel profiles from sequence traces and reports detected indels and their frequencies as knockout Divide (KO) efficiency. The results are summarized in Table 2.

為了測的靶基因敲除(KO)的癌細胞對PARPi處理的回應,針對每個靶基因敲除(KO)將1000個癌細胞放入96孔板中,每個孔添加培養基,然後在37°C、5% CO2細胞培養箱中培養過夜。第二天,以1:3的稀釋度製備不同濃度的PARPi,然後添加到每個孔中,每個濃度3個重複。最終PARPi濃度為33.3μM、11.1μM、3.70μM、1.27μM、0.41μM、0.13μM和0.05μM和0.015μM。用沒有攜帶任何靶基因敲除(KO)的癌細胞(“WT癌細胞”)進行對照組實驗,在相同條件下培養和用PARPi處理。在PARPi存在下的2-3個倍增時間後,進行CellTiter Glo®發光細胞活性測定(ATP測定),以獲得藥物毒性曲線(參見圖4)。IC50結果如所示表2。In order to measure the response of target gene knockout (KO) cancer cells to PARPi treatment, 1000 cancer cells were put into a 96-well plate for each target gene knockout (KO), and medium was added to each well, and then culture medium was added at 37 °C, 5% CO2 in a cell incubator overnight. The next day, different concentrations of PARPi were prepared at a 1:3 dilution and added to each well in triplicate for each concentration. Final PARPi concentrations were 33.3 μM, 11.1 μM, 3.70 μM, 1.27 μM, 0.41 μM, 0.13 μM and 0.05 μM and 0.015 μM. Control experiments were performed with cancer cells that did not carry any target gene knockout (KO) ("WT cancer cells"), cultured under the same conditions and treated with PARPi. After 2-3 doubling times in the presence of PARPi, the CellTiter Glo® Luminescent Cell Viability Assay (ATP Assay) was performed to obtain drug toxicity curves (see Figure 4). IC50 results are shown in Table 2.

表2. PARPi的藥物敏感基因和耐藥基因的驗證 PARPi 的藥物敏感基因 PARPi 的耐藥基因 基因 敲除(KO) 效率(%) IC50 倍數變化 (KO 對比WT) 基因 敲除(KO) 效率(%) IC50 倍數變化 (KO 對比WT) ATM 80 -56.45 MYC 57 6.43 NBN 51 -15.96 GCN1 62 5.50 FANCM 69 -13.27 ZMYND8 40 5.50 BRCA1 90 -9.86 PARP1 75 3.48 WEE1 63 -8.69 SETD2 57 2.50 ARID2 87 -5.55 RICTOR 60 2.42 RIF1 68 -3.55 CTNNB1 47 2.00 CHD7 76 -3.06 TP53 79 1.94 POLQ 58 -1.88 EIF3A 29 1.77 DIDO1 58 -1.88 MED12 70 1.50 PTEN 51 -1.78 ATR 32 -1.67 STAG1 70 -1.63 BIRC6 88 -1.50 Table 2. Validation of PARPi's drug-sensitive genes and drug-resistant genes PARPi drug sensitive gene PARPi resistance gene Gene Knockout (KO) efficiency (%) IC50 fold change (KO vs. WT) Gene Knockout (KO) efficiency (%) IC50 fold change (KO vs. WT) ATMs 80 -56.45 MYC 57 6.43 NBN 51 -15.96 GCN1 62 5.50 FANCM 69 -13.27 ZMYND8 40 5.50 BRCA1 90 -9.86 PARP1 75 3.48 WEE1 63 -8.69 SETD2 57 2.50 ARID2 87 -5.55 RICTOR 60 2.42 RIF1 68 -3.55 CTNNB1 47 2.00 CHD7 76 -3.06 TP53 79 1.94 POLQ 58 -1.88 EIF3A 29 1.77 DIDO1 58 -1.88 MED12 70 1.50 PTEN 51 -1.78 ATR 32 -1.67 STAG1 70 -1.63 BIRC6 88 -1.50

如圖4和表2所示,在敲除(KO)後篩選鑒定的藥物敏感基因確實賦予了癌細胞中對PARPi殺傷的敏感性(例如,參見ATM、BRCA1、WEE1等),且在敲除(KO)後篩選鑒定的耐藥基因確實賦予了癌細胞中對PARPi殺傷的耐藥性(例如,參見PARP1、MYC)。此外,靶基因敲除(KO)和WT癌細胞之間的IC50倍數變化主要遵循以下篩選結果:來自篩選的高度富集或耗竭的靶基因(例如,具有較高的篩選評分,例如參見表1)在IC50方面也顯示有較大差異。As shown in Figure 4 and Table 2, drug-sensitive genes identified by screening after knockout (KO) did confer susceptibility to PARPi killing in cancer cells (see, for example, ATM, BRCA1, WEE1, etc.), and in knockout (KO) (KO) Resistance genes identified by post-screening indeed confer resistance to PARPi killing in cancer cells (eg, see PARP1, MYC). Furthermore, the IC50 fold change between target gene knockout (KO) and WT cancer cells mainly follows the following screening results: highly enriched or depleted target genes from the screen (e.g., with higher screen score, see Table 1 for example). ) also showed a large difference in IC50.

這些靶基因驗證結果表明,本文所述篩選方法有效地獲得了藥物敏感基因和/或耐藥基因,而且本文提供藥物敏感基因和耐藥基因具有較高的準確性,會在癌症診斷和治療中具有重要價值。The verification results of these target genes show that the screening method described in this paper effectively obtains drug sensitive genes and/or drug resistant genes, and the drug sensitive genes and drug resistant genes provided herein have higher accuracy and will be used in cancer diagnosis and treatment. of great value.

7. 討論7. Discussion

上述方法可用於任何抗癌藥物(如針對不同通路或相同通路的藥物)和任何癌症類型的藥物敏感基因和/或耐藥基因篩選。獲得的藥物敏感基因和/或耐藥基因在癌症治療、患者選擇和新藥篩選或設計中具有重要意義。The above method can be used for any anti-cancer drugs (such as drugs targeting different pathways or the same pathway) and drug-sensitive and/or drug-resistant genes of any cancer type. The obtained drug-sensitive genes and/or drug-resistant genes are of great significance in cancer treatment, patient selection, and new drug screening or design.

例如,如果癌症患者的診斷表明,對於單一通路(例如,由PARPi靶向的等):1)患者只有靶基因的這種失活突變,所述靶基因的失活導致對通路靶向藥物產生敏感性,那麼該患者是使用此類藥物治療的完美候選者;2)患者只有靶基因的這種失活突變,所述靶基因的失活導致對通路靶向藥物產生耐藥性,那麼該患者可能不適合使用針對該通路的藥物進行治療,應尋求備選的治療方法;3)患者兼有其失活導致對通路靶向藥物產生耐藥性的靶基因以及其失活導致對靶向通路的藥物產生敏感性的靶基因的兩種失活突變,那麼需要進行更多的分析,例如,如果藥物敏感性足以在出現耐藥性之前幫助殺死癌細胞,如果與賦予藥物敏感性的基因相比,賦予耐藥性的基因在癌症發展中的重要性較小,那麼是否應選擇靶向一種通路的藥物而不是靶向另一種通路的藥物,一種藥物是否應先於另一種藥物使用或一起使用,是否存在備選的治療方法等。本文所述的藥物敏感畸變和藥物耐藥畸變的綜合評分(例如,式I)也可能有助於治療決定。For example, if a cancer patient's diagnosis shows that, for a single pathway (e.g., targeted by PARPi, etc.): 1) the patient has only such an inactivating mutation in the target gene, the inactivation of which results in the development of a pathway-targeted drug sensitivity, then the patient is a perfect candidate for treatment with this type of drug; 2) the patient has only this inactivating mutation in the target gene that inactivates resistance to pathway-targeted drugs, then the patient Patients may not be suitable for treatment with drugs targeting this pathway, and alternative treatments should be sought; 3) Patients have both target genes whose inactivation leads to resistance to pathway-targeted drugs and whose inactivation leads to resistance to pathway-targeted drugs Two inactivating mutations in a target gene that confers sensitivity to a drug would require more analysis, for example, if the drug is sensitive enough to help kill cancer cells before resistance emerges, and if it is related to the gene that confers drug sensitivity Genes that confer drug resistance are less important in cancer development than genes that confer drug resistance, so should a drug that targets one pathway be chosen over another, should one drug be used before the other or used together, whether there are alternative treatments, etc. A composite score of drug-susceptibility and drug-resistance aberrations described herein (eg, Formula I) may also be helpful in treatment decisions.

針對多種抗癌藥物(如靶向涉及癌症發展的相同或不同通路的抗癌藥物)獲得的靶基因,可以組合或重疊以找到共同的靶基因。可以進一步分析基因功能和/或作用機制,以作出治療決定和/或藥物設計/開發。例如,如果患者的基因攜帶失活突變,所述基因的失活導致對藥物X、Y和Z具有敏感性,那麼與藥物X、Y和Z的聯合治療可產生協同抗癌活性。例如,如果患者攜帶不同基因(同一通路或不同通路)的失活突變,所述基因的失活導致對藥物X、Y和Z具有敏感性,那麼藥物X、Y和Z的聯合治療可產生協同抗癌活性。另一個實例是,如果一種新藥可以被設計成靶向涉及靶基因的多種通路,而所述靶基因的缺失會導致對已知藥物的敏感性,那麼與已知藥物相比,獲得的新藥可具有更好的治療效果。Target genes obtained for multiple anticancer drugs, such as those targeting the same or different pathways involved in cancer development, can be combined or overlapped to find common target genes. Gene function and/or mechanism of action can be further analyzed for therapeutic decisions and/or drug design/development. For example, if a patient carries an inactivating mutation in a gene whose inactivation confers sensitivity to drugs X, Y, and Z, combination therapy with drugs X, Y, and Z may result in synergistic anticancer activity. For example, if a patient carries inactivating mutations in different genes (same pathway or different pathways) whose inactivation confers sensitivity to drugs X, Y, and Z, combination therapy with drugs X, Y, and Z can produce synergy Anticancer activity. Another example is that if a new drug can be designed to target multiple pathways involving target genes whose loss results in sensitivity to known drugs, the resulting new drug can be more effective than the known drug. have a better therapeutic effect.

舉例來說,對於例如由PARPi、其他治療結直腸癌的藥物或其他治療其他癌症類型但尚未測試/開發用於結直腸癌治療的藥物等靶向的多種通路,如果患者被診斷為如下的患者:1)在共用靶基因(或具有共用通路的不同基因)中有失活突變,所述基因的失活導致對多種靶向通路的藥物具有敏感性,則使用此類藥物的單一或優選組合療法可用於治療癌症;2) 在共用靶基因(或具有共用通路的不同基因)中有失活突變,所述基因的失活導致對多種靶向通路的藥物具有耐藥性,因此應尋求備選的治療方法,或者應該進一步分析基因功能和/或作用機制,以確定使用這種靶向通路的藥物的聯合治療(例如,一種先於另一種使用)是否可以緩解耐藥表型。例如,在治療過程後期會經歷來自靶基因突變的耐藥性的藥物X可以首先使用,而早期會經歷來自靶基因突變的耐藥性但可能足夠有效的藥物Y,只能在開始或整個過程中與藥物X結合使用。For example, for multiple pathways targeted by, for example, PARPi, other drugs to treat colorectal cancer, or other drugs to treat other cancer types but not yet tested/developed for colorectal cancer treatment, if a patient is diagnosed as : 1) There are inactivating mutations in a common target gene (or different genes with a shared pathway) whose inactivation results in sensitivity to drugs targeting multiple pathways, then use a single or preferred combination of such drugs Therapies can be used to treat cancer; 2) have inactivating mutations in a common target gene (or different genes with a shared pathway) whose inactivation results in resistance to drugs targeting multiple pathways, so alternatives should be sought Alternatively, gene function and/or mechanism of action should be further analyzed to determine whether combination therapy with drugs targeting such pathways (eg, one prior to the other) can alleviate the resistance phenotype. For example, drug X, which would experience resistance from target gene mutations later in the course of treatment, could be given first, while drug Y, which would experience resistance from target gene mutations earlier but might be sufficiently effective, could only be given at the beginning or throughout the course In combination with Drug X.

實施例2:綜合評分正確反映了抗癌藥物的抗癌功效Example 2: Comprehensive Score Correctly Reflects the Anticancer Efficacy of Anticancer Drugs

本實施例提供了證據:基於使用本文所述篩選方法鑒定的抗癌劑(例如,DNA損傷劑,如PARPi或ATRi)的藥物敏感基因和耐藥基因,採用本文所述的方法(例如,式I)計算的綜合評分正確地反映/能夠預測相應抗癌劑的癌症殺傷功效。This example provides evidence that based on the identification of drug susceptibility and resistance genes to anticancer agents (e.g., DNA damaging agents such as PARPi or ATRi) using the screening methods described herein, using the methods described herein (e.g., Formula I) The calculated composite score correctly reflects/can predict the cancer-killing efficacy of the corresponding anti-cancer agent.

1. 隨機選擇結直腸癌樣本用於綜合評分計算1. Randomly select colorectal cancer samples for composite score calculation

基於公共資料庫,來自III期和IV期結直腸癌患者的DNA突變頻率≥5%且RNA表達水準上調或下調大於2-倍(在細胞內或細胞表面表達)的基因,被選為用於進一步sgRNA iBAR設計的文庫基因(共1323個基因)。 Based on public databases, genes with a DNA mutation frequency ≥5% and RNA expression levels up- or down-regulated >2-fold (expressed in cells or on the cell surface) from patients with stage III and IV colorectal cancer were selected for use Library genes for further sgRNA iBAR design (1323 genes in total).

通過測量細胞存活率(反映為IC50)或PDX生長抑制率(遵循標準方法),對收集的結直腸癌細胞系和患者來源的異種移植物(PDX)進行PARPi治療反應測試(另見實施例1)。基於對PARPi治療的多種反應,選擇16個癌症樣本(10個PDX和6個癌細胞系;參見圖5A),用於綜合評分計算。其相應的細胞活性反應或PDX生長抑制反應在圖5C中反映為“藥物反應”。Harvested colorectal cancer cell lines and patient-derived xenografts (PDX) were tested for PARPi treatment response by measuring cell viability (reflected as IC50) or PDX growth inhibition (following standard methods) (see also Example 1 ). Based on multiple responses to PARPi treatment, 16 cancer samples (10 PDX and 6 cancer cell lines; see Figure 5A) were selected for composite score calculation. Its corresponding cellular viability response or PDX growth inhibition response is reflected as "drug response" in Figure 5C.

2. 對選定癌症樣本中的突變進行檢測、過濾和注釋2. Detection, filtering and annotation of mutations in selected cancer samples

將以上選定的16個癌症樣本通過NGS單獨測序。對於每個樣本,從測序數據中檢測到突變。根據突變品質進一步篩選原始突變位點,以去除低置信度突變位點。剩餘的高品質突變位點被映射到相應的基因,並基於資料庫來注釋所述突變對相應基因功能的影響。只有對相應基因具有有害影響的突變位點被保留下來進行後續分析。The 16 cancer samples selected above were individually sequenced by NGS. For each sample, mutations were detected from the sequencing data. The original mutation sites were further screened according to the mutation quality to remove low confidence mutation sites. The remaining high-quality mutation sites were mapped to the corresponding genes, and the effects of the mutations on the corresponding gene functions were annotated based on the database. Only mutation sites with deleterious effects on the corresponding genes were retained for subsequent analysis.

3. 基因水準功能注釋和系列資料庫資訊整合以進一步過濾3. Gene-level functional annotation and integration of series database information for further filtering

然後,根據來自外部和內部資料庫來源的患病率、臨床意義、治癒影響、基因本體和通路資訊等,對上述其餘突變位點進行注釋。進一步過濾出低臨床影響突變。然後計算每個樣本中映射到其餘突變的每個基因的總體功能喪失(LOF)概率。The remaining mutation sites above were then annotated based on prevalence, clinical significance, cure impact, Gene Ontology and pathway information from external and internal database sources. Low clinical impact mutations were further filtered out. The overall loss-of-function (LOF) probability for each gene mapped to the remaining mutations in each sample was then calculated.

4. 綜合評分計算4. Composite Score Calculation

為了計算每個癌症樣本的綜合評分,並測試其預測PARPi治療反應的準確性,總共選擇了51個PARPi敏感基因和PARPi耐藥基因,這些基因是從實施例1中獲得和/或驗證的,並且在過濾後的16個所選癌症樣本的任何一個樣本中至少有一個高置信度有害突變(“測試基因小組”)。他們相應的LOF概率如圖5A所示。對於每個癌症樣本,使用51個PARPi敏感/耐藥基因的基因水準LOF概率來計算式I中的

Figure 02_image094
部分。通過整合相應的權重(
Figure 02_image039
)、每個基因的相關係數(
Figure 02_image036
),以及預先計算的PARPi反應中相關通路的權重(
Figure 02_image036
),量化“測試基因組”中檢測到的突變對PARPi治療的基因水準貢獻和通路水準貢獻,以計算原始綜合評分。根據樣本類型(即細胞系、PDX、患者)進一步調整和縮放原始綜合評分,以生成每個癌症樣本的最終綜合評分(參見圖5B和5C“綜合評分”行)。 In order to calculate the composite score for each cancer sample and test its accuracy in predicting response to PARPi therapy, a total of 51 PARPi-sensitive genes and PARPi-resistant genes were selected, which were obtained and/or validated from Example 1, and at least one deleterious mutation with high confidence in any of the filtered 16 selected cancer samples ("Test Genome Panel"). Their corresponding LOF probabilities are shown in Fig. 5A. For each cancer sample, the gene-level LOF probability of 51 PARPi-sensitive/resistant genes was used to calculate the
Figure 02_image094
part. By integrating the corresponding weights (
Figure 02_image039
), the correlation coefficient of each gene (
Figure 02_image036
), and the weights of the relevant pathways in the precomputed PARPi responses (
Figure 02_image036
), quantify the gene-level contribution and the pathway-level contribution of mutations detected in the "test genome" to PARPi treatment to calculate the raw composite score. The raw composite score was further adjusted and scaled by sample type (i.e., cell line, PDX, patient) to generate the final composite score for each cancer sample (see Figure 5B and 5C row "Composite score").

5. 結果和結論5. Results and conclusions

如圖5B和5C所示,當根據式I得出的癌症樣本的綜合評分高於0時,癌症樣本確實顯示對PARPi殺傷的敏感性;而當根據式I的癌症樣本的綜合評分低於0時,癌症樣本確實顯示對PARPi殺傷的耐藥性。此外,綜合評分的絕對值越高,該癌症樣本的實際PARPi反應的預測能力越好(參見圖5C中基於綜合評分的“預測”行)。例如,對於綜合評分高於0.1的癌症樣本,樣本對PARPi殺傷的實際敏感性預測為“真”(參見PDX3、PDX6、PDX10、細胞系1、細胞系2和細胞系6)。對於負綜合評分較小(綜合評分的絕對值較大)的癌症樣本,樣本對PARPi殺傷的實際耐藥性的預測是“真”(參見PDX8、細胞系4、細胞系四)。根據式I,沒有實際測試的敏感癌症樣本的綜合評分低於0,這表明本文所述方法的“真陽性”預測能力很強。一個實際測試的敏感癌症樣本細胞系3,綜合評分為0。根據式I,所有實際測試的耐藥癌症樣本的綜合評分均低於或等於0 (表明本文所述方法的“真陽性”預測能力很強),除了PDX9的綜合評分為0.011以外。因此,對於綜合評分接近或等於0 (例如-0.1到+0.1)的癌症樣本,基於綜合評分對PARPi治療反應的預測是“不確定”的。可能需要進行更多評價,以基於綜合評分的預測來鑒定假陽性和假陰性。As shown in Figures 5B and 5C, cancer samples did show susceptibility to PARPi killing when their composite score according to Formula I was above 0; whereas cancer samples according to Formula I had a composite score below 0 , the cancer samples did show resistance to PARPi killing. Furthermore, the higher the absolute value of the composite score, the better the predictive power of the actual PARPi response for that cancer sample (see row "Prediction" based on composite score in Figure 5C). For example, for cancer samples with a composite score above 0.1, the sample's actual sensitivity to PARPi killing was predicted to be "true" (see PDX3, PDX6, PDX10, Line 1, Line 2, and Line 6). For cancer samples with a smaller negative composite score (larger absolute value of the composite score), the prediction of the sample's actual resistance to PARPi killing was "true" (see PDX8, Cell Line 4, Cell Line 4). According to Formula I, the composite score of sensitive cancer samples without actual testing was lower than 0, which indicates the strong predictive power of "true positive" of the method described here. A practically tested sensitive cancer sample cell line 3 with a composite score of 0. According to formula I, all the drug-resistant cancer samples actually tested had composite scores lower than or equal to 0 (indicating the strong "true positive" predictive power of the method described herein), except for PDX9 which had a composite score of 0.011. Therefore, for cancer samples with a composite score close to or equal to 0 (e.g., -0.1 to +0.1), the prediction of response to PARPi therapy based on the composite score is 'uncertain'. Additional evaluations may be required to identify false positives and false negatives based on composite score predictions.

這些發現表明,使用本文所述篩選方法鑒定的抗癌劑(例如DNA損傷劑,如PARPi或ATRi)的藥物敏感基因和耐藥基因,以及基於它們使用本文所述方法獲得的綜合評分,正確反映了/能夠預測相應抗癌劑的癌症殺傷功效,並可作為癌症診斷、治療選擇和/或患者選擇的工具。例如,當根據式I的患者綜合評分高於0時,患者可能適合(即可能受益於)抗癌藥物(例如DNA損傷劑,如PARPi或ATRi)治療。如果根據式I的患者的綜合評分大於或等於至少0.1(例如0.3),則可以選擇或推薦患者進行抗癌藥物治療。如果根據式I,患者的綜合評分大於0但小於0.1,則患者可能適合抗癌藥物治療,但應使用其他方法進行進一步評價(例如,藥物劑量測試、癌症基因測試(例如,尋找可能有助於抗癌藥物治療的其他協同突變,或驗證原發癌症類型)等),或基於其他資訊(如患者的臨床記錄或已知的耐藥性等)來確定是否應選擇或推薦患者進行抗癌藥物治療。如果根據式I的患者的綜合評分低於或等於0,則患者可能不適合(即可能不會受益)或應被排除在抗癌藥物治療之外。如果在完全排除患者接受抗癌藥物治療之前,根據式I的患者的綜合評分等於0或非常接近0 (例如-0.1到0),則可能需要使用其他方法或基於本文所述其他資訊進行進一步評價。These findings suggest that drug-sensitivity and resistance genes for anticancer agents (e.g., DNA damaging agents such as PARPi or ATRi) identified using the screening methods described herein, and based on their composite scores obtained using the methods described herein, correctly reflect It has/can predict the cancer-killing efficacy of corresponding anticancer agents, and can be used as a tool for cancer diagnosis, treatment selection and/or patient selection. For example, when the patient's composite score according to Formula I is higher than 0, the patient may be suitable for (ie, likely to benefit from) treatment with an anticancer drug (eg, a DNA damaging agent, such as PARPi or ATRi). If the patient's composite score according to Formula I is greater than or equal to at least 0.1 (eg, 0.3), the patient may be selected or recommended for anticancer drug treatment. If the patient's composite score according to formula I is greater than 0 but less than 0.1, the patient may be suitable for anticancer drug therapy, but should be further evaluated using other methods (eg, drug dosing tests, cancer genetic testing (eg, looking for other synergistic mutations for anticancer drug treatment, or verify the primary cancer type), etc.), or based on other information (such as the patient's clinical records or known drug resistance, etc.) to determine whether the patient should be selected or recommended for anticancer drugs treat. If the patient's composite score according to Formula I is less than or equal to 0, the patient may not be suitable (ie, may not benefit) or should be excluded from anticancer drug treatment. If a patient's composite score according to Formula I is equal to 0 or very close to 0 (eg, -0.1 to 0) before the patient is completely excluded from anticancer drug therapy, further evaluation using other methods or based on other information described herein may be required .

none

圖1顯示了篩選針對抗癌藥物的藥物敏感性和/或耐藥基因的示例性過程。 圖2顯示了針對Cas9+ sgRNA iBAR癌細胞文庫的示例性篩選工作流程。 圖3顯示了針對Cas9+ sgRNA iBAR癌細胞文庫的示例性靶基因鑒定工作流程。 圖4顯示了藥物敏感基因或耐藥基因敲除(KO)的癌細胞的PARPi反應曲線。用PARPi處理的沒有這種KO的癌細胞用作對照(WT)。 圖5A顯示了16個癌症樣本中PARPi的藥物敏感基因和耐藥基因(y軸,總共51個基因)的功能缺失(LOF)突變概率。圖5B顯示了使用公式I基於每個癌症樣本的51個基因計算的綜合評分。圖5C顯示了綜合評分、對PARPi治療的反應和針對每個癌症樣本基於綜合評分的治療功效預測。 Figure 1 shows an exemplary process for screening drug sensitivity and/or resistance genes against anticancer drugs. Figure 2 shows an exemplary screening workflow against a Cas9+ sgRNA iBAR cancer cell library. Figure 3 shows an exemplary target gene identification workflow for the Cas9+ sgRNA iBAR cancer cell library. Figure 4 shows the PARPi response curves of cancer cells with drug-sensitive or drug-resistant gene knockout (KO). Cancer cells without this KO treated with PARPi were used as controls (WT). Figure 5A shows the loss-of-function (LOF) mutation probability of drug-sensitive and resistant genes (y-axis, 51 genes in total) of PARPi in 16 cancer samples. Figure 5B shows the composite score calculated using Formula I based on 51 genes for each cancer sample. Figure 5C shows the composite score, response to PARPi treatment, and prediction of treatment efficacy based on the composite score for each cancer sample.

Claims (43)

鑒定癌細胞中其突變使該癌細胞對抗癌藥物敏感或耐藥的靶基因的方法,包括: a) 提供包含多個癌細胞的癌細胞文庫,其中所述多個癌細胞中的每一個在命中基因(“命中基因突變”)具有突變,其中在所述多個癌細胞的至少兩個中的所述命中基因彼此不同; 其中所述癌細胞文庫是在允許將sgRNA構建體和Cas組件引入初始癌細胞群並且在所述命中基因產生所述突變的條件下,通過使所述初始癌細胞群與以下物質接觸來產生的:i)包含多個sgRNA構建體的單鏈嚮導RNA (“sgRNA”)文庫,其中每個sgRNA構建體包含或編碼sgRNA,並且其中每個sgRNA包含與相應命中基因中的靶位點互補的嚮導序列;以及ii)包含Cas蛋白或編碼所述Cas蛋白的核酸的Cas組件; b) 使所述癌細胞文庫與所述抗癌藥物接觸; c) 使所述癌細胞文庫生長以獲得處理後癌細胞群;以及 d) 基於所述處理後癌細胞群和對照癌細胞群中的sgRNA或命中基因突變的譜之間的差異鑒定所述靶基因。 Methods of identifying target genes in cancer cells whose mutations render the cancer cells sensitive or resistant to anticancer drugs, comprising: a) providing a cancer cell library comprising a plurality of cancer cells, wherein each of the plurality of cancer cells has a mutation in a hit gene ("hit gene mutation"), wherein in at least two of the plurality of cancer cells The hit genes of are different from each other; wherein the cancer cell library is generated by contacting the initial cancer cell population with the : i) a single-stranded guide RNA (“sgRNA”) library comprising a plurality of sgRNA constructs, wherein each sgRNA construct comprises or encodes a sgRNA, and wherein each sgRNA comprises a guide complementary to a target site in a corresponding hit gene sequence; and ii) a Cas component comprising a Cas protein or a nucleic acid encoding the Cas protein; b) contacting said library of cancer cells with said anticancer drug; c) growing the library of cancer cells to obtain a population of treated cancer cells; and d) identifying said target gene based on the difference between the profiles of sgRNA or hit gene mutations in said treated cancer cell population and a control cancer cell population. 如請求項1所述的方法,其中其中所述對照癌細胞群獲自在相同條件下培養且沒有接觸所述抗癌藥物的癌細胞文庫。The method according to claim 1, wherein the control cancer cell population is obtained from a cancer cell library cultured under the same conditions and not exposed to the anticancer drug. 如請求項1或2所述的方法,其中其中基於所述處理後癌細胞群和所述對照癌細胞群中的sgRNA譜之間的差異,進行靶基因的鑒定。The method according to claim 1 or 2, wherein the identification of the target gene is performed based on the difference between the sgRNA profiles in the treated cancer cell population and the control cancer cell population. 如請求項3所述的方法,其中所述處理後癌細胞群和所述對照癌細胞群中的sgRNA譜是通過下一代測序來鑒定的。The method of claim 3, wherein the sgRNA profiles in the treated cancer cell population and the control cancer cell population are identified by next generation sequencing. 如請求項4所述的方法,其中所述方法包括比較獲自所述處理後癌細胞群的sgRNA序列計數和獲自所述對照癌細胞群的sgRNA序列計數,其中: i) 其相應的sgRNA嚮導序列相比所述對照癌細胞群在所述處理後癌細胞群中被鑒定為富集的且具有假髮現率(FDR) ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對所述抗癌藥物耐藥的靶基因;和/或 ii) 其相應的sgRNA嚮導序列相比所述對照癌細胞群在所述處理後癌細胞群中被鑒定為耗竭的且具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對所述抗癌藥物敏感的靶基因。 The method of claim 4, wherein the method comprises comparing sgRNA sequence counts obtained from the treated cancer cell population with sgRNA sequence counts obtained from the control cancer cell population, wherein: i) Hit genes whose corresponding sgRNA guide sequences are identified as enriched in the treated cancer cell population compared to the control cancer cell population and have a false discovery rate (FDR) ≤ 0.1, are identified as mutations a target gene that renders the cancer cell resistant to the anticancer drug; and/or ii) a hit gene whose corresponding sgRNA guide sequence is identified as depleted in the treated cancer cell population compared to the control cancer cell population and has an FDR ≤ 0.1, is identified as having a mutation that renders the cancer cell resistant to The target gene sensitive to the anticancer drug. 如請求項1-5中任一項所述的方法,其中所述sgRNA文庫和Cas元件被依次引入所述初始癌細胞群。The method of any one of claims 1-5, wherein the sgRNA library and the Cas element are sequentially introduced into the initial cancer cell population. 如請求項1-6中任一項所述的方法,其中所述Cas蛋白是Cas9。The method according to any one of claims 1-6, wherein the Cas protein is Cas9. 如請求項7所述的方法,其中每個sgRNA包含與第二序列融合的嚮導序列,其中所述第二序列包含與所述Cas9相互作用的重複-反重複莖環。The method of claim 7, wherein each sgRNA comprises a guide sequence fused to a second sequence, wherein the second sequence comprises a repeat-inverse repeat stem-loop interacting with the Cas9. 如請求項8所述的方法,其中每個sgRNA的第二序列還包含莖環1、莖環2和/或莖環3。The method according to claim 8, wherein the second sequence of each sgRNA further comprises stem-loop 1, stem-loop 2 and/or stem-loop 3. 如請求項1-9中任一項所述的方法,其中每個sgRNA還包含內部條碼(iBAR)序列(“sgRNA iBAR”),其中每個sgRNA iBAR可與Cas蛋白一起操作以修飾所述命中基因。 The method of any one of claims 1-9, wherein each sgRNA further comprises an internal barcode (iBAR) sequence ("sgRNA iBAR "), wherein each sgRNA iBAR is operable with a Cas protein to modify the hit Gene. 如請求項10所述的方法,其中每個sgRNA iBAR包含在5’-至-3’方向的第一莖環序列和第二莖環序列,其中所述第一莖環序列與所述第二莖環序列雜交以形成與所述Cas蛋白相互作用的雙鏈RNA (dsRNA)區,且其中所述iBAR序列位於所述第一莖環序列的3’端和所述第二莖環序列的5’端之間。 The method according to claim 10, wherein each sgRNA iBAR comprises a first stem-loop sequence and a second stem-loop sequence in the 5'-to-3' direction, wherein the first stem-loop sequence and the second The stem-loop sequence is hybridized to form a double-stranded RNA (dsRNA) region interacting with the Cas protein, and wherein the iBAR sequence is positioned at the 3' end of the first stem-loop sequence and the 5' end of the second stem-loop sequence ' between the ends. 如請求項10或11所述的方法,其中所述Cas蛋白是Cas9,且其中每個sgRNA iBAR的iBAR序列被插入至所述重複-反重複莖環的環區中。 The method of claim 10 or 11, wherein the Cas protein is Cas9, and wherein the iBAR sequence of each sgRNA iBAR is inserted into the loop region of the repeat-repeat stem-loop. 如請求項1-12中任一項所述的方法,其中每個嚮導序列包含約17至約23個核苷酸。The method of any one of claims 1-12, wherein each guide sequence comprises about 17 to about 23 nucleotides. 如請求項10-13中任一項所述的方法,其中所述sgRNA文庫是sgRNA iBAR文庫,其中所述sgRNA iBAR文庫包含多組sgRNA iBAR構建體,其中每組sgRNA iBAR構建體包含4個sgRNA iBAR構建體,每個該構建體包含或編碼sgRNA iBAR,其中所述4個sgRNA iBAR構建體的嚮導序列是相同的,其中所述4個sgRNA iBAR構建體的每一個的iBAR序列彼此不同,且其中每組sgRNA iBAR構建體的嚮導序列與所述命中基因中的不同靶點互補。 The method according to any one of claims 10-13, wherein the sgRNA library is a sgRNA iBAR library, wherein the sgRNA iBAR library comprises multiple sets of sgRNA iBAR constructs, wherein each set of sgRNA iBAR constructs comprises 4 sgRNAs iBAR constructs each comprising or encoding a sgRNA iBAR , wherein the guide sequences of the four sgRNA iBAR constructs are identical, wherein the iBAR sequences of each of the four sgRNA iBAR constructs are different from each other, and The guide sequences of each group of sgRNA iBAR constructs are complementary to different targets in the hit genes. 如請求項1-14中任一項所述的方法,其中所述sgRNA文庫中至少約95%的sgRNA構建體被引入至所述初始癌細胞群中。The method of any one of claims 1-14, wherein at least about 95% of the sgRNA constructs in the sgRNA library are introduced into the initial cancer cell population. 如請求項10-15中任一項所述的方法,其中針對每個sgRNA iBAR,所述癌細胞文庫具有至少約100-倍覆蓋率。 The method of any one of claims 10-15, wherein the cancer cell library has at least about 100-fold coverage for each sgRNA iBAR . 如請求項1-16中任一項所述的方法,其中針對每個sgRNA,所述癌細胞文庫具有至少約400-倍覆蓋率。The method of any one of claims 1-16, wherein the cancer cell library has at least about 400-fold coverage for each sgRNA. 如請求項1-17中任一項所述的方法,其中所述sgRNA文庫包含至少約400個sgRNA構建體。The method of any one of claims 1-17, wherein the sgRNA library comprises at least about 400 sgRNA constructs. 如請求項1-18中任一項所述的方法,其中所述sgRNA文庫中的每個sgRNA構建體是質粒。The method of any one of claims 1-18, wherein each sgRNA construct in the sgRNA library is a plasmid. 如請求項1-18中任一項所述的方法,其中所述sgRNA文庫中的每個sgRNA構建體是病毒載體。The method according to any one of claims 1-18, wherein each sgRNA construct in the sgRNA library is a viral vector. 如請求項20所述的方法,其中所述病毒載體是慢病毒載體。The method of claim 20, wherein the viral vector is a lentiviral vector. 如請求項20或21所述的方法,其中所述sgRNA文庫以至少約2的感染複數(MOI)與所述初始癌細胞群接觸。The method of claim 20 or 21, wherein the sgRNA library is contacted with the initial population of cancer cells at a multiplicity of infection (MOI) of at least about 2. 如請求項1-22中任一項所述的方法,其中步驟b)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間。The method of any one of claims 1-22, wherein step b) comprises contacting the cancer cell library with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times . 如請求項1-22中任一項所述的方法,其中步驟b)包括使所述癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間。The method of any one of claims 1-22, wherein step b) comprises contacting the cancer cell library with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times . 如請求項5-24中任一項所述的方法,其中所述sgRNA序列計數經歷中值比率歸一化,然後進行均值-方差建模。The method of any one of claims 5-24, wherein the sgRNA sequence counts undergo median ratio normalization followed by mean-variance modeling. 如請求項25所述的方法,其中所述sgRNA文庫是sgRNA iBAR文庫,且其中基於對應于所述嚮導序列的sgRNA iBAR序列中的所述iBAR序列之間的資料一致性來調節每個嚮導序列的方差。 The method of claim 25, wherein the sgRNA library is a sgRNA iBAR library, and wherein each guide sequence is adjusted based on data identity between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences Variance. 如請求項26所述的方法,其中基於每個iBAR序列的倍數變化的方向來確定對應于每個嚮導序列的sgRNA iBAR序列中的所述iBAR序列之間的資料一致性,其中如果所述iBAR序列的倍數變化相對於彼此在不同方向上,則所述嚮導序列的方差是增加的。 The method of claim 26, wherein the data consistency between the iBAR sequences in the sgRNA iBAR sequences corresponding to each guide sequence is determined based on the direction of the fold change of each iBAR sequence, wherein if the iBAR The variance of the guide sequences is increased if the fold changes of the sequences are in different directions relative to each other. 如請求項1-27中任一項所述的方法,其中所述方法包括: 用步驟b)中的抗癌藥物對來自步驟a)的癌細胞文庫進行至少兩個分別不同的處理: 使所述癌細胞文庫生長以獲得來自每個處理的處理後癌細胞群; 鑒定獲自每個處理的處理後癌細胞群中的一個或多個命中基因;以及 組合從所有處理鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對所述抗癌藥物敏感或耐藥的靶基因。 The method according to any one of claims 1-27, wherein said method comprises: The cancer cell library from step a) is subjected to at least two separate treatments with the anticancer drug in step b): growing the library of cancer cells to obtain a population of post-treatment cancer cells from each treatment; identifying one or more hit genes in the post-treatment cancer cell populations obtained from each treatment; and The one or more hit genes identified from all treatments are combined, thereby identifying target genes in the cancer cells whose mutations render the cancer cells sensitive or resistant to the anticancer drug. 如請求項1-28中任一項所述的方法,其中所述方法包括 對來自步驟a)的癌細胞文庫進行兩個分別的處理b1)和b2): b1) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約9至約10個倍增時間; b2) 使來自步驟a)的癌細胞文庫與所述抗癌藥物以約IC50至約IC70的濃度接觸持續約15至約16個倍增時間; c1)使來自處理b1)的癌細胞文庫生長以獲得處理後癌細胞群; c2)使來自處理b2)的癌細胞文庫生長以獲得處理後癌細胞群; d1) 鑒定來自處理b1)的處理後癌細胞群中的一個或多個命中基因, d2) 鑒定來自處理b2)的處理後癌細胞群中的一個或多個命中基因,以及 d3) 組合從處理b1)和處理b2)鑒定的一個或多個命中基因,由此鑒定所述癌細胞中其突變使所述癌細胞對所述抗癌藥物敏感或耐藥的靶基因。 The method according to any one of claims 1-28, wherein said method comprises Two separate treatments b1) and b2) are performed on the cancer cell library from step a): b1) contacting the cancer cell library from step a) with the anticancer drug at a concentration of about IC50 to about IC70 for about 9 to about 10 doubling times; b2) contacting the cancer cell library from step a) with the anticancer drug at a concentration of about IC50 to about IC70 for about 15 to about 16 doubling times; c1) growing the cancer cell library from treatment b1) to obtain a post-treatment cancer cell population; c2) growing the cancer cell library from treatment b2) to obtain a post-treatment cancer cell population; d1) identifying one or more hit genes in the post-treatment cancer cell population from treatment b1), d2) identifying one or more hit genes in the post-treatment cancer cell population from treatment b2), and d3) combining the one or more hit genes identified from treatment b1) and treatment b2), thereby identifying target genes in said cancer cells whose mutations render said cancer cells sensitive or resistant to said anticancer drug. 如請求項28或29所述的方法,其中 i) 其相應的sgRNA嚮導序列相比所述對照癌細胞群在所述處理後癌細胞群中被鑒定為富集的且在至少一個處理中具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對所述抗癌藥物耐藥的靶基因;和/或 ii) 其相應的sgRNA嚮導序列相比所述對照癌細胞群在所述處理後癌細胞群中被鑒定為耗竭的且在至少一個處理中具有FDR ≤ 0.1的命中基因,被鑒定為其突變使所述癌細胞對所述抗癌藥物敏感的靶基因。 The method of claim 28 or 29, wherein i) Hit genes whose corresponding sgRNA guide sequences are identified as enriched in the post-treatment cancer cell population compared to the control cancer cell population and have an FDR ≤ 0.1 in at least one treatment, are identified as mutated a target gene that renders the cancer cell resistant to the anticancer drug; and/or ii) Hit genes whose corresponding sgRNA guide sequences are identified as depleted in the post-treatment cancer cell population compared to the control cancer cell population and have an FDR ≤ 0.1 in at least one treatment, are identified as mutations that cause The target gene that the cancer cell is sensitive to the anticancer drug. 如請求項1-30中任一項所述的方法,包括: i) 分別鑒定一組的一個或多個靶基因,其突變使所述癌細胞對一種抗癌藥物、當單獨處理時針對兩種或更多種不同抗癌藥物敏感; ii) 獲得存在於針對每種抗癌藥物鑒定的每組靶基因中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對所述兩種或更多種不同抗癌藥物的組合處理敏感的靶基因;和/或 i) 分別鑒定一組的一個或多個靶基因,其突變使所述癌細胞對一種抗癌藥物、當單獨處理時針對兩種或更多種不同抗癌藥物耐藥; ii) 獲得存在於針對所有抗癌藥物鑒定的靶基因組的組合中的一個或多個靶基因,由此鑒定其突變使所述癌細胞對所述兩種或更多種不同抗癌藥物的組合處理耐藥的靶基因。 The method according to any one of claims 1-30, comprising: i) respectively identifying a set of one or more target genes whose mutations sensitize said cancer cells to one anticancer drug, and when treated alone to two or more different anticancer drugs; ii) obtaining one or more target genes present in each set of target genes identified for each anticancer drug, thereby identifying mutations thereof that render the cancer cell responsive to the two or more different anticancer drugs Combined treatment of sensitive target genes; and/or i) respectively identifying a set of one or more target genes whose mutations render said cancer cells resistant to one anticancer drug, or when treated alone, to two or more different anticancer drugs; ii) Obtaining one or more target genes present in the combination of target genes identified for all anticancer drugs, thereby identifying mutations thereof that make said cancer cells responsive to said combination of two or more different anticancer drugs Target genes for drug resistance. 如請求項31所述的方法,其中所述兩種或更多種不同抗癌藥物靶向相同的癌症靶標。The method of claim 31, wherein the two or more different anticancer drugs target the same cancer target. 如請求項31所述的方法,其中所述兩種或更多種不同抗癌藥物靶向不同的癌症靶標。The method of claim 31, wherein the two or more different anticancer drugs target different cancer targets. 如請求項5-33中任一項所述的方法,還包括對鑒定的靶基因進行排序,其中基於相比所述對照癌細胞群在所述處理後癌細胞群中的所述sgRNA嚮導序列的富集或耗竭的程度,進行靶基因排序。The method of any one of claims 5-33, further comprising ranking the identified target genes based on the sgRNA guide sequence in the treated cancer cell population compared to the control cancer cell population According to the degree of enrichment or depletion, target gene sequencing was performed. 如請求項34所述的方法,其中所述sgRNA文庫是sgRNA iBAR文庫,且其中基於對應於所述靶基因的所述嚮導序列的sgRNA iBAR序列中的所述iBAR序列之間的資料一致性來進一步調節所述靶基因排序。 The method of claim 34, wherein the sgRNA library is a sgRNA iBAR library, and wherein based on the data consistency between the iBAR sequences in the sgRNA iBAR sequences corresponding to the guide sequences of the target gene The sequence of the target genes is further adjusted. 如請求項34或35所述的方法,還包括將敏感性評分或耐藥性評分分配給鑒定的靶基因, 其中基於相比所述對照癌細胞群在所述處理後癌細胞群中的所述sgRNA嚮導序列的富集倍數,將其突變使所述癌細胞對所述抗癌藥物耐藥的靶基因從高到低排序,且從高到低相應地給每個靶基因分配一個耐藥性評分;和/或 其中基於相比所述對照癌細胞群在所述處理後癌細胞群中的所述sgRNA嚮導序列的耗竭倍數,將其突變使所述癌細胞對所述抗癌藥物敏感的靶基因從高到低排序,且從高到低相應地給每個靶基因分配一個敏感性評分。 The method of claim 34 or 35, further comprising assigning a sensitivity score or a drug resistance score to the identified target gene, wherein based on the enrichment factor of the sgRNA guide sequence in the treated cancer cell population compared to the control cancer cell population, the target gene whose mutation makes the cancer cells resistant to the anticancer drug is changed from Ranked from highest to lowest, with a corresponding resistance score assigned to each target gene from highest to lowest; and/or wherein the target genes whose mutations sensitize the cancer cells to the anticancer drug are ranged from high to Rank low and assign a sensitivity score to each target gene accordingly, from high to low. 如請求項1-36中任一項所述的方法,其中所述抗癌藥物是PARP抑制劑。The method of any one of claims 1-36, wherein the anticancer drug is a PARP inhibitor. 如請求項1-37中任一項所述的方法,其中所述癌細胞是結直腸癌細胞。The method of any one of claims 1-37, wherein the cancer cells are colorectal cancer cells. 一種鑒定癌細胞中靶基因的方法,所述靶基因中的突變使得所述癌細胞對包含第一抗癌藥物和第二抗癌藥物的聯合治療敏感,所述方法包括: i) 鑒定癌細胞中第一組的一個或多個靶基因,所述靶基因的突變使得所述癌細胞對根據權利要求1-38中任一項的方法所述的第一抗癌藥物敏感; ii) 鑒定癌細胞中第二組的一個或多個靶基因,所述靶基因的突變使得所述癌細胞對根據權利要求1-38中任一項的方法所述的第二抗癌藥物敏感;以及 iii) 獲得存在於所述第一組靶基因和所述第二組靶基因兩者中的一個或多個靶基因,由此鑒定其突變使得所述癌細胞對所述聯合治療敏感的靶基因。 A method of identifying a target gene in a cancer cell whose mutation sensitizes the cancer cell to a combination therapy comprising a first anticancer drug and a second anticancer drug, the method comprising: i) identifying a first set of one or more target genes in cancer cells whose mutation renders the cancer cells sensitive to the first anticancer drug according to the method of any one of claims 1-38 ; ii) identifying a second set of one or more target genes in cancer cells whose mutation renders the cancer cells sensitive to the second anticancer drug according to the method of any one of claims 1-38 ;as well as iii) obtaining one or more target genes present in both said first set of target genes and said second set of target genes, thereby identifying target genes whose mutations render said cancer cells sensitive to said combination therapy . 一種治療個體中的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於所述個體具有使所述癌細胞對所述抗癌藥物敏感的靶基因(“藥物敏感基因”)中的畸變選擇所述個體進行治療,且其中所述藥物敏感基因是使用權利要求1-39中任一項的方法來鑒定的。A method of treating cancer in an individual comprising administering to said individual an effective amount of an anticancer drug based on said individual possessing a target gene ("drug sensitive gene") that sensitizes said cancer cells to said anticancer drug The individual is selected for treatment by an aberration in , and wherein the drug sensitivity gene is identified using the method of any one of claims 1-39. 一種將患有癌症的個體排除在治療外的方法,包括向所述個體施用有效量的抗癌藥物,其中如果所述個體具有使所述癌細胞對所述抗癌藥物耐藥的靶基因(“耐藥基因”)中的畸變則將所述個體排除,且其中所述耐藥基因是使用權利要求1-38中任一項的方法來鑒定的。A method of excluding from treatment an individual with cancer comprising administering to the individual an effective amount of an anticancer drug, wherein if the individual has a target gene that renders the cancer cells resistant to the anticancer drug ( An aberration in a "drug resistance gene") excludes said individual, and wherein said drug resistance gene is identified using the method of any one of claims 1-38. 一種治療個體中的癌症的方法,包括向所述個體施用有效量的抗癌藥物,其中基於以下選擇所述個體: i) 使所述癌細胞對所述抗癌藥物敏感的一個或多個靶基因(“藥物敏感基因”)中的畸變(“藥物敏感性畸變”),以及 ii) 使所述癌細胞對所述抗癌藥物耐藥的一個或多個靶基因(“耐藥基因”)中的畸變(“耐藥畸變”), 其中藥物敏感基因和耐藥基因是使用權利要求1-39中任一項的方法來鑒定的,以及 其中如果所述藥物敏感畸變和耐藥畸變的綜合評分高於綜合評分閾值水準,則選擇所述個體進行治療。 A method of treating cancer in an individual comprising administering to the individual an effective amount of an anticancer drug, wherein the individual is selected based on: i) an aberration in one or more target genes ("drug sensitivity genes") that sensitizes said cancer cells to said anticancer drug ("drug sensitivity aberration"), and ii) an aberration in one or more target genes ("resistance genes") that renders said cancer cells resistant to said anticancer drug ("drug resistance aberration"), wherein the drug susceptibility gene and the drug resistance gene are identified using the method of any one of claims 1-39, and Wherein if the comprehensive score of the drug-sensitive aberration and drug-resistant aberration is higher than the comprehensive score threshold level, the individual is selected for treatment. 如請求項42所述的方法,其中所述綜合評分是通過以下獲得的: (i) (所述藥物敏感基因的敏感性評分的總數的絕對值)減去(所述耐藥基因的耐藥性評分的總數的絕對值),或 (ii) 式I, 其中所述如果所述綜合評分高於零,則選擇所述個體進行治療。 The method of claim 42, wherein the composite score is obtained by: (i) (the absolute value of the total number of sensitivity scores for the drug-sensitive genes) minus (the absolute value of the total number of drug resistance scores for the drug-resistant genes), or (ii) Formula I, Wherein said individual is selected for treatment if said composite score is higher than zero.
TW111126186A 2021-07-12 2022-07-12 Methods of identifying drug sensitive genes and drug resistant genes in cancer cells TW202309299A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
WOPCT/CN2021/105816 2021-07-12
CN2021105822 2021-07-12
CN2021105816 2021-07-12
WOPCT/CN2021/105822 2021-07-12

Publications (1)

Publication Number Publication Date
TW202309299A true TW202309299A (en) 2023-03-01

Family

ID=84919026

Family Applications (2)

Application Number Title Priority Date Filing Date
TW111126186A TW202309299A (en) 2021-07-12 2022-07-12 Methods of identifying drug sensitive genes and drug resistant genes in cancer cells
TW111126187A TW202317523A (en) 2021-07-12 2022-07-12 Biomarkers for colorectal cancer treatment

Family Applications After (1)

Application Number Title Priority Date Filing Date
TW111126187A TW202317523A (en) 2021-07-12 2022-07-12 Biomarkers for colorectal cancer treatment

Country Status (2)

Country Link
TW (2) TW202309299A (en)
WO (2) WO2023284735A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117210568A (en) * 2023-10-30 2023-12-12 云南省肿瘤医院(昆明医科大学第三附属医院) SNP marker for detecting familial hereditary colorectal cancer and application thereof

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2958290A1 (en) * 2013-09-23 2015-03-26 The University Of Chicago Methods and compositions relating to cancer therapy with dna damaging agents
WO2015100257A1 (en) * 2013-12-23 2015-07-02 The General Hospital Corporation Methods and assays for determining reduced brca1 pathway function in a cancer cell
US11464774B2 (en) * 2015-09-30 2022-10-11 Vertex Pharmaceuticals Incorporated Method for treating cancer using a combination of DNA damaging agents and ATR inhibitors
JOP20190197A1 (en) * 2017-02-24 2019-08-22 Bayer Pharma AG An inhibitor of atr kinase for use in a method of treating a hyper-proliferative disease
JP7337805B2 (en) * 2017-12-27 2023-09-04 テサロ, インコーポレイテッド Methods of treating cancer
CN111526889B (en) * 2017-12-29 2023-06-02 沃泰克斯药物股份有限公司 Methods of treating cancer using ATR inhibitors
CN110343724B (en) * 2018-04-02 2021-10-12 北京大学 Method for screening and identifying functional lncRNA
CN111334531A (en) * 2018-12-18 2020-06-26 博雅辑因(北京)生物科技有限公司 High signal-to-noise ratio negative genetic screening method
CA3123981A1 (en) * 2018-12-20 2020-06-25 Peking University Compositions and methods for highly efficient genetic screening using barcoded guide rna constructs
TW202039845A (en) * 2018-12-20 2020-11-01 北京大學 Compositions and methods for highly efficient genetic screening using barcoded guide rna constructs
CN110570922B (en) * 2019-07-19 2022-06-10 浙江大学 HR defect assessment model and application
KR102580824B1 (en) * 2019-10-30 2023-09-21 (재)록원바이오융합연구재단 Method and Kit for Determining Reactivity to PARP inhibitor
CN113025713B (en) * 2021-02-23 2022-11-22 浙江东睿生物科技有限公司 Use of biomarkers for predicting the sensitivity of a tumor patient to a specific anti-tumor drug

Also Published As

Publication number Publication date
WO2023284735A1 (en) 2023-01-19
WO2023284736A1 (en) 2023-01-19
TW202317523A (en) 2023-05-01

Similar Documents

Publication Publication Date Title
JP7144618B2 (en) Compositions and methods for efficient genetic screening using barcoded guide RNA constructs
McDonald et al. Project DRIVE: a compendium of cancer dependencies and synthetic lethal relationships uncovered by large-scale, deep RNAi screening
Hou et al. SLE non-coding genetic risk variant determines the epigenetic dysfunction of an immune cell specific enhancer that controls disease-critical microRNA expression
AU2013246909B2 (en) Novel markers for detecting microsatellite instability in cancer and determining synthetic lethality with inhibition of the DNA base excision repair pathway
EP3212789B1 (en) Massively parallel combinatorial genetics for crispr
Xu-Monette et al. Clinical and biologic significance of MYC genetic mutations in de novo diffuse large B-cell lymphoma
US10633707B2 (en) Markers for detecting microsatellite instability in cancer and determining synthetic lethality with inhibition of the DNA base excision repair pathway
WO2019200214A1 (en) Compositions and methods for treating cancer
Sanghvi et al. Characterization of a set of tumor suppressor microRNAs in T cell acute lymphoblastic leukemia
CN111349654B (en) Compositions and methods for efficient gene screening using tagged guide RNA constructs
WO2023284735A1 (en) Methods of identifying drug sensitive genes and drug resistant genes in cancer cells
Jiang et al. Genome-wide characterization of extrachromosomal circular DNA in gastric cancer and its potential role in carcinogenesis and cancer progression
Wang et al. CRISPR-Cas9 HDR system enhances AQP1 gene expression
WO2019222212A1 (en) Gene editing for autoimmune disorders
WO2023109875A1 (en) Biomarkers for colorectal cancer treatment
JP2022502481A (en) A composition for inducing the death of genetically mutated cells and a method for inducing the death of genetically mutated cells using the composition.
Xu-Monette et al. MYC mutation profiling and prognostic significance in de novo diffuse large B-cell lymphoma
Scholz et al. Divergent methylation of CRISPR repeats and cas genes in a subtype ID CRISPR-Cas-system
US10513732B2 (en) Sequencing methods and kits
Menon STRUCTURAL AND FUNCTIONAL CHARACTERISTICS OF MiRNA IN COLON CANCER AND THE IDENTIFICATION OF TARGETS BY INSILICO METHODS
JP2022512673A (en) Compositions and Methods for Modifying Regulatory T Cells
US20220290132A1 (en) Engineered CRISPR/Cas9 Systems for Simultaneous Long-term Regulation of Multiple Targets
Yuan Epigenetic regulatory network of primary brain tumour in adults
Fernandes Neto et al. A fluorescence-based sensor screen identifies MED12 as a potential microsatellite instability regulator in colon cancer
Kurata Elucidation of miRNA Function Using the CRISPR-Cas9 System