KR20190003868A

KR20190003868A - Evaluation of specificity of oligonucleotides

Info

Publication number: KR20190003868A
Application number: KR1020197000224A
Authority: KR
Inventors: 천종윤; 윤기석
Original assignee: 주식회사 씨젠
Priority date: 2016-06-03
Filing date: 2017-06-02
Publication date: 2019-01-09
Also published as: WO2017209575A1; KR102189358B1

Abstract

본 발명은 5'-X-Y-Z-3'로 표시되는 올리고뉴클레오타이드의 특이성을 평가하는 방법에 관한 것이다. 본 발명은 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고, 상기 데이터베이스로부터 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는 뉴클레오타이드 서열을 추출하는 단계; 및 상기 식 (I)의 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 부위별 매치/미스매치를 분석하여 (i) 상기 식 (I)의 올리고뉴클레오타이드의 부위 X와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율 및 개별적으로 (ii) 상기 식 (I)의 올리고뉴클레오타이드의 부위 Z와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율을 제공하는 단계를 포함한다. The present invention relates to a method for evaluating the specificity of an oligonucleotide represented by 5'-X-Y-Z-3 '. The present invention relates to a method for determining the presence or absence of a nucleotide sequence comprising at least one nucleotide sequence database comprising at least one nucleotide sequence database comprising at least one sequence of oligonucleotides of formula (I) Extracting a sequence; And analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and the respective reference nucleotide sequence to determine whether a match or mismatch between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence (Ii) the number or percentage of mismatched or matched bases between each reference nucleotide sequence and the site Z of the oligonucleotide of formula (I) above.

Description

Evaluation of specificity of oligonucleotides

본 발명은 올리고뉴클레오타이드의 특이성 평가에 관한 것이다. The present invention relates to the evaluation of specificity of oligonucleotides.

핵산 증폭은 분자 생물학의 다양한 방법에서 필수적인 과정으로서, 다양한 증폭 방법이 제시되었다. 예를 들어, Miller, H. I. 등 (국제 공개공보 WO 제89/06700호)은, 프로모터/프라이머 서열을 타겟 단일가닥 DNA("ssDNA")에 혼성화시킨 다음 상기 서열의 많은 RNA 카피를 전사하는 것에 기초하여 핵산 서열을 증폭하였다. 다른 공지의 핵산 증폭 과정은 전사 기반 증폭 시스템을 포함한다(Kwoh, D. et al., Proc. Natl. Acad. Sci. U.S.A., 86:1173(1989); 및 Gingeras T.R. et al., WO 88/10315).Nucleic acid amplification is an essential process in various methods of molecular biology, and various amplification methods have been proposed. For example, Miller, HI et al. (WO 89/06700) discloses a method of hybridizing a promoter / primer sequence to a target single stranded DNA ("ssDNA") and then transcribing many RNA copies of the sequence To amplify the nucleic acid sequence. Other known nucleic acid amplification procedures include transcription based amplification systems (Kwoh, D. et al., Proc. Natl. Acad Sci USA, 86: 1173 (1989) and Gingeras TR et al. 10315).

중합효소 연쇄반응(이하, "PCR"이라 함)으로 공지된 가장 많이 이용되는 핵산 증폭 방법은 이중가닥 DNA의 변성 후, DNA 주형에의 올리고뉴클레오타이드 프라이머의 어닐링 및 DNA 중합효소에 의한 프라이머 연장의 반복된 사이클에 기초한다(Mullis 등, 미국 특허 제4,683,195호, 제4,683,202호 및 제4,800,159호; Saiki et al., (1985) Science 230, 1350-1354). The most widely used nucleic acid amplification method known as polymerase chain reaction (hereinafter referred to as " PCR ") is a method for amplifying double stranded DNA, annealing an oligonucleotide primer to a DNA template, and repeating primer extension by a DNA polymerase (Mullis et al., U.S. Patent Nos. 4,683,195, 4,683,202 and 4,800,159; Saiki et al., (1985) Science 230, 1350-1354).

최근에는, 실시간 방식으로 타겟 핵산 서열의 증폭을 검출하는 실시간 PCR 기술이 널리 사용되고 있다. 실시간 PCR은 일반적으로 타겟 핵산 서열과 특이적으로 혼성화되는 프라이머 및/또는 프로브와 같은 올리고뉴클레오타이드를 이용한다. 표지된 프로브와 타겟 핵산 서열 간의 혼성화를 이용하는 방법의 예는 헤어핀 구조를 형성할 수 있는 이중 표지된 프로브를 이용한 Molecular beacon 방법(Tyagi et al., Nature Biotechnology v.14 MARCH 1996), HyBeacon 방법(French DJ et al., Mol. Cell Probes, 15(6):363-374(2001)), 공여체 및 수용체로 각각 표지된 2개의 프로브를 이용한 혼성화 프로브 방법(Bernad et al, 147-148 Clin Chem 2000; 46) 및 단일 표지된 올리고뉴클레오타이드를 이용한 Lux 방법(미국 특허 제7,537,886호)이 있다. 또한, DNA 폴리머라아제의 5'-뉴클레아제 활성에 의한 이중 표지된 프로브의 절단 반응 뿐만 아니라 이중 표지된 프로브의 혼성화를 이용한 TaqMan 방법(미국 특허 제5,210,015호 및 제5,538,848호)이 널리 이용되었다. In recent years, real-time PCR techniques for detecting amplification of a target nucleic acid sequence in a real-time manner have been widely used. Real-time PCR generally utilizes oligonucleotides such as primers and / or probes which hybridize specifically with the target nucleic acid sequence. Examples of methods of utilizing hybridization between a labeled probe and a target nucleic acid sequence include the Molecular beacon method using a double labeled probe capable of forming a hairpin structure (Tyagi et al., Nature Biotechnology v. 14 MARCH 1996), the HyBeacon method DJ et al., Mol. Cell Probes, 15 (6): 363-374 (2001)), a hybridization probe method using two probes labeled with a donor and a receptor, respectively (Bernad et al, 147-148 Clin Chem 2000; 46) and the Lux method using a single-labeled oligonucleotide (U.S. Patent No. 7,537,886). In addition, the TaqMan method (US Pat. Nos. 5,210,015 and 5,538,848) using hybridization of double-labeled probes as well as cleavage of double-labeled probes by the 5'-nuclease activity of DNA polymerases has been widely used .

PCR 및 실시간 PCR은 일반적으로 다양한 핵산의 혼합물로부터 원하는 타겟 핵산 서열을 증폭 또는 검출하기 위해 프라이머 및/또는 프로브를 이용한다. 따라서, 정확한 증폭 또는 검출 결과를 위해서는 프라이머 및/또는 프로브가 타겟 핵산 서열에 대해 높은 특이성을 가질 것이 요구된다. PCR and real-time PCR generally utilize primers and / or probes to amplify or detect a desired target nucleic acid sequence from a mixture of various nucleic acids. Thus, for accurate amplification or detection results, primers and / or probes are required to have high specificity for the target nucleic acid sequence.

이와 관련하여, 본 발명자들은 보다 높은 특이성으로 주형-의존적 반응을 수행하는 이중 특이성 올리고뉴클레오타이드(dual specificity oligonucleotide (DSO); 이중 프라이밍 올리고뉴클레오타이드(dual priming oligonucleotide(DPO)로도 불림)를 개발한 바 있다(국제 공개공보 WO 제2006/095981호 참조). 상기 DSO는 올리고뉴클레오타이드 분자 내에 3개의 상이한 부위, 즉 5'-고 Tm 특이성 부위, 3'-저 Tm 특이성 부위 및 분할 부위를 가지며, 유니버설 염기로 구성된 분할 부위에 의해 분리된 두 부위(5'-고 Tm 특이성 부위 및 3'-저 Tm 특이성 부위)에 의해 혼성화 특이성이 이중으로 결정된다. In this regard, the inventors have developed a dual specificity oligonucleotide (DSO) (also referred to as a dual priming oligonucleotide (DPO)) that performs a template-dependent reaction with higher specificity International Publication No. WO 2006/095981). The DSO has three different sites in the oligonucleotide molecule, a 5'-high Tm specificity site, a 3'-low Tm specificity site and a cleavage site, which are composed of universal bases The hybridization specificity is doubly determined by the two sites separated by the splitting site (5'-high Tm specific site and 3'-low Tm specific site).

또한, 본 발명자들은 타겟 핵산 서열과 비타겟 핵산 서열을 구별할 수 있는 타겟 구별성 프로브(target discriminative probe; TD probe로 불림)를 개발한 바 있다(국제 공개공보 WO 제2011/028041호 참조). 상기 TD 프로브는 올리고뉴클레오타이드 분자 내에 3개의 고유한 부위, 즉 5'-제2 혼성화 부위, 3'-제1 혼성화 부위 및 분할 부위를 포함하며, 유니버설 염기로 구성된 분할 부위에 의해 분리된 5'-제2 혼성화 부위 및 3'-제1 혼성화 부위에 의해 TD 프로브의 혼성화 특이성이 이중으로 결정된다. In addition, the present inventors have developed a target discriminative probe (referred to as a TD probe) capable of distinguishing a target nucleic acid sequence from a non-target nucleic acid sequence (see International Publication No. WO 2011/028041). The TD probe comprises three unique sites in the oligonucleotide molecule: a 5'-second hybridization site, a 3'-first hybridization site and a cleavage site, and the 5'- The hybridization specificity of the TD probe is doubly determined by the second hybridization site and the 3'-first hybridization site.

일반적으로, PCR 및 실시간 PCR에 사용되는 올리고뉴클레오타이드는 타겟 핵산 서열에 혼성화하거나 매치되도록 디자인되고 제조된다. 하지만, 아무리 정교하게 디자인된 올리고뉴클레오타이드도 그 디자인시 확인되지 않은 비타겟 핵산 서열과 혼성화될 수 있다. 따라서, 디자인된 올리고뉴클레오타이드가 의도한 타겟에만 혼성화되고 임의의 의도하지 않은 비타겟에는 혼성화되지 않는지 확인할 필요가 있다. 이것은 일반적으로 특이성 평가(확인) 과정으로 불린다. Generally, the oligonucleotides used in PCR and real-time PCR are designed and manufactured to hybridize or match the target nucleic acid sequence. However, even finely engineered oligonucleotides can hybridize to non-target nucleic acid sequences that are not identified in the design. Therefore, it is necessary to confirm that the designed oligonucleotide is hybridized only to the intended target and not hybridized to any unintended non-target. This is commonly referred to as the specificity assessment process.

특이성 평가 과정은, 상기 디자인된 올리고뉴클레오타이드를 임의의 서열 정렬 알고리즘 또는 프로그램(예컨대, BLAST)을 사용하여 공지된 뉴클레오타이드 서열 데이터베이스(예컨대, GenBank)에 대해 서치하여 상동 서열을 찾는 단계(상동성 서치), 및 상기 생성된 상동 서열을 분석하여, 디자인된 올리고뉴클레오타이드가 원하는 타겟 핵산 서열에만 혼성화되는지 확인하는 단계를 포함할 수 있다. 이러한 특이성 평가 과정은 프라이머 및 프로브의 적절성 또는 작동성을 평가하는 매우 유용한 도구가 되었다. The specificity evaluation process includes a step of searching for a homologous sequence (homology search) by searching the known nucleotide sequence database (for example, GenBank) using the arbitrary sequence alignment algorithm or program (for example, BLAST) , And analyzing the generated homologous sequence to confirm that the designed oligonucleotide is only hybridized to the desired target nucleic acid sequence. This specificity assessment process has become a very useful tool for evaluating the suitability or operability of primers and probes.

올리고뉴클레오타이드의 특이성을 평가하기 위해 다양한 서열 정렬 알고리즘 또는 프로그램이 이용되어 왔다. 그 중에서도, BLAST는 뉴클레오타이드 질의(query) 서열을 뉴클레오타이드 서열 데이터베이스와 비교하여 상기 데이터베이스에서 상기 질의와 유사한 서열을 찾는, 가장 널리 사용되는 서열 유사성 서치 도구 중 하나이다. 이 프로그램은 미국국립생물공학정보센터(NCBI, National Center for Biotechnology Information, http://www.ncbi.nih.gov)에 의해 무료로 제공된다.Various sequence alignment algorithms or programs have been used to evaluate the specificity of oligonucleotides. Among them, BLAST is one of the most widely used sequence similarity search tools to compare a nucleotide query sequence with a nucleotide sequence database to find a sequence similar to the query in the database. This program is provided free of charge by the National Center for Biotechnology Information (NCBI) at http://www.ncbi.nih.gov.

BLAST 프로그램은 기본적으로 문자열 매칭(string-matching) 프로그램이다. 생물학적 문자열 매칭은 상동성의 증거로서 유사성을 찾는다. 질의 및 데이터베이스 내의 서열 간의 유사성은 데이터베이스로부터의 서열의 상응하는 영역에 정확히 매치하는 질의 내의 염기의 퍼센트 동일성 또는 개수에 의해 측정될 수 있다. The BLAST program is basically a string-matching program. Biological string matching finds similarity as proof of homology. The similarity between a query and a sequence in the database can be measured by the percent identity or number of bases in the query that exactly match the corresponding region of the sequence from the database.

BLAST 서치의 출력은 원시 스코어 S, 스코어링(scoring) 알고리즘의 다양한 파라미터, 및 질의와 데이터베이스의 특성에 기초하여 발견한 매치에 대한 스코어와 통계의 세트를 보고한다. 원시 스코어 S는 유사성과 매치의 크기의 측정값이다. BLAST 출력은 E 값에 의해 순서가 매겨진 히트(hit)를 나열한다. 매치의 E(기대) 값은, 동일한 크기와 구성의 무작위로 생성된 데이터베이스에서 문자열 매칭(갭 허용)이 발생할 가능성을 개략적으로 측정한다. E 값이 0에 가까울수록 그것이 우연히 발생할 가능성은 더 낮다. 즉, E 값이 낮을수록 매치는 더 잘 이루어진다. 이는 타겟 핵산 서열에 대한 프라이머의 매치의 척도로서 사용될 수 있다. The output of the BLAST search reports a set of scores and statistics for the matches found based on the raw score S, various parameters of the scoring algorithm, and the characteristics of the query and the database. The raw score S is a measure of similarity and size of match. The BLAST output lists the hits that are ordered by the E value. The E (expected) value of the match roughly measures the probability of string matching (gap allowance) occurring in a randomly generated database of the same size and configuration. The closer the E value is to zero, the lower the likelihood that it occurs by chance. That is, the lower the E value, the better the match is. This can be used as a measure of the primer match to the target nucleic acid sequence.

BLAST는 전형적인 올리고뉴클레오타이드에 대해 비교적 우수한 결과를 제공하지만, 서열 내부에 몇 개의 연속적인 유니버설 염기, 비자연 염기 등을 함유하는 비전형적인 올리고뉴클레오타이드에 대해서는 적합하지 못하다. Although BLAST provides relatively good results for typical oligonucleotides, it is not suitable for non-typical oligonucleotides containing several consecutive universal bases, non-natural bases, etc. within the sequence.

특히, 본 발명자들에 의해 개발된 이중 특이성 올리고뉴클레오타이드와 같이 복수의 연속적인 유니버설 염기를 함유하는 올리고뉴클레오타이드의 경우, BLAST는 질의로서 전체 서열을 입력하였음에도 불구하고 유니버설 염기에 의해 부위 중 하나의 부위의 결과만을 생성한다. 또한, BLAST는 서열 내부에 복수의 연속적인 유니버설 염기를 함유하는 올리고뉴클레오타이드의 디자인시 중요한 고려사항인 5' 부위 및 3' 부위에 대한 개별적인 미스매치 결과를 제공하지 않는다.In particular, in the case of oligonucleotides containing a plurality of consecutive universal bases such as the bispecific oligonucleotides developed by the present inventors, the BLAST was designed so that, even though the entire sequence was input as a query, Only results are generated. In addition, BLAST does not provide individual mismatch results for the 5 'and 3' sites, which are important considerations in the design of oligonucleotides containing multiple consecutive universal bases within a sequence.

또한, BLAST는 유니버설 염기 또는 축퇴성 염기를 그의 특정 유형에 관계없이 미스매치로 처리한다. In addition, BLAST treats universal bases or degenerative bases mismatched regardless of their particular type.

따라서, 종래 서열 정렬 알고리즘 또는 프로그램이 비전형적인 올리고뉴클레오타이드의 특이성을 평가하는데 적합하지 않다는 사실에 비추어 볼 때, 비전형적인 올리고뉴클레오타드의 특이성을 보다 정확하게 평가하기 위한 새로운 방법의 개발이 요구된다. Thus, in view of the fact that conventional sequence alignment algorithms or programs are not suitable for evaluating the specificity of atypical oligonucleotides, there is a need to develop new methods for more accurately assessing the specificity of atypical oligonucleotides.

본 명세서 전체에 걸쳐 다수의 인용문헌 및 특허 문헌이 참조되고 그 인용이 표시되어 있다. 인용된 문헌 및 특허의 개시 내용은 그 전체로서 본 명세서에 참조로 삽입되어 본 발명이 속하는 기술 분야의 수준 및 본 발명의 내용이 보다 명확하게 설명된다.Numerous cited documents and patent documents are referred to and cited throughout this specification. The disclosures of the cited documents and patents are incorporated herein by reference in their entirety to more clearly describe the state of the art to which the present invention pertains and the content of the present invention.

본 발명자들은 올리고뉴클레오타이드, 특히 서열 내부에 왓슨-크릭 염기쌍에 관여하지 않는 연속적인 염기를 함유하는 비전형적인 올리고뉴클레오타이드의 특이성을 평가하는 방법을 개발하고자 노력하였다. 그 결과, 본 발명자들은 올리고뉴클레오타이드 서열을 뉴클레오타이드 서열 데이터베이스와 비교하는 단계, 상기 올리고뉴클레오타이드와 상동인 영역을 포함하는 참조 뉴클레오타이드 서열을 추출하는 단계, 및 상기 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 부위별 매치/미스매치를 분석하여 왓슨-크릭 염기쌍에 관여하지 않는 연속적인 염기에 의해 분리된 2개의 부위에서 개별적인 매치 결과를 제공하는 단계를 포함하는 신규한 방법을 개발하였다. The present inventors have sought to develop a method for evaluating the specificity of oligonucleotides, especially atypical oligonucleotides containing contiguous bases not involved in the Watson-Crick base pair within the sequence. As a result, the present inventors have found that the present invention provides a method for identifying a nucleotide sequence, comprising the steps of: comparing an oligonucleotide sequence with a nucleotide sequence database; extracting a reference nucleotide sequence including a region homologous to the oligonucleotide; / Mismatches to provide individual match results at two sites separated by consecutive bases that do not participate in Watson-Crick base pairs.

따라서, 본 발명의 목적은 올리고뉴클레오타이드의 특이성을 평가하는 방법을 제공하는 데 있다.Accordingly, it is an object of the present invention to provide a method for evaluating the specificity of oligonucleotides.

본 발명의 다른 목적은 올리고뉴클레오타이드의 특이성을 평가하는 방법을 실행하기 위한 프로세서를 구현하는 지시를 포함하는 컴퓨터 해독가능한 기록매체를 제공하는 데 있다.It is another object of the present invention to provide a computer-readable recording medium that includes instructions for implementing a processor to perform a method of assessing the specificity of an oligonucleotide.

본 발명의 또 다른 목적은 올리고뉴클레오타이드의 특이성을 평가하기 위한 장치를 제공하는 데 있다.It is another object of the present invention to provide an apparatus for evaluating the specificity of oligonucleotides.

본 발명의 또 다른 목적은 올리고뉴클레오타이드의 특이성을 평가하는 방법을 실행하기 위한 프로세서를 구현하는, 컴퓨터 해독가능한 기록매체에 저장되는 컴퓨터 프로그램을 제공하는 데 있다.It is yet another object of the present invention to provide a computer program stored on a computer readable recording medium embodying a processor for performing a method of evaluating the specificity of an oligonucleotide.

본 발명의 다른 목적 및 이점은 첨부된 청구범위 및 도면과 함께 하기의 상세한 설명으로부터 명확하게 될 것이다.Other objects and advantages of the present invention will become apparent from the following detailed description together with the appended claims and drawings.

I. I. 올리고뉴클레오타이드의Oligonucleotide 특이성의 평가 Evaluation of specificity

본 발명의 일 양태에 따르면, 본 발명은 하기 단계를 포함하는, 올리고뉴클레오타이드의 특이성을 평가하는 방법을 제공한다:According to one aspect of the present invention, the present invention provides a method of assessing the specificity of an oligonucleotide, comprising the steps of:

(a) 하기 식 (I)로 표시되는 올리고뉴클레오타이드를 제공하는 단계: (a) providing an oligonucleotide represented by the following formula (I):

5'-X-Y-Z-3' (I)5'-X-Y-Z-3 '(I)

상기 식에서, X는 타겟 핵산 서열에 혼성화되는 혼성화 뉴클레오타이드 서열을 포함하는 부위를 나타내고, Y는 왓슨-크릭 염기쌍에 관여하지 않는 2개 이상의 연속적인 염기를 포함하는 분할 부위를 나타내며, Z는 타겟 핵산 서열에 혼성화되는 혼성화 뉴클레오타이드 서열을 포함하는 부위를 나타내고; Wherein Y represents a cleavage site comprising two or more consecutive bases not involved in the Watson-Crick base pair, and Z represents a cleavage site of the target nucleic acid sequence Lt; RTI ID = 0.0 > a < / RTI > hybridization nucleotide sequence;

(b) 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고, 상기 데이터베이스로부터 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는 참조 뉴클레오타이드 서열을 추출하는 단계; 및 (b) comparing all or part of the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database and comparing the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database, Extracting a reference nucleotide sequence; And

(c) 상기 식 (I)의 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 부위별 매치/미스매치를 분석하여 (i) 상기 식 (I)의 올리고뉴클레오타이드의 부위 X와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율 및 개별적으로 (ii) 상기 식 (I)의 올리고뉴클레오타이드의 부위 Z와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율을 제공하는 단계.(c) analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and the respective reference nucleotide sequence to determine whether (i) a match between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence Providing the number or percentage of mismatched bases between the respective reference nucleotide sequence and the number or percentage of mismatched bases and (ii) site Z of the oligonucleotide of formula (I) above.

본원에서 사용된 바와 같이, 용어 "특이성(specificity)"은 "어닐링 또는 혼성화 특이성" 및 "타겟 특이성"을 포함한다. As used herein, the term " specificity " includes " annealing or hybridization specificity " and " target specificity.

용어 "어닐링 또는 혼성화 특이성"은 완전히 상보적인 염기 간에 이뤄지는 혼성화의 정확도(fidelity)를 의미한다. 상기 용어는 2개의 핵산 서열 간의 관계를 기술하는데 사용된다. 상기 정의에 따르면, 높은 특이성을 갖는 올리고뉴클레오타이드는 특정 조건하에 또 다른 올리고뉴클레오타이드 또는 폴리뉴클레오타이드에 혼성화할 수 있는 반면, 낮은 특이성을 갖는 올리고뉴클레오타이드는 그렇지 못하다. The term " annealing or hybridization specificity " means the fidelity of hybridization between completely complementary bases. The term is used to describe the relationship between two nucleic acid sequences. According to the definition above, oligonucleotides with high specificity can hybridize to another oligonucleotide or polynucleotide under certain conditions, while oligonucleotides with low specificity are not.

용어 "타겟 특이성"은 관심 타겟 핵산 서열에 매치하거나, 혼성화되거나, 이를 증폭하거나, 또는 검출하나, 임의의 다른 핵산 서열(비타겟 핵산 서열)에 매치하거나, 혼성화하거나, 이를 증폭하거나, 또는 검출하지 않는 올리고뉴클레오타이드의 특성을 의미하며, 이는 용어 "타겟 특이성", "타겟 핵산에 대한 특이성" 또는 "타겟 핵산 서열에 특이적"과 상호교환적으로 사용될 수 있다. 상기 정의에 따르면, 높은 특이성을 갖는 올리고뉴클레오타이드는 PCR 또는 실시간 PCR 방법에 의해 다양한 핵산의 혼합물을 함유하는 샘플로부터 원하는 타겟 핵산 서열만을 증폭 또는 검출할 수 있는 반면, 낮은 타겟 특이성을 갖는 올리고뉴클레오타이드는 관심 타겟뿐만 아니라 비타겟을 증폭 또는 검출하여 타겟 증폭 효율을 감소시키고 위양성(false-positive) 결과를 야기할 수 있다. The term " target specificity " refers to the ability to match, hybridize, amplify, or detect a target nucleic acid sequence of interest, match, hybridize, amplify, or detect any other nucleic acid sequence , Which may be used interchangeably with the terms " target specificity ", " specificity for target nucleic acid " or " specific for target nucleic acid sequence ". According to the definition above, oligonucleotides with high specificity can only amplify or detect a desired target nucleic acid sequence from a sample containing a mixture of various nucleic acids by PCR or real-time PCR method, while oligonucleotides with low target specificity Amplifying or detecting non-target as well as target can reduce target amplification efficiency and cause false-positive results.

본원에 사용된 바와 같이 용어 특이성은 어닐링 특이성 및 타겟 특이성 중 하나 또는 모두를 의미할 수 있다. As used herein, the term specificity may refer to one or both of annealing specificity and target specificity.

특이성은 혼성화 조건(예컨대, 온도)과 같은 몇 가지 인자에 따라 달라질 수 있으나, 특이성은 일차적으로 올리고뉴클레오타이드 서열과 참조 뉴클레오타이드 서열 간의 상동성에 의해 결정될 수 있다. 즉, 특이성은 올리고뉴클레오타이드와 참조 뉴클레오타이드 서열 간의 매치/미스매치 결과에 좌우될 수 있다. 당업자는, 디자인된 올리고뉴클레오타이드와 뉴클레오타이드 서열 간의 매치/미스매치에 기초하여, 상기 디자인된 올리고뉴클레오타이드가 특정한 조건 하에서 핵산 서열에 혼성화되어 이를 선택적으로 증폭 또는 검출할 수 있는지 확인할 수 있을 것이다. Specificity may vary depending on several factors such as hybridization conditions (e.g., temperature), but specificity may be determined primarily by homology between the oligonucleotide sequence and the reference nucleotide sequence. That is, the specificity may depend on the match / mismatch result between the oligonucleotide and the reference nucleotide sequence. One of ordinary skill in the art will be able to ascertain, based on the match / mismatch between the designed oligonucleotide and the nucleotide sequence, that the designed oligonucleotide can hybridize to the nucleic acid sequence under specific conditions to selectively amplify or detect it.

또한, 본원에 사용된 바와 같이 용어 "특이성에 관한 정보"는 올리고뉴클레오타이드의 특이성을 평가하는데 도움이 되는 임의의 정보를 의미한다. 전술한 바와 같이, 본원에서 사용된 특이성에 관한 정보는 올리고뉴클레오타이드 서열과 참조 뉴클레오타이드 서열 간의 유사성, 즉 이들 사이의 매치/미스매치를 분석하여 수득된 정보를 가리킨다. 특이성에 관한 정보는 하기에서 상세히 설명될 것이다. Also, as used herein, the term " information about specificity " means any information that helps to assess the specificity of the oligonucleotide. As noted above, the information about the specificity used herein refers to the information obtained by analyzing the similarity between the oligonucleotide sequence and the reference nucleotide sequence, i.e., the match / mismatch between the oligonucleotide sequence and the reference nucleotide sequence. Information on specificity will be described in detail below.

또한, 본원에 사용된 바와 같이 용어 "특이성을 평가한다" 또는 "특이성 평가"는 상기 제공된 정보, 즉 올리고뉴클레오타이드 서열과 참조 뉴클레오타이드 서열 간의 매치/미스매치에 기초하여 올리고뉴클레오타이드의 특이성을 결정하는 것을 포함한다. The term " assessing specificity " or " assessing specificity " as used herein also includes determining the specificity of oligonucleotides based on the provided information, i.e., the match / mismatch between the oligonucleotide sequence and the reference nucleotide sequence do.

당업자는 상기 매치/미스매치에 기초하여 디자인된 올리고뉴클레오타이드가 특정한 조건 하에서 특정 타겟 핵산 서열에 혼성화될 수 있는지 확인할 수 있을 것이다. Skilled artisans will be able to ascertain whether oligonucleotides designed based on the match / mismatch can hybridize to a particular target nucleic acid sequence under certain conditions.

또한, 당업자는 디자인된 올리고뉴클레오타이드와 참조 뉴클레오타이드 서열 간의 매치/미스매치에 기초하여 디자인된 올리고뉴클레오타이드가 특정한 조건 하에서 타겟 핵산 서열에만 혼성화되어 이를 선택적으로 증폭 또는 검출할 수 있는지 확인할 수 있을 것이다. The skilled artisan will also be able to ascertain whether the oligonucleotide designed based on the match / mismatch between the designed oligonucleotide and the reference nucleotide sequence can hybridize only to the target nucleic acid sequence and amplify or detect it, under certain conditions.

본 발명은 서열 내부에 왓슨-크릭 염기쌍에 관여하지 않는 2개 이상의 연속적인 염기를 포함하는 비전형적인 올리고뉴클레오타이드의 특이성을 평가하는 방법에 관한 것이다. 본 발명은 왓슨-크릭 염기쌍에 관여하지 않는 연속적인 염기에 의해 분리된 2개의 부위(부위 X 및 Z)에서의 개별적인 매치/미스매치 결과를 제공한다. The present invention relates to a method for evaluating the specificity of atypical oligonucleotides containing two or more consecutive bases not involved in the Watson-Crick base pair within a sequence. The present invention provides individual match / mismatch results at two sites (sites X and Z) that are separated by successive bases that do not participate in Watson-Crick base pairs.

특히, 특이성에 대해 상이한 효과를 미치는 2개의 부위를 포함하는 올리고뉴클레오타이드의 경우, 사용자는 본 발명에 의해 제공된 각 부위에서의 미스매치 결과를 통해 올리고뉴클레오타이드의 특이성을 정확하게 평가할 수 있다. 따라서, 본 발명의 방법은 이러한 비전형적인 올리고뉴클레오타이드의 특이성을 평가하는데 특히 유용하다. In particular, in the case of oligonucleotides containing two sites that have different effects on specificity, the user can accurately assess the specificity of the oligonucleotide through the mismatch results at each site provided by the present invention. Therefore, the method of the present invention is particularly useful for evaluating the specificity of such atypical oligonucleotides.

도 1은 본 발명의 예시적인 구현예에 따라 올리고뉴클레오타이드의 특이성을 평가하는 과정을 나타낸 흐름도이다. 본 발명의 방법(100)을 도 1을 참조하여 설명하면 다음과 같다:1 is a flow diagram illustrating a process for evaluating the specificity of an oligonucleotide according to an exemplary embodiment of the present invention. The method 100 of the present invention will now be described with reference to Figure 1:

단계 (a): 올리고뉴클레오타이드의 제공( 110 ) Step (a): Providing Oligonucleotides ( 110 )

먼저, 본 단계에서는 특이성을 평가하고자 하는 올리고뉴클레오타이드를 제공한다(110). 상기 올리고뉴클레오타이드는 타겟 핵산 서열을 증폭 또는 검출하는데 사용되는 프라이머 또는 프로브이다.First, in this step, an oligonucleotide to be evaluated for specificity is provided ( 110 ). The oligonucleotide is a primer or a probe used for amplifying or detecting a target nucleic acid sequence.

본원에 사용된 바와 같이 용어 "타겟 핵산 서열", "타겟 서열", 또는 "타겟"은 본 발명의 올리고뉴클레오타이드를 사용하여 증폭 또는 검출하고자 하는 핵산 서열을 의미한다. 상기 타겟 핵산 서열은 이중 가닥 또는 단일 가닥일 수 있다. 상기 타겟 핵산 서열은 이중 가닥의 핵산 중 어느 한 가닥 또는 두 가닥, 즉 (+) 가닥(코딩 가닥, 센스 가닥, 비주형 가닥) 또는 (-) 가닥(비코딩 가닥, 안티센스 가닥, 주형 가닥)일 수 있다. 상기 타겟 핵산 서열은 본 발명의 올리고뉴클레오타이드와 혼성화될 수 있는 영역을 포함하는 하나의 폴리뉴클레오타이드 서열일 수 있다. 택일적으로, 상기 타겟 핵산 서열은 본 발명의 올리고뉴클레오타이드와 혼성화될 수 있는 공통 영역을 포함하는 적어도 2개의 폴리뉴클레오타이드 서열일 수 있다. 상기 타겟 핵산 서열은 유전적 다양성(genetic diversity)을 갖는 뉴클레오타이드 서열일 수 있다. 상기 타겟 핵산 서열은 유전적으로 동일한 유전자 패밀리, 즉 유전자 및 이의 변이체로 구성된 그룹일 수 있다. 상기 타겟 핵산 서열은 유전자 및 종래 공지된 분류 기준에 따라 상기 유전자에 속하는 아형(subtype)의 그룹일 수 있다. 예를 들어, 올리고뉴클레오타이드가 HPV(human papillomavirus) 유형 16을 증폭 또는 검출하기 위한 것인 경우, 타겟 핵산 서열은 HPV 유형 16에 속하는 복수의 유전자로 구성될 수 있다. The term " target nucleic acid sequence ", " target sequence ", or " target " as used herein means a nucleic acid sequence which is to be amplified or detected using the oligonucleotides of the present invention. The target nucleic acid sequence may be double-stranded or single-stranded. The target nucleic acid sequence may be any one or two strands of double stranded nucleic acid, i.e., (+) strand (coding strand, sense strand, non-strand strand) or (-) strand (noncoding strand, antisense strand, template strand) . The target nucleic acid sequence may be one polynucleotide sequence comprising a region capable of hybridizing with an oligonucleotide of the present invention. Alternatively, the target nucleic acid sequence may be at least two polynucleotide sequences comprising a common region that can be hybridized with an oligonucleotide of the invention. The target nucleic acid sequence may be a nucleotide sequence having a genetic diversity. The target nucleic acid sequence may be a genetically identical gene family, i. E., A gene and a variant thereof. The target nucleic acid sequence may be a gene and a group of subtypes belonging to the gene according to conventionally known classification standards. For example, if the oligonucleotide is to amplify or detect human papillomavirus (HPV) type 16, the target nucleic acid sequence may comprise a plurality of genes belonging to HPV type 16.

반면, 본원에 사용된 바와 같이 용어 "비타겟 핵산 서열", "비타겟 서열", 또는 "비타겟"은 본 발명의 올리고뉴클레오타이드를 사용하여 증폭 또는 검출되는 타겟 핵산 서열 이외의 핵산 서열을 의미한다. 비타겟 핵산 서열은 또한 본 발명의 올리고뉴클레오타이드를 사용하여 증폭 또는 검출하고자 하는 것은 아니지만, 우연히 증폭 또는 검출될 수 있는 핵산 서열도 포함한다. As used herein, the term "non-target nucleic acid sequence", "non-target sequence", or "non-target" as used herein means a nucleic acid sequence other than the target nucleic acid sequence amplified or detected using the oligonucleotide of the present invention . The non-target nucleic acid sequence is also not intended to be amplified or detected using the oligonucleotides of the present invention, but also includes nucleic acid sequences that can be amplified or detected by chance.

본원에 사용된 바와 같이 용어 "올리고뉴클레오타이드"는 그의 특이성을 평가하고자 하는 짧은 폴리뉴클레오타이드를 의미한다. 상기 올리고뉴클레오타이드는 "질의(query)" 또는 "질의 서열"로 지칭될 수 있다. As used herein, the term " oligonucleotide " refers to a short polynucleotide that is intended to assess its specificity. The oligonucleotide may be referred to as a " query " or " query sequence ".

상기 올리고뉴클레오타이드는 자연적으로 존재하거나 인공적으로 합성될 수 있는, 타겟 핵산 서열에 특이적으로 혼성화할 수 있는, 데옥시리보뉴클레오타이드 및 리보뉴클레오타이드를 포함하는, 자연의 또는 변형된 모노머 또는 연쇄(linkages)의 선형 올리고머를 의미한다. 올리고뉴클레오타이드는 혼성화에서의 최대 효율을 위해 바람직하게는 단일쇄이다. 바람직하게는, 올리고뉴클레오타이드는 올리고데옥시리보뉴클레오타이드이다. 본 발명의 올리고뉴클레오타이드는 자연(naturally occurring) dNMP(즉, dAMP, dGMP, dCMP 및 dTMP), 변형된 뉴클레오타이드, 또는 비자연 뉴클레오타이드를 포함할 수 있다. 올리고뉴클레오타이드는 리보뉴클레오타이드도 포함할 수 있다. 예를 들어, 본 발명의 올리고뉴클레오타이드는 골격 변형된 뉴클레오타이드, 예컨대, 펩타이드 핵산 (Peptide Nucleic Acid: PNA)(M. Egholm et al., Nature, 365:566-568(1993)), 포스포로티오에이트 DNA, 포스포로디티오에이트 DNA, 포스포로아미데이트 DNA, 아마이드-연결된 DNA, MMI-연결된 DNA, 2'-O-메틸 RNA, 알파-DNA 및 메틸포스포네이트 DNA, 당 변형된 뉴클레오타이드, 예컨대, 2'-O-메틸 RNA, 2'-플루오로 RNA, 2'-아미노 RNA, 2'-O-알킬 DNA, 2'-O-알릴 DNA, 2'-O-알카이닐 DNA, 헥소스 DNA, 피라노실 RNA 및 안히드로헥시톨 DNA, 및 염기 변형을 갖는 뉴클레오타이드, 예컨대 C-5 치환된 피리미딘(플루오로-, 브로모-, 클로로-, 아이오도-, 메틸-, 에틸-, 비닐-, 포르밀-, 에티닐-, 프로피닐-, 알카이닐-, 티아조릴-, 이미다조릴-, 피리딜- 포함 치환기), C-7 치환기를 갖는 7-데아자퓨린(플루오로-, 브로모-, 클로로-, 아이오도-, 메틸-, 에틸-, 비닐-, 포르밀-, 알카이닐-, 알켄일-, 티아조릴-, 이미다조릴-, 피리딜- 포함 치환기), 이노신 및 디아미노퓨린을 포함할 수 있다. The oligonucleotides may be natural or modified monomers or linkages, including deoxyribonucleotides and ribonucleotides, which can be naturally hybridized or artificially synthesized, and which can specifically hybridize to a target nucleic acid sequence. Linear oligomer. The oligonucleotide is preferably a single strand for maximum efficiency in hybridization. Preferably, the oligonucleotide is an oligodeoxyribonucleotide. The oligonucleotides of the present invention may comprise naturally occurring dNMPs (i.e., dAMP, dGMP, dCMP and dTMP), modified nucleotides, or non-natural nucleotides. Oligonucleotides can also include ribonucleotides. For example, the oligonucleotides of the present invention can be synthesized by using skeleton modified nucleotides such as peptide nucleic acid (PNA) (M. Egholm et al., Nature, 365: 566-568 (1993)), phosphorothioate DNA, phosphorodithioate DNA, phosphoroamidate DNA, amide-linked DNA, MMI-linked DNA, 2'-O-methyl RNA, alpha-DNA and methylphosphonate DNA, sugar modified nucleotides, 2'-O-methyl RNA, 2'-fluoro RNA, 2'-amino RNA, 2'-O-alkyl DNA, 2'- Pyranosyl RNA and anhydrohexitol DNA and nucleotides with base modifications such as C-5 substituted pyrimidines (fluoro-, bromo-, chloro-, iodo-, methyl-, ethyl-, , Substituted or unsubstituted aryl groups, substituted or unsubstituted aryl groups, substituted or unsubstituted heteroaryl groups, substituted or unsubstituted aryl groups, substituted or unsubstituted aryl groups, substituted or unsubstituted aryl groups, However, Alkyl-, ethyl-, vinyl-, formyl-, alkynyl-, alkenyl-, thiazolyl-, imidazolyl-, pyridyl- containing substituents), inosine and diaminopurine .

예를 들어, 본 발명의 올리고뉴클레오타이드는 자연 염기(A, T, C 또는 G) 이외의 염기를 포함할 수 있다. For example, the oligonucleotide of the present invention may comprise a base other than a natural base (A, T, C or G).

본 발명의 방법에서 특이성을 평가하고자 하는 올리고뉴클레오타이드는 프라이머 또는 프로브이다.Oligonucleotides to be evaluated for specificity in the method of the present invention are primers or probes.

본원에 사용된 바와 같이 용어 "프라이머"는 타겟 핵산 가닥(주형)에 상보적인 프라이머 연장 산물의 합성이 유도되는 조건 하에, 즉, 뉴클레오타이드와 DNA 중합효소와 같은 중합제의 존재, 그리고 적합한 온도와 pH의 존재하에 합성의 개시점으로 작용할 수 있는 올리고뉴클레오타이드를 가리킨다. 프라이머는 중합제의 존재하에 연장 산물의 합성을 프라이밍시킬 수 있을 정도로 충분히 길어야 한다. 프라이머의 정확한 길이는 온도, 응용분야 및 프라이머의 소스(source)를 포함하는 많은 인자에 따라 달라질 것이다.As used herein, the term " primer " refers to a nucleic acid that hybridizes under conditions that induce the synthesis of a primer extension product complementary to the target nucleic acid strand (template), that is, the presence of a polymerizing agent such as a nucleotide and a DNA polymerase, Lt; RTI ID = 0.0 > oligonucleotides < / RTI > The primer should be long enough to be able to prime the synthesis of the extension product in the presence of the polymerizing agent. The exact length of the primer will depend on many factors, including the temperature, the application, and the source of the primer.

본원에 사용된 바와 같이 용어 "프로브(probe)"는 타겟 핵산 서열에 실질적으로 상보적인 부위 또는 부위들을 포함하는 단일쇄 핵산 분자를 의미한다. 상기 프로브는 타겟 핵산 서열의 검출을 위한 시그널을 발생시킬 수 있는 표지를 함유할 수 있다. 상기 프로브의 3'-말단은 그의 연장을 방지하기 위해 "블로킹"될 수 있다. 상기 블로킹은 종래의 방법에 따라 달성될 수 있다. 예를 들어, 블로킹은 마지막 뉴클레오타이드의 3'-하이드록실기에 바이오틴, 표지, 포스페이트기, 알킬기, 비-뉴클레오타이드 링커, 포스포로티오에이트 또는 알칸-디올과 같은 화학적 모이어티를 부가함으로써 수행될 수 있다. 택일적으로, 블로킹은 마지막 뉴클레오타이드의 3'-하이드록실기를 제거하거나 디데옥시뉴클레오타이드와 같은 3'-하이드록실기가 없는 뉴클레오타이드를 사용하여 수행될 수 있다. As used herein, the term " probe " refers to a single-stranded nucleic acid molecule comprising a site or regions substantially complementary to a target nucleic acid sequence. The probe may contain a label capable of generating a signal for detection of the target nucleic acid sequence. The 3'-end of the probe can be " blocked " to prevent its extension. The blocking can be accomplished according to conventional methods. For example, blocking can be performed by adding a chemical moiety such as biotin, a label, a phosphate group, an alkyl group, a non-nucleotide linker, a phosphorothioate or an alkane-diol to the 3'-hydroxyl group of the last nucleotide . Alternatively, the blocking can be performed using a 3'-hydroxyl group of the last nucleotide or a nucleotide without a 3'-hydroxyl group such as dideoxynucleotide.

본원에 사용된 바와 같이 용어 "어닐링" 또는 "프라이밍"은 주형 핵산에 올리고데옥시뉴클레오타이드 또는 핵산이 병치(apposition)되는 것을 의미하며, 상기 병치는 중합효소가 뉴클레오타이드를 중합시켜 주형 핵산 또는 그의 일부분에 상보적인 핵산 분자를 형성하게 한다. 본원에서 사용된 용어 "혼성화(hybridization)"는 상보적인 단일쇄 핵산으로부터 이중쇄 핵산을 형성하는 것을 의미한다. 용어 "어닐링"과 "혼성화"는 차이가 없으며, 본 명세서에서 혼용된다.The term " annealing " or " priming ", as used herein, means that oligodeoxynucleotides or nucleic acids are apposited to a template nucleic acid, which polymerase is capable of polymerizing the nucleotide to form a template nucleic acid or a portion thereof To form complementary nucleic acid molecules. The term " hybridization " as used herein means to form a double-stranded nucleic acid from a complementary single-stranded nucleic acid. The terms " annealing " and " hybridization " are no different and are used interchangeably herein.

본 발명에서 특이성을 평가하고자 하는 올리고뉴클레오타이드는 하기 식 (I)로 표시되는 올리고뉴클레오타이드이다: An oligonucleotide to be evaluated for specificity in the present invention is an oligonucleotide represented by the following formula (I): < EMI ID =

5'-X-Y-Z-3' (I)5'-X-Y-Z-3 '(I)

상기 식에서, X는 타겟 핵산 서열에 혼성화되는 혼성화 뉴클레오타이드 서열을 포함하는 부위를 나타내고, Y는 왓슨-크릭 염기쌍에 관여하지 않는 2개 이상의 연속적인 염기를 포함하는 분할 부위를 나타내며, Z는 타겟 핵산 서열에 혼성화되는 혼성화 뉴클레오타이드 서열을 포함하는 부위를 나타낸다. Wherein Y represents a cleavage site comprising two or more consecutive bases not involved in the Watson-Crick base pair, and Z represents a cleavage site of the target nucleic acid sequence Lt; RTI ID = 0.0 > nucleotide < / RTI >

식 (I)의 올리고뉴클레오타이드는 구별되는 특성을 갖는 3개의 상이한 부위를 가지며, 타겟 핵산 서열에 대한 그의 어닐링 특이성은 그의 분리된 두 부위, 즉 부위 X 및 부위 Z에 의해 이중으로 결정된다. The oligonucleotides of formula (I) have three different sites with distinctive characteristics, and their annealing specificity for the target nucleic acid sequence is doubly determined by its two discrete sites, namely site X and site Z.

일반적으로, 종래의(전형적인) 프라이머 또는 프로브의 어닐링 특이성은 그의 전체 서열에 의해 지배된다. 이에 반해, 식 (I)의 올리고뉴클레오타이드의 어닐링 특이성은 부위 Y에 의해 분리된 두 부위, 즉 부위 X 및 부위 Z에 의해 이중으로 결정된다. Generally, the annealing specificity of a conventional (typical) primer or probe is governed by its entire sequence. In contrast, the annealing specificity of the oligonucleotide of formula (I) is doubly determined by two sites separated by site Y, namely site X and site Z.

식 (I)의 올리고뉴클레오타이드에서, 부위 Y는 2개 이상의 연속적인 염기를 포함하며, 이들 각각은 왓슨-크릭 염기쌍에 관여하지 않는다. In the oligonucleotides of formula (I), site Y comprises two or more consecutive bases, each of which is not involved in the Watson-Crick base pair.

본원에 사용된 바와 같이, 왓슨-크릭 염기쌍은 아데닌(A)이 티민(T) 또는 우라실(U)에 결합하는 한편, 구아닌(G)이 시토신(C)에 결합하는 것을 의미한다. As used herein, the Watson-Crick base pair means that the adenine (A) binds to thymine (T) or uracil (U) while guanine (G) binds to cytosine (C).

따라서, 왓슨-크릭 염기쌍에 관여하지 않는 염기는 타겟 핵산 서열 내의 반대편에 있는 염기와 왓슨-크릭 염기쌍을 형성하지 않는 임의의 염기를 지칭한다. 특히, 왓슨-크릭 염기쌍에 관여하지 않는 염기는 자연 염기 사이의 염기쌍 형성의 강도보다 염기와 타겟 핵산 서열 내의 반대편에 있는 염기 사이에서 더 낮은 강도(낮은 멜팅 온도)의 염기쌍 형성을 나타내는 임의의 염기를 포함한다. Thus, a base that does not participate in the Watson-Crick base pair refers to any base that does not form a Watson-Crick base pair with the opposite base in the target nucleic acid sequence. In particular, a base that does not participate in the Watson-Crick base pair can have any base that exhibits a lower intensity (lower melting temperature) base pairing between the base and the opposite base in the target nucleic acid sequence than the intensity of the base pairing between the natural base .

일 구현예에서, 부위 Y는 올리고뉴클레오타이드가 타겟 핵산 서열에 어닐링될 때 3개의 부위 중에서 가장 낮은 Tm 값을 갖도록 디자인된다. In one embodiment, site Y is designed to have the lowest Tm value among the three sites when the oligonucleotide is annealed to the target nucleic acid sequence.

왓슨-크릭 염기쌍에 관여하지 않는 이들 염기는 특히, 부위 X 및/또는 Y가 타겟 핵산 서열에 특이적으로 어닐링(혼성화)하는 조건 하에, 어닐링(혼성화) 또는 증폭 동안에 버블 구조를 생성하여 부위 X와 부위 Z를 분할시키고, 이에 의해 타겟 핵산 서열에 대한 프라이머 또는 프로브의 어닐링 특이성을 향상시킨다. These bases that do not participate in the Watson-Crick base pair are particularly useful for generating a bubble structure during annealing (hybridization) or amplification under conditions such that site X and / or Y specifically anneals (hybridizes) to the target nucleic acid sequence, Site Z, thereby improving the annealing specificity of the primer or probe to the target nucleic acid sequence.

왓슨-크릭 염기쌍에 관여하지 않는 염기의 예는: (i) 비자연 염기; (ii) 유니버설 염기; 및 (iii) 미스매치된 염기를 포함한다. 일 구현예에서, 분할 부위 Y에 포함된 염기는 비자연 염기; 유니버설 염기; 미스매치된 염기 및 이의 조합으로부터 선택된다. Examples of bases which are not involved in the Watson-Crick base pairs are: (i) non-natural bases; (ii) a universal base; And (iii) a mismatched base. In one embodiment, the base contained in the cleavage site Y is an unnatural base; Universal base; Mismatched bases and combinations thereof.

본원에 사용된 바와 같이 용어 "비자연 염기"는 서로 수소-결합 염기쌍을 형성할 수 있는, 아데닌(A), 구아니(G), 티민(T), 시토신(C) 및 우라실(U)과 같은 자연 염기의 유도체를 의미한다(미국 특허 제8,440,406호 참고). 본원에 사용된 용어 "비자연 염기"는, 예를 들어, 미국 특허 제5,432,272호, 제5,965,364호, 제6,001,983호, 제6,037,120호, 및 제8,440,406호에 기재된 바와 같이, 모 화합물로서 자연 염기와 상이한 염기쌍 형성 패턴을 갖는 염기를 포함한다. 비자연 염기 간의 염기쌍 형성은 자연 염기와 같이 2 또는 3개의 수소 결합을 포함한다. 비자연 염기 간의 염기쌍 형성은 또한 특정 방식으로 형성된다. As used herein, the term " unnatural base " includes adenine (A), guanyl (G), thymine (T), cytosine (C), and uracil (U), which can form hydrogen- Refers to derivatives of the same natural base (see U.S. Patent No. 8,440,406). As used herein, the term " unnatural base " refers to a base that is different from a natural base as the parent compound, for example, as described in U.S. Patent Nos. 5,432,272, 5,965,364, 6,001,983, 6,037,120, and 8,440,406 And a base having a base pairing pattern. Base pairing between unnatural bases involves two or three hydrogen bonds like natural bases. Base pairing between unnatural bases is also formed in a particular way.

식 (I)의 올리고뉴클레오타이드에 포함된 비자연 염기는, 타겟 핵산 서열 내의 반대편 염기가 자연 염기인 경우, 왓슨-크릭 염기쌍에 관여하지 않는다. 비자연 염기와 타겟 핵산 서열 내의 반대편 염기 간의 염기쌍 형성은 자연 염기 간의 염기쌍 형성에 비해 낮은 강도(낮은 멜팅 온도)를 갖는다. 따라서, 이러한 염기쌍 형성은 버블 구조를 생성시키고 부위 X 및 Z를 분리시키는 역할을 한다. The unnatural base contained in the oligonucleotide of formula (I) does not participate in the Watson-Crick base pair when the opposite base in the target nucleic acid sequence is a natural base. Base pairing between an unnatural base and an opposite base in the target nucleic acid sequence has a lower intensity (lower melting temperature) than base pairing between natural bases. Thus, this base pairing serves to create a bubble structure and separate sites X and Z.

비자연 염기의 특정한 예는 염기쌍 조합된 하기 염기를 포함한다: iso-C/iso-G, iso-dC/iso-dG, K/X, H/J, 및 M/N(미국 특허 제7,422,850호 및 제8,440,406호 참고). Specific examples of unnatural bases include the following bases combined in base pairs: iso-C / iso-G, iso-dC / iso-dG, K / X, H / J, and M / N (US Patent No. 7,422,850 And 8,440,406).

본원에 사용된 바와 같이 용어 "유니버설 염기(universal base)"는 자연의 DNA/RNA 염기들 각각과 구별없이 염기쌍을 형성할 수 있는 염기를 의미하며, 상기 염기쌍은 왓슨-크릭 염기쌍에 관여하지 않는다. As used herein, the term " universal base " refers to a base capable of forming a base pair without distinguishing between natural DNA / RNA bases, and the base pair is not involved in the Watson-Crick base pair.

식 (I)의 올리고뉴클레오타이드에 포함된 유니버설 염기와 타겟 핵산 서열에 포함된 반대편 염기 간의 염기쌍 형성은 자연 염기 간의 염기쌍 형성에 비해 낮은 강도(낮은 멜팅 온도)를 갖는다. The base pairing between the universal base contained in the oligonucleotide of formula (I) and the opposite base contained in the target nucleic acid sequence has a lower intensity (lower melting temperature) than the base pairing between natural bases.

상기 유니버설 염기의 예는 데옥시이노신, 이노신, 7-디아자-2'-데옥시이노신, 2-아자-2'-데옥시이노신, 2'-OMe 이노신, 2'-F 이노신, 데옥시 3-니트로피롤, 3-니트로피롤, 2'-OMe 3-니트로피롤, 2'-F 3-니트로피롤, 1-(2'-데옥시-베타-D-리보푸라노실)-3-니트로피롤, 데옥시 5-니트로인돌, 5-니트로인돌, 2'-OMe 5-니트로인돌, 2'-F 5-니트로인돌, 데옥시 4-니트로벤즈이미다졸, 4-니트로벤즈이미다졸, 데옥시 4-아미노벤즈이미다졸, 4-아미노벤즈이미다졸, 데옥시 네불라린, 2'-F 네불라린, 2'-F 4-니트로벤즈이미다졸, PNA-5-인트로인돌, PNA-네불라린, PNA-이노신, PNA-4-니트로벤즈이미다졸, PNA-3-니트로피롤, 모르포리노-5-니트로인돌, 모르포리노-네불라린, 모르포리노-이노신, 모르포리노-4-니트로벤즈이미다졸, 모르포리노-3-니트로피롤, 포스포라미데이트-5-니트로인돌, 포스포라미데이트-네불라린, 포스포라미데이트-이노신, 포스포라미데이트-4-니트로벤즈이미다졸, 포스포라미데이트-3-니트로피롤, 2'-0-메톡시에틸이노신, 2'-0-메톡시에틸 네불라린, 2'-0-메톡시에틸 5-니트로인돌, 2'-0-메톡시에틸 4-니트로-벤즈이미다졸, 2'-0-메톡시에틸 3-니트로피롤 및 이의 조합을 포함한다. 특히, 상기 유니버설 염기는 데옥시이노신, 이노신, 1-(2'-데옥시-베타-D-리보푸라노실)-3-니트로피롤, 또는 5-니트로인돌이며, 보다 특히 데옥시이노신 또는 이노신이다. Examples of the universal base include deoxyinosine, inosine, 7-diaza-2'-deoxyinosine, 2-aza-2'-deoxyinosine, 2'-OMe inosine, 2'-F inosine, -Nitropyrrol, 3-nitropyrrol, 2'-OMe 3-nitropyrrol, 2'-F 3-nitropyrrol, 1- (2'- deoxy-beta-D- ribofuranosyl) -3 Nitroindole, 2'-OMe 5-nitroindole, 2'-F 5-nitroindole, deoxy 4-nitrobenzimidazole, 4-nitrobenzimidazole, , 4'-aminobenzimidazole, 4-aminobenzimidazole, deoxyne bululain, 2'-F nebularine, 2'-F 4-nitrobenzimidazole, PNA-5-introindole, PNA -Nonboline, PNA-inosine, PNA-4-nitrobenzimidazole, PNA-3-nitropyrrol, morpholino-5-nitroindole, morpholino-nebularine, morpholino-inosine, Mo 4-nitrobenzimidazole, mofolino-3-nitropropol, phosphoramidate-5-nitroindole, phosphoramidate Phospholamidate-3-nitroprolyl, 2'-O-methoxyethylinosine, 2'-O-isobutyrate, Methoxyethyl 5-nitroindole, 2'-O-methoxyethyl 4-nitro-benzimidazole, 2'-O-methoxyethyl 3-nitropyrrole And combinations thereof. Particularly, the universal base is deoxyinosine, inosine, 1- (2'-deoxy-beta-D-ribofuranosyl) -3-nitropyrrol or 5-nitroindole, more particularly deoxyinosine or inosine to be.

본원에 사용된 바와 같이 용어 "미스매치된 염기"는 타겟 핵산 서열 내의 반대편 염기와 수소 결합 염기쌍을 형성할 수 없는 염기를 의미한다(WO 제2013/123552호 및 WO 제2014/124290호 참고). 상기 미스매치된 염기는 타겟 핵산 내의 반대편 염기의 유형에 따라 달라질 수 있다.As used herein, the term " mismatched base " means a base that is incapable of forming a hydrogen bonding base pair with the opposite base in the target nucleic acid sequence (see WO 2013/123552 and WO 2014/124290). The mismatched base may vary depending on the type of opposite base in the target nucleic acid.

식 (I)의 올리고뉴클레오타이드에 함유된 미스매치된 염기는 타겟 핵산 내에 함유된 반대편 염기와 염기쌍을 형성할 수 없으므로, 미스매치된 염기를 포함하는 부위 Y는 버블 구조를 생성시키고 부위 X와 Z를 분리하는 역할을 한다. Since the mismatched base contained in the oligonucleotide of formula (I) can not form a base pair with the opposite base contained in the target nucleic acid, the site Y containing the mismatched base produces a bubble structure and the sites X and Z It separates.

부위 Y는 왓슨-크릭 염기쌍에 관여하지 않는 2개의 연속적인 염기, 바람직하게는 왓슨-크릭 염기쌍에 관여하지 않는 3개, 4개, 5개, 6개, 7개, 또는 그 이상의 연속적인 염기를 가질 수 있다. 특정 구현예에 따르면, 상기 부위 Y는 왓슨-크릭 염기쌍에 관여하지 않는 2-10개, 2-9개, 2-8개, 2-7개, 2-6개 또는 2-5개, 2-4개 또는 2-3개의 연속적인 염기, 보다 특히 왓슨-크릭 염기쌍에 관여하지 않는 3-10개, 3-9개, 3-8개, 3-7개, 3-6개, 3-5개 또는 3-4개의 연속적인 염기, 가장 특히 왓슨-크릭 염기쌍에 관여하지 않는 4-10개, 4-9개, 4-8개, 4-7개, 4-6개 또는 4-5개의 연속적인 염기를 갖는다. Region Y is a continuous base that does not participate in the Watson-Crick base pair, preferably three, four, five, six, seven, or more consecutive bases that do not participate in the Watson- Lt; / RTI > According to a particular embodiment, the site Y is 2-10, 2-9, 2-8, 2-7, 2-6 or 2-5, 2- 3-10, 3-9, 3-8, 3-7, 3-6, 3-5, < RTI ID = 0.0 > Or 3-4 consecutive bases, most particularly 4-10, 4-9, 4-8, 4-7, 4-6 or 4-5 consecutive consecutive bases not involved in the Watson- Base.

일 구현예에서, 부위 Y는 2개의 연속적인 비자연 염기, 바람직하게는, 3개, 4개, 5개, 6개, 7개, 8개 또는 그 이상의 연속적인 비자연 염기를 갖는다. 또 다른 구현예에서, 부위 Y는 2개의 연속적인 유니버설 염기, 바람직하게는 3개, 4개, 5개, 6개, 7개, 8개 또는 그 이상의 연속적인 유니버설 염기를 갖는다. 또 다른 구현예에서, 부위 Y는 2개의 연속적인 미스매치된 염기, 바람직하게는 3개, 4개, 5개, 6개, 7개, 8개 또는 그 이상의 연속적인 미스매치된 염기를 갖는다. 또 다른 구현예에서, 부위 Y는 2개, 바람직하게는 3개, 4개, 5개, 6개, 7개, 8개 또는 그 이상의 연속적인 염기를 가지며, 각 염기는 독립적으로 비자연 염기, 유니버설 염기 및 미스매치된 염기로부터 선택된다. In one embodiment, site Y has two consecutive non-natural bases, preferably three, four, five, six, seven, eight or more consecutive unnatural bases. In another embodiment, site Y has two consecutive universal bases, preferably three, four, five, six, seven, eight or more consecutive universal bases. In another embodiment, site Y has two consecutive mismatched bases, preferably three, four, five, six, seven, eight or more consecutive mismatched bases. In another embodiment, the site Y has two, preferably three, four, five, six, seven, eight or more consecutive bases, each base independently being a non-natural base, Universal bases and mismatched bases.

식 (I)의 올리고뉴클레오타이드에서, 부위 X 및 Z는 각각 타겟 핵산 서열에 대한 혼성화 뉴클레오타이드를 갖는 부위, 즉 각각 혼성화되는 주형 핵산 상의 위치에 상보적인 혼성화 뉴클레오타이드 서열을 갖는 부위이다. In the oligonucleotides of formula (I), the sites X and Z are sites with hybridization nucleotides to the target nucleic acid sequence, respectively, i.e. sites with hybridization nucleotide sequences complementary to the positions on the template nucleic acid to be hybridized, respectively.

용어 "상보적인"은 지정된 어닐링 조건 또는 엄격 조건하에서 타겟 핵산 서열에 선택적으로 혼성화하기에 충분히 상보적인 것을 의미하기 위해 본원에서 사용되며, 용어 "실질적으로 상보적인" 및 "완전히 상보적인", 바람직하게는 완전히 상보적인 것을 포함한다. The term " complementary " is used herein to mean sufficiently complementary to selectively hybridize to a target nucleic acid sequence under specified annealing or stringent conditions, and the terms " substantially complementary " and " Includes a completely complementary.

식 (I)의 올리고뉴클레오타이드 내의 부위 X 및/또는 부위 Z는 그것이 프라이머 또는 프로브로서 작용할 수 있는 범위 내에서 주형(타겟 핵산 서열)에 대해 하나 이상의 미스매치를 가질 수 있다. 예를 들어, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및/또는 부위 Z는 1-2, 1-3 또는 1-4개의 비상보적 뉴클레오타이드를 가질 수 있다. The site X and / or site Z in the oligonucleotide of formula (I) may have one or more mismatches to the template (target nucleic acid sequence) within a range that it can serve as a primer or probe. For example, site X and / or site Z in the oligonucleotide of formula (I) may have 1-2, 1-3 or 1-4 non-conformal nucleotides.

가장 특히, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및/또는 부위 Z는 주형 상의 하나의 위치에 완전하게 상보적인, 즉 미스매치가 없는 뉴클레오타이드 서열을 갖는다.Most particularly, site X and / or site Z in the oligonucleotide of formula (I) has a nucleotide sequence that is completely complementary, i.e. mismatched, at one position on the template.

부위 X 및 부위 Z의 길이는 각각 3 내지 50개의 뉴클레오타이드 잔기의 범위일 수 있다. The length of site X and site Z may each range from 3 to 50 nucleotide residues.

일 구현예에서, 부위 X는 부위 Z보다 길다. 구체적으로, 부위 X의 길이는 15 내지 50, 15 내지 40, 15 내지 30 또는 15 내지 25 뉴클레오타이드 잔기, 보다 특히 17 내지 50, 17 내지 40, 17 내지 30 또는 17 내지 25 뉴클레오타이드 잔기, 및 가장 특히 20 내지 50, 20 내지 40, 20 내지 30 또는 20 내지 25 뉴클레오타이드 잔기이다. Z 부위의 길이는 3 내지 15, 3 내지 12 또는 3 내지 10 뉴클레오타이드 잔기, 보다 특히 5 내지 15, 5 내지 12 또는 5 내지 10 뉴클레오타이드 잔기, 가장 특히, 6 내지 12 뉴클레오타이드 잔기이다. In one embodiment, site X is longer than site Z. Specifically, the length of site X is 15 to 50, 15 to 40, 15 to 30 or 15 to 25 nucleotide residues, more particularly 17 to 50, 17 to 40, 17 to 30 or 17 to 25 nucleotide residues, To 50, 20 to 40, 20 to 30 or 20 to 25 nucleotide residues. The length of the Z region is 3 to 15, 3 to 12 or 3 to 10 nucleotide residues, more particularly 5 to 15, 5 to 12 or 5 to 10 nucleotide residues, most particularly 6 to 12 nucleotide residues.

또 다른 구현예에서, 부위 Z는 부위 X보다 길다. 구체적으로, 부위 Z의 길이는 15 내지 50, 15 내지 40, 15 내지 30 또는 15 내지 25 뉴클레오타이드 잔기이고, 보다 특히 17 내지 50, 17 내지 40, 17 내지 30 또는 17 내지 25 뉴클레오타이드 잔기, 가장 특히, 20 내지 50, 20 내지 40, 20 내지 30 또는 20 내지 25 뉴클레오타이드 잔기이다. 부위 X의 길이는 3 내지 15, 3 내지 12 또는 3 내지 10 뉴클레오타이드 잔기, 보다 특히 5 내지 15, 5 내지 12 또는 5 내지 10 뉴클레오타이드 잔기, 가장 특히, 6 내지 12 뉴클레오타이드 잔기이다. In another embodiment, site Z is longer than site X. In particular, the length of site Z is from 15 to 50, 15 to 40, 15 to 30 or 15 to 25 nucleotide residues, more particularly from 17 to 50, 17 to 40, 17 to 30 or 17 to 25 nucleotide residues, 20 to 50, 20 to 40, 20 to 30 or 20 to 25 nucleotide residues. The length of site X is 3 to 15, 3 to 12 or 3 to 10 nucleotide residues, more particularly 5 to 15, 5 to 12 or 5 to 10 nucleotide residues, most particularly 6 to 12 nucleotide residues.

일 구현예에서, 부위 X 및 Z 각각의 Tm은 6℃ 내지 80℃, 6℃ 내지 70℃, 6℃ 내지 50℃, 6℃ 내지 40℃, 10℃ 내지 80℃, 10℃ 내지 70℃, 10℃ 내지 60℃, 10℃ 내지 50℃, 10℃ 내지 40℃, 20℃ 내지 80℃, 20℃ 내지 70℃, 20℃ 내지 60℃, 20℃ 내지 50℃, 20℃ 내지 40℃, 30℃ 내지 80℃, 30℃ 내지 70℃, 30℃ 내지 60℃, 30℃ 내지 50℃, 또는 30℃ 내지 40℃의 범위이다. 일 구현예에서, 부위 Y의 Tm은 1℃ 내지 15℃, 1℃ 내지 20℃, 1℃ 내지 5℃, 2℃ 내지 15℃, 2℃ 내지 10℃, 2℃ 내지 5℃, 3℃ 내지 15℃, 3℃ 내지 10℃, 또는 3℃ 내지 5℃이다. 일 구현예에서, 부위 Y의 Tm은 부위 X 및 Z 각각의 Tm보다 낮다. In one embodiment, the Tm of each of the moieties X and Z is in the range of 6 占 폚 to 80 占 폚, 6 占 폚 to 70 占 폚, 6 占 폚 to 50 占 폚, 6 占 폚 to 40 占 폚, 10 占 폚 to 80 占 폚, 20 deg. C to 60 deg. C, 10 deg. C to 50 deg. C, 10 deg. C to 40 deg. C, 80 占 폚, 30 占 폚 to 70 占 폚, 30 占 폚 to 60 占 폚, 30 占 폚 to 50 占 폚, or 30 占 폚 to 40 占 폚. In one embodiment, the Tm of site Y is from 1 ° C to 15 ° C, from 1 ° C to 20 ° C, from 1 ° C to 5 ° C, from 2 ° C to 15 ° C, from 2 ° C to 10 ° C, Deg.] C, 3 [deg.] C to 10 [deg.] C, or 3 [ In one embodiment, the Tm of site Y is lower than the Tm of sites X and Z, respectively.

일 구현예에서, 부위 X의 Tm은 부위 Z의 Tm보다 높다. 특정 구현예에서, 부위 X의 Tm은 부위 Z의 Tm보다 5℃, 10℃, 15℃, 20℃ 또는 25℃ 더 높다. 또 다른 구현예에서, 부위 Z의 Tm은 부위 X의 Tm보다 높다. 특정 구현예에서, 부위 Z의 Tm은 부위 Z의 Tm보다 5℃, 10℃, 15℃, 20℃ 또는 25℃ 더 높다. In one embodiment, the Tm of site X is higher than the Tm of site Z. In certain embodiments, the Tm of site X is 5 ° C, 10 ° C, 15 ° C, 20 ° C, or 25 ° C higher than the Tm of site Z. In another embodiment, the Tm of site Z is higher than the Tm of site X. In certain embodiments, the Tm of site Z is 5 ° C, 10 ° C, 15 ° C, 20 ° C, or 25 ° C higher than the Tm of site Z.

식 (I)의 올리고뉴클레오타이드에서, 상기 X 및 Z 부위 중 어느 하나 또는 둘 모두는 적어도 하나의 유니버설 염기(universal base) 또는 축퇴성 염기(degenerate base)를 포함할 수 있다. In the oligonucleotides of formula (I), either or both of the X and Z sites may comprise at least one universal base or a degenerate base.

일 구현예에서, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 부위 Z 중 어느 하나 또는 둘 모두가 2개 이상의 유니버설 염기를 포함하는 경우, 상기 유니버설 염기는 상기 올리고뉴클레오타이드 서열에 연속적으로 존재하지 않고, 분리되어 존재한다. Y 부위가 또한 2개 이상의 연속적인 유니버설 염기를 함유하는 경우, X 부위 및 Z 부위 중 어느 하나 또는 둘 모두에 함유된 상기 2개 이상의 유니버설 염기는 서열에서 분리되어 존재한다는 점에서 Y 부위에서의 2개 이상의 연속적인 유니버설 염기와 구별된다. In one embodiment, when either or both of the site X and site Z in the oligonucleotide of formula (I) comprises two or more universal bases, the universal base is not contiguous to the oligonucleotide sequence, It exists separately. In the case where the Y site also contains two or more consecutive universal bases, the two or more universal bases contained in either or both of the X and Z sites are separated from the sequence in the presence of two Lt; RTI ID = 0.0 > universal < / RTI >

또 다른 구현예에서, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 부위 Z 중 어느 하나 또는 둘 모두가 2개 이상의 유니버설 염기를 포함하는 경우, 상기 유니버설 염기는 상기 올리고뉴클레오타이드의 서열에 연속적으로 존재한다. Y 부위가 또한 2개 이상의 연속적인 유니버설 염기를 함유하는 경우, X 부위 및 Z 부위 중 어느 하나 또는 둘 모두에 함유된 둘 이상의 유니버셜 염기는 Y 부위에서의 2개 이상의 연속적인 유니버설 염기와 구별되지 않는다. 이 경우, 이들 중 어느 하나를 Y 부위로 처리하거나 간주할 수 있다. 일 예로서, 5' 말단에 더 근접한 유니버설 염기를 Y 부위로 처리할 수 있고, 상기 Y 부위 주위의 5' 말단에 있는 부위를 X 부위로 처리하고 Y 부위 주위의 3' 말단에 있는 부위를 Z 부위로 처리한다. 또 다른 예로서, 5' 말단으로부터 이격된 영역을 Y 부위로 처리하고, 상기 Y 부위 주위의 5' 말단에 있는 부위를 X 부위로, 3' 말단에 있는 부위를 Z 부위로 처리할 수 있다. 또 다른 예로서, 더 많은 유니버설 염기를 갖는 영역을 Y 부위로 처리하고, 상기 Y 부위 주위의 5' 말단에 있는 부위를 X 부위로, 상기 Y 부위 주위의 3' 말단에 있는 부위를 Z 부위로 처리할 수 있다. In another embodiment, when either or both of the site X and site Z in the oligonucleotide of formula (I) comprises two or more universal bases, the universal base is contiguous to the sequence of the oligonucleotide . When the Y site also contains two or more consecutive universal bases, two or more universal bases contained in either or both of the X site and the Z site are not distinguished from two or more consecutive universal bases at the Y site . In this case, any one of them may be treated or considered as a Y-site. As an example, the universal base closer to the 5 'end can be treated with the Y site, the site at the 5' end of the Y site is treated as X site, the site at the 3 'end around the Y site is treated as Z Treatment. As another example, a region remote from the 5 'end may be treated as a Y site, and a site at the 5' end and a site at the 3 'end of the Y site may be treated as the Z site. As another example, a region having more universal bases is treated as a Y site, and a site at the 5 'end around the Y site is referred to as an X site, and a site at the 3' end around the Y site is referred to as a Z site Can be processed.

본원에 사용된 바와 같이 용어 "축퇴성 염기"는 지정된 뉴클레오타이드 위치에 4개의 염기(A, C, G 또는 T) 중 어느 것 또는 4개의 염기의 특정 하위집합(2 또는 3개의 염기)이 존재할 수 있음을 의미한다. 또한, 상기 용어는 특정 위치에 둘 이상의 염기 가능성을 의미한다. 하나의 올리고 서열은 동일한 위치에 다수의 염기를 갖도록 합성될 수 있고, 이것은 종종 "와블(wobble)" 위치 또는 "혼합된 염기"로도 불리는 축퇴성 염기로 불린다. As used herein, the term " degenerate base " means that at any given nucleotide position there may be a specific subset (two or three bases) of any of the four bases (A, C, G or T) . The term also refers to the possibility of more than one base at a particular position. One oligonucleotide can be synthesized to have multiple bases at the same position, which is often referred to as a " wobble " position or a degenerate base, also referred to as a " mixed base. &Quot;

상기 축퇴성 염기는 상이한 축퇴성(degeneracy) 정도를 가질 수 있다. 용어 "축퇴성 정도"는 주어진 뉴클레오타이드 위치를 차지할 수 있는 염기의 개수를 가리킨다. "완전한 축퇴성(full degeneracy)"은 4개의 염기 모두(A, C, G 또는 T)가 주어진 축퇴성 위치를 차지할 수 있을 때 발생한다. 이 경우, 주어진 축퇴성 위치에 염기 A를 갖는 올리고뉴클레오타이드, 주어진 축퇴성 위치에 염기 C를 갖는 올리고뉴클레오타이드로 이루어진 4개의 올리고뉴클레오타이드, 주어진 축퇴성 위치에 염기 G를 갖는 올리고뉴클레오타이드 및 주어진 축퇴성 위치에 염기 T를 갖는 올리고뉴클레오타이드가 함께 사용될 수 있다. 한편, "부분적인 축퇴성(partial degeneracy)"은 A/G, C/T, A/C/G, A/T/G 등과 같은 4개의 염기의 특정 하위집합(2-3)이 주어진 축퇴성 위치를 차지할 수 있을 때 발생한다. The decocted base may have a different degree of degeneracy. The term " extent of degeneracy " refers to the number of bases that can occupy a given nucleotide position. "Full degeneracy" occurs when all four bases (A, C, G, or T) can occupy a given degenerative position. In this case, an oligonucleotide having a base A at a given axial displacement position, four oligonucleotides consisting of an oligonucleotide having a base C at a given axial displacement position, an oligonucleotide having a base G at a given axial displacement position, Oligonucleotides with base T can be used together. On the other hand, "partial degeneracy" means that a specific subset (2-3) of four bases such as A / G, C / T, A / C / G, Occurs when the position can be occupied.

축퇴성 염기의 표시와 관련하여, 뉴클레오타이드 염기에 대한 IUB 축퇴성 코드(IUB degenerate code)가 본원에서 사용된다. 이들 코드에서, R은 퓨린 염기 A 또는 G 중 어느 하나를 의미하고; Y는 피리미딘 염기 C 또는 T 중 어느 하나를 의미하며; M은 아미노 염기 A 또는 C 중 어느 하나를 의미하고; K는 케토 염기 G 또는 T 중 어느 하나를 의미하며; S는 강한 수소 결합 파트너 C 또는 G 중 어느 하나를 의미하고; W는 약한 수소 결합 파트너 A 또는 T 중 어느 하나를 의미하며; H는 A, C 또는 T를 의미하고; B는 G, T 또는 C를 의미하며; V는 G, C 또는 A를 의미하고; D는 G, A 또는 T를 의미하며; N은 G, A, C 또는 T를 의미한다.Regarding the indication of degenerate bases, an IUB degenerate code for the nucleotide base is used herein. In these codes, R means any of the purine bases A or G; Y means any of the pyrimidine bases C or T; M means any of amino base A or C; K means either keto base G or T; S means any strong hydrogen bonding partner C or G; W means any of the weak hydrogen bonding partners A or T; H means A, C or T; B means G, T or C; V means G, C or A; D means G, A or T; N means G, A, C or T.

본 발명의 특정 구현예에 따르면, 식 (I)로 표시되는 올리고뉴클레오타이드는 WO 제2006/095981호에 개시된 바와 같은 이중 특이성 올리고뉴클레오타이드(DSO 또는 DPO로 지칭됨)이다. 상기 이중 특이성 올리고뉴클레오타이드에 관한 세부 사항은 상기 문헌을 참고한다. According to a particular embodiment of the invention, the oligonucleotide of formula (I) is a bispecific oligonucleotide (referred to as DSO or DPO) as described in WO 2006/095981. Details regarding the bispecific oligonucleotides are available in the above references.

본 발명의 또 다른 특정 구현예에 따르면, 식 (I)로 표시되는 올리고뉴클레오타이드는 WO 제2011/028041호에 개시된 바와 같은 타겟 구별성(TD) 프로브이다. 상기 타겟 구별성 프로브에 관한 세부 사항은 상기 문헌을 참조한다. According to another specific embodiment of the present invention, the oligonucleotides represented by formula (I) are target specificity (TD) probes as disclosed in WO 2011/028041. Details of the target differentiability probe are described in the above document.

본 단계에서 제공되는 식 (I)의 올리고뉴클레오타이드는 이미 존재하는(pre-existing) 올리고뉴클레오타이드(프라이머 또는 프로브)일 수 있다. The oligonucleotides of formula (I) provided in this step may be pre-existing oligonucleotides (primers or probes).

택일적으로, 본 단계에서 제공되는 식 (I)의 올리고뉴클레오타이드는 증폭 또는 검출하고자 하는 타겟 핵산 서열에 기초하여 디자인된 올리고뉴클레오타이드일 수 있다. Alternatively, the oligonucleotide of formula (I) provided in this step may be an oligonucleotide designed based on the target nucleic acid sequence to be amplified or detected.

상기 올리고뉴클레오타이드는 수작업으로 또는 당업계에 널리 알려진 디자인 프로그램에 의해 디자인된 것일 수 있다. 종래 프라이머/프로브 디자인 프로그램의 예로는 Primer3(http://frodo.wi.mit.edu/), Visual OMP™ 소프트웨어(DNA Software, Inc., Ann Arbor, Mich.), Integrated DNA Technology(IDT) OligoAnalyzer 3.0 프로그램(http://scitools.idtdna.com/Analvzer/oligocalc.asp), DINAmelt™ 프로그램(http://dinamelt.bioinfo.rpi.edu/), OLIGO 7(Wojciech Rychlik (2007). "OLIGO 7 Primer Analysis Software". Methods MoI. Biol. 402: 35-60), Primer Express 3.0 소프트웨어(Applied Biosystems U.S.A) 등을 들 수 있으나, 이에 제한되지 않는다. The oligonucleotides may be designed by hand or by design programs well known in the art. Examples of conventional primer / probe design programs include Primer3 (http://frodo.wi.mit.edu/), Visual OMP ™ software (DNA Software, Inc., Ann Arbor, Mich.), Integrated DNA Technology (IDT) OligoAnalyzer 3.0 program (http://scitools.idtdna.com/Analvzer/oligocalc.asp), DINAmelt ™ program (http://dinamelt.bioinfo.rpi.edu/), OLIGO 7 (Wojciech Rychlik (2007). "OLIGO 7 Primer Analysis Software ", Methods MoI. Biol. 402: 35-60), Primer Express 3.0 software (Applied Biosystems USA), and the like.

식 (I)의 올리고뉴클레오타이드는 그의 X 및 Y 부위가 타겟 핵산 서열에 실질적으로 혼성화될 수 있는 서열을 갖도록 디자인된다. 이를 위해, 식 (I)의 올리고뉴클레오타이드 내의 X 및 Y 부위는 타겟 핵산 서열의 특정 영역에 매치되도록(상당한 서열 유사성을 갖도록) 디자인된다. The oligonucleotides of formula (I) are designed such that their X and Y sites have sequences that can be substantially hybridized to the target nucleic acid sequence. To this end, the X and Y sites in the oligonucleotide of formula (I) are designed to match (with significant sequence similarity) to a specific region of the target nucleic acid sequence.

식 (I)의 올리고뉴클레오타이드가 복수의 타겟 핵산 서열(예를 들어, 유전적 다양성을 갖는 뉴클레오타이드 서열; 유전적으로 동일한 유전자 패밀리로 구성된 그룹, 즉 유전자 및 이의 변이체; 유전자 및 그의 아형의 그룹)을 증폭 또는 검출하고자 하는 경우, 상기 올리고뉴클레오타이드는, 상기 복수의 타겟 핵산 서열을 정렬하고, 공통적인 서열, 예컨대 보존 구역(conserved region)을 찾고, 상기 보존 구역에 매치되도록 올리고뉴클레오타이드를 디자인함으로써 제조될 수 있다. 식 (I)의 올리고뉴클레오타이드는 복수의 타겟 핵산 서열과 100% 동일성(identity)을 갖도록 디자인될 수 있다. 택일적으로, 식 (I)의 올리고뉴클레오타이드는 제어된 혼성화 조건(예컨대, 온도) 하에 타겟 핵산 서열에 혼성화될 수 있는 한, 복수의 타겟 핵산 서열에 대해 몇 개의 미스매치를 갖도록 디자인될 수 있다.The oligonucleotides of formula (I) are amplified by a plurality of target nucleic acid sequences (for example, a nucleotide sequence having genetic diversity; a group consisting of genetically identical gene families, i.e. genes and variants thereof; genes and groups of subtypes thereof) The oligonucleotides can be prepared by aligning the plurality of target nucleic acid sequences and locating a common sequence such as a conserved region and designing the oligonucleotide to match the conserved region . The oligonucleotides of formula (I) may be designed to have 100% identity with a plurality of target nucleic acid sequences. Alternatively, oligonucleotides of formula (I) can be designed to have several mismatches for a plurality of target nucleic acid sequences, as long as they can be hybridized to the target nucleic acid sequence under controlled hybridization conditions (e.g., temperature).

식 (I)의 올리고뉴클레오타이드는 타겟 핵산 서열(들)을 기초로 하여 디자인된 복수의 후보(candidate) 올리고뉴클레오타이드 중 하나일 수 있다. 당업자는 공지된 타겟 핵산 서열(들)을 기초로 복수의 식 (I)의 후보 올리고뉴클레오타이드를 디자인할 수 있으며, 본 발명의 방법에 사용되는 식 (I)의 올리고뉴클레오타이드는 상기 후보 올리고뉴클레오타이드 중 하나일 수 있다. The oligonucleotide of formula (I) may be one of a plurality of candidate oligonucleotides designed based on the target nucleic acid sequence (s). One skilled in the art can design a plurality of candidate oligonucleotides of formula (I) based on the known target nucleic acid sequence (s), and the oligonucleotide of formula (I) used in the method of the present invention may be one of the candidate oligonucleotides Lt; / RTI >

식 (I)의 올리고뉴클레오타이드는 멀티플렉스 증폭 또는 검출에서 사용되는 올리고뉴클레오타이드 중 하나일 수 있다. 식 (I)의 올리고뉴클레오타이드는 복수의 타겟 핵산 서열을 증폭 또는 검출하기 위한 복수의 올리고뉴클레오타이드(또는 후보 올리고뉴클레오타이드) 중 하나일 수 있다.The oligonucleotide of formula (I) may be one of the oligonucleotides used in multiplex amplification or detection. The oligonucleotide of formula (I) may be one of a plurality of oligonucleotides (or candidate oligonucleotides) for amplifying or detecting a plurality of target nucleic acid sequences.

또한, 식 (I)의 올리고뉴클레오타이드는 타겟 핵산 서열을 증폭하기 위한 프라이머 쌍(즉, 정방향 프라이머 및 역방향 프라이머) 중 하나일 수 있다. In addition, the oligonucleotide of formula (I) may be one of a pair of primers (i.e., forward primer and reverse primer) to amplify the target nucleic acid sequence.

식 (I)의 올리고뉴클레오타이드는 PCR 또는 실시간 PCR에 사용될 수 있는 올리고뉴클레오타이드이다. 식 (I)의 올리고뉴클레오타이드는, 다양한 분야, 예를 들어 (i) Miller, H. I. 방법(WO 89/06700) 및 Davey, C. 등(EP 329,822), 리가아제 연쇄 반응(LCR, Wu, D.Y. et al., Genomics 4:560 (1989)), 중합효소 리가아제 연쇄 반응(Barany, PCR Methods and Applic., 1:5-16(1991)), 갭-LCR(WO 90/01069), 복구 연쇄 반응(EP 439,182), 3SR(Kwoh et al., PNAS, USA, 86:1173(1989)) 및 NASBA(U.S. Pat. No. 5,130,238), 예컨대 프라이머-관련 핵산 증폭 방법, (ⅱ) 관련된 사이클 시퀀싱(Kretz et al., (1994) Cycle sequencing. PCR Methods Appl. 3:S107-S112) 및 파이로시퀀싱(Ronaghi et al., (1996) Anal. Biochem., 242:84-89; 및 (1998) Science 281:363-365) 등, 예컨대 프라이머 연장-관련 기술들, 및 (iii) 올리고뉴클레오타이드 마이크로어레이를 사용한 타겟 뉴클레오타이드 서열의 검출, 예컨대 혼성화-관련 기술들에서 유용한 올리고뉴클레오타이드이다. 본 발명의 올리고뉴클레오타이드는 다양한 핵산 증폭, 시퀀싱 및 혼성화-관련 기술에 적용될 수 있는 올리고뉴클레오타이드이다. Oligonucleotides of formula (I) are oligonucleotides that can be used for PCR or real-time PCR. The oligonucleotides of formula (I) may be used in a variety of fields, such as (i) Miller, HI method (WO 89/06700) and Davey, C. et al (EP 329,822), ligase chain reaction (LCR, Wu, DY et PCR (Methods and Applications, 1: 5-16 (1991)), Gap-LCR (WO 90/01069), Restriction Chain Reaction (EP 439,182), 3SR (Kwoh et al., PNAS, USA, 86: 1173 (1989)) and NASBA (US Pat. No. 5,130,238), such as primer-related nucleic acid amplification methods, (1994) Cycle sequencing PCR Methods Appl. 3: S107-S112) and Pyrosequencing (Ronaghi et al., (1996) Anal. Biochem., 242: 84-89; : 363-365), such as primer extension-related techniques, and (iii) oligonucleotides useful in detection of target nucleotide sequences using oligonucleotide microarrays, such as hybridization-related techniques. The oligonucleotides of the present invention are oligonucleotides that can be applied to various nucleic acid amplification, sequencing and hybridization-related techniques.

단계 (b): 뉴클레오타이드 서열 데이터베이스와 비교 및 상동 영역을 포함하는 참조 뉴클레오타이드 서열의 추출( 120 ) Step (b): Extraction ( 120 ) of a reference nucleotide sequence comprising a comparison with a nucleotide sequence database and a homologous region,

본 단계에서는, 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고, 상기 데이터베이스로부터 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는 참조 뉴클레오타이드 서열을 추출한다(120). In this step, all or a portion of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database, and the sequence comprising a region homologous to all or part of the sequence of the oligonucleotide of formula (I) The reference nucleotide sequence is extracted ( 120 ).

본원에 사용된 바와 같이 용어 "뉴클레오타이드 서열의 데이터베이스(database)", "뉴클레오타이드 서열 데이터베이스", "뉴클레오타이드 데이터베이스", 또는 "데이터베이스"는 다양한 소스(source)로부터 유래된 2개 이상의 뉴클레오타이드 서열에 관한 데이터의 세트 또는 집합을 의미한다. 상기 뉴클레오타이드 서열 데이터베이스는 뉴클레오타이드 서열과 관련된 정보, 예를 들어 이들의 구체적인 서열 및 신원(identity)을 포함할 수 있다. 상기 데이터베이스는 공중에게 이용가능하거나, 상업적으로 이용가능하거나, 또는 본 발명자에 의해 생성될 수 있다. 상기 데이터베이스는 컴퓨터에 의한 검색의 편의성 및 속도를 위해 배열된 집합이다. As used herein, the term "database of nucleotide sequences", "nucleotide sequence database", "nucleotide database", or "database" refers to a sequence of two or more nucleotide sequences derived from various sources &Lt; / RTI > The nucleotide sequence database may contain information related to the nucleotide sequence, e.g., their specific sequence and identity. The database may be publicly available, commercially available, or may be generated by the inventor. The database is an ordered set for convenience and speed of search by a computer.

당업계에 공지된 데이터베이스의 예는 GenBank 데이터베이스, EST 데이터베이스, EMBL 뉴클레오타이드 서열 데이터베이스, Entrez 뉴클레오타이드 데이터베이스 및 LIFESEQ™ 데이터베이스를 포함하나, 이에 제한되는 것은 아니다. 본원에서 뉴클레오타이드 서열 데이터베이스는 "참조(reference) 데이터베이스"로도 불릴 수 있다. Examples of databases known in the art include, but are not limited to, the GenBank database, the EST database, the EMBL nucleotide sequence database, the Entrez nucleotide database, and the LIFESEQ ™ database. The nucleotide sequence database herein may also be referred to as a " reference database ".

본원에서 식 (I)의 올리고뉴클레오타이드와 비교되는 데이터베이스는 상기 기재된 데이터베이스 중 어느 것 또는 이의 조합일 수 있다. The database compared to the oligonucleotides of formula (I) herein may be any of the databases described above or a combination thereof.

본 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 적어도 하나의 뉴클레오타이드 서열 데이터베이스와의 비교는, 서열 정렬 알고리즘 또는 프로그램을 사용하여 데이터베이스를 서치(search)하는 것을 포함한다. 또한, 본 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 적어도 하나의 뉴클레오타이드 서열 데이터베이스와의 비교는, 서열 정렬 알고리즘 또는 프로그램을 사용하여 상기 올리고뉴클레오타이드의 전체 또는 일부 서열을 상기 데이터베이스 내의 뉴클레오타이드 서열과 정렬(alignment)하는 것을 포함한다. 또한, 본 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 적어도 하나의 뉴클레오타이드 서열 데이터베이스와의 비교는, 상기 올리고뉴클레오타이드의 전체 또는 일부 서열을 상기 데이터베이스 내의 각각의 뉴클레오타이드 서열과 정렬하고, 상기 정렬을 분석하는 것을 포함한다. 또한, 본 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 적어도 하나의 뉴클레오타이드 서열 데이터베이스와의 비교는, 상기 올리고뉴클레오타이드의 전체 또는 일부 서열을 상기 데이터베이스 내의 각각의 뉴클레오타이드 서열과 정렬하고, 이들 사이의 상동성(homology) 또는 유사성(similarity)을 결정하는 것을 포함한다. In this step (b), the comparison of all or some of the oligonucleotides of formula (I) with at least one nucleotide sequence database involves searching the database using a sequence alignment algorithm or program. Also, in this step (b), the comparison of all or a part of the oligonucleotide of formula (I) with at least one nucleotide sequence database can be performed by using a sequence alignment algorithm or a program to determine all or a part of the sequence of the oligonucleotide And alignment with the nucleotide sequence in the database. In addition, in this step (b), the comparison of all or some of the oligonucleotides of formula (I) with at least one nucleotide sequence database may be made by comparing all or part of the oligonucleotide sequence with the respective nucleotide sequence in the database And analyzing the alignment. In addition, in this step (b), the comparison of all or some of the oligonucleotides of formula (I) with at least one nucleotide sequence database may be made by comparing all or part of the oligonucleotide sequence with the respective nucleotide sequence in the database , And determining the homology or similarity between them.

본 단계에서, 두 서열 사이의, 즉 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 데이터베이스 내의 뉴클레오타이드 서열 사이의 비교는, 서열 정렬 알고리즘 또는 프로그램을 사용하여 수행될 수 있다. In this step, a comparison between the two or more sequences, i.e. the whole or partial sequence of the oligonucleotide of formula (I), and the nucleotide sequence in the database can be performed using a sequence alignment algorithm or a program.

서열 정렬 알고리즘 또는 프로그램은 당업계에 공지되어 있다. 서열 정렬 알고리즘 또는 프로그램의 예는 Smith and Waterman의 국소 상동성 알고리즘(1981, Adv. Appl. Math. 2:482), Needleman and Wunsch의 상동성 정렬 알고리즘(1970, J. Mol. Biol. 48:443), Person and Lipman의 방법을 통한 유사성 서치(1988, Proc. Nat'l. Acad. Sci. USA 85:2444), 이들 알고리즘의 컴퓨터화된 구현(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), 및 수동 정렬 및 육안 검사를 포함한다. Sequence alignment algorithms or programs are known in the art. An example of a sequence alignment algorithm or program is the homology algorithm of Smith and Waterman (1981, Adv. Appl. Math. 2: 482), the homology sorting algorithm of Needleman and Wunsch (1970, J. Mol. Biol. ), Computerized Implementations of These Algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics, 1988, Proc. Nat'l Acad Sci USA 85: 2444) Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), And manual alignment and visual inspection.

상동성을 결정하기 위한 알고리즘 또는 프로그램의의 다른 예는 BLAST 프로그램(Basic Local Alignment Search Tool at the National Center for Biological Information), ALIGN, AMAS(Analysis of Multiply Aligned Sequences), AMPS(Protein Multiple Sequence Alignment), ASSET(Aligned Segment Statistical Evaluation Tool), BANDS, BESTSCOR, BIOSCAN(Biological Sequence Comparative Analysis Node), BLIMPS(BLocks IMProved Searcher), FASTA, Intervals & Points, BMB, CLUSTAL V, CLUSTAL W, CONSENSUS, LCONSENSUS, WCONSENSUS, Smith-Waterman algorithm, DARWIN, Las Vegas algorithm, FNAT(Forced Nucleotide Alignment Tool), Framealign, Framesearch, DYNAMIC, FILTER, FSAP(Fristensky Sequence Analysis Package), GAP(Global Alignment Program), GENAL, GIBBS, GenQuest, ISSC(Sensitive Sequence Comparison), LALIGN(Local Sequence Alignment), LCP(Local Content Program), MACAW(Multiple Alignment Construction & Analysis Workbench), MAP(Multiple Alignment Program), MBLKP, MBLKN, PIMA(Pattern-Induced Multi-sequence Alignment), SAGA(Sequence Alignment by Genetic Algorithm) 및 WHAT-IF를 포함한다. 특히, 상기 서열 정렬 알고리즘 또는 프로그램은 Smith & Waterman, Needleman-Wunsch, BLAST 및 FASTA 알고리즘 또는 프로그램으로 이루어진 군으로부터 선택된다. Other examples of algorithms or programs for determining homology include BLAST programs (Basic Local Alignment Search Tool at the National Center for Biological Information), ALIGN, Analysis of Multiply Aligned Sequences (AMAS), Protein Multiple Sequence Alignment (AMPS) ASSET (Aligned Segment Statistical Evaluation Tool), BANDS, BESTSCOR, BIOSCAN (Biological Sequence Comparative Analysis Node), BLIMPS (BLocks IMPROVED SEARCHER), FASTA, Intervals & Points, BMB, CLUSTAL V, CLUSTAL W, CONSENSUS, LCONSENSUS, WCONSENSUS, Smith Generic, GIBBS, GenQuest, ISSC (Global Alignment Program), GAP (Global Alignment Program), FSM (FAP), Framerign, Framesearch, DYNAMIC, FILTER, FSAP Sequence Comparison), LALIGN (Local Sequence Alignment), LCP (Local Content Program), MACAW (Multiple Alignment Construction & Analysis Workbench), MAP (Multiple Alignment Program), MBLKP, MBLKN, PIMA ulti-sequence Alignment), Sequence Alignment by Genetic Algorithm (SAGA), and WHAT-IF. In particular, the sequence alignment algorithm or program is selected from the group consisting of Smith & Waterman, Needleman-Wunsch, BLAST, and FASTA algorithms or programs.

상기 서열 정렬 알고리즘 또는 프로그램은 올리고뉴클레오타이드(질의 서열)와 상동인 영역을 찾아내기 위하여 적절한 파라미터들을 사용한다. 본 발명의 방법에서 사용되는 서열 정렬 알고리즘 또는 프로그램은 기본값(default)으로 설정된 파라미터를 이용하거나, 당업자에 의해 적절하게 조정된 파라미터들을 이용할 수 있다. 예를 들어, 대표적인 서열 정렬 알고리즘 또는 프로그램인 BLAST 알고리즘은 E-value, Reward/penalty, Gap penalty, Gap creation, Word size, Scoring matrix, PSSM, Filter 등과 같은 파라미터를 사용한다. 상기 서열 정렬 알고리즘 또는 프로그램 내의 파라미터들은, 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 데이터베이스 내의 각각의 참조 뉴클레오타이드 서열 간의 상동성 정도(homology cutoff)의 조절을 통해, 추출되는 참조 뉴클레오타이드 서열의 양(개수)를 제어하기 위하여, 당업자에 의하여 적절히 조정될 수 있다. 특히, 식 (I)의 올리고뉴클레오타이드의 길이가 짧다는 점을 고려하여, 매치 확률을 높이기 위해, 이들의 기본값과 비교하여 Word size를 낮추고 E 값을 높이는 것이 바람직하다. The sequence alignment algorithm or program uses appropriate parameters to find regions that are homologous to the oligonucleotide (query sequence). The sequence alignment algorithm or program used in the method of the present invention may use parameters set to default or use parameters adjusted appropriately by those skilled in the art. For example, BLAST algorithm, which is a typical sequence sorting algorithm or a program, uses parameters such as E-value, Reward / penalty, Gap penalty, Gap creation, Word size, Scoring matrix, PSSM and Filter. The parameters in the sequence alignment algorithm or program can be determined by adjusting the homology cutoff between the entire or partial sequence of the oligonucleotide of formula (I) and the respective reference nucleotide sequence in the database and the amount of reference nucleotide sequence (Number), as will be appreciated by those skilled in the art. In particular, in consideration of the short length of oligonucleotides of formula (I), it is desirable to lower the word size and increase the E value by comparing with the default values in order to increase the match probability.

본 발명의 일 구현예에서, 본 발명에서 사용되는 서열 정렬 알고리즘 또는 프로그램은 본 발명자에 의해 개발된 알고리즘 또는 프로그램일 수 있다. 상기 알고리즘 또는 프로그램은 그의 서열 내부에 왓슨-크릭 염기쌍에 관여하지 않는 2개 이상의 연속적인 염기를 포함하거나, 임의로 비연속적인 유니버설 염기 또는 축퇴성 염기를 포함하는 올리고뉴클레오타이드의 특이성을 평가하기 위해 개발된 알고리즘 또는 프로그램일 수 있다. 상기 알고리즘 또는 프로그램은 식 (I)의 올리고뉴클레오타이드 내의 Y 부위의 서열을 고려하지 않을 수 있다. 예를 들어, 상기 알고리즘 또는 프로그램은 식 (I)의 올리고뉴클레오타이드 내의 Y 부위의 서열과 데이터베이스 내의 상응하는 참조 뉴클레오타이드 서열 간의 상동성을 고려하지 않는다. 즉, 상기 알고리즘 또는 프로그램을 사용한 비교는 부위 Y를 제외한 부위 X 및 Z에서의 상동성의 결정을 포함할 수 있다. In one embodiment of the invention, the sequence alignment algorithm or program used in the present invention may be an algorithm or program developed by the present inventors. The algorithm or program was developed to evaluate the specificity of oligonucleotides containing two or more consecutive bases that do not participate in the Watson-Crick base pair within their sequence or, optionally, a non-contiguous universal base or degenerate base Algorithm or program. The algorithm or program may not consider the sequence of the Y site in the oligonucleotide of formula (I). For example, the algorithm or program does not consider the homology between the sequence of the Y site in the oligonucleotide of formula (I) and the corresponding reference nucleotide sequence in the database. That is, the comparison using the algorithm or the program may include determination of homology at sites X and Z except site Y. [

상기 기재된 바와 같이 비교 후, 상기 데이터베이스로부터 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출된다. After comparison as described above, a reference nucleotide sequence comprising a region homologous to all or some of the oligonucleotides of Formula (I) is extracted from the database.

본원에 사용된 바와 같이, 용어 "참조 뉴클레오타이드 서열"은 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는, 데이터베이스 내의 서열을 지칭한다. 추출되는 참조 뉴클레오타이드의 개수는 적어도 1개일 수 있다. As used herein, the term " reference nucleotide sequence " refers to a sequence in the database, including regions that are homologous to all or part of the sequence of the oligonucleotide of formula (I). The number of reference nucleotides to be extracted may be at least one.

참조 뉴클레오타이드 서열 각각은 상동 영역 및 임의로 그의 플랭킹 영역(flanking region)을 포함한다.Each reference nucleotide sequence comprises a homologous region and optionally a flanking region thereof.

본원에 사용된 바와 같이, 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 관련하여 용어 "상동인 영역", "상동 영역" 또는 "상동성 영역"은 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 동일하거나 유사한, 데이터베이스로부터의 참조 뉴클레오타이드 서열 내의 특정 영역을 의미한다. 다시 표현하면, 상기 상동 영역은 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열에 매치되는 참조 뉴클레오타이드 서열 내부의 특정 영역을 의미한다. As used herein, the term "homologous region", "homologous region" or "homologous region" in relation to all or part of the sequence of the oligonucleotide of Formula (I) refers to the entire Refers to a specific region within a reference nucleotide sequence from a database that is the same as or similar to some sequence. Expressed again, the homologous region refers to a specific region within the reference nucleotide sequence that matches all or part of the sequence of the oligonucleotide of Formula (I).

상기 추출된 참조 뉴클레오타이드 서열은 상이한 크기의 상동 서열을 가질 수 있다. The extracted reference nucleotide sequence may have homologous sequences of different sizes.

일 구현예에서, 상동 영역은 단계 (a)에서 제공된 올리고뉴클레오타이드와 동일한 길이이다. 예를 들어, 단계 (a)에서 제공된 올리고뉴클레오타이드가 왓슨-크릭 염기쌍에 관여하지 않는 상대적으로 적은 수의 연속적인 염기(예를 들면, 2개 또는 3개의 유니버설 염기)를 포함하는 경우, BLAST 알고리즘에 의해 추출된 참조 뉴클레오타이드 서열은 단계 (a)에서 제공된 올리고뉴클레오타드와 동일한 길이의 상동 영역을 포함할 수 있다. 이 경우, 상기 상동 영역은 단계 (a)에서 제공된 올리고뉴클레오타이드의 전체 서열과 동일한 길이이며 상기 서열과 상동성을 갖는다. In one embodiment, the homology region is the same length as the oligonucleotide provided in step (a). For example, if the oligonucleotide provided in step (a) comprises a relatively small number of contiguous bases (e.g., two or three universal bases) that do not participate in Watson-Crick base pairs, the BLAST algorithm The reference nucleotide sequence extracted by step (a) may comprise a homology region of the same length as the oligonucleotide provided in step (a). In this case, the homologous region is the same length as the entire sequence of the oligonucleotide provided in step (a) and has homology with the sequence.

또 다른 구현예에서, 상기 상동 영역은 단계 (a)에서 제공된 올리고뉴클레오타이드보다 더 짧다. 예를 들어, 단계 (a)에서 제공된 올리고뉴클레오타이드가 왓슨-크릭 염기쌍에 관여하지 않는 상대적으로 많은 연속적인 염기(예를 들면, 4개, 5개, 또는 6개 이상의 유니버설 염기)를 포함하는 경우, BLAST 알고리즘에 의해 추출된 핵산 서열은 단계 (a)에서 제공된 올리고뉴클레오타이드보다 짧은 상동 영역을 포함할 수 있다. 구체적으로, 5'-X-Y-Z-3'으로 표시되는 올리고뉴클레오타이드(특히, Y 부위 내에 왓슨-크릭 염기쌍에 관여하지 않는 상대적으로 많은 연속적인 염기를 갖는)를 BLAST를 사용하여 데이터베이스와 비교하는 경우, 부위 X에만 상동인 영역(부위 X와 동일한 길이를 갖는 상동 영역)이 수득될 수 있다. 이 경우, 상기 상동 영역은 단계 (a)에 제공된 올리고뉴클레오타이드의 전체 서열보다 짧으며, 상기 올리고뉴클레오타이드의 일부 서열, 즉 X 부위와 상동성을 갖는다.In another embodiment, the homologous region is shorter than the oligonucleotide provided in step (a). For example, if the oligonucleotide provided in step (a) comprises a relatively large number of consecutive bases (e.g., four, five, or six or more universal bases) that do not participate in the Watson- The nucleic acid sequence extracted by the BLAST algorithm may comprise a region of homology that is shorter than the oligonucleotide provided in step (a). Specifically, when the oligonucleotides represented by 5'-XYZ-3 '(especially those having a relatively large number of consecutive bases not involved in the Watson-Crick base pair within the Y site) are compared to a database using BLAST, (A homology region having the same length as the site X) can be obtained. In this case, the homologous region is shorter than the entire sequence of the oligonucleotide provided in step (a), and has homology with a part of the sequence of the oligonucleotide, i.e., the X region.

문구 "올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역"은 상기 올리고뉴클레오타이드의 전체 또는 일부 서열과 실질적인 상동성(유사성)을 갖는, 참조 뉴클레오타이드 서열 내부의 영역을 가리킨다. 상기 실질적인 상동성은 참조 뉴클레오타이드 서열 내부의 영역과 상기 올리고뉴클레오타이드의 전체 또는 일부 서열 간의 상동성이 정의되거나 선택된 상동성 정도(특정 역치)보다 높다는 것을 가리킨다. 상기 정의된 상동성 정도는 디자인된 올리고뉴클레오타이드와 높은 유사성 또는 상동성을 갖는 참조 뉴클레오타이드 서열을 데이터베이스로부터 추출하기 위한 기준 또는 역치를 의미한다. 예를 들어, 정의된 상동성 정도는 2개의 정렬된 뉴클레오타이드 서열 중 어느 하나의 뉴클레오타이드 서열에서의 총 염기 개수를 기준으로, 50% 이상, 60% 이상, 70% 이상, 80% 이상, 90% 이상, 91% 이상, 92% 이상, 93% 이상, 94% 이상, 95% 이상, 96% 이상, 97% 이상, 98% 이상, 또는 99% 이상일 수 있다. 본 발명의 일 구현예에서, 올리고뉴클레오타이드의 부위 X 및 Z 중 어느 하나에서의 서열과 상응하는 참조 뉴클레오타이드 서열 간의 정의된 상동성 정도는 2개의 정렬된 뉴클레오타이드 서열 중 어느 하나의 뉴클레오타이드 서열에서의 총 염기 개수를 기준으로, 90% 이상, 91% 이상, 92% 이상, 93% 이상, 94% 이상, 95% 이상, 96% 이상, 97% 이상, 98% 이상, 또는 99% 이상일 수 있다. 본 발명의 또 다른 구현예에서, 올리고뉴클레오타이드의 부위 X에서의 서열 및 상응하는 참조 뉴클레오타이드 서열 간의 정의된 상동성 정도는, 2개의 정렬된 뉴클레오타이드 서열 중 어느 하나의 뉴클레오타이드 서열에서의 총 염기 개수를 기준으로, 90% 이상, 91% 이상, 92% 이상, 93% 이상, 94% 이상, 95% 이상, 96% 이상, 97% 이상, 98% 이상, 또는 99% 이상이고, 올리고뉴클레오타이드의 부위 Z에서의 서열 및 상응하는 참조 뉴클레오타이드 서열에서의 상동 영역 간의 상동성 정도는 90% 이상, 91% 이상, 92% 이상, 93% 이상, 94% 이상, 95% 이상, 96% 이상, 97% 이상, 98% 이상, 또는 99% 이상이다. The phrase " region homologous to all or part of the sequence of an oligonucleotide " refers to a region within a reference nucleotide sequence that has substantial homology (similarity) to all or part of the sequence of the oligonucleotide. The substantial homology indicates that the homology between the region within the reference nucleotide sequence and all or a portion of the sequence of the oligonucleotide is higher than the defined or selected degree of homology (specific threshold). The degree of homology defined above means a criterion or threshold for extracting a reference nucleotide sequence with high similarity or homology with a designed oligonucleotide from a database. For example, the defined degree of homology may be at least 50%, at least 60%, at least 70%, at least 80%, at least 90% , 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% In one embodiment of the invention, the defined degree of homology between the sequence in any one of the sites X and Z of the oligonucleotide and the corresponding reference nucleotide sequence is the total base in any one of the two aligned nucleotide sequences May be 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more or 99% or more. In another embodiment of the invention, the defined degree of homology between the sequence at the site X of the oligonucleotide and the corresponding reference nucleotide sequence is determined based on the total number of bases in the nucleotide sequence of any of the two aligned nucleotide sequences, , At least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% oligonucleotides The degree of homology between the sequence of the reference nucleotide sequence and the homologous region in the corresponding reference nucleotide sequence is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96% Or more, or 99% or more.

일 구현예에서, 식 (I)의 올리고뉴클레오타이드의 전체 서열이 단계 (b)의 비교에 사용된다.In one embodiment, the entire sequence of the oligonucleotide of formula (I) is used in the comparison of step (b).

특정 구현예에서, 식 (I)의 올리고뉴클레오타이드의 전체 서열이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교되는 경우, 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 전체 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. 예를 들어, 30 뉴클레오타이드 잔기로 구성된 올리고뉴클레오타이드의 전체 서열이 GenBank 데이터베이스와 비교되고, 단계 (b)에서 각각 30 뉴클레오타이드 길이의 상동 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. In certain embodiments, when the entire sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database, a reference comprising a region homologous to the entire sequence of the oligonucleotide of formula (I) in step (b) The nucleotide sequence can be extracted from the database. For example, the entire sequence of oligonucleotides composed of 30 nucleotide residues can be compared to the GenBank database, and in step (b) reference nucleotide sequences each containing 30 nucleotides in length of homology region can be extracted from the database.

또 다른 특정 구현예에서, 식 (I)의 올리고뉴클레오타이드의 전체 서열이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교되는 경우, 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 일부 서열(예컨대, 부위 X, 부위 Y 또는 이의 부분)에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. 예를 들어, 30 뉴클레오타이드 길이의 올리고뉴클레오타이드의 전체 서열이 GenBank 데이터베이스와 비교되고, 단계 (b)에서 30 뉴클레오타이드 길이보다 작은 상동 영역을 포함하는 참조 서열이 데이터베이스로부터 추출될 수 있다. In another specific embodiment, when the entire sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database, some of the oligonucleotides of formula (I) in step (b) A reference nucleotide sequence comprising a region homologous to a region Y or a portion thereof) can be extracted from the database. For example, the entire sequence of an oligonucleotide of 30 nucleotides in length can be compared to a GenBank database, and a reference sequence containing a homology region less than 30 nucleotides in length in step (b) can be extracted from the database.

또 다른 구현예에서, 식 (I)의 올리고뉴클레오타이드의 일부 서열이 단계 (b)의 비교에 사용된다.In another embodiment, a partial sequence of the oligonucleotide of formula (I) is used in the comparison of step (b).

본 발명의 단계 (b)의 비교에 사용되는 식 (I)의 올리고뉴클레오타이드의 일부 서열은 부위 X, 부위 Z, 또는 이의 부분일 수 있다. Some sequences of the oligonucleotides of formula (I) used in the comparison of step (b) of the present invention may be site X, site Z, or a portion thereof.

특정 구현예에서, 식 (I)의 올리고뉴클레오타이드의 일부 서열이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교되는 경우, 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. 예를 들어, 15 뉴클레오타이드 잔기로 구성된 부위 X만이 GenBank 데이터베이스와 비교되고, 단계 (b)에서 15 뉴클레오타이드 길이의 상동 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다 In certain embodiments, when a portion of the sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database, a reference comprising a region homologous to a portion of the sequence of the oligonucleotide of formula (I) in step (b) The nucleotide sequence can be extracted from the database. For example, only a site X composed of 15 nucleotide residues is compared to the GenBank database, and in step (b), a reference nucleotide sequence containing a homology region 15 nucleotides in length can be extracted from the database

또 다른 특정 구현예에서, 식 (I)의 올리고뉴클레오타이드의 일부 서열이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교되는 경우, 단계 (b)에서 식 (I)의 올리고뉴클레오타이드의 일부 서열의 부분에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. 예를 들어, 15 뉴클레오타이드 잔기로 구성된 부위 X만이 GenBank 데이터베이스와 비교되고, 단계 (b)에서 15 뉴클레오타이드 길이보다 작은 상동 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다 In another particular embodiment, when a portion of the sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database, the homologous region (s) in the portion of the sequence of the oligonucleotide of formula (I) May be extracted from the database. For example, only a site X composed of 15 nucleotide residues is compared to the GenBank database, and a reference nucleotide sequence containing a homology region less than 15 nucleotides in length in step (b) can be extracted from the database

본 발명의 일 구현예에서, 상기 올리고뉴클레오타이드 내의 X 부위의 서열만이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교된 다음, 단계 (b)에서 X 부위에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. In one embodiment of the present invention, only the sequence of the X site in the oligonucleotide is compared with at least one nucleotide sequence database, and then the reference nucleotide sequence including the region homologous to the X site in step (b) is extracted from the database .

본 발명의 또 다른 구현예에서, 상기 올리고뉴클레오타이드 내의 Z 부위의 서열만이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교된 다음, 단계 (b)에서 Z 부위에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. In another embodiment of the invention, only the sequence of the Z site in the oligonucleotide is compared to at least one nucleotide sequence database, and then the reference nucleotide sequence comprising the region homologous to the Z site in step (b) Can be extracted.

본 발명의 또 다른 구현예에서, 올리고뉴클레오타이드 내의 X 부위의 부분의 서열만이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교된 다음, 단계 (b)에서 X 부위의 부분에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. In another embodiment of the invention, only the sequence of the portion of the X site in the oligonucleotide is compared to the at least one nucleotide sequence database and then the reference nucleotide sequence comprising the homologous region in the portion of the X site in step (b) Can be extracted from the database.

본 발명의 또 다른 구현예에서, 올리고뉴클레오타이드 내의 Z 부위의 부분의 서열만이 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교된 다음, 단계 (b)에서 Z 부위의 부분에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 데이터베이스로부터 추출될 수 있다. In another embodiment of the invention, only the sequence of the portion of the Z site in the oligonucleotide is compared to the at least one nucleotide sequence database and then the reference nucleotide sequence comprising the homologous region in the portion of the Z site in step (b) Can be extracted from the database.

식 (I)의 올리고뉴클레오타이드의 일부 서열을 사용하는 구현예에 따르면, 올리고뉴클레오타이드와 뉴클레오타이드 서열 데이터베이스 간의 비교(즉, 상동성 결정)는 상기 올리고뉴클레오타이드 내의 X 또는 Z 부위, 또는 이의 부분의 서열과 데이터베이스 내의 참조 뉴클레오타이드 서열 간에서 이뤄진다. 즉, 상동성 결정은 일부 서열, 특히 Y 부위를 제외한 일부 서열을 사용하는 것을 특징으로 한다. According to embodiments using a partial sequence of the oligonucleotide of formula (I), the comparison (i. E., The determination of homology) between the oligonucleotide and the nucleotide sequence database is based on the sequence of the X or Z site, RTI ID = 0.0 > nucleotide < / RTI > Namely, the homology determination is characterized by using a part of the sequences, particularly the partial sequences excluding the Y part.

올리고뉴클레오타이드의 전체 서열이 아닌 일부 서열의 사용은 Y 부위가 상동성 결정에 부정적인 영향을 미치는 것을 방지하여, 보다 정확한 상동성을 갖는 참조 뉴클레오타이드 서열을 추출할 수 있다. 즉, 올리고뉴클레오타이드의 일부 서열을 사용하면 Y 부위에 포함된 왓슨-크릭 염기쌍에 관여하지 않는 염기로 인해 상동 영역이 잘못 판단되는 문제를 피할 수 있다. The use of some sequences other than the entire sequence of the oligonucleotide prevents the Y site from adversely affecting homology determinations, so that a reference nucleotide sequence having more accurate homology can be extracted. That is, the use of a partial sequence of the oligonucleotide avoids the problem of erroneous determination of the homology region due to the base not involved in the Watson-Crick base pair included in the Y site.

상기 임의의 구현예에 따라 추출된 참조 뉴클레오타이드 서열은 X 또는 Z 부위, 또는 이들의 부분에서의 서열과 상동인 영역을 포함하는 뉴클레오타이드 서열이다. The reference nucleotide sequence extracted according to any of the above embodiments is a nucleotide sequence comprising a region that is homologous to a sequence at an X or Z site, or a portion thereof.

올리고뉴클레오타이드 내의 X 부위의 서열만을 뉴클레오타이드 서열 데이터베이스와 비교한 다음, 상기 X 부위의 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열을 데이터베이스로부터 추출하는 예시적인 과정이 도 2에 도시되어 있다. An exemplary procedure for comparing only the sequence of the X region in the oligonucleotide with the nucleotide sequence database and then extracting the reference nucleotide sequence containing the homologous region in the sequence of the X region from the database is shown in FIG.

단계 (c): 매치/미스매치 분석( 130 ) Step (c): Match / mismatch analysis 130 :

이후, 식 (I)의 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 부위별 매치/미스매치를 분석하고, (i) 상기 식 (I)의 올리고뉴클레오타이드의 부위 X와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율 및 개별적으로 (ii) 상기 식 (I)의 올리고뉴클레오타이드의 부위 Z와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율을 제공한다(130). Thereafter, a site-specific match / mismatch between the oligonucleotide of formula (I) and the respective reference nucleotide sequence is analyzed, and (i) a match or miss between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence The number or percentage of matched bases and (ii) the number or percentage of matched or mismatched bases between each reference nucleotide sequence and region Z of the oligonucleotide of Formula (I) above is provided ( 130 ).

본 단계에서, 단계 (a)에서 제공된 식 (I)의 올리고뉴클레오타이드와 단계 (b)에서 추출된 각각의 참조 뉴클레오타이드 서열 간의 매치/미스매치는 부위별로 분석된다.In this step, the match / mismatch between the oligonucleotide of formula (I) provided in step (a) and the respective reference nucleotide sequence extracted in step (b) is analyzed by site.

본원에 사용된 바와 같이, 용어 "부위별 매치/미스매치"는 식 (I)의 올리고뉴클레오타이드의 각 부위에서의 매치/미스매치를 의미한다. 상기 용어는 "국소적 매치/미스매치"와 상호교환적으로 사용된다. As used herein, the term " site-to-site match / mismatch " means a match / mismatch at each site of the oligonucleotide of formula (I). The term is used interchangeably with " local match / mismatch ".

또한, 본원에 사용된 바와 같이, 문구 "부위별 매치/미스매치를 분석한다"는 것은 식 (I)의 올리고뉴클레오타이드의 각 부위마다 매치/미스매치를 분석하는 것을 가리킨다. 따라서, "식 (I)의 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 부위별 매치/미스매치를 분석한다"는 것은 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 각각의 서열과 각각의 참조 뉴클레오타이드 서열 내의 상응하는 부위의 서열 간의 매치/미스매치를 분석하는 것을 가리킨다. Also, as used herein, the phrase " analyzing a match / mismatch by site " refers to analyzing a match / mismatch at each site of the oligonucleotide of formula (I). Thus, " analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and each reference nucleotide sequence " means that the sequence of each of the sites X and Z in the oligonucleotide of formula (I) and the respective reference nucleotide sequence &Lt; / RTI > between the sequences of corresponding sites within the < RTI ID = 0.0 >

상기 부위별 매치/미스매치의 분석은 식 (I)의 올리고뉴클레오타이드 내의 부위 X의 서열과 각각의 참조 뉴클레오타이드 서열 내의 상응하는 서열을 비교하여 이들 간의 매치/미스매치를 계산하는 단계 및 식 (I)의 올리고뉴클레오타이드 내의 부위 Z의 서열과 각각의 참조 뉴클레오타이드 서열 내의 상응하는 서열을 비교하여 이들 간의 매치/미스매치를 계산하는 단계를 포함한다. The analysis of the site-specific match / mismatch comprises comparing the sequence of the site X in the oligonucleotide of formula (I) with the corresponding sequence in each reference nucleotide sequence, calculating the match / mismatch between them, Comparing the sequence of the region Z in the oligonucleotide of the reference nucleotide sequence with the corresponding sequence in each reference nucleotide sequence and calculating the match / mismatch between them.

그 결과, (i) 상기 식 (I)의 올리고뉴클레오타이드의 부위 X와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율 및 개별적으로 (ii) 상기 식 (I)의 올리고뉴클레오타이드의 부위 Z와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율이 제공된다. (I) the number or percentage of mismatched or mismatched bases between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence, and (ii) the site of the oligonucleotide of formula (I) The number or ratio of matched or mismatched bases between Z and each reference nucleotide sequence is provided.

부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율은 식 (I)의 올리고뉴클레오타이드의 특이성을 평가하는데 도움이 된다. 따라서, 이들은 종합적으로 본원에서 특이성에 관한 정보로 지칭된다. The number or ratio of matched or mismatched bases at sites X and Z helps to assess the specificity of the oligonucleotides of formula (I). Thus, they are collectively referred to herein as information about specificity.

이중 특이성 올리고뉴클레오타이드와 같이 서열 내에 연속적인 유니버설 염기를 포함하는 올리고뉴클레오타이드의 경우, 특이성은 연속적인 유니버설 염기에 의해 분할된 X 부위 및 Z 부위에 의해 이중으로 결정된다. 따라서, 올리고뉴클레오타이드의 특이성을 평가하기 위해, 올리고뉴클레오타이드의 X 및 Z 부위 각각에서 어닐링 특이성을 확인하는 것이 매우 중요하다. In the case of oligonucleotides containing universal bases continuous in sequence, such as bispecific oligonucleotides, the specificity is doubly determined by the X and Z sites divided by the successive universal bases. Therefore, in order to evaluate the specificity of the oligonucleotide, it is very important to confirm the annealing specificity at each of the X and Z sites of the oligonucleotide.

하지만, 종래 서열 정렬 알고리즘 또는 프로그램은 상기 기재된 바와 같은 X 및 Z 부위 각각에 대한 개별적인 미스매치 정보를 제공하지 못한다. 또한, 참조 뉴클레오타이드 서열과 올리고뉴클레오타이드의 전체 서열 간의 상동성 스코어가 다소 낮은 경우, 종래 서열 정렬 알고리즘 또는 프로그램은 상기 올리고뉴클레오타이드의 전체 서열이 아닌 상기 올리고뉴클레오타이드의 일부 서열에 대한 매치/미스매치 결과만을 제공할 수 있다. 예를 들어, 20개의 뉴클레오타이드 잔기의 올리고뉴클레오타이드를 BLAST 서치하는 경우, BLAST 알고리즘은 20개 미만의 뉴클레오타이드 길이에 대한 매치/미스매치 결과를 제공할 수도 있다. 그러한 경우, 부위 X 및 Z 중 어느 하나 또는 모두에서 매치/미스매치 결과를 얻지 못할 수 있다.However, conventional sequence alignment algorithms or programs do not provide individual mismatch information for each of the X and Z sites as described above. In addition, if the homology score between the reference nucleotide sequence and the entire sequence of the oligonucleotide is rather low, the conventional sequence alignment algorithm or program will only provide a match / mismatch result for some sequence of the oligonucleotide, rather than the entire sequence of the oligonucleotide can do. For example, if BLAST searches oligonucleotides of 20 nucleotide residues, the BLAST algorithm may provide a match / mismatch result for fewer than 20 nucleotide lengths. In such a case, a match / mismatch result may not be obtained in either or both of the regions X and Z.

이에 반해, 본 발명의 방법은 X 및 Z 부위에서 개별적인 매치/미스매치 결과를 제공한다. 따라서, 사용자는 상기 결과에 기초하여 올리고뉴클레오타이드의 특이성을 더 정확하게 평가할 수 있다. In contrast, the method of the present invention provides individual match / mismatch results at the X and Z sites. Thus, the user can more accurately evaluate the specificity of the oligonucleotide based on the result.

본 발명에 따르면, X 및 Z 부위 각각에서의 매치되거나 미스매치된 염기의 개수 또는 비율은 추출된 모든 참조 뉴클레오타이드 서열에 대해 제공된다. 따라서, 상기 결과에 기초하여, 사용자는 디자인된 올리고뉴클레오타이드가 타겟 핵산 서열에만 혼성화되는지 여부를 확인할 수 있다. In accordance with the present invention, the number or percentage of matched or mismatched bases in each of the X and Z sites is provided for all extracted reference nucleotide sequences. Thus, based on the results, the user can check whether the designed oligonucleotide is only hybridized to the target nucleic acid sequence.

특이성의 측면에서 Z 부위에서의 매치가 X 부위에서의 매치보다 더 중요한 올리고뉴클레오타이드의 경우, 올리고뉴클레오타이드의 Z 부위와 타겟 핵산 서열 간의 미스매치의 존재는 사용자가 디자인된 올리고뉴클레오타이드 대신에 다른 올리고뉴클레오타이드를 선택하게 하는 강한 근거를 제공한다. 반면, X 부위에 미스매치된 염기를 갖는 올리고뉴클레오타이드도 특정 조건하에 타겟 핵산 서열에 혼성화될 수 있으므로, X 부위에서 미스매치의 존재는 사용자가 혼성화 조건을 고려하여 올리고뉴클레오타이드를 사용할지 여부를 결정하기 위한 힌트를 제공한다. 이와 같이, X 및 Z 부위의 매치/미스매치 결과는 식 (I)의 올리고뉴클레오타이드의 특이성을 평가하는데 매우 유용하다. In the case of oligonucleotides in which the match at the Z site is more important than the match at the X site in terms of specificity, the presence of a mismatch between the Z site of the oligonucleotide and the target nucleic acid sequence results in a different oligonucleotide instead of the designed oligonucleotide Provide a strong basis for making choices. On the other hand, the oligonucleotide having a base mismatched to the X site can also be hybridized to the target nucleic acid sequence under certain conditions, so that the presence of a mismatch at the X site allows the user to determine whether to use oligonucleotides Provide hints for. As such, the match / mismatch results of the X and Z sites are very useful in assessing the specificity of oligonucleotides of formula (I).

본 단계에서 제공되는 매치되거나 미스매치된 염기의 개수 또는 비율은 올리고뉴클레오타이드 내의 X 부위의 서열을 각각의 참조 뉴클레오타이드 서열 내의 상응하는 서열과 비교하고, 올리고뉴클레오타이드 내의 Z 부위의 서열을 각각의 참조 뉴클레오타이드 서열 내의 상응하는 서열과 비교함으로써 계산될 수 있다. The number or ratio of matched or mismatched bases provided in this step is determined by comparing the sequence of the X site in the oligonucleotide with the corresponding sequence in each reference nucleotide sequence and comparing the sequence of the Z site in the oligonucleotide with the respective reference nucleotide sequence &Lt; / RTI >

일 구현예에서, 상기 식 (I)의 올리고뉴클레오타이드 전체 서열을 이의 상동성 영역에 기초하여 각각의 추출된 참조 뉴클레오타이드 서열과 정렬(배열)한 다음, X 및 Y 부위에서 매치되거나 미스매치된 염기의 개수 또는 비율을 분석한다. 일 구현예에서, 이러한 정렬 정보(또는 결과)는 참조 뉴클레오타이드 서열이 추출될 때 수득될 수 있다. In one embodiment, the entire oligonucleotide sequence of formula (I) above is aligned (aligned) with each extracted reference nucleotide sequence based on its homology region, and then the sequence of matched or mismatched bases at the X and Y sites Analyze the number or ratio. In one embodiment, such alignment information (or result) can be obtained when the reference nucleotide sequence is extracted.

본 발명의 일 구현예에서, 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고 단계 (b)에서 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열과 각각의 참조 뉴클레오타이드 서열 내의 상동 영역 간의 부위별 매치/미스매치가 분석되고, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율이 제공된다. In one embodiment of the invention, the entire sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database and the region homologous to the entire sequence of the oligonucleotide of formula (I) in step (b) Mismatches between the entire sequence of the oligonucleotide of the above formula (I) and the homology region in each reference nucleotide sequence are analyzed, and when the reference nucleotide sequence containing the match or miss in the regions X and Z is extracted, The number or percentage of matched bases is provided.

예를 들어, 40 뉴클레오타이드 길이의 식 (I)의 올리고뉴클레오타이드의 전체 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고, 단계 (b)에서 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열에 상동인 영역(40 뉴클레오타이드 길이)을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, 상기 상동 영역은 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z에 상응하는 서열을 이미 함유하고 있기 때문에, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율은 직접 계산될 수 있다. For example, the entire sequence of the oligonucleotide of formula (I) of 40 nucleotides in length is compared to at least one nucleotide sequence database and the region (s) homologous to the entire sequence of the oligonucleotide of formula (I) 40 nucleotides in length) is extracted, the homologous region already contains sequences corresponding to the sites X and Z in the oligonucleotide of formula (I), so that the matches in sites X and Z The number or percentage of mismatched bases can be calculated directly.

본 발명의 또 다른 구현예에서, 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고 단계 (b)에서 상기 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열과 각각의 참조 뉴클레오타이드 서열의 상동 영역 및 그의 플랭킹 영역 간의 부위별 매치/미스매치가 분석되고, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율이 제공된다. In another embodiment of the present invention, the entire sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database and in step (b) a region homologous to a partial sequence of the oligonucleotide of formula (I) , A site-specific match / mismatch between the entire sequence of the oligonucleotide of the formula (I) and the homologous region of each reference nucleotide sequence and its flanking region is analyzed, and the sites X and < RTI ID = The number or ratio of matched or mismatched bases in Z is provided.

예를 들어, 40 뉴클레오타이드 길이의 식 (I)의 올리고뉴클레오타이드의 전체 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고, 단계 (b)에서 상기 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역(예컨대, 10-15, 10-20, 10-30 또는 10-35 뉴클레오타이드 길이)을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, 상기 상동 영역은 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z에 상응하는 서열을 함유하고 있지 않기 때문에, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율은 직접 계산될 수 없다. 이 경우, 상기 상동 영역 외에, 그의 플랭킹 영역이 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율의 계산에 추가로 사용된다. 즉, 식 (I)의 올리고뉴클레오타이드의 전체 서열을 상기 상동 영역 및 그의 플랭킹 영역을 포함하는 각 참조 뉴클레오타이드 서열 내의 상응하는 서열과 비교하여, 부위 X 및 Z에서 매치되거나 미스매치된 염기의 개수 또는 비율을 계산한다. For example, the entire sequence of the oligonucleotide of formula (I) of 40 nucleotides in length is compared to at least one nucleotide sequence database and the region (s) homologous to the partial sequence of the oligonucleotide of formula (I) (10-15, 10-20, 10-30, or 10-35 nucleotides in length), the homologous region comprises a nucleotide sequence corresponding to sites X and Z in the oligonucleotide of formula (I) The number or percentage of matched or mismatched bases in sites X and Z can not be calculated directly. In this case, besides the homology region, its flanking region is additionally used in the calculation of the number or proportion of matched or mismatched bases in regions X and Z. That is, by comparing the entire sequence of the oligonucleotide of formula (I) with the corresponding sequence in each reference nucleotide sequence comprising the homologous region and its flanking region, the number of matched or mismatched bases at sites X and Z, or Calculate the ratio.

플랭킹 영역은 참조 뉴클레오타이드 서열에서 상동 영역을 제외한 나머지 영역을 지칭한다. 예를 들어, 상동 영역이 식 (I)의 올리고뉴클레오타이드 내의 부위 X에 상동인 경우, 플랭킹 영역은 Y 부위에 상응하는 영역 및 Z 부위에 상응하는 영역을 포함한다. 상동 영역이 식 (I)의 올리고뉴클레오타이드 내의 부위 Z에 상동인 경우, 플랭킹 영역은 Y 부위에 상응하는 영역 및 X 부위에 상응하는 영역을 포함한다. The flanking region refers to the remaining region except the homologous region in the reference nucleotide sequence. For example, if the homologous region is homologous to a site X in the oligonucleotide of formula (I), the flanking region comprises a region corresponding to the Y region and a region corresponding to the Z region. When the homologous region is homologous to the site Z in the oligonucleotide of formula (I), the flanking region comprises a region corresponding to the Y region and a region corresponding to the X region.

본 발명의 또 다른 구현예에서, 식 (I)의 올리고뉴클레오타이드의 일부 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하고 단계 (b)에서 상기 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, 상기 식 (I)의 올리고뉴클레오타이드의 전체 서열과 각각의 참조 뉴클레오타이드 서열의 상동 영역 및 그의 플랭킹 영역 간의 부위별 매치/미스매치가 분석되고, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율이 제공된다. In another embodiment of the invention, a partial sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database and a region homologous to a partial sequence of the oligonucleotide of formula (I) in step (b) Mismatches between the entire sequence of the oligonucleotide of the above formula (I) and the homologous region of each reference nucleotide sequence and its flanking region are analyzed, and when the reference X and Z The number or percentage of matched or mismatched bases in the sample is provided.

예를 들어, 40 뉴클레오타이드 길이의 식 (I)의 올리고뉴클레오타이드의 일부 서열(예컨대, 10-15, 10-20, 10-30 또는 10-35 뉴클레오타이드 길이)을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교한 다음, 단계 (b)에서 상기 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역(예컨대, 10-15, 10-20, 10-30 또는 10-35 뉴클레오타이드 길이)을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, 상기 상동 영역은 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z에 상응하는 서열을 함유하고 있지 않기 때문에, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율은 상기 상동 영역만을 사용하여 직접 계산될 수 없다. 이 경우, 상기 상동 영역 외에, 그의 플랭킹 영역이 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율의 계산에 추가로 사용된다. 즉, 식 (I)의 올리고뉴클레오타이드의 전체 서열을 상기 상동 영역 및 그의 플랭킹 영역을 포함하는 각 참조 뉴클레오타이드 서열 내의 상응하는 서열과 비교하여, 부위 X 및 Z에서 매치되거나 미스매치된 염기의 개수 또는 비율을 계산한다.For example, a partial sequence (e.g., 10-15, 10-20, 10-30 or 10-35 nucleotides in length) of an oligonucleotide of formula (I) of 40 nucleotides in length is compared to at least one nucleotide sequence database , A reference nucleotide sequence comprising a region (e.g., 10-15, 10-20, 10-30 or 10-35 nucleotides in length) homologous to a portion of the sequence of the oligonucleotide of formula (I) in step (b) , The number or proportion of matched or mismatched bases in the sites X and Z is less than or equal to the number of mismatched bases in the homology region < RTI ID = 0.0 > Can not be directly computed using only. In this case, besides the homology region, its flanking region is additionally used in the calculation of the number or proportion of matched or mismatched bases in regions X and Z. That is, by comparing the entire sequence of the oligonucleotide of formula (I) with the corresponding sequence in each reference nucleotide sequence comprising the homologous region and its flanking region, the number of matched or mismatched bases at sites X and Z, or Calculate the ratio.

전술한 바와 같이, 각 참조 뉴클레오타이드 서열 내의 상동 영역은 단계 (a)에서 제공된 식 (I)의 올리고뉴클레오타이드와 비교하여 길이가 동일하거나 더 짧을 수 있다. 구체적으로, 식 (I)의 올리고뉴클레오타이드의 전체 서열을 뉴클레오타이드 서열 데이터베이스와 비교하고 Y 부위 내의 왓슨-크릭 염기쌍에 관여하지 않는 염기의 개수가 상대적으로 적은 경우, 식 (I)의 올리고뉴클레오타이드의 전체 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출될 수 있다. 반면, Y 부위 내의 왓슨-크릭 염기쌍에 관여하지 않는 염기의 개수가 상대적으로 많은 경우, 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출될 수 있다. 또한, 식 (I)의 올리고뉴클레오타이드의 일부 서열을 비교에 사용하는 경우, 식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출될 수 있다. As described above, the homology region within each reference nucleotide sequence may be the same or shorter in length compared to the oligonucleotide of formula (I) provided in step (a). Specifically, when the entire sequence of the oligonucleotide of formula (I) is compared to the nucleotide sequence database and the number of bases not involved in the Watson-Crick base pair in the Y site is relatively small, the entire sequence of the oligonucleotide of formula (I) A reference nucleotide sequence including a homologous region can be extracted. On the other hand, when the number of bases not involved in the Watson-Crick base pair in the Y site is relatively large, a reference nucleotide sequence including a region homologous to a partial sequence of the oligonucleotide of Formula (I) may be extracted. Further, when a part of the sequence of the oligonucleotide of formula (I) is used for comparison, a reference nucleotide sequence including a region homologous to a partial sequence of the oligonucleotide of formula (I) may be extracted.

이러한 비교 또는 분석은, 단계 (b)에서의 비교가 올리고뉴클레오타이드의 일부 서열을 이용하는 반면 단계 (c)에서의 비교는 올리고뉴클레오타이드의 전체 서열을 이용한다는 점에서, "비교의 확장"으로도 불릴 수 있다. Such comparison or analysis may also be referred to as " expansion of comparison " in that the comparison in step (b) utilizes a partial sequence of the oligonucleotide while the comparison in step (c) utilizes the entire sequence of the oligonucleotide have.

식 (I)의 올리고뉴클레오타이드의 일부 서열에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출된 경우, 상기 상동 영역을 확장한 다음, 부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율을 계산할 수 있다. 상동 영역을 확장하여 매치되거나 미스매치된 염기의 개수 또는 비율을 계산한다는 것은, 상기 상동 영역을 올리고뉴클레오타이드의 전체 서열에 상응하는 서열까지 확장한 다음, 매치되거나 미스매치된 염기의 개수 또는 비율을 계산한다는 것을 가리킨다. 즉, 그것은 플랭킹 영역의 서열을 상기 추출된 핵산 서열 또는 데이터베이스로부터 가져와서(또는 복구하여) 부위 X 및 Z에서 매치되거나 미스매치된 염기의 개수 또는 비율을 계산한다는 것을 의미한다. If a reference nucleotide sequence comprising a region homologous to a portion of the sequence of the oligonucleotide of Formula (I) is extracted, then the homology region is extended and then the number or ratio of mismatched or matched bases in regions X and Z Can be calculated. Expanding the homologous region to calculate the number or percentage of mismatched or mismatched bases can be accomplished by extending the homologous region to a sequence corresponding to the entire sequence of the oligonucleotide and then calculating the number or percentage of matched or mismatched bases . That is, it means that the sequence of the flanking region is taken (or restored) from the extracted nucleic acid sequence or database to calculate the number or percentage of mismatched or matched bases in regions X and Z.

식 (I)의 올리고뉴클레오타이드의 일부 서열을 사용하여 부위 X 및 Z의 매치/미스매치 결과를 수득하는 과정이 도 2에 도시되어 있다. The process of obtaining the match / mismatch results of sites X and Z using some sequences of oligonucleotides of formula (I) is shown in FIG.

도 2에 나타낸 바와 같이, 식 (I)의 올리고뉴클레오타이드의 X 부위에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출된 경우, Z 부위의 반대편에 있는 그의 플랭킹 영역을 데이터베이스 또는 추출된 참조 뉴클레오타이드 서열로부터 가져와 부위 X 및 Z에서 미스매치된 염기의 개수를 계산한다. 반대로, 식 (I)의 올리고뉴클레오타이드의 Z 부위에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출된 경우, X 부위의 반대편에 있는 그의 플랭킹 영역을 데이터베이스 또는 추출된 참조 뉴클레오타이드 서열로부터 가져와 부위 X 및 Z에서 미스매치된 염기의 개수를 계산한다. As shown in FIG. 2, when a reference nucleotide sequence including a homologous region is extracted at the X site of the oligonucleotide of formula (I), its flanking region on the opposite side of the Z site is deleted from the database or the extracted reference nucleotide sequence To calculate the number of mismatched bases in sites X and Z. Conversely, when a reference nucleotide sequence comprising a homologous region in the Z region of the oligonucleotide of Formula (I) is extracted, its flanking region on the opposite side of the X region is taken from the database or the extracted reference nucleotide sequence, Calculate the number of mismatched bases in Z.

식 (I)의 올리고뉴클레오타이드의 경우, Y 부위에 함유된 염기는 왓슨-크릭 염기쌍을 형성하는 염기와 비교하여 상대적으로 낮은 친화성으로 타겟 핵산 서열 내의 상응하는 염기에 혼성화한다. 즉, 식 (I)의 올리고뉴클레오타이드가 타겟 핵산 서열에 혼성화되는 경우, Y 부위는 루프(loop) 구조를 형성할 수 있다. Y 부위의 이러한 루프 형성은 X 부위가 혼성화하는 영역과 Z 부위가 혼성화하는 영역 간의 간격을 줄일 수 있다. In the case of the oligonucleotides of formula (I), the bases contained in the Y sites hybridize to the corresponding bases in the target nucleic acid sequence with a relatively low affinity as compared to bases forming the Watson-Crick base pairs. That is, when the oligonucleotide of formula (I) is hybridized to the target nucleic acid sequence, the Y site can form a loop structure. This loop formation of the Y site can reduce the distance between the region where the X region hybridizes and the region where the Z region hybridizes.

따라서, 이러한 혼성화 가변성을 고려할 때, 매치되거나 미스매치된 염기의 개수 또는 비율의 계산을 위해 관심 있는 부위 X 또는 Y의 반대편에 있는 플랭킹 영역은 관심 부위 및 그의 가능한 반대편 영역을 고려하여 결정된다. Thus, considering such hybridization variability, the flanking region on the opposite side of the site X or Y of interest for the calculation of the number or ratio of matched or mismatched bases is determined in view of the region of interest and its possible opposite region.

예를 들어, Y 부위에 총 5개 염기가 함유되어 있다고 가정하면, 식 (I)의 올리고뉴클레오타이드의 X 부위에 상동인 영역을 포함하는 참조 뉴클레오타이드 서열이 추출되는 경우, Z 부위 반대편에 있는 플랭킹 영역은 일반적으로 X 부위가 혼성화되는 상동 영역으로부터 5 뉴클레오타이드 이격된 영역이지만, 그것은 Y 부위 상의 루프 형성으로 인해, X 부위가 혼성화되는 상동 영역으로부터 4 뉴클레오타이드 또는 3 뉴클레오타이드 이격된 영역일 수도 있다. For example, supposing that a total of five bases are contained in the Y site, when a reference nucleotide sequence including a homologous region is extracted at the X site of the oligonucleotide of formula (I), the flanking site at the opposite side of the Z site The region is generally 5 nucleotides apart from the homology region in which the X region is hybridized, but it may be 4 nucleotides or 3 nucleotides apart from the homology region in which the X region is hybridized due to loop formation on the Y region.

예를 들어, Y 부위에 총 5개의 염기가 함유된 경우, 매치되거나 미스매치된 염기의 개수의 계산은 Z 부위와 X 부위가 혼성화되는 영역으로부터 5 뉴클레오타이드 이격된 영역 간에서, Z 부위와 X 부위가 혼성화되는 영역으로부터 4 뉴클레오타이드 이격된 영역 간에서, 그리고 Z 부위와 X 부위가 혼성화되는 영역으로부터 3 뉴클레오타이드 이격된 영역 간에서 이뤄질 수 있다. For example, when a total of five bases are contained in the Y site, the calculation of the number of matched or mismatched bases is performed between the Z region and the X region and the 5 nucleotide spaced region from the hybridization region, May be made between the 4 nucleotide spaced regions from the hybridization region and the 3 nucleotide spaced apart regions from which the Z and X regions are hybridized.

일 구현예에서, 부위 X 및 Z 각각에서 매치된 염기의 개수가 제공된다. In one embodiment, the number of bases matched in each of the sites X and Z is provided.

일 구현예에서, 부위 X 및 Y 각각에서 미스매치된 염기의 개수 대 매치된 염기의 개수의 비율이 제공된다. In one embodiment, a ratio of the number of mismatched bases to the number of matched bases in each of the sites X and Y is provided.

일 구현예에서, 부위 X 및 Y 각각에서 미스매치된 염기의 개수 대 전체 뉴클레오타이드 서열의 개수의 비율이 제공된다. In one embodiment, a ratio of the number of mismatched bases to the total number of nucleotide sequences in each of the sites X and Y is provided.

일 구현예에서, 부위 X 및 Y 각각에서 매치된 염기의 개수 대 미스매치된 염기의 개수의 비율이 제공된다. In one embodiment, the ratio of the number of mismatched bases to the number of mismatched bases in each of the sites X and Y is provided.

일 구현예에서, 부위 X 및 Y 각각에서 매치된 염기의 개수 대 전체 뉴클레오타이드 서열의 개수의 비율이 제공된다. In one embodiment, a ratio of the number of bases matched in each of the sites X and Y to the total number of nucleotide sequences is provided.

부위 X 및 Z 중 어느 하나 또는 둘 모두가 적어도 하나의 유니버설 염기 또는 축퇴성 염기를 포함하는 경우, 본 발명의 방법은 상기 유니버설 염기 또는 축퇴성 염기를 매치 또는 미스매치로 처리하는 기준을 변경한 후, 단계 (c)에서 상기 변경된 기준에 기초하여 매치되거나 미스매치된 염기의 개수를 제공할 수 있다. When either or both of the sites X and Z comprises at least one universal base or dehydrating base, the method of the present invention may be used after changing the criteria for treating the universal base or dehydrating base as a match or mismatch , And provide the number of mismatched or mismatched bases based on the modified criterion in step (c).

본 발명의 일 구현예에서, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 중 어느 하나 또는 둘 모두가 적어도 하나의 유니버설 염기를 포함하는 경우, 상기 유니버설 염기는 단계 (c)에서 미스매치된 염기로 카운트되지 않을 수 있다. 즉, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 중 어느 하나 또는 둘 모두에 적어도 하나의 유니버설 염기가 존재하는 경우, 상기 유니버설 염기는 각각의 참조 뉴클레오타이드 서열에서의 상응하는 뉴클레오타이드의 유형에 관계없이 매치된 염기로 처리된다. 예를 들어, 15개의 뉴클레오타이드로 구성된 X 부위에 3개의 미스매치된 염기와 1개의 추가의 유니버설 염기가 존재하면, 본 발명의 일 구현예는 미스매치된 염기의 총 개수를 3개로 결정할 수 있다.In one embodiment of the present invention, when either or both of the sites X and Z in the oligonucleotide of formula (I) comprise at least one universal base, the universal base is selected from the group consisting of mismatched bases in step (c) Lt; / RTI > That is, when at least one universal base is present in any one or both of the sites X and Z in the oligonucleotide of formula (I), the universal base is capable of hybridizing to any of the reference nucleotide sequences, regardless of the type of corresponding nucleotide in each reference nucleotide sequence Treated with the matched bases. For example, if there are three mismatched bases and one additional universal base at the X site consisting of 15 nucleotides, then one embodiment of the invention can determine the total number of mismatched bases to be three.

부위 X 및 Z에서의 매치된 염기의 개수를 제공하는 구현예에서, 유니버설 염기는 매치된 염기로 카운트되거나 카운트되지 않을 수 있다. 예를 들어, 15개의 뉴클레오타이드 길이의 X 부위에 3개의 미스매치된 염기 및 하나의 추가의 유니버설 염기가 존재하는 경우, X 부위에서의 매치된 염기의 총 개수는 12개로 결정될 수 있다. 택일적으로, X 부위에서의 매치된 염기의 총 개수는 11개로 결정될 수 있다. In embodiments providing the number of matched bases at sites X and Z, the universal base may or may not be counted with the matched bases. For example, if there are three mismatched bases and one additional universal base at the X site of 15 nucleotides in length, the total number of matched bases at the X site can be determined to be twelve. Alternatively, the total number of matched bases at the X site can be determined to be 11.

본 발명의 일 구현예에서, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 중 어느 하나 또는 둘 모두가 적어도 하나의 축퇴성 염기를 포함하는 경우, 본 발명의 방법은 상기 축퇴성 염기와 참조 뉴클레오타이드 서열 내의 상응하는 염기 간의 매치를 고려한다. 즉, 식 (I)의 올리고뉴클레오타이드의 부위 X 및 Z 중 어느 하나 또는 둘 모두에 축퇴성 염기가 존재하는 경우, 상기 축퇴성 염기는 축퇴성 염기의 유형에 따라(축퇴성 염기가 나타내는 염기에 좌우되어) 단계 (c)에서 미스매치된 염기로서 카운트되거나 카운트되지 않을 수 있다. In one embodiment of the invention, when either or both of the sites X and Z in the oligonucleotide of formula (I) comprise at least one deacidifying base, the method of the present invention is characterized in that the deactivation base and the reference nucleotide Consideration is given to matching between the corresponding bases in the sequence. That is, when a degenerate base is present in either or both of the sites X and Z of the oligonucleotide of formula (I), the degenerate base may be different depending on the type of degenerate base May be counted or not counted as mismatched bases in step (c).

특정 구현예에서, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 중 어느 하나 또는 둘 모두가 적어도 하나의 축퇴성 염기를 포함하는 경우, 상기 축퇴성 염기는, 상기 축퇴성 염기가 나타내는 염기 중 어느 하나가 참조 뉴클레오타이드 서열 내의 상응하는 염기에 매치되면, 단계 (c)에서 미스매치된 염기로 카운트되지 않는다. BLAST와 같은 종래의 서열 정렬 알고리즘 또는 프로그램은 상기 축퇴성 염기를 그의 유형에 관계없이 미스매치로 처리한다. 반면, 본 발명의 방법은 축퇴성 염기의 유형에 기초하여 매치/미스매치를 결정하는 것을 특징으로 한다. 예를 들어, 올리고뉴클레오타이드 내에 축퇴성 염기 "R"(퓨린 염기 A 또는 G 중 어느 하나)이 존재하고, 비교되는 참조 뉴클레오타이드 서열 내의 상응하는 염기가 아데닌(A) 또는 구아닌(G)인 경우, 본 발명의 방법은 상기 축퇴성 염기를 매치로 처리한다. 반면, 비교되는 참조 뉴클레오타이드 서열 내의 상응하는 염기가 시토신(C) 또는 티민(T)인 경우, 본 발명의 방법은 상기 축퇴성 염기를 미스매치로 처리한다. 따라서, 본 발명의 방법은 종래 서열 정렬 알고리즘에 비해 식 (I)의 올리고뉴클레오타이드 내부에 축퇴성 염기가 존재하는 경우에도 더 정확한 매치/미스매치 결과를 생성할 수 있다. In certain embodiments, when either or both of the sites X and Z in the oligonucleotide of formula (I) comprise at least one degenerate base, the degenerate base may be any of the bases represented by the degenerate base If one matches the corresponding base in the reference nucleotide sequence, it is not counted as a mismatched base in step (c). Conventional sequence alignment algorithms or programs, such as BLAST, treat the dehydrated bases mismatched regardless of their type. On the other hand, the method of the present invention is characterized by determining a match / mismatch based on the type of dehydrating base. For example, when the degenerate base "R" (purine base A or G) is present in the oligonucleotide and the corresponding base in the reference nucleotide sequence to be compared is adenine (A) or guanine (G) The method of the invention treats the dehydrated bases as a match. On the other hand, when the corresponding base in the reference nucleotide sequence to be compared is cytosine (C) or thymine (T), the method of the present invention treats the degenerate base mismatch. Thus, the method of the present invention can produce a more accurate match / mismatch result even in the presence of degenerate bases within the oligonucleotide of formula (I) as compared to conventional sequence alignment algorithms.

본 발명의 또 다른 구현예에서, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 중 어느 하나 또는 둘 모두가 적어도 하나의 축퇴성 염기를 포함하는 경우, 상기 축퇴성 염기를 상기 축퇴성 염기가 내포하는 각각의 염기로 변환시킨 후, 단계 (b) 및 (c)를 실시한다. In another embodiment of the present invention, when one or both of the sites X and Z in the oligonucleotide of formula (I) comprises at least one degenerate base, the degenerate base is introduced into the degenerate base (B) and (c), respectively.

예를 들어, 식 (I)의 올리고뉴클레오타이드에 축퇴성 염기 "R"(퓨린 염기 A 또는 G 중 어느 하나)이 존재하는 경우, 축퇴성 염기 "R"이 아데닌(A)으로 변환된 제1 올리고뉴클레오타이드와 축퇴성 염기 "R"이 구아닌(G)로 변환된 제2 올리고뉴클레오타이드를 준비하여 본 발명의 방법을 각각 실시한다. 상기 방법은 축퇴성 염기가 미스매치로 판단되어 상동성 영역을 갖는 뉴클레오타이드 서열의 추출에 영향을 미치는 것을 방지할 수 있다. For example, when the degenerate base "R" (purine base A or G) is present in the oligonucleotide of formula (I), the first oligonucleotide in which the degenerate base "R" is converted to adenine (A) A second oligonucleotide in which nucleotides and degenerate bases " R " are converted to guanine (G) is prepared and each method of the present invention is carried out. This method can prevent the degenerate base from being judged as a mismatch and affecting the extraction of the nucleotide sequence having the homologous region.

일 구현예에 따르면, 식 (I)의 올리고뉴클레오타이드 내의 부위 X 및 Z 각각과 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율은 다양한 방식으로 표현될 수 있다. According to one embodiment, the number or ratio of matched or mismatched bases between each of the sites X and Z in the oligonucleotide of formula (I) and each reference nucleotide sequence can be expressed in a variety of ways.

예를 들어, X 부위에서의 미스매치된 염기의 개수와 Z 부위에서의 미스매치된 염기의 개수는 종합하여 Xm｜Zm, (Xm, Zm), Xm-Zm, Xm & Zm 등으로 표시될 수 있고, 여기서, Xm은 X 부위에서의 미스매치된 염기의 개수를 나타내고, Zm은 Z 부위에서의 미스매치된 염기의 개수를 나타낸다. For example, the number of mismatched bases at the X site and the number of mismatched bases at the Z site can be expressed as Xm | Zm, (Xm, Zm), Xm-Zm, Xm & , Where Xm represents the number of mismatched bases at the X site and Zm represents the number of mismatched bases at the Z site.

예를 들어, 표기법 "0｜0"은 X 부위와 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 개수가 0개이고, Z 부위와 참조 뉴클레오타이드 서열 간의 미스매치된 염기 수가 0개라는 것을 가리킨다. 즉, 상기 표기법은 Y 부위를 제외한 식 (I)의 올리고뉴클레오타이드가 참조 뉴클레오타이드 서열에 완전히 매치(perfectly match)된다는 것을 의미한다. 한편, "1｜0"은 X 부위와 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 개수가 1개이고, Z 부위와 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 수가 0개라는 것을 가리킨다. 한편, "0｜1"은 X 부위와 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 개수가 0개이고, Z 부위와 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 개수가 1개라는 것을 가리킨다. For example, the notation " 0 | 0 " indicates that the number of mismatched bases between the X site and the reference nucleotide sequence is zero, and the number of mismatched bases between the Z site and the reference nucleotide sequence is zero. That is, the above notation means that the oligonucleotide of formula (I) except for the Y site is perfectly matched to the reference nucleotide sequence. On the other hand, " 1 | 0 " indicates that the number of mismatched bases between the X site and the reference nucleotide sequence is one and that the number of mismatched bases between the Z site and the reference nucleotide sequence is zero. On the other hand, "0 | 1" indicates that the number of mismatched bases between the X site and the reference nucleotide sequence is zero, and the number of mismatched bases between the Z site and the reference nucleotide sequence is one.

상기 표기법 외에도, 부위 X 및 Z에서의 미스매치된 염기의 개수가 다른 방식으로 표현될 수 있음이 당업자에게 이해될 것이다. In addition to the above notation, it will be understood by those skilled in the art that the number of mismatched bases at sites X and Z can be expressed in other ways.

일 구현예에서, 부위 X 및 Y 각각의 전체 뉴클레오타이드의 수 또는 부위 X 및 Y 각각에서의 매치된 염기의 개수가 추가적으로 표시될 수 있다.In one embodiment, the total number of nucleotides in each of the sites X and Y, or the number of matched bases in each of the sites X and Y, may additionally be indicated.

미스매치된 염기의 개수는 식 (I)의 올리고뉴클레오타이드의 특이성과 높은 관련이 있다. The number of mismatched bases is highly related to the specificity of the oligonucleotide of formula (I).

X 및 Z 부위에서의 미스매치된 염기의 개수는 식 (I)의 올리고뉴클레오타이드의 특이성, 특히 어닐링 특이성의 평가에 다르게 영향을 미칠 수 있는 반면, Y 부위는 특이성의 평가에 영향을 미치지 않는다. 상기 논의된 바와 같이, X 부위에서의 미스매치된 염기의 개수 및 Z 부위에서의 미스매치된 염기의 개수는 올리고뉴클레오타이드의 특이성에 상이한 정도로 부정적인 영향을 미친다. 상기 영향의 차이를 고려하여, 본 발명의 방법은, 식 (I)의 올리고뉴클레오타이드의 특이성을 보다 정확하게 평가하기 위해, X 부위에서의 미스매치된 염기의 개수 또는 비율 및 Z 부위에서의 미스매치된 염기의 개수 또는 비율과 같은 2개의 값에 상이한 가중치를 부여할 수 있다. The number of mismatched bases at the X and Z sites may affect the specificity of the oligonucleotides of formula (I), particularly the annealing specificity, while the Y site does not affect the evaluation of specificity. As discussed above, the number of mismatched bases at the X site and the number of mismatched bases at the Z site have a different negative impact on the specificity of the oligonucleotide. Taking into consideration the difference in the above effects, the method of the present invention is characterized in that, in order to more accurately evaluate the specificity of the oligonucleotide of formula (I), the number or ratio of mismatched bases at the X site and mismatches The two values, such as the number or percentage of bases, can be weighted differently.

일 구현예에 따르면, 특이성의 판단시 Z 부위에서의 매치가 X 부위에서의 매치보다 중요하다(예를 들어, WO 제2006/095981호에 개시된 이중 특이성 올리고뉴클레오타이드). 이 경우, Z 부위에 1개의 미스매치를 갖는 올리고뉴클레오타이드는 X 부위에 1개의 미스매치를 갖는 올리고뉴클레오타이드에 비해 특이성이 좋지 않은 것으로 평가될 수 있다. 또한, Z 부위에 1개의 미스매치를 갖는 올리고뉴클레오타이드는 X 부위에 2개, 3개 또는 4개의 미스매치를 갖는 올리고뉴클레오타이드에 비해 특이성이 좋지 않은 것으로 평가될 수 있다. 전술한 바와 같은 X 부위에서의 미스매치된 염기의 개수와 Z 부위에서의 미스매치된 염기의 개수가 특이성의 평가에 다르게 영향을 미치는 것을 고려하여, Z 부위에서의 미스매치된 염기의 개수에 부여되는 가중치는 X 부위에서의 미스매치된 염기의 개수에 부여되는 가중치보다 클 수 있다. 상기 가중치는 당업자에 의해 다양한 방식으로 부여될 수 있다. According to one embodiment, in the determination of specificity, the match at the Z site is more important than the match at the X site (for example, the bispecific oligonucleotide disclosed in WO 2006/095981). In this case, oligonucleotides having one mismatch at the Z site can be evaluated as having poor specificity compared to oligonucleotides having one mismatch at the X site. In addition, oligonucleotides having one mismatch at the Z site can be evaluated as having poor specificity compared to oligonucleotides having two, three or four mismatches at the X site. Considering that the number of mismatched bases at the X site and the number of mismatched bases at the Z site differently affect the evaluation of the specificity as described above, the number of mismatched bases at the Z site The weight may be greater than the weight assigned to the number of mismatched bases at the X site. The weights may be given by a person skilled in the art in various ways.

또 다른 구현예에 따르면, 특이성의 결정시 X 부위에서의 매치가 Z 부위에서의 매치보다 중요하다(예를 들어, WO 제2011/028041호에 개시된 타겟 구별성(TD) 프로브 참고). 이 경우, X 부위에서의 미스매치된 염기의 개수에 부여되는 가중치는 Z 부위에서의 미스매치된 염기의 개수에 부여되는 가중치보다 클 수 있다 . According to another embodiment, the match at the X site is more important than the match at the Z site in determining the specificity (see, for example, the target differentiation (TD) probe disclosed in WO 2011/028041). In this case, the weight given to the number of mismatched bases at the X site may be greater than the weight given to the number of mismatched bases at the Z site.

또한, 본 발명의 일 구현예는 X 및 Z 부위에서의 미스매치된 염기의 개수를 기준으로 식 (I)의 올리고뉴클레오타이드에 패널티 스코어(penalty score)를 부여할 수 있다. 상기 패널티 스코어는 식 (I)의 올리고뉴클레오타이드의 특이성의 저하를 반영하는 값이다. Also, one embodiment of the present invention may assign a penalty score to the oligonucleotides of formula (I) based on the number of mismatched bases at the X and Z sites. The penalty score is a value reflecting the degradation of the specificity of the oligonucleotide of formula (I).

상기 패널티 스코어는 미스매치된 염기마다 부여될 수 있다. X 부위에서의 미스매치된 염기 당 부여되는 패널티 스코어와 Z 부위에서의 미스매치된 염기 당 부여되는 패널티 스코어는 서로 상이할 수 있다. The penalty score may be given for every mismatched base. The penalty score per mismatched base at the X site and the penalty score per mismatched base at the Z site may be different from each other.

일 구현예에서, 특이성의 결정시 Z 부위에서의 매치가 X 부위에서의 매치보다 중요한 경우, 상기 X 부위에서의 미스매치된 염기 당 부여되는 패널티 스코어는 Z 부위에서의 미스매치된 염기 당 부여되는 패널티 스코어보다 작다. 이러한 패널티 스코어의 차이는 가중된 패널티 스코어를 부여함으로써 달성될 수 있다. 예를 들어, X 및 Z 부위 모두에 미스매치된 염기가 존재하지 않는 식 (I)의 올리고뉴클레오타이드(즉, 타겟 핵산 서열과 완전하게 매치된 올리고뉴클레오타이드 내의 부위 X 및 Z)의 특이성을 "100"이라고 가정하면, X 부위에서의 미스매치된 염기 당 "10"의 패널티 스코어를 부여하고, Z 부위에서의 미스매치된 염기 당 "20", "30", "40", "50" 또는 "60"의 패널티 스코어를 부여할 수 있다. 이 경우, X 부위에 1개의 미스매치된 염기를 갖는 올리고뉴클레오타이드의 특이성은 "90"(=100-10)일 것이고, Z 부위에 1개의 미스매치된 염기를 갖는 올리고뉴클레오타이드의 특이성은 각각 "80", "70", "60", "50", 또는 "40"일 것이다. 이와 같이, 본 발명은 부위 X 및 Z에서의 미스매치된 염기의 개수에 따라, 부위 X 및 부위 Z에 상이한 가중된 패널티 스코어를 부여함으로써 정확한 특이성 평가가 가능하다. In one embodiment, if the match at the Z-site is more important than the match at the X-site in determining the specificity, the penalty score per mismatched base at the X-moiety is given per mismatched base at the Z- It is smaller than the penalty score. This difference in penalty score can be achieved by giving a weighted penalty score. For example, if the specificity of oligonucleotides of formula (I) (i.e., sites X and Z in oligonucleotides perfectly matched with the target nucleic acid sequence) in which no mismatched base is present in both the X and Z sites is set to " 100 & 20 "," 30 "," 40 "," 50 ", or" 60 "for mismatched bases at the Z site, &Quot; can be given. In this case, the specificity of the oligonucleotide having one mismatched base at the X site will be " 90 " (= 100-10), and the specificity of the oligonucleotide having one mismatched base at the Z site will be & , "70", "60", "50", or "40". As such, the present invention allows for accurate specificity evaluation by assigning different weighted penalty scores to site X and site Z, depending on the number of mismatched bases in sites X and Z. < RTI ID = 0.0 >

또 다른 구현예에서, 특이성 결정시 X 부위에서의 매치가 Z 부위에서의 매치보다 중요한 경우, 상기 Z 부위에서의 미스매치된 염기 당 부여되는 패널티 스코어는 X 부위에서의 미스매치된 염기 당 부여되는 패널티 스코어보다 작다. In another embodiment, when the match at the X site is more important than the match at the Z site in determining the specificity, the penalty score per mismatched base at the Z site is given per mismatched base at the X site It is smaller than the penalty score.

한편, 전술한 X 및 Z 부위와 달리, Y 부위는 올리고뉴클레오타이드의 특이성의 평가에 영향을 미치지 않으므로, Y 부위는 특이성의 평가에서 고려되지 않는다. On the other hand, unlike the X and Z sites described above, the Y site does not affect the evaluation of the specificity of the oligonucleotide, so the Y site is not considered in the evaluation of the specificity.

본 발명은 올리고뉴클레오타이드의 X 및 Z 부위에서의 매치/미스매치 결과를 개별적으로 제공하므로, X 및 Z 부위 각각의 특이성은 각 부위에 대한 매치/미스매치 결과에 의해 개별적으로 평가될 수 있다. Since the present invention individually provides match / mismatch results at the X and Z sites of the oligonucleotide, the specificity of each of the X and Z sites can be individually assessed by the match / mismatch results for each site.

일 구현예에서, X 및 Z 부위 각각의 특이성은 상이한 기준(예컨대, 상이한 매치/미스매치 역치)를 기준으로 각 부위에서의 매치/미스매치 결과를 평가함으로써 결정될 수 있다. In one embodiment, the specificity of each of the X and Z sites may be determined by evaluating the match / mismatch results at each site based on different criteria (e.g., different match / mismatch thresholds).

예를 들어, X 부위의 특이성과 관련하여, X 부위와 참조 뉴클레오타이드 서열 간에 2개 이하의 미스매치가 존재하는지 여부가 결정되고, Z 부위의 특이성과 관련하여, Z 부위와 참조 뉴클레오타이드 서열 간에 1개 이하의 미스매치가 존재하는지 여부가 결정된다. For example, with respect to the specificity of the X site, it is determined whether there are two or fewer mismatches between the X site and the reference nucleotide sequence, and with respect to the specificity of the Z site, one between the Z site and the reference nucleotide sequence It is determined whether or not the following mismatch exists.

올리고뉴클레오타이드의 특이성은 각 부위에서의 특이성 평가를 조합하여 올리고뉴클레오타이드가 어닐링되거나 혼성화되는 뉴클레오타이드 서열을 결정함으로써 평가될 수 있다. The specificity of the oligonucleotide can be assessed by combining the specificity evaluation at each site to determine the nucleotide sequence in which the oligonucleotide is annealed or hybridized.

일 구현예에서, 특이성을 평가하기 위해 각 부위당 매치/미스매치의 개수가 지정된 다음, 올리고뉴클레오타이드의 커버리지, 포괄성(inclusivity) 및 배타성(exclusivity)이 평가될 수 있다. 또한, 상기 올리고뉴클레오타이드의 커버리지, 포괄성 및 배타성은, 필요한 경우, 혼성화 조건 등을 조정함으로써 조절될 수 있다. In one embodiment, the number of matches / mismatches per region may be specified to assess specificity, then the coverage, inclusivity and exclusivity of the oligonucleotides may be assessed. In addition, the coverage, inclusiveness and exclusivity of the oligonucleotide can be adjusted, if necessary, by adjusting the hybridization conditions and the like.

부위 X 및 Z에서의 매치되거나 미스매치된 염기의 개수 또는 비율 외에도, 본 발명의 방법은 식 (I)의 올리고뉴클레오타이드와 참조 뉴클레오타이드 서열 각각 간의 매치 방향을 추가로 제공할 수 있다. 구체적으로, 상기 매치의 방향은 참조 뉴클레오타이드 서열의 (+) 가닥(코딩 가닥, 센스 가닥, 비주형 가닥)에 매치된 식 (I)의 올리고뉴클레오타이드와 참조 뉴클레오타이드 서열의 (-) 가닥(비코딩 가닥, 안티센스 가닥, 주형 가닥)에 매치된 식 (I)의 올리고뉴클레오타이드를 구별하기 위해 제공될 수 있다. 예를 들어, 식 (I)의 올리고뉴클레오타이드가 참조 뉴클레오타이드 서열의 (+) 가닥에 매치된 경우, "F" 또는 "+"와 같은 표시를 제공할 수 있고, 그렇지 않은 경우 "R" 또는 "-"와 같은 표시를 제공할 수 있다. 매치 방향은 전술한 X 및 Z 부위에서의 미스매치된 염기의 개수와 함께 제공될 수 있다. 예를 들어, "F Xm｜Zm", "+ Xm｜Zm", "R Xm｜Zm", "- Xm｜Zm" 등과 같은 표기법이 사용될 수 있다. 상기 표기법은 도 3에 예시되어 있다. 도 3에서 나타낸 바와 같이, 표기법 "- 1｜0"은 식 (I)의 올리고뉴클레오타이드가 참조 뉴클레오타이드 서열의 (-) 가닥에 매치된다는 것과, 식 (I)의 올리고뉴클레오타이드가 X 부위에 1개의 미스매치된 염기를 가지고, Z 부위에 0개의 미스매치된 염기를 가진다는 것을 간편하고 직관적으로 보여준다. In addition to the number or ratio of matched or mismatched bases at sites X and Z, the methods of the invention can additionally provide a match direction between the oligonucleotides of formula (I) and each of the reference nucleotide sequences. Specifically, the direction of the match is determined by comparing the oligonucleotide of formula (I) matched to the positive strand (coding strand, sense strand, non-template strand) of the reference nucleotide sequence and the negative strand of the reference nucleotide sequence , An antisense strand, a template strand) to distinguish oligonucleotides of formula (I). For example, if the oligonucleotide of formula (I) is matched to the (+) strand of the reference nucleotide sequence, it may provide an indication such as "F" or "+" &Quot; can be provided. The match direction may be provided with the number of mismatched bases at the X and Z sites described above. For example, notation such as "F Xm | Zm", "+ Xm | Zm", "R Xm | Zm", "-Xm | Zm" The above notation is illustrated in FIG. As shown in Figure 3, the notation " -1 | 0 " indicates that the oligonucleotide of formula (I) matches the (-) strand of the reference nucleotide sequence and that the oligonucleotide of formula (I) It has a simple and intuitive display that it has a mismatched base with zero base in the Z region with a matched base.

본 발명의 방법은 참조 뉴클레오타이드의 생물학적 특징을 추가로 제공할 수 있다. The method of the present invention can further provide a biological characteristic of the reference nucleotide.

참조 뉴클레오타이드 서열의 생물학적 특징은 추출된 참조 뉴클레오타이드 서열의 소스(source), 유전자 ID, 또는 설명을 포함한다. 또한, 참조 뉴클레오타이드 서열의 생물학적 특징은 상기 올리고뉴클레오타이드에 상응하는 영역의 위치(예를 들어, 5' 말단 및 3' 말단에서의 뉴클레오타이드의 위치 번호)를 포함할 수 있다. 또한, 참조 뉴클레오타이드 서열의 생물학적 특징은 원하는 올리고뉴클레오타이드와 상당한 상동성을 갖는 참조 뉴클레오타이드 서열의 목록을 포함할 수 있다. 참조 뉴클레오타이드 서열의 생물학적 특징은 통상적인 BLAST 알고리즘과 같은 서열 정렬 알고리즘 또는 프로그램에서 제공된 하나 이상의 특징을 포함할 수 있다. 참조 뉴클레오타이드 서열의 생물학적 특징은 올리고뉴클레오타이드의 특이성을 평가하는데 유용할 수 있다. 사용자는 원하는 올리고뉴클레오타이드와 상동인 영역을 포함하는 참조 뉴클레오타이드 서열의 목록 및 이들의 특정 서열 정보를 분석하여 디자인된 올리고뉴클레오타이드가 비타겟 핵산 서열이 아닌 타겟 핵산 서열만을 증폭 또는 검출(또는 혼성화)하는지 여부를 결정한다. 더욱이, 올리고뉴클레오타이드의 미스매치 정도, 구체적으로 타겟 핵산 서열과 관련하여 X 및 Z 부위에서 미스매치 정도를 제어할 수 있다. 참조 뉴클레오타이드 서열의 목록에 타겟 핵산 서열이 존재하고 비타겟 핵산 서열이 존재하지 않는 것은 상기 올리고뉴클레오타이드가 타겟 핵산 서열의 증폭 또는 검출에 적합하다는 것을 가리킨다. 반면, 참조 뉴클레오타이드 서열의 목록에 비타겟 핵산 서열이 존재하는 것은 상기 올리고뉴클레오타이드가 타겟 핵산 서열의 증폭 또는 검출에 적합하지 않다는 것을 나타내며, 이는 다른 올리고뉴클레오타이드를 선택하기 위한 강한 근거가 된다.The biological characteristics of the reference nucleotide sequence include the source, gene ID, or description of the extracted reference nucleotide sequence. In addition, the biological characteristics of the reference nucleotide sequence may include the position of the region corresponding to the oligonucleotide (e.g., the position number of the nucleotide at the 5 'end and the 3' end). In addition, the biological characteristics of the reference nucleotide sequence can include a list of reference nucleotide sequences that have substantial homology with the desired oligonucleotide. The biological characteristics of the reference nucleotide sequence may include one or more features provided in a sequence alignment algorithm or program, such as the conventional BLAST algorithm. The biological character of the reference nucleotide sequence may be useful in assessing the specificity of the oligonucleotide. The user can analyze the list of reference nucleotide sequences including the desired oligonucleotide and homologous regions and their specific sequence information to determine whether the designed oligonucleotide amplifies or detects (or hybridizes) only the target nucleic acid sequence that is not a non-target nucleic acid sequence . Furthermore, it is possible to control the degree of mismatch of the oligonucleotide, specifically the X and Z sites in relation to the target nucleic acid sequence. The presence of a target nucleic acid sequence in the list of reference nucleotide sequences and the absence of a non-target nucleic acid sequence indicates that the oligonucleotide is suitable for amplification or detection of the target nucleic acid sequence. On the other hand, the presence of a non-target nucleic acid sequence in the list of reference nucleotide sequences indicates that the oligonucleotide is not suitable for amplification or detection of the target nucleic acid sequence, which is a strong basis for selecting other oligonucleotides.

참조 뉴클레오타이드 서열의 생물학적 특징은 올리고뉴클레오타이드의 타겟 커버리지를 결정하는데 도움을 주는 정보를 포함한다. The biological character of the reference nucleotide sequence includes information that helps determine the target coverage of the oligonucleotide.

본 발명의 방법은 X 부위에서의 미스매치된 염기의 개수 및 Z 부위에서의 미스매치된 염기의 개수에 따라 참조 뉴클레오타이드 서열의 분류 결과를 추가로 제공할 수 있다. The method of the present invention may further provide a result of classification of the reference nucleotide sequence according to the number of mismatched bases at the X site and the number of mismatched bases at the Z site.

사용자는 디자인된 올리고뉴클레오타이드와 상동인 참조 뉴클레오타이드 서열을 확인할 필요가 있으며, 따라서 이러한 분류 결과의 제공은 상기 디자인된 올리고뉴클레오타이드의 특이성을 결정하는데 매우 유용하다. The user needs to identify the reference nucleotide sequence homologous to the designed oligonucleotide, and thus the provision of this sorting result is very useful in determining the specificity of the designed oligonucleotide.

상기 참조 뉴클레오타이드 서열의 분류 결과는 X 부위에서의 미스매치된 염기의 개수 및 Z 부위에서의 미스매치된 염기의 개수에 기초하여 참조 뉴클레오타이드 서열을 그룹화(분류)하여 수득된 결과로서, 이는 예를 들어 각 그룹에 속하는 참조 뉴클레오타이드의 목록 및 개수, 및 참조 뉴클레오타이드 서열 각각의 생물학적 특징을 포함한다. The result of classification of the reference nucleotide sequence is the result obtained by grouping (classifying) the reference nucleotide sequence based on the number of mismatched bases at the X site and the number of mismatched bases at the Z site, The list and number of reference nucleotides belonging to each group, and the biological characteristics of each of the reference nucleotide sequences.

프라이머 또는 프로브는 특정 혼성화 조건 하에 몇 개의 미스매치를 갖는 참조 뉴클레오타이드 서열과도 혼성화될 수 있다. 따라서, 디자인된 프라이머 또는 프로브의 적합성 또는 작동성을 평가하기 위해서는, 완전히 매치되는 참조 뉴클레오타이드 서열 뿐만 아니라 부분적으로 매치된 참조 뉴클레오타이드 서열을 확인할 필요가 있다. 이를 위해, 본 발명의 방법은 각 그룹에 속하는 참조 뉴클레오타이드 서열의 목록 및 개수, 및 참조 뉴클레오타이드 서열 각각의 생물학적 특징을 간편하고 직관적인 방식으로 제공한다. A primer or probe can also hybridize to a reference nucleotide sequence with several mismatches under certain hybridization conditions. Therefore, in order to evaluate the suitability or operability of a designed primer or probe, it is necessary to identify the reference nucleotide sequence that is matched as well as the partially matched reference nucleotide sequence. To this end, the method of the present invention provides the list and number of reference nucleotide sequences belonging to each group, and the biological characteristics of each of the reference nucleotide sequences in a simple and intuitive manner.

구체적으로, 식 (I)의 올리고뉴클레오타이드와 관련하여 "0｜0"의 미스매치(X 부위에서의 미스매치된 염기의 개수가 0개이고, Z 부위에서의 미스매치된 염기의 개수가 0개임)를 갖는 참조 뉴클레오타이드 서열의 개수가 제공될 수 있다. 또한, 식 (I)의 올리고뉴클레오타이드와 관련하여 "1｜0"의 미스매치(X 부위에서의 미스매치된 염기의 개수가 1개이고, Z 부위에서의 미스매치된 염기의 개수가 0개임), "0｜1", "1｜2", "2｜2", "3｜0", "3｜1", "3｜2" 등을 갖는 참조 뉴클레오타이드 서열의 개수가 제공될 수 있다. Specifically, a mismatch of " 0 | 0 " with respect to the oligonucleotide of formula (I) (the number of mismatched bases at the X site is zero and the number of mismatched bases at the Z site is zero) The number of reference nucleotide sequences having the nucleotide sequence of SEQ ID NO. In addition, in relation to the oligonucleotide of formula (I), a mismatch of " 1 | 0 ", wherein the number of mismatched bases at the X site is 1 and the number of mismatched bases at the Z site is 0, The number of reference nucleotide sequences having "0 | 1", "1 | 2", "2 | 2", "3 | 0", "3 | 1", "3 | 2"

예를 들어, 미스매치 유형 "0｜0"에 속하는 참조 뉴클레오타이드 서열의 개수가 "30"으로 제공되는 경우, 이는 식 (I)의 올리고뉴클레오타이드의 X 및 Z 부위와 100% 일치하는 참조 뉴클레오타이드 서열이 30개 존재한다는 것을 의미한다. 당업자라면 식 (I)의 올리고뉴클레오타이드의 특이성의 정확한 평가를 위해 "1｜0", "0｜1", "1｜1", "2｜0", "2｜1", "0｜2", "2｜2", "3｜0", "3｜1", "3｜2" 등에 해당하는 참조 뉴클레오타이드 서열에 관한 정보를 고려할 것이다. For example, if the number of reference nucleotide sequences belonging to the mismatch type " 0 | 0 " is given as " 30 ", this means that the reference nucleotide sequence which is 100% identical to the X and Z sites of the oligonucleotide of formula (I) It means that 30 exist. Those skilled in the art will appreciate that for the accurate evaluation of the specificity of the oligonucleotide of formula (I), " 1 | 0 ", & 2 ", " 3 | 0 ", " 3 | 1 ", " 3 | 2 ", and the like.

특이성의 평가시 Z 부위에서의 매치가 X 부위보다 중요한 경우, 미스매치 유형 "1｜0"에 해당되는 참조 뉴클레오타이드 서열은 식 (I)의 올리고뉴클레오타이드를 사용하여 증폭 또는 검출될 가능성이 높다. 따라서, 상기 미스매치 유형 "1｜0"에 속하는 참조 뉴클레오타이드 서열 중에 비타겟 핵산 서열이 존재하는 경우, 사용자는 상기 비타겟 핵산 서열의 증폭 또는 검출을 피하기 위하여 또 다른 올리고뉴클레오타이드를 디자인하거나, 상기 비타겟 핵산 서열의 개수가 적거나 중요도가 낮다면 상기 비타겟 핵산 서열의 증폭 또는 검출을 무시할 수 있다. 미스매치 유형 "0｜1" 내에 타겟 핵산 서열이 존재하는 경우, 상기 타겟 핵산 서열은 식 (I)의 올리고뉴클레오타이드를 사용하여 증폭 또는 검출되지 않을 가능성이 있다. 따라서, 사용자는 상기 미스매치 유형 "0｜1"에 속하는 타겟 핵산 서열을 커버하기 위하여, 식 (I)의 올리고뉴클레오타이드의 서열을 변형(예컨대, 축퇴성 염기를 혼입시킴으로써)하거나, 또 다른 올리고뉴클레오타이드를 디자인할 수 있다. 또한, 미스매치 유형 "0｜1"에 해당하는 참조 타겟 뉴클레오타이드 서열 중에 비타겟 핵산 서열이 존재하는 경우, 상기 비타겟 핵산 서열이 증폭 또는 검출되는지 확인한 다음 상기 올리고뉴클레오타이드의 사용을 결정하는 것이 바람직하다. X 부위에서의 미스매치된 염기의 개수 및 Z 부위에서의 미스매치된 염기의 개수에 기초한 참조 뉴클레오타이드 서열의 분류 결과는 디자인된 올리고뉴클레오타이드의 특이성을 간편하고 직관적인 방식으로 평가하는데 유용하다. When the match at the Z site is more important than the X site in the evaluation of the specificity, the reference nucleotide sequence corresponding to the mismatch type " 1 | 0 " is highly likely to be amplified or detected using the oligonucleotide of formula (I). Therefore, when there is a non-target nucleic acid sequence in the reference nucleotide sequence belonging to the mismatch type " 1 | 0 ", the user designes another oligonucleotide to avoid amplification or detection of the non-target nucleic acid sequence, The amplification or detection of the non-target nucleic acid sequence can be ignored if the number of target nucleic acid sequences is small or the importance is low. When the target nucleic acid sequence is present in the mismatch type " 0 | 1 ", the target nucleic acid sequence may not be amplified or detected using the oligonucleotide of formula (I). Thus, the user can modify (e.g., by incorporating a degenerate base) the sequence of the oligonucleotide of formula (I), or with another oligonucleotide, to cover the target nucleic acid sequence belonging to the mismatch type " 0 | Can be designed. Also, in the case where a non-target nucleic acid sequence exists in a reference target nucleotide sequence corresponding to the mismatch type " 0 | 1 ", it is preferable to determine whether the non-target nucleic acid sequence is amplified or detected and then use of the oligonucleotide is determined . The result of the classification of the reference nucleotide sequence based on the number of mismatched bases at the X site and the number of mismatched bases at the Z site is useful for evaluating the specificity of the designed oligonucleotide in a simple and intuitive manner.

상기 분류 결과는 각각의 참조 뉴클레오타이드 서열에 관한 정보를 추가로 포함할 수 있다. The classification result may further include information about each reference nucleotide sequence.

또한, 상기 제공되는 정보는 상기 올리고뉴클레오타이드가 초기 디자인시에 검토한 매치 결과와 동일한 매치 결과를 나타내는지 여부를 결정하는데 사용될 수 있다. 예를 들어, 식 (I)의 올리고뉴클레오타이드가 미스매치 유형 "0｜0"(X 부위에서 미스매치된 염기 수가 0개이고, Z 부위에서 미스매치된 염기 수가 0개임)으로 5개의 타겟 핵산 서열과 매치하고, 미스매치 유형 "1｜0"으로 3개의 타겟 핵산 서열에 매치하며, 미스매치 유형 "1｜1"로 2개의 타겟 핵산 서열에 매치하도록 디자인한 경우, 상기 디자인된 올리고뉴클레오타이드를 상기 타겟 핵산 서열만을 포함하는 데이터베이스와 비교하고, 각각 미스매치 유형 "0｜0", "1｜0" 및 "1｜1"에 속하는 타겟 핵산 서열의 개수를 확인함으로써 디자인시 고려했던 미스매치 결과와 동일한 결과가 수득되는지 여부를 결정할 수 있다. The information provided may also be used to determine whether the oligonucleotide represents the same match result as the match result reviewed at the time of the initial design. For example, if the oligonucleotide of formula (I) has five target nucleic acid sequences with a mismatch type " 0 | 0 ", where the number of mismatched bases at the X site is zero and the number of mismatched bases at the Z site is zero. Match the three target nucleic acid sequences with the mismatch type "1 | 0" and match the two target nucleic acid sequences with the mismatch type "1 | 1", the designed oligonucleotide is used as the target Is compared with a database containing only a nucleic acid sequence, and the number of target nucleic acid sequences belonging to mismatch types "0 | 0", "1 | 0" and "1 | 1" It can be determined whether a result is obtained.

또한, 추가의 분류 결과가 식 (I)의 올리고뉴클레오타이드의 커버리지(coverage)를 확인하는데 사용될 수 있다. 사용자는 분류 결과를 분석하고 디자인된 올리고뉴클레오타이드를 사용하여 증폭 또는 검출되는 타겟 핵산 서열을 확인할 수 있으므로, 상기 분류 결과는 식 (I)의 올리고뉴클레오타이드의 커버리지를 확인하는데 사용될 수 있다. In addition, further classification results can be used to confirm the coverage of the oligonucleotides of formula (I). The classification result can be used to confirm the coverage of the oligonucleotide of formula (I), since the user can analyze the classification results and identify the target nucleic acid sequence amplified or detected using the designed oligonucleotide.

한편, 본 발명의 방법은 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 서열 유사성에 관한 정보를 추가로 제공할 수 있다. The method of the present invention, on the other hand, can additionally provide information about the sequence similarity between the oligonucleotide and the respective reference nucleotide sequence.

상기 유사성에 관한 정보는 다양한 방식으로 표시될 수 있다. 일 구현예에서, 상기 유사성에 관한 정보는 디자인된 올리고뉴클레오타이드의 뉴클레오타이드의 총 개수 대비 매치된 뉴클레오타이드의 개수, 또는 이의 백분율-동일성 스코어(percent-identity score)로 표현될 수 있다. The information about the similarity can be displayed in various ways. In one embodiment, the information about the similarity may be expressed as the number of nucleotides matched against the total number of nucleotides of the designed oligonucleotide, or a percent-identity score thereof.

특히, 상기 유사성에 관한 정보는 올리고뉴클레오타이드의 부위 Y와 참조 뉴클레오타이드 서열의 상응하는 부위와의 유사성을 배제하여 계산될 수 있다. 예를 들어, X 부위가 p개의 뉴클레오타이드 길이이고, Y 부위가 q개의 뉴클레오타이드 길이이며, Z가 r개의 뉴클레오타이드 길이인 경우, 상기 유사성(%)은 [(X 부위 및 Z 부위에서 매치된 뉴클레오타이드의 총 개수) / (p + r)]*100에 의해 계산될 수 있다. In particular, the information about the similarity can be calculated by excluding the similarity between the site Y of the oligonucleotide and the corresponding site of the reference nucleotide sequence. For example, if the X site is a p nucleotide length, the Y site is a q nucleotide length, and Z is a r nucleotide length, then the similarity (%) is the sum of nucleotides matched at [ Number / (p + r)] * 100.

택일적으로, 상기 유사성에 관한 정보는 올리고뉴클레오타이드의 부위 Y와 참조 뉴클레오타이드 서열의 상응하는 부위가 서로 매치하는 것으로 간주하여 계산될 수 있다. 예를 들어, X 부위가 p개의 뉴클레오타이드 길이이고, Y 부위가 q개의 뉴클레오타이드 길이이며, Z가 r개의 뉴클레오타이드 길이인 경우, 상기 유사성(%)은 [(X 부위 및 Z 부위에서 매치된 뉴클레오타이드의 총 개수 + q) / (p + q + r)]*100에 의해 계산될 수 있다.Alternatively, the similarity information may be calculated assuming that the site Y of the oligonucleotide and the corresponding site of the reference nucleotide sequence match each other. For example, if the X site is a p nucleotide length, the Y site is a q nucleotide length, and Z is a r nucleotide length, then the similarity (%) is the sum of nucleotides matched at [ Number + q) / (p + q + r)] * 100.

또 다른 대안으로서, 올리고뉴클레오타이드의 X 부위와 참조 뉴클레오타이드 서열의 상응하는 부위 간의 유사성, 및 올리고뉴클레오타이드의 Z 부위와 참조 뉴클레오타이드 서열의 상응하는 부위 간의 유사성이 별도로 제공된다. As another alternative, the similarity between the X site of the oligonucleotide and the corresponding site of the reference nucleotide sequence and the similarity between the Z site of the oligonucleotide and the corresponding site of the reference nucleotide sequence are provided separately.

한편, 식 (I)의 올리고뉴클레오타이드의 부위 X 및 Z 중 어느 하나 또는 둘 모두에 적어도 하나의 유니버설 염기 또는 축퇴성 염기가 존재하는 경우, 상기 서열 유사성은 미스매치된 염기의 개수를 계산시 유니버설 염기 또는 축퇴성 염기의 처리와 동일한 방식으로 유니버설 염기 또는 축퇴성 염기를 처리함으로써 결정될 수 있다. On the other hand, when there is at least one universal base or degenerate base in any one or both of the sites X and Z of the oligonucleotide of formula (I), the sequence similarity is determined by calculating the number of mismatched bases in the universal base Or by treating universal bases or dehydrating bases in the same manner as the treatment of dehydrating bases.

전술한 바와 같이, 본 발명의 방법은 올리고뉴클레오타이드의 특이성에 관한 정보를 다양한 방식으로 제공함으로써, 사용자가 쉽고 빠르게 그리고 직관적으로 올리고뉴클레오타이드와 타겟 및 비타겟 핵산 서열과의 상동성을 분석할 수 있게 해준다. As described above, the method of the present invention provides information on the specificity of oligonucleotides in various ways, allowing a user to easily, quickly and intuitively analyze the homology of oligonucleotides with target and non-target nucleic acid sequences .

본 발명의 방법은 올리고뉴클레오타이드의 특이성에 관한 정보를 제공하는 것을 특징으로 하므로, 본 발명의 방법은 올리고뉴클레오타이드의 특이성에 관한 정보를 제공하는 방법으로 지칭될 수도 있다. Since the method of the present invention is characterized by providing information on the specificity of the oligonucleotide, the method of the present invention may be referred to as a method of providing information on the specificity of the oligonucleotide.

당업자는 본 발명의 방법에 의해 제공된 정보를 사용하여 디자인된 올리고뉴클레오타이드의 특이성을 평가할 수 있다. 따라서, 본 발명의 방법은 상기 단계 (c)에서 제공된 정보를 사용하여 식 (I)의 올리고뉴클레오타이드의 특이성을 평가하는 단계를 추가로 포함할 수 있다. One of ordinary skill in the art can assess the specificity of oligonucleotides designed using the information provided by the methods of the present invention. Thus, the method of the present invention may further comprise the step of evaluating the specificity of the oligonucleotide of formula (I) using the information provided in step (c) above.

단계 (c)에서 제공된 정보를 사용하여 식 (I)의 올리고뉴클레오타이드의 특이성을 평가하는 것은 식 (I)의 올리고뉴클레오타이드의 포괄성 및 배타성을 결정함으로써 달성될 수 있다. Estimating the specificity of the oligonucleotide of formula (I) using the information provided in step (c) can be accomplished by determining the inclusion and exclusivity of the oligonucleotide of formula (I).

본 발명의 방법은 프라이머 또는 프로브로서의 올리고뉴클레오타이드, 특히 식 (I)로 표시된 올리고뉴클레오타이드의 작동성을 평가하는데 사용될 수 있다. The method of the present invention can be used to evaluate the operability of oligonucleotides as primers or probes, particularly the oligonucleotides represented by formula (I).

단계 (c)에서 제공된 부위 X 및 Z에서의 매치/미스매치 결과는 올리고뉴클레오타이드가 특정 타겟 핵산 서열에 혼성화하는지 여부를 확인하게 해준다. 따라서, 본 발명의 방법은 올리고뉴클레오타이드가 특정 타겟 핵산 서열에 대해 프라이머 또는 프로브로서 작용할지 결정하는데 사용될 수 있다. The match / mismatch results at sites X and Z provided in step (c) confirm whether the oligonucleotides hybridize to a particular target nucleic acid sequence. Thus, the methods of the present invention can be used to determine whether oligonucleotides will act as primers or probes for a particular target nucleic acid sequence.

전술한 방법은 상기 방법을 실행하기 위한 프로세스를 구현하는 지시를 포함하는 소프트웨어에 의해 컴퓨터 상에서 실시될 수 있다. The above-described method may be embodied on a computer by software, including instructions for implementing a process for implementing the method.

II. 기록매체, 컴퓨터 프로그램 및 장치II. Recording medium, computer program and apparatus

하기 기재된 본 발명의 기록매체, 장치 및 컴퓨터 프로그램은 본 발명의 방법을 컴퓨터에서 실시할 수 있도록 한 것으로서, 이들 사이에 공통된 내용은 본 명세서의 복잡성을 야기하는 과도한 중복성을 피하기 위해 생략한다.The recording medium, the apparatus and the computer program of the present invention described below enable the method of the present invention to be carried out in a computer, and the common content therebetween is omitted in order to avoid excessive redundancy causing the complexity of the present specification.

본 발명의 또 다른 양태에 따르면, 본 발명은 올리고뉴클레오타이드의 특이성을 평가하는 방법을 실행하기 위한 프로세서를 구현하는 지시를 포함하는 컴퓨터 해독가능한 기록매체를 제공하며, 상기 방법은 다음의 단계를 포함한다:According to another aspect of the present invention, there is provided a computer-readable recording medium comprising instructions for implementing a processor for performing a method of evaluating the specificity of an oligonucleotide, said method comprising the steps of: :

5'-X-Y-Z-3' (I)5'-X-Y-Z-3 '(I)

(b) 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하여, 상기 데이터베이스로부터 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는 참조 뉴클레오타이드 서열을 추출하는 단계; 및 (b) comparing all or part of the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database, and comparing the at least one sequence of the oligonucleotide of formula (I) Extracting a reference nucleotide sequence; And

본 발명의 또 다른 양태에 따르면, 본 발명은 올리고뉴클레오타이드의 특이성을 평가하는 방법을 실행하기 위한 프로세서를 구현하는, 컴퓨터 해독가능한 기록매체에 저장되는 컴퓨터 프로그램을 제공하며, 상기 방법은 다음의 단계를 포함한다:According to another aspect of the present invention, there is provided a computer program stored on a computer readable recording medium embodying a processor for performing a method of assessing the specificity of an oligonucleotide, said method comprising the steps of: Includes:

5'-X-Y-Z-3' (I)5'-X-Y-Z-3 '(I)

프로그램 지시들은, 프로세서에 의해 실행될 때, 프로세서가 상술한 본 발명의 방법을 실행하도록 한다. 본 발명의 방법을 실행하는 프로그램 지시들은 다음의 지시를 포함할 수 있다: (i) 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열을 적어도 하나의 뉴클레오타이드 서열 데이터베이스와 비교하도록 하는 지시; (ii) 상기 데이터베이스로부터 상기 식 (I)의 올리고뉴클레오타이드의 전체 또는 일부 서열과 상동인 영역을 포함하는 참조 뉴클레오타이드 서열을 추출하도록 하는 지시; (iii) 상기 식 (I)의 올리고뉴클레오타이드와 각각의 참조 뉴클레오타이드 서열 간의 부위별 매치/미스매치를 분석하여 (i) 상기 식 (I)의 올리고뉴클레오타이드의 부위 X와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율 및 개별적으로 (ii) 상기 식 (I)의 올리고뉴클레오타이드의 부위 Z와 각각의 참조 뉴클레오타이드 서열 간의 매치되거나 미스매치된 염기의 개수 또는 비율을 제공하도록 하는 지시.Program instructions, when executed by a processor, cause the processor to perform the method of the invention described above. Program instructions for carrying out the methods of the present invention may include the following instructions: (i) instructions to compare all or part of the oligonucleotide of formula (I) with at least one nucleotide sequence database; (ii) extracting from the database a reference nucleotide sequence comprising a region homologous to all or some of the oligonucleotides of formula (I); (iii) analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and each reference nucleotide sequence to determine whether (i) a match between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence The number or percentage of mismatched bases and (ii) the number or percentage of matched or mismatched bases between each reference nucleotide sequence and the site Z of the oligonucleotide of formula (I) above.

본 발명의 방법은 프로세서에서 실행되며, 상기 프로세서는 독립 실행형 컴퓨터(stand alone computer), 네트워크 부착 컴퓨터 또는 실시간 PCR 장치와 같은 데이터 수득 장치에 있는 프로세서일 수 있다. The method of the present invention is implemented in a processor, which may be a stand alone computer, a network attached computer, or a processor in a data acquisition device such as a real-time PCR device.

컴퓨터 해독가능한 기록매체는 당업계에 공지된 다양한 저장 매체, 예컨대, CD-R, CD-ROM, DVD, 플래쉬 메모리, 플로피 디스크, 하드 드라이브, 포터블 HDD, USB, 마그네틱 테이프, MINIDISC, 비휘발성 메모리 카드, EEPROM, 광학 디스크, 광학 저장매체, RAM, ROM, 시스템 메모리 및 웹 서버를 포함하나, 이에 한정되는 것은 아니다.The computer-readable recording medium may be any of various storage media known in the art such as CD-R, CD-ROM, DVD, flash memory, floppy disk, hard drive, portable HDD, USB, magnetic tape, MINIDISC, , EEPROM, optical disk, optical storage medium, RAM, ROM, system memory and web server.

타겟 핵산 서열을 증폭 또는 검출하기 위한 식 (I)의 올리고뉴클레오타이드는 다양한 방식으로 제공될 수 있다. 예를 들어, 식 (I)의 올리고뉴클레오타이드의 서열은 네트워크 연결(예컨대, LAN, VPN, 인터넷 및 인트라넷) 또는 직접 연결(예컨대, USB 또는 다른 직접 유선 연결 또는 무선 연결)에 의해 데스크탑 컴퓨터 시스템과 같은 별도의 시스템에 제공될 수 있고, 또는 CD, DVD, 플로피 디스크 및 포터블 HDD와 같은 포터블 매체 상에 제공될 수 있다. The oligonucleotide of formula (I) for amplifying or detecting the target nucleic acid sequence may be provided in various ways. For example, the sequence of oligonucleotides of formula (I) may be introduced into a host computer system by a network connection (e.g., LAN, VPN, Internet and intranet) or a direct connection (e.g., USB or other direct wired connection or wireless connection) May be provided in a separate system, or may be provided on a portable medium such as a CD, a DVD, a floppy disk, and a portable HDD.

본 발명을 실행하는 프로세서를 구현하는 지시들은 로직 시스템에 포함될 수 있다. 상기 지시는, 비록 소프트웨어 기록매체(예컨대, 포터블 HDD, USB, 플로피 디스크, CD 및 DVD)로 제공될 수 있지만, 다운로드 가능하고 메모리 모듈(예컨대, 하드 드라이브 또는 로컬 또는 부착 RAM 또는 ROM과 같은 다른 메모리)에 저장될 수 있다. 본 발명을 실행하는 컴퓨터 코드는, C, C++, Java, Visual Basic, VBScript, JavaScript, Perl 및 XML과 같은 다양한 코딩 언어로 실행될 수 있다. 또한, 다양한 언어 및 프로토콜은 본 발명에 따른 시그널과 명령의 외부 및 내부 저장과 전달에 이용될 수 있다.Instructions implementing the processor implementing the invention may be included in the logic system. Although the instructions may be provided on a software recording medium (e.g., a portable HDD, a USB, a floppy disk, a CD and a DVD) ). &Lt; / RTI > Computer code for implementing the invention can be implemented in a variety of coding languages such as C, C ++, Java, Visual Basic, VBScript, JavaScript, Perl, and XML. In addition, various languages and protocols may be used for external and internal storage and delivery of signals and instructions in accordance with the present invention.

본 발명의 다른 양태에 따르면, 본 발명은 (a) 컴퓨터 프로세서, 및 (b) 상기 컴퓨터 프로세서에 커플링된 상기 본 발명의 컴퓨터 해독가능한 기록매체를 포함하는, 올리고뉴클레오타이드의 특이성을 평가하기 위한 장치를 제공한다.According to another aspect of the present invention, the present invention provides a device for evaluating the specificity of oligonucleotides, comprising: (a) a computer processor; and (b) said computer readable recording medium of the present invention coupled to said computer processor Lt; / RTI >

프로세서는 하나의 프로세서가 상술한 퍼포먼스를 모두 하도록 구축될 수 있다. 택일적으로, 프로세서 유닛은 여러 개의 프로세서가 각각의 퍼포먼스를 실행하도록 구축될 수 있다.A processor may be constructed such that one processor may perform all of the above-described performance. Alternatively, the processor unit may be constructed such that multiple processors execute their respective performance.

본 발명의 특징 및 이점을 요약하면 다음과 같다: The features and advantages of the present invention are summarized as follows:

(a) 종래 서열 정렬 알고리즘 또는 프로그램은, 5'-X-Y-Z-3'(여기서, Y는 왓슨-크릭 염기쌍에 관여하지 않는, 2개 이상의 연속적인 염기를 포함하는 분할 부위를 나타냄)로 표시된 것과 같은 비전형적인 올리고뉴클레오타이드에 대하여 매치/미스매치 결과를 제공하지 않는다. 이에 반해, 본 발명의 방법은 X 부위 및 Z 부위에서의 매치/미스매치 결과를 개별적으로 제공하여, 사용자가 특이성, 특히 X 부위 및 Z 부위에서의 어닐링 특이성을 상이한 가중치로 평가할 수 있게 해준다. (a) Conventional sequence alignment algorithms or programs can be used to identify a 5'-XYZ-3 ', wherein Y represents a cleavage site comprising two or more consecutive bases not involved in the Watson- But does not provide a match / mismatch result for non-typical oligonucleotides. In contrast, the method of the present invention provides individually match / mismatch results at the X and Z sites, allowing the user to evaluate the specificity, particularly the annealing specificities at the X and Z sites, to different weights.

(b) 5'-X-Y-Z-3'(여기서, Y는 왓슨-크릭 염기쌍에 관여하지 않는, 둘 이상의 연속적인 염기를 포함하는 분할 부위를 나타냄)로 표시된 것과 같은 비전형적인 올리고뉴클레오타이드의 경우, 종래 서열 정렬 알고리즘 또는 프로그램은 X 부위 또는 Z 부위 중 어느 하나의 매치/미스매치 결과만을 제공할 수 있다. 이에 반해, 본 발명의 방법은 추출된 참조 뉴클레오타이드 서열 내의 상동성 영역 및 그의 플랭킹 영역을 사용하여 X 부위 및 Z 부위 모두의 매치/미스매치 결과를 제공한다. 따라서, 본 발명의 방법은 특이성, 특히 비전형적인 구조를 갖는 올리고뉴클레오타이드의 어닐링 특이성의 정확한 평가를 가능하게 하며, X 및 Z 부위의 중요성을 고려하여 적절한 올리고뉴클레오타이드를 선택하는데 도움을 준다. (b) for non-typical oligonucleotides such as those represented by 5'-XYZ-3 ', wherein Y represents a cleavage site comprising two or more consecutive bases not involved in Watson-Crick base pairs, The sorting algorithm or program may only provide a match / mismatch result of either the X region or the Z region. In contrast, the method of the present invention provides a match / mismatch result for both the X and Z sites using the homology region and its flanking region within the extracted reference nucleotide sequence. Thus, the method of the present invention enables accurate evaluation of the specificity, particularly the annealing specificity of oligonucleotides having a non-typical structure, and helps to select an appropriate oligonucleotide in view of the importance of the X and Z sites.

(c) 서열 내부에 연속적인 또는 비연속적인 유니버설 염기를 함유하는 올리고뉴클레오타이드의 경우, 종래의 서열 정렬 알고리즘 또는 프로그램은 상기 유니버설 염기를 미스매치로 결정한다. 이에 반해, 본 발명의 일 구현예에 따른 방법은 상기 유니버설 염기를 매치로 결정하여, 올리고뉴클레오타이드의 특이성을 정확하게 평가할 수 있게 해준다. (c) For oligonucleotides containing a continuous or discontinuous universal base within a sequence, conventional sequence alignment algorithms or programs determine the universal base as a mismatch. In contrast, the method according to one embodiment of the present invention makes it possible to accurately determine the specificity of the oligonucleotide by determining the universal base as a match.

(d) 서열 내부에 축퇴성 염기(들)를 포함하는 올리고뉴클레오타이드의 경우, 종래의 서열 정렬 알고리즘 또는 프로그램은 상기 축퇴성 염기(들)를 그의 상응하는 뉴클레오타이드의 유형에 관계없이 미스매치로 결정한다. 이에 반해, 본 발명의 일 구현예에 따른 방법은 상기 축퇴성 염기로 표시된 염기의 유형에 따라 매치/미스매치를 결정하여, 올리고뉴클레오타이드의 특이성을 정확하게 평가할 수 있게 해준다. (d) For oligonucleotides containing deacidifying base (s) within the sequence, conventional sequence alignment algorithms or programs will mismatch the degenerate base (s) regardless of the type of its corresponding nucleotide . In contrast, the method according to one embodiment of the present invention determines a match / mismatch according to the type of the base represented by the degenerate base so as to accurately evaluate the specificity of the oligonucleotide.

(e) 본 발명의 방법은 부위 X 및 Z에서의 미스매치된 염기의 개수에 따른 참조 뉴클레오타이드 서열의 분류 결과뿐만 아니라 이의 생물학적 특징을 제공함으로써, 사용자가 올리고뉴클레오타이드의 특이성, 특히 타겟 특이성을 간편하고 직관적으로 평가할 수 있게 해준다. (e) The method of the present invention provides the biological characteristics of the reference nucleotide sequence as well as the result of classification of the reference nucleotide sequence according to the number of mismatched bases in the sites X and Z, so that the user can easily identify the specificity of the oligonucleotide, It allows intuitive evaluation.

도 1은 본 발명의 일 구현예에 따라 올리고뉴클레오타이드의 특이성을 평가하는 과정을 나타낸 흐름도이다.
도 2는 본 발명의 일 구현예에 따라 올리고뉴클레오타이드(DPO 프라이머)의 특이성을 평가하는 과정을 도식적으로 나타낸 것이다. 5'-X-Y-Z-3'으로 표시되는 DPO 프라이머 내의 부위 X의 서열(질의)을 BLAST를 사용하여 데이터베이스와 비교하여, 상기 X 부위와 상동인 영역을 포함하는 복수의 참조 뉴클레오타이드 서열을 추출한다. 이후, 상기 DPO 프라이머의 전체 서열과 각각의 참조 뉴클레오타이드 서열의 상동 영역 및 그의 플랭킹(flanking) 영역 간의 부위별 매치/미스매치를 분석하고, 부위 X 및 Z에서 미스매치된 염기의 개수를 제공한다.
도 3은 5'-X-Y-Z-3'로 표시되는 예시적인 DPO 프라이머의 전체 서열(상부 열)과 본 발명의 일 구현예에 따라 추출된 참조 뉴클레오타이드 서열(하부 열) 간의 부위별 매치/미스매치 분석(서열 정렬)의 결과를 나타낸 것이다. 도 3에 도시된 바와 같이, 상기 DPO 프라이머는 상기 참조 뉴클레오타이드 서열의 (-) 가닥과 매치되고, 부위 X에서 1개의 매스매치된 염기 및 부위 Z에서 0개의 미스매치된 염기를 갖는 것으로 나타났다. 상기 정보가 도 3에 "- 1｜0"으로 표시되어 있다. 1 is a flow chart illustrating a process for evaluating the specificity of an oligonucleotide according to an embodiment of the present invention.
Figure 2 is a schematic representation of a process for assessing the specificity of an oligonucleotide (DPO primer) according to one embodiment of the present invention. A sequence (query) of a site X in the DPO primer represented by 5'-XYZ-3 'is compared with a database using BLAST to extract a plurality of reference nucleotide sequences including a region homologous to the X site. Thereafter, the site-specific match / mismatch between the entire sequence of the DPO primer and the homologous region of each reference nucleotide sequence and its flanking region is analyzed and the number of mismatched bases in regions X and Z is provided .
Figure 3 shows a site-specific match / mismatch analysis between the entire sequence (top row) of an exemplary DPO primer represented by 5'-XYZ-3 'and the reference nucleotide sequence (bottom row) extracted according to one embodiment of the present invention (Sequence alignment). As shown in Figure 3, the DPO primer was found to have a (-) strand of the reference nucleotide sequence, one mass matched base at site X and zero mismatched bases at site Z. The above information is indicated as "-1 | 0 " in Fig.

이하, 실시예를 통하여 본 발명을 더욱 상세히 설명하고자 한다. 이들 실시예는 오로지 본 발명을 보다 구체적으로 설명하기 위한 것으로서, 본 발명의 요지에 따라 본 발명의 범위가 이들 실시예에 의해 제한되지 않는다는 것은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어서 자명할 것이다.Hereinafter, the present invention will be described in more detail with reference to Examples. It is to be understood that the scope of the present invention is not limited by these examples in accordance with the gist of the present invention, and it is to be understood by those skilled in the art that the present invention is not limited thereto It will be obvious.

실시예Example

실시예Example 1: 본 발명의 일 1: Work of the invention 구현예에In an implementation example 따른 Following 올리고뉴클레오타이드의Oligonucleotide 특이성 평가 Specificity evaluation

<1-1> 이중 특이성 <1-1> Double specificity 올리고뉴클레오타이드(DPO)의Of the oligonucleotide (DPO) 디자인 design

WO 제2006/095981호의 개시내용을 참고하여, 타겟 핵산 서열로서 Bacteroides fragilis의 16S 리보솜 RNA(Genbank ID No: HM352993.1)를 증폭하기 위한 DPO 프라이머(서열번호 1)를 디자인하였다. 상기 디자인된 DPO 프라이머의 뉴클레오타이드 서열을 하기에 나타내었다. A DPO primer (SEQ ID NO: 1) was designed to amplify 16S ribosomal RNA (Genbank ID No: HM352993.1) of Bacteroides fragilis as a target nucleic acid sequence with reference to the disclosure of WO 2006/095981. The nucleotide sequence of the designed DPO primer is shown below.

5'-GACTCTAGAGAGACTGCCGTCGTAAIIIIIGAGGAAGGTG-3' (서열번호: 1)5'-GACTCTAGAGAGACTGCCGTCGTAA IIIII GAGGAAGGTG-3 '(SEQ ID NO: 1)

상기 나타낸 바와 같이, DPO 프라이머는 3개의 구별되는 부위를 갖는다: (i) 5' 말단에 부위 "X": GACTCTAGAGAGACTGCCGTCGTAA; (ii) 유니버설 염기로서 5개의 데옥시이노신(I)(볼드체로 표시됨)로 이루어진 분할 부위 "Y"; 및 (iii) 3' 말단에 부위 "Z": GAGGAAGGTG. As indicated above, the DPO primer has three distinct sites: (i) site "X" at the 5 'end: GACTCTAGAGAGACTGCCGTCGTAA; (ii) a cleavage site " Y " consisting of 5 deoxyinosines (I) (expressed in bold) as universal bases; And (iii) a site " Z " at the 3 'end: GAGGAAGGTG.

<1-2> BLAST를 사용한 특이성의 평가<1-2> Evaluation of specificity using BLAST

상기 DPO 프라이머의 특이성을 평가하기 위하여, 상기 DPO 프라이머 내의 부위 X(즉, 5'-GACTCTAGAGAGACTGCCGTCGTAA-3')를 상동성 분석을 위해 BLAST를 사용하여 GenBank 데이터베이스와 비교하였다. 상기 BLAST 알고리즘에 사용된 파라미터는 다음과 같다:To assess the specificity of the DPO primers, site X (i.e., 5'-GACTCTAGAGAGACTGCCGTCGTAA-3 ') in the DPO primer was compared to the GenBank database using BLAST for homology analysis. The parameters used in the BLAST algorithm are as follows:

- query : Fasta 형식의 프라이머 서열 파일명- query: Fasta-format primer sequence filename

- db : 뉴클레오타이드 데이터베이스 파일명- db: Nucleotide database file name

- out : 저장되는 파일명- out: the file name to be saved

- evalue : 1000- evalue: 1000

- word_size : 4- word_size: 4

- perc_identity : 60- perc_identity: 60

- num_alignments : 1000000- num_alignments: 1000000

- num_descriptions : 1000000- num_descriptions: 1000000

상기 비교 결과, 상기 DPO 프라이머의 부위 X와 상동인 영역을 함유하는 총 2387개의 참조 뉴클레오타이드 서열이 추출되었다. 상기 추출된 참조 뉴클레오타이드 서열 각각은 상동 영역 및 임의로 Z 부위와의 비교에 사용될 수 있는 플랭킹 영역을 함유하였다. As a result of the above comparison, a total of 2387 reference nucleotide sequences containing regions homologous to site X of the DPO primer were extracted. Each of the extracted reference nucleotide sequences contained a flanking region that could be used for comparison with the homologous region and optionally the Z region.

상동 영역 및 그의 플랭킹 영역을 함유하는 추출된 참조 뉴클레오타이드 서열을 DPO 프라이머의 전체 서열과 각각 비교하여, DPO 프라이머의 부위 X와 각각의 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 개수 뿐만 아니라 DPO 프라이머의 부위 Z와 각각의 참조 뉴클레오타이드 서열 간의 미스매치된 염기의 개수를 수득하였다(도 2 참고). The extracted reference nucleotide sequence containing the homologous region and its flanking region is each compared with the entire sequence of the DPO primer to determine the number of mismatched bases between the site X of the DPO primer and each reference nucleotide sequence as well as the number of mismatched bases of the DPO primer The number of mismatched bases between Z and the respective reference nucleotide sequence was obtained (see FIG. 2).

상기 DPO 프라이머와 참조 뉴클레오타이드 서열 중 하나와의 비교 결과를 도 3에 나타내었다. The results of comparison between the DPO primer and one of the reference nucleotide sequences are shown in FIG.

도 3으로부터 DPO 프라이머가 예시적인 참조 뉴클레오타이드 서열과 비교하여 부위 X에서 1개의 미스매치된 염기와 Z 부위에서 0개의 미스매치된 염기를 가지고 있음을 확인할 수 있다. 또한, 상기 DPO 프라이머는 참조 뉴클레오타이드 서열의 (-) 가닥에 매치하는 것으로 나타났다. From Figure 3 it can be seen that the DPO primer has one mismatched base at site X and zero mismatched base at Z site compared to the exemplary reference nucleotide sequence. In addition, the DPO primer appeared to match the (-) strand of the reference nucleotide sequence.

상기 정보를 표기법 "D Xm｜Zm"로 표시하였다. 상기 표기법에서, "D"는 참조 뉴클레오타이드 서열 대비 관심 올리고뉴클레오타이드의 매치 방향을 의미한다. 구체적으로, "+"는 관심 올리고뉴클레오타이드가 참조 뉴클레오타이드 서열의 (+) 가닥에 매치된다는 것을 의미하며, "-"는 관심 올리고뉴클레오타이드가 참조 뉴클레오타이드 서열의 (-) 가닥에 매치된다는 것을 의미한다. 또한, "Xm"은 부위 X에서의 미스매치된 염기의 개수를 가리키고, "Zm"은 부위 Z에서의 미스매치된 염기의 개수를 가리킨다. 상기 결과를 도 3에 "- 1｜0"으로 제공하였다. The information is represented by the notation " D Xm | Zm ". In the above notation, " D " means the match direction of the oligonucleotide of interest relative to the reference nucleotide sequence. Specifically, "+" means that the oligonucleotide of interest matches the (+) strand of the reference nucleotide sequence and "-" means that the oligonucleotide of interest matches the (-) strand of the reference nucleotide sequence. Also, " Xm " refers to the number of mismatched bases at site X, and " Zm " refers to the number of mismatched bases at site Z. The results are shown in FIG. 3 as "-1 | 0 ".

이후, 상기 참조 뉴클레오타이드 서열을 부위 X에서의 미스매치된 염기의 개수 및 부위 Z에서의 미스매치된 염기의 개수에 따라 분류하였다. 결과를 하기 표 1에 나타내었다. The reference nucleotide sequence was then sorted according to the number of mismatched bases at site X and the number of mismatched bases at site Z. [ The results are shown in Table 1 below.

상기 미스매치 유형 중에서, 본 발명의 DPO 프라이머와 혼성화할 가능성이 있는 미스매치 유형 "0｜0"(230개 참조 뉴클레오타이드 서열), "1｜0"(422개의 참조 뉴클레오타이드 서열) 및 "1｜1"(10개의 참조 뉴클레오타이드 서열)에 포함되는 참조 뉴클레오타이드 서열의 소스(source)를 조사하였다. 그 결과, 상기 미스매치 유형에 포함된 모든 참조 뉴클레오타이드 서열은 Bacteroides fragilis로부터 유래된 것으로 밝혀졌다. 이것은 상기 디자인된 DPO 프라이머가 Bacteroides fragilis의 핵산 서열에 대해 특이성을 가짐을 보여준다. Among the above mismatch types, mismatch types "0 | 0" (230 reference nucleotide sequences), "1 | 0" (422 reference nucleotide sequences) and "1 | 1" &Quot; (10 reference nucleotide sequences). As a result, all reference nucleotide sequences included in the mismatch type were found to be derived from Bacteroides fragilis. This shows that the designed DPO primer has specificity for the nucleic acid sequence of Bacteroides fragilis.

상기 결과는 또한 혼성화 조건에 따라, 상기 디자인된 DPO 프라이머를 사용하여 증폭되는 타겟 핵산 서열의 커버리지(coverage)에 관한 정보를 제공한다. 구체적으로, 상기 결과로부터, 당업자는 미스매치 유형 "0｜0", "1｜0" 및 "1｜1"에 속하는 타겟 핵산 서열이 혼성화 조건을 조절함으로써 증폭될 수 있음을 인식할 것이다. The results also provide information about the coverage of the target nucleic acid sequence amplified using the designed DPO primers, depending on the hybridization conditions. Specifically, from the above results, those skilled in the art will recognize that target nucleic acid sequences belonging to mismatch types "0 | 0", "1 | 0" and "1 | 1" can be amplified by modulating the hybridization conditions.

또한, 상기 결과는 DPO 프라이머가 각각의 추출된 참조 뉴클레오타이드에 대해 어닐링 특이성을 갖는지에 관한 정보를 제공한다. The results also provide information as to whether the DPO primer has an annealing specificity for each extracted reference nucleotide.

이와 같이, 디자인된 올리고뉴클레오타이드의 특이성을 보다 간단하고 직관적인 방식으로 평가할 수 있다. Thus, the specificity of the designed oligonucleotide can be evaluated in a simpler and more intuitive manner.

이상으로 본 발명의 특정한 부분을 상세히 기술하였는바, 당업계의 통상의 지식을 가진 자에게 있어서 이러한 구체적인 기술은 단지 바람직한 구현 예일 뿐이며, 이에 본 발명의 범위가 제한되는 것이 아닌 점은 명백하다. 따라서 본 발명의 실질적인 범위는 첨부된 청구항과 그의 등가물에 의하여 정의된다고 할 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the same is by way of illustration and example only and is not to be construed as limiting the scope of the present invention. It is therefore intended that the scope of the invention be defined by the claims appended hereto and their equivalents.

<110> SEEGENE, INC. <120> Method for evaluating target specific workability of oligonucleotides <130> PI180020KR <150> KR 10-2016-0069487 <151> 2016-06-03 <160> 1 <170> KoPatentIn 3.0 <210> 1 <211> 40 <212> DNA <213> Artificial Sequence <220> <223> Synthetic nucleotide <400> 1 gactctagag agactgccgt cgtaannnnn gaggaaggtg 40 <110> SEEGENE, INC. <120> Method for evaluating target specific workability of oligonucleotides <130> PI180020KR <150> KR 10-2016-0069487 <151> 2016-06-03 <160> 1 <170> KoPatentin 3.0 <210> 1 <211> 40 <212> DNA <213> Artificial Sequence <220> <223> Synthetic nucleotide <400> 1 gactctagag agactgccgt cgtaannnnn gaggaaggtg 40

Claims

A method for evaluating the specificity of oligonucleotides, comprising the steps of:
(a) providing an oligonucleotide represented by the following formula (I):
5'-XYZ-3 '(I)
Wherein Y represents a cleavage site comprising two or more consecutive bases not involved in the Watson-Crick base pair, and Z represents a cleavage site of the target nucleic acid sequence Lt; RTI ID = 0.0 > a < / RTI > hybridization nucleotide sequence;
(b) comparing all or part of the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database and comparing the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database, Extracting a reference nucleotide sequence; And
(c) analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and the respective reference nucleotide sequence to determine whether (i) a match between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence Providing the number or percentage of mismatched bases between the respective reference nucleotide sequence and the number or percentage of mismatched bases and (ii) site Z of the oligonucleotide of formula (I) above.

3. The method of claim 1, wherein the entire sequence of the oligonucleotide of formula (I) is compared to at least one nucleotide sequence database and the step (b) comprises the step of (i) comprising a region homologous to a partial sequence of the oligonucleotide of formula When a reference nucleotide sequence is extracted, site-specific match / mismatch between the entire sequence of the oligonucleotide of the above formula (I) and the homologous region of each reference nucleotide sequence and its flanking region is analyzed, Characterized in that the number or percentage of matched or mismatched bases is provided.

2. The method according to claim 1, wherein a partial sequence of the oligonucleotide of the formula (I) is compared to at least one nucleotide sequence database and the step (b) comprises the step of introducing a sequence homologous to a partial sequence of the oligonucleotide of the formula When a reference nucleotide sequence is extracted, site-specific match / mismatch between the entire sequence of the oligonucleotide of the above formula (I) and the homologous region of each reference nucleotide sequence and its flanking region is analyzed, Characterized in that the number or percentage of matched or mismatched bases is provided.

4. The method of claim 3, wherein the partial sequence of the oligonucleotide of formula (I) used in the comparison of step (b) is site X, site Z, or a portion thereof.

6. The method of claim 5, wherein the sequence alignment algorithm or program is selected from the group consisting of Smith & Waterman, Needleman-Wunsch, BLAST, and FASTA.

3. The method of claim 1, wherein one or both of the X and Z sites comprise at least one universal base or a degenerate base.

8. The method of claim 7, wherein the universal base is not counted as a mismatched base in step (c).

8. The method of claim 7, wherein the degenerate base is not counted as a mismatched base in step (c) if any of the bases exhibited by the degenerate base match the corresponding base in the reference nucleotide sequence How to.

2. The method of claim 1, further providing a biological characteristic of each reference nucleotide sequence in step (c).

2. The method of claim 1, further comprising providing a result of classification of the reference nucleotide sequence according to the number of mismatched bases at site X and the number of mismatched bases at site Z.

The method according to claim 1, wherein the oligonucleotide of the formula (I) is a primer or a probe.

The method according to claim 1, wherein the base included in the cleavage site Y is an unnatural base; Universal base; A mismatched base and a combination thereof.

A computer readable recording medium comprising instructions for implementing a processor to perform a method of assessing the specificity of an oligonucleotide, said method comprising the steps of:
(a) providing an oligonucleotide represented by the following formula (I):
5'-XYZ-3 '(I)
Wherein Y represents a cleavage site comprising two or more consecutive bases not involved in the Watson-Crick base pair, and Z represents a cleavage site of the target nucleic acid sequence Lt; RTI ID = 0.0 > a < / RTI > hybridization nucleotide sequence;
(b) comparing all or part of the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database and comparing the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database, Extracting a reference nucleotide sequence; And
(c) analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and the respective reference nucleotide sequence to determine whether (i) a match between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence Providing the number or percentage of mismatched bases between the respective reference nucleotide sequence and the number or percentage of mismatched bases and (ii) site Z of the oligonucleotide of formula (I) above.

15. An apparatus for evaluating the specificity of an oligonucleotide, comprising: (a) a computer processor; and (b) the computer-readable recording medium of claim 15 coupled to the computer processor.

A computer program stored on a computer readable recording medium embodying a processor for performing a method of assessing the specificity of an oligonucleotide, said method comprising the steps of:
(a) providing an oligonucleotide represented by the following formula (I):
5'-XYZ-3 '(I)
Wherein Y represents a cleavage site comprising two or more consecutive bases not involved in the Watson-Crick base pair, and Z represents a cleavage site of the target nucleic acid sequence Lt; RTI ID = 0.0 > a < / RTI > hybridization nucleotide sequence;
(b) comparing all or part of the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database and comparing the sequence of the oligonucleotide of formula (I) with at least one nucleotide sequence database, Extracting a reference nucleotide sequence; And
(c) analyzing the site-specific match / mismatch between the oligonucleotide of formula (I) and the respective reference nucleotide sequence to determine whether (i) a match between the site X of the oligonucleotide of formula (I) and the respective reference nucleotide sequence Providing the number or percentage of mismatched bases between the respective reference nucleotide sequence and the number or percentage of mismatched bases and (ii) site Z of the oligonucleotide of formula (I) above.