JP2005160359A

JP2005160359A - Method for determining whale species using insertion polymorphism of sine as index

Info

Publication number: JP2005160359A
Application number: JP2003402154A
Authority: JP
Inventors: Norihiro Okada; 典弘岡田; Masahito Nikaido; 雅人二階堂
Original assignee: Rikogaku Shinkokai
Current assignee: Rikogaku Shinkokai
Priority date: 2003-12-01
Filing date: 2003-12-01
Publication date: 2005-06-23

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method for determining a whale species by a SINE (short interspersed element) method. <P>SOLUTION: The method for determining the whale species comprises making each genome DNA library from one or more whale species to be determined and isolating a clone containing an orthologues gene locus into which a SINE family or subfamily is specifically inserted in at least one kind of species among the whale species, respectively. In the gene locus, the gene locus sequence in each of the whale species is amplified by PCR (polymerase chain reaction) by using a set of a PCR primer annealing flanking sequences positioned at both ends of the SINE family or subfamily. The obtained PCR product is subjected to gel electrophoresis and the whale species is determined by presence of a band exhibiting the existence of the gene locus into which the SINE family or subfamily is inserted. The gene locus sequence and the primer sequence useful in the method are obtained. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、ＳＩＮＥ法により鯨種を判別する方法に関する。 The present invention relates to a method for discriminating whale species by the SINE method.

ＤＮＡ配列決定及び分子クローニング技術の自動化は、生物のゲノムへのアクセスを劇的に拡大している。さらに、ゲノムのディジタル情報の洪水は、多数の生物学者をコンピューターを使用したバイオ・インフォマティックスティクに惹きつけている。ヒトゲノム計画の如き包括的な努力の結果は、核酸再生試験により３０年程前に明らかにされたものを極めて詳細に示した。すなわち、真核生物のゲノムの大部分は、特定のタンパク質産物をコードしておらず、明らかな機能をもたない反復要素から構成されている (Kazazian et al., 1998)。ゲノム解析の結果として、たんぱく質をコードしていない、いわゆる「ジャンクヤード」は、DNAタイプの転移因子やRNA性の転移因子などの活動的な要素で満ちたダイナミックな分子の「ジャングル」として新たに認識されている。レトロポゾンは、系統学の再構築及び集団分析のための分類学のツールとして利用されることができると考えられつつある興味深い遍在性の反復配列であり、上記RNA性の転移因子の一員である(Brosius, 1991; Shedlock and Okada, 2000)。これゆえ、レトロポゾンによる診断は、個体レベルあるいは種レベルにおける進化生物学の関連亜分野の間の重要な橋渡しをする。 The automation of DNA sequencing and molecular cloning techniques has dramatically expanded access to the genome of an organism. In addition, the flood of digital genomic information has attracted many biologists to bioinformatics using computers. The results of comprehensive efforts such as the Human Genome Project have shown in great detail what was revealed about 30 years ago by nucleic acid regeneration studies. That is, the majority of eukaryotic genomes are composed of repetitive elements that do not encode specific protein products and have no apparent function (Kazazian et al., 1998). As a result of genome analysis, the so-called “junk yard”, which does not encode protein, is newly introduced as a dynamic molecular “jungle” filled with active elements such as DNA-type transposable elements and RNA-type transposable elements. Recognized. Retroposons are an interesting ubiquitous repetitive sequence that is thought to be used as a taxonomic tool for phylogenetic reconstruction and population analysis, and is a member of the RNA-related transposable element (Brosius, 1991; Shedlock and Okada, 2000). Thus, retroposon diagnosis provides an important bridge between the relevant sub-fields of evolutionary biology at the individual or species level.

用語「レトロポジション」とは、上記要素が、ＲＮＡ中間体によりゲノム内の親遺伝子座と標的遺伝子座の間でどのように移動するかをいう (Rogers, 1983)。このコピー・アンド・ペースト過程は、ＲＮＡから染色体ＤＮＡへの遺伝情報の逆流を作り出す (Weiner, et al., 1986, Schmid, 1996)。それは親遺伝子座に元のコピーを残さないような様式、則ちカット・アンド・ペースト様式で、染色体の周辺をジャンプするDNAタイプのトランスポゾン、例えば、mariner及びPエレメントとは異なる (Clark and Kidwell, 1997; Hartl et al, 1997)。レトロポゾンを分類するために使用される重要な特徴は、自己増幅のために不可欠な酵素である逆転写酵素（ＲＴａｓｅ）をコードしているか否かである (Okada, 1991; Eickbush, 1994)。自己増幅しないレトロポゾンンの中で、短い散在性の要素（short interspersed elements, SINEs）は、ゲノム内に多数存在するので、真核生物の分類学の研究のためには極めて有用であると考えられる。ＳＩＮＥは、サイズが７０〜５００ｂｐの範囲にあり、そして２つのカテゴリー、tRNA由来のものと７ＳＬＲＮＡ由来のものに分類される。７ＳＬＲＮＡ由来のＳＩＮＥは、２つのＳＩＮＥファミリー、すなわち、霊長類のＡｌｕファミリーとげっ歯類のＢ１ファミリーだけを含む。それ以外のこれまで調べられた他のＳＩＮＥの全てがｔＲＮＡ由来であることが示されている。増幅のためのＲＴａｓｅをコードするＬＩＮＥも、２つのカテゴリーに分類される。すなわち、増幅のための特定の配列を必要とせずに単純なポリＡ配列のみを逆転写のために必要とする哺乳動物のＬ１、と増幅のためにＲＴａｓｅに認識されるべく3’末端に厳格な配列モチーフを必要とする、ほとんどのＬＩＮＥを含むの他のカテゴリーである。ほとんどの非哺乳動物のＳＩＮＥとＬＩＮＥは、同一の3’末端尾配列を共有し、その存在により、ＳＩＮＥは、ＬＩＮＥによりコードされるＲＴａｓｅにより増幅されることができる (Ohshima et al, 1997; Okada et al, 1997; Terai et al., 1998)。哺乳動物の場合、ｔＲＮＡと７ＳＬＲＮＡ由来のものを含むＳＩＮＥは、哺乳動物のＬ１によりコードされるＲＴａｓｅの助けを借りて増幅されるようである。それゆえ、この場合、上記の保存された配列モチーフは、ＳＩＮＥの3’末端尾には観察されない。ＳＩＮＥのレトロポジションに関する一般ダイアグラムを図１に示す。 The term “retroposition” refers to how the element moves between a parent locus and a target locus in the genome by an RNA intermediate (Rogers, 1983). This copy-and-paste process creates a backflow of genetic information from RNA to chromosomal DNA (Weiner, et al., 1986, Schmid, 1996). It differs from DNA-type transposons that jump around the chromosome, such as mariner and P elements, in a manner that does not leave an original copy at the parent locus, that is, a cut-and-paste manner (Clark and Kidwell, 1997; Hartl et al, 1997). An important feature used to classify retroposons is whether they encode reverse transcriptase (RTase), an enzyme essential for self-amplification (Okada, 1991; Eickbush, 1994). Among the retroposons that do not self-amplify, there are many short interspersed elements (SINEs) in the genome, which is considered extremely useful for eukaryotic taxonomy studies. . SINEs range in size from 70 to 500 bp and are classified into two categories, those derived from tRNA and those derived from 7SLRNA. The 7SLRNA-derived SINE contains only two SINE families: the primate Alu family and the rodent B1 family. All other SINEs examined so far have been shown to be derived from tRNA. LINEs encoding RTases for amplification also fall into two categories. A mammalian L1 that requires only a simple poly A sequence for reverse transcription without the need for a specific sequence for amplification, and a strict 3 'end to be recognized by RTase for amplification. Other categories, including most LINEs, that require unique sequence motifs. Most non-mammalian SINEs and LINEs share the same 3 ′ terminal tail sequence, and their presence allows SINE to be amplified by the LINE-encoded RTase (Ohshima et al, 1997; Okada et al, 1997; Terai et al., 1998). In the case of mammals, SINEs, including those derived from tRNA and 7SLRNA, appear to be amplified with the help of RTase encoded by mammalian L1. Therefore, in this case, the above conserved sequence motif is not observed in the 3 'end tail of SINE. A general diagram for the SINE retroposition is shown in FIG.

レトロポジションが一方向性に常に生じると云う事実とＳＩＮＥの１０⁴以上のコピーが宿主ゲノムの全体にわたって散在するという事実は、ＳＩＮＥ挿入分析が分子分類学への強力な新規アプローチであることを際立たせている。このＳＩＮＥ挿入分析は、キャラクター・データをもつ他の形態の、最も顕著にはＤＮＡ配列及び形態学の分析を補う (例えば、Murata et al, 1993, Shimamura et al., 1997, Takahashi et al, 1998; Nikaido et al, 1999)。分類学のツールとしてのＳＩＮＥ進化とそれらの重要性に関する論文は入手できる (Weiner et al, 1996, Deininger and Batzer, 1993, Schmid, 1996, Rokas and Holland, 2000; Shedlock and Okada, 2000, Shedlock et al. 2000)。 The fact that a copy of 10 ⁴ or more facts and SINE referred retro position always occurs in the unidirectional interspersed throughout the host genome, when it SINE insertion analysis is a powerful new approach to molecular taxonomy Standing up. This SINE insertion analysis complements other forms of character data, most notably DNA sequence and morphology analysis (eg, Murata et al, 1993, Shimamura et al., 1997, Takahashi et al, 1998). Nikaido et al, 1999). Papers on SINE evolution as taxonomy tools and their importance are available (Weiner et al, 1996, Deininger and Batzer, 1993, Schmid, 1996, Rokas and Holland, 2000; Shedlock and Okada, 2000, Shedlock et al 2000).

ところで、鯨種の判別においては、ＰＣＲでミトコンドリア遺伝子の或る特定の領域を増幅した後、それらをアガロース・ゲルで電気泳動して増幅を確認し、その後シークエンス法を用いて塩基配列を決定して、その配列の違いを指標にして種判別がなされてきた。シークエンス法実験設備は高額であるため、それらの解析は現在ほとんど専門会社により行われている。したがって、膨大な数の個体数についての種を判別するためには、その解析数も当然に膨大となり、そのコストは多大なものとなる。さらに、塩基配列の解析は、以下に述べるように未だ不十分・不確実な面もあり、それらの解釈もしばしば問題視されている。
特開２００３−００９８６６号公報 Shimamura M, Yasue H, Ohshima K, Abe H, Kato H, Kishiro T, Goto M, Munechika I, Okada,N. (1997) Molecular evidence that whales form a clade within even-toed ungulates. Nature 388：666-670. Nikaido M, Rooney AP, Okada N. (1999) Phylogenetic relationships among cetartiodactyls based on insertions of short and long interpersed elements：hippopotamuses are the closest extant relatives of whales. Proc. Natl. Acad. Sci. U.S.A. 96：10261-10266. Shedlock AM, Okada N (2000) SINE insertions：powerful tools for molecular systematics. BioEssays 22：148-160. Nikaido M, Matsuno F, Hamilton H, Brownell Jr. LR, Cao Y, Ding W, Zuoyan Z, Shedlock AM, Fordyce RE, Hasegawa M, Okada N. (2001) Retroposon analysis of major cetacean lineages：the monophyly of toothed whales and the paraphyly of river dolphins. Proc. Natl. Acad. Sci. U.S.A. 98：7384-7389. 岡田典弘（1997）クジラの起源と哺乳類の進化−哺乳類の放散は恐竜の絶滅後に起こったのか科学 vol.67 岩波書店 pp.908-916。岡田典弘（2000）クジラが陸にいたころは−サイン法による系統決定が明らかにしたカバとの関係科学 vol.70 岩波書店 pp.119-125。 By the way, in distinguishing whale species, after amplification of specific regions of mitochondrial genes by PCR, they are electrophoresed on agarose gels to confirm amplification, and then the base sequence is determined using a sequencing method. Thus, species discrimination has been made using the difference in sequence as an index. Since the sequence method experimental equipment is expensive, the analysis is currently almost done by specialized companies. Therefore, in order to discriminate the species for a huge number of individuals, the number of analyzes is naturally huge, and the cost is great. Furthermore, base sequence analysis is still insufficient and uncertain as described below, and their interpretation is often regarded as a problem.
JP 2003-009866 A Shimamura M, Yasue H, Ohshima K, Abe H, Kato H, Kishiro T, Goto M, Munechika I, Okada, N. (1997) Molecular evidence that whales form a clade within even-toed ungulates. Nature 388: 666-670 . Nikaido M, Rooney AP, Okada N. (1999) Phylogenetic relationships among cetartiodactyls based on insertions of short and long interpersed elements: hippopotamuses are the closest extant relatives of whales. Proc. Natl. Acad. Sci. USA 96: 10261-10266. Shedlock AM, Okada N (2000) SINE insertions: powerful tools for molecular systematics. BioEssays 22: 148-160. Nikaido M, Matsuno F, Hamilton H, Brownell Jr. LR, Cao Y, Ding W, Zuoyan Z, Shedlock AM, Fordyce RE, Hasegawa M, Okada N. (2001) Retroposon analysis of major cetacean lineages: the monophyly of toothed whales Proc. Natl. Acad. Sci. USA 98: 7384-7389. and the paraphyly of river dolphins. Norihiro Okada (1997) Origin of Whales and Mammal Evolution-Did Mammals Disappear After the Dinosaur Extermination? Science vol.67 Iwanami Shoten pp.908-916. Norihiro Okada (2000) When whales were on land-Relationship with hippopotamus revealed by system determination by sign method Science vol.70 Iwanami Shoten pp.119-125.

したがって、塩基配列の解析を用いずに、簡易・迅速・安価に、鯨類を判別するための検定方法の必要性が未だ存在する。 Therefore, there is still a need for a test method for discriminating cetaceans in a simple, rapid and inexpensive manner without using base sequence analysis.

今般、本願発明者は、ＳＩＮＥが、上述のように系統学・分類学の研究に利用できるだけでなく、鯨種判別のための簡易・迅速・安価な検定方法に利用できることを発見した。そして広範囲にわたる試料について鋭意実験を重ねた結果その確実性を証明して本願発明を完成するに至った。すなわち、以下に詳細に述べるように、ＳＩＮＥの挿入が共通にあれば、それらの共通に挿入のある種どうしが互いに単系統であるということを系統学的に明らかにできるわけであるが、それらのＳＩＮＥファミリー又はサブファミリーの挿入が、ある特定の種内に特有であれば、その挿入の有無が種判別に利用できることがわかる。このような種に特有のＳＩＮＥサブファミリー配列の単離・同定方法も本願発明の範囲内にある。
従来SINEファミリーを使った系統樹作成はなされて来たが、それはSINEファミリーを用いてサブファミリーとして分類する事なく行なわれて来た。種の同定を可能にするSINEは、最近に増幅を行ったSINEでなくては行う事が出来ないので、最近に増幅したサブファミリーを特異的に単離する必要が有る。本方法はその方法を提供するものである。 The present inventor has recently found that SINE can be used not only for phylogenetic and taxonomic studies as described above, but also for a simple, quick and inexpensive assay method for whale species discrimination. As a result of intensive experiments on a wide range of samples, the certainty was proved and the present invention was completed. That is, as described in detail below, if SINE insertions are common, it is possible to clarify phylogenically that some of the common insertions are a single system, If the insertion of the SINE family or subfamily is unique within a particular species, it can be seen that the presence or absence of the insertion can be used for species discrimination. Methods for isolating and identifying such a species-specific SINE subfamily sequence are also within the scope of the present invention.
Previously, phylogenetic tree creation using the SINE family has been made, but it has been done without classifying it as a subfamily using the SINE family. SINEs that enable species identification can only be performed with recently amplified SINEs, so it is necessary to specifically isolate recently amplified subfamilies. The method provides that method.

本発明の１の態様においては、ＳＩＮＥ法により鯨種を判別する方法であって、以下のステップ：
判別しようとする１以上の鯨種からそれぞれゲノムＤＮＡライブラリーを作成し；
上記各ライブラリーから、上記鯨種の中の少なくとも１種において或るＳＩＮＥファミリー又はサブファミリーに属するＳＩＮＥが特異的に挿入されているオルソロガス遺伝子座を含むクローンを単離し；
上記遺伝子座において上記ＳＩＮＥファミリー又はサブファミリーの両側に位置するフランキング配列にアニールするＰＣＲプライマーのセットを用いて、上記鯨種のそれぞれについて上記遺伝子座の配列をＰＣＲにより増幅し；そして
得られたＰＣＲ産物をゲル電気泳動して、上記ＳＩＮＥファミリー又はサブファミリー挿入遺伝子座の存在を示すバンドの有無により、上記鯨種を判別する；
を含み、ここで、前記オルソロガス遺伝子座が、配列番号３１に示す配列に相当するＧＲＹ２０、配列番号３２に示す配列に相当ＢＲＹ５０、配列番号３３に示す配列に相当するＳｐｅｒｍ２８、配列番号３４に示す配列に相当するＨｕｍｐ４２、配列番号３５に示す配列に相当するＮａｇ１８、配列番号３６に示す配列に相当するＳＩＲ１３、配列番号３７に示す配列に相当するＳｅｉ３５、配列番号３８に示す配列に相当するＢＲＹ６３、配列番号３９に示す配列に相当するＧＲＹ８、配列番号４０に示す配列に相当するＭＮＫ３１、配列番号４１に示す配列に相当するＮＭ１、配列番号４２に示す配列に相当するＳＥＭ３、配列番号４３に示す配列に相当するｓＮＲ２、配列番号４４に示す配列に相当するｎａ１０２、配列番号４５に示す配列に相当するＴｕｔｉ３５、配列番号４６に示す配列に相当するＴｕｔｉ３５、配列番号４７に示す配列に相当するＳｐ９、及び配列番号４８に示す配列に相当するＳｐ２から成る群から選ばれる、前記方法が提供される。 In one aspect of the present invention, there is provided a method for discriminating whale species by the SINE method, comprising the following steps:
Create a genomic DNA library from one or more whale species to be distinguished;
Isolating a clone containing an orthologous locus into which a SINE belonging to a SINE family or subfamily is specifically inserted in at least one of the whale species from each of the libraries;
Using the PCR primer set that anneals to flanking sequences located on either side of the SINE family or subfamily at the locus, the sequence of the locus was amplified by PCR for each of the whale species; and obtained The whale species is discriminated by gel electrophoresis of the PCR product and the presence or absence of a band indicating the presence of the SINE family or subfamily insertion locus;
Wherein the orthologous locus is GRY20 corresponding to the sequence shown in SEQ ID NO: 31, BRY50 corresponding to the sequence shown in SEQ ID NO: 32, Superm28 corresponding to the sequence shown in SEQ ID NO: 33, and the sequence shown in SEQ ID NO: 34 Hump42 corresponding to the sequence, Nag18 corresponding to the sequence shown in SEQ ID NO: 35, SIR13 corresponding to the sequence shown in SEQ ID NO: 36, Sei35 corresponding to the sequence shown in SEQ ID NO: 37, BRY63 corresponding to the sequence shown in SEQ ID NO: 38, GRY8 corresponding to the sequence shown in No. 39, MNK31 corresponding to the sequence shown in SEQ ID No. 40, NM1 corresponding to the sequence shown in SEQ ID No. 41, SEM3 corresponding to the sequence shown in SEQ ID No. 42, and the sequence shown in SEQ ID No. 43 Corresponding sNR2, na102 corresponding to the sequence shown in SEQ ID NO: 44, SEQ ID NO: 45 The method is selected from the group consisting of Tuti 35 corresponding to the sequence, Tut i 35 corresponding to the sequence shown in SEQ ID NO: 46, Sp9 corresponding to the sequence shown in SEQ ID NO: 47, and Sp2 corresponding to the sequence shown in SEQ ID NO: 48. Provided.

本発明の他の態様においては、配列番号３１〜４８のいずれか１に示す配列を有するＤＮＡ又は上記ＤＮＡとストリンジェント条件下でハイブリダイズするオルソロガスＤＮＡが提供される。
前記ＰＣＲプライマーは、配列番号４９〜８４から成る群から選ばれることができる。
本発明の他の態様においては、配列番号４９〜８４のいずれか１に示す配列を有するプライマーＤＮＡが提供される。 In another aspect of the present invention, there is provided DNA having the sequence shown in any one of SEQ ID NOs: 31 to 48 or orthologous DNA that hybridizes with the above DNA under stringent conditions.
The PCR primer can be selected from the group consisting of SEQ ID NOs: 49-84.
In another aspect of the present invention, a primer DNA having the sequence shown in any one of SEQ ID NOs: 49 to 84 is provided.

上述の通り、本発明の効果は配列決定の作業を省略できるところである。まず配列決定に必要なシーケンサー関連機材を揃えるならば最低でも１５００万円の設備投資が必要となる。また消耗品に関しては１回の配列決定にかかるコストがおよそ１２００円であり、多量（数千サンプル）の検体を判別するとなると膨大な費用がかかってしまう。外注する場合には設備投資の必要がなくなるが、１検体につき１万円以上の費用がかかるのが現状である。また配列決定をおこなうにはＰＣＲ、電気泳動後に、５時間以上かかることから、ＳＩＮＥ法は時間的な面からも、大きな効率化を図ることが可能になる。 As described above, the effect of the present invention is that the work of determining the sequence can be omitted. First, if you have the sequencer-related equipment necessary for sequencing, you will need a capital investment of at least 15 million yen. In addition, for consumables, the cost for one-time sequencing is approximately 1200 yen, and when a large amount (several thousand samples) of specimens is discriminated, a huge cost is required. In the case of outsourcing, there is no need for capital investment, but the cost is 10,000 yen or more per sample. In addition, since it takes 5 hours or more after PCR and electrophoresis to perform sequencing, the SINE method can be greatly improved in terms of time.

またデータの解析に関しても、ＤＮＡ塩基配列に基づく種判別には統計的な操作が必要であり、その解析方法によって、結果が異なることもしばしば見受けられ、種判別という観点から考えても理想的とは言い難い。それに対し、ＳＩＮＥ法では、ＳＩＮＥの挿入という不可逆的な現象を指標として種判別をおこなうので、その結果は明瞭であり、また特別な統計処理も必要なく、一見しただけで結果の解釈ができる。 As for data analysis, statistical operations are necessary for species discrimination based on DNA base sequences, and the results often vary depending on the analysis method, which is ideal from the viewpoint of species discrimination. Is hard to say. On the other hand, in the SINE method, species discrimination is performed using the irreversible phenomenon of SINE insertion as an index, so that the result is clear and no special statistical processing is required, and the result can be interpreted at a glance.

以上の理由から、従来の塩基配列の比較に基づく種判別よりもＳＩＮＥの挿入を指標とした種判別の方が、金銭的、時間的な面から考えても優れており、その信頼性も高いといえる。 For the above reasons, species discrimination using SINE insertion as an index is superior to the species discrimination based on the comparison of conventional nucleotide sequences in terms of money and time, and its reliability is high. It can be said.

定義
本願明細書中、「ＳＩＮＥ」とは、その１次構造上の長さは約１００〜４００塩基である短い散在性の反復配列であって、その5’末端から順にｔＲＮＡ相同領域、ｔＲＮＡ非相同領域、ＡＴに富む領域を含むものをいう。その両端にはＳＩＮＥがゲノム内に挿入される際にできると考えられている約５〜２０塩基の同方向の繰返し配列(direct repeats)が存在することを特徴とする。
本願明細書中、「ＳＩＮＥファミリー」とは、同一のｔＲＮＡに由来する、お互いに配列の似通ったＳＩＮＥのメンバーをいう。
一方、本願明細書中、「ＳＩＮＥサブファミリー」とは、同じ起源（ファミリー）に由来するが、特徴的(diagnostic)な変異が見られるため、ＳＩＮＥファミリーの中の同じ祖先メンバーから増幅したようＳＩＮＥ群のメンバーをいう。サブファミリーはタイプとも互換使用される。
本願明細書中、オルソロガス遺伝子座とは、二つの遺伝子座がある共通祖先からの種分化に由来することをいう。一方，種分化ではなく遺伝子重複によって二つの遺伝子が生じたとき、それらは「パラロガス」である。分子系統樹の推定にとって必要な情報を与えるのは、パラロガス遺伝子ではなく、オルソロガス遺伝子である。 Definitions In the present specification, “SINE” is a short interspersed repetitive sequence having a length of about 100 to 400 bases in primary structure, and in order from the 5 ′ end thereof, a tRNA homologous region, A homologous region, which includes a region rich in AT. At both ends, about 5 to 20 base direct repeats (direct repeats), which are considered to be formed when SINE is inserted into the genome, are characterized.
In the present specification, the “SINE family” refers to members of SINE having similar sequences to each other, derived from the same tRNA.
On the other hand, in the present specification, the “SINE subfamily” is derived from the same origin (family) but has a characteristic mutation, so that it is amplified from the same ancestor member in the SINE family. A group member. Subfamily is used interchangeably with type.
In the present specification, the orthologous locus means that it is derived from speciation from a common ancestor having two loci. On the other hand, when two genes are generated by gene duplication rather than speciation, they are “paralogous”. The orthologous gene, not the paralogous gene, gives the information necessary for estimating the molecular phylogenetic tree.

本願明細書中、「フランキング配列」とは、オルソロガス遺伝子座内に特異的にＳＩＮＥファミリー又はサブファミリーが挿入された場合、その挿入部位の両側に位置する配列をいう。かかるフランキング配列内でＰＣＲプライマーを任意に設計することができる。
本明細書中、ストリンジェント条件とは、鋳型ＤＮＡを、ハイブリバックに入れ、そこにプレハイブリ溶液（6 X SSC, 1 % SDS ）を適当量加え、ハイブリバックをシーラーでパックし1時間以上インキュベートし、溶液を捨て、ハイブリ溶液（6 X SSC, 1 % SDS, 1 X Denhart's solution , Carrier DNA (Shared Herring Sperm DNA solution)を加えた後、さらに予め調整したプローブを95 ℃で3分間熱変性しておいたものを加え、またハイブリバックをシールし、その後42 ℃のウォーターバスで一晩（〜１５時間程度）インキュベートし、それらのメンブレンを適当量のウォッシュ溶液（2X SSC, 1 % SDS）で軽く濯いだとき、プローブが上記鋳型ＤＮＡとハイブリダイズするような条件をいう。 In the present specification, “flanking sequence” refers to a sequence located on both sides of an insertion site when a SINE family or subfamily is specifically inserted into an orthologous locus. PCR primers can be arbitrarily designed within such flanking sequences.
In this specification, stringent conditions mean that template DNA is put into a hybrid bag, an appropriate amount of prehybrid solution (6 X SSC, 1% SDS) is added thereto, the hybrid bag is packed with a sealer, and incubated for 1 hour or longer. Discard the solution, add the hybrid solution (6 X SSC, 1% SDS, 1 X Denhart's solution, Carrier DNA (Shared Herring Sperm DNA solution)), and heat denature the probe for 3 minutes at 95 ° C. Add soup, seal the hybrid back, and then incubate overnight (about 15 hours) in a 42 ° C water bath. Lightly wash the membranes with an appropriate amount of wash solution (2X SSC, 1% SDS). The conditions under which the probe hybridizes with the template DNA when rinsed.

本願明細書中、「外群比較」とは、系統推定論において，外群 (outgroup)，すなわち対象生物群すなわち内群 (ingroup)に対して最も近縁であると仮定される種または種群に基づく形質極性の決定法をいう。外群を解析に含めることにより、内群根(内群系統樹全体の共通祖先)の位置を決めることができる。分岐分類学の初期の理論では、まず初めに外群のもつ形質状態に基づいて内群の形質状態の極性すなわち原始的形質状態と派生的形質状態の判定を行い、次に、あらかじめ判定された形質の極性に基づいて，派生的形質状態を共有する種を単系統群としてまとめる。外群比較は、とりわけ形態形質の極性を判別する主たる方法として広く用いられる。その論理的根拠は最節約原理とよばれるもので、内群根に連なる枝での仮想的形質状態(内群での極性判別の基準)を外群の形質分布から最節約的に推定しているからである。Maddison, Donoghue ＆ Maddison(1984)とSwofford ＆ Maddison(1987)は、外群比較に基づく内群の系統推定が、内群と外群をあわせた群に対する極性判定を行わない最節約的な系統推定と論理的に等価であることを証明した。
したがって，特に制限酵素の切断部位や核酸の塩基配列など極性判定が困難な分子データからも分岐分類学に基づく最節約系統推定が可能になった。 In this specification, “outside group comparison” means a species or species group that is assumed to be most closely related to the outgroup, that is, the target organism group or ingroup in the phylogenetic theory. A method for determining the polarity of a trait based on it. By including the outer group in the analysis, the position of the inner group root (the common ancestor of the entire inner group phylogenetic tree) can be determined. In the initial theory of bifurcation taxonomy, first, the polarity of the inner group's trait state, that is, the primitive trait state and the derived trait state are determined based on the trait state of the outer group, and then the pre-determined Based on trait polarity, group species that share a derived trait state into a single lineage group. Outer group comparison is widely used as the main method for determining the polarity of morphological traits. The rationale is called the principle of saving the least, and the hypothetical trait state (criteria for polarity discrimination in the inner group) at the branch connected to the inner group root is estimated from the outer group trait distribution in the most conservative manner. Because. Maddison, Donoghue & Maddison (1984) and Swofford & Maddison (1987) are the most conservative phylogenetic estimation of the inner group based on the outer group comparison, without the polarity judgment for the inner group and outer group. And proved to be logically equivalent.
Therefore, it has become possible to estimate the most conserved phylogeny based on branch taxonomy, especially from molecular data that is difficult to determine polarity, such as restriction enzyme cleavage sites and nucleic acid base sequences.

本願明細書中、「共有派生形質」とは、外群比較などを用いて極性の推定をした結果、派生的と判定された形質状態 (apomorphy)を共有することをいう。分岐分類学の理論では共有派生形質だけが系統関係を推定する情報を与えると主張される。推定された派生形質を共有する種群を生んだ直接共通祖先を仮定できるからである。したがって、共有派生形質は単系統群(正確には完系統群)を構築する手掛かりとなる。ここで，もしも形質分布に不整合が生じたときには，いくつかの形質の派生的形質状態は，ある共通祖先からではなく別々の枝で進化したホモプラシーであると考えなければならない。共有派生形質という仮説の妥当性は，最節約原理に基づいて選択された系統仮説，すなわち分岐図との整合性によって検証される。 In the present specification, “covalently derived trait” means sharing an apomorphy determined to be derived as a result of estimation of polarity using an outer group comparison or the like. The theory of bifurcation taxonomy argues that only shared derived traits provide information for estimating phylogenetic relationships. This is because it is possible to assume a direct common ancestor that gave rise to species that share the estimated derived traits. Therefore, the shared derivation trait is a clue to construct a single lineage group (more precisely, a complete lineage group). Here, if an inconsistency occurs in the trait distribution, the derivative trait state of some traits must be considered as homoplasmy evolved from separate branches rather than from a common ancestor. The validity of the hypothesis of shared derivation traits is verified by the consistency with the phylogenetic hypothesis selected based on the principle of least saving, that is, the bifurcation diagram.

本願明細書中、「極性」とは、ある形質の形質状態間の遷移順序の進化方向という。したがって、極性は遷移順序(order)のモデルに依存する。形質状態の遷移順序は、系統推定に用いる形質データの形質を反映する。質的な形態形質では直線状または分岐状の遷移順序を仮定できることもまれではない。一方、核酸塩基配列や制限酵素切断サイトなどの分子データの多くは、形質状態の間に順序付けができない無順序的(unordered)な形質である。分岐分類学や進化分類学では、系統解析に先立って形質進化の極性推定を要求している。極性を推定する規準としては、外群比較あるいは化石記録や個体発生の情報などが利用されているが、もっとも広く用いられているのは外群比較である。 In this specification, “polarity” refers to the evolution direction of the transition order between the trait states of a trait. Thus, the polarity depends on the model of transition order. The order of transition of the trait state reflects the trait of the trait data used for lineage estimation. It is not uncommon for a qualitative morphological trait to assume a linear or branched transition order. On the other hand, much of the molecular data such as nucleobase sequences and restriction enzyme cleavage sites are unordered traits that cannot be ordered between trait states. Bifurcation taxonomy and evolutionary taxonomy require polarity estimation of trait evolution prior to phylogenetic analysis. As criteria for estimating polarity, outer group comparison or fossil records and ontogeny information are used, but the most widely used is outer group comparison.

ＳＩＮＥの進化
系統推定及び本願発明に係る単離・同定方法並びに種判別方法に、ＳＩＮＥを有効に利用する為には、ＳＩＮＥがどのように進化してきたかを理解することが重要である。ＳＩＮＥは、以下詳細に記載するように、配列類似性に基づくファミリーと特徴的なヌクレオチド存在及び／又は欠失に基づくサブファミリーとに分類されることができる。一旦ゲノム内に挿入されたＳＩＮＥがどのような運命を辿るかは、染色体環境中の様々な要因(Schmit and Maraia 1992)とＳＩＮＥに蓄積する増幅を妨げるような突然変異とのバランスによって決定される。さらに、ＳＩＮＥの増幅の為にはＬＩＮＥにコードされたＲＴａｓｅが必要であるので、ＬＩＮＥがゲノム中で増幅能力を失い死んでしまえば、それはその生物中でのＳＩＮＥの死も同時に意味するところとなる (Okada et al., 1997)。 It is important to understand how SINE has evolved in order to effectively use SINE for the evolutionary phylogeny estimation of SINE and the isolation / identification method and species discrimination method according to the present invention. SINEs can be classified into families based on sequence similarity and subfamilies based on characteristic nucleotide presence and / or deletion, as described in detail below. The fate of SINE once inserted into the genome is determined by the balance between various factors in the chromosomal environment (Schmit and Maraia 1992) and mutations that prevent amplification from accumulating in SINE. . Furthermore, since LINE-encoded RTase is necessary for the amplification of SINE, if LINE loses its amplification ability in the genome and dies, it also means the death of SINE in the organism. (Okada et al., 1997).

ゲノム内でＳＩＮＥが進化過程でどのように増幅するかに関しては、これまでにＳＩＮＥ進化の２つの対立モデル：マスター遺伝子モデルと多数源遺伝子モデルが提唱されて来た。マスター遺伝子モデル（図２（Ａ））は、１又は数個の「マスター」遺伝子座だけが、全ての子孫コピーの元であり、この子孫コピーはそれら自身の上で複製する能力はもたない、というものである。このシナリオにおいては、時間の経過にわたるその増幅率は、そのマスター遺伝子の条件及び活性に完全に依存する。一方、多数源遺伝子モデル（図２（Ｂ））では、子孫も親コピーと同様の増幅する能力を持つ事が出来るので、進化の時間の経過にわたり「多数源遺伝子」として機能する、というものである。後者モデルにおいては、増幅率は、源遺伝子の全てに由来する合計コピー数の増加又は減少率の関数である。マスター遺伝子モデルはげっ歯類のＩＤＳＩＮＥ (Kim et al, 1994)及びヒトＡｌｕ反復の初期の研究 (Shen et al., 1991) からの経験的証拠により提唱された。しかし、その後の詳細なＡｌｕ配列の研究や他の様々な分類群からの比較研究により、今では大部分のＳＩＮＥファミリーの増幅は、上記多数源遺伝子モデルにより起こると解釈することが最も妥当であると考えられている(Matera et al, 1990; Schmid and Maraia, 1992; Leeflang et al, 1992; Kido et al., 1994; Takasaki et al. 1994; Shedlock and Okada, 2000)。実際には、特徴的なヌクレオチドの存在及び／又は欠失を特徴とするサブファミリーが、それぞれの源遺伝子を表すことになる。 In the past, two opposing models of SINE evolution have been proposed for how SINE amplifies within the genome: a master gene model and a multi-source gene model. In the master gene model (FIG. 2A), only one or several “master” loci are the source of all descendant copies, which have no ability to replicate on themselves. That's it. In this scenario, its amplification rate over time is completely dependent on the conditions and activity of the master gene. On the other hand, in the multi-source gene model (Fig. 2 (B)), since the offspring can also have the same amplification ability as the parent copy, it functions as a "multi-source gene" over the course of evolution. is there. In the latter model, the amplification factor is a function of the rate of increase or decrease in total copy number from all of the source genes. The master gene model was proposed by empirical evidence from rodent ID SINE (Kim et al, 1994) and early studies of human Alu repeats (Shen et al., 1991). However, with subsequent detailed Alu sequence studies and comparative studies from various other taxa, it is now most reasonable to interpret that most SINE family amplifications are caused by the multi-source gene model. (Matera et al, 1990; Schmid and Maraia, 1992; Leeflang et al, 1992; Kido et al., 1994; Takasaki et al. 1994; Shedlock and Okada, 2000). In practice, a subfamily characterized by the presence and / or deletion of characteristic nucleotides will represent the respective source gene.

ＳＩＮＥの誕生と死の間の期間が、宿主ゲノム内のＳＩＮＥの活動の期間であり、その寿命が分類学の及び本願発明におけるツールとしてのそれらの使用に直接関連する。ＳＩＮＥサブファミリーのメンバー間の平均配列相違度が小さい場合、そのサブファミリーはかなり若く、かつ、その宿主内で未だ活動的に増幅していると推定することが妥当である。そのＳＩＮＥサブファミリーのメンバー間の平均配列相違度が大きい場合、そのサブファミリーは比較的古く、かつ、その宿主内で既に不活性又は死んでいると推定することが妥当である。しかしながら、ＳＩＮＥ挿入を用いた宿主分類群間の共通先祖の診断は、与えられたサブファミリーの活動寿命内でのみ可能である。このＳＩＮＥ法の基本原理を、以下の「手順」欄でさらに説明する。 The period between the birth and death of SINE is the period of SINE activity in the host genome, and its lifetime is directly related to their use in taxonomy and as a tool in the present invention. If the average sequence dissimilarity between members of the SINE subfamily is small, it is reasonable to assume that the subfamily is quite young and is still actively amplified in the host. If the average sequence dissimilarity between members of the SINE subfamily is large, it is reasonable to assume that the subfamily is relatively old and already inactive or dead in the host. However, diagnosis of common ancestors between host taxa using SINE insertion is possible only within the active life span of a given subfamily. The basic principle of this SINE method will be further described in the “Procedure” section below.

ＳＩＮＥ挿入動態及び特徴理論
ＳＩＮＥがなぜ強力な分類学のツールであるかということの鍵は、それらが宿主ゲノム内に不可逆的に独立して挿入されるということである (Murata et al, 1993; Shedlock and Okada 2000)。ゲノムからＳＩＮＥを特異的に除去する既知のメカニズムは存在せず、かつ、２つの要素が全く同じ遺伝子座内に挿入され又は同一遺伝子座内から正確に切除される確率は極めて低いので、本願発明者は、２つの異なる分類群内の同一遺伝子座のＳＩＮＥの存在を、そのゲノムにおける極性化された派生表現型であるとみなすことができる。これは系統発生仮説を構築するために、共有派生形質、又は共有された派生した特性だけを使用するHennig (1966)の方法の厳格な意味において、１の分岐群又は単系統群を定める。この分岐群においては、既知の先祖条件、又は与えられた遺伝子座におけるＳＩＮＥ挿入の欠如が、周知の人工物を作り出す競合方法を介して特性極性を確立する必要とせずに外群を一義的に定める(Hendy and Penny, 1989)（図３参照）。 SINE Insertion Kinetics and Feature Theory The key to why SINE is a powerful taxonomic tool is that they are irreversibly independently inserted into the host genome (Murata et al, 1993; Shedlock and Okada 2000). There is no known mechanism for specifically removing SINE from the genome, and the present invention has a very low probability that two elements are inserted into the same locus or precisely excised from within the same locus. One can consider the presence of a SINE at the same locus in two different taxa as a polarized derivative phenotype in its genome. This defines one branch group or single lineage group in the strict sense of Hennig's (1966) method that uses only shared derived traits, or shared derived characteristics, to construct a phylogenetic hypothesis. In this bifurcation group, known ancestral conditions, or the lack of SINE insertion at a given locus, uniquely identifies the outer group without having to establish characteristic polarity through a competitive method of creating well-known artifacts. (Hendy and Penny, 1989) (see Figure 3).

ＳＩＮＥの単離及び特徴付け方法
以下、ゲノム・ライブラリーから新規ＳＩＮＥを単離するための戦略、そのスクリーニング方法、クローンの配列決定及びＳＩＮＥのファミリーからサブファミリーへの特徴付け、代表的な宿主分類群におけるコピー数の定量、並びに系統発生学的な情報を提供するＳＩＮＥ挿入パターンの決定的な診断について説明する。 Methods for isolation and characterization of SINEs Below, strategies for isolating new SINEs from genomic libraries, their screening methods, clone sequencing and SINE family-to-subfamily characterization, representative host classifications The quantification of copy number in the group as well as the definitive diagnosis of the SINE insertion pattern that provides phylogenetic information is described.

手順
ゲノム・ライブラリーを作成するための種の選択方法
ＳＩＮＥ法は、以下の基本ステップ：１）選択された種からのゲノム・ライブラリーの作成；２）ＳＩＮＥ遺伝子座を含むクローンの単離；３）クローンのＤＮＡ配列の決定；４）そのＳＩＮＥ遺伝子座のフランキング配列内でのポリメラーゼ・チェイン・リアクション（ＰＣＲ）プライマーの設計；及び５）着目の関連種間のＳＩＮＥの存在又は非存在のＰＣＲ診断、を含む。ＳＩＮＥ法は一旦確立することができれば迅速・確実に確定的な結果を提供することができるけれども、その手順及び条件の確立及び立にはかなりの時間及び労力を要する。 procedure
Species selection method for generating a genomic library The SINE method consists of the following basic steps: 1) generation of a genomic library from the selected species; 2) isolation of clones containing the SINE locus; 3 4) the design of the DNA sequence of the clone; 4) the design of the polymerase chain reaction (PCR) primer within the flanking sequence of the SINE locus; and 5) the presence or absence of SINE between the relevant species of interest. Including diagnosis. Although the SINE method can provide deterministic results quickly and reliably once established, it requires considerable time and effort to establish and establish its procedures and conditions.

例えば、密接に関連する種Ａ，Ｂ，Ｃ，及びＤの間の系統発生関係を決定する場合を考える。上記種の実際の系統樹を図４（Ａ）に示す。この場合、種Ｄが、それからＳＩＮＥを単離するための宿主として選ばれた場合、系統発生の情報を提供するＳＩＮＥ遺伝子座は得られない。なぜなら、種Ｄは、着目の４つの分類群の共通先祖内及びさらに古い起源内に挿入されたＳＩＮＥ遺伝子座を含むからである。４種の全ての先祖内に挿入された遺伝子座を図４（Ｂ）中ＳＩＮＥ３として示す。ＳＩＮＥ３の挿入の存在（＋）又は非存在（−）を示すＰＣＲの電気泳動ゲルのパターンを図４（Ｃ）の下段に示す。系統発生の情報を提供するＳＩＮＥを単離するためには、図４（Ｂ）中に示す３つのＳＩＮＥ遺伝子座、ＳＩＮＥ１、ＳＩＮＥ２、及びＳＩＮＥ３を提供することができる種Ａ又は種Ｂを選ばなければならない。これらの遺伝子座のそれぞれが、これらの種の進化の歴史において共通の先祖を共有する分岐群、又は単系統群を定める。 For example, consider the case of determining the phylogenetic relationship between closely related species A, B, C, and D. An actual phylogenetic tree of the above kind is shown in FIG. In this case, if species D is then chosen as the host from which to isolate SINE, a SINE locus that provides phylogenetic information cannot be obtained. This is because Species D contains a SINE locus inserted in the common ancestor of the four taxa of interest and in older origins. The locus inserted into all four ancestors is shown as SINE3 in FIG. 4 (B). The pattern of the electrophoresis gel of PCR showing the presence (+) or absence (−) of SINE3 insertion is shown in the lower part of FIG. 4 (C). To isolate a SINE that provides phylogenetic information, one must select species A or B that can provide the three SINE loci, SINE1, SINE2, and SINE3 shown in FIG. 4B. I must. Each of these loci defines a branching group, or single lineage group that shares a common ancestor in the evolutionary history of these species.

特定の種からの新規ＳＩＮＥファミリーの単離方法
特定の種、例えば、図４（Ａ）中の種ＡにおいてＳＩＮＥファミリーが全く知られていないとき、そのゲノムからＳＩＮＥファミリーを新たに単離し分析することが必要である。選択された種から新規ＳＩＮＥファミリーを単離することができる２つの方法がある。１は、インビトロにおける全ゲノムＤＮＡ転写 (Endoh and Okada, 1986)を含み、そして他は、新たな高処理量自動ＤＮＡ配列決定法により容易化された約６０Ｋｂｐ以上のゲノムＤＮＡ配列決定を含む。 Isolation of a new SINE family from a particular species When no SINE family is known in a particular species, eg, species A in FIG. 4 (A), the SINE family is newly isolated and analyzed from its genome It is necessary. There are two ways in which a new SINE family can be isolated from selected species. One includes whole genomic DNA transcription in vitro (Endoh and Okada, 1986), and the other includes genomic DNA sequencing of about 60 Kbp or more facilitated by a new high-throughput automated DNA sequencing method.

インビトロにおける全ゲノムＤＮＡ転写
ほとんどのＳＩＮＥは、ｔＲＮＡ由来であることが知られている、それゆえ、ＳＩＮＥは、ＲＮＡポリメラーゼＩＩＩのための内部プロモーターをもっている。ＳＩＮＥはインビボにおいては極めて稀に転写されるけれども、ＳＩＮＥはインビトロにおいては裸ＤＮＡから容易に転写されることができる。放射標識された前駆体ヌクレオチド、例えば、アルファＰ３２−ＧＴＰを用いたＨｅＬａ細胞抽出物中である種の全ゲノムＤＮＡを転写するとき、通常、放射標識されたＲＮＡが転写される。いくつかの場合、これらの放射標識転写産物は、それらがゲル電気泳動に供されるときクリアなバンドを形成する(Endoh and Okada. 1986; Matsumoto et al. 1986)。 Whole genome DNA transcription in vitro Most SINEs are known to be derived from tRNA, and therefore SINE has an internal promoter for RNA polymerase III. Although SINE is transcribed very rarely in vivo, SINE can be easily transcribed from naked DNA in vitro. When transcribing certain total genomic DNA in HeLa cell extracts using radiolabeled precursor nucleotides, such as alpha P32-GTP, usually radiolabeled RNA is transcribed. In some cases, these radiolabeled transcripts form clear bands when they are subjected to gel electrophoresis (Endoh and Okada. 1986; Matsumoto et al. 1986).

この放射標識されたＲＮＡは着目の選択された種からのゲノム・ライブラリーをスクリーニングするためのプローブとして使用されることができる。この転写産物がゲル内でクリアなバンドを形成する場合、それは実際のＳＩＮＥファミリーを表している。なぜなら、与えられたＳＩＮＥファミリーの各遺伝子座からの同一転写産物の全てが集合して区別されるバンドを形成するからである。ゲノムＤＮＡから不明瞭なバンドが生じた場合でさえ、スクリーニングのためのプローブとしてそれらを用いることができる。しかしながら、後者の場合には、その転写産物は多数のＳＩＮＥファミリーを表している可能性がある。 This radiolabeled RNA can be used as a probe to screen a genomic library from a selected species of interest. If this transcript forms a clear band in the gel, it represents the actual SINE family. This is because all of the same transcripts from each locus of a given SINE family aggregate to form a distinct band. Even when obscured bands arise from genomic DNA, they can be used as probes for screening. However, in the latter case, the transcript may represent multiple SINE families.

本願発明者のこれまでの経験から、脊椎動物及び／又は無脊椎動物のゲノム中に１０，０００以上のコピー数のＳＩＮＥが存在するとき、それらは全ゲノムＤＮＡのインビトロ転写産物により検出されることができる。図４は、選択された動物種由来の全ゲノムＤＮＡからの転写産物のいくつかのパターン例を示す(Endoh and Okada, 1986)。 From our previous experience, when 10,000 or more copies of SINE are present in the vertebrate and / or invertebrate genome, they are detected by in vitro transcripts of total genomic DNA. Can do. FIG. 4 shows several example patterns of transcripts from total genomic DNA from selected animal species (Endoh and Okada, 1986).

自動シーケンサーによる６０ｋｂｐより大きなゲノムＤＮＡの配列決定
本願発明者の経験によれば、ゲノム内のＳＩＮＥファミリーのコピー数は通常１０，０００を超える。全ゲノムが長さ３ｘ１０⁹ｂｐであり、かつ、あるＳＩＮＥのサイズが３００ｂｐであると仮定すると、このようなＳＩＮＥファミリーはそのゲノムの０．１％を占めることになる（例えば、３００ｘ１０⁴＝３ｘ１０⁶）。したがって、６０ｋｂｐ以上にわたり配列決定することによりこの種のランダムに単離されたＤＮＡ断片内に２つの独立したＳＩＮＥ配列を見つけることができるであろう（例えば、６００ｘ１００／０．１＝６ｘ１０⁶）。 Sequencing of genomic DNA larger than 60 kbp by an automatic sequencer According to the inventor's experience, the number of copies of the SINE family in the genome is usually over 10,000. Assuming that the entire genome is 3 × 10 ⁹ bp long and the size of a SINE is 300 bp, such a SINE family will occupy 0.1% of the genome (eg, 300 × 10 ⁴ = 3 × 10 ⁶ ). Thus, by sequencing over 60 kbp, it would be possible to find two independent SINE sequences within this type of randomly isolated DNA fragment (eg 600 × 100 / 0.1 = ⁶ × 10 ⁶ ).

これは、高処理量自動ＤＮＡシーケンサーの最新モデルへのアクセスにより実験室において簡単に達成できるようになってきた。例えば、新規ＳＩＮＥファミリーが最近ゾウのゲノムから特徴づけされ、そしてこの新規ＳＩＮＥがＡｆｒｏｔｈｅｒｉａの全ての種間に分布していることが示された。この方法は、全ての哺乳動物及びおそらくほとんどの脊椎動物のゲノムに適用されることができる。 This has been easily achieved in the laboratory with access to the latest models of high throughput automatic DNA sequencers. For example, a novel SINE family has recently been characterized from the elephant genome and this novel SINE has been shown to be distributed among all species of Afrotheria. This method can be applied to the genomes of all mammals and possibly most vertebrates.

ＳＩＮＥファミリーを正確に同定し、そしてそのｔＲＮＡ構造を演繹する方法
上記の方法に従って反復単位の多数コピーの配列を決定した後、それらを整列させ、そして反復ファミリーのコンセンサス配列を演繹することができる。ＳＩＮＥ要素以外にもゲノム内には多くの反復配列が存在するので、その配列を適切に診断することが不可欠である。ほとんどのＳＩＮＥはｔＲＮＡ由来であることが知られているので、それらはＲＮＡポリメラーゼＩＩＩのためのプロモーターを含む。ＲＮＡポリメラーゼＩＩＩプロモーターは保存された配列ブロックであり、そのゲノム内で互いに分離された第１プロモーターと第２プロモーターの特性を有する。この第２プロモーターは高く保存されており、そして経験的に容易に認識されることができる。 Methods for accurately identifying the SINE family and deducing its tRNA structure After determining multiple copies of repeat units according to the method described above, they can be aligned and the consensus sequence of the repeat family can be deduced. In addition to the SINE element, there are many repetitive sequences in the genome, and it is essential to properly diagnose the sequence. Since most SINEs are known to be derived from tRNA, they contain a promoter for RNA polymerase III. The RNA polymerase III promoter is a conserved sequence block and has the properties of a first promoter and a second promoter separated from each other in its genome. This second promoter is highly conserved and can be easily recognized empirically.

以下、ＣＨＲ−２ＳＩＮＥの例を考える。これらの要素のｔＲＮＡ様構造を以下のように確立することができる（図６参照）：
１．まず、ＣＨＲ−２のいくつかの配列の整列からＣＨＲ−２ＳＩＮＥのコンセンサス配列を構築する（図８参照）。
２．目視により、ＲＮＡポリメラーゼＩＩＩのための第２プロモーターのコンセンサス配列を検索する。この配列は5’-GT(又はA)TCG(又はA)-3’である。このプロモーターをスクリーニングするとき、このモチーフに対する例外は存在しない。このモチーフが存在するとき、この第２プロモーター配列を含むステム・ループ構造を作る。ループ内の塩基の数は７であり、そしてステム内の塩基対の数は５である。ステム領域内の塩基の全てが塩基対を形成しない場合でさえも、図６（Ａ）に示すように、ｔＲＮＡ内の適当な位置にそれらの塩基を配置する。 Hereinafter, an example of CHR-2 SINE will be considered. The tRNA-like structure of these elements can be established as follows (see FIG. 6):
1. First, a consensus sequence of CHR-2 SINE is constructed from alignment of several sequences of CHR-2 (see FIG. 8).
2. The consensus sequence of the second promoter for RNA polymerase III is searched visually. This sequence is 5′-GT (or A) TCG (or A) -3 ′. There is no exception to this motif when screening this promoter. When this motif is present, it creates a stem loop structure containing this second promoter sequence. The number of bases in the loop is 7, and the number of base pairs in the stem is 5. Even when all of the bases in the stem region do not form a base pair, as shown in FIG. 6 (A), these bases are arranged at appropriate positions in the tRNA.

３．上記ステム領域から5’上流方向にある５塩基を１のユニットを考える。なぜなら、原形質クラスＩｔＲＮＡにおいては、この余分ループ領域は５塩基から成るからである。ＣＨＲ−２ＳＩＮＥの場合、５塩基のこのユニットの配列は3’-CAGGG- 5’である。3’-PyPyPuPuPu-5’配列がいくつかのｔＲＮＡにおけるこの余分ループに典型的であり(Sprinzl et al. 1987)、そしてこれにより、上記ＳＩＮＥのｔＲＮＡ起源を演繹することができる（図６（Ｂ）参照）。
４．ｔＲＮＡ構造のアミノアシル−ステム領域を形成する次の５塩基を、他のユニットとみなす。この場合、3’-GACGT-5’である（図６（Ｃ）参照）。
５．ｔＲＮＡ構造のアンチコドン−ループ領域を形成する次の７塩基を、さらに他のユニットとみなす。この場合、3’-AACCGTC-5’である。その3’末端におけるＡＡ残基5’末端における3’-TC-5’残基はこのＳＩＮＥのｔＲＮＡ起源の良い指標である。なぜなら、これらの塩基はほとんどのｔＲＮＡにおいて高く保存されているからである(Galli et al.,1981)。これはさらにこのＳＩＮＥのｔＲＮＡ起源を支持する（図６（Ｄ）参照）。
６．通常、次の５塩基は他のユニットとみなし、そしてそれらはアンチコドン−ステム領域（図６（Ｃ））に割り当てられた５塩基と塩基対を形成するはずである。この場合、その配列は3’-CGTCT-5’であり、そしてその最初の４塩基だけが上記アンチコドン−ステム領域の相手とよくマッチする。このユニットを正確に整列させるためには、上記アンチコドン−ステムの3’側の第１塩基の位置に欠失を配置する（図６（Ｅ）参照）。
７．次に、このｔＲＮＡ様構造のＤ領域のためのステム・ループ構造を構築する。このステムとループの塩基の数は通常それぞれ４と８であるが、特にＳＩＮＥのｔＲＮＡ様構造においては、１〜２塩基程変化することができる。明らかに、ＣＨＲ−２次のいくつかの塩基対は意味のある二次構造を形成しない。この場合、その第１プロモーター領域に注目する。第１プロモーター領域における最も目立った特徴は、その領域内の２つのＧの存在である。他の特徴はこのループ内の１５位（tRNAのナンバリングシステムによる。以下同じ。）におけるＧと１４位におけるＡである。この１４位におけるＡは5’側上そのループ内の最初の塩基である。それゆえ、これらの塩基を、このｔＲＮＡ様構造のループの対応の位置に配置する（図６（Ｆ）参照）。図６（Ｆ）中の最初の塩基であるＴはｔＲＮＡ分子の全てにおいて高く保存されていることは周知である。
８．ＣＨＲ−２ＳＩＮＥのｔＲＮＡ様構造は、図６（Ｆ）に示す配列を、このファミリーのための他の配列（図７（Ａ））と併合することにより、演繹することができる。
９．次に、ＧｅｎＢａｎｋＤＮＡデータベース中のＢＬＡＳＴＮプログラムを用いて (Altschul et al. 1990)、ＣＨＲ−２ＳＩＮＥと実際のｔＲＮＡの間の類似性について検索する。この例では、ｔＲＮＡＧｌｕがＣＨＲ−２の配列に最も類似する。図７（Ｂ）はヒトｔＲＮＡＧｌｕの二次構造を示す。 3. Consider a unit of 5 bases 5 ′ upstream from the stem region. This is because in protoplasm class ItRNA, this extra loop region consists of 5 bases. In the case of CHR-2 SINE, the sequence of this unit of 5 bases is 3′-CAGGG-5 ′. The 3'-PyPyPuPuPu-5 'sequence is typical for this extra loop in several tRNAs (Sprinzl et al. 1987) and this can deduce the tRNA origin of the SINE (Figure 6). (See (B)).
4). The next 5 bases that form the aminoacyl-stem region of the tRNA structure are considered other units. In this case, it is 3′-GACGT-5 ′ (see FIG. 6C).
5). The next 7 bases that form the anticodon-loop region of the tRNA structure are further considered as other units. In this case, 3′-AACCGTC-5 ′. The 3'-TC-5 'residue at the 5' end of the AA residue at its 3 'end is a good indicator of this SINE tRNA origin. This is because these bases are highly conserved in most tRNAs (Galli et al., 1981). This further supports the tRNA origin of this SINE (see Figure 6 (D)).
6). Usually, the next 5 bases are considered other units, and they should base pair with the 5 bases assigned to the anticodon-stem region (FIG. 6C). In this case, the sequence is 3'-CGTCT-5 'and only the first 4 bases match well with the anticodon-stem region partner. In order to align this unit correctly, a deletion is placed at the position of the first base 3 ′ of the anticodon-stem (see FIG. 6E).
7). Next, a stem and loop structure for the D region of this tRNA-like structure is constructed. The number of stem and loop bases is usually 4 and 8, respectively, but can vary by 1 to 2 bases, especially in the tRNA-like structure of SINE. Obviously, some base pairs following CHR-2 do not form meaningful secondary structures. In this case, focus on the first promoter area. The most prominent feature in the first promoter region is the presence of two Gs in that region. Other features are G at position 15 (according to the tRNA numbering system; the same applies hereinafter) and A at position 14 in this loop. A at position 14 is the first base in the loop on the 5 'side. Therefore, these bases are placed at the corresponding positions of the loop of this tRNA-like structure (see FIG. 6 (F)). It is well known that T, the first base in FIG. 6 (F), is highly conserved in all tRNA molecules.
8). The tRNA-like structure of CHR-2 SINE can be deduced by merging the sequence shown in FIG. 6 (F) with other sequences for this family (FIG. 7 (A)).
9. Next, the BLASTN program in the GenBank DNA database (Altschul et al. 1990) is used to search for similarities between CHR-2 SINE and actual tRNA. In this example, tRNA Glu is most similar to the sequence of CHR-2. FIG. 7 (B) shows the secondary structure of human tRNA Glu.

ＳＩＮＥファミリーをサブファミリーに特徴付ける方法
あるＳＩＮＥファミリーが系統発生において種Ａのゲノム内に特徴付けられ、かつ、このＳＩＮＥファミリーが進化の間に最初に生成された時が知られていないと仮定する。さらに、このＳＩＮＥファミリーが図９中の分岐群Ｙの全ての分類群の古い共通先祖において最初に生成されたと仮定する。この場合、種Ａのゲノム内に存在するＳＩＮＥのコピーは、図９のｔの時に増幅された古いＳＩＮＥと、ｕの時に増幅された若いＳＩＮＥを含む。種Ａのゲノム・ライブラリーをこのＳＩＮＥファミリーのコンセンサス配列を用いてスクリーニングするとき、上記の古いＳＩＮＥと新しいＳＩＮＥの両者を単離することができる。種Ａ，Ｂ，Ｃ，及びＤの系統関係だけが求められているので、単離されたＳＩＮＥの増幅事件の全ての時を調べるのは効率が悪い。むしろ、種Ｄの分岐付近の時に増幅され、かつ、図９中の分岐群Ｘを含む４つの分類群の全ての分岐にわたるＳＩＮＥ遺伝子座を単離するように試みるのがはるかに効率がよい。 Methods for characterizing the SINE family into subfamilies Assume that a SINE family is characterized in the genome of Species A in phylogeny, and that it is not known when this SINE family was first generated during evolution. Further assume that this SINE family was first generated in the old common ancestor of all taxon of branch group Y in FIG. In this case, the copy of SINE present in the genome of species A includes the old SINE amplified at time t in FIG. 9 and the young SINE amplified at time u. When screening a species A genomic library with this SINE family consensus sequence, both the old and new SINEs described above can be isolated. Since only the phylogenetic relationships of species A, B, C, and D are sought, it is inefficient to examine all times of an isolated SINE amplification event. Rather, it is much more efficient to attempt to isolate the SINE locus that is amplified near the branch of species D and spans all branches of the four taxa, including branch group X in FIG.

上述のように、ＳＩＮＥとＬＩＮＥは多数源遺伝子モデルに従って増幅されると信じられている (Schmit and Maraia 1992; Smit et al. 1995)。ある源遺伝子が突然変異を受け、そして進化の間に首尾よく増幅された場合、この突然変異した源遺伝子はその対応のＳＩＮＥファミリー内のサブファミリーとして認識されることができる (Britten et al. 1988; Jurka et al. 1995)。サブファミリーは進化のある段階において増幅される。それゆえ、あるサブファミリーが種Ａ，Ｂ，Ｃ，及びＤの共通先祖においてのみ増幅された場合、このサブファミリーのコピーは、これら４つの分岐群の系統発生関係を決定するために有効に使用されることができる。 As mentioned above, SINE and LINE are believed to be amplified according to a multi-source gene model (Schmit and Maraia 1992; Smit et al. 1995). If a source gene is mutated and successfully amplified during evolution, this mutated source gene can be recognized as a subfamily within its corresponding SINE family (Britten et al. 1988). Jurka et al. 1995). Subfamilies are amplified at some stage of evolution. Therefore, if a subfamily is amplified only in the common ancestors of species A, B, C, and D, a copy of this subfamily can be used effectively to determine the phylogenetic relationship of these four branch groups. Can be done.

種ＡにおけるＳＩＮＥファミリーのコンセンサス配列は、上記手順の一部として確立され、そしてそのＳＩＮＥ配列の5’末端における１のＰＣＲプライマー及びその3’末端付近の保存領域における他のプライマーの設計を可能にする。このプライマー・セットにより増幅される配列はそのＳＩＮＥ配列の全体を包含する。このプライマー・セットは、種Ａ由来のゲノムＤＮＡを用いたＰＣＲによるＳＩＮＥの多くのコピーを増幅するために使用されることができる。この反応のＰＣＲ産物は適当なベクターＤＮＡ内でクローニングされ、そして配列決定されることができる。この時点で、そのＰＣＲ産物からのＳＩＮＥの１００コピーの配列決定は困難な仕事ではない。これらの配列を整列させることにより、このＳＩＮＥファミリーのサブファミリーを表す特徴的なヌクレオチド又は可能な欠失を同定することができる（図１０参照）。特徴的なヌクレオチドは、特定のサブファミリー内の３以上のヌクレオチド位置において協調的に変更されており、かつ、進化の間のＳＩＮＥ配列内でランンダムに蓄積した自然突然変異から区別されうるものとして定義される。特徴的なヌクレオチドの存在及びしばしば特異的な欠失に基づきサブファミリーを首尾よく特徴づけた後に初めて、ドット−ブロット・ハイブリダイゼーション又はＰＣＲにより与えられたサブファミリーの分類学的分布を調べることができるようになる。 The SINE family consensus sequence in Species A was established as part of the above procedure and allows the design of one PCR primer at the 5 'end of the SINE sequence and other primers in the conserved region near the 3' end. To do. The sequence amplified by this primer set includes the entire SINE sequence. This primer set can be used to amplify many copies of SINE by PCR using genomic DNA from species A. The PCR product of this reaction can be cloned and sequenced in appropriate vector DNA. At this point, sequencing 100 copies of SINE from the PCR product is not a difficult task. By aligning these sequences, characteristic nucleotides or possible deletions representing this subfamily of the SINE family can be identified (see FIG. 10). Characteristic nucleotides are defined as those that are coordinately altered at three or more nucleotide positions within a particular subfamily and that can be distinguished from random mutations that accumulate in a random manner within the SINE sequence during evolution. Is done. Only after successfully characterizing a subfamily based on the presence of characteristic nucleotides and often specific deletions can the taxonomic distribution of a given subfamily be determined by dot-blot hybridization or PCR It becomes like this.

ＣＨＲ−２ＳＩＮＥのサブファミリー
図８は、クジラ目（Cetaceans）、猪豚亜目（Hippopotamuses）、及び核脚亜目（Ruminants）のゲノム中に存在すると元々特徴づけられたＣＨＲ−２ＳＩＮＥファミリー(Shimamura et al. 1997)のコピーのアラインメントを示す。ＣＨＲ−２ＳＩＮＥには６つのサブファミリーが存在することが容易に分かる。欠失の存在に因り、ＦＬ（完全長）、ＭＤＩ（中央欠失Ｉ）、ＭＤＩＩ（中央欠失ＩＩ）、及び最短群を特徴付けした。次に、この最短群を、ＤＴ（欠失型）、ＣＤ（クジラ欠失型）、及びＣＤＯ（クジラ欠失歯クジラ特異的）のサブファミリーに分類することができる。図１１は、これらサブファミリーのコンセンサス配列のアラインメント結果を示す。配列番号１は上位コンセンサス配列、配列番号２はＦＬコンセンサス配列、配列番号３はＭＤＩコンサンサス配列、配列番号４はＭＤＩＩコンセンサス配列、配列番号は５ＤＴコンセンサス配列、配列番号６はＣＤコンセンサス配列、そして配列番号７はＣＤＯコンサンサス配列を表す。 CHR-2 SINE subfamily Figure 8 shows the CHR-2 SINE family originally characterized as present in the genomes of Cetaceans, Hippopotamuses, and Ruminants ( The alignment of a copy of Shimamura et al. 1997) is shown. It is easy to see that there are six subfamilies in CHR-2 SINE. Due to the presence of the deletion, FL (full length), MDI (central deletion I), MDII (central deletion II), and the shortest group were characterized. This shortest group can then be categorized into subfamilies of DT (deletion type), CD (whale deletion type), and CDO (whale deletion tooth whale specific). FIG. 11 shows the alignment results of these subfamily consensus sequences. SEQ ID NO: 1 is the upper consensus sequence, SEQ ID NO: 2 is the FL consensus sequence, SEQ ID NO: 3 is the MDI consensus sequence, SEQ ID NO: 4 is the MDII consensus sequence, SEQ ID NO is the 5DT consensus sequence, SEQ ID NO: 6 is the CD consensus sequence, and SEQ ID NO: 7 represents the CDO consensus sequence.

図１２に、ＣＤ，ＣＤＯ，及び他のサブファミリーに特異的なプローブを、それぞれ、用いたドット−ハイブリダイゼーション実験の結果を示す。この結果はＣＤサブファミリーがクジラ目（歯クジラ及び髭クジラ）のゲノムに特異的であり、そしてＣＤＯサブファミリーが歯クジラのゲノムに特異的であることをはっきりと示している。それゆえ、ＣＤサブファミリーに属するＳＩＮＥは、クジラ目の、特に髭クジラの系統関係を推定するために有用であり、一方、ＣＤＯサブファミリーに属するＳＩＮＥは、歯クジラの系統関係を推定するために有用である。ＣＤＯサブファミリーのコピーの分布は、哺乳動物の分類生物学において最も争いのある点の中の１つであったマッコウクジラを含む歯クジラ亜目の単系統性をも示唆する。 FIG. 12 shows the results of dot-hybridization experiments using probes specific for CD, CDO, and other subfamilies, respectively. This result clearly shows that the CD subfamily is specific to the genome of the Whales (dental and sperm whales) and that the CDO subfamily is specific to the genome of the dental whale. Therefore, SINEs belonging to the CD subfamily are useful for estimating the phylogenetic relationship of the cetaceans, in particular the lepidopterous whales, whereas SINEs belonging to the CDO subfamily are for estimating the phylogenetic relationship of the tooth whales. Useful. The distribution of copies of the CDO subfamily also suggests a single lineage of dentition whales, including sperm whales, which was one of the most controversial points in mammalian taxonomic biology.

フランキングＳＩＮＥＰＣＲ
図９中に示す分岐群Ｘの共通先祖において生成されたサブファミリーに属する、種ＡからのＳＩＮＥ遺伝子座を単離し、そしてそれらの配列を決定した後、与えられた遺伝子座における挿入の存在又は非存在を診断するために、ＰＣＲ実験を行うことができる。そのフランキング（隣接）配列を見て、プライマー配列を選択する。プライマーの設計に際しては、二次構造の折り畳みの形成に対して及び上流プライマーと下流プライマーの間のタンデム・アニーリンングに対して注意しなければならない。これは、商業的に又はインターネットを通して入手できるＰＣＲプライマー設計を容易にするために書かれたさまざまな標準的なソフトウェア・プログラムを用いて容易にチェックすることができる。オリゴヌクレオチド・プライマーの溶融温度は５５℃付近に設定する。それゆえ、ＰＣＲのためのアニーリング温度は、種Ｂ，Ｃ，及びＤからのオルソロガスな遺伝子座の増幅を最適化するとき、この温度に基づかなければならない。ときに種Ｂ，Ｃ，及びＤについてのプライマー結合領域内の突然変異の蓄積が、この反応の間の効率的なプライマー−鋳型アニーリングを阻害する。この場合、アニーリング温度は、約４５〜５０℃まで低下されなければならない。ＰＣＲ産物が最初のプライマーを用いたＰＣＲにより増幅されない場合、ＰＣＲの効率を低下させるかもしれない潜在的な人工物に関してさらに注意して、新たなＰＣＲプライマーを設計すべきである。図１３にフランキングＳＩＮＥＰＣＲの原理を模式的に示す。 Franking SINE PCR
After isolating the SINE locus from species A belonging to the subfamily generated in the common ancestor of branch group X shown in FIG. 9 and determining their sequence, the presence or absence of an insertion at a given locus or PCR experiments can be performed to diagnose absence. A primer sequence is selected by looking at the flanking (adjacent) sequence. Care must be taken in the design of the primer for the formation of secondary structure folds and for tandem annealing between the upstream and downstream primers. This can be easily checked using a variety of standard software programs written to facilitate PCR primer design, commercially or available through the internet. The melting temperature of the oligonucleotide primer is set around 55 ° C. Therefore, the annealing temperature for PCR must be based on this temperature when optimizing the amplification of orthologous loci from species B, C, and D. Sometimes the accumulation of mutations in the primer binding regions for species B, C, and D inhibits efficient primer-template annealing during this reaction. In this case, the annealing temperature must be reduced to about 45-50 ° C. If the PCR product is not amplified by PCR with the first primer, new PCR primers should be designed with additional care regarding potential artifacts that may reduce the efficiency of the PCR. FIG. 13 schematically shows the principle of flanking SINE PCR.

図１４は、ＰＣＲ結果の１例を示す。また、２つの異なるプローブを用いた同一フィルターを用いて行ったハイブリダイゼーション実験結果を同時に示す。図１４（Ａ）は、海イルカが単系統である証拠を提供するＰＣＲパターンである。なぜなら、海イルカからのＰＣＲ産物は挿入された要素を含む予想断片サイズをもっており、一方、他の歯クジラからの断片はMago 19における挿入を欠く予想断片サイズをもつからである。図１４（Ｂ）は、ＳＩＮＥプローブを使用したハイブリダイゼーション実験を示し、一方、図１４（Ｃ）は、上記遺伝子座のフランキングＤＮＡを使用したハイブリダイゼーション実験を示す。この後者の実験は、オルソロガスな遺伝子座が、それから上記遺伝子座が元々単離されかつ特徴付けられたところのマゴンドウ以外の種においてＰＣＲにより忠実に増幅されたことを証明するために行われた。 FIG. 14 shows an example of the PCR result. In addition, the results of hybridization experiments conducted using the same filter using two different probes are shown simultaneously. FIG. 14 (A) is a PCR pattern that provides evidence that sea dolphins are single lines. This is because PCR products from sea dolphins have the expected fragment size including the inserted element, while fragments from other tooth whales have the expected fragment size lacking insertion in Mago 19. FIG. 14 (B) shows a hybridization experiment using a SINE probe, while FIG. 14 (C) shows a hybridization experiment using flanking DNA at the above locus. This latter experiment was carried out to prove that the orthologous locus was faithfully amplified by PCR in species other than the dragonfly from which the locus was originally isolated and characterized.

ＰＣＲデータの解釈
比較的最近分岐した種を調べる場合、オルソロガスなＳＩＮＥ遺伝子座におけるフランンキング配列は忠実に保存され、そして典型的には、ＰＣＲ診断を阻害する問題を引き起こさない。しかしながら、比較的古く分岐した分類群を調べる場には、ＰＣＲはより頻繁に失敗し、そして実験結果の解釈を困難にする。不成功のＰＣＲにおいては、ＳＩＮＥ−マイナス・データが現れる。すなわち、ＳＩＮＥ挿入の非存在を示す与えられた遺伝子座における成功したＰＣＲ増幅がある。不成功のＰＣＲは、失われたデータを表し、そして挿入の存在又は非存在のパターンをコードするＳＩＮＥキャラクター・マトリックスの最節約分析を行うとき、そのまま（例えば、「？」）コードされる。
調べた独立した遺伝子座の間に矛盾した挿入パターンが存在するとき、先祖の多型性及びその後の不完全な系統分類が、あるはずである。 Interpretation of PCR Data When examining relatively recently branched species, flanking sequences at the orthologous SINE locus are faithfully conserved and typically do not cause problems that hinder PCR diagnosis. However, when examining relatively old and branched taxa, PCR fails more frequently and makes it difficult to interpret experimental results. In unsuccessful PCR, SINE-minus data appears. That is, there is successful PCR amplification at a given locus indicating the absence of a SINE insertion. An unsuccessful PCR represents lost data and is encoded as is (eg, “?”) When performing a most efficient analysis of the SINE character matrix that encodes the presence or absence of insertions.
When there is an inconsistent insertion pattern between the independent loci examined, there should be ancestral polymorphism and subsequent incomplete lineage classification.

哺乳類ゲノム中のＳＩＮＥの分布
一般にほとんどの哺乳動物は大量のＳＩＮＥをもっている。それらは、調べた種間のハイブリダイゼーション・パターン（例えば、図１２中のＣＨＲ−２ＳＩＮＥの分布）に基づき、明らかに目、亜目、上科、科、属、又は種に特異的である。このような経験的な証拠は、ＳＩＮＥファミリーが多くの先祖哺乳動物系統において新たに生成したことを示している。但し、その生成メカニズムは十分に理解されていない。哺乳動物のゲノム中にこのように多数のＳＩＮＥ又はレトロポゾンが存在する理由は、哺乳動物のＬ１によりコードされるＲＴａｓｅが哺乳動物の共通先祖においてその鋳型認識の特異性を変更したためであると思われる。これは、ステム−ループ構造の形成に責任を負う3’尾を厳格に認識する多くのＬＩＮＥ内に存在するレトロポジションのために要求されるポリＡ尾の認識を可能 (Ohshima et al. 1996; Okada et al. 1997; Kajikawa et al.)。このようなシナリオは、哺乳動物ゲノム内のＬ１ＲＴａｓｅを介してポリＡ含有ＲＮＡが擬似遺伝子となることを可能にすることができたのであろう。 Distribution of SINEs in the mammalian genome In general, most mammals have large amounts of SINE. They are clearly specific to the eyes, suborder, superfamily, family, genus, or species based on the hybridization pattern between the species examined (eg, the distribution of CHR-2 SINE in FIG. 12) . Such empirical evidence indicates that the SINE family was newly generated in many ancestral mammalian strains. However, the generation mechanism is not fully understood. The reason for this large number of SINEs or retroposons in the mammalian genome may be due to the fact that RTase encoded by mammalian L1 altered its specificity of template recognition in common mammalian ancestors. . This allows recognition of the poly A tail required for retropositions present in many LINEs that strictly recognize the 3 'tail responsible for the formation of the stem-loop structure (Ohshima et al. 1996; Okada et al. 1997; Kajikawa et al.). Such a scenario could have allowed poly A-containing RNA to become a pseudogene via L1 RTase in the mammalian genome.

図１５は、最近提案された哺乳動物の系統樹を示す (Waddell et al. 1999; Cao et al. 2000; Nikaido et al. 2000)。現在まで特徴付けられた哺乳動物ＳＩＮＥファミリーを図１５上に記す。簡単に言えば、全哺乳動物ゲノム中に分布する最も古いＳＩＮＥファミリーはＭＩＲである (Smit and Riggs 1995; Jurka et al. 1995)。Ａｌｕファミリーは明らかに霊長類ゲノムに特異的であるが、その近縁種、例えば、ヒヨケザル間のその分布は詳細に調べられていない。げっ歯類Ｂ１，Ｂ２，及びＩＤはげっ歯目のゲノムに特異的である。ウサギのＣファミリーはウサギ目のゲノム内に報告されているが、その近縁種間の分布は報告されていない。鯨偶蹄目ゲノム内に存在するＳＩＮＥファミリー、例えば、ＣＨＲ−１,ＣＨＲ−２，ＣＨＲＳ，ＣＨＲＳ−Ｓ，ＰＲＥ−１，及びＢｏｖ-ｔＡは、詳細に調べられている (Shimamura et al. 1997; Shimamura et al. 1999; Nikaido et al. 1999)。ＣａｎＳＩＮＥＡはCanidaeゲノムから最初に報告されたが(Minnick et al. 1992; Coltman and Wright 1994)、多くの他の食肉目ゲノム内の進化の間に生じたことがその後示された (van der Vlugt and Lenstra 1995)。ＥＲＥファミリーと命名されたウマＳＩＮＥが報告され、そしてその分布が調べられた (Sakagami et al. 1994; Gallagher et al. 1999)。コウモリＳＩＮＥファミリーがBorodulina and Kramerov (1999)により単離され、そしてＶＥＳと命名され、そして他のコウモリＳＩＮＥファミリーも最近特徴付けられた (Kawai et al. )。ゾウＳＩＮＥファミリーが最近単離され、Ａｆｒｏｔｈｅｒｉａの種間に分布していることが示された (Nikaido and Okada)。ここで、由来するtRNAが異なっていたり、まったく違った配列であるSINEをfamilyとして区別し、同じ起源に由来するがdiagnosticな変異が見うけられるようなSINEをTypeまたはsubfamilyとして区別している。あるSINEはある時期に爆発的にそのコピー数を増やし、その爆発的増幅時期と生物全体の進化における時間軸がそれらの系統関係とSINEの分布との関連性につながっていると考えられる。 FIG. 15 shows a recently proposed phylogenetic tree of mammals (Waddell et al. 1999; Cao et al. 2000; Nikaido et al. 2000). The mammalian SINE family that has been characterized to date is noted on FIG. Briefly, the oldest SINE family distributed throughout the entire mammalian genome is MIR (Smit and Riggs 1995; Jurka et al. 1995). The Alu family is clearly specific for the primate genome, but its distribution among closely related species, such as lemurs, has not been investigated in detail. Rodents B1, B2, and ID are specific to the rodent genome. The rabbit C family has been reported in the genome of the order of rabbits, but its distribution among related species has not been reported. The SINE family present in the genome of the cetacean, such as CHR-1, CHR-2, CHRS, CHRS-S, PRE-1, and Bov-tA has been examined in detail (Shimamura et al. 1997; Shimamura et al. 1999; Nikaido et al. 1999). CanSINEA was first reported from the Canidae genome (Minnick et al. 1992; Coltman and Wright 1994), but was later shown to have occurred during evolution within many other carnivorous genomes (van der Vlugt and Lenstra 1995). A horse SINE, named the ERE family, was reported and its distribution was examined (Sakagami et al. 1994; Gallagher et al. 1999). The bat SINE family was isolated by Borodulina and Kramerov (1999) and named VES, and other bat SINE families have recently been characterized (Kawai et al.). The elephant SINE family was recently isolated and shown to be distributed among Afrotheria species (Nikaido and Okada). Here, SINEs with different tRNAs or completely different sequences are distinguished as families, and SINEs that originate from the same origin but have a diagnostic mutation are distinguished as Type or subfamily. A certain SINE explosively increases its copy number at a certain time, and it is thought that the explosive amplification time and the time axis in the evolution of the whole organism are related to the relationship between their phylogenetic relationship and the distribution of SINE.

重要なことは、哺乳動物ゲノム内に多くのＳＩＮＥファミリーが今日まで単離されてきたけれども、それらは未だそのサブファミリー構造まで特徴付けられていないということである。したがって、本願発明に従って、今後、そのサブファミリー構造が明らかにされるであろうし、新たな哺乳動物ＳＩＮＥファミリー又はサブファミリーもさらに獲得されることができる。 Importantly, although many SINE families have been isolated to date within the mammalian genome, they have not yet been characterized to their subfamily structure. Therefore, according to the present invention, the subfamily structure will be revealed in the future, and a new mammalian SINE family or subfamily can also be acquired.

ＳＩＮＥの新しい増幅：固定及び短期間の種分岐
ＳＩＮＥがゲノムの進化においてひじょうに新しく増幅され、そして１の種の集団の間で固定されていない場合、共通派生形質としての地位は不安定であり、そしてそれは系統樹作成のために使用されるべきではない。しかしながら、このようなＳＩＮＥの分布は、集団構造の分析のためには使用されることができる (例えば、Hamada et al. 1998)。種分岐が短期間に生じた場合、すなわち、大部分のＳＩＮＥが遺伝的浮動を介して集団の間に固定される前には、先祖の多型性その後の不完全な系統分類は矛盾したＳＩＮＥ挿入パターンを作り出す。この現象は、ＳＩＮＥレトロポジションの不可逆な性質と結合して、爆発的に派生した分類群における系統分類の歴史的パターンを調べるためにＳＩＮＥを使用するためのバイアスを提供する。先祖に多型性があり、その後不完全にSINEが子孫に分配され、矛盾する挿入パターンを作り出した場合には、系統樹作成法の一つである最節約法を用いて、各遺伝子座における挿入の存在又は非存在についてのＳＩＮＥキャラクター・マトリックスを評価することが有用である。固定されていない、多形性ＳＩＮＥは、高レベルの系統関係のためには有用ではないけれども、それらは、集団分析のための優れた分類学的ツールとなることが知られている(Deininger and Batzer, 1994, Stoneking et al, 1997; Hamada et al, 1998)。したがって、本願明細書に記載する方法に従って、ＳＩＮＥファミリー又はサブファミリーを同定し、かつ、特徴付けすることにより、かかるＳＩＮＥファミリー又はサブファミリーを利用して動物の種判別方法を行うことができる。 New amplification of SINE: If fixed and short-term species branching SINE is very newly amplified in the evolution of the genome and is not fixed between populations of one species, the status as a co-derived trait is unstable, And it should not be used for phylogenetic tree creation. However, such a SINE distribution can be used for population structure analysis (eg, Hamada et al. 1998). If species branching occurs in a short period, that is, before most SINEs are fixed between populations via genetic drift, ancestral polymorphisms and subsequent incomplete lineages are inconsistent Create an insertion pattern. This phenomenon, coupled with the irreversible nature of the SINE retroposition, provides a bias for using SINE to examine historical patterns of phylogenetic classification in explosively derived taxa. If ancestors have polymorphisms, and then SINEs are incompletely distributed to offspring, creating contradictory insertion patterns, use one of the phylogenetic tree-saving methods to save at each locus. It is useful to evaluate the SINE character matrix for the presence or absence of insertions. Although unfixed polymorphic SINEs are not useful for high-level phylogenetic relationships, they are known to be excellent taxonomic tools for population analysis (Deininger and Batzer, 1994, Stoneking et al, 1997; Hamada et al, 1998). Therefore, by identifying and characterizing the SINE family or subfamily according to the method described in the present specification, it is possible to perform an animal species discrimination method using such a SINE family or subfamily.

フランキング配列の機能及び価値（有用性）
ＳＩＮＥ挿入について調べられた遺伝子座内のヌクレオチド・フランキング配列の情報は有用である。もちろん挿入データは系統樹の作成及び本願発明に係る方法に使用するために有用であるけれども、フランキング配列との挿入配列の統合は、独立したＳＩＮＥ挿入事件により定められる分岐群間の枝の長さについての情報を提供する (Lum et al, 2000; Shedlock and Okada, 2000)。さらに、与えられた挿入要素と会合したフランキング配列は文字通り連結されているので、ＳＩＮＥ由来のトポロジー対フランキング配列の間の一貫性は、各遺伝子座における不可逆挿入の基本的仮定の評価への数多くのアプローチを提供する (Lum et al, 2000)。独立したＳＩＮＥ遺伝子座にホモプラシー（成因的相同）、又は形質矛盾が存在する場合、系統発生樹の間には明らかに矛盾があるかもしれない。このようなアプローチはＳＩＮＥ分析の新たな次元として始まりつつあり、そしてＳＩＮＥ法の統計学的評価を高めるための基礎を提供する。 Function and value of flanking sequences (usefulness)
Information on nucleotide flanking sequences within the locus examined for SINE insertion is useful. Of course, the insertion data is useful for the creation of the phylogenetic tree and the method according to the present invention, but the integration of the insertion sequence with the flanking sequence is the length of the branch between the branch groups determined by the independent SINE insertion event. Provide information on the safety (Lum et al, 2000; Shedlock and Okada, 2000). Furthermore, since the flanking sequences associated with a given insertion element are literally linked, the consistency between the SINE-derived topology versus the flanking sequences is an evaluation of the basic assumption of irreversible insertion at each locus. Provides a number of approaches (Lum et al, 2000). If there is homoplasmy (genetically homologous) or trait contradiction at an independent SINE locus, there may be a clear discrepancy between the phylogenetic trees. Such an approach is beginning as a new dimension of SINE analysis and provides the basis for enhancing the statistical evaluation of the SINE method.

以下、本発明を実施例により詳細に説明する。但し、以下の実施例により本発明の技術的範囲が限定されるものではない。 Hereinafter, the present invention will be described in detail with reference to examples. However, the technical scope of the present invention is not limited by the following examples.

材料及び方法
緩衝液、酵素類その他試薬
核酸の操作に用いる種々の緩衝液、酵素、大腸菌の培養に用いる培地、その他の試薬類は、和光純薬（株）、Sigma社、Difco社、FMC社より購入したものをSambrook et al. (1989)の文献を参考にして調整した。各種制限酵素、修飾酵素、ベクターDNAは、宝酒造（株）、東洋紡績（株）、アマシャムライフサイエンス社より購入した。ラジオアイソトープは、第一化学薬品（株）より購入した。PCRプライマーに関してはOligoExpress(アマシャムファルマシアバイオテク（株）)に合成を委託注文したものを使用した。 Materials and methods
Buffers, enzymes and other reagents Various buffers used for the operation of nucleic acids, enzymes, medium used for culture of E. coli, and other reagents were purchased from Wako Pure Chemical Industries, Ltd., Sigma, Difco and FMC. The material was adjusted with reference to Sambrook et al. (1989). Various restriction enzymes, modifying enzymes, and vector DNA were purchased from Takara Shuzo Co., Ltd., Toyobo Co., Ltd., and Amersham Life Sciences. Radioisotope was purchased from Daiichi Chemicals. PCR primers used were those commissioned for synthesis by OligoExpress (Amersham Pharmacia Biotech Co., Ltd.).

ゲノムDNA
本研究に使用した各種サンプル（組織もしくはDNA）について以下に示す。尚解析に用いた鯨、偶蹄類の英名、和名、学名等について以下の表1： Genomic DNA
Various samples (tissue or DNA) used in this study are shown below. The following table 1 shows the English name, Japanese name, scientific name, etc. of whales and cloven-hoofs used in the analysis.

に示す。
フタコブラクダは東京大学医学部の吉田穣博士より提供された筋肉組織からDNAを抽出した。ブタについては当研究室に保管されていたDNAをそのまま用いた。ペッカリーは農水省畜産試験場の安江博博士より提供されたDNAを用いた。ジャワマメジカは米国サンディエゴ動物園から提供されたDNAを用いた。アクシスジカ、アミメキリン、セーブルアンテロープ、ニホンカモシカ、マーコール、ムフロン、カバに関しては千葉市立動物公園の宗近功氏より提供されたDNAを用いた。ヒツジは長野県上田食肉衛生検査場の向井康氏より提供された肝臓から抽出したものを使用した。ウシに関しては神奈川県食肉衛生試験場相模出張所より提供された腎臓からDNAを抽出して用いた。 Shown in
Bactrian camels extracted DNA from muscle tissue provided by Dr. Kei Yoshida of the University of Tokyo School of Medicine. For pigs, the DNA stored in our laboratory was used as it was. The peccary used DNA provided by Dr. Hiroshi Yasue of the MAFF Livestock Experiment Station. Java deer used DNA provided by San Diego Zoo, USA. The DNA provided by Mr. Isao Munechika of Chiba City Zoological Park was used for Axis deer, Amyme giraffe, Sable antelope, Japanese serow, Markor, Muflon and Hippopotamus. The sheep was extracted from the liver provided by Mr. Yasushi Mukai of Ueda Meat Sanitation Inspection Center in Nagano Prefecture. For cattle, DNA was extracted from the kidney provided by Sagami Branch, Kanagawa Meat Sanitation Laboratory.

イッカク、オオギハクジラのサンプルについては、千葉県立中央博物館の宮正樹博士より筋肉と思われる組織片を入手しDNAを抽出して用いた。オオギハクジラのサンプルに関してはアカボウクジラ科オオギハクジラ属であることは判明しているが種名は不明である。カワイルカ類を除くその他のサンプルは、水産庁遠洋水産研究所大型鯨類研究室の加藤秀弘博士より提供された各種組織片からDNAを抽出して用いた。カワイルカ類のサンプルについては、アメリカ合衆国南西漁業科学センターのブラウネル博士（Dr. Robert L. Brownell, Jr.: Chief Marine Mammal Division, Southeast Fisheries Science Center, P.O. BOX 271, La Jolla, California 92038）により日本国への入手手続きを行ってもらった。それに基づいて、アマゾンカワイルカの皮下組織片、ラプラタカワイルカの肝臓組織片についてはカリフォルニア州立大学バークレー校のHealy H. Hamilton 氏より提供を受けDNAを抽出した。ガンジスカワイルカは乾燥骨サンプルを国立科学博物館の山田格氏より提供されDNAを抽出した。ヨウスコウカワイルカに関しては中国科学院水生生物研究所の付属水族館に飼育されている個体から鮮血を採取しその場でヘパリン処理したものを研究所へ搬送し即座にDNAを抽出した。抽出したDNAは、日本への持ち込みが不可能なため現在は中国の研究所において保管されている。 For narwhal and giant whale samples, we obtained tissue fragments from Dr. Masaki Miya of the Chiba Prefectural Central Museum and extracted and used DNA. As for the sample of the humpback whale, it is known that it is a genus of the blue whale family, but the species name is unknown. Other samples, except for dolphins, were extracted from various pieces of tissue provided by Dr. Hidehiro Kato of the Large Whale Laboratory at the Fisheries Agency. For dolphins samples, Dr. Robert L. Brownell, Jr .: Chief Marine Mammal Division, Southeast Fisheries Science Center, PO BOX 271, La Jolla, California 92038 We had you go through the procedure for obtaining. Based on this, DNA samples were obtained from Dr. Healy H. Hamilton of California State University Berkeley for the subcutaneous tissue pieces of Amazon dolphins and liver tissue of La Plata dolphins. The Ganges River Dolphin provided dried bone samples by Dr. Satoshi Yamada of the National Science Museum and extracted DNA. As for the Chinese dolphins, fresh blood was collected from individuals kept in the aquarium attached to the Institute of Aquatic Biology of the Chinese Academy of Sciences, and heparinized on the spot was transported to the laboratory and DNA extracted immediately. The extracted DNA is currently stored at a Chinese laboratory because it cannot be brought into Japan.

ゲノムDNAの抽出
解析に用いた動物種の組織からのゲノムDNA抽出は、簡便法（Rapid法;ProteinaseKを用いる方法）で行った。抽出したゲノムDNAはライブラリーの作製及びPCRの鋳型として使用した。また、それらは４℃で保管している。−４０℃のフリザーに保存しておいた鯨、偶蹄類の組織（肝臓、筋肉）をメスで数ミリ角に切り出し、マイクロチューブに移してから組織の重量を測った後、そのマイクロチューブに1X TNE buffer (10mM Tris-HCl(pH8.0),100mM NaCl, 1mM EDTA (pH 8.0))を500μl加えた。チューブ内の組織片をパスツールピペットを用いて細かく粉砕したらその、組織及びTNE bufferに1X Lysis buffer (20mg/ml Proteinase K,2% SDS, 10mM Tris-HCl(pH8.0), 150mM NaCl, 10mM EDTA (pH8.0)) を加え、転倒撹拌した後、55℃のウォーターバスを用いてインキュベートし組織片を完全に溶解させた。その溶液と等量のフェノールを加え、ベリーダンサー(200〜300rpm)で１〜２時間撹拌した。遠心分離後 (室温、12000rpm, 5min), 上清を別の2.0ml マイクロチューブに移した。同様にしてフェノール／クロロホルム抽出に続きクロロホルム抽出を繰り返し、最終的な上清に0.1倍量の3M 酢酸ナトリウム及び2倍量のエタノールを加えた。この際多くの場合ファイバーが確認され、その後70%エタノールを用いてリンスした後適当量のTE bufferを加えペレットを溶解させた。ヨウスコウカワイルカの血液サンプルからのDNAの抽出は市販のカラム（QIAamp Tissue/Blood Kit Cat. No.29306: キアゲン（株）社製）を用いて行った。 Genomic DNA extraction from tissue of animal species used for genomic DNA extraction analysis was performed by a simple method (Rapid method; a method using Proteinase K). The extracted genomic DNA was used as a library preparation and PCR template. They are also stored at 4 ° C. Cut the tissue (liver, muscle) of whales and cloven-hoofs stored in a -40 ° C fryer to a few millimeters with a scalpel, transfer to a microtube, weigh the tissue, and then add 1X to the microtube. 500 μl of TNE buffer (10 mM Tris-HCl (pH 8.0), 100 mM NaCl, 1 mM EDTA (pH 8.0)) was added. When the tissue piece in the tube is finely ground using a Pasteur pipette, the tissue and TNE buffer are diluted with 1X Lysis buffer (20 mg / ml Proteinase K, 2% SDS, 10 mM Tris-HCl (pH 8.0), 150 mM NaCl, 10 mM). EDTA (pH 8.0)) was added, and the mixture was agitated by inversion, followed by incubation using a 55 ° C. water bath to completely dissolve the tissue piece. An equivalent amount of phenol was added to the solution, followed by stirring with a belly dancer (200-300 rpm) for 1-2 hours. After centrifugation (room temperature, 12000 rpm, 5 min), the supernatant was transferred to another 2.0 ml microtube. In the same manner, phenol / chloroform extraction followed by chloroform extraction was repeated, and 0.1-fold amount of 3M sodium acetate and 2-fold amount of ethanol were added to the final supernatant. At this time, in many cases, fibers were confirmed, and then rinsed with 70% ethanol, and then an appropriate amount of TE buffer was added to dissolve the pellet. Extraction of DNA from the blood sample of the Chinese dolphins was performed using a commercially available column (QIAamp Tissue / Blood Kit Cat. No. 29306: Qiagen Co., Ltd.).

フランキングPCR（主に相同遺伝子座の増幅に用いる）
フランキングPCRとはSINE配列を挟む様にその両側（上流及び下流）の近傍領域にプライマーを設計し、ゲノムDNAを鋳型としてPCRを行う事を意味している。その際に用いるプライマーは長さにしておよそ17〜30ヌクレオチドでTm(Melting Temperature)値はおおよそ55℃周辺になるように設計した。その設計に関しては、マッキントッシュ版フリーソフトとしてインターネット上でダウンロードが可能なCPrimer (Ver.1.08, Bristol and Anderson(1995))を使用した。
反応組成：Template DNA (100 〜 500 ng); 10 X PCR Buffer (100 mM Tris -HCl (pH8.3), 500 mM KCl, 15 mM MgCl₂ 5 μl; dNTP Mixture (2.5 mM each) 4 μl; Forward and Reverse Primer (5 pmol/μl) 2 μl each; TaKaRa TaqTM (5 U/μl) 0.25 μl; adds ddH₂O up to final volume 50 μl。
反応条件： 94 ℃（Pre-denature） 2〜3 min.; and 30 cycles of 94 ℃ (Denature) 30 sec.; 45 ℃〜60 ℃(Annealing) 1 min; 72 ℃ (Extension) 30〜90sec.。この際Annealingの温度はプライマーのTm値はもちろんのこと、その遺伝子座の増えやすさなどに応じて適宜変えていき最適な温度を探した。 Flanking PCR (mainly used for amplification of homologous loci)
Flanking PCR means that PCR is performed using genomic DNA as a template by designing primers in the vicinity of both sides (upstream and downstream) of the SINE sequence. The primer used at that time was designed to have a length of approximately 17 to 30 nucleotides and a Tm (Melting Temperature) value of approximately 55 ° C. Regarding the design, CPrimer (Ver.1.08, Bristol and Anderson (1995)), which can be downloaded on the Internet, was used as Macintosh free software.
Reaction composition: Template DNA (100-500 ng); 10 X PCR Buffer (100 mM Tris-HCl (pH 8.3), 500 mM KCl, 15 mM MgCl ₂ 5 μl; dNTP Mixture (2.5 mM each) 4 μl; Forward and Reverse Primer (5 pmol / μl) 2 μl each; TaKaRa TaqTM (5 U / μl) 0.25 μl; adds ddH ₂ O up to final volume 50 μl.
Reaction conditions: 94 ° C (Pre-denature) 2-3 min .; and 30 cycles of 94 ° C (Denature) 30 sec .; 45 ° C-60 ° C (Annealing) 1 min; 72 ° C (Extension) 30-90 sec. At this time, the temperature of Annealing was changed appropriately depending on the Tm value of the primer as well as the ease of increase of the locus, and the optimum temperature was searched.

コロニーPCR
コロニーPCRはサブクローニングを行う時、目的のDNA断片がインサートとしてあるクローンに含まれているか否かを選別するのに大変簡便な方法である。従来はプラスミドDNAをMiniprepによって調整した後に、制限酵素で消化してそのインサートの有無を確認していたが、この行程をPCRによるチェックのみで済ませる事が可能となるので、時間と労力の大幅な短縮につながる。まずプレートに生えた大腸菌のコロニーを爪楊枝で軽くつつき0.5 ml PCR 用マイクロチューブに擦り付け、それを鋳型にして通常のPCRを20 μlの反応系にして行うだけである。この際用いるプライマーはベクターに特異的な配列に基づいて作製されている。PCR反応における熱変性の際に大腸菌の細胞膜が破壊され鋳型となるベクターDNAが溶液中に溶け出すことでこのPCRが可能になる。後で述べるがこのPCRによってインサートの有無を確認した後、このPCR産物を直接ABIシークエンサーを用いて配列決定の鋳型とすることが可能なので、インサートチェックの直後にそのインサートのシークエンスも可能なので時間的にもかなりの短縮になった。 Colony PCR
Colony PCR is a very simple method for selecting whether or not a target DNA fragment is contained in a clone as an insert when subcloning is performed. In the past, plasmid DNA was prepared with Miniprep and then digested with restriction enzymes to confirm the presence or absence of the insert. However, it is possible to complete this process only by PCR, so it takes a lot of time and effort. It leads to shortening. First, the colony of E. coli growing on the plate is lightly picked with a toothpick and rubbed into a 0.5 ml PCR microtube, and then it is used as a template for normal PCR in a 20 μl reaction system. The primer used in this case is prepared based on a sequence specific to the vector. During the heat denaturation in the PCR reaction, the cell membrane of Escherichia coli is destroyed, and the vector DNA serving as a template is dissolved in the solution to enable this PCR. As will be described later, this PCR product can be directly used as a template for sequencing using the ABI sequencer after confirming the presence or absence of the insert, so it is possible to sequence the insert immediately after the insert check. Even a considerable shortening.

塩基配列の決定
塩基配列の決定には以下に示すシークエンサー及びシークエンス反応キットを使用した。LI-COR dNA Sequencer (Model 4000) を用いたシークエンス：SequiTherm EXCELL（商標）II Long-ReadTM DNA Sequence Kit-LC (Cat. No. SE7701LC, EPICENTRE TECHNOLOGIES社製)；ABI PRIZM（商標） 310 Genetic Analyser を用いたシークエンス： BigDye Terminator Cycle Sequencing FS Ready Reaction Kit (P/N 4303152, Perkin Elmer社製)。
サブクローニングしたプラスミドDNAをシークエンス反応の鋳型に用いる場合は、 SDS-Alkaline and Mg沈殿法を用いて調整したものを使用した。
スクリーニングによって単離したSINE配列を含んだ遺伝子座の塩基配列を決定する場合はベクター配列上のM4及びRV、そしてSINEのコンセンサス配列に基づいて作製した蛍光の付加しているオリゴヌクレオチドプライマー（アロカ（株）社製）を用いた。
PCR産物をダイレクトシークエンスする場合は、PCR産物にShrimp Alkaline Phosphatase (SAP) 2U/μl 0.5μl; Exonuclease I 10U/μl 0.5 μl (共にアマシャムライフサイエンス社製)を直接加え、37 ℃で30 分インキュベートしてオリゴヌクレオチドを分解した後、85 ℃で15 分間インキュベートして酵素を失活させたものをシークエンス反応の鋳型として用いた。 Determination of base sequence The following sequencer and sequence reaction kit were used to determine the base sequence. Sequence using LI-COR dNA Sequencer (Model 4000): SequiTherm EXCELL ™ II Long-ReadTM DNA Sequence Kit-LC (Cat. No. SE7701LC, manufactured by EPICENTRE TECHNOLOGIES); ABI PRIZM ™ 310 Genetic Analyzer Sequence used: BigDye Terminator Cycle Sequencing FS Ready Reaction Kit (P / N 4303152, manufactured by Perkin Elmer).
When subcloned plasmid DNA was used as a template for sequencing reaction, one prepared by SDS-Alkaline and Mg precipitation method was used.
When determining the base sequence of the locus containing the SINE sequence isolated by screening, M4 and RV on the vector sequence, and the oligonucleotide primer (aloca (aloca ()) prepared based on the consensus sequence of SINE Manufactured by the same company).
For direct sequencing of PCR products, add Shrimp Alkaline Phosphatase (SAP) 2U / μl 0.5μl; Exonuclease I 10U / μl 0.5μl (both from Amersham Life Sciences) to the PCR product and incubate at 37 ° C for 30 minutes. After degrading the oligonucleotide, the enzyme was inactivated by incubating at 85 ° C. for 15 minutes, and used as a template for the sequencing reaction.

ゲノム・ライブラリーの作製
スクロース勾配
ゲノムDNA約50 μgを適当な制限酵素（本研究においては主にHind IIIを使用した）で完全消化し、フェノール／クロロホルム抽出、エタノール沈殿を行った後、200 μ l の TE buffer に溶解した。15 ml の超遠心分離用プラスチックチューブ（Centrifuge Tubes - 50 Ultra - Clear（商標）Tube 14 X 89 mm, Order No. 344059: BECKMAN 社製）に10-40 %スクロース勾配を作製し、この上にDNA 溶液を加え、ローターに取り付け、超遠心分離機（L8-70M, Serial No. 7C869: BECKMAN 社製）で遠心分離（25000 rpm., 15 ℃, 15 時間）した。遠心後、1.5 ml マイクロチューブに10〜15滴ずつ滴下して分画（約250〜400 μl/フラクション）した。その分画を0.7〜1 %アガロースゲル電気泳動し、約2〜4 kbpの断片を含むフラクションを挟むように4〜6本のフラクションチューブを選別しエタノール沈殿を行った後、適当量のTE bufferに溶解した。再び0.7〜1 %アガロース電気泳動を行い、ライブラリー作製に用いるフラクションを決定した。 Genome library creation
Approximately 50 μg of sucrose gradient genomic DNA was completely digested with appropriate restriction enzymes (mainly Hind III was used in this study), extracted with phenol / chloroform and ethanol precipitated, and then dissolved in 200 μl of TE buffer. did. Create a 10-40% sucrose gradient in a 15 ml plastic tube for ultracentrifugation (Centrifuge Tubes-50 Ultra-Clear ™ Tube 14 X 89 mm, Order No. 344059: manufactured by BECKMAN) The solution was added, attached to the rotor, and centrifuged (25000 rpm, 15 ° C., 15 hours) with an ultracentrifuge (L8-70M, Serial No. 7C869: manufactured by BECKMAN). After centrifugation, 10-15 drops were added dropwise to a 1.5 ml microtube and fractionated (about 250-400 μl / fraction). The fraction was subjected to 0.7 to 1% agarose gel electrophoresis, 4 to 6 fraction tubes were selected so that the fraction containing about 2 to 4 kbp fragment was sandwiched, and ethanol precipitation was performed. Dissolved in. 0.7-1% agarose electrophoresis was performed again to determine the fraction used for library preparation.

プラスミド・ライブラリーの構築
今回の実験においてはpUC18/HindIIIもしくはpUC19/HindIIIを使用したプラスミド・ライブラリーのみを用い、ファージライブラリーは作製しなかった。それは鯨、偶蹄類ゲノム中にはSINEのコピー数が十分量存在し、ファージライブラリーを使用しなくても十分量のポジティブクローンが得られる事が、当研究室での経験上わかっていたため、時間の節約を考えて比較的操作行程の少ないプラスミド・ライブラリーを活用した。プラスミド・ライブラリーの作製は以下の様にして行った。組成はTaKaRa Ligation Kit Ver.1 A液, 12 μl; 同 B液, 1.5 μl; Vector DNA (pUC18 or 19/HindIII〜100 ng/μl), 0.5 μl; Insert DNA (Genomic DNA Sucrose Density Gradient Fraction), 1.0 μl : 以上の溶液を0.5 mlチューブ中で混合した後、クールブロック上で16 ℃で30分以上放置した。このゲノム・ライブラリーは-20 ℃で保存している。 Construction of plasmid library In this experiment, only a plasmid library using pUC18 / HindIII or pUC19 / HindIII was used, and no phage library was prepared. In our laboratory, we know that there are enough copies of SINE in the genomes of whales and cloven-hoofed animals, and that enough positive clones can be obtained without using a phage library. In order to save time, a plasmid library with relatively few operation steps was used. A plasmid library was prepared as follows. Composition: TaKaRa Ligation Kit Ver.1 solution A, 12 μl; solution B, 1.5 μl; Vector DNA (pUC18 or 19 / HindIII to 100 ng / μl), 0.5 μl; Insert DNA (Genomic DNA Sucrose Density Gradient Fraction), 1.0 μl: The above solution was mixed in a 0.5 ml tube and allowed to stand at 16 ° C. for 30 minutes or longer on a cool block. This genomic library is stored at -20 ° C.

SINE配列を含むクローン単離までの流れ
図１６にＳＩＮＥ法における実験の流れを示す。
スクリーニング
スクリーニングに使用するメンブレンの作製
0.5 mlチューブに前述のゲノム・ライブラリー混合液を1 μl、大腸菌（ E.coli JM105株）のコンピテントセルを適当量加えて混合した後、氷上で30分間静置した。42〜45 ℃で 45 秒間ヒートショックした後、あらかじめ 37 ℃ でインキュベートしておいた L/amp/X-gal/IPTG プレートにプレーティングし、37 ℃インキュベーターに一晩放置した。プレート１枚当たりのコロニー数が200〜300個になるように調整し、プレート10〜20枚分のライブラリーをまいた。このプレート上に生えたコロニーをナイロンメンブレン(Colony/Plaque Screen（商標）NEF-978:NEN Research Products 社製)にトランスファーし、変性溶液（0.4 M NaOH, 0.6 M NaCl）, 中和溶液（1 M NaCl, 0.5 M Tris-HCl (pH7.0)）の順に３分間程度放置した。そのメンブレンは水分を切った後よく乾燥させておいた。コロニーをトランスファーした後のプレートは37℃でインキュベートして再度コロニーが十分に生えた状態にしておき、後述のポジティブクローンのピックアップをしやすいようにした。 The flow diagram 16 to clonal isolation including SINE sequences showing the flow of experiments in SINE method.
screening
Production of membranes for screening
A 0.5 ml tube was mixed with 1 μl of the aforementioned genomic library mixture and an appropriate amount of competent cells of E. coli (E.coli JM105 strain), and then allowed to stand on ice for 30 minutes. After heat shock at 42-45 ° C. for 45 seconds, the plate was plated on an L / amp / X-gal / IPTG plate that had been incubated at 37 ° C. and left overnight in a 37 ° C. incubator. The number of colonies per plate was adjusted to 200 to 300, and a library for 10 to 20 plates was spread. Colonies that grew on this plate were transferred to a nylon membrane (Colony / Plaque Screen (trademark) NEF-978: manufactured by NEN Research Products), denatured solution (0.4 M NaOH, 0.6 M NaCl), neutralized solution (1 M NaCl, 0.5 M Tris-HCl (pH 7.0)) in this order for about 3 minutes. The membrane was dried well after draining water. The plate after the colony transfer was incubated at 37 ° C. so that the colony was sufficiently grown again so that the positive clones described later could be easily picked up.

スクリーニングに使用するプローブの作製
スクリーニングにはオリゴヌクレオチドもしくはPCR産物を使用した。実際に使用したオリゴヌクレオチド配列を以下の表２： Preparation of probes used for screening Oligonucleotides or PCR products were used for screening. The oligonucleotide sequences actually used are listed in Table 2 below:

に示す。その作製方法はまず、オリゴヌクレオチドの場合：ddH2O 27〜37μl; Oligo Nucleotide (5 pmol/μl) 1 μl; 10 X T4 Polynucleotide Kinase Buffer 5 μl; T4 polynucleotide Kinase 2μl; [γ-32P]ATP 5〜15μlを0.5mlチューブに混合した後、37℃でインキュベートした。プローブとして用いたオリゴヌクレオチドの配列を上記表２に示す。PCR産物を用いる場合（Primer-Extension法）： Template DNA, 500 μg; Forward primer (12.5 nmol/μl) 2 μl; Reverse primer (12.5 nmol/μl) 2 μl dDTP Mixture (dATP, dGTP, dTTP) 2.5 μl; 以上の混合液を95 ℃で5分間熱変性し、続いて55 ℃で1分間アニールさせた後氷上に移した。さらに以下の試薬を順に混合した後60 ℃で30分間インキュベートした。10XBcaBEST（商標）Buffer 2.5 μl; BcaBEST（商標）DNA polymerase 2 μl; [α- 32P] dCTP。反応後のそれぞれのプローブはNICK Column (Sephadex G-50 DNA Grade: アマシャムファルマシアバイオテク（株）社製)で溶出させて精製した。RIのカウントは液体シンチレーションカウンターで測定した。（簡易的測定にガイガーカウンターも使用した。） Shown in First, in the case of oligonucleotide: ddH2O 27-37 μl; Oligo Nucleotide (5 pmol / μl) 1 μl; 10 X T4 Polynucleotide Kinase Buffer 5 μl; T4 polynucleotide Kinase 2 μl; [γ-32P] ATP 5-15 μl Was mixed in a 0.5 ml tube and incubated at 37 ° C. The sequences of oligonucleotides used as probes are shown in Table 2 above. When using PCR products (Primer-Extension method): Template DNA, 500 μg; Forward primer (12.5 nmol / μl) 2 μl; Reverse primer (12.5 nmol / μl) 2 μl dDTP Mixture (dATP, dGTP, dTTP) 2.5 μl The above mixture was heat denatured at 95 ° C. for 5 minutes, then annealed at 55 ° C. for 1 minute, and then transferred to ice. Further, the following reagents were mixed in order and incubated at 60 ° C. for 30 minutes. 10XBcaBEST ™ Buffer 2.5 μl; BcaBEST ™ DNA polymerase 2 μl; [α-32P] dCTP. Each probe after the reaction was purified by elution with NICK Column (Sephadex G-50 DNA Grade: Amersham Pharmacia Biotech Co., Ltd.). RI counts were measured with a liquid scintillation counter. (A Geiger counter was also used for simple measurement.)

ハイブリダイゼーション
メンブレンをハイブリバックに入れ、そこにプレハイブリ溶液（6 X SSC, 1 % SDS ) を適当量加えた。ハイブリバックをシーラーでパックし1時間以上インキュベートした。溶液を捨て、ハイブリ溶液（6 X SSC, 1 % SDS, 1 X Denhart's solution , Carrier DNA (Shared Herring Sperm DNA solution)を加えた後、さらに予め調整したプローブを95 ℃で3分間熱変性しておいたものを加え、またハイブリバックをシールした。その後42 ℃のウォーターバスで一晩（〜１５時間程度）インキュベートした。それらのメンブレンを適当量のウォッシュ溶液（2X SSC, 1 % SDS)で軽くすすぎカウントを測定し確認してから、さらに新たなウォッシュ溶液を用いてウォーターバスの温度を55〜60 ℃に設定してウォッシュを行った。 The hybridization membrane was placed in a hybrid bag, and an appropriate amount of a prehybrid solution (6 X SSC, 1% SDS) was added thereto. The hybrid bag was packed with a sealer and incubated for 1 hour or more. Discard the solution, add the hybrid solution (6 X SSC, 1% SDS, 1 X Denhart's solution, Carrier DNA (Shared Herring Sperm DNA solution)), and heat denature the probe for 3 minutes at 95 ° C. The hybrid bag was sealed, and the hybrid bag was sealed, and then incubated overnight (about 15 hours) in a water bath at 42 ° C. Rinse the membranes gently with an appropriate amount of wash solution (2X SSC, 1% SDS). After measuring and confirming the count, washing was performed using a new wash solution with the water bath temperature set at 55-60 ° C.

現像
ウォッシュ溶液を捨て新たなウォッシュ溶液で軽くすすいだ後、カウントを、ガイガーカウンターを用いて確認した。暗室内で台紙、メンブレン、X線フィルム、Intensifierの順にカセットを入れ、これを−８０ ℃のフリザーに入れて感光させた。暗室内でカセットからX線フィルムを取り出し、現像液、停止液、定着液の順にX線フィルムを入れて現像した。 After discarding the developer wash solution and rinsing lightly with a new wash solution, the count was confirmed using a Geiger counter. In the dark room, a cassette, a membrane, an X-ray film, and an intensifier were placed in this order, and this was placed in a −80 ° C. freezer to expose it. The X-ray film was taken out from the cassette in the dark room, and developed with the X-ray film in the order of developer, stop solution, and fixer.

Positive Clone の単離
現像したX線フィルムとライブラリーをまいたプレートを重ね合わせる様にしてそのポジティブクローンの位置を確認した。そのポジティブクローンだと思われるコロニーについてはMiniprepを行う前に、そのクローン中におけるSINE配列の有無の確認のためにベクターの配列ではなく、スクリーニングに用いたSINE配列のコンセンサス配列に基づいて作製したプライマーを用いてコロニーPCRを行った。その後、ポジティブクローンのプラスミドを調製し、LI-CORシークエンサーを用いてその近傍領域の配列決定を行った。その後の操作に関しては上述の通りであるので、以下、簡単に述べる。 To confirm the position of the positive clones in the manner to superimpose an isolated developing X-ray film and plates were seeded a library of Positive Clone. For a colony that seems to be a positive clone, before minipreping, a primer created based on the consensus sequence of the SINE sequence used for screening, not the vector sequence, to confirm the presence or absence of the SINE sequence in the clone Colony PCR was performed using Thereafter, a positive clone plasmid was prepared, and its neighboring region was sequenced using an LI-COR sequencer. Since the subsequent operations are as described above, they will be briefly described below.

まず、そのフランキング配列を基にしてプライマーを作製し、種々の生物のゲノムDNAを鋳型にしてフランキングPCRを行った。そして多くの場合、スクリーニングに用いた生物種の近縁種に関しては相同遺伝子座を簡単に増幅させることが可能であるが、それとは遠縁のグループもしくは分岐が古い時代に起こったと考えられる種の相同遺伝子座については、その近傍領域における塩基置換がより多く蓄積しているのでスクリーニングに用いた１種の配列に基づいたプライマーではアニールがうまくいかず、最初のフランキングPCRでは増幅できない場合もしばしば見られた。その際には増幅した様々な種の相同遺伝子座の塩基配列を決定してそれらのコンセンサスをとりそれを考慮してまた新たなフランキングプライマーを作製し再度PCRを行った。 First, primers were prepared based on the flanking sequences, and flanking PCR was performed using genomic DNAs of various organisms as templates. And in many cases, it is possible to easily amplify homologous loci for closely related species of the species used for screening, but this is in contrast to species that are thought to have originated in distantly related groups or branches. For loci, more base substitutions in the neighboring region are accumulated, so annealing with a primer based on a single sequence used for screening does not work well, and it is often not possible to amplify by the first flanking PCR. It was. In that case, nucleotide sequences of various homologous loci amplified were determined, their consensus was taken into consideration, new flanking primers were prepared, and PCR was performed again.

鯨・偶蹄目間の系統関係の推定の際にはアガロース・ゲル電気泳動の後、増幅されたバンドが相同遺伝子座なのか否かを確認するためにサザンハイブリダイゼーションを行った。その行程は以下に示す。アガロース・ゲル電気泳動後エチジウム・ブロマイド(EtBr)溶液で染色し、デジタルカメラによって泳動パターンを確認、撮影した。その後変性溶液（0.4 NaOH, 0.6M NaCl）に十分浸したメンブレン1枚、ろ紙2枚をこの順で、間に気泡が入らないようにゲルにのせてその上にJKワイパー、キムタオル（商標）、重しとして辞典などを載せ、一晩（〜15時間）放置した。メンブレンを中和溶液（1 M NaCl, 0.5 M Tris - HCl (pH 7.0)）に15分程度放置して中和した後十分に乾燥させた。その後のハイブリダイゼーション、現像などの操作は前述のスクリーニングの時と同じなので省略する。また鯨目内の系統関係推定の際には殆どの遺伝子座についてそれらの塩基配列を決定したので基本的にサザンハイブリダイゼーションによる確認は行わなかった。 When estimating the phylogenetic relationship between whales and artiodactyla, Southern hybridization was performed after agarose gel electrophoresis to confirm whether the amplified band is a homologous locus. The process is shown below. After agarose gel electrophoresis, it was stained with ethidium bromide (EtBr) solution, and the electrophoresis pattern was confirmed and photographed with a digital camera. Then put one membrane fully soaked in denaturing solution (0.4 NaOH, 0.6M NaCl) and two filter papers in this order on the gel so that no air bubbles get in between, and then JK wiper, Kim towel (trademark), I put a dictionary as a weight and left it overnight (~ 15 hours). The membrane was neutralized by standing in a neutralization solution (1 M NaCl, 0.5 M Tris-HCl (pH 7.0)) for about 15 minutes, and then sufficiently dried. Subsequent operations such as hybridization and development are the same as those in the screening described above, and are therefore omitted. In addition, since the base sequences of most loci were determined at the time of estimating the phylogenetic relationships within the order of the cetaceae, basically confirmation by Southern hybridization was not performed.

実施例１：鯨偶蹄目ゲノム内におけるCHR-2各サブファミリーの分布
鯨の起源ついてはその内部系統について形態による分類と分子による分類で見解がわかれている。鯨目は現生の多くの種類を含む歯鯨亜目と髭鯨亜目、そしてこれらの祖先となったと考えられている原鯨亜目（ムカシクジラ亜目とも呼ばれている）の３つの亜目に分けられており(Fordyce et al., 1994)、現生の鯨類の分類はその分類名からも明らかなように、歯を持つか否かによって区別されている。しかし、髭鯨類も発生のかなり最後の方の段階まで歯が確認できるし原鯨類の化石種の中には当然のことながら歯鯨と髭鯨の中間段階にあると思われるような形質を備えたもの（つまり歯を持った髭鯨）も存在しているので、実際の問題として歯の存在だけで歯鯨の単系統性を主張することはできない。それは系統的に考えた時に髭鯨の「髭」はおそらく共有派生形質であるが歯鯨の「歯」という形質は原始形質とみなされるためである。しかし歯鯨のみがエコロケーションをする能力を持ち、それに付随するメロン体の存在も歯鯨の単系統性を示唆する最も有名な形質のひとつである。 Example 1: Distribution of CHR-2 subfamilies in the genome of the cetacean. Regarding the origin of whales, opinions are divided on the internal lineage by classification by morphology and molecular classification. The cetaceans are the three subspecies of the cetaceans and the cetaceans, including many species of modern life, and the original cetaceans (also known as the blue whales) that are thought to have become their ancestors. It is divided into eyes (Fordyce et al., 1994), and the classification of modern cetaceans is distinguished by whether or not it has teeth, as is clear from its classification name. However, teeth can be confirmed up to the very last stage of cetaceans, and some of the fossil species of the original cetaceans are naturally traits that appear to be in the middle stage of cetaceans and cetaceans. Since there is also a whale with teeth (that is, a whale with teeth), as a matter of fact, it is impossible to claim a single system of whales based on the existence of teeth. This is because, when considered systematically, the whale “髭” is probably a co-derived trait, but the “tooth” trait of a whale is considered a primitive trait. However, only whales have the ability to eco-location, and the presence of melon bodies associated with them is one of the most famous traits suggesting a single system of whales.

形態的分類が歯鯨、髭鯨類のそれぞれの単系統性を強く主張しているのに対して、現在までに行われてきた分子を用いた系統解析によればその殆どが歯鯨が単系統群を形成しないことを示唆している。その中でも大きな論争の火種となったのがMilinkovith and Meyer(1993)のミトコンドリア遺伝子配列の比較解析により提唱された系統樹でこの解析によれば歯鯨に含まれているマッコウクジラ上科のグループが他の歯鯨類よりむしろ髭鯨類に近縁である（つまり歯鯨は多系統である）という。この結果に続いて多くの分子統計学的研究がなされたが多くは歯鯨が多系統もしくは解決不可能という結果となっていた(e. g., Arnason et al., 1994; Adachi et al., 1996; Smith et al., 1996：図１７参照)。そこで、以下の問題を解決すべくＳＩＮＥ法を用いて種判別を行った：問題（1）歯鯨の単系統性の問題；（２）全てのカワイルカ類を含めたそれらの鯨目における系統関係；（３）鯨類それぞれの分岐年代。本願発明により、系統関係の決定のみならず現在まで絶対的な信頼性をもって進められてきた統計学的解析の問題点なども明らかにすることができた。 While the morphological classification strongly insists on the single systematicity of cetaceans and cetaceans, phylogenetic analysis using molecules that have been carried out to date has made most of them whales alone. This suggests that no lineage group is formed. Among them, a major controversy was the phylogenetic tree proposed by the comparative analysis of mitochondrial gene sequences of Milinkovith and Meyer (1993). According to this analysis, the group of sperm whale superfamily included in dentinal whales It is said to be closely related to cetaceans rather than other cetaceans (ie, cetaceans are multi-system). This result was followed by a number of molecular statistical studies, many of which resulted in multiple strains of whales or insolvency (eg, Arnason et al., 1994; Adachi et al., 1996; Smith et al., 1996: see Figure 17). Therefore, we performed species discrimination using the SINE method to solve the following problems: Problem (1) Problem of phylogenetic monophylaxis; (2) Phylogenetic relationships among all dolphins, including all dolphins ; (3) Branching age of each whale. The present invention has made it possible to clarify not only the determination of system relationships but also the problems of statistical analysis that has been advanced with absolute reliability up to now.

上述のように、本願発明者は、偶蹄目ゲノム中に存在するCHR-2には大きく分けてFL(Full Length), MD(Middle Deletion type)、DT(Deletion Type)、CD(Cetacea Deletion type)、 CDO(Cetacea Deletion type Odontoceti)のサブファミリーが存在することを発見した。それぞれのサブファミリーは進化の過程における増幅時期が異なるためそれらの分布は系統関係を反映すると考えられる。各サブファミリーのコンセンサス配列のアラインメントを図１８に示す。図１８から明らかなように各サブファミリーはそれらにDiagnosticな塩基置換や欠失が認められる。 As described above, the inventor of the present application is roughly divided into CHR-2 present in the cloven-hoofed genome, FL (Full Length), MD (Middle Deletion type), DT (Deletion Type), CD (Cetacea Deletion type) The CDO (Cetacea Deletion type Odontoceti) subfamily was found to exist. Since each subfamily has a different amplification period in the evolutionary process, their distribution is considered to reflect phylogenetic relationships. The alignment of consensus sequences for each subfamily is shown in FIG. As is clear from FIG. 18, each subfamily has a diagnostic base substitution or deletion in them.

これらSINE各サブファミリーの鯨、偶蹄目における分布を確認するためにドットハイブリダイゼーションを行った。用いたプローブは、CDとCDOに関してはそれらの区別が可能な位置にオリゴプローブを設計し（アラインメントの太線）、FLに関してはPCR産物を使用した。ドットハイブリダイゼーションの結果を図１２に示す。まずFLは全ての鯨目そして偶蹄目ではカバ、ウシ（反すう亜目を代表）のみに分布し、ブタやラクダ及びその他外群として加えた哺乳類ゲノム中には存在しないことが示唆された。次にCDは鯨目のみに特異的に分布し偶蹄目には分布してないと考えられ、鯨目の単系統性を強く支持している。さらにCDOに関しては、歯鯨にのみ強くシグナルが確認されることからこのサブファミリーは歯鯨亜目特異的に爆発的な増幅をしたと考えられ、分子による解析から疑問視されている歯鯨の単系統性を支持するデータである。しかしドットハイブリダイゼーションによる解析においてシグナルが確認されないだけでは、単にその系統でコピー数が少なかっただけという可能性を否定できないので、その結果だけで系統関係を推定することはできない。そこで実際にそれらのSINEが鯨目ゲノム中に挿入している様な遺伝子座を単離して系統関係の推定を行った。 In order to confirm the distribution of these SINE subfamilies in whales and cloven-hoofed eyes, dot hybridization was performed. For the probe used, an oligo probe was designed at a position where CD and CDO can be distinguished from each other (alignment bold line), and a PCR product was used for FL. The results of dot hybridization are shown in FIG. First, it was suggested that FL is distributed only in hippopotamus and cattle (representative ruminants) in all cetaceans and cloven-hoofed eyes, and is not present in the mammalian genome added as pigs, camels and other outgroups. Secondly, CD is distributed specifically only in the cetaceans and not in the cloven-hoofed eyes, and strongly supports the single system of the cetaceans. Furthermore, with regard to CDO, a strong signal was confirmed only in the dentition, so this subfamily was considered to have exploded explosively in the cetaceae subspecies, and the analysis of the sperm whale questioned from molecular analysis This data supports the single system. However, if the signal is not confirmed in the analysis by dot hybridization, the possibility that the number of copies in the line is simply small cannot be denied, and therefore the lineage relationship cannot be estimated only by the result. Therefore, we actually isolated the loci such that those SINEs were inserted into the cetacean genome, and estimated their phylogenetic relationships.

実施例２：SINEのゲノム中への挿入を指標とした鯨目内部系統の解析
鯨目内部系統に関する研究においては、鯨目に存在するSINEの中から鯨目全体に特異的に分布しているCD及び、歯鯨亜目に特異的に増幅していると考えられるCDOに注目し、鯨目各種のゲノム・ライブラリーからこれらのサブファミリーに属するSINE配列を含む遺伝子座の単離を行った。鯨目の系統関係を示す遺伝子座の単離に用いたＰＣＲプライマーを以下の表３： Example 2: Analysis of the internal line of the cetacean using the insertion of SINE into the genome as an indicator In the study on the internal line of the cetacean, it is distributed specifically among the cetaceans from among the SINEs existing in the cetacean Focusing on CD and CDO, which is thought to be specifically amplified in the order of dentition, we isolated loci containing SINE sequences belonging to these subfamilies from various genomic libraries of cetaceans. . The PCR primers used for the isolation of loci showing phylogenetic relationships are shown in Table 3 below.

に示す。 Shown in

遺伝子座Ａ（ＧＲＹ２０）を、ｇｒａｙｗｈａｌｅ（コククジラ）の、遺伝子座Ｂ（ＢＲＹ５０）を、Ｂｒｙｄｅ’ｓｗｈａｌｅ（ニタリクジラ）の、遺伝子座Ｃ（Ｓｐｅｒｍ２８）を、Ｓｐｅｒｍｗｈａｌｅ（マッコウクジラ）の、遺伝子座Ｄ（Ｈｕｍｐ４２）を、ｈｕｍｐｂａｃｋｗｈａｌｅ（ザトウクジラ）の、遺伝子座Ｅ（Ｎａｇ１８）を、ｆｉｎｗｈａｌｅ（ナガスクジラ）の、遺伝子座Ｆ（ＳＩＲ１３）を、ｂｌｕｅｗｈａｌｅ（シロナガスクジラ）の、遺伝子座Ｇ（Ｓｅｉ３５）を、ｓｅｉｗｈａｌｅ（イワシクジラ）の、遺伝子座Ｈ（ＢＲＹ）を、Ｂｒｙｄｅ’ｓｗｈａｌｅ（ニタリクジラ）の、遺伝子座Ｉ（ＧＲＹ８）を、ｇｒａｙｗｈａｌｅ（コククジラ）の、遺伝子座Ｊ（Ｍｎｋ３１）を、ｍｉｎｋｅｗｈａｌｅ（ミンククジラ）の、遺伝子座Ｋ（ＮＭ１）を、ｍｉｎｋｅｗｈａｌｅ（ミンククジラ）の、遺伝子座Ｌ（ＳＥＭ３）を、ｒｉｇｈｔｗｈａｌｅ（セミクジラ）の、遺伝子座Ｍ（ｓＮＲ２）を、ｒｉｇｈｔｗｈａｌｅ（セミクジラ）の、遺伝子座Ｎ（ｎａ１０７）を、ｎａｒｗｈａｌ（イッカク）の、遺伝子座Ｏ（ｎａ１０２）を、ｎａｒｗｈａｌ（イッカク）の、遺伝子座Ｐ（Ｔｕｔｉ３５）を、ｂｅａｋｅｄｗｈａｌｅ（オオギハクジラ）の、遺伝子座Ｑ（ＳＰ９）を、ｓｐｅｒｍｗｈａｌｅ（マッコウクジラ）の、遺伝子座Ｒ（ＳＰ２）を、ｓｐｅｒｍｗｈａｌｅ（マッコウクジラ）のゲノム・ライブラリーから、それぞれ、単離した。 The locus A (GRY20), the locus of gray gray (Bhale whale), the locus B (BRY50), the locus of Bryde's what (Bryde's whale), the locus C (Sperm28), the locus of the sperm whole (sperm whale) D (Hump42), humbback what (humpback whale), locus E (Nag18), fin whale, locus F (SIR13), blue hale, locus G (Sei35) The locus H (BRY) of sei whale (sardine whale), the locus I (GRY8) of Bryde's what (Bryde's whale), the locus J (Mnk31) of gray whale (the gray whale), minke The locus K (NM1) of the hale (minke whale), the locus L (SEM3) of the minke whale (minke whale), the locus M (sNR2) of the right what (semi whale), the right whale (semi whale) ), The locus N (na107), the narwhhal locus, the locus O (na102), the narwhhal locus P (Tuti35), and the baked whale locus Q ( SP9) was isolated from a genomic library of sperm whale and the locus R (SP2) of sperm hale (sperm whale), respectively.

これらの遺伝子座の塩基配列決定、ＳＩＮＥ近傍領域におけるプライマーの設計、鯨偶蹄目のゲノムＤＮＡを鋳型としたＰＣＲ反応、アガロースゲル電気泳動による分離を行った。その結果を図１９〜２１に示す。 The nucleotide sequences of these loci were determined, the primers were designed in the region near SINE, the PCR reaction using the genomic DNA of the cetacean was used as a template, and the separation was performed by agarose gel electrophoresis. The results are shown in FIGS.

図１９〜２１中の各レーンに対応する数字は、図２２〜２３に示す鯨種に対応する。 The numbers corresponding to the lanes in FIGS. 19 to 21 correspond to the whale species shown in FIGS.

図２０Ｄに見られるように、例えば、Ｈｕｍｐ４２と名付けた遺伝子座はザトウクジラのみにＳＩＮＥの挿入が確認される。つまりこの遺伝子座を増幅するプライマーを用いて検体を解析すれば、その由来不明の鯨肉がザトウクジラなのか否かが簡便に判別できる。図１９Ｆに見られるように、ＳＩＲ１３ではシロナガスクジラか否かを判別可能である。またＳｐｅｒｍ２８やＢＲＹ５０では、検体が歯鯨類であるのか、ヒゲ鯨類であるのかを、おおまかに見分けることができる。したがって、表３に示すプライマーを用いて実験をおこなえば、市場に出回るほぼ全種類の鯨肉に関して、迅速・安価で信頼度の高い種判別をおこなうことができる。 As seen in FIG. 20D, for example, the locus named Hump42 confirms the insertion of SINE only in humpback whales. In other words, if a sample is analyzed using a primer that amplifies this gene locus, it can be easily determined whether the whale meat whose origin is unknown is a humpback whale. As can be seen in FIG. 19F, the SIR 13 can determine whether or not it is a blue whale. In addition, with Super 28 and BRY 50, it is possible to roughly distinguish whether the specimen is a cetacean or a bearded cetacean. Therefore, if experiments are performed using the primers shown in Table 3, it is possible to quickly and inexpensively perform highly reliable species discrimination for almost all types of whale meat on the market.

要するに、検体が鯨かどうか判別するＰＣＲプライマー（Ａ）、歯鯨類か否かを判別するＰＣＲプライマー（Ｃ）、ヒゲ鯨類か否かを判別するＰＣＲプライマー（Ｂ）がそれぞれ一組ずつある。これらによって、まず検体をおおまかに振り分けることが可能になる。その後に、特定の鯨種を判別できるプライマーを使用してＰＣＲをおこなえば、それぞれの検体がどの種に対応するのかが即座にわかる。実際には、歯鯨類を判別するＰＣＲプライマーが５組（Ｎ，Ｏ，Ｐ，Ｑ，Ｒ）、ヒゲ鯨類を判別するＰＣＲプライマーが１０組（Ｄ，Ｅ，Ｆ，Ｇ，Ｈ，Ｉ，Ｊ，Ｋ，Ｌ，Ｍ）、加えて上述の３組のプライマー（Ａ，Ｂ，Ｃ）の計１８組の種判別プライマーを使用すればよいことがわかる。また市場で販売される鯨種はずっと限られたものであるので、このプライマー全てを使用する必要はなく、目的に応じてこのプライマーを使い分けることもでき、より効率的に少ない実験量で種判別をおこなうことも可能である。つまりミンク鯨か否かだけが知りたいのであれば、１組のプライマーセットで十分判別可能である。また時間としては、検体が組織であればＤＮＡ抽出に３０分、ＰＣＲに２時間半、アガロースゲルによる電気泳動に３０分、計３時間半で最終的な結論まで導くことが可能である。試験的実験をおこなった結果、その正答率は１００％であった。 In short, there are one set each of a PCR primer (A) for discriminating whether the specimen is a whale, a PCR primer (C) for discriminating whether it is a cetacean, and a PCR primer (B) for discriminating whether it is a bearded cetacean. . As a result, it is possible to roughly sort the specimens first. After that, if PCR is performed using a primer capable of discriminating a specific whale species, it is immediately known which species each sample corresponds to. Actually, 5 sets of PCR primers for discriminating cetaceans (N, O, P, Q, R) and 10 sets of PCR primers for discriminating bearded cetaceans (D, E, F, G, H, I) , J, K, L, M), in addition to the above-described three sets of primers (A, B, C), a total of 18 species discrimination primers may be used. In addition, since the number of whale species sold in the market is much limited, it is not necessary to use all of these primers, and it is possible to use these primers properly according to the purpose, making it possible to distinguish species more efficiently with a small amount of experiment. It is also possible to perform. In other words, if you only want to know whether or not you are a mink whale, one primer set is enough to distinguish. In terms of time, if the specimen is a tissue, the final conclusion can be reached in 3 hours and a half, 30 minutes for DNA extraction, 2.5 hours for PCR, and 30 minutes for electrophoresis on an agarose gel. As a result of conducting a pilot experiment, the correct answer rate was 100%.

実施例３：鯨目内部の系統樹の作成
実施例２の結果から得られたSINEの挿入パターンをマトリックスにまとめた(図２２参照)。これに基づき推定される鯨目内部の系統発生樹を図２３に示す。 Example 3: Creation of a phylogenetic tree inside the cetacean The SINE insertion patterns obtained from the results of Example 2 were summarized in a matrix (see FIG. 22). FIG. 23 shows a phylogenetic tree in the cetacea estimated based on this.

ＳＩＮＥのレトロポジションに関する一般ダイアグラム。General diagram of SINE retro position. 図２（Ａ）はマスター遺伝子モデルを表し、そして図２（Ｂ）は多数源遺伝子モデルを表す。FIG. 2 (A) represents the master gene model and FIG. 2 (B) represents the multi-source gene model. ＳＩＮＥの進化の指標としての有用性を示す。It shows the usefulness of SINE as an indicator of evolution. 種Ａ〜Ｄの系統樹モデルを表す。図４（C）はPCRパターンを示す。Represents a phylogenetic tree model of species AD. FIG. 4C shows a PCR pattern. 選択された動物種由来の全ゲノムＤＮＡからの転写産物のいくつかのパターン例である。Figure 5 is an example of several patterns of transcripts from total genomic DNA from selected animal species. ＣＨＲ−２ＳＩＮＥのｔＲＮＡ様構造を示す。2 shows a tRNA-like structure of CHR-2 SINE.

図７（Ａ）はＣＨＲ−２ＳＩＮＥのｔＲＮＡ様二次構造を示し、そして図７（Ｂ）はヒトｔＲＮＡＧｌｕの二次構造を示す。FIG. 7 (A) shows the tRNA-like secondary structure of CHR-2 SINE, and FIG. 7 (B) shows the secondary structure of human tRNA Glu. ＣＨＲ−２のいくつかの配列アラインメントからのＣＨＲ−２ＳＩＮＥのコンセンサス配列の構築を表す。Fig. 4 represents the construction of a consensus sequence for CHR-2 SINE from several sequence alignments of CHR-2. ｔの時に増幅された古いＳＩＮＥと、ｕの時に増幅された若いＳＩＮＥを含む系統分類樹。A phylogenetic tree containing an old SINE amplified at t and a young SINE amplified at u. アラインメントによるＳＩＮＥファミリーのサブファミリーを表す特徴的なヌクレオチド又は可能な欠失の同定を表す。Represents the identification of characteristic nucleotides or possible deletions representing a subfamily of the SINE family by alignment. ＣＨＲ−２ＳＩＮＥファミリーのサブファミリーのコンセンサス配列のアラインメントを示す。Figure 5 shows an alignment of the consensus sequences for the CHR-2 SINE family subfamily. ＣＤ，ＣＤＯ，及び他のサブファミリーに特異的なプローブを、それぞれ、用いたドット−ハイブリダイゼーション実験の結果を示す。The results of dot-hybridization experiments using probes specific for CD, CDO, and other subfamilies, respectively, are shown.

フランキングＳＩＮＥＰＣＲの原理を模式的に示す。The principle of flanking SINE PCR is schematically shown. フランキングＰＣＲ結果の１例を示す。また、図１４（Ｂ）と（Ｃ）に２つの異なるプローブを用いた同一フィルターを用いて行ったハイブリダイゼーション実験結果を同時に示す。An example of a flanking PCR result is shown. FIGS. 14B and 14C simultaneously show the results of hybridization experiments conducted using the same filter using two different probes. 哺乳動物の系統発生樹を示す。1 shows a phylogenetic tree of mammals. ＳＩＮＥ法における実験の流れを示す。The flow of the experiment in the SINE method is shown. 分子統計学的研究から提唱された系統樹（歯鯨類の単系統性の問題をめぐる論争）を示す。A phylogenetic tree proposed from molecular statistical studies (the controversy over the issue of phylogenetic phylogeny).

ＣＨＲ−２各サブファミリーのコンセンサス配列のアラインメントを示す。The alignment of consensus sequences for each CHR-2 subfamily is shown. 遺伝子座Ａ〜Ｃ、及びＦ〜ＨのSINEの挿入結果を表すアガロース・ゲル電気泳動図。The agarose gel electrophoretic diagram showing the insertion result of SINE of gene locus A-C and F-H. 遺伝子座Ｄ，Ｅ，Ｋ、及びＩ，Ｊ，ＯのSINE挿入結果を示すアガロース・ゲル電気泳動図。The agarose gel electrophoretic diagram which shows the SINE insertion result of gene locus D, E, K, and I, J, O.

遺伝子座Ｌ〜Ｎ、及びＰ〜ＲのSINE挿入結果を示すアガロース・ゲル電気泳動図。The agarose gel electrophoretic diagram which shows the SINE insertion result of gene locus LN and PR. SINEの挿入パターンをまとめたマトリックスを示す。The matrix which put together the insertion pattern of SINE is shown. 図２２に示すマトリックスに基づき推定される鯨目内部の系統発生樹を示す。FIG. 23 shows a phylogenetic tree inside the cetacean estimated based on the matrix shown in FIG.

Claims

A method for discriminating whale species by the SINE method, comprising the following steps:
Create a genomic DNA library from one or more whale species to be distinguished;
Isolating a clone containing an orthologous locus into which a SINE belonging to a SINE family or subfamily is specifically inserted in at least one of the whale species from each of the libraries;
Using the PCR primer set that anneals to flanking sequences located on either side of the SINE family or subfamily at the locus, the sequence of the locus was amplified by PCR for each of the whale species; and obtained The whale species is discriminated by gel electrophoresis of the PCR product and the presence or absence of a band indicating the presence of the SINE family or subfamily insertion locus;
Wherein the orthologous locus is GRY20 corresponding to the sequence shown in SEQ ID NO: 31, BRY50 corresponding to the sequence shown in SEQ ID NO: 32, Superm28 corresponding to the sequence shown in SEQ ID NO: 33, and the sequence shown in SEQ ID NO: 34 Hump42 corresponding to the sequence, Nag18 corresponding to the sequence shown in SEQ ID NO: 35, SIR13 corresponding to the sequence shown in SEQ ID NO: 36, Sei35 corresponding to the sequence shown in SEQ ID NO: 37, BRY63 corresponding to the sequence shown in SEQ ID NO: 38, GRY8 corresponding to the sequence shown in No. 39, MNK31 corresponding to the sequence shown in SEQ ID No. 40, NM1 corresponding to the sequence shown in SEQ ID No. 41, SEM3 corresponding to the sequence shown in SEQ ID No. 42, and the sequence shown in SEQ ID No. 43 Corresponding sNR2, na107 corresponding to the sequence shown in SEQ ID NO: 44, SEQ ID NO: 45 na102 corresponding to be arranged, Tuti35 corresponding to the sequence shown in SEQ ID NO: 46, selected from the group consisting of Sp2 corresponding to the corresponding Sp9, and sequences shown in SEQ ID NO: 48 in the sequence shown in SEQ ID NO: 47, said method.

A DNA having the sequence shown in any one of SEQ ID NOs: 31 to 48 or an orthologous DNA that hybridizes with the above DNA under stringent conditions.

The method of claim 1, wherein the PCR primer is selected from the group consisting of SEQ ID NOs: 49-84.

Primer DNA which has a sequence shown in any one of sequence number 49-84.