KR20150057047A

KR20150057047A - GPCR Ligand Peptides and a Conserved Motif Theirof

Info

Publication number: KR20150057047A
Application number: KR1020130139856A
Authority: KR
Inventors: 김영준
Original assignee: 애니젠 주식회사
Priority date: 2013-11-18
Filing date: 2013-11-18
Publication date: 2015-05-28
Also published as: KR101657152B1

Abstract

The present invention relates to novel GPCR peptides and a screening method for identifying the same. The present invention provides the GPCR ligand peptides represented by a general formula 1 which is NH_2-Xa_1-Xa_2-Xa_3-Xa_4-Xa_5-Cys-His-Phe-Lys-Xa_6-Cys-Asn-Met-CONH_2. In the general formula 1, Xa_1 can be Gln, Asn, His, Ser, Thr, or Pro, Xa_2 can be Tyr or Arg, Xa_3 can be Met, Leu or Ile, Xa_4 can be Ser, Thr, Ala, or Gln, Xa_5 can be Pro, Leu, and Met, and Xa_6 can be Ile or Leu. In addition, according to the present invention, the screening method can provide signaling peptide ligand including: a sequence which has one trans-membrane domain (i); a signal peptide sequence (ii); and a sequence which does not have a functional domain (iii).

Description

GPCR ligand peptide and its conservative motif {GPCR Ligand Peptides and a Conserved Motif Theirof}

본 발명은 신규 GPCR 리간드 펩타이드 및 이를 동정하기 위한 스크리닝 방법에 대한 것이다.
The present invention relates to novel GPCR ligand peptides and screening methods for identifying them.

조절성 펩타이드와 이의 수용체인 GPCR(G-protein coupled receptor) 간의 시그널링은 발생, 대사, 생식 및 행동 같은 다양한 생물학적 과정들에서 중요하다[1, 2]. 최근 수십년 동안, 간단한 생물학적 어세이 시스템을 이용한 펩타이들의 생화학적 분리 방법은 다양한 종들에서 매우 많은 조절성 펩타이드들을 동정하는 결과를 초래하였다. 최근의 게놈 이후 시대에 있어서, 게놈 서열 분석(genome-wide sequence analyses)은 신규 펩타이드 유전자들의 발견, 특히 초기 생화학적 연구들에서 발견된 펩타이드들과 어느 정도의 서열 유사성을 가지는 펩타이드들의 동정을 촉진시켰다[3]. 마찬가지로, 게놈 분석은 GPCRs을 동정하는 데 효과적인데, 이는 거의 모든 GPCRs에서 존재하는 분자적 사인인 7개의 트랜스막 도메인들을 통해 GPCRs이 게놈 상에서 확실하게 동정될 수 있기 때문이다[4].Signaling between regulatory peptides and their receptor GPCR (G-protein coupled receptor) is important in a variety of biological processes such as development, metabolism, reproduction and behavior [1,2]. In recent decades, biochemical separation of peptides using a simple biological assay system has resulted in the identification of a large number of regulatory peptides in a variety of species. In recent post-genomic era, genome-wide sequence analyzes have facilitated the identification of novel peptide genes, particularly peptides with some degree of sequence similarity to peptides found in early biochemical studies [3]. Similarly, genomic analysis is effective in identifying GPCRs because GPCRs can be reliably identified on the genome through seven transmembrane domains, which are molecular sines present in almost all GPCRs [4].

드로소필라(Drosophila) 게놈에 대한 바이오인포메틱스를 통해 약 200개의 GPCRs이 동정되었으며, 이들 중 약 49개의 GPCRs이 리간드로 펩타이드들을 가지는 것으로 예측된다[3-5]. 펩타이드 GPCRs 중에서, 14개의 GPCR이 리간드가 아직까지 동정되지 않은 고아(orphan) GPCR로 남아있다. 유사하게도, 게놈에서 동정된 ILPI-7(7 insulin-like peptide)을 포함하는 44개의 펩타이드 전구체 유전자들이 존재하고[1, 6, 7], 이들 중 8개의 펩타이드 전구체 유전자들로부터 유래된 11개 이상의 펩타이드들이 수용체 GPCR 파트너와 아직까지 연결되어 있지 않다. 몇몇 펩타이드들은 비-GPCR형 수용체들을 통해 기능한다. 예를 들어, ILPs 및 전흉선자극 호르몬(prothorasicotrophic hormone, PTTH)는 타이로신 수용체 키나제를 통해 기능하는 것으로 알려져 있으며 부화 호르몬(eclosion hormone)은 수용체 구아닐릴 사이클라제(receptor guanylyl cyclase)를 통해 기능하는 것으로 알려져 있다. 또한, 어떤 GPCR은 2개의 동떨어진 펩타이드 리간드들을 가지는 경우도 있다. 예를 들어, SPR(Drosophila sex peptide receptor)은 두 그룹의 펩타이드 리간드들인 SP(sex peptide) 및 Mip(myoinhibitory peptides)를 가진다[5, 8]. 종합해보면, 고아 GPCRs을 고려할 때 현재까지 알려진 조절성 펩타이드들의 수가 부족하기 때문에, 신규한 펩타이드들의 동정이 GPCRs의 파트너를 결정하는 중요한 단계이다.About 200 GPCRs have been identified through bioinformatics for the Drosophila genome, of which about 49 GPCRs are predicted to have peptides as ligands [3-5]. Of the peptide GPCRs, 14 GPCRs remain as orphan GPCRs whose ligands have not yet been identified. Similarly, there are 44 peptide precursor genes, including 7 insulin-like peptides (ILPI-7) identified in the genome [1, 6, 7] and 11 or more of these, derived from 8 of these peptide precursor genes Peptides are not yet linked to receptor GPCR partners. Some peptides function through non-GPCR type receptors. For example, ILPs and prothoracicotrophic hormone (PTTH) are known to function through tyrosine receptor kinases, and eclosion hormones function through receptor guanylyl cyclase . In addition, some GPCRs may have two peptides ligands that are spaced apart. For example, SPR ( Drosophila sex peptide receptor) has two groups of peptide ligands, SP (sex peptide) and Mip (myoinhibitory peptides) [5, 8]. Taken together, the identification of novel peptides is an important step in determining the partners of GPCRs, as there is a lack of known regulatory peptides to date to account for orphan GPCRs.

게놈으로부터 신규 조절성 펩타이드 유전자를 동정하는 일이 점차 어려워지고 있는데, 이는 모든 분비성 단백질들 및 매우 짧은 모노- 및 2염기성- 절단 위치들(R, K 및 이들의 조합)에 의해 공유되는 시그널 서열(성숙한 펩타이드 서열에서 상대적으로 짧은 부위를 차지하고 길이가 4-50 아미노산으로 다양함)을 제외하고는 펩타이드 전구체들이 항상 명확한 분자적 사인을 나타내는 것이 아니기 때문이다[9]. 게놈에서 펩타이드 전구체-유사 유전자들을 동정할 수 있을 지라도, 이들이 진짜 기능성 펩타이드들을 인코딩하는 지를 증명하는 것도 용이하지 않은 일이다.
It is becoming increasingly difficult to identify novel regulatory peptide genes from the genome, which is a signal sequence shared by all secretory proteins and very short mono- and dibasic-cleavage sites (R, K, and combinations thereof) Peptide precursors do not always exhibit definite molecular signatures except for the relatively short length of the mature peptide sequence and varying in length from 4-50 amino acids [9]. Although it is possible to identify peptide precursor-like genes in the genome, it is not easy to prove that they encode real functional peptides.

본 명세서 전체에 걸쳐 다수의 논문 및 특허문헌이 참조되고 그 인용이 표시되어 있다. 인용된 논문 및 특허문헌의 개시 내용은 그 전체로서 본 명세서에 참조로 삽입되어 본 발명이 속하는 기술 분야의 수준 및 본 발명의 내용이 보다 명확하게 설명된다.
Numerous papers and patent documents are referenced and cited throughout this specification. The disclosures of the cited papers and patent documents are incorporated herein by reference in their entirety to better understand the state of the art to which the present invention pertains and the content of the present invention.

본 발명자는 절지동물에서 신규한 시그널링 펩타이드 리간드를 동정하고자 노력하였다. 그 결과, 본 발명자는 절지동물의 공지된 게놈 서열로부터 하나의 트랜스막 도메인 서열을 가지고 기능성(functional) 도메인을 가지지 않아 분자적 기능(molecular function)을 예측할 수 없는 시그널링 펩타이드 전구체들(signaling peptide precursors)을 대규모로 분석하여 종 간 보존 서열(conserved sequence)을 적어도 하나 이상을 가지는 신규한 시그널링 펩타이드 리간드를 동정하였고, 이들이 기능성 펩타이드라는 것을 확인함으로써, 본 발명을 완성하게 되었다.The present inventors have sought to identify novel signaling peptide ligands in arthropods. As a result, the present inventors have found that signaling peptide precursors that have a single trans-membrane domain sequence from a known genomic sequence of arthropods and do not have a functional domain and can not predict a molecular function, Were analyzed on a large scale to identify novel signaling peptide ligands having at least one conserved sequence and confirmed that these were functional peptides, thereby completing the present invention.

따라서, 본 발명의 목적은 시그널링 펩타이드 리간드(signaling peptide ligands)의 스크리닝 방법을 제공하는 데 있다.Accordingly, it is an object of the present invention to provide a screening method of signaling peptide ligands.

본 발명의 다른 목적은 GPCR(G-protein coupled receptor) 리간드(ligand) 펩타이드를 제공하는 데 있다.Another object of the present invention is to provide a GPCR (G-protein coupled receptor) ligand peptide.

본 발명의 다른 목적은 서열목록 제1서열 또는 서열목록 제2서열의 아미노산 서열을 인코딩하는 뉴클레오타이드 서열을 포함하는 재조합 벡터를 제공하는 데 있다.It is another object of the present invention to provide a recombinant vector comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2.

본 발명의 또 다른 목적은 상술한 재조합 벡터에 의해 형질전환된 세포를 제공하는 데 있다.
It is yet another object of the present invention to provide a cell transformed by the above-mentioned recombinant vector.

본 발명의 다른 목적 및 이점은 하기의 발명의 상세한 설명, 청구범위 및 도면에 의해 보다 명확하게 된다.
Other objects and advantages of the present invention will become more apparent from the following detailed description of the invention, claims and drawings.

본 발명의 일 양태에 따르면, 본 발명은 절지동물의 게놈 서열로부터 시그널 서열을 가지는 시그널링 펩타이드 전구체(signaling peptide precursors)를 동정하는 단계를 포함하는 시그널링 펩타이드 리간드의 스크리닝 방법으로, 상기 후보 펩타이드 전구체는 하나의 트랜스막 도메인 서열을 가지고 기능성(functional) 도메인을 가지지 않아 분자적 기능(molecular function)을 예측할 수 없는 것을 특징으로 하는 시그널링 펩타이드 리간드의 스크리닝 방법을 제공한다.According to one aspect of the present invention, there is provided a method for screening a signaling peptide ligand comprising the step of identifying signaling peptide precursors having a signal sequence from the genomic sequence of arthropods, wherein said candidate peptide precursor is one The present invention provides a screening method for a signaling peptide ligand, which is characterized in that it has no functional domain with a trans-membrane domain sequence of SEQ ID NO: 1 and thus can not predict a molecular function.

본 발명의 다른 양태에 따르면, 본 발명은 다음 일반식 1로 표시되는 GPCR(G-protein coupled receptor) 리간드(ligand) 펩타이드를 제공한다:According to another aspect of the present invention, the present invention provides a G-protein coupled receptor (GPCR) ligand peptide represented by the following general formula 1:

일반식 11

NH₂-Xa₁-Xa₂-Xa₃-Xa₄-Xa₅-Cys-His-Phe-Lys-Xa₆-Cys-Asn-Met-CONH₂ NH ₂ -Xa ₁ -Xa ₂ -Xa ₃ -Xa ₄ -Xa ₅ -Cys-His-Phe-Lys-Xa ₆ -Cys-Asn-Met-CONH ₂

상기 일반식 1에서, Xa₁은 Gln, Asn, His, Ser, Thr 또는 Pro이고; Xa₂는 Tyr 또는 Arg이며; Xa₃은 Met, Leu 또는 Ile이고; Xa₄는 Ser, Thr, Ala 또는 Gln이며; Xa₅는 Pro, Leu 또는 Met이고; Xa₆은 Ile 또는 Leu인 것을 특징으로 하는 펩타이드.In the general formula 1, Xa ₁ is Gln, Asn, His, Ser, Thr or Pro; Xa ₂ is Tyr or Arg; Xa ₃ is Met, Leu or Ile; Xa ₄ is Ser, Thr, Ala or Gln; Xa < ₅ > is Pro, Leu or Met; Xa < ₆ > is Ile or Leu.

본 발명의 또 다른 양태에 따르면, 본 발명은 (a) 서열목록 제1서열 또는 서열목록 제2서열의 아미노산 서열을 인코딩하는 뉴클레오타이드 서열; (b) 상기 뉴클레오타이드 서열에 작동적으로 결합된(operatively linked) 프로모터; 및 (c) 터미네이터(terminator)를 포함하는 재조합 벡터를 제공한다.According to another aspect of the present invention, the present invention provides a nucleic acid molecule comprising (a) a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2; (b) a promoter operably linked to the nucleotide sequence; And (c) a terminator.

본 발명의 또 다른 양태에 따르면, 본 발명은 상술한 재조합 벡터로 형질전환된 세포를 제공한다.
According to another aspect of the present invention, the present invention provides a cell transformed with the above-mentioned recombinant vector.

본 발명자는 절지동물에서 신규한 시그널링 펩타이드 리간드를 동정하고자 노력하였다. 그 결과, 본 발명자는 절지동물의 공지된 게놈 서열로부터 하나의 트랜스막 도메인 서열을 가지고 기능성(functional) 도메인을 가지지 않아 분자적 기능(molecular function)을 예측할 수 없는 시그널링 펩타이드 전구체들(signaling peptide precursors)를 대규모로 분석하여 종 간 보존 서열(conserved sequence)을 적어도 하나 이상을 가지는 신규한 시그널링 펩타이드 리간드를 동정하였고, 이들이 기능성 펩타이드라는 것을 확인하였다.The present inventors have sought to identify novel signaling peptide ligands in arthropods. As a result, the present inventors have found that signaling peptide precursors that have a single trans-membrane domain sequence from a known genomic sequence of arthropods and do not have a functional domain and can not predict a molecular function, Were analyzed on a large scale to identify novel signaling peptide ligands having at least one conserved sequence and confirmed that these were functional peptides.

공지된 게놈 서열로부터 신규한 조절성 펩타이드 유전자를 동정하는 것은 당업계에서 매우 어려운 상황인데, 이는 상기 조절성 펩타이드 유전자가 GOA(gene ontology annotation)에 의해 구별될 수 있는 특징을 거의 가지지 않기 때문이다. 따라서, 시그널링 펩타이드 전구체를 동정하는 보다 효과적인 방법이 당업계에서 시급히 요구되고 있는 실정이다.It is a very difficult situation in the art to identify novel regulatory peptide genes from known genomic sequences because the regulatory peptide genes have very few characteristics that can be distinguished by gene ontology annotation (GOA). Therefore, a more effective method of identifying a signaling peptide precursor is urgently required in the art.

본 발명자는 공지된 게놈 서열로부터 종들 간에 보존된 하나 또는 복수의 부위들을 포함하는 다량의 후보 펩타이드 전구체들을 1차적으로 선정하고 이로부터 하나의 트랜스막 도메인 서열을 가지고 기능성(functional) 도메인을 가지지 않아 분자적 기능(molecular function)을 예측할 수 없는 펩타이드 전구체들을 선택하여 기능성 어세이(functional assay)를 실시함으로써 최종적으로 신규한 시그널링 펩타이드 리간드를 동정하였다.The present inventors have primarily selected a large number of candidate peptide precursors containing one or plural sites conserved among species from a known genomic sequence, from which there is no functional domain with one trans-membrane domain sequence, We finally identified new signaling peptide ligands by selecting peptide precursors that can not predict molecular function and performing a functional assay.

본 발명의 방법에 따르면, 우선 절지동물의 게놈 서열로부터 시그널 서열을 가지는 시그널링 펩타이드 전구체를 동정한다.According to the method of the present invention, first, a signaling peptide precursor having a signal sequence from the genomic sequence of arthropod is identified.

통상적으로 공지된 펩타이드 전구체 유전자들은 시그널 서열을 포함한다는 점에 착안하여 본 발명자는 SignalP4.1(http://www.cbs.dtu.dk/services/SignalP/)을 이용하여 공지된 게놈 서열(구체적으로는, 노랑초파리의 게놈 서열)로부터 예측된 단백질 서열을 분리한 후, 상기 분리된 단백질 서열들 중에서 다음과 같은 특성을 가지는 단백질 서열만을 선택하였다: (a) 하나의 트랜스막 도메인을 가지는 서열; (b) 시그널 펩타이드 서열의 존재; 및 (c) 기능성 도메인을 가지지 않는 서열. 따라서, 본 발명의 방법은 공지된 게놈 서열로부터 상기 특성을 가지는 단백질 서열만을 간편하고 효율적으로 분리/동정할 수 있다는 장점을 가진다.In view of the fact that commonly known peptide precursor genes include a signal sequence, the inventors of the present invention have used a known genomic sequence (specifically, SEQ ID NO: 2) using SignalP4.1 (http://www.cbs.dtu.dk/services/SignalP/) (Genomic sequence of yellow flies), only the protein sequences having the following characteristics among the separated protein sequences were selected: (a) a sequence having one trans membrane domain; (b) the presence of a signal peptide sequence; And (c) a sequence that does not have a functional domain. Therefore, the method of the present invention has an advantage that it is possible to simply and efficiently isolate / identify only a protein sequence having the above-mentioned characteristics from a known genome sequence.

본 발명의 어떤 구현예에 따르면, 본 발명의 방법에서 이용될 수 있는 게놈 서열은 NCBI(http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase(http://hymenopteragenome.org/beebase/), Beetlebase(http://beetlebase.org) 및 VectorBase(https://www.vectorbase.org)로부터 획득할 수 있지만, 이에 한정되는 것은 아니다.According to some embodiments of the present invention, the genomic sequences that can be used in the methods of the invention are NCBI ( http://www.ncbi.nlm.nih.gov/nucleotide/ ), Beebase ( http://hymenopteragenome.org / beebase / ), Beetlebase ( http://beetlebase.org ), and VectorBase ( https://www.vectorbase.org ).

본 발명의 어떤 구현예에 따르면, 본 발명의 후보 펩타이드 전구체는 종(species) 간 보존 서열(conserved sequence)을 적어도 하나 이상 포함하고, 상기 보존 서열의 양 말단(N-말단 또는 C-말단)에 모노- 또는 2염기성 절단 위치들을 포함하며, 보다 구체적으로는 2염기성 절단 위치를 포함한다.According to some embodiments of the present invention, the candidate peptide precursor of the present invention comprises at least one conserved sequence between species and comprises at least one conserved sequence at both ends (N-terminal or C-terminal) of the conserved sequence Mono- or dibasic cleavage sites, and more particularly include dibasic cleavage sites.

본 발명의 어떤 구현예에 따르면, 본 발명의 방법은 상술한 후보 펩타이드 전구체로부터 얻어진 후보 시그널링 펩타이드의 기능 어세이(functional assay)를 실시하는 단계를 추가적으로 포함한다.According to some embodiments of the invention, the method further comprises the step of performing a functional assay of the candidate signaling peptides obtained from the candidate peptide precursors described above.

본 발명자는 상술한 간단하고 편리한 방법을 이용하여 많은 수의 후보 펩타이드 리간드들을 동정하였다.The present inventors have identified a number of candidate peptide ligands using the simple and convenient method described above.

본 발명은 다음 일반식 1로 표시되는 GPCR(G-protein coupled receptor) 리간드(ligand) 펩타이드를 제공한다:The present invention provides a G-protein coupled receptor (GPCR) ligand peptide represented by the following general formula 1:

일반식 11

본 발명의 어떤 구현예에 따르면, 본 발명의 펩타이드는 다음 일반식 2로 표시되는 GPCR 리간드 펩타이드를 포함한다:According to some embodiments of the present invention, the peptide of the present invention comprises a GPCR ligand peptide represented by the following general formula 2:

일반식 2Formula 2

NH₂-Xa₁-Tyr-Met-Xa₂-Xa₃-Cys-His-Phe-Lys-Xa₄-Cys-Asn-Met-CONH₂ NH ₂ -Xa ₁ -Tyr-Met-Xa ₂ -Xa ₃ -Cys-His-Phe-Lys-Xa ₄ -Cys-Asn-Met-CONH ₂

상기 일반식 2에서, Xa₁은 Gln, Asn, His, Ser 또는 Thr이고; Xa₂는 Ser, Thr 또는 Ala이며; Xa₃은 Pro 또는 Leu이고; Xa₄는 Ile 또는 Leu인 것을 특징으로 하는 펩타이드.In the general formula 2, Xa ₁ is Gln, Asn, His, Ser or Thr; Xa ₂ is Ser, Thr or Ala; Xa < ₃ > is Pro or Leu; Xa ₄ is Ile or Leu.

본 발명의 어떤 구현예에 따르면, 상기 일반식 2에서 Xa₁은 Gln, Asn 또는 His이고; Xa₂는 Ser 또는 Thr이며; Xa₃은 Pro 또는 Leu이고; Xa₄는 Ile 또는 Leu이다.According to some embodiments of the present invention, Xa ₁ in the general formula 2 is Gln, Asn, or His; Xa ₂ is Ser or Thr; Xa < ₃ > is Pro or Leu; Xa ₄ is Ile or Leu.

본 발명의 어떤 구현예에 따르면, 상기 일반식 1에서 Xa₁은 피로글루타민산(pyroglutamic acid)이다.According to some embodiments of the present invention, in Formula 1, Xa ₁ is pyroglutamic acid.

본 명세서에서 사용되는 용어 "펩타이드"는 펩타이드 결합에 의해 아미노산 잔기들이 서로 결합되어 형성된 선형의 분자를 의미한다. 펩타이드는 당업계에 공지된 화학적 합성 방법, 예를 들어 고상 합성 기술(solid-phase synthesis techniques; Merrifield, J. Amer. Chem. Soc. 85:2149-54(1963); Stewart, et al., Solid Phase Peptide Synthesis, 2nd. ed., Pierce Chem. Co.: Rockford, 111(1984)) 또는 액상 합성 기술(US 등록특허 제5,516,891호)에 따라 제조될 수 있다. 본 명세서에서 사용되는 용어 "시그널링 펩타이드 리간드(signaling peptide ligand)"는 펩타이드-수용체 상호작용을 통해 상기 수용체를 활성화시킴으로써, 펩타이드-수용체 시그널링을 유도하는 분자를 의미한다.As used herein, the term "peptide" refers to a linear molecule formed by peptide bonds and amino acid residues joined together. Peptides may be synthesized by methods known in the art such as solid-phase synthesis techniques (Merrifield, J. Amer. Chem. Soc. 85: 2149-54 (1963); Stewart, et al., Solid (Peptide Synthesis , 2nd ed., Pierce Chem. Co .: Rockford, 111 (1984)) or liquid phase synthesis technology (US Patent No. 5,516,891). The term "signaling peptide ligand " as used herein refers to a molecule that induces peptide-receptor signaling by activating said receptor through peptide-receptor interaction.

본 발명의 어떤 구현예에 따르면, 본 발명의 GPCR 리간드 펩타이드는 GPCR과 결합하여 이의 활성화를 유도하고, 보다 구체적으로는 GPCR CG33696과 결합하여 이의 활성화를 유도한다.According to some embodiments of the invention, the GPCR ligand peptide of the invention binds to and activates its activation, more specifically, GPCR CG33696, to induce its activation.

본 발명의 어떤 구현예에 따르면, 본 발명의 GPCR 리간드 펩타이드의 상기 일반식 1에서 여섯 번째 Cys 잔기와 11번째 Cys 잔기는 서로 이황화 결합(disulfide bond)로 연결되어 있다.According to some embodiments of the present invention, the sixth Cys residue and the eleventh Cys residue in the general formula 1 of the GPCR ligand peptide of the present invention are linked to each other by a disulfide bond.

본 발명의 어떤 구현예에 따르면, 본 발명의 GPCR 리간드 펩타이드의 N-말단 및 C-말단은 모노- 또는 2염기성 절단 위치(mono- or dibasic cleavage sites) 서열을 포함한다.According to some embodiments of the invention, the N-terminus and C-terminus of the GPCR ligand peptide of the present invention comprises mono- or dibasic cleavage sites.

본 발명의 어떤 구현예에 따르면, 본 발명의 GPCR 리간드 펩타이드의 N-말단 및 C-말단에 존재하는 2염기성 절단 위치 서열은 Lys 및 Arg의 조합 서열이고, 보다 구체적으로는 Lys-Lys 또는 Arg-Lys 서열이다.According to some embodiments of the present invention, the dibasic cleavage position sequence present at the N-terminal and C-terminal ends of the GPCR ligand peptide of the present invention is a combination of Lys and Arg, more specifically Lys-Lys or Arg- Lys sequence.

본 발명의 어떤 구현예에 따르면, 본 발명의 GPCR 리간드 펩타이드는 절지동물에서 유래된 서열이며, 성충 머리 또는 이동성 유충(wandering larvae)의 소화 시스템에서 주로 발현된다.According to some embodiments of the present invention, the GPCR ligand peptide of the present invention is an arthropodically derived sequence and is predominantly expressed in the digestive system of adult head or wandering larvae.

본 발명의 어떤 구현예에 따르면, 본 발명의 GPCR 리간드 펩타이드는 서열목록 제1서열(NH₂-QYMSPCHFKICNM-CONH₂)을 포함한다. 본 발명의 GPCR 리간드 펩타이드의 변형예로서 서열목록 제2서열(NH₂-NVQYMSPCHFKICNM-CONH₂)을 추가적으로 제시한다(참고: 도 1a).According to some embodiments of the present invention, the GPCR ligand peptide of the present invention comprises SEQ ID NO: 1 (NH ₂ -QYMSPCHFKICNM-CONH ₂ ). As a modification of the GPCR ligand peptide of the present invention, there is additionally provided a second sequence of the sequence listing (NH ₂ -NVQYMSPCHFKICNM-CONH ₂ ) (see FIG. 1A).

본 발명의 어떤 구현예에 따르면, 본 발명의 서열목록 제1서열 또는 서열목록 제2서열의 펩타이드 리간드는 CG33696(Irin 수용체, 'IrinR'로 명명됨; GenBank Accession No. BT024209, 서열목록 제5서열; GenBank Accession No. ABC86271, 서열목록 제6서열)에 대한 펩타이드 리간드이다.According to some embodiments of the present invention, the peptide ligand of the sequence listing sequence of the present invention or the sequence sequence of the second sequence is selected from the group consisting of CG33696 (designated Irin receptor, 'IrinR', GenBank Accession No. BT024209, ; GenBank Accession No. ABC 86271, Sequence Listing 6).

본 발명은 (a) 서열목록 제1서열 또는 서열목록 제2서열의 아미노산 서열을 인코딩하는 뉴클레오타이드 서열; (b) 상기 뉴클레오타이드 서열에 작동적으로 결합된(operatively linked) 프로모터; 및 (c) 터미네이터(terminator)를 포함하는 재조합 벡터를 제공하며, 상기 재조합 벡터로 형질전환된 세포를 제공한다.(A) a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2; (b) a promoter operably linked to the nucleotide sequence; And (c) a terminator, to provide cells transformed with the recombinant vector.

본 발명의 어떤 구현예에 따르면, 본 발명의 기능성 어세이에서 이용될 수 있는 재조합 벡터는 (i) 상술한 본 발명의 서열목록 제1서열 또는 서열목록 제2서열의 아미노산 서열을 인코딩하는 뉴클레오타이드 서열; (ii) 상기 (i)의 뉴클레오타이드 서열에 작동적으로 연결되며 동물세포(예를 들어, CHO 세포)에서 작용하여 RNA 분자를 형성시키는 프로모터; 및 (iii) 동물세포에서 작용하여 상기 RNA 분자의 3'-말단의 폴리아데닐화를 야기시키는 3'-비-해독화 부위를 포함하는 재조합 벡터를 포함한다.According to some embodiments of the present invention, the recombinant vector that can be used in the functional assay of the present invention comprises (i) a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2 of the above- ; (ii) a promoter operatively linked to the nucleotide sequence of (i) and acting on animal cells (e. g., CHO cells) to form RNA molecules; And (iii) a recombinant vector comprising a 3'-non-detoxified site that acts in animal cells to cause polyadenylation of the 3'-end of the RNA molecule.

본 명세서에서 용어 "프로모터"는 코딩 서열 또는 기능적 RNA의 발현을 조절하는 DNA 서열을 의미한다. 본 발명의 제조합 벡터에서 목적 뉴클레오타이드 서열은 상기 프로모터에 작동적으로 연결된다. 본 명세서에서 용어 "작동적으로 결합된(operatively linked)"은 핵산 발현 조절 서열(예: 프로모터 서열, 시그널 서열, 또는 전사조절인자 결합 위치의 어레이)과 다른 핵산 서열 사이의 기능적인 결합을 의미하며, 이에 의해 상기 조절 서열은 상기 다른 핵산 서열의 전사 및/또는 트랜스레이션을 조절하게 된다.As used herein, the term "promoter " means a DNA sequence that regulates the expression of a coding sequence or functional RNA. The target nucleotide sequence in the combination vector of the present invention is operatively linked to the promoter. As used herein, the term "operatively linked" refers to a functional linkage between a nucleic acid expression control sequence (e.g., an array of promoter sequences, signal sequences, or transcription factor binding sites) , Whereby the regulatory sequence regulates transcription and / or translation of the other nucleic acid sequence.

본 발명의 벡터 시스템은 당업계에 공지된 다양한 방법을 통해 구축될 수 있으며, 이에 대한 구체적인 방법은 Sambrook et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press(2001)에 개시되어 있으며, 이 문헌은 본 명세서에 참조로서 삽입된다.The vector system of the present invention can be constructed through various methods known in the art, and specific methods for this are disclosed in Sambrook et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press (2001) This document is incorporated herein by reference.

본 발명의 재조합 벡터가 진핵세포(예컨대, CHO 세포)에 적용되는 경우, 이용될 수 있는 프로모터는, 본 발명의 서열목록 제1서열 또는 서열목록 제2서열의 아미노산 서열-인코딩 뉴클레오타이드 서열의 전사를 조절할 수 있는 것으로서, 포유동물 바이러스로부터 유래된 프로모터, 포유동물 세포의 지놈으로부터 유래된 프로모터 및 효모 세포에서 유래된 프로모터를 포함하며, 예컨대, CMV(cytomegalo virus) 프로모터, 아데노바이러스 후기 프로모터, 백시니아 바이러스 7.5K 프로모터, SV40 프로모터, HSV의 tk 프로모터, RSV 프로모터, EF1 알파 프로모터, 메탈로티오닌 프로모터, 베타-액틴 프로모터, 인간 IL-2 유전자의 프로모터, 인간 IFN 유전자의 프로모터, 인간 IL-4 유전자의 프로모터, 인간 림포톡신 유전자의 프로모터, 인간 GM-CSF 유전자의 프로모터, 효모(S. cerevisiae) GAPDH(Glyceraldehyde 3-phosphate dehydrogenase) 프로모터, 효모(S. cerevisiae) GAL1 내지 GAL10 프로모터 및 효모(Pichia pastoris) AOX1 또는 AOX2 프로모터를 포함하나, 이에 한정되는 것은 아니다.When the recombinant vector of the present invention is applied to a eukaryotic cell (e. G., A CHO cell), a promoter that can be used is a promoter of the amino acid sequence-encoding nucleotide sequence of SEQ ID No. 1 or SEQ ID No. 2 of the present invention Examples of the promoter include a promoter derived from a mammalian virus, a promoter derived from a genome of a mammalian cell, and a promoter derived from a yeast cell. Examples of the promoter include CMV (cytomegalo virus) promoter, adenovirus late promoter, vaccinia virus A promoter of the human IL-2 gene, a promoter of the human IL-4 gene, a promoter of the human IL-4 gene, a promoter of the human IL-2 gene, a promoter of the human IL- Promoter, promoter of human lymphotoxin gene, promoter of human GM-CSF gene, yeast (S. cerev isiae) Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) promoter, S. cerevisiae GAL1 to GAL10 promoter and Pichia pastoris AOX1 or AOX2 promoter.

본 발명의 어떤 구현예에 따르면, 본 발명에서 이용될 수 있는 발현 컨스트럭트는 폴리 아네닐화 서열을 포함한다(예: 소성장 호르몬 터미네이터(BGH pA) 및 SV40 유래 폴리 아데닐화 서열).According to some embodiments of the present invention, the expression constructs that may be used in the present invention include polyanenylation sequences (e.g. bovine growth hormone terminator (BGH pA) and SV40 derived polyadenylation sequences).

또한, 본 발명의 벡터는 선택마커를 추가적으로 포함한다. 본 발명의 어떤 구현예에 따르면, 본 발명의 벡터는 당업계에서 통상적으로 이용되는 항생제 내성 유전자를 포함하며, 예를 들어 네오마이신, 제네티신, 암피실린, 카나마이신, 하이그로마이신, 스트렙토마이신, 페니실린, 클로람페니콜, 겐타마이신, 카베니실린 및 테트라사이클린에 대한 내성 유전자를 포함하지만 이에 한정되는 것은 아니다.In addition, the vector of the present invention additionally comprises a selection marker. According to some embodiments of the present invention, the vector of the present invention includes an antibiotic resistance gene commonly used in the art and includes, for example, neomycin, geneticin, ampicillin, kanamycin, hygromycin, streptomycin, penicillin , Chloramphenicol, gentamycin, carbenicillin and tetracycline resistance genes.

본 발명의 벡터를 숙주 세포 내로 운반하는 방법은 당업계에 공지된 다양한 방법들을 이용할 수 있으며, 예를 들어 숙주 세포가 원핵 세포인 경우, CaCl₂ 방법(Cohen, et al., Proc. Natl. Acac. Sci. USA, 9:2110-2114(1973)), 하나한 방법(Cohen, et al., Proc. Natl. Acac. Sci. USA, 9:2110-2114(1973); 및 Hanahan, D., J. Mol. Biol., 166:557-580(1983)) 및 전기 천공 방법(Dower, et al., Nucleic. Acids Res., 16:6127-6145(1988)) 등에 의해 실시될 수 있으며, 진핵세포인 경우, 리포펙션(lipofection), 전기동공법(electroporation), 리포좀-매개 전이방법(Wong, 등, 1980) 및 레트로바이러스-매개 전이방법(Chen, H.Y., et al., (1990), J. Reprod. Fert. 41:173-182; Kopchick, J.J. et al., (1991) Methods for the introduction of recombinant DNA into chicken embryos. In Transgenic Animals, ed. N.L. First & F.P. Haseltine, pp.275-293, Boston; Butterworth-Heinemann; Lee, M.-R. and Shuman, R. (1990) Proc. 4th World Congr. Genet. Appl. Livestock Prod. 16, 107-110), 마이크로인젝션, 유전자총(particle bombardment), YAC에서 이용되는 효모 구형질체/세포 융합, 식물세포에서 이용되는 아그로박테리움-매개된 형질전환 등을 이용하여 실시할 수 있으며, 보다 구체적으로는 리포펙션 방법으로 실시한다.Methods for delivering a vector of the present invention into a host cell can be performed by a variety of methods known in the art, for example, when the host cell is a prokaryotic cell, the CaCl ₂ method (Cohen, et al. , Proc. Natl. Acac USA, 9: 2110-2114 (1973)) and one method (Cohen, et al. , Proc. Natl. Acac. (Dower, et al. , Nucleic. Acids Res., 16: 6127-6145 (1988)), In the case of cells, lipofection, electroporation, liposome-mediated transfer (Wong, et al., 1980) and retrovirus-mediated transfer method (Chen, HY, et al. In Transgenic Animals, ed. NL First & FP Haseltine, pp. 275-293, Boston (1991) Methods for the introduction of recombinant DNA into chicken embryos. ; Butterworth-Heinemann; Lee, M.-R. and Shuman, R. (1990) Proc. 4th World 16, 107-110), microinjection, particle bombardment, yeast spheroid / cell fusion used in YAC, Agrobacterium-mediated transformation used in plant cells Or the like, and more specifically, by a lipofection method.

기능성 어세이를 위해, 본 발명의 재조합 벡터는 본 발명의 방법에 의해 동정된 후보 펩타이드 리간드에 대한 억제 형태로 제조될 수 있다. 예를 들어, 안티센스 올리고뉴클레오타이드, siRNA(small interference RNA), shRNA(small hairpin RNA 또는 short hairpin RNA) 및 miRNA(microRNA)의 형태로 제조될 수 있지만, 이에 한정되는 것은 아니다.For functional assays, the recombinant vectors of the present invention can be prepared in the inhibitory form for the candidate peptide ligands identified by the methods of the present invention. But are not limited to, for example, in the form of antisense oligonucleotides, small interference RNA (siRNA), shRNA (small hairpin RNA or short hairpin RNA) and miRNA (microRNA).

본 발명에서 사용되는 용어 "안티센스 올리고뉴클레오타이드"는 특정 mRNA의 서열에 상보적인 핵산 서열을 함유하고 있는 DNA 또는 RNA 또는 이들의 유도체를 의미하고, mRNA 내의 상보적인 서열에 결합하여 mRNA의 단백질로의 번역을 저해하는 작용을 한다. 본 명세서에서 언급되는 용어 "상보적(complementary)"은 소정의 혼성화 또는 어닐링 조건, 바람직하게는 생리학적 조건 하에서 안티센스 올리고뉴클레오타이드가 타겟(예컨대, GPCR 리간드 펩타이드를 인코딩하는 뉴클레오타이드 서열)에 선택적으로 혼성화할 정도로 충분히 상보적인 것을 의미하는 것으로 하나 또는 그 이상의 미스매치(mismatch) 염기서열을 가질 수 있으며, 실질적으로 상보적(substantially complementary) 및 완전히 상보적(perfectly complementary)인 것을 모두 포괄하는 의미를 가지며, 보다 구체적으로는 완전히 상보적인 것을 의미한다. 안티센스 올리고뉴클레오타이드의 길이는 6 내지 100 염기이고, 보다 구체적으로는 8 내지 60 염기이고, 가장 구체적으로는 10 내지 40 염기이다.The term "antisense oligonucleotide " as used herein refers to DNA or RNA or a derivative thereof containing a nucleic acid sequence complementary to the sequence of a specific mRNA, and is capable of binding to a complementary sequence in mRNA and translating mRNA into a protein . &Lt; / RTI > As used herein, the term "complementary" refers to the ability of an antisense oligonucleotide to selectively hybridize to a target (e.g., a nucleotide sequence encoding a GPCR ligand peptide) under a given hybridization or annealing condition, preferably under physiological conditions Quot; refers to a nucleotide sequence that has one or more mismatch nucleotide sequences and is substantially complementary and perfectly complementary, Specifically, it means completely complementary. The length of the antisense oligonucleotide is 6 to 100 bases, more specifically 8 to 60 bases, most specifically 10 to 40 bases.

본 발명에서 사용되는 용어 "siRNA"는 RNA 방해 또는 유전자 사일런싱을 매개할 수 있는 핵산 분자를 의미한다(참조: WO 00/44895, WO 01/36646, WO 99/32619, WO 01/29058, WO 99/07409 및 WO 00/44914). siRNA는 표적 유전자의 발현을 억제할 수 있기 때문에 효율적인 유전자 넉다운 방법으로서 또는 유전자 치료 방법으로 제공된다. siRNA는 식물, 벌레, 초파리 및 기생충에서 처음으로 발견되었으나, 최근에 siRNA를 개발/이용하여 포유류 세포 연구에 응용되었다(Degot S, et al., 2002; Degot S, et al., 2004; Ballut L, et al., 2005).The term "siRNA" as used herein means a nucleic acid molecule capable of mediating RNA interference or gene silencing (see WO 00/44895, WO 01/36646, WO 99/32619, WO 01/29058, WO 99/07409 and WO 00/44914). Since siRNA can inhibit the expression of a target gene, it is provided as an efficient gene knockdown method or as a gene therapy method. siRNAs were first discovered in plants, insects, fruit flies and parasites, but recently they have been applied to mammalian cell research to develop / use siRNAs (Degot S, et al. , 2002; Degot S, et al. , 2004; Ballut L , et al. , 2005).

본 발명에서 이용될 수 있는 siRNA 분자는, 센스 가닥(예를 들어, GPCR 리간드 펩타이드를 인코딩하는 뉴클레오타이드 서열(mRNA)에 상응하는(corresponding) 서열)과 안티센스 가닥(예를 들어, GPCR 리간드 펩타이드를 인코딩하는 뉴클레오타이드 서열(mRNA)에 상보적인 서열)이 서로 반대쪽에 위치하여 이중쇄를 이루는 구조를 가질 수 있다. 또한, 본 발명에서 이용될 수 있는 siRNA 분자는, 자기-상보성(self-complementary) 센스 및 안티센스 가닥을 가지는 단일쇄 구조를 가질 수 있다.SiRNA molecules that may be used in the present invention include siRNA molecules that encode a sense strand (e.g., a sequence corresponding to a nucleotide sequence (mRNA) encoding a GPCR ligand peptide) and an antisense strand (e.g., encoding a GPCR ligand peptide (MRNA) complementary to the nucleotide sequence (mRNA)) are located on opposite sides to form a double-stranded structure. In addition, siRNA molecules that may be used in the present invention may have a single stranded structure with self-complementary sense and antisense strands.

siRNA는 RNA끼리 짝을 이루는 이중사슬 RNA 부분이 완전히 쌍을 이루는 것에 한정되지 않고 미스매치(대응하는 염기가 상보적이지 않음), 벌지(일방의 사슬에 대응하는 염기가 없음) 등에 의하여 쌍을 이루지 않는 부분이 포함될 수 있다. 구체적으로는, 전체 길이는 10 내지 100 염기, 보다 구체적으로는 15 내지 80 염기, 그리고 보다 더 구체적으로는 20 내지 70 염기이다.The siRNA is not limited to a complete pair of double-stranded RNA portions that are paired with each other, but is paired by a mismatch (the corresponding base is not complementary), a bulge (no base corresponding to one chain) May be included. Specifically, the total length is 10 to 100 bases, more specifically 15 to 80 bases, and even more specifically 20 to 70 bases.

본 발명에서 사용되는 용어 "shRNA(small hairpin RNA 또는 short hairpin RNA)"는 견고한 헤어핀 턴을 만드는 RNA의 서열을 나타내며, 이는 RNA 간섭을 통해 유전자 발현을 사일런스시키는 데 이용될 수 있다. shRNA는 세포 도입용 벡터를 이용하며 shRNA를 발현할 수 있는 U6 프로모터를 주로 이용한다. 이러한 벡터는 항상 딸세포로 전달되어 유전자 사일런싱이 유전될 수 있도록 한다. shRNA 헤어핀 구조는 세포 내 기작(machinery)인 siRNA로 분해되어 RNA-유도 사일런싱 복합체(RNA-induced silencing complex)에 결합된다. 상술한 복합체는 이에 결합된 siRNA에 상응하는(matched) mRNA에 결합하여 분해시킨다. shRNA는 RNA 폴리머라제 III에 의해 전사되며, 포유동물 세포에서 shRNA 생산은 세포가 shRNA를 바이러스 공격으로 인식하여 방어 수단을 찾는 것처럼 인터페론 반응을 야기시킬 수도 있다. 또한, shRNA는 식물 및 다른 시스템에서도 이용될 수 있으며 U6 프로모터가 반드시 필요한 것은 아니다. 식물의 경우에는 매우 강력한 연속적인 발현 능력을 보유한 전통적인 프로모터인 CaMV(cauliflower mosaic virus) 35S 프로모터가 이용될 수 있다.As used herein, the term "shRNA (small hairpin RNA or short hairpin RNA)" refers to an RNA sequence that produces a robust hairpin turn, which can be used to silence gene expression through RNA interference. shRNA uses a vector for introducing cells and mainly uses a U6 promoter capable of expressing shRNA. These vectors are always transferred to daughter cells, allowing genetic silencing to be inherited. The shRNA hairpin structure is degraded into an intracellular machinery siRNA and bound to an RNA-induced silencing complex. The above-described complex binds to and degrades mRNA matched to the siRNA bound thereto. shRNAs are transcribed by RNA polymerase III, and production of shRNAs in mammalian cells may cause interferon responses as cells recognize shRNA as a viral attack and find defensive means. In addition, shRNAs can also be used in plants and other systems, and the U6 promoter is not necessary. In the case of plants, the CaMV (cauliflower mosaic virus) 35S promoter, which is a conventional promoter having a very strong continuous expression ability, can be used.

본 명세서의 언급되는 용어 "마이크로RNA(microRNA, miRNA)"는 21-25개의 뉴클레오타이드의 단일가닥 RNA 분자로서 mRNA(messengerRNA)의 3'-UTR에 결합하여 진핵생물의 유전자 발현을 제어하는 물질을 나타낸다(Bartel DP, et al., Cell, 23;116(2): 281-297(2004)). miRNA의 생성은 Drosha(RNaseIII type 효소)에 의해 스템-루프 구조의 전구체 miRNA(pre-miRNA)로 만들어지고, 세포질로 이동하여 다이서(Dicer)에 의해 절단되어 성숙한 miRNA로 만들어진다[Kim VN, et al., Nat Rev Mol Cell Biol., 6(5): 376-385(2005)]. 상술한 바와 같이 제조된 miRNA는 표적단백질의 발현을 조절함으로써 발생, 세포증식 및 사멸, 지방대사, 종양형성 등에 관여한다[Wienholds E, et al., Science, 309(5732): 310-311(2005); Nelson P, et al., Trends Biochem Sci., 28: 534-540(2003); Lee RC, et al., Cell, 75: 843-854(1993); 및 Esquela-Kerscher A, et al., Nat Rev Cancer, 6: 259-269(2006)].
As used herein, the term "microRNA (miRNA)" refers to a material that binds to the 3'-UTR of mRNA (messenger RNA) as a single strand RNA molecule of 21-25 nucleotides and controls gene expression of eukaryotes (Bartel DP, et al. , Cell, 23: 116 (2): 281-297 (2004)). The production of miRNA is made into a stem-loop structure precursor miRNA (pre-miRNA) by Drosha (RNase III type enzyme), and it is transferred to the cytoplasm and cleaved by Dicer to make mature miRNA [Kim VN, et al. , Nat Rev Mol Cell Biol., 6 (5): 376-385 (2005)]. MiRNAs prepared as described above are involved in development, cell proliferation and death, lipid metabolism, tumor formation, etc. by controlling the expression of target proteins [Wienholds E, et al. , &Lt; / RTI > Science, 309 (5732): 310-311 (2005); Nelson P, et al. , Trends Biochem Sci., 28: 534-540 (2003); Lee RC, et al. , Cell, 75: 843-854 (1993); And Esquela-Kerscher A, et al. , Nat Rev Cancer, 6: 259-269 (2006)].

본 발명의 특징 및 이점을 요약하면 다음과 같다:The features and advantages of the present invention are summarized as follows:

(a) 본 발명은 신규 GPCR 리간드 펩타이드 및 이를 동정하기 위한 스크리닝 방법에 관한 것이다.(a) The present invention relates to novel GPCR ligand peptides and a screening method for identifying them.

(b) 본 발명은 하기 일반식 1로 표현되는 GPCR 리간드 펩타이드를 제공한다:(b) The present invention provides a GPCR ligand peptide represented by the following general formula 1:

일반식 11

(c) 또한, 본 발명의 스크리닝 방법은 다음의 특징을 가지는 시그널링 펩타이드 리간드를 제공할 수 있다: (i) 하나의 트랜스막 도메인을 가지는 서열; (ii) 시그널 펩타이드 서열의 존재; 및 (iii) 기능성 도메인을 가지지 않는 서열.
(c) In addition, the screening method of the present invention can provide a signaling peptide ligand having the following characteristics: (i) a sequence having one transmembrane domain; (ii) the presence of a signal peptide sequence; And (iii) a sequence that does not have a functional domain.

도 1은 노랑초파리 및 다른 절지동물 종들에 존재하는 Irin에 관한 서열 분석 결과이다. 도 1a는 flybase에서 예측된 2개의 전구체들의 서열을 보여주는 결과이다. 잠재적인 성숙 펩타이드(굵은 글씨)는 2염기성 절단 서열(밑줄 친 RK 또는 KK)에 의해 표시되어 있으며, C-말단에 아미드화 시그널(밑줄 친 GRK)를 포함한다. 이황화 결합을 형성할 것으로 보이는 한 쌍의 시스테인 잔기들이 존재한다. 도 1b는 6개의 다른 절지동물 목들에서 유래된 Irin 전구체들의 보존서열이 잠재적인 성숙 펩타이드 서열(별표로 표시됨)에만 존재한다는 것을 보여주는 아미노산 서열 정렬 결과이다. 동일한 아미노산 잔기들이 검은색 그림자로 강조되어 있고, 보존성 높은 아미노산 잔기들은 회백색으로 표시된다. 점들은 정렬을 위해 도입된 갭을 지시한다. 도 1c는 BLAST 조사에 의해 캡쳐된 서열들로부터 예측된 모든 Irin들을 보여주는 정렬 결과이다.
도 2는 CG33639가 Irin에 대해 매우 높은 선택성과 민감성을 가지는 수용체를 인코딩한다는 것을 보여주는 결과이다. 도 2a는 ATP(50 μM) 또는 다양한 펩타이드 리간드들(10 μM)의 처리에 따른 지시된 GPCR, 애쿠오린인 Gq 및 3개의 키메릭 G-단백질(Gα-qs, Gα-qi 또는 Gα-qo)을 발현하는 CHO 세포의 발광(luminescence, LU) 반응성을 보여주는 결과이다. 발광 반응은 ATP(100%)에 대한 상대적인 반응 정도로 표준화시켰다. 약어: SP(sex peptide) 및 Mip4. 도 2b는 CG33696(노랑초파리 IrinR), 애쿠오린 및 Gαq를 발현하는 CHO 세포에 다양한 펩타이드 리간드들(10 μM)을 처리하고, 10 μM Irin의 반응에 대해 표준화시킨 CHO 세포의 발광 반응 결과이다. 각 데이터 포인트는 평균값±표준편차(n = 4)를 의미한다. 도 2c는 노랑초파리(Drosophila melanogaster) IrinR(Drm-IrinR), 애쿠오린 및 Gαq를 발현하는 CHO 세포에 Irin을 처리한 투여량-반응 곡선을 측정한 결과이다. 도 2d는 꿀벌(Apis mellifera) IrinR (Am-IrinR), 애쿠오린 및 Gαq를 발현하는 CHO 세포에 Irin을 처리한 투여량-반응 곡선을 측정한 결과이다. 도 2e는 누에나방(Bombyx mori) IrinR (Bom-IrinR), 애쿠오린 및 Gαq를 발현하는 CHO 세포에 Irin를 처리한 투여량-반응 곡선을 측정한 결과이다.
도 3은 Irin 수용체(IrinR)의 계통학적 분석 결과이다. 진화적 상관관계들은 UPGMA 방법에 의해 제조된 트리를 이용하여 추론되었다. 진화적 거리는 위치 당 아미노산 대체에 대한 포아송 정정 방법(Poisson correction method)을 이용하여 계산되었다. 대부분의 클러스터들은 상기 트리 내 500개의 복제물에서 70% 부트스트랩 값 이상으로 뒷받침되었다. IrinR은 계통발생학적 측면에서 IrinR-1 및 IrinR-2로 클러스터링된다. 노랑초파리 IrinR은 박스로 표시된다. 여러 종의 막시목 분류군(회색 박스)은 IrinR-1 및 IrinR-2 동종물(paralogies; 분류군 이름 앞에 검정색 원) 모두를 가지는 것을 확인되었다.Figure 1 shows the sequence analysis results for Irin present in yellow flies and other arthropod species. Figure 1 a shows the sequence of the two precursors predicted on the flybase. Potential mature peptides (bold) are indicated by the dibasic cleavage sequence (underlined RK or KK) and include an amidated signal at the C-terminus (underlined GRK). There is a pair of cysteine residues that appear to form disulfide bonds. Figure 1B shows the amino acid sequence alignment results showing that the conserved sequences of Irin precursors derived from six different arthropods are present only in potential mature peptide sequences (marked with an asterisk). The same amino acid residues are highlighted in black shadows, and the conserved amino acid residues are marked grayish white. The dots indicate the gap introduced for alignment. 1C is an alignment result showing all the Irin predicted from the sequences captured by the BLAST probe.
Figure 2 shows that CG33639 encodes receptors with very high selectivity and sensitivity to Irin. Figure 2a shows the effect of the indicated GPCR, the acrolein Gq and the three chimeric G-proteins (G? -Qs, G? -Qi or G? -Qo (LU) reactivity of CHO cells expressing LPS. The luminescent reaction was normalized to the relative response to ATP (100%). Abbreviations: SP (sex peptide) and Mip4. FIG. 2b shows the results of a luminescent reaction of CHO cells in which CHO cells expressing CG33696 (Yellow Drosophila IrinR), acuinin and Gαq were treated with various peptide ligands (10 μM) and standardized for the reaction of 10 μM Irin. Each data point means mean value ± standard deviation (n = 4). FIG. 2C shows the results of measuring dose-response curves of Irin-treated CHO cells expressing Drosophila melanogaster IrinR (Drm-IrinR), acuinin and Gαq. FIG. 2d shows the results of measuring dose-response curves of Irin-treated CHO cells expressing bee (Apis mellifera) IrinR (Am-IrinR), acuinin and Gaq. FIG. 2E shows the results of measuring dose-response curves of Irin-treated CHO cells expressing Bombyx mori IrinR (Bom-IrinR), acuinin and Gαq.
3 shows the phylogenetic analysis result of the Irin receptor (IrinR). Evolutionary correlations were deduced using the tree produced by the UPGMA method. Evolutionary distances were calculated using the Poisson correction method for amino acid substitutions per position. Most clusters were backed by over 70% bootstrap values in 500 replicas in the tree. IrinR is clustered into IrinR-1 and IrinR-2 in phylogenetic terms. The yellow fruit flies IrinR are represented by boxes. Several species of marcia taxa (gray boxes) were found to have both IrinR-1 and IrinR-2 homologs (black circles before taxonomic names).

이하, 실시예를 통하여 본 발명을 더욱 상세히 설명하고자 한다. 이들 실시예는 오로지 본 발명을 보다 구체적으로 설명하기 위한 것으로, 본 발명의 요지에 따라 본 발명의 범위가 이들 실시예에 의해 제한되지 않는다는 것은 당업계에서 통상의 지식을 가진 자에 있어서 자명할 것이다.
Hereinafter, the present invention will be described in more detail with reference to Examples. It is to be understood by those skilled in the art that these embodiments are only for describing the present invention in more detail and that the scope of the present invention is not limited by these embodiments in accordance with the gist of the present invention .

실시예Example

실험재료 및 실험방법Materials and Experiments

드로소필라 게놈에서 시그널링 펩타이드 전구체들의 동정Identification of signaling peptide precursors in the Drosophila genome

신규 시그널링 펩타이드들을 동정하기 위해, 본 발명자는 NCBI(http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase(http://hymenopteragenome.org/beebase/), Beetlebase(http://beetlebase.org) 및 VectorBase(https://www.vectorbase.org) 데이터베이스를 이용하여 2,648개의 단백질 엔트리에 대한 다-단계 분석을 실시하였다. 먼저, 모든 알려진 펩타이드 전구체 유전자들이 다른 분비성 단백질과 같이 시그널 서열을 가지기 때문에, 본 발명자는 SignalP4.1(http://www.cbs.dtu.dk/services/SignalP; 14)를 이용하여 예측된 시그널 서열을 가지는 단백질을 분리하였다. 대부분의 펩타이드 전구체는 종종 하나의 트랜스막 도메인으로 예측되는 C-말단 끝에 시그널 펩타이드를 가지기 때문에, 본 발명자는 예측된 트랜스막 도메인들의 숫자를 확인하여 후보들로부터 복수개의 트랜스막들을 가지는 펩타이드 전구체들을 제거하였다. 이후, 본 발명자는 많은 미확인된 펩타이드 전구체 유전자들이 알려진 기능성 도메인(예를 들어, 키나제, 프로티나제, 등)을 포함하지 않아 분자적 기능이 예측되지 않을 것으로 예상하였기 때문에, 본 발명자는 명확한 GOA(gene ontology annotation)를 가지지 않는 펩타이드 전구체들에 대한 스크리닝을 실시하였다. 이를 위해, 본 발명자는 펩타이드 전구체를 인코딩하는 후보 유전자들로서 2,648개의 단백질 엔트리를 셋팅한 후, NCBInr 데이터베이스를 이용하여 2,648개의 후보 유전자들에 대한 DELTA-BLAST를 실시하였고, 비-드로소필리데(non-Drosophilidae) 곤충에서 유사 카운터파트를 가지는 단백질들에 대해 스크리닝하였다. 최종적으로, 본 발명자는 BLAST 결과들을 수작업으로 검사하여 종들 간에 보존된 하나 또는 복수의 부위들을 포함하는 단백질들을 동정하였으며, 상기 단백질들은 모노- 또는 2염기성 절단 위치들을 가진다. 상술한 접근방법을 통해, 본 발명자는 드로소필라 게놈에서 CG13936 및 추가적인 펩타이드 전구체 후보들을 동정하였다.
To identify the new signaling peptide, the present inventors NCBI (http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase (http://hymenopteragenome.org/beebase/ ), Beetlebase (http: / /beetlebase.org ) and the VectorBase ( https://www.vectorbase.org ) database to perform a multi-step analysis of 2,648 protein entries. First, since all known peptide precursor genes have a signal sequence similar to other secretory proteins, the present inventors have used predicted signal P4.1 (http://www.cbs.dtu.dk/services/SignalP; 14) The protein having the signal sequence was isolated. Since most peptide precursors often have a signal peptide at the C-terminal end predicted as a single transmembrane domain, the inventors have identified the number of predicted trans-membrane domains and removed peptide precursors with multiple trans-membranes from the candidates . Since the present inventors have anticipated that many unidentified peptide precursor genes do not contain known functional domains (e. G., Kinases, proteases, etc.) and thus do not predict molecular functions, gene ontology annotation). For this, the present inventors set 2,648 protein entries as candidate genes encoding peptide precursors, then performed DELTA-BLAST on 2,648 candidate genes using the NCBInr database, and non-Drosophilide (non - Drosophilidae ) insects were screened for proteins with similar counterparts. Finally, the inventors have manually examined the BLAST results and identified proteins containing one or more sites conserved between species, which proteins have mono- or dibasic cleavage sites. Through the approach described above, we have identified CG13936 and additional peptide precursor candidates in the Drosophila genome.

계통학적 분석(Phylogenetic analysis)Phylogenetic analysis

서열들은 NCBI(http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase(http://hymenopteragenome.org/beebase/), Beetlebase(http://beetlebase.org) 및 VectorBase(https://www.vectorbase.org)로부터 얻었다. 분석을 위해, NCBI nr 데이터베이스로부터 얻어진 데이터가 우선적으로 이용된 반면에, 조사는 특정 생물체에 대한 데이터베이스에 어셈블리된 게놈으로 확장시켜 실시되었다. 어셈블리된 게놈에서만 동정된 서열에 있어서, BLAST 조사에서 캡쳐된 부분 서열들이 이용되었다. GPCR인 CG33696 및 이의 유사 수용체(orthologues 및 paralogues)의 계통학적 분석에서, 트랜스막 도메인 1 내지 7만을 인코딩하는 서열 부위가 MEGA5로 분석되었다[11]. 쿨렉스(Culex) 및 애데스(Aedes) 같은 몇몇 모기 종들은 2개 카피 이상의 CG33696 유사체(약 4-6 카피)를 포함한다. 이들 중 몇몇은 다른 유전자들로서 알려져 있지만, 동일한 뉴클레오타이드 서열을 가지는 것으로 확인된 바, 이는 대부분 게놈 어셈블리에서 발생된 오류의 산물인 것으로 보인다. 따라서, 이들은 이후 분석에 이용되지 않는다.
Sequences NCBI (http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase (http://hymenopteragenome.org/beebase/ ), Beetlebase (http://beetlebase.org) and VectorBase (https : //www.vectorbase.org ). For analysis, the data obtained from the NCBI nr database was used primarily, while the investigation was carried out by extending to the genome assembled into the database for a particular organism. For sequences identified only in the assembled genome, partial sequences captured in BLAST studies were used. In a phylogenetic analysis of the GPCR CG33696 and its analogous receptors (orthologues and paralogues), the sequence region encoding only trans-membrane domains 1 to 7 was analyzed as MEGA5 [11]. Some mosquito species, such as Culex and Aedes , contain more than two copies of the CG33696 analog (about 4-6 copies). Although some of these are known as other genes, they have been identified as having the same nucleotide sequence, which appears to be the product of most genomic assembly errors. Therefore, they are not used for later analysis.

GPCR 어세이GPCR assay

노랑초파리 CG33639(IP11344), CG13229(AT19640) 또는 CG33696(RE32713)에 대한 cDNA 클론들은 DGRC(Drosophila Genomics Resource Center ;Indiana University)으로부터 얻었다. 꿀벌(Apis mellifera)-유래 CG33696 유사체 (orthologue)인 GB42369-RA(BeeBase ID; http://hymenopteragenome.org/beebase/)에서 예측된 아미노산 서열의 코돈-최적화된 ORF 클론이 바이오니아에 의해 합성되었다. CHO 세포 발현에 있어서, 상술한 클론들의 전장-코딩 서열들은 pcDNA3.1(+) 벡터(Invitrogen) 내로 서브클로닝되었다. SPR, 인간 코돈-최적화된 애쿠오린(aequorin), 야생형 Gαq 단백질 또는 키메릭 Gαq 단백질들(Gα-qi, -qo 및 -qs)에 대한 플라스미드들은 이전에 기재된 바와 같았다[5]. 누에나방(Bombyx mori) GPCR인 BNGR-A18(GenBank Accession No. AB330439)은 친절하게도 H. Kataoka 박사(University of Tokyo)에 의해 제공되었다. 상기 플라스미드들은 DMEM/F-12 배지(Welgene, Korea)에서 배양된 CHO-K1 세포에 제조자의 지시에 따라 Fugene6을 이용하여 트랜스펙션되었다. 수용체 어세이들에 이용된 과정들은 이전에 개시된 바와 같았다(15). HPLC-정제된 합성 DrmIrin(pQYMSPCHFKICNM-amide; pQ, pyroglutamic acid; 밑줄 친 C 사이에 이황화 결합)은 Anygene(Korea)으로부터 얻었다.
CDNA clones for yellow flies CG33639 (IP11344), CG13229 (AT19640) or CG33696 (RE32713) were obtained from DGRC (Drosophila Genomics Resource Center; Indiana University). The codon-optimized ORF clone of the predicted amino acid sequence from the Apis mellifera -derived CG33696 orthologue GB42369-RA (BeeBase ID; http://hymenopteragenome.org/beebase/) was synthesized by bionea. For CHO cell expression, the full-length coding sequences of the clones described above were subcloned into the pcDNA3.1 (+) vector (Invitrogen). Plasmids for SPR, human codon-optimized aequorin, wild-type Gαq protein or chimeric Gαq proteins (Gα-qi, -qo and -qs) were as previously described [5]. BNGR-A18 (GenBank Accession No. AB330439), a Bombyx mori GPCR, was kindly provided by H. Kataoka (University of Tokyo). The plasmids were transfected into CHO-K1 cells cultured in DMEM / F-12 medium (Welgene, Korea) using Fugene6 according to the manufacturer's instructions. The procedures used for receptor assays were as previously described (15). HPLC-purified synthetic DrmIrin (pQYMSP C HFKI C NM-amide; pQ, pyroglutamic acid; disulfide bond between underlined C) was obtained from Anygene (Korea).

실험결과 및 추가논의사항Experimental results and further discussion

신규 시그널닝 펩타이드를 동정하기 위해, 본 발명자는 NCBI(http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase(http://hymenopteragenome.org/beebase/), Beetlebase(http://beetlebase.org) 및 VectorBase(https://www.vectorbase.org) 데이터베이스에서 얻은 2,648개의 노랑초파리(Drosophila melanogaster) 단백질 엔트리에 대한 인 실리코(in silico) 분석을 실시하여(실험재료 및 실험방법) 후보로 CG13936을 동정하였다. CG13936은 서로 다른 mRNA 스플라이싱을 통해 2개의 다른 전구체들(CG13936-PB 및 CG13936-PD; 각각 서열목록 제3서열 및 서열목록 제4서열)을 인코딩하는 것으로 예측된다(www.flybase.org). 양 변이체들은 서로 높은 유사성을 가지며, 각 산물들은 하나의 잠재적인 성숙 펩타이드(도 1a 내 굵은 글씨)로, 상기 펩타이드는 2염기성-절단 위치들(K 및 R의 조합; 도 1a 내 굵은 글씨 양 말단에 밑줄로 표시된 글씨)로 확인된다. 특히, 잠재적인 펩타이드는 C-말단 끝에 표준 아미드화 위치(도 1a 내 굵은 글씨의 C-말단에 밑줄 친 G 염기)를 가지는데, 이는 상기 펩타이드가 아미드화된 C-말단을 가진다는 것을 의미한다. CG13936-PD가 아니라 CG13936-PB로부터 유래된 성숙 펩타이드는 N-말단에 글루타민 잔기를 가지는데, 상기 아미노산은 종종 피로글루타민산(pyroglutamic acids)으로 전환된다. 또한, 상기 잠재적인 성숙 펩타이드는 4개의 아미노산들을 감싸는 2개의 시스테인 잔기(도 1a 내 밑줄 친 시스테인들)를 가지는데, 이는 상기 시스테인 잔기들이 이황화 결합에 의해 고리화된다는 것을 의미한다. 본 발명자는 상기 그룹의 펩타이드들을 C-말단 끝 모티프를 가지는 I nsulin r egulatory prote in (Irin)으로 명명하였다.Order to identify novel signal peptide Ning, the inventors NCBI (http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase (http://hymenopteragenome.org/beebase/ ), Beetlebase (http: subjected to //beetlebase.org) and VectorBase (https://www.vectorbase.org) in silico to 2648 of yellow fruit fly (Drosophila melanogaster) protein obtained from the database entry (in silico) analysis (materials and test method ) CG13936 was identified as a candidate. CG13936 is predicted to encode two different precursors (CG13936-PB and CG13936-PD; respectively SEQ ID No. 3 and SEQ ID No. 4) through different mRNA splicing ( www.flybase.org ) . Both variants have a high similarity to each other, and each product is a potential mature peptide (bold in FIG. 1A), the peptide has two basic-cut positions (a combination of K and R; Underlined in the letter). In particular, potential peptides have a standard amidation site at the C-terminus (G base underlined in the bold C-terminus in Figure 1A), which means that the peptide has an amidated C-terminus . Mature peptides derived from CG13936-PB, but not CG13936-PD, have a glutamine residue at the N-terminus, which is often converted to pyroglutamic acids. In addition, the potential mature peptide has two cysteine residues (the underlined cysteines in FIG. 1A) surrounding four amino acids, which means that the cysteine residues are cyclized by a disulfide bond. The inventors have named the group of peptides of the I nsulin r egulatory prote in (Irin) having a C- terminal end of the motif.

본 발명자는 주요 곤충 목들을 포함하는 많은 절지동물 종들의 게놈 및 mRNA 데이터베이스로부터 Irin 전구체들을 동정하였다(결과를 보이지 않음). 본 발명자의 광범위한 조사에도 불구하고, 본 발명자는 누에나방(Bombyx mori) 및 다나우스 플렉시푸스(Danaus plexippus) 같은 게놈-시퀀싱된 나비목에서 Irin을 동정할 수 없었다. 여러 종들의 전구체 서열들을 이용한 서열 비교 분석은 두드러진 보존성, 특히 예측된 성숙 펩타이드 서열 부위에서 서열 보존성을 나타냈는데(도 1b), 이는 본 발명자의 예측을 뒷받침해주는 결과이다. 더 나아가, C-말단 아미드화 위치들 및 2개의 시스테인 잔기들은 모든 동정된 Irin에서 잘 보존되어 있었으며(도 1c), 이는 이들의 기능적 중요성을 나타낸다.The present inventors identified Irin precursors from the genomic and mRNA databases of many arthropod species, including major insect species (results not shown). Despite extensive investigations by the present inventors, the inventors were unable to identify Irin in genomic-sequenced lepidoptera, such as Bombyx mori and Danaus plexippus . Sequence comparison analysis using precursor sequences of several species showed remarkable conservatism, particularly sequence conservation at the predicted mature peptide sequence region (FIG. 1b), which is the result supporting the present inventors' prediction. Furthermore, the C-terminal amidation sites and the two cysteine residues were well conserved in all identified Irin (Fig. 1c), indicating their functional significance.

본 발명자는 flybase(http://flybase.org)에서 얻어진 modENCODE 조직 발현 데이터를 확인함으로써 CG13936(Irin)의 발현 패턴을 조사하였다. 그 결과, Irin은 성충 머리 및 이동성 유충(wandering larvae)의 소화 시스템에서 약하지만 주되게 발현된다.The present inventors confirmed by the modENCODE tissue expression data obtained in flybase (http://flybase.org) were examined the expression pattern of CG13936 (Irin). As a result, Irin is expressed weakly but predominantly in the digestive system of adult head and wandering larvae.

Irin이 기능성 펩타이드를 인코딩하는 지 여부를 테스트하기 위해, 본 발명자는 N-말단 피로글루타민산, C-말단 아미드화 및 2개의 시스테인 잔기들 간의 이황화 브릿지(disulfide bridge)를 가지는 노랑초파리 Irin을 합성하여 이질성 GPCR 어세이 시스템을 이용하여 상기 합성 펩타이드가 작은 패널의 관계된 고아 GPCRs(CG33639, CG13229 및 CG33696(aka CG16726))를 활성화시킬 수 있는 지 여부를 조사하였다(도 2a). 본 발명자는 상기 수용체들과 커플링된 다운스트림 G-단백질 파트너를 알 수 없었기 때문에, 본 발명자는 각 수용체, Gα-q와 함께 Ca²⁺ 리포터 애쿠오린 및 Ca²⁺경로에서 Gα-s, Gα-i or Gα-o-의존성 GPCRs와 커플링되도록 각각 디자인된 3개의 키메릭 G 단백질들(Gα-qs, Gα-qi or Gα-qo)을 발현시켰다. 본 어세이에서, 고아 GPCR인 CG33696을 발현하는 CHO 세포들이 Irin에 대해 유의한 Ca²⁺ 반응성을 나타낸 반면에, CG33639 또는 CG13229을 발현하는 세포들은 Ca²⁺ 반응성을 나타내지 않았다(도 2a). 다음으로, 본 발명자는 Irin과 19개의 다른 시그널링 펩타이드들과의 비교 테스트를 통해 CG33696의 특이성을 조사하였으며, 상기 다른 시그널링 펩타이드들 중 몇몇(예컨대, 오르코키닌(orcokinins) 및 신경펩타이드-유사 펩타이드들)은 수용체에 대해 알려져 있지 않은 펩타이드였다. 10 μM로 테스트된 경우, CG33696은 Irin 외에 모든 테스트된 펩타이드들에서 측정할만한 [Ca²⁺]_i 반응들을 나타내지 않았다(도 2b). To test whether Irin encodes a functional peptide, the present inventors synthesized Irin, a yellow fruit fly having N-terminal pyroglutamic acid, C-terminal amidation and a disulfide bridge between two cysteine residues, The GPCR assay system was used to investigate whether the synthetic peptides could activate the small panel of related orphan GPCRs (CG33639, CG13229 and CG33696 (aka CG16726)) (Fig. 2a). Since we did not know the downstream G-protein partner coupled with the receptors, the present inventors have found that Ca ²⁺ reporter acuin with each receptor, G? -Q and G? -S in the Ca ²⁺ pathway, Three chimeric G proteins (G? -Qs, G? -Qi or G? -Qo) each designed to be coupled to G? -I or G? -O-dependent GPCRs were expressed. In this assay, CHO cells expressing orphan GPCR CG33696 showed significant Ca ²⁺ reactivity to Irin, whereas cells expressing CG33639 or CG13229 did not show Ca ²⁺ reactivity (FIG. 2a). Next, the present inventors examined the specificity of CG33696 through a comparison test between Irin and 19 other signaling peptides, and found that some of the other signaling peptides (such as orcokinins and neuropeptide-like peptides ) Was a peptide not known to the receptor. When tested at 10 [mu] M, CG33696 showed no [Ca2 ⁺ ] _i responses measurable in all tested peptides in addition to Irin (Fig. 2b).

이후, 본 발명자는 CG33696 및 Gα-q를 발현하는 CHO 세포에서 투여량-반응 상관관계를 조사하였다. 그 결과, 본 발명자는 Irin이 CG33696에 대한 매우 강력한 리간드(7.1 nM EC₅₀)라는 것을 확인하였는데(도 2c), 이는 Irin이 공지된 생리학적 관련성을 가지는 다른 펩타이드-수용체 상호작용(예를 들어, SP는 SPR에 대해 1.3 nM EC₅₀을 가짐)과 비교될만한 하다. 따라서, 본 발명자는 CG33696을 Irin 수용체(IrinR)로 명명한다.We then examined the dose-response relationship in CHO cells expressing CG33696 and G? -Q. As a result, the inventors have Irin this were confirmed that a very powerful ligands for CG33696 (7.1 nM EC ₅₀₎ (Fig. 2c), which other polypeptide having a physiological relevance Irin a known-receptor interactions (e.g., SP has 1.3 nM EC ₅₀ for SPR). Therefore, the inventor named CG33696 as Irin receptor (IrinR).

계통학적 분석은 SPR(sex peptide receptor; aka CG16752)이 IrinR과 연관되어 있다는 것을 나타냈다(도 3). SPR은 Irin에 매우 낮지만 유의한 민감성을 나타내어, 10 μM Irin에서 유의한 SPR 활성화를 유도하였지만, 그 반응성은 대조군인 50 μM 대비 약 40%를 넘지 않았다(결과를 보이지 않음). 이와 대조적으로, SPR에 대한 알려진 아고니스트인 SP(sex peptide) 또는 Mip는 1 nM의 낮은 농도에서 상기 수용체를 활성화시킬 수 있다(결과를 보이지 않음). 상술한 결과들을 토대로, 본 발명자는 Irin이 진짜 기능성 펩타이드이고 이의 수용체인 IrinR은 Irin에 대해 매우 민감성이 높고 선택적이라는 것을 확증한다. Phylogenetic analysis indicated that the SPR (sex peptide receptor; aka CG16752) was associated with IrinR (Fig. 3). SPR was very low but very sensitive to Irin, leading to a significant SPR activation at 10 μM Irin, but its reactivity did not exceed about 40% of the control 50 μM (no results). In contrast, SP (sex peptide) or Mip, a known agonist for SPR, can activate the receptor at a low concentration of 1 nM (results not shown). Based on the above results, the inventors confirm that Irin is a true functional peptide and its receptor, IrinR, is highly sensitive and selective for Irin.

Irin 유전자와 유사하게도, Irin의 수용체인 IrinR 유전자도 절지동물 게놈에 폭넓게 퍼져 있었다. 다른 종들에서 유래된 IrinR들 간의 상관관계를 이해하기 위해, 본 발명자는 4개의 다른 방법들을 이용하여 IrinR 및 이와 관계된 GPCR들의 계통학적 트리를 구축하였다: UPGMA 트리, 최대 절약(maximum parsimonious) 트리, NJ(neighbor-joining) 트리 및 최대 가능성 트리(maximum likelihood trees)[10, 11]. 본 발명자는 트리 구축에 대한 방법들이 각각 약간 다른 결과들을 도출한다는 것을 확인하였다. UPGMA 트리(도 3)에서는 대부분의 클러스터들이 500개의 복제물(replicates)에서 >70% 부트스트랩 값에 의해 뒷받침되었다. 다른 3개의 방법들은 UPGMA 트리에서는 다른-그룹으로 분리된 IrinR 클레이드(clade)가 아플리시아 칼리포니카(Aplysia californica) 서열들을 낮은 부트스트랩 값(<30%)으로 포함하였다. 그럼에도 불구하고, 모든 4개의 방법들이 2개의 분리된 IrinR 클레이드로 분리하였다: IrinR-1 및 IrinR-2(도 3).Similar to the Irin gene, the IrinR gene, a receptor for Irin, was also widely distributed in the arthropod genome. To understand the correlation between IrinRs from different species, the present inventors constructed a phylogenetic tree of IrinR and related GPCRs using four different methods: UPGMA tree, maximum parsimonious tree, NJ a neighbor-joining tree and a maximum likelihood tree [10, 11]. The inventors have confirmed that the methods for constructing the tree each yield slightly different results. In the UPGMA tree (FIG. 3), most clusters were backed by > 70% bootstrap values in 500 replicates. The other three methods included a different bootstrap value (< 30%) in the UPGMA tree as a different-grouped IrinR clade with Aplysia californica sequences. Nevertheless, all four methods were separated into two separate IrinR clades: IrinR-1 and IrinR-2 (Figure 3).

많은 막시류 종들(회색 박스)이 IrinR-1 및 Irin-2 모두를 가진다는 것이 주목할만한 사실인데, 이는 IrinR-1과 IrinR-2가 서로 동종 유전자(paralogous)라는 것을 의미한다(도 3 내 검은색 원). 한 가지 예외는 IrinR-2만을 포함하는 꿀벌(Apis melifera)에서만 관찰된다. 모기 종들도 2개의 IrinR들을 포함하지만, 모두 IrinR-1 클레이드에 속한다. 하나의 예로서, 아노펠레스 감비애(Anopheles gambiae)는 IrinR-1 클레이드에 2개의 유전자들을 가지는데, 이들은 10 kb의 게놈 부위에 앞뒤로 반복적으로(tandemly repeated) 위치한다. 애데스 아집티(Aedes agypti) 및 쿨렉스 퀸쿠에파시아투스(Culex quinquefasciatus) 같은 다른 모기 종들의 게놈들도 IrinR-1 클레이드에 복수개의 수용체 카피들을 나타냈지만, 대립형질성 컨티그 또는 스캐폴드를 잠재적으로 포함하는 게놈 어셈블리의 모호성으로 인해 계통학적 분석에는 포함되지 않았다. 모기 IrinR의 계통학적 위치들은 추가적인 유전자 중복이 모기 종 분화(speciation)의 측면에서 IrinR-1과 IrinR-2의 분기 후에 발생했다는 것을 보여준다. 하지만, 다른 모든 종들은 IrinR-1 또는 IrinR-2를 가진다: 나비목(Lepidopteran), 파리목(Dipteran), 이속(Pediculus) 및 진딧물(Acyrthosiphon) 종들은 IrinR-1 만을 포함하는 반면에, 트리볼리움(Tribolium), 갑각류(Crustaceans) 및 거미류(arachnids)는 IrinR-2 만을 포함한다. 대부분의 최신 데이터베이스들을 매우 심도있게 조사하였기 때문에, 본 발명자는 트리에 존재하지 않는 나방류 곤충의 IrinR은 실제로 존재하지 않는 것이라고 확신한다. 상술한 결과들을 고려할 때, 본 발명자는 고대에 중복이 발생한 후 다른 계통에서 IrinR-1 또는 IrinR-2 유전자에서 상대적으로 후기(late phase)의 독립적인 결실이 IrinR들의 현재의 계통학적 분포를 초래하였다고 결론내렸다. 이와 유사하게도, 다른 계통의 곤충에서 CCAP 수용체의 다양한 중복 및 결실이 이전에 알려졌다(13).It is noteworthy that many membrane-type species (gray boxes) have both IrinR-1 and Irin-2, which means that IrinR-1 and IrinR-2 are paralogous to one another Color circle). One exception is observed only in bees ( Apis melifera ) that contain only IrinR-2. Mosquito species also contain two IrinRs, but all belong to the IrinR-1 clade. As an example, Anopheles gambiae has two genes in the IrinR-1 clade, which are repeated tandemly repeatedly in the 10 kb genome region. Genomes of other mosquito species, such as Aedes agypti and Culex quinquefasciatus , also showed multiple receptor copies in IrinR-1 clade, but allelic contigs or scaffolds Were not included in the phylogenetic analysis due to the ambiguity of the genome assembly that potentially contains. The phylogenetic locations of mosquito IrinR show that additional gene duplication occurred after the branching of IrinR-1 and IrinR-2 in terms of mosquito species speciation. However, all other species have IrinR-1 or IrinR-2: Lepidopteran, Dipteran, Pediculus and Acyrthosiphon species contain only IrinR-1 , whereas tribolium Tribolium , Crustaceans and arachnids contain only IrinR-2 . Having examined most modern databases in-depth, the present inventors are convinced that the IrinR of moth insects that do not exist in the tree are not actually present. Given the above results, the present inventors have found that relative late phase independent deletions in the IrinR-1 or IrinR-2 gene in other strains after the ancient duplication lead to the current phylogenetic distribution of IrinRs . Similarly, various duplications and deletions of CCAP receptors have been previously known in other strains of insects (13).

IrinR-2 클레이드에 속하는 유전자들도 기능성 Irin 수용체를 인코딩하는 지를 조사하기 위해, 본 발명자는 CHO 세포에서 꿀벌의 IrinR(AmIrinR)을 발현시켜 노랑초파리 Irin이 AmIrinR을 발현하는 세포에서 [Ca²⁺]_i 반응을 유도할 수 있는 지 여부를 조사하였다. Irin은 우수한 EC₅₀ 값(5.8 nM)으로 ApIrinR을 활성화시키는 데(도 2d), 이는 IrinR-2 클레이드에 속하는 수용체도 IrinR-1 클레이드에 속하는 노랑초파리 IrinR만큼 Irin에 민감하다는 것을 의미한다. To investigate whether genes belonging to IrinR-2 clade also encode a functional Irin receptor, the present inventors expressed the bee's IrinR (AmIrinR) in CHO cells, and found that the yellow fruit fly Irin expresses [Ca2 ⁺ ] _i reaction in the culture medium. Irin activates ApIrinR with an excellent EC ₅₀ value (5.8 nM) (Fig. 2d), indicating that receptors belonging to the IrinR-2 clade are also Irin-sensitive as the yellow Drosophila IrinR belonging to the IrinR-1 clade.

리간드 펩타이드와 이의 GPCR 수용체의 계통학적 분포 패턴들은 리간드-수용체 쌍[8, 12])의 공동-진화(coevolution)를 일관되게 지지한다. 매우 흥미롭게도, 본 발명자는 Irin과 IrinR의 계통학적 분포에서 두드러진 불일치(disparity)를 발견하였다: IrinR 유전자는 나비목 종들(예컨대, Bombyx mori 및 Danaus plexippus)에서 보여지는 반면에(도 3), Irin 펩타이드 유전자는 상기 종들에서 관찰되지 않는다. 이에, 본 발명자는 누에나방(Bombyx mori) IrinR(BNGR-A18로도 알려짐)가 Irin에 대한 기능성 수용체를 인코딩하는 지 여부를 조사하였다. 실질적으로, CHO 세포에서 발현된 BNGR-A18은 Irin에 의해 활성화되었으나, 그 민감성은 매우 낮았다(>1,000 nM EC₅₀; 도 2e). Irin이 없는 누에나방이 이에 대한 수용체를 가지는 이유를 설명할 수 있는 가능한 가설은 수용체 IrinR 유전자와 같이 펩타이드 Irin 유전자도 절지동물 진화 과정동안 중복되어 몇몇 계통에서 상기 유전자 카피가 소실되었다는 것이다. 상기 가설은 IrinR에 대한 추가적인 펩타이드 리간드의 존재를 예측하는 것으로, 뚜렷한 Irin를 가지지 않는 나미목 종에서 특히 그러하다. 펩타이드 유전자의 보존성 부위는 항상 매우 짧기 때문에, BLAST 조사를 통해 Irin 동종물(paralog)을 동정하는 것은 매우 어렵다. 따라서, 누에나방 IrinR에 대한 리간드(들)을 동정하기 위한 역방향 약물학적 접근방법을 채택하는 추가적인 조사방법이 필요하다.The phylogenetic distribution patterns of ligand peptides and their GPCR receptors consistently support the coevolution of the ligand-receptor pair [8, 12]). Very interestingly, the inventors have found a pronounced disparity in the phylogenetic distribution of Irin and IrinR: the IrinR gene is shown in Lepidoptera species (e.g. Bombyx mori and Danaus plexippus ) (Fig. 3), whereas the Irin peptide No genes are observed in these species. Thus, the present inventors investigated whether Bombyx mori IrinR (also known as BNGR-A18) encodes a functional receptor for Irin. Substantially, BNGR-A18 expressed in CHO cells was activated by Irin, but the sensitivity was very low (> 1,000 nM EC ₅₀ ; Fig. 2e). A possible hypothesis to explain why Irin-free silkworm mothers have receptors for them is that the peptide Irin gene, like the receptor IrinR gene, is redundant during arthropod evolution and that the gene copy is lost in some strains. This hypothesis predicts the presence of additional peptide ligands to IrinR, particularly in the case of naive species that do not have distinct Irin. Because the conserved region of the peptide gene is always very short, it is very difficult to identify Irin paralog through BLAST studies. Thus, there is a need for additional investigational methods that employ a reverse pharmacological approach to identify the ligand (s) for the silkworm mite IrinR.

Irin은 신경 시스템에서 발현되기 때문에, 본 발명자는 다음의 주된 행동학적 패러다임들에서 Irin 시그널링 시스템의 생물학적 기능들을 탐구하였다: 식이(feeding), 학습 및 기억(learning and memory), 성적 행동 및 수면. RNAi를 이용하여 신경 시스템에서 Irin을 넉-다운(knock-down)시킨 후, 본 발명자는 수컷 구애 행동, 암컷 생식 행동(교배율, 알을 낳고 재-교배하는 비율), 및 수면을 조사하였다. 신경 시스템에서 Irin 발현이 없을 지라도, RNAi 초파리들은 모든 조사된 행동들에서 대조군 초파리들과 차이가 없었다.Since Irin is expressed in the nervous system, the inventor has explored the biological functions of the Irin signaling system in the following major behavioral paradigms: feeding, learning and memory, sexual behavior and sleep. After knocking down Irin in the nervous system using RNAi, the present inventors investigated male courtship behavior, female reproductive behavior (rate of crossing, ratio of breeding and re-crossing eggs), and sleep. Although there was no Irin expression in the nervous system, RNAi Drosophila did not differ from control Drosophila in all investigated behaviors.

본 연구에서 본 발명자는 동물 종들에서 주요 그룹을 차지하는 절지동물에서 특이적으로 진화된 신규 펩타이드성 시그널링 시스템을 동정하였다. 상기 시그널링 시스템에 대한 비교 분석은 조절성 펩타이드와 이의 수용체 GPCR이 오랜 기간 동안 어떻게 진화되었는 지에 대해 중요한 통찰력을 제공할 수 있을 것이다.
In this study we have identified a novel peptide signaling system that has evolved specifically in arthropods that constitute a major group in animal species. A comparative analysis of the signaling system could provide important insights into how regulatory peptides and their receptor GPCRs evolved over time.

이상으로 본 발명의 특정한 부분을 상세히 기술 하였는 바, 당업계의 통상의 지식을 가진 자에게 있어서 이러한 구체적인 기술은 단지 일 구현예일 뿐이며, 이에 본 발명의 범위가 제한되는 것이 아닌 점은 명백하다. 따라서, 본 발명의 실질적인 범위는 첨부된 청구항과 그의 등가물에 의하여 정의된다고 할 것이다.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is obvious that the same is by way of illustration and example only and is not to be construed as limiting the scope of the invention. Accordingly, the actual scope of the present invention will be defined by the appended claims and their equivalents.

참고문헌references

1. Nssel DR & Winther AME (2010) Drosophila neuropeptides in regulation of physiology and behavior. Prog. Neurobiol. 92, 42-104.1. Nssel DR & Winther AME (2010) Drosophila neuropeptides in regulation of physiology and behavior. Prog. Neurobiol. 92, 42-104.

2. Taghert PH & Nitabach MN (2012) Peptide neuromodulation in invertebrate model systems. Neuron 76, 82-97.2. Taghert PH & Nitabach MN (2012) Peptide neuromodulation in invertebrate model systems. Neuron 76, 82-97.

3. Hewes RS & Taghert PH (2001) Neuropeptides and neuropeptide receptors in the Drosophila melanogaster genome. Genome Res. 11, 1126-1142.3. Hewes RS & Taghert PH (2001) Neuropeptides and neuropeptide receptors in the Drosophila melanogaster genome. Genome Res. 11, 1126-1142.

4. Brody T (2000) Drosophila melanogaster G Protein-coupled Receptors. The Journal of Cell Biology 150, 83F-88.4. Brody T (2000) Drosophila melanogaster G Protein-coupled Receptors. The Journal of Cell Biology 150, 83F-88.

5. Yapici N, Kim Y-J, Ribeiro C & Dickson BJ (2008) A receptor that mediates the post-mating switch in Drosophila reproductive behaviour. Nature 451, 33-37.5. The constructor N, Kim Y-J, Ribeiro C & Dickson BJ (2008) A receptor that mediates the post-mating switch in Drosophila reproductive behavior. Nature 451, 33-37.

6. Ida T, Takahashi T, Tominaga H, Sato T, Kume K, Ozaki M, Hiraguchi T, Maeda T, Shiotani H, Terajima S, Sano H, Mori K, Yoshida M, Miyazato M, Kato J, Murakami N, Kangawa K & Kojima M (2011) Identification of the novel bioactive peptides dRYamide-1 and dRYamide-2, ligands for a neuropeptide Y-like receptor in Drosophila. Biochemical and Biophysical Research Communications 410, 872-877.6. Ida T, Takahashi T, Tominaga H, Sato T, Kume K, Ozaki M, Hiraguchi T, Maeda T, Shiotani H, Terajima S, Sano H, Mori K, Yoshida M, Miyazato M, Kangawa K & Kojima M (2011) Identification of the novel bioactive peptides dRYamide-1 and dRYamide-2, ligands for a neuropeptide Y-like receptor in Drosophila. Biochemical and Biophysical Research Communications 410, 872-877.

7. Ida T, Takahashi T, Tominaga H, Sato T, Kume K, Yoshizawa-Kumagaye K, Nishio H, Kato J, Murakami N, Miyazato M, Kangawa K & Kojima M (2011) Identification of the endogenous cysteine-rich peptide trissin, a ligand for an orphan G protein-coupled receptor in Drosophila. Biochemical and Biophysical Research Communications 414, 44-48.7. Ida T, Takahashi T, Tominaga H, Sato T, Kume K, Yoshizawa-Kumagaye K, Nishio H, Kato J, Murakami N, Miyazato M, Kangawa K. Kojima M (2011) Identification of the endogenous cysteine-rich peptide trissin, a ligand for an orphan G protein-coupled receptor in Drosophila. Biochemical and Biophysical Research Communications 414,44-48.

8. Kim Y-J, Bartalska K, Audsley N, Yamanaka N, Yapici N, Lee J-Y, Kim Y-C, Markovic M, Isaac E, Tanaka Y & Dickson BJ (2010) MIPs are ancestral ligands for the sex peptide receptor. Proc. Natl. Acad. Sci. U.S.A. 107, 6520-6525.8. MIPs are ancestral ligands for the sex peptide receptor (MIPs), which are involved in the immune response of the immune system. Proc. Natl. Acad. Sci. U.S.A. 107, 6520-6525.

9. Liu F, Baggerman G, DHertog W, Verleyen P, Schoofs L & Wets G (2006) In silico identification of new secretory peptide genes in Drosophila melanogaster. Mol. Cell Proteomics 5, 510-522.9. Liu F, Baggerman G, DHertog W, Ver. P, Schoofs L & Wets G (2006) In silico identification of new secretory peptide genes in Drosophila melanogaster. Mol. Cell Proteomics 5, 510-522.

10. Sneath PHA & Sokal RR (1973) Numerical taxonomy. The principles and practice of numerical classification., First Edition Freeman & Company Limited, W H.10. Sneath PHA & Sokal RR (1973) Numerical taxonomy. The principles and practice of numerical classification.

11. Tamura K, Peterson D, Peterson N, Stecher G, Nei M & Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731-2739.11. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731-2739.

12. Park Y, Kim Y-J & Adams ME (2002) Identification of G protein-coupled receptors for Drosophila PRXamide peptides, CCAP, corazonin, and AKH supports a theory of ligand-receptor coevolution. Proc. Natl. Acad. Sci. U.S.A. 99, 11423-11428.12. Park Y, Kim YJ & Adams ME (2002) Identification of G protein-coupled receptors for Drosophila PRXamide peptides, CCAP, corazonin, and AKH supports a theory of ligand-receptor coevolution. Proc. Natl. Acad. Sci. U.S.A. 99, 11423-11428.

13. Li, B., R. W. Beeman, and Y. Park (2011) Functions of duplicated genes encoding CCAP receptors in the red flour beetle Tribolium castaneum. J. Ins. Phys. 57: 1190-1197.13. Li, B., R. W. Beeman, and Y. Park (2011) Functions of duplicated genes encoding CCAP receptors in the red flour beetle Tribolium castaneum. J. Ins. Phys. 57: 1190-1197.

14. SignalP 4.0: discriminating signal peptides from transmembrane regions. Thomas Nordahl Petersen, Sren Brunak, Gunnar von Heijne & Henrik Nielsen. Nature Methods, 8:785-786, 201114. SignalP 4.0: discriminating signal peptides from transmembrane regions. Thomas Nordahl Petersen, Sren Brunak, Gunnar von Heijne & Henrik Nielsen. Nature Methods, 8: 785-786, 2011

15. Kim et al., PNAS April 27, 2004 vol. 101 no. 17 6704-6709.
15. Kim et al., PNAS April 27, 2004 vol. 101 no. 17 6704-6709.

<110> Gwangju Institute of Science and Technology (GIST) <120> GPCR Ligand Peptides and a Conserved Motif Thereof <130> PN130540 <160> 6 <170> KopatentIn 2.0 <210> 1 <211> 13 <212> PRT <213> Artificial Sequence <220> <223> Irin peptide <400> 1 Gln Tyr Met Ser Pro Cys His Phe Lys Ile Cys Asn Met 1 5 10 <210> 2 <211> 15 <212> PRT <213> Artificial Sequence <220> <223> another Irin peptide <400> 2 Asn Val Gln Tyr Met Ser Pro Cys His Phe Lys Ile Cys Asn Met 1 5 10 15 <210> 3 <211> 193 <212> PRT <213> Artificial Sequence <220> <223> CG13936-PB <400> 3 Met Ser Ala Leu Ser Ala Pro Thr Thr Cys Gly Cys Ser Pro Val His 1 5 10 15 Trp Ala Ile Val Ile Val Leu Leu Ser Val Ala Ile Gly Pro Gly Asp 20 25 30 Ala Met Ala Arg Pro Ala Arg Asn Thr Gln Leu Leu Phe Ser Glu Leu 35 40 45 Leu Gly Gly Gly Asn Asp Asp Asn Asn Tyr Tyr Gly Asp Gln Leu Lys 50 55 60 Tyr Gln Gln Gln Gln Gln Gln Gln Gln Glu Gln Lys Gln Gln Arg Val 65 70 75 80 Pro Ala Phe Ala Arg Lys Trp Pro Ser Leu Arg Asp Leu Leu Leu Thr 85 90 95 Val Asp Tyr Asp Asp Phe Gly Val Thr Gln Glu Ser Glu Glu Gln Val 100 105 110 Ala Pro Ser Ser Arg Leu Leu Ala Arg Leu His Arg Leu Gly Asp Asn 115 120 125 Gly Gly Gly Glu Glu Leu Arg Tyr Asn Val Val Asn Glu Leu Thr Asn 130 135 140 Met Pro Ser Lys Lys Val Met Pro Gly His Pro Leu Lys Asp His Asn 145 150 155 160 Thr Lys Lys Asn Val Gln Phe Arg Lys Gln Tyr Met Ser Pro Cys His 165 170 175 Phe Lys Ile Cys Asn Met Gly Arg Lys Arg Asn Ala Gly Phe Asn Ser 180 185 190 Tyr <210> 4 <211> 189 <212> PRT <213> Artificial Sequence <220> <223> CG13936-PD <400> 4 Met Ser Ala Leu Ser Ala Pro Thr Thr Cys Gly Cys Ser Pro Val His 1 5 10 15 Trp Ala Ile Val Ile Val Leu Leu Ser Val Ala Ile Gly Pro Gly Asp 20 25 30 Ala Met Ala Arg Pro Ala Arg Asn Thr Gln Leu Leu Phe Ser Glu Leu 35 40 45 Leu Gly Gly Gly Asn Asp Asp Asn Asn Tyr Tyr Gly Asp Gln Leu Lys 50 55 60 Tyr Gln Gln Gln Gln Gln Gln Gln Gln Glu Gln Lys Gln Gln Arg Val 65 70 75 80 Pro Ala Phe Ala Arg Lys Trp Pro Ser Leu Arg Asp Leu Leu Leu Thr 85 90 95 Val Asp Tyr Asp Asp Phe Gly Val Thr Gln Glu Ser Glu Glu Gln Val 100 105 110 Ala Pro Ser Ser Arg Leu Leu Ala Arg Leu His Arg Leu Gly Asp Asn 115 120 125 Gly Gly Gly Glu Glu Leu Arg Tyr Asn Val Val Asn Glu Leu Thr Asn 130 135 140 Met Pro Ser Lys Lys Val Met Pro Gly His Pro Leu Lys Asp His Asn 145 150 155 160 Thr Lys Lys Asn Val Gln Tyr Met Ser Pro Cys His Phe Lys Ile Cys 165 170 175 Asn Met Gly Arg Lys Arg Asn Ala Gly Phe Asn Ser Tyr 180 185 <210> 5 <211> 1440 <212> DNA <213> Drosophila melanogaster <400> 5 atggatatgg agtatataac tagcagtagt ggcaacataa ctgccactac agaagcagat 60 ttcagttcat ctttgggcga atccaatgtc acggaataca atacaacgga aatggacgcg 120 aatgaatcag ctggcgagga tgaggagatg ctaaggatag ccttttttat agggcacttc 180 gtgcatcaat actatatacc agtgctttgc tgcacgggca gcattggcaa tatcctctcc 240 gtgtttgtct tctttaggac caagttgaga aagttgtcct ccagctttta cctggccgct 300 ctcgcagtga gcgacacctg cttcctggcc ggactcttcg cacagtggct gaacttcctc 360 aatgtggata tctataatca aaactacttt tgccagttct tcacgttctt cagctacctg 420 gccagcttct gctccgtctg gtttgtggtg gccttcaccg tggagcgctt cattgccgtc 480 atcaatccac tgaagcgcca gaccatgtgc accgtgcgcc gggccaagat cgtcctgttt 540 tgtttgacac tcgtgggatg tctgcactgt ctgccctata tagtcattgc caagccggtg 600 tttatgccca aattgaacac gaccatttgc gatctcaact cggaatacaa agaacaactg 660 gccctcttca actactggga caccatagtt gtctatgcgg tgcccttcac caccatcgcc 720 gttctgaaca cctgtacagg ttgtacggtg tggaagttcg ccaccgttcg ccgaacgctg 780 accatgcaca agatgaagcc ccaaacgaac agcatgccat cgaactcgtc gaactcatcg 840 ggcggagcct catctgcagt ggcctcgtac cgcctatcgg catcgctgaa gcgccagaag 900 tcgacgggaa cgcatccaag tggacagcat aatgtggcca acaggcagac ggatgatcag 960 gagcagcaac agcagtcgca gcagcatcaa ataaacaatt gccaacacca ttgcgagatt 1020 acgcagaaac cagcccgtcg caaggtgcaa aactcatcgc agctcaaggt aaccaaaatg 1080 ctgctaattg tctcgacggt ttttgtttgc ctcaatttgc ccagctgcct gctgcgcatc 1140 gaggcctatt gggagacgga gtcggcccga aaccagaatt ccacaattgc tttgcaatat 1200 atttttcacg ctttcttcat caccaatttt ggcatcaatt tcgtgcttta ctgcgtcagt 1260 ggacagaatt tccgcaaagc cgttttgagt attttccgaa gggtttcatc cgctcaacgg 1320 gaggcgggaa acacccaagt gacagtttct gaatactgtc gcaatactgg cacatccaca 1380 cgtcgtcgaa tgatgacgca acattgttgg aacgaaatgc acgagttgca tccactcaag 1440 1440 <210> 6 <211> 480 <212> PRT <213> Drosophila melanogaster <400> 6 Met Asp Met Glu Tyr Ile Thr Ser Ser Ser Gly Asn Ile Thr Ala Thr 1 5 10 15 Thr Glu Ala Asp Phe Ser Ser Ser Leu Gly Glu Ser Asn Val Thr Glu 20 25 30 Tyr Asn Thr Thr Glu Met Asp Ala Asn Glu Ser Ala Gly Glu Asp Glu 35 40 45 Glu Met Leu Arg Ile Ala Phe Phe Ile Gly His Phe Val His Gln Tyr 50 55 60 Tyr Ile Pro Val Leu Cys Cys Thr Gly Ser Ile Gly Asn Ile Leu Ser 65 70 75 80 Val Phe Val Phe Phe Arg Thr Lys Leu Arg Lys Leu Ser Ser Ser Phe 85 90 95 Tyr Leu Ala Ala Leu Ala Val Ser Asp Thr Cys Phe Leu Ala Gly Leu 100 105 110 Phe Ala Gln Trp Leu Asn Phe Leu Asn Val Asp Ile Tyr Asn Gln Asn 115 120 125 Tyr Phe Cys Gln Phe Phe Thr Phe Phe Ser Tyr Leu Ala Ser Phe Cys 130 135 140 Ser Val Trp Phe Val Val Ala Phe Thr Val Glu Arg Phe Ile Ala Val 145 150 155 160 Ile Asn Pro Leu Lys Arg Gln Thr Met Cys Thr Val Arg Arg Ala Lys 165 170 175 Ile Val Leu Phe Cys Leu Thr Leu Val Gly Cys Leu His Cys Leu Pro 180 185 190 Tyr Ile Val Ile Ala Lys Pro Val Phe Met Pro Lys Leu Asn Thr Thr 195 200 205 Ile Cys Asp Leu Asn Ser Glu Tyr Lys Glu Gln Leu Ala Leu Phe Asn 210 215 220 Tyr Trp Asp Thr Ile Val Val Tyr Ala Val Pro Phe Thr Thr Ile Ala 225 230 235 240 Val Leu Asn Thr Cys Thr Gly Cys Thr Val Trp Lys Phe Ala Thr Val 245 250 255 Arg Arg Thr Leu Thr Met His Lys Met Lys Pro Gln Thr Asn Ser Met 260 265 270 Pro Ser Asn Ser Ser Asn Ser Ser Gly Gly Ala Ser Ser Ala Val Ala 275 280 285 Ser Tyr Arg Leu Ser Ala Ser Leu Lys Arg Gln Lys Ser Thr Gly Thr 290 295 300 His Pro Ser Gly Gln His Asn Val Ala Asn Arg Gln Thr Asp Asp Gln 305 310 315 320 Glu Gln Gln Gln Gln Ser Gln Gln His Gln Ile Asn Asn Cys Gln His 325 330 335 His Cys Glu Ile Thr Gln Lys Pro Ala Arg Arg Lys Val Gln Asn Ser 340 345 350 Ser Gln Leu Lys Val Thr Lys Met Leu Leu Ile Val Ser Thr Val Phe 355 360 365 Val Cys Leu Asn Leu Pro Ser Cys Leu Leu Arg Ile Glu Ala Tyr Trp 370 375 380 Glu Thr Glu Ser Ala Arg Asn Gln Asn Ser Thr Ile Ala Leu Gln Tyr 385 390 395 400 Ile Phe His Ala Phe Phe Ile Thr Asn Phe Gly Ile Asn Phe Val Leu 405 410 415 Tyr Cys Val Ser Gly Gln Asn Phe Arg Lys Ala Val Leu Ser Ile Phe 420 425 430 Arg Arg Val Ser Ser Ala Gln Arg Glu Ala Gly Asn Thr Gln Val Thr 435 440 445 Val Ser Glu Tyr Cys Arg Asn Thr Gly Thr Ser Thr Arg Arg Arg Met 450 455 460 Met Thr Gln His Cys Trp Asn Glu Met His Glu Leu His Pro Leu Lys 465 470 475 480 <110> Gwangju Institute of Science and Technology (GIST) <120> GPCR Ligand Peptides and a Conserved Motif Thereof <130> PN130540 <160> 6 <170> Kopatentin 2.0 <210> 1 <211> 13 <212> PRT <213> Artificial Sequence <220> <223> Irin peptide <400> 1 Gln Tyr Met Ser Pro Cys His Phe Lys Ile Cys Asn Met 1 5 10 <210> 2 <211> 15 <212> PRT <213> Artificial Sequence <220> <223> another Irin peptide <400> 2 Asn Val Gln Tyr Met Ser Pro Cys His Phe Lys Ile Cys Asn Met 1 5 10 15 <210> 3 <211> 193 <212> PRT <213> Artificial Sequence <220> <223> CG13936-PB <400> 3 Met Ser Ala Leu Ser Ala Pro Thr Thr Cys Gly Cys Ser Pro Val His 1 5 10 15 Trp Ala Ile Val Ile Val Leu Leu Ser Val Ala Ile Gly Pro Gly Asp 20 25 30 Ala Met Ala Arg Pro Ala Arg Asn Thr Gln Leu Leu Phe Ser Glu Leu 35 40 45 Leu Gly Gly Gly Asn Asp Asp Asn Asn Tyr Tyr Gly Asp Gln Leu Lys 50 55 60 Tyr Gln Gln Gln Gln Gln Gln Gln Gln Glu Gln Lys Gln Gln Arg Val 65 70 75 80 Pro Ala Phe Ala Arg Lys Trp Pro Ser Leu Arg Asp Leu Leu Leu Thr 85 90 95 Val Asp Tyr Asp Asp Phe Gly Val Thr Gln Glu Ser Glu Glu Gln Val 100 105 110 Ala Pro Ser Ser Leu Leu Ala Arg Leu His Arg Leu Gly Asp Asn 115 120 125 Gly Gly Gly Glu Glu Leu Arg Tyr Asn Val Val Asn Glu Leu Thr Asn 130 135 140 Met Pro Ser Lys Lys Val Met Pro Gly His Pro Leu Lys Asp His Asn 145 150 155 160 Thr Lys Lys Asn Val Gln Phe Arg Lys Gln Tyr Met Ser Pro Cys His 165 170 175 Phe Lys Ile Cys Asn Met Gly Arg Lys Arg Asn Ala Gly Phe Asn Ser 180 185 190 Tyr <210> 4 <211> 189 <212> PRT <213> Artificial Sequence <220> <223> CG13936-PD <400> 4 Met Ser Ala Leu Ser Ala Pro Thr Thr Cys Gly Cys Ser Pro Val His 1 5 10 15 Trp Ala Ile Val Ile Val Leu Leu Ser Val Ala Ile Gly Pro Gly Asp 20 25 30 Ala Met Ala Arg Pro Ala Arg Asn Thr Gln Leu Leu Phe Ser Glu Leu 35 40 45 Leu Gly Gly Gly Asn Asp Asp Asn Asn Tyr Tyr Gly Asp Gln Leu Lys 50 55 60 Tyr Gln Gln Gln Gln Gln Gln Gln Gln Glu Gln Lys Gln Gln Arg Val 65 70 75 80 Pro Ala Phe Ala Arg Lys Trp Pro Ser Leu Arg Asp Leu Leu Leu Thr 85 90 95 Val Asp Tyr Asp Asp Phe Gly Val Thr Gln Glu Ser Glu Glu Gln Val 100 105 110 Ala Pro Ser Ser Leu Leu Ala Arg Leu His Arg Leu Gly Asp Asn 115 120 125 Gly Gly Gly Glu Glu Leu Arg Tyr Asn Val Val Asn Glu Leu Thr Asn 130 135 140 Met Pro Ser Lys Lys Val Met Pro Gly His Pro Leu Lys Asp His Asn 145 150 155 160 Thr Lys Lys Asn Val Gln Tyr Met Ser Pro Cys His Phe Lys Ile Cys 165 170 175 Asn Met Gly Arg Lys Arg Asn Ala Gly Phe Asn Ser Tyr 180 185 <210> 5 <211> 1440 <212> DNA <213> Drosophila melanogaster <400> 5 atggatatgg agtatataac tagcagtagt ggcaacataa ctgccactac agaagcagat 60 ttcagttcat ctttgggcga atccaatgtc acggaataca atacaacgga aatggacgcg 120 aatgaatcag ctggcgagga tgaggagatg ctaaggatag ccttttttat agggcacttc 180 gtgcatcaat actatatacc agtgctttgc tgcacgggca gcattggcaa tatcctctcc 240 gtgtttgtct tctttaggac caagttgaga aagttgtcct ccagctttta cctggccgct 300 ctcgcagtga gcgacacctg cttcctggcc ggactcttcg cacagtggct gaacttcctc 360 aatgtggata tctataatca aaactacttt tgccagttct tcacgttctt cagctacctg 420 gt; atcaatccac tgaagcgcca gaccatgtgc accgtgcgcc gggccaagat cgtcctgttt 540 tgtttgacac tcgtgggatg tctgcactgt ctgccctata tagtcattgc caagccggtg 600 tttatgccca aattgaacac gaccatttgc gatctcaact cggaatacaa agaacaactg 660 gccctcttca actactggga caccatagtt gtctatgcgg tgcccttcac caccatcgcc 720 gttctgaaca cctgtacagg ttgtacggtg tggaagttcg ccaccgttcg ccgaacgctg 780 accatgcaca agatgaagcc ccaaacgaac agcatgccat cgaactcgtc gaactcatcg 840 ggcggagcct catctgcagt ggcctcgtac cgcctatcgg catcgctgaa gcgccagaag 900 tcgacgggaa cgcatccaag tggacagcat aatgtggcca acaggcagac ggatgatcag 960 gagcagcaac agcagtcgca gcagcatcaa ataaacaatt gccaacacca ttgcgagatt 1020 acgcagaaac cagcccgtcg caaggtgcaa aactcatcgc agctcaaggt aaccaaaatg 1080 ctgctaattg tctcgacggt ttttgtttgc ctcaatttgc ccagctgcct gctgcgcatc 1140 gaggcctatt gggagacgga gtcggcccga aaccagaatt ccacaattgc tttgcaatat 1200 atttttcacg ctttcttcat caccaatttt ggcatcaatt tcgtgcttta ctgcgtcagt 1260 ggacagaatt tccgcaaagc cgttttgagt attttccgaa gggtttcatc cgctcaacgg 1320 gaggcgggaa acacccaagt gacagtttct gaatactgtc gcaatactgg cacatccaca 1380 cgtcgtcgaa tgatgacgca acattgttgg aacgaaatgc acgagttgca tccactcaag 1440 1440 <210> 6 <211> 480 <212> PRT <213> Drosophila melanogaster <400> 6 Met Asp Met Glu Tyr Ile Thr Ser Ser Ser Gly Asn Ile Thr Ala Thr 1 5 10 15 Thr Glu Ala Asp Phe Ser Ser Ser Leu Gly Glu Ser Asn Val Thr Glu 20 25 30 Tyr Asn Thr Thr Glu Met Asp Ala Asn Glu Ser Ala Gly Glu Asp Glu 35 40 45 Glu Met Leu Arg Ile Ala Phe Phe Ile Gly His Phe Val His Gln Tyr 50 55 60 Tyr Ile Pro Val Leu Cys Cys Thr Gly Ser Ile Gly Asn Ile Leu Ser 65 70 75 80 Val Phe Val Phe Phe Arg Thr Lys Leu Arg Lys Leu Ser Ser Ser Phe 85 90 95 Tyr Leu Ala Ala Leu Ala Val Ser Asp Thr Cys Phe Leu Ala Gly Leu 100 105 110 Phe Ala Gln Trp Leu Asn Phe Leu Asn Val Asp Ile Tyr Asn Gln Asn 115 120 125 Tyr Phe Cys Gln Phe Phe Thr Phe Phe Ser Tyr Leu Ala Ser Phe Cys 130 135 140 Ser Val Trp Phe Val Ala Phe Thr Val Glu Arg Phe Ile Ala Val 145 150 155 160 Ile Asn Pro Leu Lys Arg Gln Thr Met Cys Thr Val Arg Arg Ala Lys 165 170 175 Ile Val Leu Phe Cys Leu Thr Leu Val Gly Cys Leu His Cys Leu Pro 180 185 190 Tyr Ile Val Ile Ala Lys Pro Val Phe Met Pro Lys Leu Asn Thr Thr 195 200 205 Ile Cys Asp Leu Asn Ser Glu Tyr Lys Glu Gln Leu Ala Leu Phe Asn 210 215 220 Tyr Trp Asp Thr Ile Val Val Tyr Ala Val Pro Phe Thr Thr Ile Ala 225 230 235 240 Val Leu Asn Thr Cys Thr Gly Cys Thr Val Trp Lys Phe Ala Thr Val 245 250 255 Arg Arg Thr Leu Thr Met His Lys Met Lys Pro Gln Thr Asn Ser Met 260 265 270 Pro Ser Asn Ser Ser Asn Ser Ser Gly Gly Ala Ser Ser Ala Val Ala 275 280 285 Ser Tyr Arg Leu Ser Ala Ser Leu Lys Arg Gln Lys Ser Thr Gly Thr 290 295 300 His Pro Ser Gly Gln His Asn Val Ala Asn Arg Gln Thr Asp Asp Gln 305 310 315 320 Glu Gln Gln Gln Gln Ser Gln Gln Gln Ile Asn Asn Cys Gln His 325 330 335 His Cys Glu Ile Thr Gln Lys Pro Ala Arg Arg Lys Val Gln Asn Ser 340 345 350 Ser Gln Leu Lys Val Thr Lys Met Leu Leu Ile Val Ser Thr Val Phe 355 360 365 Val Cys Leu Asn Leu Pro Ser Cys Leu Leu Arg Ile Glu Ala Tyr Trp 370 375 380 Glu Thr Glu Ser Ala Arg Asn Gln Asn Ser Thr Ile Ala Leu Gln Tyr 385 390 395 400 Ile Phe His Ala Phe Phe Ile Thr Asn Phe Gly Ile Asn Phe Val Leu 405 410 415 Tyr Cys Val Ser Gly Gln Asn Phe Arg Lys Ala Val Leu Ser Ile Phe 420 425 430 Arg Arg Val Ser Ser Ala Gln Arg Glu Ala Gly Asn Thr Gln Val Thr 435 440 445 Val Ser Glu Tyr Cys Arg Asn Thr Gly Thr Ser Thr Arg Arg Arg Met 450 455 460 Met Thr Gln His Cys Trp Asn Glu Met His Glu Leu His Pro Leu Lys 465 470 475 480

Claims

A method for screening a signaling peptide ligand comprising identifying signaling peptide precursors having a signal sequence from the genomic sequence of an arthropod, said candidate peptide precursor having a single trans-membrane domain sequence, Wherein the signaling peptide ligand has no domain and thus can not predict a molecular function.

The method of claim 1, wherein the genomic sequence is selected from the group consisting of NCBI (http://www.ncbi.nlm.nih.gov/nucleotide/), Beebase (http://hymenopteragenome.org/beebase/), Beetlebase beetlebase.org or VectorBase (https://www.vectorbase.org).

3. The method of claim 1, wherein the candidate peptide precursor has a signal peptide at the C-terminus of the transmembrane domain.

2. The method of claim 1, wherein the candidate peptide precursor comprises at least one species conserved sequence.

2. The method of claim 1, wherein the candidate peptide precursor comprises mono- or dibasic cleavage sites.

2. The method of claim 1, wherein the signaling peptide screening method further comprises performing a functional assay of the candidate signaling peptide from the candidate peptide precursor.

A GPCR (ligand) peptide represented by the following general formula 1:
1
NH ₂ -Xa ₁ -Xa ₂ -Xa ₃ -Xa ₄ -Xa ₅ -Cys-His-Phe-Lys-Xa ₆ -Cys-Asn-Met-CONH ₂
In the general formula 1, Xa ₁ is Gln, Asn, His, Ser, Thr or Pro; Xa ₂ is Tyr or Arg; Xa ₃ is Met, Leu or Ile; Xa ₄ is Ser, Thr, Ala or Gln; Xa < ₅ > is Pro, Leu or Met; Xa < ₆ > is Ile or Leu.

The peptide according to claim 7, wherein the 6th Cys residue and the 11th Cys residue in the general formula (1) are connected to each other by a disulfide bond.

8. The peptide of claim 7, wherein the N-terminus and C-terminus of the peptide comprise mono- or dibasic cleavage sites.

10. The peptide of claim 9, wherein the dibasic cleavage position sequence is a combination sequence of Lys and Arg.

8. The peptide of claim 7, wherein the peptide is a GPCR ligand peptide represented by the following formula 2:
Formula 2
NH ₂ -Xa ₁ -Tyr-Met-Xa ₂ -Xa ₃ -Cys-His-Phe-Lys-Xa ₄ -Cys-Asn-Met-CONH ₂
In the general formula 1, Xa ₁ is Gln, Asn, His, Ser or Thr; Xa ₂ is Ser, Thr or Ala; Xa < ₃ > is Pro or Leu; Xa ₄ is Ile or Leu.

12. The compound of claim 11, wherein Xa ₁ is Gln, Asn, or His; Xa ₂ is Ser or Thr; Xa < ₃ > is Pro or Leu; Xa ₄ is Ile or Leu.

The peptide according to claim 7, wherein Xa ₁ is Gln in the general formula (1), and is converted to pyroglutamic acid.

8. The peptide of claim 7, wherein the peptide is the first sequence of the sequence listing.

(a) a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 1 or SEQ ID NO: 2; (b) a promoter operably linked to the nucleotide sequence; And (c) a terminator.

A cell transformed with the recombinant vector of claim 15.