WO2011071209A1 - Système et procédé d'identification et de classification de gènes de résistance de plantes à l'aide du modèle de markov caché - Google Patents

Système et procédé d'identification et de classification de gènes de résistance de plantes à l'aide du modèle de markov caché Download PDF

Info

Publication number
WO2011071209A1
WO2011071209A1 PCT/KR2010/000333 KR2010000333W WO2011071209A1 WO 2011071209 A1 WO2011071209 A1 WO 2011071209A1 KR 2010000333 W KR2010000333 W KR 2010000333W WO 2011071209 A1 WO2011071209 A1 WO 2011071209A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
domain
resistance gene
gene
resistance
Prior art date
Application number
PCT/KR2010/000333
Other languages
English (en)
Korean (ko)
Inventor
허철구
김정은
이봉우
이승원
홍지만
Original Assignee
한국생명공학연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국생명공학연구원 filed Critical 한국생명공학연구원
Priority to US13/515,006 priority Critical patent/US20120271558A1/en
Publication of WO2011071209A1 publication Critical patent/WO2011071209A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Definitions

  • the present invention constructs a scoring matrix for finding a domain encoding a resistance gene of a plant using a hidden Markov model, and performs the method and method for identifying and classifying the domain of the resistance gene based on the matrix.
  • a recording medium having a computer readable program recorded thereon.
  • Plants are attacked by various forms from pathogens such as bacteria, fungi and nematodes from the outside environment. Plants have their own immune system to induce defense mechanisms to resist attacks from this external environment. The defense mechanism of plants is achieved by initiating signaling from genes that recognize foreign molecules of resistance genes. Resistant genes include pathogen associated molecular patterns such as effector proteins, lipopolysaccrides, peptidoglycans, and glycoproteins that are transmitted from pathogens into plant cells. pattern and triggers a hypersensitive response by initiating a signal to activate the immune system (Gohre, V. and S. Robatzek, 2008, Breaking the Barriers: Microbial Effector Molecules Subvert Plant Immunity. Annu Rev Phytopathol).
  • pathogens such as bacteria, fungi and nematodes from the outside environment. Plants have their own immune system to induce defense mechanisms to resist attacks from this external environment. The defense mechanism of plants is achieved by initiating signaling from genes that recognize foreign molecules of resistance genes. Resistant
  • Plant resistance genes consist of several conserved functional domain sets, and are largely divided into five groups according to the combination of these functional domains (Dangl, JL and JD Jones, 2001, Plant pathogens and integrated defenceresponses to infection.Nature. 411 (6839): p. 826-33).
  • the largest category is the NBS-LRR group, which encodes a nucleotide binding site (NBS) and a leucine rich repeat (LRR) domain.
  • TIR-NBS-LRR TIR-NBS-LRR
  • CC-NBS CC-NBS
  • TIR toll interleukine-1 like receptor
  • CC coiled-coil
  • LZ leucine-zipper
  • the resistance gene present in the cell membrane encodes a leucine rich repeat domain in the outer cell region and the transmembrane (TM) domain, which is a transmembrane domain.
  • Resistant genes belonging to this group are leucine rich repeat-receptor kianse (LRR-RK) groups and leucine rich refit receptors depending on whether they encode a kinase domain in the cytoplasmic region.
  • Protein leucine rich repeat receptor protein (LRR-RP)).
  • LRR-RP leucine rich repeat receptor protein
  • the final classification is a protein that encodes a kinase domain in the cytoplasm and does not have a transmembrane (TM) domain.
  • the similarity search has a disadvantage in that accuracy is low because it is classified as the same candidate group as the resistance gene of the comparative object even for a protein having a low similarity or a high local similarity.
  • the present invention constructs a profile matrix using a hidden Markov model using conservative protein sequences of a domain encoding a resistance gene, and constructs a domain encoding a resistance gene based on the constructed profile matrix.
  • a method of identification and a method of classifying as a resistance gene by a combination of identified domains were devised.
  • the present invention derived from such a need, seeks to develop systems and methods for effectively identifying resistance genes in plants known or unknown in previous studies from large numbers of nucleotides or protein sequences.
  • the present invention uses a protein sequence corresponding to the functional domain of the resistance gene to identify the domain of the resistance gene using a profile matrix constructed using the Hidden Markov Model, and the resistance Systems and methods including algorithms for classifying resistant genes using combinations of gene domains are provided.
  • the present invention also provides a recording medium having recorded thereon a computer readable program for performing the method.
  • Previously unknown resistance gene candidates can be identified quickly and efficiently from large plant sequences. Large numbers of sequences can be downloaded from public databases to identify previously unknown resistance genes. Not only resistance genes encoding the entire domain, but also genes encoding only some domains can be found, which can help find candidates for resistance genes from large sequences.
  • FIG. 1 shows a schematic of a system for identifying and classifying resistance genes in plants.
  • FIG. 3 shows the results of phylogenetic analysis using sequences of NBS domains having a TIR domain at the amino terminus and NBS domains having no TIR domain.
  • the tree corresponding to the right red bar is a gene encoding an NBS domain having a TIR domain
  • the tree corresponding to the blue bar is a group of genes encoding an NBS domain having no TIR domain.
  • Figure 4 is a schematic of using the NBS domain alignment results of the TNL group and the CNL group to compare the name and sequence alignment results of the active motif.
  • Figure 5 is a graph of the score of the results of searching for protein sequences belonging to the CNL, TNL, NL group using two NBS domain profile metrics.
  • the blue and pink lines represent the expected values from hmmpfam using the NBS_CC and NBS_TIR profile metrics, respectively.
  • the Y axis represents the expected value and the X axis represents the resistance gene class of the input sequence.
  • FIG. 6 is a schematic of a series of processes that constitute the profile matrix of domains encoding resistance genes.
  • the rhombus shape represents the domain name.
  • Red rhombus is the domain identified by the profile matrix, green is the coiled-coil domain identified by the COILS program, and purple represents the TM domain identified by the TMHMM.
  • the red line represents five major resistance gene groups, and the blue line is a group of genes with the same structure as genes known to be involved in plant immune signaling in combination with or associated with resistance genes.
  • the black line is a group of resistance genes that have yet to be identified but may have been or may have evolved into resistance genes.
  • Figure 13 shows the results of the search section 1) the distribution according to the taxon of the resistance gene of Medicago truncatula species and the ID of the protein belonging to the CNL taxonomy in Genomic Data, 2) the distribution of resistance genes of 32 plant species as a result of UniGene As a detail, resistance gene classification and distribution of Arabidopsis plants are shown.
  • FIG. 14 shows an example of identifying a domain of a resistance gene using a profile matrix.
  • An input unit for inputting a protein or nucleotide sequence for identifying and classifying resistance genes
  • a processing unit for identifying each domain encoding a resistance gene using a profile matrix from the input sequence, and classifying the resistance gene
  • An output unit showing detailed information of the resistance gene using data from the results stored in the database
  • An input unit for inputting a protein or nucleotide sequence for finding a domain encoding a resistance gene
  • a processor capable of identifying a domain using a hidden mark model of the resistance gene
  • An output unit which shows the gene structure of the resistance gene identified from the retrieved gene, the similar gene search result, the tree and sequence alignment result with the similar gene;
  • It provides a system for processing a large amount of protein or nucleotide sequence of a plant comprising a to identify a resistance gene associated domain, and classify the resistance gene from a combination of the domain.
  • the profile metrics can be constructed by the following steps:
  • the public database of step a) may be UniProt, but is not limited thereto.
  • the domain encoding the resistance gene of step d) is NBS (nucleotide binding site), LZ (leucine zipper), LRR (leucine rich repeat), TIR (toll interleuine-1 receptor) ) Or kinase, but is not limited thereto.
  • the algorithm may be an algorithm for identifying domains using appropriate boundary values of each matrix and classifying resistance genes using a combination of identified domains.
  • the present invention also provides
  • It provides a method of identifying a resistance gene related domain of a plant comprising a, and classifying the identified resistance gene.
  • the profile metrics of step c) may be constructed by the following steps:
  • the publishing database may be UniProt, but is not limited thereto.
  • the domain encoding the resistance gene is NBS (nucleotide binding site), leucine zipper (LZ), leucine rich repeat (LRR), toll interleuine-1 receptor (TIR) or kinase ( kinase), but is not limited thereto.
  • NBS nucleotide binding site
  • LZ leucine zipper
  • LRR leucine rich repeat
  • TIR toll interleuine-1 receptor
  • kinase kinase
  • the present invention also provides a recording medium having recorded thereon a computer readable program for performing the method.
  • the processor algorithm may construct a profile matrix in the following manner to identify a domain from an input protein or nucleotide sequence.
  • the entire plant sequence was downloaded from UniProt, a public database.
  • Resistance gene corresponding to a training set for constructing profile metrics through domain name search (FIG. 2-1), technical term search (FIG. 2-2), keyword search (FIG. 2-3) from UniProt flatfile Candidate groups were selected. Among them, the gene having only the fragment sequence and the gene with the predicted sequence were removed and the protein sequence of the resistance gene was collected based on the sequences with the experimental basis.
  • NBS fam- bin binding sites
  • LZ leucine zipper
  • LRR leucine rich repeat
  • TIR domains that encode five resistance genes through pfam and Multiple Em for Motif Elicitation (MEME) programs
  • MME Multiple Em for Motif Elicitation
  • each domain can be seen in the example for constructing a profile matrix of resistance gene related domains.
  • the example shows how to build the profile metric of the NBS domain, and the other four domains were constructed in a similar process.
  • NBS domains have been reported to show a marked difference in sequence between a group having a TIR domain in the amino acid terminal region and a group having a CC or LZ.
  • the group having the NBS protein sequence belonging to the TNL group is named NBS_TIR, and the group having the NBS protein sequence belonging to the CNL group is called NBS_CC, and the group is mixed and analyzed. Results It was found that the NBS domain of the TNL group and the NBS domain of the CNL group were classified into completely different groups on the tree tree (FIG. 3).
  • the NBS motif reported seven active domains: P-loop, RNBS-A, kinase-2 (Kin-2), RNBS-B, RNBS-C GLPL, and RNBS-D.
  • the degree of conservation was compared based on the active motifs conserved in the sequence alignment results (FIG. 4).
  • the P-loop domain is well conserved in a wider range than the sequence of the NBS_CC group in the sequence of the NBS_TIR group.
  • the last amino acid of the kinase2 (Kin-2) motif preserves aspartic acid (D) in the NBS_TIR group, while tryptophan is preserved in the NBS_CC group.
  • the RNBS-A, RNBS-C, and RNBS-D motifs differ significantly between the two groups in terms of sequence and length, and the RNBS-C, RNBS-D domains appear to have a higher degree of conservation in the NBS_CC group. Because of these differences, the NBS domains of the NBS_TIR group and the NBS_CC group can be estimated to be grouped independently from each other in the lineage analysis. You can expect to be able.
  • NBS_TIR and NBS_CC profile metrics we can independently build the NBS_TIR and NBS_CC profile metrics, and verify that the two NBS profile metrics can be identified and identified in UniProt by distinguishing them from protein sequences belonging to different groups.
  • the sequence encoding N and some sequences encoding NBS-LRR (NL) group having no amino group were received and analyzed using NBS domain profile matrix using hmmpfam program to compare expected values (FIG. 5).
  • the expected value of hmmpfam using the NBS domain profile matrix made from the coiled-coil sequence of amino group of NBS domain is blue, and the profile matrix of the NBS domain made from sequence having TNL of amino group is shown in blue. Expected value of hmmpfam is shown in pink.
  • the CNL protein sequence had a higher score in the NBS_CC profile matrix
  • the TNL protein sequence had a higher score in the NBS_TIR profile matrix
  • the two metrics were significantly different even when the NBS fragment sequence was entered. It was determined that the classification of the NBS domain using (Fig. 5).
  • the domains encoding each resistance gene were constructed in the same way as the method of constructing the profile matrix of the NBS domain (FIG. 6).
  • Profile metrics are constructed through sequence alignment, manual identification of aligned sequences, profile metrics construction using hidden Markov models, and setting the lowest reference value considering the length and similarity of each domain by repeated experiments. Set.
  • the lowest reference value applied to identifying each domain using the profile matrix and the profile matrix for the domain encoding the resistance gene is a significant resistance gene from the protein sequence processed from the input unit. It may be an algorithm for identifying an encryption domain.
  • the process of identifying and classifying resistance genes using profile metrics is predicted based on protein sequences. Therefore, in order to enable this analysis, the analysis based on the nucleotide sequence translates into 6 reading frames, and as a result, a resistance gene analysis process is performed by selecting a reading frame encoding the longest protein sequence. Using the hmmpfam program to identify resistance gene-related domains using the profile matrix created by the above method, the resistance genes are finally applied by applying the lowest threshold of each domain determined through repeated experiments to classify resistance genes. Determines whether the domain is encrypted. The combination of resistance gene domains identified in this way is used to classify which group the resistance gene belongs to (FIG. 7).
  • the algorithm for identifying the domain encoding the resistance gene is meaningful by applying the profile matrix and the lowest reference value of the domain by translation from the nucleotide sequence processed from the input to the protein sequence
  • the resistance gene may be an algorithm for identifying a coding domain.
  • the NBS domain is determined to have a high expected value resulting from hmmpfam performance using NBS_TIR and NBS_CC metrics. Can be distinguished.
  • the LRR domain of the carboxyl group having an expected value above the lowest reference value is identified, and if the TIR is identified in the amino group, the coiled-coil (CC) domain or the leucine zipper (LZ) domain is identified in the TNL group. Cases are classified as CNL groups.
  • the NBS domain When the NBS domain is identified but the LRR of the carboxyl group is not identified, it is classified as TN group when TIR is identified in amino group and CN when coiled-coil domain or LZ domain is identified. If it contains only the LRR domain on the same gene as the identified NBS domain, it is classified as NL TIR and NL CC , and if it does not include other domains encoding the resistance gene is classified as N TIR and N CC . In each of these four groups, whether each gene belongs to the TIR, CC, or LZ is determined by the expected value through the NBS profile matrix.
  • the coiled-coil domain is predicted using the COILS (version 2.2) program.
  • the TMHMM (version 2.0c) program is used to identify the transmembrane (TM) structure that is expected to be located in the cell membrane.
  • TM structure is identified, it is classified into LRR-RK and LRR-RP groups according to whether or not there is a kinase domain having an expected value above the lowest reference value in the carboxyl group. If a kinase domain with an expected value above the lowest reference value without the TM structure is found, it is classified as pto-kinase.
  • resistance genes belonging to the above process is a resistance gene belonging to five representative classes of plants.
  • Resistant gene groups were classified into 12 groups (TNL, pto-like kinase, LRR-RP, LRR-RK, NLcc, Tx, NLtir, CNL, Ntir, TN, CN, Ncc).
  • TNL pto-like kinase
  • LRR-RP pto-like kinase
  • LRR-RK LRR-RK
  • NLcc Tx
  • NLtir CNL
  • Ntir TN, CN, Ncc
  • a TIR domain having an expected value above the lowest reference value may be classified as Tx when a domain having an NBS or LRR structure is not identified.
  • the data corresponding to the UniGene search unit of the present invention was made by downloading and processing sequence and library information from the UniGene database of NCBI, which is a public database.
  • tissue specificity was verified using Audic's test using the distribution of the protein and the distribution of the EST (expressed sequence tag) library included in UniGene.
  • Audic's test may be an algorithm for calculating tissue specificity by Equation 1.
  • the present invention also provides a recording medium having recorded thereon a computer readable program for carrying out a method for identifying and classifying a resistance gene of a plant of the present invention.
  • a recording medium having a computer readable program recorded thereon for performing a method for identifying a domain of a plant resistance gene and classifying a resistance gene by using a protein or nucleotide sequence.
  • Computer-readable recording medium refers to any recording medium that can be read directly and accessed by a computer.
  • Such recording media include magnetic recording media such as floppy disks, hard disks, and magnetic tapes, optical recording media such as CD-ROMs, CD-Rs, CDs, RWs, DVD-ROMs, DVD-RAMs, DVD-RWs, RAMs and ROMs.
  • Electrical recording media such as and mixtures of these categories (for example, magnetic / optical recording media such as MO), but are not limited to these.
  • the selection of a device or apparatus for recording or inputting the above-described recording medium or a device or apparatus for reading information in the recording medium is based on the type of recording medium and the access method.
  • Various data processor programs, software, comparators, and formats are also used to record a program for performing the method of the present invention on the medium.
  • the information can be represented, for example, in the form of a binary file, a text file or an ASCII file formatted with commercially available software.
  • FIG. 1 shows a schematic diagram of a system for identifying domains of resistance genes of plants and classifying resistance genes.
  • the system of the present invention comprises the input unit described above; Processing unit; Database; An output unit; It includes a search unit.
  • the input unit performs a function of inputting a protein or nucleotide sequence.
  • 8 shows an input unit screen. Enter the proteins, nucleotide base types and protein or nucleotide sequences in the fasta format that are essential to the input format.
  • the processing unit functions to identify the resistance gene domain using the profile matrix from the input sequence information, classify the resistance gene, and store the resistance gene in a database.
  • the database stores data derived from an analysis process in the processing unit by using an algorithm for identifying a resistance gene coding domain and classifying a resistance gene.
  • the domain database stores the predicted results of domains encoding resistance genes
  • the resistance gene classification database stores classification information and protein and nucleotide base sequences through the resistance gene classification algorithm.
  • the UniProt BLAST and RefSeq BLAST databases store the results for the degree of similarity and the family of genes that have similarities between genes classified as resistant genes and resistant gene proteins derived from public databases such as UniProt and NCBI.
  • the output unit functions to output the information processed in the processing unit stored in the database on the web.
  • 9 is an overall view showing a result processed by the processing unit on a system.
  • the output part displays the result predicted using the protein sequence (FIG. 9-1) and the result predicted using the nucleotide sequence of UniGene (FIG. 9-2).
  • the output of the protein sequence can be divided into seven sub-categories: HMM results, sequence information, gene structure and similar protein groups, blast results, related references, trees, and sequence alignment results.
  • the HMM results show the results of identifying resistance gene domains using the profile matrix constructed in the algorithm using hmmpfam.
  • the table shows the domain of the resistance gene and the position of the domain on the protein sequence and the position on the matrix for each domain, and the View Info item shows the actual pfam results.
  • the sequence information section shows the amino acid sequence of proteins classified as resistance genes.
  • the domain structure of the resistance gene is shown using the domain identification results, and the blast algorithm is used to search for similarity with proteins in commercial databases such as UniProt or NCBI. Show relative position.
  • the blast result is a table of similarity positions and degrees of similarity for proteins similar to the above resistance genes.
  • Relevant references include information about journals that publish experimental results of proteins that are similar to resistance genes in a database, and links each journal to the PubMed web for easy access.
  • Trees are constructed using the Neighbor-Joining (NJ) algorithm, which shows the association between query sequences and similar sequences.
  • the sequence alignment result is a result of performing multiple sequence alignment (MSA) using clustalW to indicate a similar region between the sequence similar to the query sequence received from the input unit.
  • MSA multiple sequence alignment
  • Figure 12 summarizes the output and the other parts of the prediction results using the protein as the output for the result of the prediction and classification of the resistance gene using the nucleotide sequence.
  • FIG. 13 is a system corresponding to the search unit, classifies into a group of resistance genes using sequence information provided from a public database using an algorithm implemented in the system, and stores the classified gene group on a database.
  • the gene information of the protein corresponding to the id can be output and viewed in the same format as in the output unit.
  • 32 kinds of resistance gene information provided by NCBI are displayed when clicked, and when the graph showing the species name or the number of resistance genes of each species is clicked, the classification of the specific species and the number of resistance genes of the corresponding classification group are displayed. 13-2).
  • the input unit for identifying the domain of the resistance gene using the profile matrix described in the algorithm is the same as the input unit of FIG. 8.
  • Profile metrics are built for five different domains (LRR, LZ, NBS, Pkinase, TIR) . If you click on a domain name and enter a sequence, you can search for and output the selected profile matrix for proteins, and for nucleotide sequences. It is processed into the protein sequence of the longest ORF among the results translated into 6 reading frames to retrieve and output the profile matrix. 14 shows the results of searching the profile matrix of the Pkinase domain.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Bioethics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

La présente invention concerne un système et un procédé d'identification et de classification rapide et précise de gènes de résistance d'une plante à partir d'une protéine ou d'une séquence d'ADN. Afin d'identifier et de classifier les gènes de résistance d'une plante à l'aide d'un modèle de Markov caché, il est conçu une matrice de profils réalisée à l'aide d'une séquence protéique d'un domaine qui est codé par les gènes de résistance, et un système d'identification du domaine des gènes de résistance à l'aide la matrice de profils et de classification des gènes de résistance par combinaison de domaines. La présente invention permet l'identification et la classification efficace des gènes de résistance d'une plante à l'aide de la matrice de profils du programme, à partir desquels la séquence de bases nucléotidiques ou la séquence protéique est détectée.
PCT/KR2010/000333 2009-12-11 2010-01-19 Système et procédé d'identification et de classification de gènes de résistance de plantes à l'aide du modèle de markov caché WO2011071209A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/515,006 US20120271558A1 (en) 2009-12-11 2010-01-19 System and method for identifying and classifying resistance genes of plant using hidden marcov model

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020090123006A KR101140780B1 (ko) 2009-12-11 2009-12-11 히든 마코브 모델을 이용한 식물 저항성 유전자 동정 및 분류를 위한 시스템 및 방법
KR10-2009-0123006 2009-12-11

Publications (1)

Publication Number Publication Date
WO2011071209A1 true WO2011071209A1 (fr) 2011-06-16

Family

ID=44145741

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2010/000333 WO2011071209A1 (fr) 2009-12-11 2010-01-19 Système et procédé d'identification et de classification de gènes de résistance de plantes à l'aide du modèle de markov caché

Country Status (3)

Country Link
US (1) US20120271558A1 (fr)
KR (1) KR101140780B1 (fr)
WO (1) WO2011071209A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491692A (zh) * 2018-03-09 2018-09-04 中国科学院生态环境研究中心 一种构建抗生素抗性基因数据库的方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101289403B1 (ko) * 2011-04-27 2013-07-29 한국생명공학연구원 십자화과 식물 유전자의 진화 및 기능 연구를 위한 발현 유전체 비교분석 시스템 구축 방법
US9857328B2 (en) 2014-12-18 2018-01-02 Agilome, Inc. Chemically-sensitive field effect transistors, systems and methods for manufacturing and using the same
US10006910B2 (en) 2014-12-18 2018-06-26 Agilome, Inc. Chemically-sensitive field effect transistors, systems, and methods for manufacturing and using the same
WO2016100049A1 (fr) 2014-12-18 2016-06-23 Edico Genome Corporation Transistor à effet de champ chimiquement sensible
US9618474B2 (en) 2014-12-18 2017-04-11 Edico Genome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US9859394B2 (en) 2014-12-18 2018-01-02 Agilome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US10020300B2 (en) 2014-12-18 2018-07-10 Agilome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US11031094B2 (en) 2015-07-16 2021-06-08 Dnastar, Inc. Protein structure prediction system
WO2017201081A1 (fr) 2016-05-16 2017-11-23 Agilome, Inc. Dispositifs à fet au graphène, systèmes et leurs méthodes d'utilisation pour le séquençage d'acides nucléiques
CN113470751A (zh) * 2021-06-30 2021-10-01 南方科技大学 一种蛋白纳米孔氨基酸序列的筛选方法、蛋白纳米孔及其应用
CN113628687A (zh) * 2021-08-13 2021-11-09 南京大学 一种植物成对nlr抗性基因数据库的构建方法及其多物种成对nlr基因数据库
CN114550827B (zh) * 2022-01-14 2022-11-22 山东师范大学 一种基因序列比对方法及系统

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000078944A1 (fr) * 1999-06-17 2000-12-28 Dna Plant Technology Corporation Methodes de conception et d'identification de nouveaux genes de resistance de plantes

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000078944A1 (fr) * 1999-06-17 2000-12-28 Dna Plant Technology Corporation Methodes de conception et d'identification de nouveaux genes de resistance de plantes

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANDERS KROGH ET AL.: "Hidden Markov Models in Computational Biology : Applications to Protein Modeling", JOURNAL OF MOLECULAR BIOLOGY, vol. 235, no. ISS.5, February 1994 (1994-02-01), pages 1501 - 1531, XP024008598, DOI: doi:10.1006/jmbi.1994.1104 *
GRZEGORZ KOCZYK ET AL.: "AN ASSESSMENT OF THE RESISTANCE GENE ANALOGUES OF Oryza sativa ssp.japonica THEIR PRESENCE AND STRUCTURE", CELLULAR & MOLECULAR BIOLOGY LETTERS, vol. 8, 2003, pages 963 - 972 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108491692A (zh) * 2018-03-09 2018-09-04 中国科学院生态环境研究中心 一种构建抗生素抗性基因数据库的方法
CN108491692B (zh) * 2018-03-09 2023-07-21 中国科学院生态环境研究中心 一种构建抗生素抗性基因数据库的方法

Also Published As

Publication number Publication date
US20120271558A1 (en) 2012-10-25
KR20110066380A (ko) 2011-06-17
KR101140780B1 (ko) 2012-05-03

Similar Documents

Publication Publication Date Title
WO2011071209A1 (fr) Système et procédé d'identification et de classification de gènes de résistance de plantes à l'aide du modèle de markov caché
Nayfach et al. CheckV assesses the quality and completeness of metagenome-assembled viral genomes
Anderson et al. Transposable elements contribute to dynamic genome content in maize
Sun et al. SHOREmap v3. 0: fast and accurate identification of causal mutations from forward genetic screens
Merkel et al. Detecting short tandem repeats from genome data: opening the software black box
Chang et al. Identifying earthworms through DNA barcodes: Pitfalls and promise
Gouin et al. Whole-genome re-sequencing of non-model organisms: lessons from unmapped reads
Grant et al. Building a phylogenomic pipeline for the eukaryotic tree of life-addressing deep phylogenies with genome-scale data
WO2013065944A1 (fr) Procédé de recombinaison de séquence, et appareil pour séquençage de nouvelle génération
Makunin et al. A targeted amplicon sequencing panel to simultaneously identify mosquito species and Plasmodium presence across the entire Anopheles genus
CN115064215B (zh) 一种通过相似度进行菌株溯源及属性鉴定的方法
Kim et al. Hisat-genotype: Next generation genomic analysis platform on a personal computer
Cornman Relative abundance and molecular evolution of Lake Sinai Virus (Sinaivirus) clades
Dylus et al. Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree
Gauthier et al. DiscoSnp-RAD: de novo detection of small variants for RAD-Seq population genomics
CN115662516A (zh) 一种基于二代测序技术的高通量预测噬菌体宿主的分析方法
Congrains et al. Phylogenomic approach reveals strong signatures of introgression in the rapid diversification of neotropical true fruit flies (Anastrepha: Tephritidae)
CN107862177B (zh) 一种区分鲤群体的单核苷酸多态性分子标记集的构建方法
Lindner et al. Performance of methods to detect genetic variants from bisulphite sequencing data in a non‐model species
Avedi et al. Metagenomic analyses and genetic diversity of Tomato leaf curl Arusha virus affecting tomato plants in Kenya
Jin et al. Haplotype-resolved genomes of wild octoploid progenitors illuminate genomic diversifications from wild relatives to cultivated strawberry
WO2023158253A1 (fr) Procédé d'analyse des variations génétiques basé sur un séquençage d'acide nucléique
Fletcher et al. AFLAP: assembly-free linkage analysis pipeline using k-mers from genome sequencing data
Groza et al. GraffiTE: a unified framework to analyze transposable element insertion polymorphisms using genome-graphs
JP2008161056A (ja) Dna配列解析装置、dna配列解析方法およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10836107

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 13515006

Country of ref document: US

122 Ep: pct application non-entry in european phase

Ref document number: 10836107

Country of ref document: EP

Kind code of ref document: A1