WO2001038568A3 - Automated method for identifying related biomolecular sequences - Google Patents

Automated method for identifying related biomolecular sequences Download PDF

Info

Publication number
WO2001038568A3
WO2001038568A3 PCT/IB2000/001676 IB0001676W WO0138568A3 WO 2001038568 A3 WO2001038568 A3 WO 2001038568A3 IB 0001676 W IB0001676 W IB 0001676W WO 0138568 A3 WO0138568 A3 WO 0138568A3
Authority
WO
WIPO (PCT)
Prior art keywords
family members
sequences
similarity values
automated method
identifying related
Prior art date
Application number
PCT/IB2000/001676
Other languages
French (fr)
Other versions
WO2001038568A2 (en
Inventor
Van Huijsduijnen Rob Hooft
Jacques Colinge
Original Assignee
Applied Research Systems
Van Huijsduijnen Rob Hooft
Jacques Colinge
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to DK00973154T priority Critical patent/DK1232282T3/en
Application filed by Applied Research Systems, Van Huijsduijnen Rob Hooft, Jacques Colinge filed Critical Applied Research Systems
Priority to DE60017586T priority patent/DE60017586T2/en
Priority to EP00973154A priority patent/EP1232282B8/en
Priority to CA002386706A priority patent/CA2386706C/en
Priority to IL14976700A priority patent/IL149767A0/en
Priority to AU11697/01A priority patent/AU782633B2/en
Priority to US10/148,124 priority patent/US6996474B1/en
Priority to SI200030616T priority patent/SI1232282T1/en
Priority to AT00973154T priority patent/ATE287453T1/en
Priority to JP2001539910A priority patent/JP2003515148A/en
Publication of WO2001038568A2 publication Critical patent/WO2001038568A2/en
Publication of WO2001038568A3 publication Critical patent/WO2001038568A3/en
Priority to IL149767A priority patent/IL149767A/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention relates to an automated method for identifying related biomolecular sequences having defined features of interest from databases, the databases comprising at least a first and a second set of sequences, each set being derived from a different type of organism, comprising the steps of: a) establishing from the first set of sequences a non-redundant list of query sequences having the defined features of interest (first family members), using a database search program; b) performing sequence alignments with the first family members in a second set of sequences derived from a second type of organism, using a database search proram and a preset similarity threshold, giving a list of second family members; c) establishing a two dimensional matrix displaying the first and second family members and their respective similarity values resulting from step (b), optionally displaying only those second family members having similarity values exceeding a preset threshold value; d) selecting from the matrix those pairs of first and second family members for which the similarity values are the best among all of the alignments that involve one of the two pair's members (orthologs).
PCT/IB2000/001676 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences WO2001038568A2 (en)

Priority Applications (11)

Application Number Priority Date Filing Date Title
AU11697/01A AU782633B2 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
DE60017586T DE60017586T2 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
EP00973154A EP1232282B8 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
CA002386706A CA2386706C (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
IL14976700A IL149767A0 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
DK00973154T DK1232282T3 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
US10/148,124 US6996474B1 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
JP2001539910A JP2003515148A (en) 1999-11-25 2000-11-16 Automated methods for identifying cognate biomolecular sequences
AT00973154T ATE287453T1 (en) 1999-11-25 2000-11-16 AUTOMATIC METHOD FOR IDENTIFYING SIMILAR BIOMOLECULAR SEQUENCES
SI200030616T SI1232282T1 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences
IL149767A IL149767A (en) 1999-11-25 2002-05-20 Automated method for identifying related biomolecular sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP99811086.0 1999-11-25
EP99811086A EP1103911A1 (en) 1999-11-25 1999-11-25 Automated method for identifying related biomolecular sequences

Publications (2)

Publication Number Publication Date
WO2001038568A2 WO2001038568A2 (en) 2001-05-31
WO2001038568A3 true WO2001038568A3 (en) 2001-12-20

Family

ID=8243161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2000/001676 WO2001038568A2 (en) 1999-11-25 2000-11-16 Automated method for identifying related biomolecular sequences

Country Status (13)

Country Link
US (1) US6996474B1 (en)
EP (2) EP1103911A1 (en)
JP (1) JP2003515148A (en)
AT (1) ATE287453T1 (en)
AU (1) AU782633B2 (en)
CA (1) CA2386706C (en)
DE (1) DE60017586T2 (en)
DK (1) DK1232282T3 (en)
ES (1) ES2234687T3 (en)
IL (2) IL149767A0 (en)
PT (1) PT1232282E (en)
SI (1) SI1232282T1 (en)
WO (1) WO2001038568A2 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2476412C (en) 2002-03-22 2008-02-19 Phenomenome Discoveries Inc. Method of visualizing non-targeted metabolomic data generated from fourier transform ion cyclotron resonance mass spectrometers
WO2004025416A2 (en) * 2002-09-13 2004-03-25 The Texas A & M University System Bioinformatic method for identifying surface-anchored proteins from gram-positive bacteria and proteins obtained thereby
US8032310B2 (en) * 2004-07-02 2011-10-04 The United States Of America As Represented By The Secretary Of The Navy Computer-implemented method, computer readable storage medium, and apparatus for identification of a biological sequence
AU2011203297B8 (en) * 2004-07-02 2013-07-11 The Government Of The United States Of America, As Represented By The Secretary Of The Navy Computer-Implemented Biological Sequence Identifier System and Method
US20080250016A1 (en) * 2007-04-04 2008-10-09 Michael Steven Farrar Optimized smith-waterman search
WO2011137368A2 (en) 2010-04-30 2011-11-03 Life Technologies Corporation Systems and methods for analyzing nucleic acid sequences
US9268903B2 (en) 2010-07-06 2016-02-23 Life Technologies Corporation Systems and methods for sequence data alignment quality assessment
KR101809046B1 (en) 2016-03-18 2017-12-14 고려대학교 산학협력단 Method and device for re-arranging data for analyzing the gene expression of orthologous gene

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5843732A (en) * 1995-06-06 1998-12-01 Nexstar Pharmaceuticals, Inc. Method and apparatus for determining consensus secondary structures for nucleic acid sequences

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2283840B (en) 1993-11-12 1998-07-22 Fujitsu Ltd Genetic motif extracting method and apparatus
US5701256A (en) 1995-05-31 1997-12-23 Cold Spring Harbor Laboratory Method and apparatus for biological sequence comparison
US5873052A (en) 1996-11-06 1999-02-16 The Perkin-Elmer Corporation Alignment-based similarity scoring methods for quantifying the differences between related biopolymer sequences

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5843732A (en) * 1995-06-06 1998-12-01 Nexstar Pharmaceuticals, Inc. Method and apparatus for determining consensus secondary structures for nucleic acid sequences

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
D BOTSTEIN ET AL: "Yeast as a Model Organism", SCIENCE, vol. 277, 29 August 1997 (1997-08-29), pages 1259 - 1260, XP000914591 *
D.J. LIPMAN AND W.R. PEARSON: "Rapid and Sensitive Protein Similarity Searches", SCIENCE, vol. 227, 1985, pages 1435 - 1441, XP002920456 *
H.B. NICHOLAS ET AL: "A Tutorial on Searching Sequence Databases and Sequence Scoring Methods", HTTP://WWW.PSC.EDU/BIOMED/TUTORIALS/SEQUENCE/DBSEARCH/TUTORIAL.HTML, March 1998 (1998-03-01), Pittsburgh, USA, XP002139943 *
PEARSON W R ET AL: "COMPARISON OF DNA SEQUENCES WITH PROTEIN SEQUENCES", GENOMICS,US,ACADEMIC PRESS, SAN DIEGO, vol. 46, no. 1, 15 November 1997 (1997-11-15), pages 24 - 36, XP000857023, ISSN: 0888-7543 *
PEARSON W R ET AL: "IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF USA,US,NATIONAL ACADEMY OF SCIENCE. WASHINGTON, vol. 85, 1 April 1988 (1988-04-01), pages 2444 - 2448, XP002060460, ISSN: 0027-8424 *
PEARSON W R: "RAPID AND SENSITIVE SEQUENCE COMPARISON WITH FASTP AND FASTA", METHODS IN ENZYMOLOGY,US,ACADEMIC PRESS INC, SAN DIEGO, CA, vol. 183, 1 January 1990 (1990-01-01), pages 63 - 98, XP000670614, ISSN: 0076-6879 *
S.F. ALTSCHUL ET AL: "Issues in Searching Molecular Sequence Databases", NATURE GENETICS, vol. 6, 1994, pages 119 - 129, XP002920457 *

Also Published As

Publication number Publication date
EP1232282B1 (en) 2005-01-19
IL149767A0 (en) 2002-11-10
DE60017586D1 (en) 2005-02-24
WO2001038568A2 (en) 2001-05-31
PT1232282E (en) 2005-05-31
SI1232282T1 (en) 2005-06-30
DK1232282T3 (en) 2005-05-23
ATE287453T1 (en) 2005-02-15
CA2386706C (en) 2008-08-05
US6996474B1 (en) 2006-02-07
EP1103911A1 (en) 2001-05-30
CA2386706A1 (en) 2001-05-31
DE60017586T2 (en) 2005-12-22
JP2003515148A (en) 2003-04-22
AU782633B2 (en) 2005-08-18
ES2234687T3 (en) 2005-07-01
AU1169701A (en) 2001-06-04
IL149767A (en) 2008-11-03
EP1232282A2 (en) 2002-08-21
EP1232282B8 (en) 2006-01-18

Similar Documents

Publication Publication Date Title
EP1050830A3 (en) System and method for collaborative ranking of search results employing user and group profiles
WO2005036351A3 (en) Systems and methods for search processing using superunits
WO2001006414A8 (en) Method and system for organizing data
EP0326927A3 (en) Method and apparatus for processing a database
EP2261820A3 (en) Data profiling
DE69808079D1 (en) SYSTEM, METHOD AND PROGRAM PRODUCT FOR THE GROUP-ORGANIZED DATA PROCESSING OF PATENTS
WO2003044725A3 (en) Image identification system
SE9702763D0 (en) Method at database
WO2002046465A3 (en) Method for identification of genes involved in specific diseases
WO2006073951B1 (en) Adaptive fingerprint matching method and apparatus
WO2002042945A3 (en) A system and method for processing patient medical information
WO2006058986A3 (en) Method for identifying an individual based on fragments
WO2001038568A3 (en) Automated method for identifying related biomolecular sequences
WO2003050720A3 (en) Database system having heterogeneous object types
WO2003060807A3 (en) Methods for determining polypeptide structure, function or pharmacophore from comparison of polypeptide sequences
WO2002080649A3 (en) Methods and systems for searching genomic databases
WO2002006974A3 (en) Method for comparing search profiles
EP1653378A3 (en) System and method for data entry and search
EP0210866A2 (en) Inductive inference apparatus
CA2343379A1 (en) Process for cyclic, interactive image analysis, and also computer system and computer program for performing the process
WO2003085552A3 (en) Comparison of source files
WO2003009180A3 (en) Method and system for reorganizing a tablespace in a database
WO2002095394A3 (en) Method for analyzing a biological sample
CN1248152C (en) System and method for discovering patterns with noise
ATE375564T1 (en) DATA PROCESSING METHOD AND ASSOCIATED SOFTWARE PROGRAM

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 11697/01

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2386706

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2000973154

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 149767

Country of ref document: IL

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 539910

Kind code of ref document: A

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 2000973154

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWE Wipo information: entry into national phase

Ref document number: 10148124

Country of ref document: US

WWG Wipo information: grant in national office

Ref document number: 2000973154

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 11697/01

Country of ref document: AU