EP1563444A2 - Verfahren zur erzeugung von virtuellen chromosomen - Google Patents
Verfahren zur erzeugung von virtuellen chromosomenInfo
- Publication number
- EP1563444A2 EP1563444A2 EP03798163A EP03798163A EP1563444A2 EP 1563444 A2 EP1563444 A2 EP 1563444A2 EP 03798163 A EP03798163 A EP 03798163A EP 03798163 A EP03798163 A EP 03798163A EP 1563444 A2 EP1563444 A2 EP 1563444A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- chromosomes
- chromosome
- virtual
- value
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000004519 manufacturing process Methods 0.000 title claims abstract description 12
- 210000000349 chromosome Anatomy 0.000 claims abstract description 126
- 210000003918 fraction a Anatomy 0.000 claims abstract description 3
- 230000002559 cytogenic effect Effects 0.000 claims description 29
- 238000000034 method Methods 0.000 claims description 21
- 241000282414 Homo sapiens Species 0.000 claims description 16
- 230000002759 chromosomal effect Effects 0.000 claims description 16
- 238000012986 modification Methods 0.000 claims description 15
- 230000004048 modification Effects 0.000 claims description 15
- 230000002068 genetic effect Effects 0.000 claims description 14
- 230000000877 morphologic effect Effects 0.000 claims description 14
- 230000005945 translocation Effects 0.000 claims description 9
- 210000003917 human chromosome Anatomy 0.000 claims description 7
- 238000013507 mapping Methods 0.000 claims description 7
- 238000012937 correction Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 238000010230 functional analysis Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 2
- 108020004414 DNA Proteins 0.000 description 13
- 108091028043 Nucleic acid sequence Proteins 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 10
- 239000000523 sample Substances 0.000 description 9
- 230000000052 comparative effect Effects 0.000 description 6
- 238000009833 condensation Methods 0.000 description 6
- 230000005494 condensation Effects 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 6
- 238000009826 distribution Methods 0.000 description 6
- 238000009396 hybridization Methods 0.000 description 6
- 108090000623 proteins and genes Proteins 0.000 description 6
- 238000010186 staining Methods 0.000 description 6
- 230000005856 abnormality Effects 0.000 description 5
- 238000000126 in silico method Methods 0.000 description 5
- 230000008707 rearrangement Effects 0.000 description 5
- 240000006829 Ficus sundaica Species 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 239000002773 nucleotide Substances 0.000 description 4
- 125000003729 nucleotide group Chemical group 0.000 description 4
- 238000012800 visualization Methods 0.000 description 4
- 208000031404 Chromosome Aberrations Diseases 0.000 description 3
- 206010028980 Neoplasm Diseases 0.000 description 3
- 102000004142 Trypsin Human genes 0.000 description 3
- 108090000631 Trypsin Proteins 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000010422 painting Methods 0.000 description 3
- 239000012588 trypsin Substances 0.000 description 3
- 108020005351 Isochores Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 210000002230 centromere Anatomy 0.000 description 2
- 238000005056 compaction Methods 0.000 description 2
- 230000008602 contraction Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000001295 genetical effect Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000016507 interphase Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 206010008805 Chromosomal abnormalities Diseases 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Chemical class Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 231100000005 chromosome aberration Toxicity 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 210000001840 diploid cell Anatomy 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000031864 metaphase Effects 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000003334 potential effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- WFKWXMTUELFFGS-UHFFFAOYSA-N tungsten Chemical compound [W] WFKWXMTUELFFGS-UHFFFAOYSA-N 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B45/00—ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
Definitions
- the present invention relates to a method for producing a virtual chromosome representing a respective natural chromosome as well as a virtual chromosome or a part thereof represented by values according to its CG content, a set of virtual chromosomes or parts thereof and the use of a set of virtual chromosomes .
- the human genome is ordered in a highly structured hierarchical fashion.
- the nucleus of a diploid cell contains approximately 2-3,17'10 9 nucleotide bases, which are arranged into 46 intricately packed DNA threads that become visible in the form of separate chromosomes during cell division.
- the specific features of these chromosomes such as number, form, structure and banding patterns, provide the basis for their microscopic assessment through conventional cytogenetic means . This technique is still the most important screening tool for the identification of constitutional as well as acquired karyotype abnormalities .
- ISCN International System for Human Cytogenetic Nomenclature
- the chromosomal banding pattern reflects alternating CG-rich and CG-poor sequence compartments. Nevertheless, the prevalent notion was that the correlation between chromosome bands and CG content could only be considered on the whole as a rather weak approximation. In this context, it seemed highly unlikely that the banding phenomenon could simply result from long-range variations in the linear base pair composition alone. Instead, it was presumed that the banding pattern is significantly co-determined and modified by structural factors, such as the folding, protein coverage, packaging and condensation of the DNA as well as by the accessibility of dyes to DNA.
- CCAP Cancer Chromosome Aberration Project
- an aim of the present application is to provide a method for producing a virtual chromosome, which chromosome provides an accurate banding pattern in a scale independent and highly region specific fashion with high resolution.
- these produced virtual chromosomes should comprise not only morphological information as the conventional ideograms or ISCN strings, but also the corresponding genetical information, e.g. the sequence data.
- Such a virtual chromosome comprising the complete sequence data, being comparable to the conventional representations of chromosomes and having a high resolution has not been produced so far.
- a further aim of the present application is to provide a set of chromosomes which can be used as an interface between conventional representations of chromosomes and genetical information and sequence data, in order to directly compare DNA sequence derived data and natural chromosomal binding patterns .
- Chromosomes produced by this method were found to provide an excellent correlation between their banding pattern and that of their corresponding natural counterparts .
- This astonishing concordance not only indicates that the chromosomal banding pattern is to a large extent directly determined by the underlying DNA sequence, but can also provide a unique basis for the joint procession of morphological and molecular genetic data within a single DNA sequence based framework.
- Giemsa banding patterns cannot be explained only by the different base composition, it is shown with the present method that a sequence based display of chromosomes according to CG content is possible and results in high resolution virtual chromosomes.
- a method for producing a virtual chromosome representing a respective natural chromosome refers not only to complete chromosomes but also to parts thereof, for example separate chromosome arms or ends.
- the sequence data of the natural chromosome can be retrieved for example from any electronically available data, in the case of human chromosomes this may be for example sequences from the Hu- man Genome Project Working Draft (http://genome.ucsc.edu/).
- chromosome refers to any chromosomes of any organism.
- the organism is for example the human being, however also any animal, in particular mammal, may be the organism from which the chromosome is derived.
- Virtual chromosomes from mammals and human beings are particularly preferred since they can be used for evolutionary studies .
- the step of dividing the sequence data into fractions and determining the CG content in every fraction will preferably be carried out electronically.
- value between a minimum and a maximum value refers to any parameter which is suitable to define and preferably virtually represent a specific CG content. This may be for example a percentage between 0% and 100% or a value between 0 and 1 which can for example be visualised in a two- or three-dimensional picture. The values can furthermore be represented by light values or colour values . Any value which can be visualised is useful for representing a given CG content.
- the step of calculating the value for every fraction according to its CG content may be carried out with any suitable table, algorithm, formula or programme, whereby for example a minimum value is assigned to a minimum amount of CG in a fraction and a maximum value is assigned to a maximum amount of CG in a fraction.
- the values in between are then assigned as a linear function between the two extremes to the varying CG content in the fractions .
- the value is a light value.
- the advantage of a light value is that the visualisation can be interpreted very rapidly and simply and can furthermore be compared to conventionally produced, for example microscopically taken or ISCN chromosomes.
- the maximum value is represented by white, the minimum value is represented by black and values in between are represented by grey shades.
- This representation of a virtual chromosome is directly comparable to the conventionally scanned chromosomes, however the resolution is very high and the virtual chromosome further comprises the sequence data information which is missing in the chromosome representations according to the state of the art.
- the natural chromosome is divided into fractions of a minimum length of 1000, preferably 5000, more preferred 10,000, especially 20,000, and/or into 10 fractions of a length of 10,000 to 1,000,000 bp, preferably a length of 50,000 to 500,000 bp, still preferred a length of 100,000 to 300,000 bp (or combination of these values).
- fractions are sufficiently small, in order to provide high resolutions and a maximum amount of information.
- a fraction corresponds to the estimated average size of a DNA loop as well as to that of an isochore.
- An optimal length of a fraction is for example 200,000 base pairs.
- the fraction with a CG content of 30 to 35%, preferably 33% is assigned to a minimum value and the fraction with a CG content of 60 to 65%, preferably 62%, is assigned to a maximum value. It was found that a variation between these percentages produces chromosomes with grey values of bands corresponding to the conventionally represented chromosomes . Therefore this visualisation of the virtual chromosome can be directly compared to conventional chromosome representations, as for example perceived through a microscope.
- the value is assigned according to their morphological appearance. Even though the amount of sequence missing fractions has become very low in particular with human chromosomes due to the nearly complete human genome and will generally decrease rapidly, the few missing sequence fractions can be supplemented with data derived from the morphological appearance as for example taken from an ideogram in order to provide a complete chromosome.
- a filter for smoothing the appearance is applied, preferably a Gaussian convolution filter. With this also the resulting shades were used to gradually fill the last few pixels at the chromo- some boundaries.
- a scale correction filter is applied for the production of the virtual chromosome.
- This may be a normalisation and non-linear gamma-like grey scale correction filter.
- a further aspect of the present application relates to a virtual chromosome or a part thereof, represented by values according to its CG content which is characterised in that it is produced according to the inventive method as defined above.
- a virtual chromosome is provided which does not only comprise morphological information and can be compared to for example conventional ideograms but which also comprises sequence data.
- This sequence based visualised chromosome according to the present invention shows an excellent correlation of the banding pattern of virtual chromosomes and that of the corresponding natural counter-parts .
- the same definitions and preferred embodiments as mentioned above apply.
- the value is a light value, still preferred the maximum value is represented by white, the minimum value is represented by black and values in between are represented by grey shades.
- this allows a representation of the chromosome as seen through a microscope and therefore is optimal for comparing with conventional chromosome representations, as for example ideograms or microscopically perceived chromosomes .
- a set of virtual chromosomes or parts thereof is provided which is characterised that it comprises two or more chromosomes or parts thereof according to the present invention as defined above.
- the set comprises a maximum number of chromosomes whereby this set can continuously be supplemented with additional newly found or newly identified chromosomes .
- the set comprises chromosomes or parts thereof spe- cific for one or more organism(s) .
- the advantage of a set which is specific for one organism lies in the fact that this set is useful for comparing any newly identified modifications or rearrangements of chromosomes of that organism.
- the set comprises 24 human chromosomes or parts thereof.
- This set will be a standard for the normal human chromosomes and can be used for comparing chromosomes of a patient with normal chromosomes in order to detect any modifications or rearrangements .
- the set further comprises additional modified chromosomes or parts thereof, preferably chromosomes with trans- locations.
- modified chromosomes which are related to a specific illness, for example a specific tumor.
- a classification of such modified virtual chromosomes preferably every modification comprising a reference to a specific illness, it is possible to easily refer to a set of chromosomes isolated from a patient an illness or the risk of developing an illness by comparing the set of virtual chromosomes with the patient's chromosomes. Due to constant detection of new modifications in chromosomes the set can be rapidly and continuously completed with the newest medical information.
- chromosome modification refers to any sequence modification, e.g. any mutation or translocation of a chromosome fragment.
- a further aspect of the present invention is the use of the set of virtual chromosomes according to the present invention as mentioned above for cataloguing chromosome modifications.
- inventive set is particularly useful for providing electronic information on chromosome modifications and their connection with any illness or risk of illness .
- chromosomal modifications relates to any modification in the chromosome. This may be on the level of a sequence mutation or on the level of com- plete translocation of a chromosome fragment. Due to the high resolution any chromosome modification can be detected and catalogued.
- the inventive set allows the description of chromosome abnormalities with a hitherto unknown molecular precision, whereas on the other hand it remains possible to interpret more fuzzy large scale events on the plain chromosomal level such as also obtained with conventional cytogenetic analysis and with chromosome painting multicolour FISH comparative genome hybridisation and comparative expressed sequence hybridisation.
- a further aspect of the present application relates to the use of an inventive set of virtual chromosomes as defined above for virtually mapping the chromosomal position of a sequence. Since the inventive set of virtual chromosomes derives from and represents the complete human DNA sequence, it is possible to map and display the chromosomal position of any given, known or unknown sequence or set of sequences that are contained in a data base used to produce the virtual chromosomes.
- a further aspect of the present application relates to the use of an inventive set of virtual chromosomes as defined above as an interface between morphological and molecular genetic data.
- the morphological data is derived from information based on the International System for Human Cytogenetic Nomenclature (ISCN) .
- a virtual chromosome-based system will ensure that previously collected data remain accessible and analyzable.
- a combination of ISCN nomenclature- and virtual chromosome-based graphic interface tools that may be superimposed onto existing cytogenetic databases as mentioned above is extremely useful.
- Such an interface enables the transformation of ISCN information into the corresponding karyotype image.
- a karyotype picture that is generated with such a virtual chromosome tool can be translated into an ISCN string.
- Such a graphic interface is extremely valuable for visually crosschecking the ISCN description, by comparing the karyotype picture with the virtual chromosome image.
- the superimposition of virtual chromosomes as a graphic interface on molecular genetic databases aids in the visualization of any type of FISH-, DNA- and RNA-derived data sets as well as gene expression profiles in a standardized "chromosomal" fashion.
- the advantages of such a chromosomal representation is that it is independent of the probe distribution on the diverse arrays, and also that its "natural" appearance makes it easier to comprehend and compare by visual inspection.
- the representation of gene expression profile in such a style is of growing interest, since there is accumulating evidence that even functionally unrelated genes are expressed in transcriptional territories in Drosophila, as well as in the human genome.
- the resulting distribution pattern will resemble those that derive from CGH and CESH analyses, in which differentially labeled DNA or cDNA from a tissue of interest and a control sample are simultaneously hybridized directly onto chromosomes. Consequently, such data sets can be directly correlated and cross-analyzed with other karyotype patterns, for example with the associated karyotype abnormalities.
- the set of chromosomes serves as reference for classifying a phenotype to a sequence arrangement.
- sequence arrangement refers to any sequence modification, e.g. sequence mutations or translocations of chromosome fragments.
- the phenotype may refer to normal or abnormal phenotypes, e.g. various illnesses as tumors.
- any chromosome isolated from a patient and analysed with conventional microscopic methods can be compared to the inventive set of chromosomes. Similarities between the modifications of the chromosomes would imply also similar phenotypes or at least the likelihood or risk of developing a similar phenotype.
- the set of chromosomes serves as a tool for carrying out structural and functional analyses, respectively, of a sequence arrangement.
- the analysis can be carried out by gene mapping or virtual hybridisation due to the sequence data comprised in the virtual chromosome.
- the set of chromosomes serves as a tool for determining the influence of a given factor on a sequence arrangement.
- an external factor as a chemical substance, energy with various wavelengths or even the influence of microorganisms can be analysed on a cytogenetical basis and transferred into or compared to the inventive set of chromosomes, whereby the implications or resulting phenotypes can be retrieved or foreseen.
- the chromosomes or set of chromosomes may also be stored on a computer program product (a CD, DUD, diskette or on a web server) as well as the method according to the present invention (as computer readable program means for causing a computer to control execution of the method according to the present invention) .
- a computer program product a CD, DUD, diskette or on a web server
- the method according to the present invention as computer readable program means for causing a computer to control execution of the method according to the present invention
- Fig.l shows images of DNA sequence derived human chromosomes compared to Trypsin/Giemsa banded chromosome images
- Fig.2 represents a modelling of uneven condensation of G bands depending on their CG content
- Fig.3 shows virtual chromosomes compared to the cytogenetic and molecular genetic maps
- Fig.4 shows a virtual in situ hybridisation
- Fig.5 shows the construction of virtual chromosome abnormalities
- Fig.6 represents a graphic interface between cytogenic and molecular genetic data sets .
- Gaps arising from unsequenced nucleotides (N's) were not taken into consideration.
- the CG content of the individual strips ranged from 33% to 62%, with a mean value of 41%.
- the tables with these data were stored in temporary files together with the information about the segment and band coordinates of the particular chromosome, and were fur ⁇ ther computed and assembled with Mathematica as shown below:
- T he derived bars were then integrated into the particular chro ⁇ mosome boundaries, as defined by their length and centromere po ⁇ sition. %he centromeric, heterochromatic and satellite regions, for which no appropriate sequence information is as yet available, were artificially supplemented according to their morphological appearance.
- a Gaussian convolution filter N ( 0.1 ) , 22 stripes in length
- misplacements of sequence portions may change the local banding pattern in a noticeable way, as becomes particularly obvious when comparing the long arms of the virtual chromosomes 1 and 11 from various releases .
- the location of the centromeres of chromosomes 5 (December 2001 release) , 7 and 12 (both April 2002 release) moved to odd positions.
- the advances and continuous corrections in the sequence assembly significantly improved, as well as the concordance between the natural and virtual banding patterns .
- Such a comparison of virtual chromosomes that derive from different sequence releases can thus also provide an independent validation of the sequence map .
- Virtual chromosomes allow structural and functional analyses of the genome and provide the means to study the influence of different factors on the large-scale chromosomal banding pattern.
- the example shown here relates to the potential effects of the unequal contraction of light and dark bands during chromosome condensation.
- the analysis is based on the notion that Giemsa dark bands may contain up to 11 times more DNA than the light bands and that the DNA compaction ratio is in the order of the cubic root of the respective DNA length (s. Fig. 2).
- the ISCN chromosome 7 (850 band-stage) is shown together with its virtual counterpart and three different ideograms in Fig. 3a.
- the banding pattern of the left ideogram is based on the location of the turning points between CG-richer and CG-poorer regions in the sequence-based virtual chromosome.
- the curve follows the mean CG content.
- the UCSC ideogram (August 2001 release) is placed in the middle and the according ISCN (850 band- stage) one on the right.
- the left half of the virtual chromosome displays the raw, unenhanced grey values of the respective CG content, whereas in the right half the contrast has been enhanced according to the curve shown in the graph (b) .
- the thin horizontal lines provide an absolute 10 Mb scale.
- band 4(q21) encompasses between 11.5 and 14.2 Mb (6.0% - 7.4% of chromosome 4) and band ll(q23) between 10.9 and 11.7 Mb (7.9% - 8.5% of chromosome 11).
- the breakpoints might be located anywhere within these bands. To demonstrate this point, we have assigned them to the outer boundaries of the bands in question. Most likely, this is also already one of the highest resolutions that can be achieved with an average morphological chromosome analysis.
- virtual chromosomes cover the nine orders of magnitude of the whole genome in a highly condensed, easily expandable and the most natural imaginable "morphological" fashion. Therefore, they can be utilized as a unique front-end tool for the visualization of the information contained in any sequence database; in effect basically from a single base pair up to whole chromosomes in a cytogenetic manner.
- we show here in a one Mb scale the distribution of the approximately 15,000 genes and CpGs from the UCSC database along virtual chromosomes and the according UCSC color ideograms. For practical reasons, the scale of the bars for the genes is only shown in half the height of the CpG bars, and the height of these CpG bars on chromosome 19 are cut off.
- the vertical bars on the left of the chromosome indicate the size and position of heterochromatic and satellite regions that were artificially supplemented, since their sequence is not yet available.
- the fine horizontal bars on the left of the chromosomes indicate sequence gaps.
Landscapes
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Data Mining & Analysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AT14302002 | 2002-09-24 | ||
| AT0143002A AT412476B (de) | 2002-09-24 | 2002-09-24 | Verfahren zur herstellung eines virtuellen chromosoms |
| PCT/EP2003/010254 WO2004029747A2 (en) | 2002-09-24 | 2003-09-16 | Method for producing virtual chromosomes |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP1563444A2 true EP1563444A2 (de) | 2005-08-17 |
Family
ID=32034594
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP03798163A Withdrawn EP1563444A2 (de) | 2002-09-24 | 2003-09-16 | Verfahren zur erzeugung von virtuellen chromosomen |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP1563444A2 (de) |
| AT (1) | AT412476B (de) |
| AU (1) | AU2003275968A1 (de) |
| WO (1) | WO2004029747A2 (de) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3573066B1 (de) | 2012-03-13 | 2023-09-27 | The Chinese University Of Hong Kong | Verfahren zur analyse von massiv parallelen sequenzierungsdaten zur nichtinvasiven pränatalen diagnose |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6136540A (en) * | 1994-10-03 | 2000-10-24 | Ikonisys Inc. | Automated fluorescence in situ hybridization detection of genetic abnormalities |
-
2002
- 2002-09-24 AT AT0143002A patent/AT412476B/de not_active IP Right Cessation
-
2003
- 2003-09-16 AU AU2003275968A patent/AU2003275968A1/en not_active Abandoned
- 2003-09-16 WO PCT/EP2003/010254 patent/WO2004029747A2/en not_active Ceased
- 2003-09-16 EP EP03798163A patent/EP1563444A2/de not_active Withdrawn
Non-Patent Citations (1)
| Title |
|---|
| See references of WO2004029747A2 * |
Also Published As
| Publication number | Publication date |
|---|---|
| ATA14302002A (de) | 2004-08-15 |
| AT412476B (de) | 2005-03-25 |
| WO2004029747A3 (en) | 2005-05-26 |
| AU2003275968A1 (en) | 2004-04-19 |
| WO2004029747A2 (en) | 2004-04-08 |
| AU2003275968A8 (en) | 2004-04-19 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Poulet et al. | The LINC complex contributes to heterochromatin organisation and transcriptional gene silencing in plants | |
| JP7767672B2 (ja) | ベースコールのための等化器ベースの強度補正 | |
| Eling et al. | cytomapper: an R/Bioconductor package for visualization of highly multiplexed imaging data | |
| Bass et al. | A maize root tip system to study DNA replication programmes in somatic and endocycling nuclei during plant development | |
| Pope et al. | Topologically associating domains are stable units of replication-timing regulation | |
| Dawe et al. | Meiotic chromosome pairing in maize is associated with a novel chromatin organization | |
| US8271251B2 (en) | Automated imaging system for single molecules | |
| Woodfine et al. | Replication timing of the human genome | |
| Anderson et al. | High-resolution crossover maps for each bivalent of Zea mays using recombination nodules | |
| US7613571B2 (en) | Method and system for the multidimensional morphological reconstruction of genome expression activity | |
| Koumbaris et al. | A new single‐locus cytogenetic mapping system for maize (Zea mays L.): overcoming FISH detection limits with marker‐selected sorghum (S. propinquum L.) BAC clones | |
| Goodman | Biological data becomes computer literate: new advances in bioinformatics | |
| Poulet et al. | NucleusJ: an ImageJ plugin for quantifying 3D images of interphase nuclei | |
| WO1994016104A1 (en) | Color imaging system for use in molecular biology | |
| Lowenstein et al. | Long-range interphase chromosome organization in Drosophila: a study using color barcoded fluorescence in situ hybridization and structural clustering analysis | |
| Wang et al. | An effective approach for identification of in vivo protein-DNA binding sites from paired-end ChIP-Seq data | |
| EP1563444A2 (de) | Verfahren zur erzeugung von virtuellen chromosomen | |
| Kendall et al. | Computational methods for DNA copy-number analysis of tumors | |
| Paar et al. | ColorHOR—novel graphical algorithm for fast scan of alpha satellite higher-order repeats and HOR annotation for GenBank sequence of human genome | |
| DeVries et al. | Comparative genomic hybridization | |
| Jang et al. | Multiresolution correction of GC bias and application to identification of copy number alterations | |
| Buwe et al. | Multicolor spectral karyotyping of rat chromosomes | |
| US20110301862A1 (en) | System for array-based DNA copy number and loss of heterozygosity analyses and reporting | |
| Wheeler et al. | A comparison of genomic methods to assess DNA replication timing | |
| Lohmann et al. | BrainView: a computer program for reconstruction and interactive visualization of 3D data sets |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| 17P | Request for examination filed |
Effective date: 20050418 |
|
| AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
| AX | Request for extension of the european patent |
Extension state: AL LT LV MK |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
| 18D | Application deemed to be withdrawn |
Effective date: 20060331 |