US20140368550A1 - Distributed System Providing Dynamic Indexing And Visualization Of Genomic Data - Google Patents
Distributed System Providing Dynamic Indexing And Visualization Of Genomic Data Download PDFInfo
- Publication number
- US20140368550A1 US20140368550A1 US14/363,788 US201214363788A US2014368550A1 US 20140368550 A1 US20140368550 A1 US 20140368550A1 US 201214363788 A US201214363788 A US 201214363788A US 2014368550 A1 US2014368550 A1 US 2014368550A1
- Authority
- US
- United States
- Prior art keywords
- genomic
- data
- information
- sequence
- scale
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012800 visualization Methods 0.000 title claims abstract description 37
- 108090000623 proteins and genes Proteins 0.000 claims description 23
- 201000010099 disease Diseases 0.000 claims description 14
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 14
- 210000000349 chromosome Anatomy 0.000 claims description 13
- 108700028369 Alleles Proteins 0.000 claims description 8
- 230000035772 mutation Effects 0.000 claims description 7
- 238000013500 data storage Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 230000002759 chromosomal effect Effects 0.000 claims description 4
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 239000012634 fragment Substances 0.000 claims description 3
- 230000005945 translocation Effects 0.000 claims description 2
- 238000000034 method Methods 0.000 abstract description 19
- 238000013479 data entry Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000007246 mechanism Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 108010077544 Chromatin Proteins 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 208000034951 Genetic Translocation Diseases 0.000 description 1
- 208000020584 Polyploidy Diseases 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000031864 metaphase Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004224 protection Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000005641 tunneling Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B45/00—ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G5/00—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
- G09G5/36—Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
- G09G5/37—Details of the operation on graphic patterns
- G09G5/373—Details of the operation on graphic patterns for modifying the size of the graphic pattern
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/10—Ontologies; Annotations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/24—Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2340/00—Aspects of display data processing
- G09G2340/04—Changes in size, position or resolution of an image
- G09G2340/045—Zooming at least part of an image, i.e. enlarging it or shrinking it
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09G—ARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
- G09G2354/00—Aspects of interface with display user
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
Definitions
- the field of the invention is computational genomics, especially as it relates to dynamic graphic representation of complex genetic information.
- sequencing speed is no longer the bottleneck in genome analysis but data storage, retrieval, and coordinated analysis.
- the difficulties associated with data storage, retrieval, and analysis are further compounded by the varying requirements for displayed information from different users. Viewed from a different perspective, information-dense and selective presentation of genomic data is paramount to making use of the massive quantity of data now available.
- the inventive subject matter is directed to methods and devices for dynamic visualization of genomic data in which a genomic visualization system adapts presentation of information content according to scale-relevant annotations within a sequence object.
- adaptive content display can be achieved at significantly reduced data analysis and transfer.
- a genomic visualization system comprising an indexed genomic database that stores a sequence object representative of a genomic region.
- the sequence object includes a plurality of scale-relevant annotations.
- a scaling engine is coupled with the indexed genomic data storage and is configured to (a) adjust scale-relevant information derived from the scale-relevant annotations of the sequence object as a function of a user selected zoom level, (b) dynamically generate a genomic display object representative of the scale-relevant information based on the zoom level, and (c) configure an output device to present the genomic display objects to a user.
- sequence object has a SAM/BAM or BAMBAM format, and/or that the genomic region is a whole genome, a chromosome, a chromosomal fragment, or an allele.
- one or more bamservers and/or visualization servers may operate as the scaling engine.
- the scaling engine may be further configured to adjust the scale-relevant information by downsampling based on the zoom-level (wherein downsampling may be a function of data density derived from the zoom-level).
- downsampling may be a function of data density derived from the zoom-level.
- the scaling engine is configured to determine the zoom level, and optionally to summarize a full data set of the sequence object according to the zoom level.
- the scaling engine may also be configured to derive the scale relevant information from differences in scale-relevant annotations in different sequence objects.
- the sequence object comprises a reference sequence object, which is most preferably raw sequence data, sequence data from homo statisticus, and/or sequence data from a specified point in time.
- the sequence object comprises a differential sequence object with respect to a reference genomic region (e.g., reference genomic region from homo statisticus or to a specific point in time).
- the scale relevant annotations may vary considerably and will preferably include genomic structure information (e.g., chromosome identification, location within a chromosome, allele, etc.), genomic change information (e.g., a mutation, a translocation, an inversion, a deletion, a repeat, and a copy number), disease information (e.g., type of disease, a status of disease, and a treatment option for the disease), gene relevant information (e.g., raw sequence data or processed sequence data, gene identification, information on gene regulation, and information of association of the gene with a disease), differential information relative to a reference sequence, and/or metadata (e.g., patient identification, facility identification, physician identification, and insurance information).
- genomic structure information e.g., chromosome identification, location within a chromosome, allele, etc.
- genomic change information e.g., a mutation, a translocation, an inversion, a deletion, a repeat, and a copy number
- disease information e.g., type
- the genomic visualization system will further include a genomic graphic library that stores a graphic object representative of scale relevant annotations.
- the scaling engine maps the scale relevant information to graphic objects from the graphic library according to the zoom level, and that the genomic display object comprises the mapped graphic objects.
- suitable output devices a display, a browser, a printer, a 3D printer, and/or a speaker are typically preferred.
- FIG. 1 provides an overview of a distributed genomic visualization environment.
- FIG. 2 illustrates a possible genomic visualization system including a visualization scaling engine.
- FIG. 3 is an exemplary display view at base zoom level.
- FIG. 4 is the exemplary display view of FIG. 3 at a sub-kilobase zoom level.
- FIG. 5 is the exemplary display view of FIG. 4 at a kilobase zoom level.
- FIG. 6 is the exemplary display view of FIG. 5 at a chromosome zoom level.
- the inventive subject matter is directed to devices and methods for dynamic visualization of genomic data.
- Contemplated systems and methods allow for selective and scalable display of information-rich content while reducing data aggregation and traffic.
- computing devices comprise a processor configured to execute software instructions stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.).
- the software instructions preferably configure the computing device to provide the roles, responsibilities, or other functionality as discussed below with respect to the disclosed apparatus.
- the various servers, systems, databases, or interfaces exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods.
- Data exchanges preferably are conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
- a server can include one or more computers operating as a web server, database server, or other type of computer server in a manner to fulfill described roles, responsibilities, or functions.
- Coupled to is intended to include both direct coupling (in which two elements that are coupled to each other contact each other) and indirect coupling (in which at least one additional element is located between the two elements). Therefore, the terms “coupled to” and “coupled with” are used synonymously.
- Contemplated devices and methods combine advantageous features of a bamserver and a genome visualization engine that are loosely coupled such as to allow for trivial integration with other alternative genomic powered engines or other genomic data storage solutions.
- each component can scale as necessary to accommodate multiple bamservers or multiple visualization engines, as schematically and exemplarily illustrated in FIG. 1 .
- each server is flexible enough to maintain independent storage, authentication, and data retrieval on its own as well as in a distributed nature where each server may coordinate some parts with other servers.
- the ability of both the bamserver and visualization engine to dynamically scale the data provided from large data sources will help mitigate against significant increases in data sizes of future data formats and file types.
- FIG. 2 illustrates genomic visualization system 200 capable of generating a visual display of genomic information a different scales of observation.
- System 200 includes indexed genomic database 220 and scaling engine 230 .
- system 200 can also include genomic graphics library 237 or even devices 250 , possibly operating as clients of the services offered by system 200 .
- devices 250 can include a browser-enabled computing device (e.g., a cell phone, tablet, computer, etc.), through which a healthcare provider or a patient can access genomic information of interest over network 215 .
- Scaling engine 230 can provide a visual display of the genomic information to the user's browser via HTTP, or other suitable protocol.
- a genomic visualization system 200 will comprise an indexed genomic database 220 that stores one or more of sequence objects 223 representative of a genomic region, wherein the sequence object 223 includes a plurality of scale-relevant annotations 225 .
- Scaling engine 230 is coupled with the indexed genomic database 220 and configured to adjust scale-relevant information 233 that is derived from the scale-relevant annotations 225 of the sequence object 223 as a function of a user selected zoom level 252 .
- the scaling engine 230 will then dynamically generate a genomic display object 235 that is representative of the scale-relevant information 233 based on the zoom level 252 , and configure an output device 250 to present the genomic display objects 235 to a user.
- genomic region typically refers to a sequence name and a start and end coordinate that specify a closed interval within that sequence.
- An example genomic region is: chr1:1234-5678, where chr1 specifies the sequence of chromosome 1 from a human reference genome, 1234 is the start coordinate, and 5678 is the end coordinate.
- chr1 specifies the sequence of chromosome 1 from a human reference genome
- 1234 is the start coordinate
- 5678 is the end coordinate.
- suitable formats will include particular references to the chromosomal location and/or sub-location, to gene names or functions, regulatory aspects of the gene(s) in the region, chromatin structural aspects of the gene(s) in the region, length of sequence, etc.
- the genomic region may be a whole genome, a chromosome, a chromosomal fragment, or an allele. Moreover, it should be noted that specification of multiple genomic regions in a single request is possible by using any known delimiter between the genomic regions.
- sequence object 223 may have numerous data formats, and that all known formats are deemed suitable so long as such formats also include one or more scale-relevant annotations.
- particularly preferred formats for contemplated sequence objects include SAM/BAM and BAMBAM format.
- sequence object 223 may represent a genomic region of a reference genome (e.g., from homo statisticus) or a genomic region of a test sample. Where the sequence object 223 is from a test sample to be analyzed, it is typically preferred that the analysis is performed with respect to a reference genome and/or a genome of the same test subject from a different point in time.
- suitable reference sequence objects 223 may include raw sequence data, sequence data from homo statisticus, and/or sequence data of a test subject from a specified point in time.
- the sequence object 223 need not necessarily be confined to a raw data read or assembled sequence (e.g., full-length gene), but that the sequence object 223 may be or comprise a differential sequence object 223 with respect to a reference genomic region (e.g., in which only discordant corresponding bases are listed).
- reference genomic region may be from the same test proband taken at an earlier point in time, or from an actual healthy proband or a hypothetical, consensus sequence from multiple healthy probands (homo statisticus).
- annotations 225 may vary considerably and that all annotations known in genomics analysis are deemed suitable for use herein.
- particularly preferred annotations 225 include those related to the genomic structure on various scale levels (e.g., location of sequence on a chromosome, location within a chromosome, allele information, etc.) and those related to genomic changes on various scale levels (e.g., chromosomal translocation, repeat or copy number, insertions, deletions, inversions, various mutations such as SNPs, transitions, transversions, etc,).
- scale relevant annotations 225 may also include disease information on various scale levels (e.g., polyploidy, copy and/or repeat numbers, type/status/treatment options of a disease associated with mutations or copy numbers, etc.).
- the scale relevant annotations 225 may also include gene relevant information on various scale levels (e.g., gene as part of a functional or regulatory network of genes, gene name or functional identification, raw sequence data or processed sequence data, gene identification, information on gene regulation, and information of association of the gene with a disease).
- scale relevant annotations 225 will typically also include metadata associated with the sequence object, and most typically include patient identification, facility identification, physician identification, and/or insurance information.
- scale relevant annotations 225 will include annotations that are suitable for display for selected audiences (e.g., physician, researcher, patient, insurance, etc.). For example, where the audience is a physician, scale relevant annotations 225 may be relevant to a display format of an entire genome in simplified format (e.g., circle plot, metaphase spread, etc.) where mutations are indicated by simple pointers or other graphical tools. On the other hand, where the audience is a researcher, scale relevant annotations 225 may be relevant to a display format in which actual raw sequence data and copy number/allele frequency is provided.
- audiences e.g., physician, researcher, patient, insurance, etc.
- scale relevant annotations 225 may be relevant to a display format of an entire genome in simplified format (e.g., circle plot, metaphase spread, etc.) where mutations are indicated by simple pointers or other graphical tools.
- scale relevant annotations 225 may be relevant to a display format in which actual raw sequence data and copy number/allele frequency is provided.
- scale relevant annotations 225 may further include data that indicate suitability for the particular annotation for a specific zoom level or levels 252 .
- suitability for display at a given zoom level may also be determined independently of such data as further discussed below.
- Zoom level 252 selected by a user can be determined through various techniques. In some embodiments, zoom level 252 can be determined based on the user profile: healthcare provider, patient, insurance company, researcher, or other type of profile.
- zoom level 252 representing a highest level zoom can be selected as a default when a patient is viewing the data.
- a researcher might have a default zoom level 252 that targets specific regions of interest.
- Other techniques for establishing zoom level 252 include receiving a user selected bounding box from the visualization device (e.g., browser, application, etc.), automatically triggering on anomalous genomic regions relative to a reference region (homo statisticus), receiving genomic information from a sequence device indicative of a region of interest, or other techniques.
- scaling engine 230 receives zoom level 252 from a healthcare provider who is reviewing a patient's genomic information with respect to known mutations. Scaling engine 230 obtains sequence object 223 from indexed genomic database 220 along with the associated scale-relevant annotations 225 .
- Scaling engine 230 derives scale-relevant information 233 as a function of the scale-relevant annotations 225 , the healthcare provider information (e.g., authorization, profile, etc.), and zoom level 252 .
- Scale relevant information 233 thus represents the genomic region of sequence object 223 at a proper zoom level as well as at an appropriate level of detail with respect to the observer. In other words, at the given level of zoom, the scale-relevant information 233 represents the information that would be appropriate for the healthcare provider. If the observer were a patient, scale relevant information 233 would likely carry a different presentation of the genomic information the would be appropriate for the patient even though zoom level 252 and sequence object 223 are identical.
- Scaling engine 230 maps the scale relevant information 233 to one or more graphic objects in genomic graphic library 237 to create genomic display object 235 .
- genomic graphic library 237 is configured to store genomic graphic objects rather than mere graphic primitives. Genomic graphic library 237 can be updated with additional genomic graphic objects as desired or existing genomic graphic objects can be modified, possibly with different graphics (e.g., textures, skins, themes, etc.). Such an approach is considered advantageous within the market as it allows for branding or customization of visual presentations.
- the bamserver is or comprises a distributed network server system capable of efficient random access to data indexed by genomic region, supporting protected access to encrypted data both over secured connections and via encrypted file access.
- a user will: 1. connect to the bamserver over the network, 2. issue a request with two parameters—A) a data archive and B) a list of genomic regions, and 3. receive all data entries from the archive that overlap any of the provided genomic regions.
- data archive refers to a set of data entries where each entry is associated with a genomic region.
- a data entry can be any data, including a single number, a string of characters, and a list of numbers and/or strings.
- Some common examples of data entries are a sequence read and associated read quality from a sequencing machine, a known gene location, or a detected mutation.
- the bamserver sorts the data entries by genomic region, then preferably creates an R-tree like binning index, as is commonly used in genomic applications and has been described fully in its use in the UCSC Genome Browser and the SAM Tools software library. Briefly, an indexed sequence is broken up into overlapping bins. Starting with one bin covering the entire sequence, two new bins are added which split the previous bin in half. The index then has pointers from each bin to the data entries that fit within that bin, but no smaller bin. Retrieving data entries that overlap a query is then a matter of examining only the bins that overlap the query.
- bamserver restricts access to non-public data archives by checking each request against a data file access server. If the client does not provide sufficient security credentials according to the data file access server, access to any results are denied.
- Each bamserver can be configured for a unique data file access server, allowing flexible permission schemes and federated authentication methods.
- the data archives of the bamserver are stored on a file system that appears local to the bamserver.
- This file system may use disks attached directly to the bamserver and/or network-accessible disks.
- protected data archives are stored in an encrypted form (e.g., AES symmetric block encryption, using CTR mode).
- the bamserver will typically not have access to the encryption key.
- the data file access server grants access, the data file access server will provide the encryption key for the requested file.
- the bamserver will use the key while processing the request, and discard the key as soon as the request is completely processed.
- Suitable request methods are typically made using RESTful (conforming to representational state transfer constraints) queries over HTTPS, an SSL-secured HTTP protocol, or using an alternative encrypted tunneling mechanism within which HTTPS queries are made.
- RESTful conforming to representational state transfer constraints
- the RESTful nature of the queries allows bamservers to be distributed both geographically and locally to provide maximum throughput to consuming applications.
- the only constraint on locality of the bamserver is direct file access to the underlying data, which could even be presented over a wide-area network using the appropriate protocols (NFS over VPN, or other such solutions).
- dynamic scaling of the data is implemented.
- the bamserver possibly operating as scaling engine 230 , has capabilities of dynamically scaling (“downsampling”) the data to provide a more condensed version that will reduce processing and transfer times. This downsampling is most preferably accomplished in two parallel mechanisms. The first mechanism requires no knowledge of the underlying data, and is accomplished by providing the bamserver files that are pre-condensed to certain levels. The bamserver can then dynamically decide at the time of query if it should provide a “raw” level of data, or alternatively one of the condensed files.
- the consuming application is a visualization engine, which could also operate as scaling engine 230 , one example of a useful data point count might be based upon the number of pixels that will be drawn to the screen.
- the second mechanism for downsampling is dynamic summarization of the full data accessible to the bamserver. This mechanism requires providing additional information about the file type to the bamserver so that it can understand which fields are possible to summarize, and the mechanism of summarization. Given a file with only a single data column beyond the genomic coordinate index, this could be automatically determined and a median or mean summarization could automatically be performed.
- the bamserver will require parameters outlining how to perform that summarization.
- One example is downsampling of a file in SAM/BAM format, which would perform a downsampling by sub-sampling the individual reads at each position, only providing a limited number back to the consuming application.
- bamserver is capable of reading files from multiple formats and understanding both genomically indexed data and additional storage formats such as SQLite and JSON.
- the format of the requested file is currently provided by the consuming application, but auto-detection of file format is also contemplated.
- the architecture of the bamserver preferably supports additional data formats in the form of plugins that can understand foreign indexing schemes and still provide a unified interface. These plugins are either specified via the universal resource identifier (URI) REST request, or by auto detection of the appropriate format within the bamserver.
- URI universal resource identifier
- a dynamic genome visualization engine is capable of interpreting multiple types of data with the common attribute of being mapped to a location in the genome, and producing image-based interpretations of the data.
- a genome “browser” in some sense is already known (e.g., University of California, Santa Cruz Genome Browser, established in 2001 (see URL genome.ucsc.edu)).
- browsers limit views of data to user specified densities and are unable to respond to requests past certain limits in a timely and meaningful manner.
- the dynamic genome visualization engine contemplated herein is capable of understanding the amount of data being requested by a user and altering the visualizations presented to provide more compact and summarized versions when appropriate.
- the level of downsampling is handled by the bamserver, which understands the region that is attempting to be visualized, and will automatically reduce the data sent to the visualization engine.
- the engine itself recognizes a sufficiently large amount of data is being request, the underlying visualizations produced will alter in a way to provide summaries that are more useful to the end-user.
- FIGS. 3-6 represent some examples of how these display change based on the various number of bases the user is viewing in the window where the displays are generated from genomic graphic objects used to generate genomic display objects 235 within a browser. It is important to emphasize that these displays are dynamically generated and not pre-computed, although for certain use cases pre-generated static images are not excluded and are supported by contemplated devices and methods.
- FIG. 3 52 bases of the human genome are shown across approximately 1000 horizontal pixels, with graphical representations of overall copy-number, allele specific copy-number, raw sequencing data from BAM, and an annotation track of UCSC Known Genes.
- each of these tracks is pulled dynamically from the bamserver architecture outlined earlier, and each track can query an independent bamserver to obtain the data necessary. Because such a small number of bases are being shown, no downsampling on either the bamserver or the visualization engine is being performed. Thus, it is particularly preferred that the lowest zoom level is at the base readout of the raw or computed sequence.
- FIG. 4 represents a sub-kilobase zoom level showing about 1000 bases from that same region of the genome. At this resolution and number of bases, no downsampling is taking place on the bamserver, however the visualization engine has begun to alter the display of each data source to accommodate the increased viewport. In particular, the letters on each base no longer appear both on the top reference base bar and within the individual bam reads, instead resorting to simple colors to represent the changes identified.
- FIG. 5 is viewing approximately 2 megabases (2 million bases) at a kilobase zoom level while the number of pixels is maintained constant.
- both the bamserver and the visualization engine have downsampled the data being drawn.
- the bamserver has reduced the amount of copy-number data it provides the visualization engine, and the visualization engine has ignored the raw data track because viewing would be impractical.
- the visualization engine has begun to summarize one of the variant tracks (the bottom-most track) by producing a graphical histogram at the top.
- the visualization engine has averaged together the multiple datapoints for the copy-number variation that sit beneath each pixel to produce a more accurate image.
- FIG. 6 represents all of chromosome 12 at a chromosome zoom level. All of the previous downsampling is occurring at this resolution, with additional downsampling being down to remove the text and display a more graphical representation of both the UCSC Known Gene and COSMIC variant tracks at the bottom of the image. While one clear example has been represented in these diagrams, this engine provides a framework for dynamic visualization that is not limited to pre-determined and pre-drawn resolution levels, and furthermore can accommodate many different types of underlying data beyond what has been shown here.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Systems and methods for dynamic visualization of genomic data are provided in which a genomic visualization system adapts presentation of information content according to scale-relevant annotations within a sequence object.
Description
- This application claims the benefit of U.S. provisional application with the Ser. No. 61/568478, which was filed Dec. 8, 2011. This and all other extrinsic materials discussed herein are incorporated by reference in their entirety.
- The field of the invention is computational genomics, especially as it relates to dynamic graphic representation of complex genetic information.
- The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
- With the advent of high-throughput sequencing and the availability of entire genome data sets, sequencing speed is no longer the bottleneck in genome analysis but data storage, retrieval, and coordinated analysis. The difficulties associated with data storage, retrieval, and analysis are further compounded by the varying requirements for displayed information from different users. Viewed from a different perspective, information-dense and selective presentation of genomic data is paramount to making use of the massive quantity of data now available.
- While there are several genomic browsers known in the art, all of the known browsers have substantial difficulties. For example, the UCSC Genome Browser (http://genome.ucsc.edu) provides massive data in a graphical forma, however, fails to accommodate to a user specified information density as predefined displays that are independent of the zoom level. Therefore, such browsers are unable to respond optimally to requests at all zoom levels. Similarly, graphic viewers like that of NCBI (http://www.ncbi.nlm.nih.gov/nuccore/) are also limited to certain predefined parameters and thus fail to allow for dynamic presentation and adaptation of content.
- Consequently, even though various systems and methods of display of complex genomic information are known in the art, numerous disadvantages nevertheless remain. Therefore there is still a need to provide improved devices and methods for graphic representation of complex genetic information, and especially dynamic graphic representation.
- The inventive subject matter is directed to methods and devices for dynamic visualization of genomic data in which a genomic visualization system adapts presentation of information content according to scale-relevant annotations within a sequence object. Thus, adaptive content display can be achieved at significantly reduced data analysis and transfer.
- In one especially preferred aspect of the inventive subject matter, a genomic visualization system is contemplated comprising an indexed genomic database that stores a sequence object representative of a genomic region. Most typically, the sequence object includes a plurality of scale-relevant annotations. A scaling engine is coupled with the indexed genomic data storage and is configured to (a) adjust scale-relevant information derived from the scale-relevant annotations of the sequence object as a function of a user selected zoom level, (b) dynamically generate a genomic display object representative of the scale-relevant information based on the zoom level, and (c) configure an output device to present the genomic display objects to a user.
- While not limiting to the inventive subject matter, it is generally preferred that the sequence object has a SAM/BAM or BAMBAM format, and/or that the genomic region is a whole genome, a chromosome, a chromosomal fragment, or an allele.
- With respect to the scaling engine it is contemplated that one or more bamservers and/or visualization servers may operate as the scaling engine. Furthermore, it is contemplated that the scaling engine may be further configured to adjust the scale-relevant information by downsampling based on the zoom-level (wherein downsampling may be a function of data density derived from the zoom-level). Alternatively, or additionally, it is contemplated that the scaling engine is configured to determine the zoom level, and optionally to summarize a full data set of the sequence object according to the zoom level. Where desired, the scaling engine may also be configured to derive the scale relevant information from differences in scale-relevant annotations in different sequence objects.
- In still further contemplated aspects, the sequence object comprises a reference sequence object, which is most preferably raw sequence data, sequence data from homo statisticus, and/or sequence data from a specified point in time. Alternatively, or additionally, the sequence object comprises a differential sequence object with respect to a reference genomic region (e.g., reference genomic region from homo statisticus or to a specific point in time). Similarly, the scale relevant annotations may vary considerably and will preferably include genomic structure information (e.g., chromosome identification, location within a chromosome, allele, etc.), genomic change information (e.g., a mutation, a translocation, an inversion, a deletion, a repeat, and a copy number), disease information (e.g., type of disease, a status of disease, and a treatment option for the disease), gene relevant information (e.g., raw sequence data or processed sequence data, gene identification, information on gene regulation, and information of association of the gene with a disease), differential information relative to a reference sequence, and/or metadata (e.g., patient identification, facility identification, physician identification, and insurance information).
- While not limiting to the inventive subject matter, it is generally preferred that the genomic visualization system will further include a genomic graphic library that stores a graphic object representative of scale relevant annotations. In such systems, it is particularly preferred that the scaling engine maps the scale relevant information to graphic objects from the graphic library according to the zoom level, and that the genomic display object comprises the mapped graphic objects. With respect to suitable output devices, a display, a browser, a printer, a 3D printer, and/or a speaker are typically preferred.
- Various objects, features, aspects and advantages of the inventive subject matter will become more apparent from the following detailed description of preferred embodiments, along with the accompanying drawing figures in which like numerals represent like components.
-
FIG. 1 provides an overview of a distributed genomic visualization environment. -
FIG. 2 illustrates a possible genomic visualization system including a visualization scaling engine. -
FIG. 3 is an exemplary display view at base zoom level. -
FIG. 4 is the exemplary display view ofFIG. 3 at a sub-kilobase zoom level. -
FIG. 5 is the exemplary display view ofFIG. 4 at a kilobase zoom level. -
FIG. 6 is the exemplary display view ofFIG. 5 at a chromosome zoom level. - The inventive subject matter is directed to devices and methods for dynamic visualization of genomic data. Contemplated systems and methods allow for selective and scalable display of information-rich content while reducing data aggregation and traffic.
- It should be noted that while the following description is drawn to a computer/server based genomic visualization systems, various alternative configurations are also deemed suitable and may employ various computing devices including servers, interfaces, systems, databases, agents, peers, engines, controllers, or other types of computing devices operating individually or collectively. One should appreciate the computing devices comprise a processor configured to execute software instructions stored on a tangible, non-transitory computer readable storage medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.). The software instructions preferably configure the computing device to provide the roles, responsibilities, or other functionality as discussed below with respect to the disclosed apparatus. In especially preferred embodiments, the various servers, systems, databases, or interfaces exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods. Data exchanges preferably are conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
- Throughout the following discussion, numerous references will be made regarding servers, services, interfaces, portals, platforms, or other systems formed from computing devices. It should be appreciated that the use of such terms is deemed to represent one or more computing devices having at least one processor configured to execute software instructions stored on a computer readable tangible, non-transitory medium. For example, a server can include one or more computers operating as a web server, database server, or other type of computer server in a manner to fulfill described roles, responsibilities, or functions.
- As used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
- The recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g. “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
- Groupings of alternative elements or embodiments of the invention disclosed herein are not to be construed as limitations. Each group member can be referred to and claimed individually or in any combination with other members of the group or other elements found herein. One or more members of a group can be included in, or deleted from, a group for reasons of convenience and/or patentability. When any such inclusion or deletion occurs, the specification is herein deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims. Although each embodiment represents a single combination of inventive elements, the inventive subject matter is considered to include all possible combinations of the disclosed elements. Thus if one embodiment comprises elements A, B, and C, and a second embodiment comprises elements B and D, then the inventive subject matter is also considered to include other remaining combinations of A, B, C, or D, even if not explicitly disclosed.
- As used herein, and unless the context dictates otherwise, the term “coupled to” is intended to include both direct coupling (in which two elements that are coupled to each other contact each other) and indirect coupling (in which at least one additional element is located between the two elements). Therefore, the terms “coupled to” and “coupled with” are used synonymously.
- Contemplated devices and methods combine advantageous features of a bamserver and a genome visualization engine that are loosely coupled such as to allow for trivial integration with other alternative genomic powered engines or other genomic data storage solutions. In addition, each component can scale as necessary to accommodate multiple bamservers or multiple visualization engines, as schematically and exemplarily illustrated in
FIG. 1 . Most preferably, each server is flexible enough to maintain independent storage, authentication, and data retrieval on its own as well as in a distributed nature where each server may coordinate some parts with other servers. Moreover, the ability of both the bamserver and visualization engine to dynamically scale the data provided from large data sources will help mitigate against significant increases in data sizes of future data formats and file types. -
FIG. 2 illustratesgenomic visualization system 200 capable of generating a visual display of genomic information a different scales of observation.System 200 includes indexedgenomic database 220 and scalingengine 230. In some embodiments,system 200 can also includegenomic graphics library 237 or evendevices 250, possibly operating as clients of the services offered bysystem 200. For example,devices 250 can include a browser-enabled computing device (e.g., a cell phone, tablet, computer, etc.), through which a healthcare provider or a patient can access genomic information of interest overnetwork 215.Scaling engine 230 can provide a visual display of the genomic information to the user's browser via HTTP, or other suitable protocol. - It is generally contemplated that a
genomic visualization system 200 will comprise an indexedgenomic database 220 that stores one or more of sequence objects 223 representative of a genomic region, wherein thesequence object 223 includes a plurality of scale-relevant annotations 225.Scaling engine 230 is coupled with the indexedgenomic database 220 and configured to adjust scale-relevant information 233 that is derived from the scale-relevant annotations 225 of thesequence object 223 as a function of a user selectedzoom level 252. Thescaling engine 230 will then dynamically generate agenomic display object 235 that is representative of the scale-relevant information 233 based on thezoom level 252, and configure anoutput device 250 to present the genomic display objects 235 to a user. - As used herein, the term “genomic region” typically refers to a sequence name and a start and end coordinate that specify a closed interval within that sequence. An example genomic region is: chr1:1234-5678, where chr1 specifies the sequence of chromosome 1 from a human reference genome, 1234 is the start coordinate, and 5678 is the end coordinate. However, it should be readily apparent to the person of ordinary skill in the art that the particular format of the genomic region may vary considerably and that suitable formats will include particular references to the chromosomal location and/or sub-location, to gene names or functions, regulatory aspects of the gene(s) in the region, chromatin structural aspects of the gene(s) in the region, length of sequence, etc. Therefore, and viewed from a different perspective, the genomic region may be a whole genome, a chromosome, a chromosomal fragment, or an allele. Moreover, it should be noted that specification of multiple genomic regions in a single request is possible by using any known delimiter between the genomic regions.
- Consequently, it should be recognized that the
sequence object 223 may have numerous data formats, and that all known formats are deemed suitable so long as such formats also include one or more scale-relevant annotations. For example, particularly preferred formats for contemplated sequence objects include SAM/BAM and BAMBAM format. Likewise, it should be appreciated that thesequence object 223 may represent a genomic region of a reference genome (e.g., from homo statisticus) or a genomic region of a test sample. Where thesequence object 223 is from a test sample to be analyzed, it is typically preferred that the analysis is performed with respect to a reference genome and/or a genome of the same test subject from a different point in time. Thus, suitable reference sequence objects 223 may include raw sequence data, sequence data from homo statisticus, and/or sequence data of a test subject from a specified point in time. Moreover, it should be recognized that thesequence object 223 need not necessarily be confined to a raw data read or assembled sequence (e.g., full-length gene), but that thesequence object 223 may be or comprise adifferential sequence object 223 with respect to a reference genomic region (e.g., in which only discordant corresponding bases are listed). As before, such reference genomic region may be from the same test proband taken at an earlier point in time, or from an actual healthy proband or a hypothetical, consensus sequence from multiple healthy probands (homo statisticus). - With respect to scale
relevant annotations 225 it is contemplated that theannotations 225 may vary considerably and that all annotations known in genomics analysis are deemed suitable for use herein. For example, particularlypreferred annotations 225 include those related to the genomic structure on various scale levels (e.g., location of sequence on a chromosome, location within a chromosome, allele information, etc.) and those related to genomic changes on various scale levels (e.g., chromosomal translocation, repeat or copy number, insertions, deletions, inversions, various mutations such as SNPs, transitions, transversions, etc,). Likewise, scalerelevant annotations 225 may also include disease information on various scale levels (e.g., polyploidy, copy and/or repeat numbers, type/status/treatment options of a disease associated with mutations or copy numbers, etc.). In further contemplated aspects, the scalerelevant annotations 225 may also include gene relevant information on various scale levels (e.g., gene as part of a functional or regulatory network of genes, gene name or functional identification, raw sequence data or processed sequence data, gene identification, information on gene regulation, and information of association of the gene with a disease). - Of course, it should be appreciated that all or part of the relevant information may also be expressed as differential information relative to a reference sequence (e.g., homo statisticus or earlier point in time), which will advantageously reduce data size and complexity. Additionally, scale
relevant annotations 225 will typically also include metadata associated with the sequence object, and most typically include patient identification, facility identification, physician identification, and/or insurance information. - Viewed from a different perspective, scale
relevant annotations 225 will include annotations that are suitable for display for selected audiences (e.g., physician, researcher, patient, insurance, etc.). For example, where the audience is a physician, scalerelevant annotations 225 may be relevant to a display format of an entire genome in simplified format (e.g., circle plot, metaphase spread, etc.) where mutations are indicated by simple pointers or other graphical tools. On the other hand, where the audience is a researcher, scalerelevant annotations 225 may be relevant to a display format in which actual raw sequence data and copy number/allele frequency is provided. - Moreover, and regardless of the audience, it should be recognized that the type of visual presentation will dynamically change as a function of
zoom level 252 such that appropriate content relative to the zoom is displayed. Consequently, scalerelevant annotations 225 may further include data that indicate suitability for the particular annotation for a specific zoom level orlevels 252. Of course, suitability for display at a given zoom level may also be determined independently of such data as further discussed below.Zoom level 252 selected by a user can be determined through various techniques. In some embodiments,zoom level 252 can be determined based on the user profile: healthcare provider, patient, insurance company, researcher, or other type of profile. For example,zoom level 252 representing a highest level zoom (i.e., maximum view of the genomic region) can be selected as a default when a patient is viewing the data. Alternatively, a researcher might have adefault zoom level 252 that targets specific regions of interest. Other techniques for establishingzoom level 252 include receiving a user selected bounding box from the visualization device (e.g., browser, application, etc.), automatically triggering on anomalous genomic regions relative to a reference region (homo statisticus), receiving genomic information from a sequence device indicative of a region of interest, or other techniques. - There are numerous options to graphically represent the scale
relevant annotations 225 and it is especially preferred that graphic representation is performed using known symbols and notations. Most preferably, known symbols and annotations can be stored in a genomicgraphic library 237 that is configured to store graphic objects representative of the scalerelevant annotations 225. In such case, it is particularly preferred that the scaling engine is configured to map the scalerelevant information 233 to graphic objects fromgraphic library 237 according to thezoom level 252, and that thegenomic display object 235 comprises the mapped graphic objects. For example, scalingengine 230 receiveszoom level 252 from a healthcare provider who is reviewing a patient's genomic information with respect to known mutations.Scaling engine 230 obtainssequence object 223 from indexedgenomic database 220 along with the associated scale-relevant annotations 225.Scaling engine 230 derives scale-relevant information 233 as a function of the scale-relevant annotations 225, the healthcare provider information (e.g., authorization, profile, etc.), andzoom level 252. Scalerelevant information 233 thus represents the genomic region ofsequence object 223 at a proper zoom level as well as at an appropriate level of detail with respect to the observer. In other words, at the given level of zoom, the scale-relevant information 233 represents the information that would be appropriate for the healthcare provider. If the observer were a patient, scalerelevant information 233 would likely carry a different presentation of the genomic information the would be appropriate for the patient even thoughzoom level 252 andsequence object 223 are identical.Scaling engine 230 then maps the scalerelevant information 233 to one or more graphic objects in genomicgraphic library 237 to creategenomic display object 235. - One should appreciate that genomic
graphic library 237 is configured to store genomic graphic objects rather than mere graphic primitives. Genomicgraphic library 237 can be updated with additional genomic graphic objects as desired or existing genomic graphic objects can be modified, possibly with different graphics (e.g., textures, skins, themes, etc.). Such an approach is considered advantageous within the market as it allows for branding or customization of visual presentations. - With respect to hardware it should be noted that contemplated devices and methods may be configured and operated in numerous manners, and it should be appreciated that the particular configuration and/or manner of operation will at least in part dictate the functional components and interconnections. Thus, the following description of preferred aspects should only be viewed as exemplary guidance to the person of ordinary skill in the art.
- With respect to suitable bamservers it is generally preferred that the bamserver is or comprises a distributed network server system capable of efficient random access to data indexed by genomic region, supporting protected access to encrypted data both over secured connections and via encrypted file access. In a typical use case, a user will: 1. connect to the bamserver over the network, 2. issue a request with two parameters—A) a data archive and B) a list of genomic regions, and 3. receive all data entries from the archive that overlap any of the provided genomic regions. As used herein, the term “data archive” refers to a set of data entries where each entry is associated with a genomic region. A data entry can be any data, including a single number, a string of characters, and a list of numbers and/or strings. Some common examples of data entries are a sequence read and associated read quality from a sequencing machine, a known gene location, or a detected mutation.
- Indexing genomic regions: When a data archive is added to the bamserver, the bamserver sorts the data entries by genomic region, then preferably creates an R-tree like binning index, as is commonly used in genomic applications and has been described fully in its use in the UCSC Genome Browser and the SAM Tools software library. Briefly, an indexed sequence is broken up into overlapping bins. Starting with one bin covering the entire sequence, two new bins are added which split the previous bin in half. The index then has pointers from each bin to the data entries that fit within that bin, but no smaller bin. Retrieving data entries that overlap a query is then a matter of examining only the bins that overlap the query.
- Data access protections: Most typically, the bamserver restricts access to non-public data archives by checking each request against a data file access server. If the client does not provide sufficient security credentials according to the data file access server, access to any results are denied. Each bamserver can be configured for a unique data file access server, allowing flexible permission schemes and federated authentication methods.
- With respect to data storage it is generally contemplated that the data archives of the bamserver are stored on a file system that appears local to the bamserver. This file system may use disks attached directly to the bamserver and/or network-accessible disks. It is further preferred that protected data archives are stored in an encrypted form (e.g., AES symmetric block encryption, using CTR mode). The bamserver will typically not have access to the encryption key. When processing a request for a protected data archive, if the data file access server grants access, the data file access server will provide the encryption key for the requested file. The bamserver will use the key while processing the request, and discard the key as soon as the request is completely processed.
- Suitable request methods are typically made using RESTful (conforming to representational state transfer constraints) queries over HTTPS, an SSL-secured HTTP protocol, or using an alternative encrypted tunneling mechanism within which HTTPS queries are made. The RESTful nature of the queries allows bamservers to be distributed both geographically and locally to provide maximum throughput to consuming applications. The only constraint on locality of the bamserver is direct file access to the underlying data, which could even be presented over a wide-area network using the appropriate protocols (NFS over VPN, or other such solutions).
- In further preferred aspects, dynamic scaling of the data is implemented. Based on the size of the genomic region requested and knowledge about the resolution with which the data will be displayed, the bamserver, possibly operating as scaling
engine 230, has capabilities of dynamically scaling (“downsampling”) the data to provide a more condensed version that will reduce processing and transfer times. This downsampling is most preferably accomplished in two parallel mechanisms. The first mechanism requires no knowledge of the underlying data, and is accomplished by providing the bamserver files that are pre-condensed to certain levels. The bamserver can then dynamically decide at the time of query if it should provide a “raw” level of data, or alternatively one of the condensed files. This decision is made by including an additional parameter in the request that indicates the number of data points that will be utilized by the consuming application. If the consuming application is a visualization engine, which could also operate as scalingengine 230, one example of a useful data point count might be based upon the number of pixels that will be drawn to the screen. The second mechanism for downsampling is dynamic summarization of the full data accessible to the bamserver. This mechanism requires providing additional information about the file type to the bamserver so that it can understand which fields are possible to summarize, and the mechanism of summarization. Given a file with only a single data column beyond the genomic coordinate index, this could be automatically determined and a median or mean summarization could automatically be performed. For more complex data types or more complex summarization techniques, the bamserver will require parameters outlining how to perform that summarization. One example is downsampling of a file in SAM/BAM format, which would perform a downsampling by sub-sampling the individual reads at each position, only providing a limited number back to the consuming application. - It should further be appreciated that contemplated systems and methods are readily extensible as the bamserver is capable of reading files from multiple formats and understanding both genomically indexed data and additional storage formats such as SQLite and JSON. The format of the requested file is currently provided by the consuming application, but auto-detection of file format is also contemplated. The architecture of the bamserver preferably supports additional data formats in the form of plugins that can understand foreign indexing schemes and still provide a unified interface. These plugins are either specified via the universal resource identifier (URI) REST request, or by auto detection of the appropriate format within the bamserver.
- With respect to dynamic genome visualization engines, it generally contemplated that a dynamic genome visualization engine is capable of interpreting multiple types of data with the common attribute of being mapped to a location in the genome, and producing image-based interpretations of the data. It should be noted that the concept of a genome “browser” in some sense is already known (e.g., University of California, Santa Cruz Genome Browser, established in 2001 (see URL genome.ucsc.edu)). However, currently known browsers limit views of data to user specified densities and are unable to respond to requests past certain limits in a timely and meaningful manner. In contrast, the dynamic genome visualization engine contemplated herein is capable of understanding the amount of data being requested by a user and altering the visualizations presented to provide more compact and summarized versions when appropriate. At one level, the level of downsampling is handled by the bamserver, which understands the region that is attempting to be visualized, and will automatically reduce the data sent to the visualization engine. At a higher level, if the engine itself recognizes a sufficiently large amount of data is being request, the underlying visualizations produced will alter in a way to provide summaries that are more useful to the end-user.
- Displays can vary widely based on the density of data attempting to be viewed.
FIGS. 3-6 represent some examples of how these display change based on the various number of bases the user is viewing in the window where the displays are generated from genomic graphic objects used to generate genomic display objects 235 within a browser. It is important to emphasize that these displays are dynamically generated and not pre-computed, although for certain use cases pre-generated static images are not excluded and are supported by contemplated devices and methods. InFIG. 3 , 52 bases of the human genome are shown across approximately 1000 horizontal pixels, with graphical representations of overall copy-number, allele specific copy-number, raw sequencing data from BAM, and an annotation track of UCSC Known Genes. Each of these tracks is pulled dynamically from the bamserver architecture outlined earlier, and each track can query an independent bamserver to obtain the data necessary. Because such a small number of bases are being shown, no downsampling on either the bamserver or the visualization engine is being performed. Thus, it is particularly preferred that the lowest zoom level is at the base readout of the raw or computed sequence. -
FIG. 4 represents a sub-kilobase zoom level showing about 1000 bases from that same region of the genome. At this resolution and number of bases, no downsampling is taking place on the bamserver, however the visualization engine has begun to alter the display of each data source to accommodate the increased viewport. In particular, the letters on each base no longer appear both on the top reference base bar and within the individual bam reads, instead resorting to simple colors to represent the changes identified. -
FIG. 5 is viewing approximately 2 megabases (2 million bases) at a kilobase zoom level while the number of pixels is maintained constant. As a result, both the bamserver and the visualization engine have downsampled the data being drawn. The bamserver has reduced the amount of copy-number data it provides the visualization engine, and the visualization engine has ignored the raw data track because viewing would be impractical. In addition, the visualization engine has begun to summarize one of the variant tracks (the bottom-most track) by producing a graphical histogram at the top. Finally, the visualization engine has averaged together the multiple datapoints for the copy-number variation that sit beneath each pixel to produce a more accurate image. - The final resolution,
FIG. 6 , represents all of chromosome 12 at a chromosome zoom level. All of the previous downsampling is occurring at this resolution, with additional downsampling being down to remove the text and display a more graphical representation of both the UCSC Known Gene and COSMIC variant tracks at the bottom of the image. While one clear example has been represented in these diagrams, this engine provides a framework for dynamic visualization that is not limited to pre-determined and pre-drawn resolution levels, and furthermore can accommodate many different types of underlying data beyond what has been shown here. - It should be apparent to those skilled in the art that many more modifications besides those already described are possible without departing from the inventive concepts herein. The inventive subject matter, therefore, is not to be restricted except in the spirit of the appended claims. Moreover, in interpreting both the specification and the claims, all terms should be interpreted in the broadest possible manner consistent with the context. In particular, the terms “comprises” and “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, indicating that the referenced elements, components, or steps may be present, or utilized, or combined with other elements, components, or steps that are not expressly referenced. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply. Where the specification claims refers to at least one of something selected from the group consisting of A, B, C . . . and N, the text should be interpreted as requiring only one element from the group, not A plus N, or B plus N, etc.
Claims (26)
1. A genomic visualization system comprising:
an indexed genomic database configured to store a sequence object representative of a genomic region, the sequence object comprising a plurality of scale-relevant annotations; and
a scaling engine coupled with the indexed genomic data storage and configured to:
adjust scale-relevant information derived from the scale-relevant annotations of the sequence object as a function of a user selected zoom level;
dynamically generate a genomic display object representative of the scale-relevant information based on the zoom level; and
configure an output device to present the genomic display objects to a user.
2. The system of claim 1 , wherein the sequence object has a SAM/BAM or BAMBAM format.
3. The system of claim 1 , wherein the genomic region is one of the following: a whole genome, a chromosome, a chromosomal fragment, and an allele.
4. The system of claim 1 , further comprising a bamserver operating as the scaling engine.
5. The system of claim 4 , further comprising a plurality of bamservers.
6. The system of claim 1 , further comprising a visualization server operating as the scaling engine.
7. The system of claim 6 , further comprising a plurality of visualization servers.
8. The system of claim 1 , wherein the output device comprises at least one of the following:
a display, a browser, a printer, a 3D printer, and a speaker.
9. The system of claim 1 , wherein the scaling engine is further configured to adjust the scale-relevant information by downsampling based on the zoom-level.
10. The system of claim 9 , wherein the scaling engine is further configured to downsample as a function of data density derived from the zoom-level.
11. The system of claim 1 , wherein the scaling engine is further configured to determine the zoom level.
12. The system of claim 11 , wherein the scaling engine is further configured to summarize a full data set of the sequence object according to the zoom level.
13. The system of claim 1 , wherein the scaling engine is further configured to derive the scale relevant information from differences in scale-relevant annotations in different sequence objects.
14. The system of claim 1 , wherein the sequence object comprises a reference sequence object.
15. The system of claim 14 wherein the reference sequence object is selected from the group consisting of raw sequence data, sequence data from homo statisticus, and sequence data from a specified point in time.
16. The system of claim 1 , wherein the sequence object comprises a differential sequence object with respect to a reference genomic region.
17. The system of claim 16 wherein the reference genomic region is from homo statisticus or specific to a point in time.
18. The system of claim 1 , wherein the scale relevant annotations include at least one of the following: genomic structure information, genomic change information, disease information, gene relevant information, differential information relative to a reference sequence, and metadata.
19. The system of claim 18 , wherein the genomic structure includes at least one of the following: chromosome identification, location within a chromosome, allele,
20. The system of claim 18 , wherein the genomic change information includes at least one of the following: a mutation, a translocation, an inversion, a deletion, a repeat, and a copy number.
21. The system of claim 18 , wherein the disease information includes at least one of the following: a type of disease, a status of disease, and a treatment option for the disease.
22. The system of claim 18 , wherein the gene relevant information comprises raw sequence data or processed sequence data, gene identification, information on gene regulation, and information of association of the gene with a disease.
23. The system of claim 18 , wherein the metadata includes at least one of the following:
patient identification, facility identification, physician identification, and insurance information.
24. The system of claim 1 , further comprising a genomic graphic library configured to store a graphic object representative of scale relevant annotations.
25. The system of claim 24 , wherein the scaling engine is further configured to map the scale relevant information to graphic objects from graphic library according to the zoom level.
26. The system of claim 25 , wherein the genomic display object comprises the mapped graphic objects.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/363,788 US10140683B2 (en) | 2011-12-08 | 2012-12-07 | Distributed system providing dynamic indexing and visualization of genomic data |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161568478P | 2011-12-08 | 2011-12-08 | |
PCT/US2012/068493 WO2013086355A1 (en) | 2011-12-08 | 2012-12-07 | Distributed system providing dynamic indexing and visualization of genomic data |
US14/363,788 US10140683B2 (en) | 2011-12-08 | 2012-12-07 | Distributed system providing dynamic indexing and visualization of genomic data |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2012/068493 A-371-Of-International WO2013086355A1 (en) | 2011-12-08 | 2012-12-07 | Distributed system providing dynamic indexing and visualization of genomic data |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/169,946 Division US10733701B2 (en) | 2011-12-08 | 2018-10-24 | Distributed system providing dynamic indexing and visualization of genomic data |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140368550A1 true US20140368550A1 (en) | 2014-12-18 |
US10140683B2 US10140683B2 (en) | 2018-11-27 |
Family
ID=48574927
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/363,788 Active 2033-04-14 US10140683B2 (en) | 2011-12-08 | 2012-12-07 | Distributed system providing dynamic indexing and visualization of genomic data |
US16/169,946 Active US10733701B2 (en) | 2011-12-08 | 2018-10-24 | Distributed system providing dynamic indexing and visualization of genomic data |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/169,946 Active US10733701B2 (en) | 2011-12-08 | 2018-10-24 | Distributed system providing dynamic indexing and visualization of genomic data |
Country Status (10)
Country | Link |
---|---|
US (2) | US10140683B2 (en) |
EP (2) | EP3534368B1 (en) |
JP (3) | JP6025859B2 (en) |
KR (5) | KR20190016149A (en) |
CN (1) | CN104246689B (en) |
AU (1) | AU2012347547B2 (en) |
CA (1) | CA2858686C (en) |
ES (1) | ES2729714T3 (en) |
IL (3) | IL233016A (en) |
WO (1) | WO2013086355A1 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9792405B2 (en) | 2013-01-17 | 2017-10-17 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9858384B2 (en) | 2013-01-17 | 2018-01-02 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9940266B2 (en) | 2015-03-23 | 2018-04-10 | Edico Genome Corporation | Method and system for genomic visualization |
US10049179B2 (en) | 2016-01-11 | 2018-08-14 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing |
US10068054B2 (en) | 2013-01-17 | 2018-09-04 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10068183B1 (en) | 2017-02-23 | 2018-09-04 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on a quantum processing platform |
US20190066262A1 (en) * | 2011-12-08 | 2019-02-28 | Five3 Genomics, Llc | Distributed System Providing Dynamic Indexing And Visualization Of Genomic Data |
CN110506272A (en) * | 2016-10-11 | 2019-11-26 | 基因组系统公司 | For accessing with the method and apparatus of the biological data of access unit structuring |
US10691775B2 (en) | 2013-01-17 | 2020-06-23 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10847251B2 (en) | 2013-01-17 | 2020-11-24 | Illumina, Inc. | Genomic infrastructure for on-site or cloud-based DNA and RNA processing and analysis |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9953137B2 (en) | 2012-07-06 | 2018-04-24 | Nant Holdings Ip, Llc | Healthcare analysis stream management |
CN104871164B (en) | 2012-10-24 | 2019-02-05 | 南托米克斯有限责任公司 | Processing and the genome browser system that the variation of genomic sequence data nucleotide is presented |
CN106687965B (en) * | 2013-11-13 | 2019-10-01 | 凡弗3基因组有限公司 | System and method for transmitting and pre-processing sequencing data |
JP6576957B2 (en) * | 2014-02-26 | 2019-09-18 | ナントミクス,エルエルシー | Safe portable genome browsing device and method thereof |
BR112017019373A2 (en) * | 2015-03-12 | 2018-06-05 | Koninklijke Philips N.V. | computer-implemented method and computer-readable media |
CN107004069B (en) * | 2015-04-30 | 2021-12-03 | 株式会社Xcoo | Genome analysis device and genome visualization method |
SG11201903174SA (en) * | 2016-10-11 | 2019-05-30 | Genomsys Sa | Method and system for the transmission of bioinformatics data |
CN107506618B (en) * | 2017-07-07 | 2020-12-08 | 北京中科晶云科技有限公司 | Storage method and query method of high-throughput sequencing sequence |
CN110993033A (en) * | 2019-11-14 | 2020-04-10 | 北京诺禾致源科技股份有限公司 | Method, system and device for processing genome data |
US11662938B2 (en) | 2020-05-11 | 2023-05-30 | Nantcell, Inc. | Object storage and access management systems and methods |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040002818A1 (en) * | 2001-12-21 | 2004-01-01 | Affymetrix, Inc. | Method, system and computer software for providing microarray probe data |
US20100281401A1 (en) * | 2008-11-10 | 2010-11-04 | Signature Genomic Labs | Interactive Genome Browser |
US20120066601A1 (en) * | 2010-09-14 | 2012-03-15 | Apple Inc. | Content configuration for device platforms |
US20120102041A1 (en) * | 2010-10-22 | 2012-04-26 | Samsung Sds Co., Ltd. | Genetic information management system and method |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6519583B1 (en) * | 1997-05-15 | 2003-02-11 | Incyte Pharmaceuticals, Inc. | Graphical viewer for biomolecular sequence data |
EP1067466A2 (en) * | 1999-07-09 | 2001-01-10 | Smithkline Beecham | Genome browser interface |
AU2002228739A1 (en) * | 2000-10-27 | 2002-05-06 | Entigen Corporation | Integrating heterogeneous data and tools |
US20030204317A1 (en) * | 2002-04-26 | 2003-10-30 | Affymetrix, Inc. | Methods, systems and software for displaying genomic sequence and annotations |
GB0202809D0 (en) * | 2002-02-07 | 2002-03-27 | Riverwood Int Corp | A paperboard carton |
US20050038776A1 (en) * | 2003-08-15 | 2005-02-17 | Ramin Cyrus | Information system for biological and life sciences research |
JP2006065501A (en) * | 2004-08-25 | 2006-03-09 | Nittetsu Hitachi Systems Engineering Inc | Genome information display system |
US7868888B2 (en) * | 2006-02-10 | 2011-01-11 | Adobe Systems Incorporated | Course grid aligned counters |
US20090125248A1 (en) * | 2007-11-09 | 2009-05-14 | Soheil Shams | System, Method and computer program product for integrated analysis and visualization of genomic data |
KR102218512B1 (en) | 2010-05-25 | 2021-02-19 | 더 리젠츠 오브 더 유니버시티 오브 캘리포니아 | Bambam: parallel comparative analysis of high-throughput sequencing data |
CN101944151B (en) | 2010-09-30 | 2012-06-27 | 重庆大学 | Wall boundary simulation method in molecular dynamics simulation |
CN104246689B (en) * | 2011-12-08 | 2020-06-02 | 凡弗3基因组有限公司 | Distributed system providing dynamic indexing and visualization of genomic data |
-
2012
- 2012-12-07 CN CN201280068298.9A patent/CN104246689B/en not_active Expired - Fee Related
- 2012-12-07 KR KR1020197003895A patent/KR20190016149A/en active Application Filing
- 2012-12-07 WO PCT/US2012/068493 patent/WO2013086355A1/en active Application Filing
- 2012-12-07 EP EP19167102.3A patent/EP3534368B1/en active Active
- 2012-12-07 JP JP2014546125A patent/JP6025859B2/en not_active Expired - Fee Related
- 2012-12-07 US US14/363,788 patent/US10140683B2/en active Active
- 2012-12-07 KR KR1020167013318A patent/KR101949569B1/en active IP Right Grant
- 2012-12-07 KR KR1020197024130A patent/KR20190099105A/en not_active Application Discontinuation
- 2012-12-07 AU AU2012347547A patent/AU2012347547B2/en not_active Ceased
- 2012-12-07 ES ES12856007T patent/ES2729714T3/en active Active
- 2012-12-07 EP EP12856007.5A patent/EP2788861B1/en not_active Not-in-force
- 2012-12-07 KR KR1020207011314A patent/KR20200044149A/en not_active Application Discontinuation
- 2012-12-07 KR KR20147016583A patent/KR20140135945A/en active Search and Examination
- 2012-12-07 CA CA2858686A patent/CA2858686C/en active Active
-
2014
- 2014-06-08 IL IL233016A patent/IL233016A/en active IP Right Grant
-
2016
- 2016-07-08 JP JP2016135820A patent/JP6171058B2/en not_active Expired - Fee Related
-
2017
- 2017-06-11 IL IL252817A patent/IL252817B/en active IP Right Grant
- 2017-07-03 JP JP2017130451A patent/JP6368832B2/en not_active Expired - Fee Related
-
2018
- 2018-10-24 US US16/169,946 patent/US10733701B2/en active Active
-
2019
- 2019-07-10 IL IL267977A patent/IL267977A/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040002818A1 (en) * | 2001-12-21 | 2004-01-01 | Affymetrix, Inc. | Method, system and computer software for providing microarray probe data |
US20100281401A1 (en) * | 2008-11-10 | 2010-11-04 | Signature Genomic Labs | Interactive Genome Browser |
US20120066601A1 (en) * | 2010-09-14 | 2012-03-15 | Apple Inc. | Content configuration for device platforms |
US20120102041A1 (en) * | 2010-10-22 | 2012-04-26 | Samsung Sds Co., Ltd. | Genetic information management system and method |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190066262A1 (en) * | 2011-12-08 | 2019-02-28 | Five3 Genomics, Llc | Distributed System Providing Dynamic Indexing And Visualization Of Genomic Data |
US10733701B2 (en) * | 2011-12-08 | 2020-08-04 | Five3 Genomics, Llc | Distributed system providing dynamic indexing and visualization of genomic data |
US10083276B2 (en) | 2013-01-17 | 2018-09-25 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10622096B2 (en) | 2013-01-17 | 2020-04-14 | Edico Genome Corporation | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9953134B2 (en) | 2013-01-17 | 2018-04-24 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9953132B2 (en) | 2013-01-17 | 2018-04-24 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9953135B2 (en) | 2013-01-17 | 2018-04-24 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US20180196917A1 (en) | 2013-01-17 | 2018-07-12 | Edico Genome Corporation | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9792405B2 (en) | 2013-01-17 | 2017-10-17 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10210308B2 (en) | 2013-01-17 | 2019-02-19 | Edico Genome Corporation | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10068054B2 (en) | 2013-01-17 | 2018-09-04 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10847251B2 (en) | 2013-01-17 | 2020-11-24 | Illumina, Inc. | Genomic infrastructure for on-site or cloud-based DNA and RNA processing and analysis |
US11842796B2 (en) | 2013-01-17 | 2023-12-12 | Edico Genome Corporation | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US11043285B2 (en) | 2013-01-17 | 2021-06-22 | Edico Genome Corporation | Bioinformatics systems, apparatus, and methods executed on an integrated circuit processing platform |
US10262105B2 (en) | 2013-01-17 | 2019-04-16 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9898424B2 (en) | 2013-01-17 | 2018-02-20 | Edico Genome, Corp. | Bioinformatics, systems, apparatus, and methods executed on an integrated circuit processing platform |
US10216898B2 (en) | 2013-01-17 | 2019-02-26 | Edico Genome Corporation | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9858384B2 (en) | 2013-01-17 | 2018-01-02 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10691775B2 (en) | 2013-01-17 | 2020-06-23 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US10622097B2 (en) | 2013-01-17 | 2020-04-14 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US9940266B2 (en) | 2015-03-23 | 2018-04-10 | Edico Genome Corporation | Method and system for genomic visualization |
US10068052B2 (en) | 2016-01-11 | 2018-09-04 | Edico Genome Corporation | Bioinformatics systems, apparatuses, and methods for generating a De Bruijn graph |
US11049588B2 (en) | 2016-01-11 | 2021-06-29 | Illumina, Inc. | Bioinformatics systems, apparatuses, and methods for generating a De Brujin graph |
US10049179B2 (en) | 2016-01-11 | 2018-08-14 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing |
CN110506272A (en) * | 2016-10-11 | 2019-11-26 | 基因组系统公司 | For accessing with the method and apparatus of the biological data of access unit structuring |
US10068183B1 (en) | 2017-02-23 | 2018-09-04 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on a quantum processing platform |
Also Published As
Publication number | Publication date |
---|---|
KR20190016149A (en) | 2019-02-15 |
EP2788861A1 (en) | 2014-10-15 |
US10140683B2 (en) | 2018-11-27 |
EP3534368A1 (en) | 2019-09-04 |
EP3534368B1 (en) | 2020-09-16 |
CN104246689A (en) | 2014-12-24 |
AU2012347547A1 (en) | 2014-07-03 |
KR20160062211A (en) | 2016-06-01 |
KR20190099105A (en) | 2019-08-23 |
KR101949569B1 (en) | 2019-02-18 |
WO2013086355A1 (en) | 2013-06-13 |
EP2788861B1 (en) | 2019-05-15 |
CA2858686C (en) | 2018-10-02 |
KR20200044149A (en) | 2020-04-28 |
AU2012347547B2 (en) | 2015-10-22 |
US10733701B2 (en) | 2020-08-04 |
KR20140135945A (en) | 2014-11-27 |
IL233016A0 (en) | 2014-07-31 |
JP2017208115A (en) | 2017-11-24 |
IL267977A (en) | 2019-09-26 |
CA2858686A1 (en) | 2013-06-13 |
JP6171058B2 (en) | 2017-07-26 |
ES2729714T3 (en) | 2019-11-05 |
US20190066262A1 (en) | 2019-02-28 |
IL252817B (en) | 2019-07-31 |
JP6368832B2 (en) | 2018-08-01 |
EP2788861A4 (en) | 2015-04-15 |
IL252817A0 (en) | 2017-08-31 |
JP6025859B2 (en) | 2016-11-16 |
CN104246689B (en) | 2020-06-02 |
JP2015500535A (en) | 2015-01-05 |
JP2016212900A (en) | 2016-12-15 |
IL233016A (en) | 2017-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10733701B2 (en) | Distributed system providing dynamic indexing and visualization of genomic data | |
AU2007325055B2 (en) | Rendering document views with supplemental informational content | |
US20090049408A1 (en) | Location-based visualization of geo-referenced context | |
EP2336910B1 (en) | Generating device specific thumbnails | |
US9202007B2 (en) | Method, apparatus and computer program product for providing documentation and/or annotation capabilities for volumetric data | |
AU2017202994B2 (en) | Distributed system providing dynamic indexing and visualization of genomic data | |
JP2005309745A (en) | Document management network system for managing original image document information and postscript information, and image document delivery server | |
JP2007233752A (en) | Retrieval device, computer program and recording medium | |
US20230030087A1 (en) | Information processing apparatus, information processing method, and non-transitory computer readable medium | |
Lei et al. | An Optimization-based Matching Method and its Application in Merging Administrative Boundary Data | |
Parlak | A New Method for Medical Image Archival Based on Chaotic Maps | |
ES2303790B1 (en) | PROCEDURE FOR ANALYSIS, VISUALIZATION AND PROCESSING OF BIOMEDICAL DIGITAL IMAGES. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |