WO2020191391A2 - Séquençage à base d'intelligence artificielle - Google Patents

Séquençage à base d'intelligence artificielle Download PDF

Info

Publication number
WO2020191391A2
WO2020191391A2 PCT/US2020/024092 US2020024092W WO2020191391A2 WO 2020191391 A2 WO2020191391 A2 WO 2020191391A2 US 2020024092 W US2020024092 W US 2020024092W WO 2020191391 A2 WO2020191391 A2 WO 2020191391A2
Authority
WO
WIPO (PCT)
Prior art keywords
cluster
image
subpixels
pixel
sequencing
Prior art date
Application number
PCT/US2020/024092
Other languages
English (en)
Other versions
WO2020191391A3 (fr
Inventor
Anindita Dutta
Dorna KASHEFHAGHIGHI
Amirali Kia
Kishore Jaganathan
John Randall GOBBEL
Original Assignee
Illumina, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from NL2023316A external-priority patent/NL2023316B1/en
Priority claimed from NL2023312A external-priority patent/NL2023312B1/en
Priority claimed from NL2023310A external-priority patent/NL2023310B1/en
Priority claimed from NL2023314A external-priority patent/NL2023314B1/en
Priority claimed from NL2023311A external-priority patent/NL2023311B9/en
Priority claimed from US16/825,991 external-priority patent/US11210554B2/en
Priority claimed from US16/825,987 external-priority patent/US11347965B2/en
Priority to MX2020014302A priority Critical patent/MX2020014302A/es
Priority to KR1020217003270A priority patent/KR20210145116A/ko
Priority to BR112020026455-5A priority patent/BR112020026455A2/pt
Priority to JP2020572706A priority patent/JP2022535306A/ja
Priority to AU2020240141A priority patent/AU2020240141A1/en
Application filed by Illumina, Inc. filed Critical Illumina, Inc.
Priority to CA3104951A priority patent/CA3104951A1/fr
Priority to SG11202012463YA priority patent/SG11202012463YA/en
Priority to CN202080004529.4A priority patent/CN112689875A/zh
Priority to EP20757979.8A priority patent/EP3942074A2/fr
Publication of WO2020191391A2 publication Critical patent/WO2020191391A2/fr
Publication of WO2020191391A3 publication Critical patent/WO2020191391A3/fr
Priority to IL279533A priority patent/IL279533A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23211Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with adaptive number of clusters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the technology disclosed relates to artificial intelligence type computers and digital data processing systems and corresponding data processing methods and products for emulation of intelligence (i.e., knowledge based systems, reasoning systems, and knowledge acquisition systems); and including systems for reasoning with uncertainty (e.g., fuzzy logic systems), adaptive systems, machine learning systems, and artificial neural networks.
  • intelligence i.e., knowledge based systems, reasoning systems, and knowledge acquisition systems
  • systems for reasoning with uncertainty e.g., fuzzy logic systems
  • adaptive systems e.g., machine learning systems
  • artificial neural networks e.g., neural network for analyzing data.
  • Deep neural networks are a type of artificial neural networks that use multiple nonlinear and complex transforming layers to successively model high-level features. Deep neural networks provide feedback via backpropagation which carries the difference between observed and predicted output to adjust parameters. Deep neural networks have evolved with the availability of large training datasets, the power of parallel and distributed computing, and sophisticated training algorithms. Deep neural networks have facilitated major advances in numerous domains such as computer vision, speech recognition, and natural language processing.
  • Convolutional neural networks and recurrent neural networks (RNNs) are components of deep neural networks.
  • Convolutional neural networks have succeeded particularly in image recognition with an architecture that comprises convolution layers, nonlinear layers, and pooling layers.
  • Recurrent neural networks are designed to utilize sequential information of input data with cyclic connections among building blocks like perceptrons, long short-term memory units, and gated recurrent units.
  • many other emergent deep neural networks have been proposed for limited contexts, such as deep spatio-temporal neural networks, multi -dimensional recurrent neural networks, and convolutional auto-encoders.
  • the goal of training deep neural networks is optimization of the weight parameters in each layer, which gradually combines simpler features into complex features so that the most suitable hierarchical representations can be learned from data.
  • a single cycle of the optimization process is organized as follows. First, given a training dataset, the forward pass sequentially computes the output in each layer and propagates the function signals forward through the network. In the final output layer, an objective loss function measures error between the inferenced outputs and the given labels. To minimize the training error, the backward pass uses the chain rale to backpropagate error signals and compute gradients with respect to all weights throughout the neural network. Finally, the weight parameters are updated using optimization algorithms based on stochastic gradient descent.
  • stochastic gradient descent provides stochastic approximations by performing the updates for each small set of data examples.
  • optimization algorithms stem from stochastic gradient descent.
  • the Adagrad and Adam training algorithms perform stochastic gradient descent while adaptively modifying learning rates based on update frequency and moments of the gradients for each parameter, respectively.
  • regularization refers to strategies intended to avoid overfitting and thus achieve good generalization performance.
  • weight decay adds a penalty term to the objective loss function so that weight parameters converge to smaller absolute values.
  • Dropout randomly removes hidden units from neural networks during training and can be considered an ensemble of possible subnetworks.
  • maxout a new activation function
  • mnDrop a variant of dropout for recurrent neural networks
  • batch normalization provides a new regularization method through normalization of scalar features for each activation within a mini -batch and learning each mean and variance as parameters.
  • Convolutional neural networks have been adapted to solve sequence-based problems in genomics such as motif discovery, pathogenic variant identification, and gene expression inference. Convolutional neural networks use a weight-sharing strategy that is especially useful for studying DNA because it can capture sequence motifs, which are short, recurring local patterns in DNA that are presumed to have significant biological functions. A hallmark of convolutional neural networks is the use of convolution filters.
  • convolution filters perform adaptive learning of features, analogous to a process of mapping raw input data to the informative representation of knowledge.
  • the convolution filters serve as a series of motif scanners, since a set of such filters is capable of recognizing relevant patterns in the input and updating themselves during the training procedure.
  • Recurrent neural networks can capture long-range dependencies in sequential data of varying lengths, such as protein or DNA sequences.
  • nucleic acid cluster-based genomics methods extend to other areas of genome analysis as well.
  • nucleic acid cluster-based genomics can be used in sequencing applications, diagnostics and screening, gene expression analysis, epigenetic analysis, genetic analysis of polymorphisms, and the like.
  • Each of these nucleic acid cluster-based genomics technologies is limited when there is an inability to resolve data generated from closely proximate or spatially overlapping nucleic acid clusters.
  • nucleic acid sequencing data that can be obtained rapidly and cost-effectively for a wide variety of uses, including for genomics (e.g., for genome characterization of any and all animal, plant, microbial or other biological species or populations), pharmacogenomics, transcriptomics, diagnostics, prognostics, biomedical risk assessment, clinical and research genetics, personalized medicine, drag efficacy and drug interactions assessments, veterinary medicine, agriculture, evolutionary and biodiversity studies, aquaculture, forestry, oceanography, ecological and environmental management, and other purposes.
  • genomics e.g., for genome characterization of any and all animal, plant, microbial or other biological species or populations
  • pharmacogenomics e.g., for genome characterization of any and all animal, plant, microbial or other biological species or populations
  • transcriptomics e.g., for genome characterization of any and all animal, plant, microbial or other biological species or populations
  • diagnostics e.g., for prognostics
  • biomedical risk assessment e.g., for genetic characterization
  • the technology disclosed provides neural network-based methods and systems that address these and similar needs, including increasing the level of throughput in high-throughput nucleic acid sequencing technologies, and offers other related advantages.
  • Figure 1 shows one implementation of a processing pipeline that determines cluster metadata using subpixel base calling.
  • Figure 2 depicts one implementation of a flow cell that contains clusters in its tiles.
  • Figure 3 illustrates one example of the Illumina GA-IIx flow cell with eight lanes.
  • Figure 4 depicts an image set of sequencing images for four-channel chemistry, i.e., the image set has four sequencing images, captured using four different wavelength bands (image/imaging channel) in the pixel domain.
  • Figure 5 is one implementation of dividing a sequencing image into subpixels (or subpixel regions).
  • Figure 6 shows preliminary center coordinates of the dusters identified by the base caller during the subpixel base calling.
  • Figure 7 depicts one implementation of merging subpixel base calls produced over the plurality of sequencing cycles to generate the so- called“cluster maps” that contain the cluster metadata.
  • Figure 8a illustrates one example of a cluster map generated by the merging of the subpixel base calls.
  • Figure 8b depicts one implementation of subpixel base calling.
  • Figure 9 shows another example of a cluster map that identifies cluster metadata.
  • Figure 10 shows how a center of mass (COM) of a disjointed region in a cluster map is calculated.
  • Figure 11 depicts one implementation of calculation of a weighted decay factor based on the Euclidean distance from a subpixel in a disjointed region to the COM of the disjointed region.
  • Figure 12 illustrates one implementation of an example ground truth decay map derived from an example cluster map produced by the subpixel base calling.
  • Figure 13 illustrates one implementation of deriving a ternary map from a cluster map.
  • Figure 14 illustrates one implementation of deriving a binary map from a cluster map.
  • Figure 15 is a block diagram that shows one implementation of generating training data that is used to train the neural network-based template generator and the neural network-based base caller.
  • Figure 16 shows characteristics of the disclosed training examples used to train the neural network-based template generator and the neural network-based base caller.
  • Figure 17 illustrates one implementation of processing input image data through the disclosed neural network-based template generator and generating an output value for each unit in an array.
  • the array is a decay map.
  • the array is a ternary map.
  • the array is a binary map.
  • Figure 18 shows one implementation of post-processing techniques that are applied to the decay map, the ternary map, or the binary map produced by the neural network-based template generator to derive cluster metadata, including cluster centers, cluster shapes, cluster sizes, cluster background, and/or cluster boundaries.
  • Figure 19 depicts one implementation of extracting cluster intensity in the pixel domain.
  • Figure 20 illustrates one implementation of extracting cluster intensity in the subpixel domain.
  • Figure 21a shows three different implementations of the neural network-based template generator.
  • Figure 21b depicts one implementation of the input image data that is fed as input to the neural network-based template generator
  • the input image data comprises a series of image sets with sequencing images that are generated during a certain number of initial sequences cycles of a sequencing run.
  • Figure 22 shows one implementation of extracting patches from the series of image sets in Figure 21b to produce a series of“downsized” image sets that form the input image data.
  • Figure 23 depicts one implementation of upsampling the series of image sets in Figure 21b to produce a series of“upsampled” image sets that forms the input image data.
  • Figure 24 shows one implementation of extracting patches from the series of upsampled image sets in Figure 23 to produce a series of “upsampled and down-sized” image sets that form the input image data.
  • Figure 25 illustrates one implementation of an overall example process of generating ground truth data for training the neural network-based template generator.
  • Figure 26 illustrates one implementation of the regression model.
  • Figure 27 depicts one implementation of generating a ground truth decay map from a cluster map.
  • the ground truth decay map is used as ground truth data for training the regression model.
  • Figure 28 is one implementation of training the regression model using a backpropagation-based gradient update technique.
  • Figure 29 is one implementation of template generation by the regression model during inference.
  • Figure 30 illustrates one implementation of subjecting the decay map to post-processing to identify cluster metadata.
  • Figure 31 depicts one implementation of a watershed segmentation technique identifying non-overlapping groups of contiguous cluster/cluster interior subpixels that characterize the clusters.
  • Figure 32 is a table that shows an example U-Net architecture of the regression model.
  • Figure 33 illustrates different approaches of extracting cluster intensify using cluster shape information identified in a template image.
  • Figure 34 shows different approaches of base calling using the outputs of the regression model.
  • Figure 35 illustrates the difference in base calling performance when the RTA base caller uses ground truth center of mass (COM) location as the duster center, as opposed to using a non-COM location as the cluster center.
  • COM center of mass
  • Figure 36 shows, on the left, an example decay map produced the regression model. On the right, Figure 36 also shows an example ground truth decay map that the regression model approximates during the training.
  • Figure 37 portrays one implementation of the peak locator identifying cluster centers in the decay map by detecting peaks.
  • Figure 38 compares peaks detected by the peak locator in a decay map produced by the regression model with peaks in a corresponding ground truth decay map.
  • Figure 39 illustrates performance of the regression model using precision and recall statistics.
  • Figure 40 compares performance of the regression model with the RTA base caller for 20pM library concentration (normal run).
  • Figure 41 compares performance of the regression model with the RTA base caller for 30pM library concentration (dense ran).
  • Figure 42 compares number of non-duplicate proper read pairs, i.e., the number of paired reads that do not have both reads aligned inwards within a reasonable distance detected by the regression model versus the same detected by the RTA base caller.
  • Figure 43 shows, on the right, a first decay map produced by the regression model. On the left, Figure 43 shows a second decay map produced by the regression model.
  • Figure 44 compares performance of the regression model with the RTA base caller for 40pM library concentration (highly dense ran).
  • Figure 45 shows, on the left, a first decay map produced by the regression model. On the right, Figure 45 shows the results of the thresholding, the peak locating, and the watershed segmentation technique applied to the first decay map.
  • Figure 46 illustrates one implementation of the binary classification model.
  • Figure 47 is one implementation of training the binary classification model using a backpropagation-based gradient update technique that involves softmax scores.
  • Figure 48 is another implementation of training the binary classification model using a backpropagation-based gradient update technique that involves sigmoid scores.
  • Figure 49 illustrates another implementation of the input image data fed to the binary classification model and the corresponding class labels used to train the binary classification model.
  • Figure 50 is one implementation of template generation by the binary classification model during inference.
  • Figure 51 illustrates one implementation of subjecting the binary map to peak detection to identify duster centers.
  • Figure 52a shows, on the left, an example binary map produced by the binary classification model. On the right, Figure 52a also shows an example ground truth binary map that the binary classification model approximates during the training.
  • Figure 52b illustrates performance of the binary classification model using a precision statistic.
  • Figure 53 is a table that shows an example architecture of the binary classification model.
  • Figure 54 illustrates one implementation of the ternary classification model.
  • Figure 55 is one implementation of training the ternary classification model using a backpropagation-based gradient update technique.
  • Figure 56 illustrates another implementation of the input image data fed to the ternary classification model and the corresponding class labels used to train the ternary classification model.
  • Figure 57 is a table that shows an example architecture of the ternary classification model.
  • Figure 58 is one implementation of template generation by the ternary classification model during inference.
  • Figure 59 shows a ternary map produced by the ternary classification model.
  • Figure 60 depicts an array of units produced by the ternary classification model 5400, along with the unit-wise output values.
  • Figure 61 shows one implementation of subjecting the ternary map to post-processing to identify cluster centers, duster background, and duster interior.
  • Figure 62a shows example predictions of the ternary classification model.
  • Figure 62b illustrates other example predictions of the ternary classification model.
  • Figure 62c shows yet other example predictions of the ternary classification model.
  • Figure 63 depicts one implementation of deriving the duster centers and cluster shapes from the output of the ternary classification model in Figure 62a.
  • Figure 64 compares base calling performance of the binary classification model, the regression model, and the RTA base caller.
  • Figure 65 compares the performance of the ternary classification model with that of the RTA base caller under three contexts, five sequencing metrics, and two run densities.
  • Figure 66 compares the performance of the regression model with that of the RTA base caller under the three contexts, the five sequencing metrics, and the two ran densities discussed in Figure 65.
  • Figure 67 focuses on the penultimate layer of the neural network-based template generator.
  • Figure 68 visualizes what the penultimate layer of the neural network-based template generator has learned as a result of the backpropagation-based gradient update training.
  • the illustrated implementation visualizes twenty -four out of the thirty -two trained convolution filters of the penultimate layer depicted in Figure 67.
  • Figure 69 overlays cluster center predictions of the binary classification model (in blue) onto those of the RTA base caller (in pink).
  • Figure 70 overlays cluster center predictions made by the RTA base caller (in pink) onto visualization of the trained convolution filters of the penultimate layer of the binary classification model.
  • Figure 71 illustrates one implementation of training data used to train the neural network-based template generator.
  • Figure 72 is one implementation of using beads for image registration based on duster center predictions of the neural network-based template generator.
  • Figure 73 illustrates one implementation of cluster statistics of dusters identified by the neural network-based template generator.
  • Figure 74 shows how the neural network-based template generator's ability to distinguish between adjacent dusters improves when the number of initial sequencing cycles for which the input image data is used increases from five to seven.
  • Figure 75 illustrates the difference in base calling performance when a RTA base caller uses ground truth center of mass (COM) location as the duster center, as opposed to when a non-COM location is used as the duster center.
  • COM center of mass
  • Figure 76 portrays the performance of the neural network-based template generator on extra detected dusters.
  • Figure 77 shows different datasets used for training the neural network-based template generator.
  • Figure 78 shows the processing stages used by the RTA base caller for base calling, according to one implementation.
  • Figure 79 illustrates one implementation of base calling using the disclosed neural network-based base caller.
  • Figure 80 is one implementation of transforming, from subpixel domain to pixel domain, location/position information of duster centers identified from the output of the neural network-based template generator.
  • Figure 81 is one implementation of using cycle-specific and image channel-specific transformations to derive the so-called “transformed duster centers” from the reference duster centers.
  • Figure 82 illustrates an image patch that is part of the input data fed to the neural network-based base caller.
  • Figure 83 depids one implementation of determining distance values for a distance channel when a single target duster is being base called by the neural network-based base caller.
  • Figure 84 shows one implementation of pixel-wise encoding the distance values that are calculated between the pixels and the target duster.
  • Figure 85a depids one implementation of determining distance values for a distance channel when multiple target dusters are being simultaneously base called by the neural network-based base caller.
  • Figure 85b shows, for each of the target dusters, some nearest pixels determined based on the pixel center-to-nearest duster center distances.
  • Figure 86 shows one implementation of pixel-wise encoding the minimum distance values that are calculated between the pixels and the nearest one of the dusters.
  • Figure 87 illustrates one implementation using pixel-to-duster ensembleification/attribution/categorization, referred to herein as“duster shape data”.
  • Figure 88 shows one implementation of calculating the distance values using the cluster shape data.
  • Figure 89 shows one implementation of pixel-wise encoding the distance values that are calculated between the pixels and the assigned clusters.
  • Figure 90 illustrates one implementation of the specialized architecture of the neural network-based base caller that is used to segregate processing of data for different sequencing cycles.
  • Figure 91 depicts one implementation of segregated convolutions.
  • Figure 92a depicts one implementation of combinatory convolutions.
  • Figure 92b depicts another implementation of the combinatory convolutions.
  • Figure 93 shows one implementation of convolution layers of the neural network-based base caller in which each convolution layer has a bank of convolution filters.
  • Figure 94 depicts two configurations of the scaling channel that supplements the image channels.
  • Figure 95a illustrates one implementation of input data for a single sequencing cycle that produces a red image and a green image.
  • Figure 95b illustrates one implementation of the distance channels supplying additive bias that is incorporated in the feature maps generated from the image channels.
  • Figures 96a, 96b, and 96c depict one implementation of base calling a single target cluster.
  • Figure 97 shows one implementation of simultaneously base calling multiple target clusters.
  • Figure 98 shows one implementation of simultaneously base calling multiple target dusters at a plurality of successive sequencing cycles, thereby simultaneously producing a base call sequence for each of the multiple target clusters.
  • Figure 99 illustrates the dimensionality diagram for the single cluster base calling implementation.
  • Figure 100 illustrates the dimensionality diagram for the multiple clusters, single sequencing cycle base calling implementation.
  • Figure 101 illustrates the dimensionality diagram for the multiple clusters, multiple sequencing cycles base calling implementation.
  • Figure 102a depicts an example arrayed input configuration the multi-cycle input data.
  • Figure 102b shows an example stacked input configuration the multi-cycle input data.
  • Figure 103a depicts one implementation of reframing pixels of an image patch to center a center of a target cluster being base called in a center pixel.
  • Figure 103b depicts another example reframed/shifted image patch in which (i) the center of the center pixel coincides with the center of the target cluster and (ii) the non-center pixels are equidistant from the center of the target cluster.
  • Figure 104 shows one implementation of base calling a single target cluster at a current sequencing cycle using a standard convolution neural network and the reframed input.
  • Figure 105 shows one implementation of base calling multiple target clusters at the current sequencing cycle using the standard convolution neural network and the aligned input.
  • Figure 106 shows one implementation of base calling multiple target clusters at a plurality of sequencing cycles using the standard convolution neural network and the aligned input.
  • Figure 107 shows one implementation of training the neural network-based base caller.
  • Figure 108a depicts one implementation of a hybrid neural network that is used as the neural network-based base caller.
  • Figure 108b shows one implementation of 3D convolutions used by the recurrent module of the hybrid neural network to produce the current hidden state representations.
  • Figure 109 illustrates one implementation of processing, through a cascade of convolution layers of the convolution module, per-cycle input data for a single sequencing cycle among the series of t sequencing cycles to be base called.
  • Figure 110 depicts one implementation of mixing the single sequencing cycle's per-cycle input data with its corresponding convolved representations produced by the cascade of convolution layers of the convolution module.
  • Figure 111 shows one implementation of arranging flattened mixed representations of successive sequencing cycles as a stack.
  • Figure 112a illustrates one implementation of subjecting the stack of Figure 111 to recurrent application of 3D convolutions in forward and backward directions and producing base calls for each of the clusters at each of the t sequencing cycles in the series.
  • Figure 112b shows one implementation of processing a 3D input volume x(t), which comprises groups of flattened mixed representations, through an input gate, an activation gate, a forget gate, and an output gate of a long short-term memory (LSTM) network that applies the 3D convolutions.
  • the LSTM network is part of the recurrent module of the hybrid neural network.
  • Figure 113 shows one implementation of balancing trinucleotides (3-mers) in the training data used to train the neural network-based base caller.
  • Figure 114 compares base calling accuracy of the RTA base caller against the neural network-based base caller.
  • Figure 115 compares tile-to-tile generalization of the RTA base caller with that of the neural network-based base caller on a same tile.
  • Figure 116 compares tile-to-tile generalization of the RTA base caller with that of the neural network-based base caller on a same tile and on different tiles.
  • Figure 117 also compares tile-to-tile generalization of the RTA base caller with that of the neural network-based base caller on different tiles.
  • Figure 118 shows how different sizes of the image patches fed as input to the neural network-based base caller effect the base calling accuracy.
  • Figures 119, 120, 121, and 122 show lane-to-lane generalization of the neural network-based base caller on training data from A.baumanni and E.coli.
  • Figure 123 depicts an error profile for the lane-to-lane generalization discussed above with respect to Figures 119, 120, 121, and 122.
  • Figure 124 attributes the source of the error detected by the error profile of Figure 123 to low cluster intensity in the green channel.
  • Figure 125 compares error profiles of the RTA base caller and the neural network-based base caller for two sequencing runs (Read 1 and Read 2).
  • Figure 126a shows run-to-run generalization of the neural network-based base caller on four different instruments.
  • Figure 126b shows run-to-run generalization of the neural network-based base caller on four different runs executed on a same instrument.
  • Figure 127 shows the genome statistics of the training data used to train the neural network-based base caller.
  • Figure 128 shows the genome context of the training data used to train the neural network-based base caller.
  • Figure 129 shows the base calling accuracy of the neural network-based base caller in base calling long reads (e.g., 2 x 250).
  • Figure 130 illustrates one implementation of how the neural network-based base caller attends to the central cluster pixel(s) and its neighboring pixels across image patches.
  • Figure 131 shows various hardware components and configurations used to train and ran the neural network-based base caller, according to one implementation. In other implementations, different hardware components and configurations are used.
  • Figure 132 shows various sequencing tasks that can be performed using the neural network-based base caller.
  • Figure 133 is a scatter plot visualized by t-Distributed Stochastic Neighbor Embedding (t-SNE) and portrays base calling results of the neural network-based base caller.
  • t-SNE t-Distributed Stochastic Neighbor Embedding
  • Figure 134 illustrates one implementation of selecting the base call confidence probabilities made by the neural network-based base caller for quality scoring.
  • Figure 135 shows one implementation of the neural network-based quality scoring.
  • Figures 136a-136b depict one implementation of correspondence between the quality scores and the base call confidence predictions made by the neural network-based base caller.
  • Figure 137 shows one implementation of inferring quality scores from base call confidence predictions made by the neural network- based base caller during inference.
  • Figure 138 shows one implementation of training the neural network-based quality scorer to process input data derived from the sequencing images and directly produce quality indications.
  • Figure 139 shows one implementation of directly producing quality indications as outputs of the neural network-based quality scorer during inference.
  • Figure 140 depicts one implementation of using lossless transformation to generate transformed data that can be fed as input to the neural network-based template generator, the neural network-based base caller, and the neural network-based quality scorer.
  • Figure 141 illustrates one implementation of integrating the neural network-based template generator with the neural network-based base caller using area weighting factoring.
  • Figure 142 illustrates another implementation of integrating the neural network-based template generator with the neural network- based base caller using upsampling and background masking.
  • Figure 143 depicts one example of area weighting factoring 14300 for contribution from only a single cluster per pixel.
  • Figure 144 depicts one example of area weighting factoring for contributions from multiple clusters per pixel.
  • Figure 145 depicts one example of using interpolation for upsampling and background masking.
  • Figure 146 depicts one example of using subpixel count weighting for upsampling and background masking.
  • Figures 147A and 147B depict one implementation of a sequencing system.
  • the sequencing system comprises a configurable processor.
  • Figure 147C is a simplified block diagram of a system for analysis of sensor data from the sequencing system, such as base call sensor outputs.
  • Figure 148A is a simplified diagram showing aspects of the base calling operation, including functions of a runtime program executed by a host processor.
  • Figure 148B is a simplified diagram of a configuration of a configurable processor such as the one depicted in Figure 147C.
  • Figure 149 is a computer system that can be used by the sequencing system of Figure 147A to implement the technology disclosed herein.
  • Figure 150 shows different implementations of data pre-processing, which can include data normalization and data augmentation.
  • Figure 151 shows that the data normalization technique (DeepRTA (norm)) and the data augmentation technique (DeepRTA (augment)) of Figure 150 reduce the base calling error percentage when the neural network-based base caller is trained on bacterial data and tested on human data, where the bacterial data and the human data share the same assay (e.g., both contain intronic data).
  • Figure 152 shows that the data normalization technique (DeepRTA (norm)) and the data augmentation technique (DeepRTA (augment)) of Figure 151 reduce the base calling error percentage when the neural network-based base caller is trained on non-exonic data (e.g., intronic data) and tested on exonic data.
  • Data normalization technique (DeepRTA (norm)
  • DeepRTA data augmentation technique
  • the signal from an image set being evaluated is increasingly faint as classification of bases proceeds in cycles, especially over increasingly long strands of bases.
  • the signal-to-noise ratio decreases as base classification extends over the length of a strand, so reliability decreases. Updated estimates of reliability are expected as the estimated reliability of base classification changes.
  • Digital images are captured from amplified clusters of sample strands. Samples are amplified by duplicating strands using a variety of physical structures and chemistries. During sequencing by synthesis, tags are chemically attached in cycles and stimulated to glow. Digital sensors collect photons from the tags that are read out of pixels to produce images.
  • Cluster positions are not mechanically regulated, so cluster centers are not aligned with pixel centers.
  • a pixel center can be the integer coordinate assigned to a pixel. In other implementations, it can be the top-left comer of the pixel. In yet other implementations, it can be the centroid or center-of-mass of the pixel. Amplification does not produce uniform cluster shapes. Distribution of cluster signals in the digital image is, therefore, a statistical distribution rather than a regular pattern. We call this positional uncertainty.
  • One of the signal classes may produce no detectable signal and be classified at a particular position based on a“dark” signal.
  • templates are necessary for classification during dark cycles. Production of templates resolves initial positional uncertainty using multiple imaging cycles to avoid missing dark signals.
  • the physical, sensor pixel is a region of an optical sensor that reports detected photons.
  • a logical pixel simply referred to as a pixel, is data corresponding to at least one physical pixel, data read from the sensor pixel.
  • the pixel can be subdivided or“up sampled” into sub pixels, such as 4 x 4 sub pixels.
  • values can be assigned to sub pixels by interpolation, such as bilinear interpolation or area weighting. Interpolation or bilinear interpolation also is applied when pixels are re-framed by applying an affine transformation to data from physical pixels.
  • High resolution sensors capture only part of an imaged media at a time. The sensor is stepped over the imaged media to cover the whole field. Thousands of digital images can be collected during one processing cycle.
  • Sensor and illumination design are combined to distinguish among at least four illumination response values that are used to classify bases. If a traditional RGB camera with a Bayer color filter array were used, four sensor pixels would be combined into a single RGB value. This would reduce the effective sensor resolution by four-fold.
  • multiple images can be collected at a single position using different illumination wavelengths and/or different filters rotated into position between the imaged media and the sensor. The number of images required to distinguish among four base classifications varies between systems. Some systems use one image with four intensity levels for different classes of bases. Other systems use two images with different illumination wavelengths (red and green, for instance) and/or filters with a sort of truth table to classify bases. Systems also can use four images with different illumination wavelengths and/or filters tuned to specific base classes.
  • Massively parallel processing of digital images is practically necessary to align and combine relatively short strands, on the order of 30 to 2000 base pairs, into longer sequences, potentially millions or even billions of bases in length. Redundant samples are desirable over an imaged media, so a part of a sequence may be covered by dozens of sample reads. Millions or at least hundreds of thousands of sample clusters are imaged from a single imaged media. Massively parallel processing of so many clusters has increased in sequencing capacity while decreasing cost.
  • the technology disclosed improves processing during both template generation to resolve positional uncertainty and during base classification of dusters at resolved positions. Applying the technology disclosed, less expensive hardware can be used to reduce the cost of machines. Near real time analysis can become cost effective, reducing the lag between image collection and base classification.
  • the technology disclosed can use upsampled images produced by interpolating sensor pixels into subpixels and then producing templates that resolve positional uncertainty.
  • a resulting subpixel is submitted to a base caller for classification that treats the subpixel as if it were at the center of a cluster.
  • Clusters are determined from groups of adjoining subpixels that repeatedly receive the same base classification.
  • This aspect of the technology leverages existing base calling technology to determine shapes of clusters and to hyper-locate cluster centers with a subpixel resolution.
  • Another aspect of the technology disclosed is to create ground truth, training data sets that pair images with confidently determined duster centers and/or duster shapes.
  • Deep learning systems and other machine learning approaches require substantial training sets.
  • Human curated data is expensive to compile.
  • the technology disclosed can be used to leverage existing classifiers, in a non-standard mode of operation, to generate large sets of confidently classified training data without intervention or the expense of a human curator.
  • the training data correlates raw images with duster centers and/or duster shapes available from existing classifiers, in a non-standard mode of operation, such as CNN -based deep learning systems, which can then directly process image sequences.
  • One training image can be rotated and reflected to produce additional, equally valid examples.
  • Training examples can focus on regions of a predetermined size within an overall image. The context evaluated during base calling determines the size of example training regions, rather than the size of an image from or overall imaged media.
  • the technology disclosed can produce different types of maps, usable as training data or as templates for base classification, which correlate duster centers and/or duster shapes with digital images.
  • a subpixel can be classified as a duster center, thereby localizing a duster center within a physical sensor pixel.
  • a duster center can be calculated as the centroid of a duster shape. This location can be reported with a selected numeric precision.
  • a duster center can be reported with surrounding subpixels in a decay map, either at subpixel or pixel resolution.
  • a decay map reduces weight given to photons detected in regions as separation of the regions from the duster center increase, attenuating signals from more distant positions.
  • binary or ternary classifications can be applied to subpixels or pixels in dusters of adjoining regions.
  • a region is classified as belonging to a duster center or as background.
  • the third class type is assigned to the region that contains the duster interior, but not the duster center.
  • Subpixel classification of duster center locations could be substituted for real valued duster center coordinates within a larger optical pixel.
  • the alternative styles of maps can initially be produced as ground truth data sets, or, with training, they can be produced using a neural network. For instance, dusters can be depicted as disjoint regions of adjoining subpixels with appropriate classifications. Intensity mapped dusters from a neural network can be post-processed by a peak detector filter, to calculate duster centers, if the centers have not already been determined. Applying a so-called watershed analysis, abutting regions can be assigned to separate dusters. When produced by a neural network inference engine, the maps can be used as templates for evaluating a sequence of digital images and classifying bases over cycles of base calling.
  • the neural network processes multiple image channels in a current cycle together with image channels of past and future cycles.
  • some of the strands may run ahead or behind the main course of synthesis, which out-of-phase tagging is known as pre-phasing or phasing.
  • pre-phasing or phasing out-of-phase tagging.
  • Reframing image data means interpolating image data, typically by applying an affine transformation. Reframing can put a cluster center of interest in the middle of the center pixel of a pixel patch.
  • Reframing involves adjusting intensity values of all pixels in the pixel patch.
  • Bi-linear and bi-cubic interpolation and weighted area adjustments are alternative strategies.
  • cluster center coordinates can be fed to a neural network as an additional image channel.
  • Distance signals also can contribute to base classification. Several types of distance signals reflect separation of regions from cluster centers. The strongest optical signal is deemed to coincide with the duster center. The optical signal along the cluster perimeter sometimes includes a stray signal from a nearby cluster. Classification has been observed to be more accurate when contribution of signal component is attenuated according to its separation from the cluster center.
  • Distance signals that work include a single duster distance channel, a multi-duster distance channel, and a multi-cluster shape-based distance channel. A single duster distance channel applies to a patch with a duster center in the center pixel. Then, distance of all regions in the patch is a distance from the duster center in the center pixel.
  • a multi -duster distance channel pre-calculates distance of each region to the closest duster center. This has the potential of connecting a region to the wrong duster center, but that potential is low.
  • a multi -duster shape-based distance channel associates regions (sub-pixels or pixels) through adjoining regions to a pixel center that produces a same base classification. At some computational expense, this avoids the possibility of measuring a distance to the wrong pixel.
  • the multi -duster and multi -duster shape-based approaches to distance signals have the advantage of being subject to pre-calculation and use with multiple dusters in an image.
  • Shape information can be used by a neural network to separate signal from noise, to improve the signal-to-noise ratio.
  • regions can be marked as background, as not being part of a duster, to define duster edges.
  • a neural network can be trained to take advantage of the resulting information about irregular duster shapes.
  • Distance information and background classification can be combined or used separately. Separating signals from abutting dusters will be increasingly important as duster density increases.
  • One direction for increasing the scale of parallel processing is to increase duster density on the imaged media.
  • Increasing density has the downside of increasing background noise when reading a duster that has an adjacent neighbor.
  • base classification scores also can be leveraged to predict quality.
  • An advantage of real time calculation of quality scores, during base classification is that a flawed sequencing ran can be terminated early. Applicant has found that occasional (rare) decisions to terminate runs can be made one-eighth to one-quarter of the way through the analysis sequence. A decision to terminate can be made after 50 cycles or after 25 to 75 cycles. In a sequential process that would otherwise ran 300 to 1000 cycles, early termination results in substantial resource savings.
  • Specialized convolutional neural network (CNN) architectures can be used to classify bases over multiple cycles.
  • One specialization involves segregation among digital image channels during initial layers of processing.
  • Convolution filters stacks can be structured to segregate processing among cycles, preventing cross-talk between digital image sets from different cycles.
  • the motivation for segregating processing among cycles is that images taken at different cycles have residual registration error and are thus misaligned and have random translational offsets with respect to each other. This occurs due to the finite accuracy of the movements of the sensor's motion stage and also because images taken in different frequency channels have different optical paths and wavelengths.
  • the convolutional neural network structure also can be specialized in handling information about clustering. Templates for cluster centers and/or shapes provide additional information, which the convolutional neural network combines with the digital image data. The cluster center classification and distance data can be applied repeatedly across cycles.
  • the convolutional neural network can be structured to classify multiple clusters in an image field.
  • the distance channel for a pixel or subpixel can more compactly contain distance information relative to either the closest cluster center or to the adjoining cluster center, to which a pixel or subpixel belongs.
  • a large distance vector could be supplied for each pixel or subpixel, or at least for each one that contains a cluster center, which gives complete distance information from a cluster center to all other pixels that are context for the given pixel.
  • Performing base classification in the pixel domain has the advantage of not calling for an increase in calculations, such as 16 fold, which results from upsampling.
  • even the top layer of convolutions may have sufficient cluster density to justify performing calculations that would not be harvested, instead of adding logic to cancel unneeded calculations.
  • classification focuses on a particular cluster.
  • pixels on the perimeter of a cluster may have different modified intensify values, depending on which adjoining cluster is the focus of classification.
  • the template image in the subpixel domain can indicate that an overlap pixel contributes intensify value to two different clusters.
  • optical pixel we refer to optical pixel as an“overlap pixel” when two or more adjacent or abutting clusters both overlap the pixel; both contribute to the intensify reading from the optical pixel.
  • Watershed analysis named after separating rain flows into different watersheds at a ridge line, can be applied to separate even abutting clusters.
  • the template image can be used to modify intensify data for overlap pixels along the perimeter of dusters.
  • the overlap pixels can have different modified intensities, depending on which cluster is the focus of classification.
  • the modified intensify of a pixel can be reduced based on subpixel contribution in the overlap pixel to a home cluster (i.e., the cluster to which the pixel belongs or the duster whose intensify emissions the pixel primarily depicts), as opposed to an away duster (i.e., the non-home duster whose intensify emissions the pixel depids).
  • a home cluster i.e., the cluster to which the pixel belongs or the duster whose intensify emissions the pixel primarily depicts
  • an away duster i.e., the non-home duster whose intensify emissions the pixel depids.
  • intensify is reduced by 5/16, based on the area of subpixels contributing to the home duster divided by the total number of subpixels.
  • intensity is reduced by 5/7, based on the area of subpixels contributing to the home duster divided by the total area of contributing subpixels. The latter two calculations change when the focus turns to the away duster, producing fractions with 2 in the numerator.
  • the modified pixel values are convolved through layers of a neural network-based classifier to produce modified images.
  • the modified images are used to classify bases in successive sequencing cycles.
  • classification in the pixel domain can proceed in parallel for all pixels or all dusters in a chunk of an image. Only one modification of a pixel value can be applied in this scenario to assure reusability of intermediate calculations. Any of the fractions given above can be used to modify pixel intensity, depending on whether a smaller or larger attenuation of intensify is desired.
  • pixels and surrounding context can be convolved through layers of a neural network-based classifier to produce modified images.
  • Performing convolutions on an image chunk allows reuse of intermediate calculations among pixels that have shared context.
  • the modified images are used to classify bases in successive sequencing cycles.
  • This description can be paralleled for application of area weights in the subpixel domain.
  • the parallel is that weights can be calculated for individual subpixels.
  • the weights can, but do not need to, be the same for different subpixel parts of an optical pixel.
  • the assignment of intensify to a subpixel belonging to the home cluster can be 7/16, 5/16 or 5/7 of the pixel intensify. Again, further reduction in intensify can be applied if a distance channel is being considered along with a subpixel map of cluster shapes.
  • subpixel intensities for the image chunk have been modified using the template image
  • subpixels and surrounding context can be convolved through layers of a neural network-based classifier to produce modified images.
  • Performing convolutions on an image chunk allows reuse of intermediate calculations among subpixels that have shared context.
  • the modified images are used to classify bases in successive sequencing cycles.
  • Another alternative is to apply the template image as a binary mask, in the subpixel domain, to image data interpolated into the subpixel domain.
  • the template image can either be arranged to require a background pixel between clusters or to allow subpixels from different dusters to abut.
  • the template image can be applied as a mask. The mask determines whether an interpolated pixel keeps the value assigned by interpolation or receives a background value (e.g., zero), if it is classified in the template image as background.
  • subpixels and surrounding context can be convolved through layers of a neural network-based classifier to produce modified images.
  • Performing convolutions on an image chunk allows reuse of intermediate calculations among subpixels that have shared context.
  • the modified images are used to classify bases in successive sequencing cycles.
  • Features of the technology disclosed are combinable to classify an arbitrary number of clusters within a shared context, reusing intermediate calculations.
  • At optical pixel resolution in one implementation, about ten percent of pixels hold cluster centers to be classified.
  • three by three optical pixels were grouped for analysis as potential signal contributors for a cluster center, given observation of irregularly shaped clusters. Even one 3-by-3 filter away from the top convolution layer, cluster densities are likely to roll up into pixels at cluster centers optical signals from substantially more than half of the optical pixels. Only at super sampled resolution does cluster center density for the top convolution layer drop below one percent.
  • Shared context is substantial in some implementations. For instance, 15-by-15 optical pixel context may contribute to accurate base classification. An equivalent 4x up sampled context would be 60-by-60 sub pixels. This extent of context helps the neural network recognize impacts of non-uniform illumination and background during imaging.
  • the technology disclosed uses small filters at a lower convolution layer to combine cluster boundaries in template input with boundaries detected in digital image input. Cluster boundaries help the neural network separate signal from background conditions and normalize image processing against the background.
  • the technology disclosed substantially reuses intermediate calculations. Suppose that 20 to 25 cluster centers appear within a context area of 15-by-15 optical pixels. Then, first layer convolutions stand to be reused 20 to 25 times in blockwise convolution roll-ups. The reuse factor is reduced layer-by-layer until the penultimate layer, which is the first time that the reuse factor at optical resolution drops below lx.
  • Blockwise roll-up training and inference from multiple convolution layers applies successive roll-ups to a block of pixels or sub pixels.
  • an overlap zone in which data used during roll-up of a first data block overlaps with and can be reused for a second block of roll-ups.
  • Within the block in a center area surrounded by the overlap zone, are pixel values and intermediate calculations that can be rolled up and that can be reused.
  • convolution results that progressively reduce the size of a context field, for instance from 15-by-15 to 13-by-13 by application of a 3-by-3 filter, can be written into the same memory block that holds the values convolved, conserving memory without impairing reuse of underlying calculations within the block.
  • sharing intermediate calculations in the overlap zone requires less resources. With smaller blocks, it can be possible to calculate multiple blocks in parallel, to share the intermediate calculations in the overlap zones.
  • the input channels for template data can be chosen to make the template structure consistent with classifying multiple cluster centers in a digital image field.
  • Two alternatives described above do not satisfy this consistency criteria: reframing and distance mapping over an entire context. Reframing places the center of just one cluster in the center of an optical pixel. Better for classifying multiple clusters is supplying center offsets for pixels classified as holding cluster centers.
  • Distance mapping if provided, is difficult to perform across a whole context area unless every pixel has its own distance map over a whole context. Simpler distance maps provide the useful consistency for classifying multiple clusters from a digital image input block.
  • a neural network can learn from classification in a template of pixels or sub pixels at the boundary of a cluster, so a distance channel can be supplanted by a template that supplies binary or ternary classification, accompanied by a cluster center offset channel.
  • a distance map can give a distance of a pixel from a cluster center to which the pixel (or subpixel) belongs. Or the distance map can give a distance to the closest cluster center.
  • the distance map can encode binary classification with a flag value assigned to background pixels or it can be a separate channel from pixel classification. Combined with cluster center offsets, the distance map can encode ternary classification. In some implementations, particularly ones that encode pixel classifications with one or two bits, it may be desirable, at least during development, to use separate channels for pixel classification and for distance.
  • the technology disclosed can include reduction of calculations to save some calculation resources in upper layers.
  • the duster center offset channel or a ternary classification map can be used to identify centers of pixel convolutions that do not contribute to an ultimate classification of a pixel center.
  • performing a lookup during inference and skipping a convolution roll up can be more efficient in upper lay er(s) than performing even nine multiplies and eight adds to apply a 3 -by-3 filter.
  • every pixel can be classified within the pipeline.
  • the cluster center map can be used after the final convolution to harvest results for only pixels that coincide with cluster centers, because an ultimate classification is only desired for those pixels.
  • the first step of template generation is determining cluster metadata.
  • Cluster metadata identifies spatial distribution of clusters, including their centers, shapes, sizes, background, and/or boundaries. Determining Cluster Metadata
  • Figure 1 shows one implementation of a processing pipeline that determines cluster metadata using subpixel base calling.
  • Figure 2 depicts one implementation of a flow cell that contains clusters in its tiles.
  • the flow cell is partitioned into lanes.
  • the lanes are further partitioned into non-overlapping regions called“tiles”.
  • the clusters and their surrounding background on the tiles are imaged.
  • Figure 3 illustrates an example Illumina GA-IIxTM flow cell with eight lanes. Figure 3 also shows a zoom-in on one tile and its clusters and their surrounding background.
  • Figure 4 depicts an image set of sequencing images for four-channel chemistry, i.e., the image set has four sequencing images, captured using four different wavelength bands (image/imaging channel) in the pixel domain.
  • Each image in the image set covers a tile of a flow cell and depicts intensity emissions of clusters on the tile and their surrounding background captured for a particular image channel at a particular one of a plurality of sequencing cycles of a sequencing ran performed on the flow cell.
  • each imaged channel corresponds to one of a plurality of filter wavelength bands.
  • each imaged channel corresponds to one of a plurality of imaging events at a sequencing cycle.
  • each imaged channel corresponds to a combination of illumination with a specific laser and imaging through a specific optical filter.
  • the intensity emissions of a cluster comprise signals detected from an analyte that can be used to classify a base associated with the analyte.
  • the intensity emissions may be signals indicative of photons emitted by tags that are chemically attached to an analyte during a cycle when the tags are stimulated and that may be detected by one or more digital sensors, as described above.
  • Figure 5 is one implementation of dividing a sequencing image into subpixels (or subpixel regions).
  • quarter (0.25) subpixels are used, which results in each pixel in the sequencing image being divided into sixteen subpixels.
  • the illustrated sequencing image has a resolution of 20 x 20 pixels, i.e., 400 pixels, the division produces 6400 subpixels.
  • Each of the subpixels is treated by a base caller as a region center for subpixel base calling. In some implementations, this base caller does not use neural network-based processing. In other implementations, this base caller is a neural network-based base caller.
  • the base caller is configured with logic to produce a base call for the given sequencing cycle particular subpixel by performing image processing steps and extracting intensity data for the subpixel from the corresponding image set of the sequencing cycle. This is done for each of the subpixels and for each of a plurality of sequencing cycles. Experiments have also been carried out with quarter subpixel division of 1800 x 1800 pixel resolution tile images of the Illumina MiSeq sequencer. Subpixel base calling was performed for fifty sequencing cycles and for ten tiles of a lane.
  • Figure 6 shows preliminary center coordinates of the clusters identified by the base caller during the subpixel base calling.
  • Figure 6 also shows“origin subpixels” or“center subpixels” that contain the preliminary center coordinates.
  • Figure 7 depicts one example of merging subpixel base calls produced over the plurality of sequencing cycles to generate the so-called “cluster maps” that contain the cluster metadata.
  • the subpixel base calls are merged using a breadth-first search approach.
  • Figure 8a illustrates one example of a cluster map generated by the merging of the subpixel base calls.
  • Figure 8b depicts one example of subpixel base calling.
  • Figure 8b also shows one implementation of analyzing subpixel-wise base call sequences produced from the subpixel base calling to generate a cluster map.
  • Cluster metadata determination involves analyzing image data produced by a sequencing instrument 102 (e.g., Illumina's iSeq, HiSeqX, HiSeq3000, HiSeq4000, HiSeq2500, NovaSeq 6000, NextSeq, NextSeqDx, MiSeq and MiSeqDx).
  • a sequencing instrument 102 e.g., Illumina's iSeq, HiSeqX, HiSeq3000, HiSeq4000, HiSeq2500, NovaSeq 6000, NextSeq, NextSeqDx, MiSeq and MiSeqDx.
  • Base calling is the process in which the raw signal of the sequencing instrument 102, i.e., intensity data extracted from images, is decoded into DNA sequences and quality scores.
  • the Illumina platforms employ cyclic reversible termination (CRT) chemistry for base calling.
  • CRT cyclic reversible termination
  • the process relies on growing nascent DNA strands complementary to template DNA strands with modified nucleotides, while tracking the emitted signal of each newly added nucleotide.
  • the modified nucleotides have a 3’ removable block that anchors a fluorophore signal of the nucleotide type.
  • Sequencing occurs in repetitive cycles, each comprising three steps: (a) extension of a nascent strand by adding a modified nucleotide; (b) excitation of the fluorophores using one or more lasers of the optical system 104 and imaging through different filters of the optical system 104, yielding sequencing images 108; and (c) cleavage of the fluorophores and removal of the 3’ block in preparation for the next sequencing cycle. Incorporation and imaging cycles are repeated up to a designated number of sequencing cycles, defining the read length of all clusters. Using this approach, each cycle interrogates a new position along the template strands.
  • the tremendous power of the Illumina platforms stems from their ability to simultaneously execute and sense millions or even billions clusters undergoing CRT reactions.
  • the sequencing process occurs in a flow cell 202 - a small glass slide that holds the input DNA fragments during the sequencing process.
  • the flow cell 202 is connected to the high-throughput optical system 104, which comprises microscopic imaging, excitation lasers, and fluorescence filters.
  • the flow cell 202 comprises multiple chambers called lanes 204.
  • the lanes 204 are physically separated from each other and may contain different tagged sequencing libraries, distinguishable without sample cross contamination.
  • the imaging device 106 e.g., a solid-state imager such as a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) sensor
  • CCD charge-coupled device
  • CMOS complementary metal-oxide-semiconductor
  • a tile 206 holds hundreds of thousands to millions of clusters.
  • An image generated from a tile with clusters shown as bright spots is shown at 208.
  • a cluster 302 comprises approximately one thousand identical copies of a template molecule, though clusters vary in size and shape.
  • the dusters are grown from the template molecule, prior to the sequencing ran, by bridge amplification of the input library.
  • the purpose of the amplification and duster growth is to increase the intensity of the emitted signal since the imaging device 106 cannot reliably sense a single fluorophore.
  • the physical distance of the DNA fragments within a duster 302 is small, so the imaging device 106 perceives the duster of fragments as a single spot 302.
  • the output of a sequencing ran is the sequencing images 108, each depicting intensity emissions of dusters on the tile in the pixel domain for a specific combination of lane, tile, sequencing cycle, and fluorophore (208A, 208C, 208T, 208G).
  • a biosensor comprises an array of light sensors.
  • a light sensor is configured to sense information from a corresponding pixel area (e.g., a reaction site/well/nanowell) on the detection surface of the biosensor.
  • An analyte disposed in a pixel area is said to be associated with the pixel area, i.e., the associated analyte.
  • the light sensor corresponding to the pixel area is configured to detect/capture/sense emissions/photons from the associated analyte and, in response, generate a pixel signal for each imaged channel.
  • each imaged channel corresponds to one of a plurality of filter wavelength bands.
  • each imaged channel corresponds to one of a plurality of imaging events at a sequencing cycle.
  • each imaged channel corresponds to a combination of illumination with a specific laser and imaging through a specific optical filter.
  • Pixel signals from the light sensors are communicated to a signal processor coupled to the biosensor (e.g., via a communication port). For each sequencing cycle and each imaged channel, the signal processor produces an image whose pixels respectively
  • a pixel in the image corresponds to: (i) a light sensor of the biosensor that generated the pixel signal depicted by the pixel, (ii) an associated analyte whose emissions were detected by the corresponding light sensor and converted into the pixel signal, and (iii) a pixel area on the detection surface of the biosensor that holds the associated analyte.
  • Pixels in the red and green images have one-to-one correspondence within a sequencing cycle. This means that corresponding pixels in a pair of the red and green images depict intensity data for the same associated analyte, albeit in different imaged channels. Similarly, pixels across the pairs of red and green images have one-to-one correspondence between the sequencing cycles. This means that corresponding pixels in different pairs of the red and green images depict intensity data for the same associated analyte, albeit for different acquisition events/timesteps (sequencing cycles) of the sequencing run.
  • Corresponding pixels in the red and green images can be considered a pixel of a“per-cycle image” that expresses intensity data in a first red channel and a second green channel.
  • a per-cycle image whose pixels depict pixel signals for a subset of the pixel areas, i.e., a region (tile) of the detection surface of the biosensor, is called a“per-cycle tile image.”
  • a patch extracted from a per-cycle tile image is called a“per-cycle image patch.”
  • the patch extraction is performed by an input preparer.
  • the image data comprises a sequence of per-cycle image patches generated for a series of k sequencing cycles of a sequencing ran.
  • the pixels in the per-cycle image patches contain intensity data for associated analytes and the intensity data is obtained for one or more imaged channels (e.g., a red channel and a green channel) by corresponding light sensors configured to detect emissions from the associated analytes.
  • the per-cycle image patches are centered at a center pixel that contains intensity data for a target associated analyte and non-center pixels in the per-cycle image patches contain intensity data for associated analytes adjacent to the target associated analyte.
  • the image data is prepared by an input preparer.
  • the technology disclosed accesses a series of image sets generated during a sequencing ran.
  • the image sets comprise the sequencing images 108.
  • Each image set in the series is captured during a respective sequencing cycle of the sequencing run.
  • Each image (or sequencing image) in the series captures clusters on a tile of a flow cell and their surrounding background.
  • the sequencing ran utilizes four-channel chemistry and each image set has four images. In another implementation, the sequencing run utilizes two-channel chemistry and each image set has two images. In yet another implementation, the sequencing ran utilizes one-channel chemistry and each image set has two images. In yet other implementations, each image set has only one image.
  • the sequencing images 108 in the pixel domain are first converted into the subpixel domain by a subpixel addresser 110 to produce sequencing images 112 in the subpixel domain.
  • each pixel in the sequencing images 108 is divided into sixteen subpixels 502.
  • the subpixels 502 are quarter subpixels.
  • the subpixels 502 are half subpixels.
  • each of the sequencing images 112 in the subpixel domain has a plurality of subpixels 502.
  • the subpixels are then separately fed as input to a base caller 114 to obtain, from the base caller 114, a base call classifying each of the subpixels as one of four bases (A, C, T, and G).
  • the subpixels 502 are identified to the base caller 114 based on their integer or non-integer coordinates. By tracking the emission signal from the subpixels 502 across image sets generated during the plurality of sequencing cycles, the base caller 114 recovers the underlying DNA sequence for each subpixel. An example of this is illustrated in Figure 8b.
  • the technology disclosed obtains, from the base caller 114, the base call classifying each of the subpixels as one of five bases (A, C, T, G, and N).
  • N base call denotes an undecided base call, usually due to low levels of extracted intensify.
  • Some examples of the base caller 114 include non-neural network-based Illumina offerings such as the RTA (Real Time Analysis), the Firecrest program of the Genome Analyzer Analysis Pipeline, the IPAR (Integrated Primary Analysis and Reporting) machine, and the OLB (Off- Line Basecaller).
  • RTA Real Time Analysis
  • Firecrest program of the Genome Analyzer Analysis Pipeline the Firecrest program of the Genome Analyzer Analysis Pipeline
  • IPAR Integrated Primary Analysis and Reporting
  • OLB Off- Line Basecaller
  • the base caller 114 produces the base call sequences by interpolating intensify of the subpixels, including at least one of nearest neighbor intensify extraction, Gaussian based intensify extraction, intensity extraction based on average of 2 x 2 subpixel area, intensify extraction based on brightest of 2 x 2 subpixel area, intensify extraction based on average of 3 x 3 subpixel area, bilinear intensity extraction, bicubic intensify extraction, and/or intensify extraction based on weighted area coverage.
  • intensify of the subpixels including at least one of nearest neighbor intensify extraction, Gaussian based intensify extraction, intensity extraction based on average of 2 x 2 subpixel area, intensify extraction based on brightest of 2 x 2 subpixel area, intensify extraction based on average of 3 x 3 subpixel area, bilinear intensity extraction, bicubic intensify extraction, and/or intensify extraction based on weighted area coverage.
  • the base caller 114 can be a neural network-based base caller, such as the neural network-based base caller 1514 disclosed herein.
  • the subpixel-wise base call sequences 116 are then fed as input to a searcher 118.
  • the searcher 118 searches for substantially matching base call sequences of contiguous subpixels.
  • the searcher 118 then generates a cluster map 802 that identifies clusters as disjointed regions, e.g., 804a-d, of contiguous subpixels that share a substantially matching base call sequence.
  • This application uses“disjointed”,“disjoint”, and“non-overlapping” interchangeably.
  • the search involves base calling the subpixels that contain parts of clusters to allow linking the called subpixels to contiguous subpixels with which they share a substantially matching base call sequence.
  • the searcher 118 requires that at least some of the disjointed regions have a predetermined minimum number of subpixels (e.g., more than 4, 6, or 10 subpixels) to be processed as a cluster.
  • the base caller 114 also identifies preliminary center coordinates of the clusters. Subpixels that contain the preliminary center coordinates are referred to as origin subpixels. Some example preliminary center coordinates (604a-c) identified by the base caller 114 and corresponding origin subpixels (606a-c) are shown in Figure 6. However, identification of the origin subpixels (preliminary center coordinates of the clusters) is not needed, as explained below.
  • the searcher 118 uses breadth-first search for identifying substantially matching base call sequences of the subpixels by beginning with the origin subpixels 606a-c and continuing with successively contiguous non-origin subpixels 702a-c. This again is optional, as explained below.
  • Figure 8a illustrates one example of a cluster map 802 generated by the merging of the subpixel base calls.
  • the cluster map identifies a plurality of disjointed regions (depicted in various colors in Figure 8a).
  • Each disjointed region comprises a non-overlapping group of contiguous subpixels that represents a respective cluster on a tile (from whose sequencing images and for which the cluster map is generated via the subpixel base calling).
  • the region between the disjointed regions represents the background on the tile.
  • the subpixels in the background region are called“background subpixels”.
  • the subpixels in the disjointed regions are called“cluster subpixels” or“cluster interior subpixels”.
  • origin subpixels are those subpixels in which preliminary center cluster coordinates determined by the RTA or another base caller, are located.
  • the origin subpixels contain the preliminary center cluster coordinates. This means that the area covered by an origin subpixel includes a coordinate location that coincides with a preliminary center cluster coordinate location. Since the cluster map 802 is an image of logical subpixels, the origin subpixels are some of the subpixels in the cluster map.
  • the search to identify clusters with substantially matching base call sequences of the subpixels does not need to begin with identification of the origin subpixels (preliminary center coordinates of the clusters) because the search can be done for all the subpixels and can start from any subpixel (e.g., 0,0 subpixel or any random subpixel).
  • the search since each subpixel is evaluated to determine whether it shares a substantially matching base call sequence with another contiguous subpixel, the search does not depend on origin subpixels; the search can start with any subpixel.
  • origin subpixels are used or not, certain dusters are identified that do not contain the origin subpixels (preliminary center coordinates of the clusters) predicted by the base caller 114.
  • Some examples of clusters identified by the merging of the subpixel base calls and not containing an origin subpixel are clusters 812a, 812b, 812c, 812d, and 812e in Figure 8a.
  • the technology disclosed identifies additional or extra clusters for which the centers may not have been identified by the base caller 114. Therefore, use of the base caller 114 for identification of origin subpixels (preliminary center coordinates of the clusters) is optional and not essential for the search of substantially matching base call sequences of contiguous subpixels.
  • the origin subpixels (preliminary center coordinates of the clusters) identified by the base caller 114 are used to identify a first set of clusters (by identification of substantially matching base call sequences of contiguous subpixels). Then, subpixels that are not part of the first set of clusters are used to identify a second set of clusters (by identification of substantially matching base call sequences of contiguous subpixels). This allows the technology disclosed to identify additional or extra dusters for which the centers are not identified by the base caller 114. Finally, subpixels that are not part of the first and second sets of clusters are identified as background subpixels.
  • FIG 8b depicts one example of subpixel base calling.
  • each sequencing cycle has an image set with four distinct images (i.e., A, C, T, G images) captured using four different wavelength bands (image/imaging channel) and four different fluorescent dyes (one for each base).
  • pixels in images are divided into sixteen subpixels. Subpixels are then separately base called at each sequencing cycle by the base caller 114. To base call a given subpixel at a particular sequencing cycle, the base caller 114 uses intensities of the given subpixel in each of the four A, C, T, G images. For example, intensities in image regions covered by subpixel 1 in each of the each of the four A, C, T, G images of cycle 1 are used to base call subpixel 1 at cycle 1. For subpixel 1, these image regions include top-left one-sixteenth area of the respective top-left pixels in each of the four A, C, T, G images of cycle 1.
  • intensities in image regions covered by subpixel m in each of the each of the four A, C, T, G images of cycle n are used to base call subpixel m at cycle n.
  • these image regions include bottom- right one-sixteenth area of the respective bottom-right pixels in each of the four A, C, T, G images of cycle 1.
  • This process produces subpixel-wise base call sequences 116 across the plurality of sequencing cycles.
  • the searcher 118 evaluates pairs of contiguous subpixels to determine whether they have a substantially matching base call sequence. If yes, then the pair of subpixels is stored in the cluster map 802 as belonging to a same cluster in a disjointed region. If no, then the pair of subpixels is stored in the cluster map 802 as not belonging to a same disjointed region.
  • the cluster map 802 therefore identifies contiguous sets of sub-pixels for which the base calls for the sub-pixels substantially match across a plurality of cycles. Cluster map 802 therefore uses information from multiple cycles to provide a plurality of clusters with a high confidence that each cluster of the plurality of clusters provides sequence data for a single DNA strand.
  • a cluster metadata generator 122 then processes the cluster map 802 to determine cluster metadata, including determining spatial distribution of clusters, including their centers (810a), shapes, sizes, background, and/or boundaries based on the disjointed regions ( Figure 9).
  • the cluster metadata generator 122 identifies as background those subpixels in the cluster map 802 that do not belong to any of the disjointed regions and therefore do not contribute to any clusters. Such subpixels are referred to as background subpixels 806a-c.
  • the cluster map 802 identifies cluster boundary portions 808a-c between two contiguous subpixels whose base call sequences do not substantially match.
  • the cluster map is stored in memory (e.g., cluster maps data store 120) for use as ground truth for training a classifier such as the neural network-based template generator 1512 and the neural network-based base caller 1514.
  • the cluster metadata can also be stored in memory (e.g., cluster metadata data store 124).
  • Figure 9 shows another example of a cluster map that identifies cluster metadata, including spatial distribution of the clusters, along with cluster centers, cluster shapes, cluster sizes, cluster background, and/or cluster boundaries.
  • Figure 10 shows how a center of mass (COM) of a disjointed region in a cluster map is calculated.
  • the COM can be used as the “revised” or“improved” center of the corresponding cluster in downstream processing.
  • a center of mass generator 1004 on a cluster-by -cluster basis, determines hyperlocated center coordinates 1006 of the clusters by calculating centers of mass of the disjointed regions of the cluster map as an average of coordinates of respective contiguous subpixels forming the disjointed regions. It then stores the hyperlocated center coordinates of the clusters in the memory on the cluster-by -cluster basis for use as ground truth for training the classifier.
  • a subpixel categorizer on the cluster-by -cluster basis, identifies centers of mass subpixels 1008 in the disjointed regions 804a-d of the cluster map 802 at the hyperlocated center coordinates 1006 of the clusters.
  • the cluster map is upsampled using interpolation. The upsampled cluster map is stored in the memory for use as ground truth for training the classifier.
  • Figure 11 depicts one implementation of calculation of a weighted decay factor for a subpixel based on the Euclidean distance from the subpixel to the center of mass (COM) of the disjointed region to which the subpixel belongs.
  • the weighted decay factor gives the highest value to the subpixel containing the COM and decreases for subpixels further away from the COM.
  • the weighted decay factor is used to derive a ground truth decay map 1204 from a cluster map generated from the subpixel base calling discussed above.
  • the ground truth decay map 1204 contains an array of units and assigns at least one output value to each unit in the array. In some implementations, the units are subpixels and each subpixel is assigned an output value based on the weighted decay factor.
  • the ground truth decay map 1204 is then used as ground truth for training the disclosed neural network-based template generator 1512. In some implementations, information from the ground truth decay map 1204 is also used to prepare input for the disclosed neural network-based base caller 1514.
  • Figure 12 illustrates one implementation of an example ground truth decay map 1204 derived from an example cluster map produced by the subpixel base calling as discussed above.
  • a value is assigned to each contiguous subpixel in the disjointed regions based on a decay factor 1102 that is proportional to distance 1106 of an contiguous subpixel from a center of mass subpixel 1104 in a disjointed region to which the contiguous subpixel belongs.
  • Figure 12 depicts a ground truth decay map 1204.
  • the subpixel value is an intensity value normalized between zero and one.
  • a same predetermined value is assigned to all the subpixels identified as the background.
  • the predetermined value is a zero intensity value.
  • the ground truth decay map 1204 is generated by a ground truth decay map generator 1202 from the upsampled cluster map that expresses the contiguous subpixels in the disjointed regions and the subpixels identified as the background based on their assigned values.
  • the ground truth decay map 1204 is stored in the memory for use as ground truth for training the classifier.
  • each subpixel in the ground truth decay map 1204 has a value normalized between zero and one.
  • Figure 13 illustrates one implementation of deriving a ground truth ternary map 1304 from a cluster map.
  • the ground truth ternary map 1304 contains an array of units and assigns at least one output value to each unit in the array.
  • ternary map implementations of the ground truth ternary map 1304 assign three output values to each unit in the array, such that, for each unit, a first output value corresponds to a classification label or score for a background class, a second output value corresponds to a classification label or score for a cluster center class, and a third output value corresponds to a classification label or score for a cluster/cluster interior class.
  • the ground truth ternary map 1304 is used as ground truth data for training the neural network-based template generator 1512. In some implementations, information from the ground truth ternary map 1304 is also used to prepare input for the neural network-based base caller 1514.
  • Figure 13 depicts an example ground truth ternary map 1304.
  • the contiguous subpixels in the disjointed regions are categorized on the cluster-by -cluster basis by a ground truth ternary map generator 1302, as cluster interior subpixels belonging to a same cluster, the centers of mass subpixels as cluster center subpixels, and as background subpixels the subpixels not belonging to any cluster.
  • the categorizations are stored in the ground truth ternary map 1304. These categorizations and the ground truth ternary map 1304 are stored in the memory for use as ground truth for training the classifier.
  • coordinates of the cluster interior subpixels, the cluster center subpixels, and the background subpixels are stored in the memory for use as ground truth for training the classifier. Then, the coordinates are downscaled by a factor used to upsample the cluster map. Then, on the cluster-by-cluster basis, the downscaled coordinates are stored in the memory for use as ground truth for training the classifier.
  • the ground truth ternary map generator 1302 uses the cluster maps to generate the ternary ground truth data 1304 from the upsampled cluster map.
  • the ternary ground truth data 1304 labels the background subpixels as belonging to a background class, the cluster center subpixels as belonging to a cluster center class, and the cluster interior subpixels as belonging to a cluster interior class.
  • color coding can be used to depict and distinguish the different class labels.
  • the ternary ground truth data 1304 is stored in the memory for use as ground truth for training the classifier.
  • Figure 14 illustrates one implementation of deriving a ground truth binary map 1404 from a cluster map.
  • the binary map 1404 contains an array of units and assigns at least one output value to each unit in the array.
  • the binary map assigns two output values to each unit in the array, such that, for each unit, a first output value corresponds to a classification label or score for a cluster center class and a second output value corresponds to a classification label or score for a non-center class.
  • the binary map is used as ground truth data for training the neural network-based template generator 1512. In some implementations, information from the binary map is also used to prepare input for the neural network-based base caller 1514.
  • Figure 14 depicts a ground truth binary map 1404.
  • the ground truth binary map generator 1402 uses the cluster maps 120 to generate the binary ground truth data 1404 from the upsampled cluster maps.
  • the binary ground truth data 1404 labels the cluster center subpixels as belonging to a cluster center class and labels all other subpixels as belonging to a non-center class.
  • the binary ground truth data 1404 is stored in the memory for use as ground truth for training the classifier.
  • the technology disclosed generates cluster maps 120 for a plurality of tiles of the flow cell, stores the cluster maps in memory, and determines spatial distribution of clusters in the tiles based on the cluster maps 120, including their shapes and sizes. Then, the technology disclosed, in the upsampled cluster maps 120 of the clusters in the tiles, categorizes, on a cluster-by-cluster basis, subpixels as cluster interior subpixels belonging to a same cluster, cluster center subpixels, and background subpixels.
  • the technology disclosed then stores the categorizations in the memory for use as ground truth for training the classifier, and stores, on the cluster-by -cluster basis across the tiles, coordinates of the cluster interior subpixels, the cluster center subpixels, and the background subpixels in the memory for use as ground truth for training the classifier.
  • the technology disclosed then downscales the coordinates by the factor used to upsample the cluster map and stores, on the cluster-by -cluster basis across the tiles, the downscaled coordinates in the memory for use as ground truth for training the classifier.
  • the flow cell has at least one patterned surface with an array of wells that occupy the clusters.
  • the technology disclosed determines: (1) which ones of the wells are substantially occupied by at least one cluster, (2) which ones of the wells are minimally occupied, and (3) which ones of the wells are co-occupied by multiple clusters. This allows for determining respective metadata of multiple dusters that co-occupy a same well, i.e., centers, shapes, and sizes of two or more clusters that share a same well.
  • the solid support on which samples are amplified into clusters comprises a patterned surface.
  • A“patterned surface” refers to an arrangement of different regions in or on an exposed layer of a solid support.
  • one or more of the regions can be features where one or more amplification primers are present.
  • the features can be separated by interstitial regions where amplification primers are not present.
  • the pattern can be an x-y format of features that are in rows and columns.
  • the pattern can be a repeating arrangement of features and/or interstitial regions.
  • the pattern can be a random arrangement of features and/or interstitial regions.
  • the solid support comprises an array of wells or depressions in a surface.
  • This may be fabricated as is generally known in the art using a variety of techniques, including, but not limited to, photolithography, stamping techniques, molding techniques and microetching techniques. As will be appreciated by those in the art, the technique used will depend on the composition and shape of the array substrate.
  • the features in a patterned surface can be wells in an array of wells (e.g. microwells or nanowells) on glass, silicon, plastic or other suitable solid supports with patterned, covalently -linked gel such as poly(N-(5-azidoacetamidylpentyl)acrylamide-co-acrylamide) (PAZAM, see, for example, US Pub. No. 2013/184796, WO 2016/066586, and WO 2015-002813, each of which is incorporated herein by reference in its entirety).
  • PAZAM poly(N-(5-azidoacetamidylpentyl)acrylamide-co-acrylamide)
  • the covalent linking of the polymer to the wells is helpful for maintaining the gel in the structured features throughout the lifetime of the structured substrate during a variety of uses.
  • the gel need not be covalently linked to the wells.
  • silane free acrylamide SFA, see, for example, US Pat. No. 8,563,477, which is incorporated herein by reference in its entirety
  • SFA silane free acrylamide
  • a structured substrate can be made by patterning a solid support material with wells (e.g. microwells or nanowells), coating the patterned support with a gel material (e.g. PAZAM, SFA or chemically modified variants thereof, such as the azidolyzed version of SFA (azido-SFA)) and polishing the gel coated support, for example via chemical or mechanical polishing, thereby retaining gel in the wells but removing or inactivating substantially all of the gel from the interstitial regions on the surface of the structured substrate between the wells.
  • a gel material e.g. PAZAM, SFA or chemically modified variants thereof, such as the azidolyzed version of SFA (azido-SFA)
  • a fragmented human genome can then be contacted with the polished substrate such that individual target nucleic acids will seed individual wells via interactions with primers attached to the gel material; however, the target nucleic acids will not occupy the interstitial regions due to absence or inactivity of the gel material.
  • Amplification of the target nucleic acids will be confined to the wells since absence or inactivity of gel in the interstitial regions prevents outward migration of the growing nucleic acid colony.
  • the process is conveniently manufacturable, being scalable and utilizing micro- or nano-fabrication methods.
  • flow cell refers to a chamber comprising a solid surface across which one or more fluid reagents can be flowed.
  • flow cells and related fluidic systems and detection platforms that can be readily used in the methods of the present disclosure are described, for example, in Bentley et al leverage Nature 456:53-59 (2008), WO 04/018497; US 7,057,026; WO 91/06678; WO 07/123744; US 7,329,492; US 7,211,414; US 7,315,019; US 7,405,281, and US 2008/0108082, each of which is incorporated herein by reference.
  • P5 and P7 are used when referring to amplification primers. It will be understood that any suitable amplification primers can be used in the methods presented herein, and that the use of P5 and P7 are exemplary implementations only. Uses of amplification primers such as P5 and P7 on flow cells is known in the art, as exemplified by the disclosures of WO 2007/010251, WO 2006/064199, WO 2005/065814, WO 2015/106941, WO 1998/044151, and WO 2000/018957, each of which is incorporated by reference in its entirety.
  • any suitable forward amplification primer can be useful in the methods presented herein for hybridization to a complementary sequence and amplification of a sequence.
  • any suitable reverse amplification primer can be useful in the methods presented herein for hybridization to a complementary sequence and amplification of a sequence.
  • One of skill in the art will understand how to design and use primer sequences that are suitable for capture, and amplification of nucleic acids as presented herein.
  • the flow cell has at least one nonpatterned surface and the clusters are unevenly scattered over the nonpattemed surface.
  • density of the clusters ranges from about 100,000 clusters/mm 2 to about 1,000,000 clusters/mm 2 . In other implementations, density of the clusters ranges from about 1,000,000 clusters/mm 2 to about 10,000,000 clusters/mm 2 .
  • the preliminary center coordinates of the clusters determined by the base caller are defined in a template image of the tile.
  • a pixel resolution, an image coordinate system, and measurement scales of the image coordinate system are same for the template image and the images.
  • the technology disclosed relates to determining metadata about clusters on a tile of a flow cell.
  • the technology disclosed accesses (1) a set of images of the tile captured during a sequencing run and (2) preliminary center coordinates of the clusters determined by a base caller.
  • the technology disclosed obtains a base call classifying, as one of four bases, (1) origin subpixels that contain the preliminary center coordinates and (2) a predetermined neighborhood of contiguous subpixels that are successively contiguous to respective ones of the origin subpixels.
  • the predetermined neighborhood of contiguous subpixels can be a m x n subpixel patch centered at subpixels containing the origin subpixels.
  • the subpixel patch is 3 x 3 subpixels.
  • the image patch can be of any size, such as 5 x 5, 15 x 15, 20 x 20, and so on.
  • the predetermined neighborhood of contiguous subpixels can be a n- connected subpixel neighborhood centered at subpixels containing the origin subpixels.
  • the technology disclosed identifies as background those subpixels in the cluster map that do not belong to any of the disjointed regions.
  • the technology disclosed generates a cluster map that identifies the clusters as disjointed regions of contiguous subpixels that: (a) are successively contiguous to at least some of the respective ones of the origin subpixels and (b) share a substantially matching base call sequence of the one of four bases with the at least some of the respective ones of the origin subpixels.
  • the technology disclosed then stores the cluster map in memory and determines the shapes and the sizes of the clusters based on the disjointed regions in the cluster map. In other implementations, centers of the clusters are also determined.
  • Figure 15 is a block diagram that shows one implementation of generating training data that is used to train the neural network-based template generator 1512 and the neural network-based base caller 1514.
  • Figure 16 shows characteristics of the disclosed training examples used to train the neural network-based template generator 1512 and the neural network-based base caller 1514.
  • Each training example corresponds to a tile and is labelled with a corresponding ground truth data representation.
  • the ground truth data representation is a ground truth mask or a ground truth map that identifies the ground truth cluster metadata in the form of the ground truth decay map 1204, the ground truth ternary map 1304, or the ground truth binary map 1404.
  • multiple training examples correspond to a same tile.
  • the technology disclosed relates to generating training data 1504 for neural network-based template generation and base calling.
  • the technology disclosed accesses a multitude of images 108 of a flow cell 202 captured over a plurality of cycles of a sequencing ran.
  • the flow cell 202 has a plurality of tiles.
  • each of the tiles has a sequence of image sets generated over the plurality of cycles.
  • Each image in the sequence of image sets 108 depicts intensify emissions of clusters 302 and their surrounding background 304 on a particular one of the tiles at a particular one the cycles.
  • a training set constructor 1502 constructs a training set 1504 that has a plurality of training examples.
  • each training example corresponds to a particular one of the tiles and includes image data from at least some image sets in the sequence of image sets 1602 of the particular one of the tiles.
  • the image data includes images in at least some image sets in the sequence of image sets 1602 of the particular one of the tiles.
  • the images can have a resolution of 1800 x 1800. In other implementations, it can be any resolution such as 100 x 100, 3000 x 3000, 10000 x 10000, and so on.
  • the image data includes at least one image patch from each of the images.
  • the image patch covers a portion of the particular one of the tiles.
  • the image patch can have a resolution of 20 x 20.
  • the image patch can have any resolution, such as 50 x 50, 70 x 70, 90 x 90, 100 x 100, 3000 x 3000, 10000 x 10000, and so on.
  • the image data includes an upsampled representation of the image patch.
  • the upsampled representation can have a resolution of 80 x 80, for example.
  • the upsampled representation can have any resolution, such as 50 x 50, 70 x 70, 90 x 90, 100 x 100, 3000 x 3000, 10000 x 10000, and so on.
  • multiple training examples correspond to a same particular one of the tiles and respectively include as image data different image patches from each image in each of at least some image sets in a sequence of image sets 1602 of the same particular one of the tiles. In such implementations, at least some of the different image patches overlap with each other.
  • a ground truth generator 1506 generates at least one ground truth data representation for each of the training examples.
  • the ground truth data representation identifies at least one of spatial distribution of clusters and their surrounding background on the particular one of the tiles whose intensity emissions are depicted by the image data, including at least one of cluster shapes, cluster sizes, and/or cluster boundaries, and/or centers of the clusters.
  • the ground truth data representation identifies the clusters as disjointed regions of contiguous subpixels, the centers of the clusters as centers of mass subpixels within respective ones of the disjointed regions, and their surrounding background as subpixels that do not belong to any of the disjointed regions.
  • the ground truth data representation has an upsampled resolution of 80 x 80.
  • the ground truth data representation can have any resolution, such as 50 x 50, 70 x 70, 90 x 90, 100 x 100, 3000 x 3000, 10000 x 10000, and so on.
  • the ground truth data representation identifies each subpixel as either being a cluster center or a non-center. In another implementation, the ground truth data representation identifies each subpixel as either being cluster interior, duster center, or surrounding background.
  • the technology disclosed stores, in memory, the training examples in the training set 1504 and associated ground truth data 1508 as the training data 1504 for training the neural network-based template generator 1512 and the neural network-based base caller 1514.
  • the training is operationalized by trainer 1510.
  • the technology disclosed generates the training data for a variety of flow cells, sequencing instruments, sequencing protocols, sequencing chemistries, sequencing reagents, and cluster densities.
  • the technology disclosed uses peak detection and segmentation to determine cluster metadata.
  • the technology disclosed processes input image data 1702 derived from a series of image sets 1602 through a neural network 1706 to generate an alternative representation 1708 of the input image data 1702.
  • an image set can be for a particular sequencing cycle and include four images, one for each image channel A, C, T, and G. Then, for a sequencing ran with fifty sequencing cycles, there will be fifty such image sets, i.e., a total of 200 images. When arranged temporally, fifty image sets with four images-per image set would form the series of image sets 1602.
  • image patches of a certain size are extracted from each image in the fifty image sets, forming fifty image patch sets with four image patches-per image patch set and, in one implementation, this is the input image data 1702.
  • the input image data 1702 comprises image patch sets with four image patches-per image patch set for fewer than the fifty sequencing cycles, i.e., just one, two, three, fifteen, twenty sequencing cycles.
  • Figure 17 illustrates one implementation of processing input image data 1702 through the neural network-based template generator 1512 and generating an output value for each unit in an array.
  • the array is a decay map 1716.
  • the array is a ternary map 1718.
  • the array is a binary map 1720. The array may therefore represent one or more properties of each of a plurality of locations represented in the input image data 1702.
  • the decay map 1716, the ternary map 1718, and/or the binary map 1720 are generated by forward propagation of the trained neural network-based template generator 1512.
  • the forward propagation can be during training or during inference.
  • the decay map 1716, the ternary map 1718, and the binary map 1720 i.e., cumulatively the output 1714
  • the decay map 1716, the ternary map 1718, and the binary map 1720 i.e., cumulatively the output 1714
  • the size of the image array analyzed during inference depends on the size of the input image data 1702 (e.g., be the same or an upscaled or downscaled version), according to one implementation.
  • Each unit can represent a pixel, a subpixel, or a superpixel.
  • the unit-wise output values of an array can characterize/represent/denote the decay map 1716, the ternary map 1718, or the binary map 1720.
  • the input image data 1702 is also an array of units in the pixel, subpixel, or superpixel resolution.
  • the neural network-based template generator 1512 uses semantic segmentation techniques to produce an output value for each unit in the input array. Additional details about the input image data 1702 can be found in Figures 21b, 22, 23, and 24 and their discussion.
  • the neural network-based template generator 1512 is a fully convolutional network, such as the one described in J. Long, E. Shelhamer, and T. Darrell,“Fully convolutional networks for semantic segmentation,” in CVPR, (2015), which is incorporated herein by reference.
  • the neural network-based template generator 1512 is a U-Net network with skip connections between the decoder and the encoder between the decoder and the encoder, such as the one described in Ronneberger O, F ischer P, Brox T.,“U-net: Convolutional networks for biomedical image segmentation,” Med. Image Comput. Comput. Assist. Interv.
  • the U-Net architecture resembles an autoencoder with two main sub-structures: 1) an encoder, which takes an input image and reduces its spatial resolution through multiple convolutional layers to create a representation encoding. 2) A decoder, which takes the representation encoding and increases spatial resolution back to produce a reconstructed image as output.
  • the U-Net introduces two innovations to this architecture: First, the objective function is set to reconstruct a segmentation mask using a loss function; and second, the convolutional layers of the encoder are connected to the corresponding layers of the same resolution in the decoder using skip connections.
  • the neural network-based template generator 1512 is a deep fully convolutional segmentation neural network with an encoder subnetwork and a corresponding decoder network.
  • the encoder subnetwork includes a hierarchy of encoders and the decoder subnetwork includes a hierarchy of decoders that map low resolution encoder feature maps to full input resolution feature maps. Additional details about segmentation networks can be found in Appendix entitled“Segmentation Networks”.
  • the neural network-based template generator 1512 is a convolutional neural network. In another implementation, the neural network-based template generator 1512 is a recurrent neural network. In yet another implementation, the neural network-based template generator 1512 is a residual neural network with residual bocks and residual connections. In a further implementation, the neural network-based template generator 1512 is a combination of a convolutional neural network and a recurrent neural network.
  • the neural network-based template generator 1512 i.e., the neural network 1706 and/or the output layer 1710 can use various padding and striding configurations. It can use different output functions (e.g., classification or regression) and may or may not include one or more fully -connected layers.
  • ID convolutions 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and cross-channel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • loss functions such as logistic regression/log loss, multi-class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss.
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g., non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g., max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • sigmoid and hyperbolic tangent sigmoid and hyperbolic tangent
  • each image in the sequence of image sets 1602 covers a tile and depicts intensity emissions of dusters on a tile and their surrounding background captured for a particular imaging channel at a particular one of a plurality of sequencing cycles of a sequencing ran performed on a flow cell.
  • the input image data 1702 includes at least one image patch from each of the images in the sequence of image sets 1602.
  • the image patch covers a portion of the tile.
  • the image patch has a resolution of 20 x 20. In other cases, the resolution of the image patch can range from 20 x 20 to 10000 x 10000.
  • the input image data 1702 includes an upsampled, subpixel resolution representation of the image patch from each of the images in the sequence of image sets 1602.
  • the upsampled, subpixel representation has a resolution of 80 x 80. In other cases, the resolution of the upsampled, subpixel representation can range from 80 x 80 to 10000 x 10000.
  • the input image data 1702 has an array of units 1704 that depicts clusters and their surrounding background.
  • an image set can be for a particular sequencing cycle and include four images, one for each image channel A, C, T, and G. Then, for a sequencing ran with fifty sequencing cycles, there will be fifty such image sets, i.e., a total of 200 images.
  • fifty image sets with four images-per image set would form the series of image sets 1602.
  • image patches of a certain size are extracted from each image in the fifty image sets, forming fifty image patch sets with four image patches-per image patch set and, in one implementation, this is the input image data 1702.
  • the input image data 1702 comprises image patch sets with four image patches-per image patch set for fewer than the fifty sequencing cycles, i.e., just one, two, three, fifteen, twenty sequencing cycles.
  • the alternative representation is a feature map.
  • the feature map can be a convolved feature or convolved representation when the neural network is a convolutional neural network.
  • the feature map can be a hidden state feature or hidden state representation when the neural network is a recurrent neural network.
  • the technology disclosed processes the alternative representation 1708 through an output layer 1710 to generate an output 1714 that has an output value 1712 for each unit in the array 1704.
  • the output lay er can be a classification layer such as softmax or sigmoid that produces unit-wise output values.
  • the output layer is a ReLU layer or any other activation function layer that produces unit-wise output values.
  • the units in the input image data 1702 are pixels and therefore pixel-wise output values 1712 are produced in the output 1714.
  • the units in the input image data 1702 are subpixels and therefore subpixel-wise output values 1712 are produced in the output 1714.
  • the units in the input image data 1702 are superpixels and therefore superpixel- wise output values 1712 are produced in the output 1714.
  • Figure 18 shows one implementation of post-processing techniques that are applied to the decay map 1716, the ternary map 1718, or the binary map 1720 produced by the neural network-based template generator 1512 to derive cluster metadata, including cluster centers, cluster shapes, cluster sizes, cluster background, and/or cluster boundaries.
  • the post-processing techniques are applied by a post-processor 1814 that further comprises a thresholder 1802, a peak locator 1806, and a segmenter 1810.
  • the input to the thresholder 1802 is the decay map 1716, the ternary map 1718, or the binary map 1720 produced by template generator 1512, such as the disclosed neural network-based template generator.
  • the thresholder 1802 applies thresholding on the values in the decay map, the ternary map, or the binary map to identify background units 1804 (i.e., subpixels characterizing non-cluster background).) and non-background units.
  • the thresholder 1802 thresholds output values of the units 1712 and classifies, or can reclassify a first subset of the units 1712 as“background units” 1804 depicting the surrounding background of the clusters and“non-background units” depicting units that potentially belong to clusters.
  • the threshold value applied by the thresholder 1802 can be preset.
  • the input to the peak locator 1806 is also the decay map 1716, the ternary map 1718, or the binary map 1720 produced by the neural network-based template generator 1512.
  • the peak locator 1806 applies peak detection on the values in the decay map 1716, the ternary map 1718, or the binary map 1720 to identify center units 1808 (i.e., center subpixels characterizing cluster centers).
  • the peak locator 1806 processes the output values of the units 1712 in the output 1714 and classifies a second subset of the units 1712 as“center units” 1808 containing centers of the clusters.
  • the centers of the clusters detected by the peak locator 1806 are also the centers of mass of the clusters.
  • the center units 1808 are then provided to the segmenter 1810. Additional details about the peak locator 1806 can be found in the Appendix entitled“Peak Detection”.
  • the thresholding and the peak detection can be done in parallel or one after the other. That is, they are not dependent on each other.
  • the input to the segmenter 1810 is also the decay map 1716, the ternary map 1718, or the binary map 1720 produced by the neural network-based template generator 1512.
  • Additional supplemental input to the segmenter 1810 comprises the thresholded units (background, nonbackground) 1804 identified by the thresholder 1802 and the center units 1808 identified by the peak locator 1806.
  • the segmenter 1810 uses the background, non-background 1804 and the center units 1808 to identify disjointed regions 1812 (i.e., non-overlapping groups of contiguous cluster/cluster interior subpixels characterizing clusters).
  • the segmenter 1810 processes the output values of the units 1712 in the output 1714 and uses the background, non-background units 1804 and the center units 1808 to determine shapes 1812 of the clusters as nonoverlapping regions of contiguous units separated by the background units 1804 and centered at the center units 1808.
  • the output of the segmenter 1810 is cluster metadata 1812.
  • the cluster metadata 1812 identifies cluster centers, cluster shapes, cluster sizes, cluster background, and/or cluster boundaries.
  • the segmenter 1810 begins with the center units 1808 and determines, for each center unit, a group of successively contiguous units that depict a same cluster whose center of mass is contained in the center unit.
  • the segmenter 1810 uses a so-called“watershed” segmentation technique to subdivide contiguous clusters into multiple adjoining clusters at a valley in intensity. Additional details about the watershed segmentation technique and other segmentation techniques can be found in Appendix entitled “Watershed Segmentation”.
  • the output values of the units 1712 in the output 1714 are continuous values, such as the one encoded in the ground truth decay map 1204.
  • the output values are softmax scores, such as the one encoded in the ground truth ternary map 1304 and the ground truth binary map 1404.
  • the contiguous units in the respective ones of the non-overlapping regions have output values weighted according to distance of a contiguous unit from a center unit in a non-overlapping region to which the contiguous unit belongs.
  • the center units have highest output values within the respective ones of the non-overlapping regions.
  • the decay map 1716, the ternary map 1718, and the binary map 1720 i.e., cumulatively the output 1714
  • the binary map 1720 progressively match or approach the ground truth decay map 1204, the ground truth ternary map 1304, and the ground truth binary map 1404, respectively.
  • duster shapes determined by the technology disclosed can be used to extract intensity of the dusters. Since dusters typically have irregular shapes and contours, the technology disclosed can be used to identify which subpixels contribute to the irregularly shaped disjointed/non-overlapping regions that represent the duster shapes.
  • Figure 19 depicts one implementation of extracting duster intensify in the pixel domain.
  • “Template image” or“template” can refer to a data structure that contains or identifies the duster metadata 1812 derived from the decay map 1716, the ternary map 1718, and/or the binary map 1718.
  • the duster metadata 1812 identifies duster centers, duster shapes, duster sizes, duster background, and/or duster boundaries.
  • the template image is in the upsampled, subpixel domain to distinguish the duster boundaries at a finegrained level.
  • the sequencing images 108 which contain the duster and background intensify data, are typically in the pixel domain.
  • the technology disclosed proposes two approaches to use the duster shape information encoded in the template image in the upsampled, subpixel resolution to extract intensities of the irregularly shaped dusters from the optical, pixel-resolution sequencing images.
  • the non-overlapping groups of contiguous subpixels identified in the template image are located in the pixel resolution sequencing images and their intensities extracted via interpolation. Additional details about this intensify extraction technique can be found in Figure 33 and its discussion.
  • the duster intensify 1912 of a given duster is determined by an intensify extractor 1902 as follows.
  • a subpixel locator 1904 identifies subpixels that contribute to the duster intensify of the given duster based on a corresponding non-overlapping region of contiguous subpixels that identifies a shape of the given duster.
  • the subpixel locator 1904 locates the identified subpixels in one or more optical, pixel-resolution images 1918 generated for one or more imaging channels at a current sequencing cycle.
  • integer or non-integer coordinates e.g., floating points
  • the optical, pixel-resolution images after a downscaling based on a downscaling factor that matches an upsampling factor used to create the subpixel domain.
  • an interpolator and subpixel intensify combiner 1906 intensities of the identified subpixels in the images processed, combines the interpolated intensities, and normalizes the combined interpolated intensities to produce a per-image duster intensify for the given duster in each of the images.
  • the normalization is performed by a normalizer 1908 and is based on a normalization factor.
  • the normalization factor is a number of the identified subpixels. This is done to normalize/account for different duster sizes and uneven illuminations that dusters receive depending on their location on the flow cell.
  • a cross-channel subpixel intensify accumulator 1910 combines the per-image duster intensify for each of the images to determine the duster intensify 1912 of the given duster at the current sequencing cycle.
  • the given duster is base called based on the duster intensity 1912 at the current sequencing cycle by any one of the base callers discussed in this application, yielding base calls 1916.
  • the output of the neural network-based base caller 1514 i.e., the decay map 1716, the ternary map 1718, and the binary map 1720 are in the optical, pixel domain. Accordingly, in such implementations, the template image is also in the optical, pixel domain.
  • Figure 20 depicts the second approach of extracting duster intensity in the subpixel domain.
  • the sequencing images in the optical, pixel-resolution are upsampled into the subpixel resolution. This results in correspondence between the“duster shape depicting subpixels” in the template image and the“duster intensify depicting subpixels” in the upsampled sequencing images.
  • the duster intensify is then extracted based on the correspondence. Additional details about this intensify extraction technique can be found in Figure 33 and its discussion.
  • the duster intensify 2012 of a given duster is determined by an intensify extractor 2002 as follows.
  • a subpixel locator 2004 identifies subpixels that contribute to the duster intensify of the given duster based on a corresponding non-overlapping region of contiguous subpixels that identifies a shape of the given duster.
  • the subpixel locator 2004 locates the identified subpixels in one or more subpixel resolution images 2018 upsampled from corresponding optical, pixel-resolution images 1918 generated for one or more imaging channels at a current sequencing cycle.
  • the upsampling can be performed by nearest neighbor intensify extraction, Gaussian based intensity extraction, intensity extraction based on average of 2 x 2 subpixel area, intensify extraction based on brightest of 2 x 2 subpixel area, intensify extraction based on average of 3 x 3 subpixel area, bilinear intensify extraction, bicubic intensity extraction, and/or intensify extraction based on weighted area coverage.
  • the template image can, in some implementations, serve as a mask for intensity extraction.
  • a subpixel intensity combiner 2006 in each of the upsampled images, combines intensities of the identified subpixels and normalizes the combined intensities to produce a per-image cluster intensity for the given cluster in each of the upsampled images.
  • the normalization is performed by a normalizer 2008 and is based on a normalization factor.
  • the normalization factor is a number of the identified subpixels. This is done to normalize/account for different cluster sizes and uneven illuminations that clusters receive depending on their location on the flow cell.
  • a cross-channel, subpixel-intensity accumulator 2010 combines the per-image cluster intensity for each of the upsampled images to determine the cluster intensity 2012 of the given cluster at the current sequencing cycle.
  • the given cluster is base called based on the duster intensity 2012 at the current sequencing cycle by any one of the base callers discussed in this application, yielding base calls 2016.
  • FIG. 21a The discussion now turns to details of three different implementations of the neural network-based template generator 1512. There are shown in Figure 21a and include: (1) the decay map-based template generator 2600 (also called the regression model), (2) the binary map-based template generator 4600 (also called the binary classification model), and (3) the ternary map-based template generator 5400 (also called the ternary classification model).
  • the decay map-based template generator 2600 also called the regression model
  • the binary map-based template generator 4600 also called the binary classification model
  • ternary map-based template generator 5400 also called the ternary classification model
  • the regression model 2600 is a fully convolutional network. In another implementation, the regression model 2600 is a U-Net network with skip connections between the decoder and the encoder. In one implementation, the binary classification model 4600 is a fully convolutional network. In another implementation, the binary classification model 4600 is a U-Net network with skip connections between the decoder and the encoder. In one implementation, the ternary classification model 5400 is a fully convolutional network. In another implementation, the ternary classification model 5400 is a U-Net network with skip connections between the decoder and the encoder.
  • Figure 21b depicts one implementation of the input image data 1702 that is fed as input to the neural network-based template generator 1512.
  • the input image data 1702 comprises a series of image sets 2100 with the sequencing images 108 that are generated during a certain number of initial sequences cycles of a sequencing ran (e.g., the first 2 to 7 sequencing cycles).
  • intensities of the sequencing images 108 are corrected for background and/or aligned with each other using affine transformation.
  • the sequencing ran utilizes four-channel chemistry and each image set has four images.
  • the sequencing run utilizes two-channel chemistry and each image set has two images.
  • the sequencing ran utilizes one-channel chemistry and each image set has two images.
  • each image set has only one image.
  • Each image 2116 in the series of image sets 2100 covers a tile 2104 of a flow cell 2102 and depicts intensity emissions of clusters 2106 on the tile 2104 and their surrounding background captured for a particular image channel at a particular one of a plurality of sequencing cycles of the sequencing ran.
  • the image set includes four images 2112A, 2112C, 2112T, and 2112G: one image for each base A, C, T, and G labeled with a corresponding fluorescent dye and imaged in a corresponding wavelength band (image/imaging channel).
  • Figure 21b depicts cluster intensity emissions as 2108 and background intensity emissions as 2110.
  • the image set also includes four images 2114A, 2114C, 2114T, and 2114G: one image for each base A, C, T, and G labeled with a corresponding fluorescent dye and imaged in a corresponding wavelength band (image/imaging channel).
  • image 2114A Figure 21b depicts cluster intensity emissions as 2118 and, in image 2114T, depicts background intensity emissions as 2120.
  • the input image data 1702 is encoded using intensity channels (also called imaged channels). For each of the c images obtained from the sequencer for a particular sequencing cycle, a separate imaged channel is used to encode its intensity signal data.
  • intensity channels also called imaged channels.
  • the input data 2632 comprises (i) a first red imaged channel with w x ii pixels that depict intensity emissions of the one or more clusters and their surrounding background captured in the red image and (ii) a second green imaged channel with w x ii pixels that depict intensity emissions of the one or more clusters and their surrounding background captured in the green image.
  • image data is not used as input to the neural network-based template generator 1512 or the neural network- based base caller 1514.
  • the input to the neural network-based template generator 1512 and the neural network-based base caller 1514 is based on pH changes induced by the release of hydrogen ions during molecule extension. The pH changes are detected and converted to a voltage change that is proportional to the number of bases incorporated (e.g., in the case of Ion Torrent).
  • the Oxford Nanopore Technologies (ONT) sequencing is based on the following concept: pass a single strand of DNA (or RNA) through a membrane via a nanopore and apply a voltage difference across the membrane.
  • the nucleotides present in the pore will affect the pore's electrical resistance, so current measurements over time can indicate the sequence of DNA bases passing through the pore.
  • This electrical current signal (the‘squiggle’ due to its appearance when plotted) is the raw data gathered by an ONT sequencer.
  • DAC integer data acquisition
  • the input data 2632 comprises normalized or scaled DAC values.
  • Figure 22 shows one implementation of extracting patches from the series of image sets 2100 in Figure 21b to produce a series of “down-sized” image sets that form the input image data 1702.
  • the sequencing images 108 in the series of image sets 2100 are of si/c /. si (e.g., 2000 x 2000).
  • L is any number ranging from 1 and 10,000.
  • a patch extractor 2202 extracts patches from the sequencing images 108 in the series of image sets 2100 and produces a series of down-sized image sets 2206, 2208, 2210, and 2212.
  • Each image in the series of down-sized image sets is a patch of size M x M (e.g., 20 x 20) that is extracted from a corresponding sequencing image in the series of image sets 2100.
  • the size of the patches can be preset.
  • M is any number ranging from 1 and 1000.
  • the first example series of down-sized image sets 2206 is extracted from coordinates 0,0 to 20,20 in the sequencing images 108 in the series of image sets 2100.
  • the second example series of down-sized image sets 2208 is extracted from coordinates 20,20 to 40,40 in the sequencing images 108 in the series of image sets 2100.
  • the third example series of down-sized image sets 2210 is extracted from coordinates 40,40 to 60,60 in the sequencing images 108 in the series of image sets 2100.
  • the fourth example series of down-sized image sets 2212 is extracted from coordinates 60,60 to 80,80 in the sequencing images 108 in the series of image sets 2100.
  • the series of down-sized image sets form the input image data 1702 that is fed as input to the neural network-based template generator 1512. Multiple series of down-sized image sets can be simultaneously fed as an input batch and a separate output can be produced for each series in the input batch.
  • Figure 23 depicts one implementation of upsampling the series of image sets 2100 in Figure 21b to produce a series of“upsampled” image sets 2300 that forms the input image data 1702.
  • an upsampler 2302 uses interpolation (e.g., bicubic interpolation) to upsample the sequencing images 108 in the series of image sets 2100 by an upsampling factor (e.g., 4x) and the series of upsampled image sets 2300.
  • interpolation e.g., bicubic interpolation
  • an upsampling factor e.g., 4x
  • the sequencing images 108 in the series of image sets 2100 are of size L x L (e.g., 2000 x 2000) and are upsampled by an upsampling factor of four to produce upsampled images of size Ux U (e.g., 8000 x 8000) in the series of upsampled image sets 2300.
  • L x L e.g., 2000 x 2000
  • Ux U e.g., 8000 x 8000
  • the sequencing images 108 in the series of image sets 2100 are fed directly to the neural network-based template generator 1512 and the upsampling is performed by an initial layer of the neural network-based template generator 1512. That is, the upsampler 2302 is part of the neural network-based template generator 1512 and operates as its first layer that upsamples the sequencing images 108 in the series of image sets 2100 and produces the series of upsampled image sets 2300.
  • the series of upsampled image sets 2300 forms the input image data 1702 that is fed as input to the neural network-based template generator 1512.
  • Figure 24 shows one implementation of extracting patches from the series of upsampled image sets 2300 in Figure 23 to produce a series of“upsampled and down-sized” image sets 2406, 2408, 2410, and 2412 that form the input image data 1702.
  • the patch extractor 2202 extracts patches from the upsampled images in the series of upsampled image sets 2300 and produces series of upsampled and down-sized image sets 2406, 2408, 2410, and 2412.
  • Each upsampled image in the series of upsampled and down-sized image sets is a patch of sizeMxM (e.g., 80 x 80) that is extracted from a corresponding upsampled image in the series of upsampled image sets 2300.
  • the size of the patches can be preset. In other implementations, M is any number ranging from 1 and 1000.
  • the first example series of upsampled and down-sized image sets 2406 is extracted from coordinates 0,0 to 80,80 in the upsampled images in the series of upsampled image sets 2300.
  • the second example series of upsampled and down-sized image sets 2408 is extracted from coordinates 80,80 to 160,160 in the upsampled images in the series of upsampled image sets 2300.
  • the third example series of upsampled and down-sized image sets 2410 is extracted from coordinates 160,160 to 240,240 in the upsampled images in the series of upsampled image sets 2300.
  • the fourth example series of upsampled and down-sized image sets 2412 is extracted from coordinates 240,240 to 320,320 in the upsampled images in the series of upsampled image sets 2300.
  • the series of upsampled and down-sized image sets form the input image data 1702 that is fed as input to the neural network-based template generator 1512.
  • Multiple series of upsampled and down-sized image sets can be simultaneously fed as an input batch and a separate output can be produced for each series in the input batch.
  • the three models are trained to produce different outputs. This is achieved by using different types of ground truth data representations as training labels.
  • the regression model 2600 is trained to produce output that characterizes/represents/denotes a so-called“decay map” 1716.
  • the binary classification model 4600 is trained to produce output that characterizes/represents/denotes a so-called“binary map” 1720.
  • the ternary classification model 5400 is trained to produce output that characterizes/represents/denotes a so-called“ternary map” 1718.
  • the output 1714 of each type of model comprises an array of units 1712.
  • the units 1712 can be pixels, subpixels, or superpixels.
  • the output of each type of model includes unit-wise output values, such that the output values of an array of units together
  • Figure 25 illustrates one implementation of an overall example process of generating ground truth data for training the neural network-based template generator 1512.
  • the ground truth data can be the decay map 1204.
  • the ground truth data can be the binary map 1404.
  • the ground truth data can be the ternary map 1304.
  • the ground truth data is generated from the cluster metadata.
  • the cluster metadata is generated by the duster metadata generator 122.
  • the ground truth data is generated by the ground truth data generator 1506.
  • the ground truth data is generated for tile A that is on lane A of flow cell A.
  • the ground truth data is generated from the sequencing images 108 of tile A captured during sequencing ran A.
  • the sequencing images 108 of tile A are in the pixel domain.
  • two hundred sequencing images 108 for fifty sequencing cycles are accessed.
  • Each of the two hundred sequencing images 108 depicts intensity emissions of clusters on tile A and their surrounding background captured in a particular image channel at a particular sequencing cycle.
  • the subpixel addresser 110 converts the sequencing images 108 into the subpixel domain (e.g., by dividing each pixel into a plurality of subpixels) and produces sequencing images 112 in the subpixel domain.
  • the base caller 114 (e.g., RTA) then processes the sequencing images 112 in the subpixel domain and produces a base call for each subpixel and for each of the fifty sequencing cycles. This is referred to herein as“subpixel base calling”.
  • the subpixel base calls 116 are then merged to produce, for each subpixel, a base call sequence across the fifty sequencing cycles.
  • Each subpixeT s base call sequence has fifty base calls, i.e., one base call for each of the fifty sequencing cycles.
  • the searcher 118 evaluates base call sequences of contiguous subpixels on a pair-wise basis.
  • the search involves evaluating each subpixel to determine with which of its contiguous subpixels it shares a substantially matching base call sequence.
  • the base caller 114 also identifies preliminary center coordinates of the clusters. Subpixels that contain the preliminary center coordinates are referred to as center or origin subpixels. Some example preliminary center coordinates (604a-c) identified by the base caller 114 and corresponding origin subpixels (606a-c) are shown in Figure 6. However, identification of the origin subpixels (preliminary center coordinates of the clusters) is not needed, as explained below.
  • the searcher 118 uses a breadth-first search for identifying substantially matching base call sequences of the subpixels by beginning with the origin subpixels 606a-c and continuing with successively contiguous non-origin subpixels 702a-c. This again is optional, as explained below.
  • the search for substantially matching base call sequences of the subpixels does not need identification of the origin subpixels (preliminary center coordinates of the clusters) because the search can be done for all the subpixels and the search does not have to start from the origin subpixels and instead can start from any subpixel (e.g., 0,0 subpixel or any random subpixel).
  • the search since each subpixel is evaluated to determine whether it shares a substantially matching base call sequence with another contiguous subpixel, the search does not have to utilize the origin subpixels and can start with any subpixel.
  • origin subpixels are used or not, certain clusters are identified that do not contain the origin subpixels (preliminary center coordinates of the clusters) predicted by the base caller 114.
  • Some examples of clusters identified by the merging of the subpixel base calls and not containing an origin subpixel are clusters 812a, 812b, 812c, 812d, and 812e in Figure 8a. Therefore, use of the base caller 114 for identification of origin subpixels (preliminary center coordinates of the clusters) is optional and not essential for the search of substantially matching base call sequences of the subpixels.
  • the searcher 118 (1) identifies contiguous subpixels with substantially matching base call sequences as so-called“disjointed regions”, (2) further evaluates base call sequences of those subpixels that do not belong to any of the disjointed regions already identified at (1) to yield additional disjointed regions, and (3) then identifies background subpixels as those subpixels that do not belong to any of the disjointed regions already identified at (1) and (2).
  • Action (2) allows the technology disclosed to identify additional or extra clusters for which the centers are not identified by the base caller 114.
  • the results of the searcher 118 are encoded in a so-called“cluster map” of tile A and stored in the duster map data store 120.
  • each of the dusters on tile A are identified by a respedive disjointed region of contiguous subpixels, with background subpixels separating the disjointed regions to identify the surrounding background on tile A.
  • the center of mass (COM) calculator 1004 determines a center for each of the dusters on tile A by calculating a COM of each of the disjointed regions as an average of coordinates of respective contiguous subpixels forming the disjointed regions.
  • the centers of mass of the dusters are stored as COM data 2502.
  • a subpixel categorizer 2504 uses the duster map and the COM data 2502 to produce subpixel categorizations 2506.
  • the subpixel categorizations 2506 classify subpixels in the duster map as (1) backgrounds subpixels, (2) COM subpixels (one COM subpixel for each disjointed region containing the COM of the respedive disjointed region), and (3) duster/duster interior subpixels forming the respedive disjointed regions. That is, each subpixel in the duster map is assigned one of the three categories.
  • the ground truth decay map 1204 is produced by the ground truth decay map generator 1202
  • the ground truth binary map 1304 is produced by the ground truth binary map generator 1302
  • the ground truth ternary map 1404 is produced by the ground truth ternary map generator 1402.
  • Figure 26 illustrates one implementation of the regression model 2600.
  • the regression model 2600 is a fully convolutional network 2602 that processes the input image data 1702 through an encoder subnetwork and a corresponding decoder subnetwork.
  • the encoder subnetwork includes a hierarchy of encoders.
  • the decoder subnetwork includes a hierarchy of decoders that map low resolution encoder feature maps to a full input resolution decay map 1716.
  • the regression model 2600 is a U-Net network 2604 with skip connections between the decoder and the encoder. Additional details about the segmentation networks can be found in the Appendix entitled“Segmentation Networks”.
  • Figure 27 depicts one implementation of generating a ground truth decay map 1204 from a cluster map 2702.
  • the ground truth decay map 1204 is used as ground truth data for training the regression model 2600.
  • the ground truth decay map generator 1202 assigns a weighted decay value to each contiguous subpixel in the disjointed regions based on a weighted decay factor.
  • the weighted decay value is proportional to Euclidean distance of a contiguous subpixel from a center of mass (COM) subpixel in a disjointed region to which the contiguous subpixel belongs, such that the weighted decay value is highest (e.g., 1 or 100) for the COM subpixel and decreases for subpixels further away from the COM subpixel.
  • the weighted decay value is multiplied by a preset factor, such as 100.
  • the ground truth decay map generator 1202 assigns all background subpixels a same predetermine value (e.g., a minimalist background value).
  • the ground truth decay map 1204 expresses the contiguous subpixels in the disjointed regions and the background subpixels based on the assigned values.
  • the ground truth decay map 1204 also stores the assigned values in an array of units, with each unit in the array representing a corresponding subpixel in the input.
  • Figure 28 is one implementation of training 2800 the regression model 2600 using a backpropagation-based gradient update technique that modifies parameters of the regression model 2600 until the decay map 1716 produced by the regression model 2600 as training output during the training 2800 progressively approaches or matches the ground truth decay map 1204.
  • the training 2800 includes iteratively optimizing a loss function that minimizes error 2806 between the decay map 1716 and the ground truth decay map 1204, and updating parameters of the regression model 2600 based on the error 2806.
  • the loss function is mean squared error and the error is minimized on a subpixel-by -subpixel basis between weighted decay values of corresponding subpixels in the decay map 1716 and the ground truth decay map 1204.
  • the training 2800 includes hundreds, thousands, and/or millions of iterations of forward propagation 2808 and backward propagation 2810, including parallelization techniques such as batching.
  • the training data 1504 includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the training data 1504 is annotated with ground truth labels by an annotator 2806.
  • the training 2800 is operationalized by the trainer 1510 using a stochastic gradient update algorithm such as ADAM.
  • Figure 29 is one implementation of template generation by the regression model 2600 during inference 2900 in which the decay map 1716 is produced by the regression model 2600 as the inference output during the inference 2900.
  • One example of the decay map 1716 is disclosed in the Appendix titled“Regression_Model_Sample_Ouput''.
  • the Appendix includes unit-wise weighted decay output values 2910 that together represent the decay map 1716.
  • the inference 2900 includes hundreds, thousands, and/or millions of iterations of forward propagation 2904, including parallelization techniques such as batching.
  • the inference 2900 is performed on inference data 2908 that includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the inference 2900 is operationalized by a tester 2906.
  • Figure 30 illustrates one implementation of subjecting the decay map 1716 to (i) thresholding to identify background subpixels characterizing cluster background and to (ii) peak detection to identify center subpixels characterizing cluster centers.
  • the thresholding is performed by the thresholder 1802 that uses a local threshold binary to produce binarized output.
  • the peak detection is performed by the peak locator 1806 to identify the cluster centers. Additional details about the peak locator can be found in the Appendix entitled“Peak Detection”.
  • Figure 31 depicts one implementation of a watershed segmentation technique that takes as input the background subpixels and the center subpixels respectively identified by the thresholder 1802 and the peak locator 1806, finds valleys in intensify between adjoining clusters, and outputs non-overlapping groups of contiguous cluster/cluster interior subpixels characterizing the clusters. Additional details about the watershed segmentation technique can be found in the Appendix entitled“Watershed Segmentation”.
  • a watershed segmenter 3102 takes as input (1) negativized output values 2910 in the decay map 1716, (2) binarized output of the thresholder 1802, and (3) duster centers identified by the peak locator 1806. Then, based on the input, the watershed segmenter 3102 produces output 3104. In the output 3104, each cluster center is identified as a unique set/group of subpixels that belong to the cluster center (as long as the subpixels are“1” in the binary output, i.e., not background subpixels). Further, the clusters are filtered based on containing at least four subpixels.
  • the watershed segmenter 3102 can be part of the segmenter 1810, which in turn is part of the post-processor 1814.
  • Figure 32 is a table that shows an example U-Net architecture of the regression model 2600, along with details of the layers of the regression model 2600, dimensionality of the output of the layers, magnitude of the model parameters, and interconnections between the layers. Similar details are disclosed in the file titled“Regression_Model_Example_Architecture”, which is submitted as an appendix to this application.
  • Figure 33 illustrates different approaches of extracting cluster intensify using cluster shape information identified in a template image.
  • the template image identifies the cluster shape information in the upsampled, subpixel resolution.
  • the cluster intensify information is in the sequencing images 108, which are typically in the optical, pixel-resolution.
  • coordinates of the subpixels are located in the sequencing images 108 and their respective intensities extracted using bilinear interpolation and normalized based on a count of the subpixels that contribute to a cluster.
  • the second approach uses a weighted area coverage technique to modulate the intensify of a pixel according to a number of subpixels that contribute to the pixel.
  • the modulated pixel intensity is normalized by a subpixel count parameter.
  • the third approach upsamples the sequencing images into the subpixel domain using bicubic interpolation, sums the intensity of the upsampled pixels belonging to a cluster, and normalizes the summed intensify based on a count of the upsampled pixels that belong to the cluster.
  • Figure 34 shows different approaches of base calling using the outputs of the regression model 2600.
  • the cluster centers identified from the output of the neural network-based template generator 1512 in the template image are fed to a base caller (e.g., Illumina's Real-Time Analysis software, referred to herein as“RTA base caller”) for base calling.
  • a base caller e.g., Illumina's Real-Time Analysis software, referred to herein as“RTA base caller”
  • FIG. 35 illustrates the difference in base calling performance when the RTA base caller uses ground truth center of mass (COM) location as the duster center, as opposed to using a non-COM location as the cluster center. The results show that using COM improves base calling.
  • COM ground truth center of mass
  • Figure 36 shows, on the left, an example decay map 1716 produced by the regression model 2600. On the right, Figure 36 also shows an example ground truth decay map 1204 that the regression model 2600 approximates during the training.
  • Both the decay map 1716 and the ground truth decay map 1204 depict clusters as disjointed regions of contiguous subpixels, the centers of the clusters as center subpixels at centers of mass of the respective ones of the disjointed regions, and their surrounding background as background subpixels not belonging to any of the disjointed regions.
  • the contiguous subpixels in the respective ones of the disjointed regions have values weighted according to distance of a contiguous subpixel from a center subpixel in a disjointed region to which the contiguous subpixel belongs.
  • the center subpixels have the highest values within the respective ones of the disjointed regions.
  • the background subpixels all have a same minimalist background value within a decay map.
  • Figure 37 portrays one implementation of the peak locator 1806 identifying cluster centers in a decay map by detecting peaks 3702. Additional details about the peak locator can be found in the Appendix entitled“Peak Detection”.
  • Figure 38 compares peaks detected by the peak locator 1806 in the decay map 1716 produced by the regression model 2600 with peaks in a corresponding ground truth decay map 1204.
  • the red markers are peaks predicted by the regression model 2600 as cluster centers and the green markers are the ground truth centers of mass of the clusters.
  • Figure 39 illustrates performance of the regression model 2600 using precision and recall statistics.
  • the precision and recall statistics demonstrate that the regression model 2600 is good at recovering all identified cluster centers.
  • Figure 40 compares performance of the regression model 2600 with the RTA base caller for 20pM library concentration (normal ran). Outperforming the RTA base caller, the regression model 2600 identifies 34, 323 (4.46%) more clusters in a higher cluster density environment (i.e., 988,884 clusters).
  • Figure 40 also shows results for other sequencing metrics such as number of clusters that pass the chastity filter (“% PF” (pass-filter)), number of aligned reads (“% Aligned”), number of duplicate reads (“% Duplicate”), number of reads mismatching the reference sequence for all reads aligned to the reference sequence (“% Mismatch”), bases called with quality score 30 and above (“% Q30 bases”), and so on.
  • Figure 41 compares performance of the regression model 2600 with the RTA base caller for 30pM library concentration (dense run). Outperforming the RTA base caller, the regression model 2600 identifies 34, 323 (6.27%) more clusters in a much higher cluster density environment (i.e., 1,351,588 clusters).
  • Figure 41 also shows results for other sequencing metrics such as number of clusters that pass the chastity filter (“% PF” (pass-filter)), number of aligned reads (“% Aligned”), number of duplicate reads (“% Duplicate”), number of reads mismatching the reference sequence for all reads aligned to the reference sequence (“% Mismatch”), bases called with quality score 30 and above (“% Q30 bases”), and so on.
  • Figure 42 compares number of non-duplicate (unique or deduplicated) proper read pairs, i.e., the number of paired reads that have both reads aligned inwards within a reasonable distance detected by the regression model 2600 versus the same detected by the RTA base caller. The comparison is made both for the 20pM normal run and the 30pM dense run.
  • Figure 42 shows that the disclosed neural network-based template generators are able to detect more clusters in fewer sequencing cycles of input to template generation than the RTA base caller.
  • the regression model 2600 identifies 11% more non-duplicate proper read pairs than the RTA base caller during the 20pM normal run and 33% more non-duplicate proper read pairs than the RTA base caller during the 30pM dense ran.
  • the regression model 2600 identifies 4.5% more non-duplicate proper read pairs than the RTA base caller during the 20pM normal ran and 6.3% more non-duplicate proper read pairs than the RTA base caller during the 30pM dense ran.
  • Figure 43 shows, on the right, a first decay map produced by the regression model 2600.
  • the first decay map identifies clusters and their surrounding background imaged during the 20pM normal run, along with their spatial distribution depicting cluster shapes, cluster sizes, and cluster centers.
  • Figure 43 shows a second decay map produced by the regression model 2600.
  • the second decay map identifies clusters and their surrounding background imaged during the 30pM dense ran, along with their spatial distribution depicting cluster shapes, cluster sizes, and duster centers.
  • Figure 44 compares performance of the regression model 2600 with the RTA base caller for 40pM library concentration (highly dense ran).
  • the regression model 2600 produced 89,441,688 more aligned bases than the RTA base caller in a much higher cluster density environment (i.e., 1,509,395 clusters).
  • Figure 44 also shows results for other sequencing metrics such as number of clusters that pass the chastity filter (“% PF” (pass-filter)), number of aligned reads (“% Aligned”), number of duplicate reads (“% Duplicate”), number of reads mismatching the reference sequence for all reads aligned to the reference sequence (“% Mismatch”), bases called with a quality score 30 and above (“% Q30 bases”), and so on.
  • % PF pass-filter
  • Figure 45 shows, on the left, a first decay map produced by the regression model 2600.
  • the first decay map identifies clusters and their surrounding background imaged during the 40pM normal ran, along with their spatial distribution depicting cluster shapes, cluster sizes, and cluster centers.
  • Figure 45 shows the results of the thresholding and the peak locating applied to the first decay map to distinguish the respective clusters from each other and from the background and to identify their respective cluster centers.
  • intensities of the respective clusters are identified and a chastity filter (or passing filter) applied to reduce the mismatch rate.
  • Figure 46 illustrates one implementation of the binary classification model 4600.
  • the binary classification model 4600 is a deep fully convolutional segmentation neural network that processes the input image data 1702 through an encoder subnetwork and a corresponding decoder subnetwork.
  • the encoder subnetwork includes a hierarchy of encoders.
  • the decoder subnetwork includes a hierarchy of decoders that map low resolution encoder feature maps to a full input resolution binary map 1720.
  • the binary classification model 4600 is a U-Net network with skip connections between the decoder and the encoder. Additional details about the segmentation networks can be found in the Appendix entitled“Segmentation Networks”.
  • the final output layer of the binary classification model 4600 is a unit-wise classification layer that produces a classification label for each unit in an output array.
  • the unit-wise classification layer is a subpixel-wise classification layer that produces a softmax classification score distribution for each subpixel in the binary map 1720 across two classes, namely, a cluster center class and a noncluster class, and the classification label for a given subpixel is determined from the corresponding softmax classification score distribution.
  • the unit-wise classification layer is a subpixel-wise classification layer that produces a sigmoid classification score for each subpixel in the binary map 1720, such that the activation of a unit is interpreted as the probability that the unit belongs to the first class and, conversely, one minus the activation gives the probability that it belongs to the second class.
  • the binary map 1720 expresses each subpixel based on the predicted classification scores.
  • the binary map 1720 also stores the predicted value classification scores in an array of units, with each unit in the array representing a corresponding subpixel in the input.
  • Figure 47 is one implementation of training 4700 the binary classification model 4600 using a backpropagation-based gradient update technique that modifies parameters of the binary classification model 4600 until the binary map 1720 of the binary classification model 4600 progressively approaches or matches the ground truth binary map 1404.
  • the final output layer of the binary classification model 4600 is a softmax-based subpixel-wise classification layer.
  • the ground truth binary map generator 1402 assigns each ground truth subpixel either (i) a cluster center value pair (e.g., [1, 0]) or (ii) a non-center value pair (e.g., [0, 1]).
  • a first value [1] represents the cluster center class label and a second value [0] represents the noncenter class label.
  • a first value [0] represents the cluster center class label and a second value [1] represents the non-center class label.
  • the ground truth binary map 1404 expresses each subpixel based on the assigned value pair/value.
  • the ground truth binary map 1404 also stores the assigned value pairs/values in an array of units, with each unit in the array representing a corresponding subpixel in the input.
  • the training includes iteratively optimizing a loss function that minimizes error 4706 (e.g., softmax error) between the binary map 1720 and the ground truth binary map 1404, and updating parameters of the binary classification model 4600 based on the error 4706.
  • error 4706 e.g., softmax error
  • the loss function is a custom-weighted binary cross-entropy loss and the error 4706 is minimized on a subpixel-by -subpixel basis between predicted classification scores (e.g., softmax scores) and labelled class scores (e.g., softmax scores) of corresponding subpixels in the binary map 1720 and the ground truth binary map 1404, as shown in Figure 47.
  • the custom-weighted loss function gives more weight to the COM subpixels, such that the cross-entropy loss is multiplied by a corresponding reward (or penalty) weight specified in a reward (or penalty) matrix whenever a COM subpixel is misclassified. Additional details about the custom-weighted loss function can be found in the Appendix entitled“Custom-Weighted Loss Function”.
  • the training 4700 includes hundreds, thousands, and/or millions of iterations of forward propagation 4708 and backward propagation 4710, including parallelization techniques such as batching.
  • the training data 1504 includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the training data 1504 is annotated with ground truth labels by the annotator 2806.
  • the training 2800 is operationalized by the trainer 1510 using a stochastic gradient update algorithm such as ADAM.
  • Figure 48 is another implementation of training 4800 the binary classification model 4600, in which the final output layer of the binary classification model 4600 is a sigmoid-based subpixel-wise classification layer.
  • the ground truth binary map generator 1302 assigns each ground truth subpixel either (i) a cluster center value (e.g., [1]) or (ii) a non-center value (e.g., [0]).
  • the COM subpixels are assigned the duster center value pair/value and all other subpixels are assigned the non-center value pair/value.
  • values above a threshold intermediate value between 0 and 1 represent the center class label.
  • values below a threshold intermediate value between 0 and 1 represent the noncenter class label.
  • the ground truth binary map 1404 expresses each subpixel based on the assigned value pair/value.
  • the ground truth binary map 1404 also stores the assigned value pairs/values in an array of units, with each unit in the array representing a corresponding subpixel in the input.
  • the training includes iteratively optimizing a loss function that minimizes error 4806 (e.g., sigmoid error) between the binary map 1720 and the ground truth binary map 1404, and updating parameters of the binary classification model 4600 based on the error 4806.
  • error 4806 e.g., sigmoid error
  • the loss function is a custom-weighted binary cross-entropy loss and the error 4806 is minimized on a subpixel-by -subpixel basis between predicted scores (e.g., sigmoid scores) and labelled scores (e.g., sigmoid scores) of corresponding subpixels in the binary map 1720 and the ground truth binary map 1404, as shown in Figure 48.
  • predicted scores e.g., sigmoid scores
  • labelled scores e.g., sigmoid scores
  • the custom-weighted loss function gives more weight to the COM subpixels, such that the cross-entropy loss is multiplied by a corresponding reward (or penalty) weight specified in a reward (or penalty) matrix whenever a COM subpixel is misclassified. Additional details about the custom-weighted loss function can be found in the Appendix entitled“Custom-Weighted Loss Function”.
  • the training 4800 includes hundreds, thousands, and/or millions of iterations of forward propagation 4808 and backward propagation 4810, including parallelization techniques such as batching.
  • the training data 1504 includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the training data 1504 is annotated with ground truth labels by the annotator 2806.
  • the training 2800 is operationalized by the trainer 1510 using a stochastic gradient update algorithm such as ADAM.
  • Figure 49 illustrates another implementation of the input image data 1702 fed to the binary classification model 4600 and the corresponding class labels 4904 used to train the binary classification model 4600.
  • the input image data 1702 comprises a series of upsampled and down-sized image sets 4902.
  • the class labels 4904 comprise two classes: (1)“no cluster center” and (2)“cluster center”, which are distinguished using different output values. That is, (1) the light green units/subpixels 4906 represent subpixels that are predicted by the binary classification model 4600 to not contain the cluster centers and (2) the dark green subpixels 4908 represent units/subpixels that are predicted by the binary classification model 4600 to contain the cluster centers.
  • Figure 50 is one implementation of template generation by the binary classification model 4600 during inference 5000 in which the binary map 1720 is produced by the binary classification model 4600 as the inference output during the inference 5000.
  • the binary map 1720 includes unit-wise binary classification scores 5010 that together represent the binary map 1720.
  • the binary map 1720 has a first array 5002a of unit-wise classification scores for the non-center class and a second array 5002b of unit-wise classification scores for the cluster center class.
  • the inference 5000 includes hundreds, thousands, and/or millions of iterations of forward propagation 5004, including parallelization techniques such as batching.
  • the inference 5000 is performed on inference data 2908 that includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the inference 5000 is operationalized by the tester 2906.
  • the binary map 1720 is subjected to post-processing techniques discussed above, such as thresholding, peak detection, and/or watershed segmentation to generate cluster metadata.
  • Figure 51 depicts one implementation of subjecting the binary map 1720 to peak detection to identify cluster centers.
  • the binary map 1720 is an array of units that classifies each subpixel based on the predicted classification scores, with each unit in the array representing a corresponding subpixel in the input.
  • the classification scores can be softmax scores or sigmoid scores.
  • the binary map 1720 includes two arrays: (1) a first array 5002a of unit-wise classification scores for the non-center class and (2) a second array 5002b of unit-wise classification scores for the cluster center class. In both the arrays, each unit represents a corresponding subpixel in the input.
  • the peak locator 1806 applies peak detection on the units in the binary map 1720.
  • the peak detection identifies those units that have classification scores (e.g., softmax/sigmoid scores) above a preset threshold.
  • classification scores e.g., softmax/sigmoid scores
  • the identified units are inferred as the cluster centers and their corresponding subpixels in the input are determined to contain the cluster centers and stored as cluster center subpixels in a subpixel classifications data store 5102. Additional details about the peak locator 1806 can be found in the Appendix entitled“Peak Detection”.
  • the remaining units and their corresponding subpixels in the input are determined to not contain the cluster centers and stored as noncenter subpixels in the subpixel classifications data store 5102.
  • those units that have classification scores below a certain background threshold are set to zero.
  • such units and their corresponding subpixels in the input are inferred to denote the background surrounding the clusters and stored as background subpixels in the subpixel classifications data store 5102. In other implementations, such units can be considered noise and ignored.
  • Figure 52a shows, on the left, an example binary map produced by the binary classification model 4600. On the right, Figure 52a also shows an example ground truth binary map that the binary classification model 4600 approximates during the training.
  • the binary map has a plurality of subpixels and classifies each subpixel as either a cluster center or a non-center.
  • the ground truth binary map has a plurality of subpixels and classifies each subpixel as either a cluster center or a non-center.
  • Figure 52b illustrates performance of the binary classification model 4600 using recall and precision statistics. Applying these statistics, the binary classification model 4600 outperforms the RTA base caller.
  • Figure 53 is a table that shows an example architecture of the binary classification model 4600, along with details of the layers of the binary classification model 4600, dimensionality of the output of the layers, magnitude of the model parameters, and interconnections between the layers. Similar details are disclosed in the Appendix titled“Binary_Classification_Model_Example_Architecture”.
  • Figure 54 illustrates one implementation of the ternary classification model 5400.
  • the ternary classification model 5400 is a deep fully convolutional segmentation neural network that processes the input image data 1702 through an encoder subnetwork and a corresponding decoder subnetwork.
  • the encoder subnetwork includes a hierarchy of encoders.
  • the decoder subnetwork includes a hierarchy of decoders that map low resolution encoder feature maps to a full input resolution ternary map 1718.
  • the ternary classification model 5400 is a U-Net network with skip connections between the decoder and the encoder. Additional details about the segmentation networks can be found in the Appendix entitled“Segmentation Networks”.
  • the final output layer of the ternary classification model 5400 is a unit-wise classification layer that produces a classification label for each unit in an output array.
  • the unit-wise classification layer is a subpixel-wise classification layer that produces a softmax classification score distribution for each subpixel in the ternary map 1718 across three classes, namely, a background class, a duster center class, and a cluster/cluster interior class, and the classification label for a given subpixel is determined from the corresponding softmax classification score distribution.
  • the ternary map 1718 expresses each subpixel based on the predicted classification scores.
  • the ternary map 1718 also stores the predicted value classification scores in an array of units, with each unit in the array representing a corresponding subpixel in the input. Training
  • Figure 55 is one implementation of training 5500 the ternary classification model 5400 using a backpropagation-based gradient update technique that modifies parameters of the ternary classification model 5400 until the ternary map 1718 of the ternary classification model 5400 progressively approaches or matches training ground truth ternary maps 1304.
  • the final output layer of the ternary classification model 5400 is a softmax-based subpixel-wise classification layer.
  • the ground truth ternary map generator 1402 assigns each ground truth subpixel either (i) a background value triplet (e.g., [1, 0, 0]), (ii) a cluster center value triplet (e.g., [0, 1, 0]), or (iii) a cluster/cluster interior value triplet (e.g., [0, 0,
  • the background subpixels are assigned the background value triplet.
  • the center of mass (COM) subpixels are assigned the cluster center value triplet.
  • the cluster/cluster interior subpixels are assigned the cluster/cluster interior value triplet.
  • a first value [1] represents the background class label
  • a second value [0] represents the cluster center label
  • a third value [0] represents the cluster/cluster interior class label.
  • a first value [0] represents the background class label
  • a second value [1] represents the cluster center label
  • a third value [0] represents the cluster/cluster interior class label.
  • a first value [0] represents the background class label
  • a second value [0] represents the cluster center label
  • a third value [1] represents the cluster/cluster interior class label.
  • the ground truth ternary map 1304 expresses each subpixel based on the assigned value triplet.
  • the ground truth ternary map 1304 also stores the assigned triplets in an array of units, with each unit in the array representing a corresponding subpixel in the input.
  • the training includes iteratively optimizing a loss function that minimizes error 5506 (e.g., softmax error) between the ternary map 1718 and the ground truth ternary map 1304, and updating parameters of the ternary classification model 5400 based on the error 5506.
  • error 5506 e.g., softmax error
  • the loss function is a custom-weighted categorical cross-entropy loss and the error 5506 is minimized on a subpixel-by -subpixel basis between predicted classification scores (e.g., softmax scores) and labelled class scores (e.g., softmax scores) of corresponding subpixels in the ternary map 1718 and the ground truth ternary map 1304, as shown in Figure 54.
  • predicted classification scores e.g., softmax scores
  • labelled class scores e.g., softmax scores
  • the custom-weighted loss function gives more weight to the COM subpixels, such that the cross-entropy loss is multiplied by a corresponding reward (or penalty) weight specified in a reward (or penalty) matrix whenever a COM subpixel is misclassified. Additional details about the custom-weighted loss function can be found in the Appendix entitled“Custom-Weighted Loss Function”.
  • the training 5500 includes hundreds, thousands, and/or millions of iterations of forward propagation 5508 and backward propagation 5510, including parallelization techniques such as batching.
  • the training data 1504 includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the training data 1504 is annotated with ground truth labels by the annotator 2806.
  • the training 5500 is operationalized by the trainer 1510 using a stochastic gradient update algorithm such as ADAM.
  • Figure 56 illustrates one implementation of input image data 1702 fed to the ternary classification model 5400 and the corresponding class labels used to train the ternary classification model 5400.
  • the input image data 1702 comprises a series of upsampled and down-sized image sets 5602.
  • the class labels 5604 comprise three classes: (1)“background class”, (2)“duster center class”, and (3)“cluster interior class”, which are distinguished using different output values.
  • some of these different output values can be visually represented as follows: (1) the grey units/subpixels 5606 represent subpixels that are predicted by the ternary classification model 5400 to be the background, (2) the dark green units/subpixels 5608 represent subpixels that are predicted by the ternary classification model 5400 to contain the duster centers, and (3) the light green subpixels 5610 represent subpixels that are predicted by the ternary classification model 5400 to contain the interior of the dusters.
  • Figure 57 is a table that shows an example architecture of the ternary classification model 5400, along with details of the layers of the ternary classification model 5400, dimensionality of the output of the layers, magnitude of the model parameters, and interconnedions between the layers. Similar details are disclosed in the Appendix titled“Temary_Classification_Model_Example_Architedure”.
  • Figure 58 is one implementation of template generation by the ternary classification model 5400 during inference 5800 in which the ternary map 1718 is produced by the ternary classification model 5400 as the inference output during the inference 5800.
  • One example of the ternary map 1718 is disclosed in the Appendix titled“Temary_Classification_Model_Sample_Oupuf’ .
  • the Appendix includes unit-wise binary classification scores 5810 that together represent the ternary map 1718.
  • the Appendix has a first array 5802a of unit- wise classification scores for the background class, a second array 5802b of unit-wise classification scores for the duster center class, and a third array 5802c of unit-wise classification scores for the cluster/cluster interior class.
  • the inference 5800 includes hundreds, thousands, and/or millions of iterations of forward propagation 5804, including parallelization techniques such as batching.
  • the inference 5800 is performed on inference data 2908 that includes, as the input image data 1702, a series of upsampled and down-sized image sets.
  • the inference 5000 is operationalized by the tester 2906.
  • the ternary map 1718 is produced by the ternary classification model 5400 using post-processing techniques discussed above, such as thresholding, peak detection, and/or watershed segmentation.
  • Figure 59 graphically portrays the ternary map 1718 produced by the ternary classification model 5400 in which each subpixel has a three-way softmax classification score distribution for the three corresponding classes, namely, the background class 5906, the duster center class 5902, and the cluster/cluster interior class 5904.
  • Figure 60 depicts an array of units produced by the ternary classification model 5400, along with the unit-wise output values.
  • each unit has three output values for the three corresponding classes, namely, the background class 5906, the cluster center class 5902, and the cluster/cluster interior class 5904.
  • each unit is assigned the class that has the highest output value, as indicated by the class in parenthesis under each unit.
  • the output values 6002, 6004, and 6006 are analyzed for each of the respective classes 5906, 5902, and 5904 (row-wise).
  • Figure 61 shows one implementation of subjecting the ternary map 1718 to post-processing to identify cluster centers, cluster background, and cluster interior.
  • the ternary map 1718 is an array of units that classifies each subpixel based on the predicted classification scores, with each unit in the array representing a corresponding subpixel in the input.
  • the classification scores can be softmax scores.
  • the ternary map 1718 includes three arrays: (1) a first array 5802a of unit-wise classification scores for the background class, (2) a second array 5802b of unit-wise classification scores for the cluster center class, and (3) a third array 5802c of unit-wise classification scores for the cluster interior class. In all three arrays, each unit represents a corresponding subpixel in the input.
  • the peak locator 1806 applies peak detection on softmax values in the ternary map 1718 for the cluster center class 5802b.
  • the peak detection identifies those units that have classification scores (e.g., softmax scores) above a preset threshold.
  • classification scores e.g., softmax scores
  • the identified units are inferred as the cluster centers and their corresponding subpixels in the input are determined to contain the duster centers and stored as duster center subpixels in a subpixel classifications and segmentations data store 6102. Additional details about the peak locator 1806 can be found in the Appendix entitled“Peak Detedion”.
  • those units that have classification scores below a certain noise threshold are set to zero. Such units can be considered noise and ignored.
  • units that have classification scores for the background class 5802a above a certain background threshold e.g., equal to or greater than 0.5
  • a certain background threshold e.g., equal to or greater than 0.5
  • the watershed segmentation algorithm operationalized by the watershed segmenter 3102, is used to determine the shapes of the dusters.
  • the background units/subpixels are used as a mask by the watershed segmentation algorithm.
  • Classification scores of the unit/subpixels inferred as the duster centers and the duster interior are summed to produce so-called“duster labels”.
  • the duster centers are used as watershed markers, for separation by intensify valleys by the watershed segmentation algorithm.
  • negativized duster labels are provided as an input image to the watershed segmenter 3102 that performs segmentation and produces the duster shapes as disjointed regions of contiguous duster interior subpixels separated by the background subpixels. Furthermore, each disjointed region includes a corresponding duster center subpixel. In some implementations, the corresponding duster center subpixel is the center of the disjointed region to which it belongs. In other implementations, centers of mass (COM) of the disjointed regions are calculated based on the underlying location coordinates and stored as new centers of the dusters.
  • COM centers of mass
  • the outputs of the watershed segmenter 3102 are stored in the subpixel classifications and segmentations data store 6102. Additional details about the watershed segmentation algorithm and other segmentation algorithms can be found in Appendix entitled“Watershed Segmentation”.
  • Example outputs of the peak locator 1806 and the watershed segmenter 3102 are shown in Figures 62a, 62b, 63, and 64.
  • Figure 62a shows example predidions of the ternary classification model 5400.
  • Figure 62a shows four maps and each map has an array of units.
  • the first map 6202 (left most) shows each unit's output values for the duster center class 5802b.
  • the second map 6204 shows each unit's output values for the duster/duster interior class 5802c.
  • the third map 6206 (right most) shows each unit's output values for the background class 5802a.
  • the fourth map 6208 (bottom) is a binary mask of ground truth ternary map 6008 that assigns each unit the class label that has the highest output value.
  • Figure 62b illustrates other example predictions of the ternary classification model 5400.
  • Figure 62b shows four maps and each map has an array of units.
  • the first map 6212 bottom left most) shows each unit's output values for the cluster/cluster interior class.
  • the second map 6214 shows each unit's output values for the cluster center class.
  • the third map 6216 bottom right most shows each unit's output values for the background class.
  • the fourth map (top) 6210 is the ground truth ternary map that assigns each unit the class label that has the highest output value.
  • Figure 62c shows yet other example predictions of the ternary classification model 5400.
  • Figure 64 shows four maps and each map has an array of units.
  • the first map 6220 (bottom left most) shows each unit's output values for the cluster/cluster interior class.
  • the second map 6222 shows each unit's output values for the cluster center class.
  • the third map 6224 (bottom right most) shows each unit's output values for the background class.
  • the fourth map 6218 (top) is the ground truth ternary map that assigns each unit the class label that has the highest output value.
  • Figure 63 depicts one implementation of deriving the cluster centers and cluster shapes from the output of the ternary classification model 5400 in Figure 62a by subjecting the output to post-processing.
  • the post-processing e.g., peak locating, watershed segmentation
  • Figure 64 compares performance of the binary classification model 4600, the regression model 2600, and the RTA base caller. The performance is evaluated using a variety of sequencing metrics. One metric is the total number of clusters detected (“# clusters”), which can be measured by the number of unique cluster centers that are detected. Another metric is the number of detected clusters that pass the chastity filter (“% PF” (pass-filter)). During cycles 1-25 of a sequencing ran, the chastity filter removes the least reliable clusters from the image extraction results. Clusters“pass filter” if no more than one base call has a chastity value below 0.6 in the first 25 cycles.
  • Chastity is defined as the ratio of the brightest base intensity divided by the sum of the brightest and the second brightest base intensities. This metric goes beyond the quantity of the detected clusters and also conveys their quality, i.e., how many of the detected clusters can be used for accurate base calling and downstream secondary and ternary analysis such as variant calling and variant pathogenicity annotation.
  • Other metrics that measure how good the detected clusters are for downstream analysis include the number of aligned reads produced from the detected clusters (“% Aligned”), the number of duplicate reads produced from the detected clusters (“% Duplicate”), the number of reads produced from the detected clusters mismatching the reference sequence for all reads aligned to the reference sequence (“% Mismatch”), the number of reads produced from the detected clusters whose portions do not match well to the reference sequence on either side and thus are ignored for the alignment (“% soft clipped”), the number of bases called for the detected clusters with quality score 30 and above (“% Q30 bases”), the number of paired reads produced from the detected clusters that have both reads aligned inwards within a reasonable distance (“total proper read pairs”), and the number of unique or deduplicated proper read pairs produced from the detected clusters (“non-duplicate proper read pairs”).
  • both the binary classification model 4600 and the regression model 2600 outperform the RTA base caller at template generation on most of the metrics.
  • Figure 65 compares the performance of the ternary classification model 5400 with that of the RTA base caller under three contexts, five sequencing metrics, and two ran densities.
  • the cluster centers are detected by the RTA base caller, the intensity extraction from the clusters is done by the RTA base caller, and the clusters are also base called using the RTA base caller.
  • the cluster centers are detected by the ternary classification model 5400; however, the intensity extraction from the clusters is done by the RTA base caller and the clusters are also base called using the RTA base caller.
  • the cluster centers are detected by the ternary classification model 5400 and the intensity extraction from the clusters is done using the cluster shape-based intensity extraction techniques disclosed herein (note that the cluster shape information is generated by the ternary classification model 5400); but the clusters are base called using the RTA base caller.
  • the performance is compared between the ternary classification model 5400 and the RTA base caller along five metrics: (1) the total number of clusters detected (“# clusters”), (2) the number of detected clusters that pass the chastity filter (“# PF”), (3) the number of unique or deduplicated proper read pairs produced from the detected clusters (“# nondup proper read pairs”), (4) the rate of mismatches between a sequence read produced from the detected clusters and a reference sequence after alignment (“%Mismatch rate”), and (5) bases called for the detected clusters with quality score 30 and above (“% Q30”).
  • the performance is compared between the ternary classification model 5400 and the RTA base caller under the three contexts and the five metrics for two types of sequencing runs: (1) a normal run with 20pM library concentration and (2) a dense ran with 30pM library concentration. [00579] As shown in Figure 65, the ternary classification model 5400 outperforms the RTA base caller on all the metrics.
  • Figure 67 focuses on the penultimate layer 6702 of the neural network-based template generator 1512.
  • Figure 68 visualizes what the penultimate layer 6702 of the neural network-based template generator 1512 has learned as a result of the backpropagation-based gradient update training.
  • the illustrated implementation visualizes twenty -four out of the thirty -two convolution filters of the penultimate layer 6702 overlaid on the ground truth cluster shapes.
  • the penultimate layer 6702 has learned the cluster metadata, including spatial distribution of the clusters such as duster centers, duster shapes, duster sizes, duster background, and duster boundaries.
  • Figure 69 overlays duster center predictions of the binary classification model 4600 (in blue) onto those of the RTA base caller (in pink). The predictions are made on sequencing image data from the Illumina NextSeq sequencer.
  • Figure 70 overlays duster center predictions made by the RTA base caller (in pink) onto visualization of the trained convolution filters of the penultimate layer of the binary classification model 4600. These convolution filters are learned as a result of training on sequencing image data from the Illumina NextSeq sequencer.
  • Figure 71 illustrates one implementation of training data used to train the neural network-based template generator 1512.
  • the training data is obtained from dense flow cells that produce data with storm probe images.
  • the training data is obtained from dense flow cells that produce data with fewer bridge amplification cycles.
  • Figure 72 is one implementation of using beads for image registration based on duster center predictions of the neural network-based template generator 1512.
  • Figure 73 illustrates one implementation of duster statistics of dusters identified by the neural network-based template generator 1512.
  • the duster statistics include duster size based on number of contributive subpixels and GC-content.
  • Figure 74 shows how the neural network-based template generator 1512's ability to distinguish between adjacent dusters improves when the number of initial sequencing cycles for which the input image data 1702 is used increases from five to seven. For five sequencing cycles, a single duster is identified by a single disjointed region of contiguous subpixels. For seven sequencing cycles, the single duster is segmented into two adjacent dusters, each having their own disjointed regions of contiguous subpixels.
  • Figure 75 illustrates the difference in base calling performance when a RTA base caller uses ground truth center of mass (COM) location as the duster center, as opposed to when a non-COM location is used as the duster center.
  • COM center of mass
  • Figure 76 portrays the performance of the neural network-based template generator 1512 on extra detected dusters.
  • Figure 77 shows different datasets used for training the neural network-based template generator 1512.
  • Figure 78 shows the processing stages used by the RTA base caller for base calling, according to one implementation.
  • Figure 78 also shows the processing stages used by the disclosed neural network-based base caller for base calling, according to two implementations.
  • the neural network-based base caller 1514 can streamline the base calling process by obviating many of the processing stages used by the RTA base caller. The streamlining improves base calling accuracy and scale.
  • it performs base calling using location/position information of duster centers identified from the output of the neural network-based template generator 1512.
  • the neural network-based base caller 1514 does not use the location/position information of the duster centers for base calling.
  • the second implementation is used when a patterned flow cell design is used for duster generation.
  • the patterned flow cell contains nanowells that are precisely positioned relative to known fiducial locations and provide prearranged duster distribution on the patterned flow cell.
  • the neural network-based base caller 1514 base calls dusters generated on random flow cells.
  • Figure 79 illustrates one implementation of base calling using the neural network 7906.
  • the main input to the neural network 7906 is image data 7902.
  • the image data 7902 is derived from the sequencing images 108 produced by the sequencer 102 during a sequencing ran.
  • the image data 7902 comprises n x n image patches extracted from the sequencing images 102, where n is any number ranging from 1 and 10,000.
  • the sequencing ran produces m image(s) per sequencing cycle for corresponding m image channels, and an image patch is extracted from each of the m image(s) to prepare the image data for a particular sequencing cycle.
  • m is 4 or 2.
  • m is 1, 3, or greater than 4.
  • the image data 7902 is in the optical, pixel domain in some implementations, and in the upsampled, subpixel domain in other implementations.
  • the image data 7902 comprises data for multiple sequencing cycles (e.g., a current sequencing cycle, one or more preceding sequencing cycles, and one or more successive sequencing cycles).
  • the image data 7902 comprises data for three sequencing cycles, such that data for a current (time t) sequencing cycle to be base called is accompanied with (i) data for a left flanking/context/previous/preceding/prior (time M) sequencing cycle and (ii) data for a right flanking/context/next/successive/subsequent (time f+1) sequencing cycle.
  • the image data 7902 comprises data for a single sequencing cycle.
  • the image data 7902 depicts intensity emissions of one or more clusters and their surrounding background.
  • the image patches are extracted from the sequencing images 108 in such a way that each image patch contains the center of the target cluster in its center pixel, a concept referred to herein as the“target cluster-centered patch extraction”.
  • the image data 7902 is encoded in the input data 7904 using intensity channels (also called image channels). For each of the m images obtained from the sequencer 102 for a particular sequencing cycle, a separate image channel is used to encode its intensity data.
  • intensity channels also called image channels.
  • the input data 7904 comprises (i) a first red image channel with n x n pixels that depict intensity emissions of the one or more clusters and their surrounding background captured in the red image and (ii) a second green image channel with n x n pixels that depict intensity emissions of the one or more clusters and their surrounding background captured in the green image.
  • the image data 7902 is accompanied with supplemental distance data (also called distance channels).
  • Distance channels supply additive bias that is incorporated in the feature maps generated from the image channels. This additive bias contributes to base calling accuracy because it is based on pixel center-to-cluster center(s) distances, which are pixel-wise encoded in the distance channels.
  • a supplemental distance channel identifies distances of its pixels’ centers from the center of a target cluster containing its center pixel and to be base called. The distance channel thereby indicates respective distances of pixels of an image patch from a center pixel of the image patch.
  • a supplemental distance channel identifies each pixel's center-to -center distance from a nearest one of the clusters selected based on center-to -center distances between the pixel and each of the clusters.
  • a supplemental distance channel identifies each cluster pixel's center-to -center distance from an assigned cluster selected based on classifying each cluster pixel to only one cluster.
  • the image data 7902 is accompanied with supplemental scaling data (also called scaling channel) that accounts for different cluster sizes and uneven illumination conditions.
  • Scaling channel also supplies additive bias that is incorporated in the feature maps generated from the image channels. This additive bias contributes to base calling accuracy because it is based on mean intensities of central cluster pixel(s), which are pixel-wise encoded in the scaling channel.
  • the location/position information 7916 (e.g., x-y coordinates) of cluster center(s) identified from the output of the neural network-based template generator 1512 is fed as supplemental input to the neural network 7906.
  • the neural network 7906 receives, as supplemental input, cluster attribution information that classifies which pixels or subpixels are: background pixels or subpixels, cluster center pixels or subpixels, and cluster/cluster interior pixels or subpixels depicting/contributing to/belonging to a same cluster.
  • cluster attribution information that classifies which pixels or subpixels are: background pixels or subpixels, cluster center pixels or subpixels, and cluster/cluster interior pixels or subpixels depicting/contributing to/belonging to a same cluster.
  • the decay map, the binary map, and/or the ternary map or a variation of those is fed as supplemental input to the neural network 7906.
  • the input data 7904 does not contain the distance channels, but instead the neural network 7906 receives, as input, modified image data that is modified based on the output of the neural network-based template generator 1512, i.e., the decay map, the binary map, and/or the ternary map.
  • the intensities of the image data 7902 are modified to account for the absence distance channels.
  • the image data 7902 is subjected to one or more lossless transformation operations (e.g., convolutions, deconvolutions, Fourier transforms) and the resulting modified image data is fed as input to the neural network 7906.
  • lossless transformation operations e.g., convolutions, deconvolutions, Fourier transforms
  • the neural network 7906 is also referred to herein as the“neural network-based base caller” 1514.
  • the neural network-based base caller 1514 is a multilayer perceptron (MLP).
  • the neural network-based base caller 1514 is a feedforward neural network.
  • the neural network-based base caller 1514 is a fully -connected neural network.
  • the neural network-based base caller 1514 is a fully convolutional neural network.
  • the neural network-based base caller 1514 is a semantic segmentation neural network.
  • the neural network-based base caller 1514 is a convolutional neural network (CNN) with a plurality of convolution layers.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • LSTM long short-term memory network
  • Bi-LSTM bidirectional LSTM
  • GRU gated recurrent unit
  • it includes both a CNN and a RNN.
  • the neural network-based base caller 1514 can use ID convolutions, 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and cross-channel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • It can use one or more loss functions such as logistic regression/log loss, multiclass cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous SGD.
  • loss functions such as logistic regression/log loss, multiclass cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss.
  • It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous/a
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g., non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g., max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • sigmoid and hyperbolic tangent sigmoid and hyperbolic tangent
  • the neural network-based base caller 1514 processes the input data 7904 and produces an alternative representation 7908 of the input data 7904.
  • the alternative representation 7908 is a convolved representation in some implementations and a hidden representation in other implementations.
  • the alternative representation 7908 is then processed by an output layer 7910 to produce an output 7912.
  • the output 7912 is used to produce the base call(s), as discussed below.
  • the neural network-based base caller 1514 outputs a base call for a single target duster for a particular sequencing cycle. In another implementation, it outputs a base call for each target cluster in a plurality of target clusters for the particular sequencing cycle. In yet another implementation, it outputs a base call for each target cluster in a plurality of target clusters for each sequencing cycle in a plurality of sequencing cycles, thereby producing a base call sequence for each target cluster.
  • Figure 80 is one implementation of transforming, from subpixel domain to pixel domain, location/position information of cluster centers identified from the output of the neural network-based template generator 1512.
  • Cluster center location/position information is used for the neural network-based base calling at least (i) to construct the input data by extracting image patches from the sequencing images 108 that contain the centers of target clusters to be base called in their center pixels, (ii) to construct the distance channel that identifies distances of an image patch's pixels’ centers from the center of a target cluster contained its center pixel, and/or (iii) as supplemental input 7916 to the neural network-based base caller 1514.
  • the cluster center location/position information is identified from the output of the neural network-based template generator 1512 in the upsampled, subpixel resolution.
  • the neural network-based base caller 1514 operates on image data that is in optical, pixel-resolution. Therefore, in one implementation, the cluster center location/position information is transformed into the pixel domain by downscaling coordinates of the cluster centers by the same upsampling factor used to upsample image data fed as input to the neural network-based template generator 1512.
  • the image patches data fed as input to the neural network-based template generator 1512 are derived by upsampling sequencing images 108 from some initial sequencing cycles by an upsampling factor f.
  • the coordinates of the cluster centers 8002, produced by the neural network-based template generator 1512 by the post-processor 1814 and stored in the template/template image 8004, are divided by / (the divisor).
  • These downscaled cluster center coordinates are referred to herein as the“reference cluster centers” 8008 and stored in the template/template image 8004.
  • the downscaling is performed by a downscaler 8006.
  • Figure 81 is one implementation of using cycle-specific and image channel-specific transformations to derive the so-called “transformed cluster centers” 8104 from the reference cluster centers 8008. The motivation for doing so is discussed first.
  • Sequencing images taken at different sequencing cycles are misaligned and have random translational offsets with respect to each other. This occurs due to the finite accuracy of the movements of the sensor's motion stage and also because images taken in different image/frequency channels have different optical paths and wavelengths. Consequently, an offset exists between the reference duster centers and locations/positions of the duster centers in the sequencing images. This offset varies between images captured at different sequencing cycles and within images captured at a same sequencing cycle in different image channels.
  • cycle-specific and image channel-specific transformations are applied to the reference duster centers to produce respective transformed duster centers for image patches of each sequencing cycle.
  • the cycle-specific and image channel-specific transformations are derived by an image registration process that uses image correlation to determine a full six-parameter affine transformation (e.g., translation, rotation, scaling, shear, right reflection, left reflection) or a Procrustes transformation (e.g., translation, rotation, scaling, optionally extended to aspect ratio), additional details of which can be found in Appendices 1, 2, 3, and 4.
  • the sequencing ran uses 2-channel chemistry in which a red image and a green image are produced at each sequencing cycle. Then, for example sequencing cycle 3, the cycle-specific and image channel-specific transformations are for the red image and
  • cycle-specific and image channel-specific transformations are for the red lmage and for the green i mage.
  • the transformations are performed by a transformer 8102.
  • the transformed cluster centers 8104 are the stored in the template/template image 8004 and respectively used (i) to do the patch extraction from corresponding sequencing images 108 (e.g., by a patch extractor 8106), (ii) in the distance formula (
  • a different distance formula can be used such as distance squared, -distance, and -distance squared.
  • Figure 82 illustrates an image patch 8202 that is part of the input data fed to the neural network-based base caller 1514.
  • the input data includes a sequence of per-cycle image patch sets generated for a series of sequencing cycles of a sequencing run.
  • Each per-cycle image patch set in the sequence has an image patch for a respective one of one or more image channels.
  • the sequencing ran uses the 2-channel chemistry which produces a red image and a green image at each sequencing cycle, and the input data comprises data spanning a series of three sequencing cycles of the sequencing ran: a current (time t ) sequencing cycle to be base called, a previous (time t- 1) sequencing cycle, and a next (time f+1) sequencing cycle.
  • the input data comprises the following sequence of per-cycle image patch sets: a current cycle image patch set with a current red image patch and a current green image patch respectively extracted from the red and green sequencing images captured at the current sequencing cycle, a previous cycle image patch set with a previous red image patch and a previous green image patch respectively extracted from the red and green sequencing images captured at the previous sequencing cycle, and a next cycle image patch set with a next red image patch and a next green image patch respectively extracted from the red and green sequencing images captured at the next sequencing cycle.
  • each image patch can be n x n, where n can be any number ranging from 1 and 10,000.
  • Each image patch can be in the optical, pixel domain or in the upsampled, subpixel domain.
  • the extracted image page 8202 has pixel intensity data for pixels that cover/depict a plurality of clusters 1 -m and their surrounding background. Also, in the illustrated implementation, the image patch 8202 is extracted in such a way that is contains in its center pixel the center of a target cluster being base called.
  • Figure 82 the pixel centers are depicted by a black rectangle and have integer location/position coordinates, and the cluster centers are depicted by a purple circle and have floating-point location/position coordinates.
  • Figure 83 depicts one implementation of determining distance values 8302 for a distance channel when a single target cluster is being base called by the neural network-based base caller 1514.
  • the center of the target cluster is contained in the center pixels of the image patches that are fed as input to the neural network-based base caller 1514.
  • the distance values are calculated on a pixel-by -pixel basis, such that, for each pixel, the distance between its center and the center of the target cluster is determined. Accordingly, a distance value is calculated for each pixel in each of the image patches that are part of the input data.
  • Figure 83 shows three distance values dl, dc, and dn for a particular image patch.
  • the distance values 8302 are calculated using the following distance formula: . which operates on the transformed cluster centers
  • a different distance formula can be used such as distance squared, e A -distance, and e A -distance squared.
  • the distance values 8302 are calculated in the subpixel domain.
  • the distance channels are calculated only with respect to the target cluster being base called.
  • Figure 84 shows one implementation of pixel-wise encoding 8402 the distance values 8302 that are calculated between the pixels and the target cluster.
  • the distance values 8302 in the input data, supplement each corresponding image channel (image patch) as“pixel distance data”.
  • the input data comprises a red distance channel and a green distance channel that supplement the red image channel and the green image channel as pixel distance data, respectively.
  • the distance channels are encoded on a subpixel-by -subpixel basis.
  • Figure 85a depicts one implementation of determining distance values 8502 for a distance channel when multiple target clusters 1 -m are being simultaneously base called by the neural network-based base caller 1514.
  • the distance values are calculated on a pixel-by -pixel basis, such that, for each pixel, the distance between its center and respective centers of each of the multiple clusters 1 -m is determined and the minimum distance value (in red) is assigned to the pixel.
  • the distance channel identifies each pixel's center-to-center distance from a nearest one of the clusters selected based on center-to -center distances between the pixel and each of the clusters.
  • Figure 85a shows pixel center-to-cluster center distances for two pixels and four cluster centers. Pixel 1 is nearest to cluster 1 and pixel n is nearest to cluster 3.
  • the distance values 8502 are calculated using the following distance formula:
  • formula can be used such as distance squared distance, and distance squared.
  • the distance values 8502 are calculated in the subpixel domain.
  • the distance channels are calculated with respect to the nearest cluster from among a plurality of clusters.
  • Figure 85b shows, for each of the target clusters 1-m, some nearest pixels determined based on the pixel center-to-nearest cluster center distances 8504 (dl, d2, d23 , d29, d24, d32, dn, d!3, d!4, and etc.).
  • Figure 86 shows one implementation of pixel-wise encoding 8602 the minimum distance values that are calculated between the pixels and the nearest one of the clusters.
  • the distance channels are encoded on a subpixel-by-subpixel basis.
  • Figure 87 illustrates one implementation using pixel-to-cluster classification/attribution/categorization 8702, referred to herein as “cluster shape data” or“cluster shape information”, to determine cluster distance values 8802 for a distance channel when multiple target clusters 1-m are being simultaneously base called by the neural network-based base caller 1514.
  • cluster shape data or“cluster shape information”
  • the output of the neural network-based template generator 1512 is used to classify the pixels as: background pixels, center pixels, and cluster/cluster interior pixels depicting/contributing to/belonging to a same cluster.
  • This pixel-to-cluster classification information is used to attribute each pixel to only one cluster, irrespective of the distances between the pixel centers and the cluster centers, and is stored as the cluster shape data.
  • background pixels are colored in grey
  • pixels belonging to cluster 1 are colored in yellow (cluster 1 pixels)
  • pixels belonging to cluster 2 are colored in green (cluster 2 pixels)
  • pixels belonging to cluster 3 are colored in red (cluster 3 pixels)
  • pixels belonging to cluster m are colored in blue (cluster m pixels).
  • Figure 88 shows one implementation of calculating the distance values 8802 using the cluster shape data.
  • the center-to -center distance value for a pixel is calculated with respect to the nearest cluster from among a plurality of clusters.
  • the pixel is assigned a distance value that is calculated with respect to cluster B (to which it does not belong), instead of being assigned a distance value vis-a-vis cluster A (to which it truly belongs).
  • The“multi-cluster shape-based” base calling implementation avoids this by using the true pixel-to-cluster mapping, as defined in the raw image data and produced by the neural network-based template generator 1512.
  • the cluster pixels depict cluster intensities and the background pixels depict background intensities.
  • the cluster distance values identify each cluster pixel's center-to -center distance from an assigned one of the clusters selected based on classifying each cluster pixel to only one of the clusters.
  • the background pixels are assigned a predetermined background distance value, such as 0 or 0.1, or some other minimum value.
  • the cluster distance values 8802 are calculated using the following distance formula:
  • formula can be used such as distance squared, distance, and distance squared.
  • the cluster distance values 8802 are calculated in the subpixel domain and the cluster and background attribution 8702 occurs on a subpixel-by-subpixel basis.
  • the distance channels are calculated with respect to an assigned cluster from among a plurality of clusters.
  • the assigned cluster is selected based on classifying each cluster pixel to only one of the clusters in accordance with the true pixel-to-cluster mapping defined in the raw image data.
  • Figure 89 shows one implementation of pixel-wise encoding the distance values 8702 that are calculated between the pixels and the assigned dusters.
  • the distance channels are encoded on a subpixel-by -subpixel basis.
  • Deep learning is a powerful machine learning technique that uses many -layered neural networks.
  • One particularly successful network structure in computer vision and image processing domains is the convolutional neural network (CNN), where each layer performs a feed-forward convolutional transformations from an input tensor (an image-like, multi -dimensional dense array) to an output tensor of different shape.
  • CNNs are particularly suited for image-like input due the spatial coherence of images and the advent of general purpose graphics processing units (GPUs) which make training fast on arrays up to 3- or 4-D. Exploiting these image-like properties leads to superior empirical performance compared to other learning methods such as support vector machine (SVM) or multi-layer perceptron (MLP).
  • SVM support vector machine
  • MLP multi-layer perceptron
  • Figure 90 illustrates one implementation of the specialized architecture of the neural network-based base caller 1514 that is used to segregate processing of data for different sequencing cycles. The motivation for using the specialized architecture is described first.
  • the neural network-based base caller 1514 processes data for a current sequencing cycle, one or more preceding sequencing cycles, and one or more successive sequencing cycles. Data for additional sequencing cycles provides sequence-specific context. The neural network-based base caller 1514 learns the sequence-specific context during training and base call them. Furthermore, data for pre and post sequencing cycles provides second order contribution of pre-phasing and phasing signals to the current sequencing cycle.
  • the specialized architecture comprises spatial convolution layers that do not mix information between sequencing cycles and only mix information within a sequencing cycle.
  • Spatial convolution layers use so-called“segregated convolutions” that operationalize the segregation by independently processing data for each of a plurality of sequencing cycles through a“dedicated, non-shared” sequence of convolutions.
  • the segregated convolutions convolve over data and resulting feature maps of only a given sequencing cycle, i.e., intra-cycle, without convolving over data and resulting feature maps of any other sequencing cycle.
  • the input data comprises (i) current data for a current (time t ) sequencing cycle to be base called, (ii) previous data for a previous (time f-1) sequencing cycle, and (iii) next data for a next (time f+1) sequencing cycle.
  • the specialized architecture then initiates three separate data processing pipelines (or convolution pipelines), namely, a current data processing pipeline, a previous data processing pipeline, and a next data processing pipeline.
  • the current data processing pipeline receives as input the current data for the current (time t ) sequencing cycle and independently processes it through a plurality of spatial convolution layers to produce a so-called“current spatially convolved representation” as the output of a final spatial convolution layer.
  • the previous data processing pipeline receives as input the previous data for the previous (time f-1) sequencing cycle and independently processes it through the plurality of spatial convolution layers to produce a so- called“previous spatially convolved representation” as the output of the final spatial convolution layer.
  • the next data processing pipeline receives as input the next data for the next (time f+1) sequencing cycle and independently processes it through the plurality of spatial convolution layers to produce a so-called“next spatially convolved representation” as the output of the final spatial convolution layer.
  • the current, previous, and next processing pipelines are executed in parallel.
  • the spatial convolution layers are part of a spatial convolutional network (or subnetwork) within the specialized architecture.
  • the neural network-based base caller 1514 further comprises temporal convolution layers that mix information between sequencing cycles, i.e., inter-cycles.
  • the temporal convolution layers receive their inputs from the spatial convolutional network and operate on the spatially convolved representations produced by the final spatial convolution layer for the respective data processing pipelines.
  • the inter-cycle operability freedom of the temporal convolution layers emanates from the fact that the misalignment property, which exists in the image data fed as input to the spatial convolutional network, is purged out from the spatially convolved representations by the cascade of segregated convolutions performed by the sequence of spatial convolution layers.
  • Temporal convolution layers use so-called“combinatory convolutions” that groupwise convolve over input channels in successive inputs on a sliding window basis.
  • the successive inputs are successive outputs produced by a previous spatial convolution layer or a previous temporal convolution layer.
  • the temporal convolution layers are part of a temporal convolutional network (or subnetwork) within the specialized architecture.
  • the temporal convolutional network receives its inputs from the spatial convolutional network.
  • a first temporal convolution layer of the temporal convolutional network groupwise combines the spatially convolved representations between the sequencing cycles.
  • subsequent temporal convolution layers of the temporal convolutional network combine successive outputs of previous temporal convolution layers.
  • the output of the final temporal convolution layer is fed to an output layer that produces an output.
  • the output is used to base call one or more clusters at one or more sequencing cycles.
  • the specialized architecture processes information from a plurality of inputs in two stages.
  • segregation convolutions are used to prevent mixing of information between the inputs.
  • combinatory convolutions are used to mix information between the inputs.
  • the results from the second stage are used to make a single inference for the plurality of inputs.
  • the specialized architecture maps the plurality of inputs to the single inference.
  • the single inference can comprise more than one prediction, such as a classification score for each of the four bases (A, C, T, and G).
  • the inputs have temporal ordering such that each input is generated at a different time step and has a plurality of input channels.
  • the plurality of inputs can include the following three inputs: a current input generated by a current sequencing cycle at time step (f), a previous input generated by a previous sequencing cycle at time step (f-1), and a next input generated by a next sequencing cycle at time step (f+1).
  • each input is respectively derived from the current, previous, and next inputs by one or more previous convolution layers and includes k feature maps.
  • each input can include the following five input channels: a red image channel (in red), a red distance channel (in yellow), a green image channel (in green), a green distance channel (in purple), and a scaling channel (in blue).
  • each input can include k feature maps produced by a previous convolution layer and each feature map is treated as an input channel.
  • Figure 91 depicts one implementation of the segregated convolutions.
  • Segregated convolutions process the plurality of inputs at once by applying a convolution filter to each input in parallel.
  • the convolution filter combines input channels in a same input and does not combine input channels in different inputs.
  • a same convolution filter is applied to each input in parallel.
  • a different convolution filter is applied to each input in parallel.
  • each spatial convolution layer comprises a bank of k convolution filters, each of which applies to each input in parallel.
  • Combinatory convolutions mix information between different inputs by grouping corresponding input channels of the different inputs and applying a convolution filter to each group.
  • the grouping of the corresponding input channels and application of the convolution filter occurs on a sliding window basis.
  • a window spans two or more successive input channels representing, for instance, outputs for two successive sequencing cycles. Since the window is a sliding window, most input channels are used in two or more windows.
  • the different inputs originate from an output sequence produced by a preceding spatial or temporal convolution layer.
  • the different inputs are arranged as successive outputs and therefore viewed by a next temporal convolution layer as successive inputs.
  • the combinatory convolutions apply the convolution filter to groups of corresponding input channels in the successive inputs.
  • the successive inputs have temporal ordering such that a current input is generated by a current sequencing cycle at time step (f), a previous input is generated by a previous sequencing cycle at time step (f-1), and a next input is generated by a next sequencing cycle at time step (f+1).
  • each successive input is respectively derived from the current, previous, and next inputs by one or more previous convolution layers and includes k feature maps.
  • each input can include the following five input channels: a red image channel (in red), a red distance channel (in yellow), a green image channel (in green), a green distance channel (in purple), and a scaling channel (in blue).
  • each input can include k feature maps produced by a previous convolution layer and each feature map is treated as an input channel.
  • the depth B of the convolution filter is dependent upon the number of successive inputs whose corresponding input channels are groupwise convolved by the convolution filter on a sliding window basis. In other words, the depth B is equal to the number of successive inputs in each sliding window and the group size.
  • each temporal convolution layer comprises a bank of k convolution filters, each of which applies to the successive inputs on a sliding window basis.
  • Figure 93 shows one implementation of convolution layers of the neural network-based base caller 1514 in which each convolution layer has a bank of convolution filters.
  • each convolution layer has a bank of convolution filters.
  • five convolution layers are shown, each of which has a bank of 64 convolution filters.
  • each spatial convolution layer has a bank of k convolution filters, where k can be any number such as 1, 2, 8, 64, 128, 256, and so on.
  • each temporal convolution layer has a bank of k convolution filters, where k can be any number such as 1, 2, 8, 64, 128, 256, and so on.
  • Figure 94 depicts two configurations of the scaling channel that supplements the image channels.
  • the scaling channel is pixel-wise encoded in the input data that is fed to the neural network-based base caller 1514. Different cluster sizes and uneven illumination conditions result in a wide range of cluster intensities being extracted. The additive bias supplied by the scaling channel makes cluster intensities comparable across dusters.
  • the scaling channel is encoded on a subpixel-by -subpixel basis.
  • the scaling channel assigns a same scaling value to all the pixels.
  • the scaling channels assign different scaling values to groups of pixels based on the cluster shape data.
  • Scaling channel 9410 has a same scaling value (si) for all the pixels.
  • Scaling value (si) is based on a mean intensity of the center pixel that contains the center of the target cluster.
  • the mean intensity is calculated by averaging intensity values of the center pixel observe during two or more preceding sequencing cycles that produced an A and a T base call for the target cluster.
  • Scaling channel 9408 has different scaling values (si, s2, s3 , sm) for respective pixel groups attributed to corresponding clusters based on the cluster shape data.
  • Each pixel group includes a central cluster pixel that contains a center of the corresponding cluster.
  • Scaling value for a particular pixel group is based on the mean intensity of its central duster pixel.
  • the mean intensity is calculated by averaging intensity values of the central duster pixel observe during two or more preceding sequencing cycles that produced an A and a T base call for the corresponding duster.
  • the background pixels are assigned a background scaling value (sb), which can be 0 or 0.1, or some other minimum value.
  • the scaling channels 9406 and their scaling values are determined by an intensity scaler 9404.
  • the intensity scaler 9404 uses duster intensity data 9402 from preceding sequencing cycles to calculate the mean intensities.
  • the supplemental scaling channel can be provided as input in a different way, such as prior to or to the last layer of the neural network-based base caller 1514, prior to or to the one or more intermediate layers of the neural network-based base caller 1514, and as a single value instead of encoding it pixel-wise to match the image size.
  • Input Data Image Channels. Distance Channels, and Scaling Channel
  • Figure 95a illustrates one implementation of input data 9500 for a single sequencing cycle that produces a red image and a green image.
  • the input data 9500 comprises the following:
  • Red intensity data 9502 (in red) for pixels in an image patch extracted from the red image.
  • the red intensity data 9502 is encoded in a red image channel.
  • Red distance data 9504 (in yellow) that pixel-wise supplements the red intensity data 9502.
  • the red distance data 9504 is encoded in a red distance channel.
  • Green intensity data 9506 (in green) for pixels in an image patch extracted from the green image.
  • the green intensity data 9506 is encoded in a green image channel.
  • Green distance data 9508 (in purple) that pixel-wise supplements the green intensity data 9506.
  • the green distance data 9508 is encoded in a green distance channel.
  • Scaling data 9510 (in blue) that pixel-wise supplements the red intensity data 9502 and the green intensity data 9506.
  • the scaling data 9510 is encoded in a scaling channel.
  • the input data can include fewer or greater number of image channels and supplemental distance channels.
  • the input data comprises four image channels for each sequencing cycle and four supplemental distance channels.
  • Figure 95b illustrates one implementation of the distance channels supplying additive bias that is incorporated in the feature maps generated from the image channels.
  • This additive bias contributes to base calling accuracy because it is based on pixel center-to-cluster center(s) distances, which are pixel-wise encoded in the distance channels.
  • An image patch's pixels depict intensity emissions of a plurality of clusters (e.g., 10 to 200 clusters) and their surround background. Additional clusters incorporate information from a wider radius and contribute to base call prediction by discerning the underlying base whose intensity emissions are depicted in the image patch. In other words, intensity emissions from a group of dusters cumulatively create an intensity pattern that can be assigned to a discrete base (A, C, T, or G).
  • clusters e.g. 10 to 200 clusters
  • Additional clusters incorporate information from a wider radius and contribute to base call prediction by discerning the underlying base whose intensity emissions are depicted in the image patch.
  • intensity emissions from a group of dusters cumulatively create an intensity pattern that can be assigned to a discrete base (A, C, T, or G).
  • the distance channels convey to the convolution filters which pixels contain the duster centers and which pixels are farther away from the duster centers.
  • the convolution filters use this information to assign a sequencing signal to its proper source duster by attending to (a) the central duster pixels, their neighboring pixels, and feature maps derived from them more than (b) the perimeter duster pixels, background pixels, and feature maps derived from them.
  • the distance channels supply positive additive biases that are incorporated in feature maps resulting from (a), but supply negative additive biases that are incorporated in feature maps resulting from (b).
  • the distance channels have the same dimensionality as the image channels. This allows the convolution filters to separately evaluate the image channels and the distance channels within a local receptive field and coherently combine the evaluations.
  • the distance channels identify only one central duster pixel at the center of the image patches.
  • the distance channels identify multiple central duster pixels distributed across the image patches.
  • A“single duster” distance channel applies to an image patch that contains the center of a single target duster to be base called in its center pixel.
  • the single duster distance channel includes center-to -center distance of each pixel in the image patch to the single target duster.
  • the image patch also includes additional dusters that are adjacent to the single target duster, but the additional dusters are not base called.
  • A“multi-cluster” distance channel applies to an image patch that contains the centers of multiple target dusters to be base called in its respective central duster pixels.
  • the multi -duster distance channel includes center-to -center distance of each pixel in the image patch to the nearest duster from among the multiple target dusters. This has the potential of measuring a center-to -center distance to the wrong duster, but that potential is low.
  • A“multi-cluster shape-based” distance channel applies to an image patch that contains the centers of multiple target dusters to be base called in its respective central duster pixels and for which pixel-to-duster attribution information is known.
  • the multi -duster distance channel includes center-to-center distance of each duster pixel in the image patch to the duster to which it belongs or is attributed to from among the multiple target dusters. Background pixels can be flagged as background, instead of given a calculated distance.
  • Figure 95b also illustrates one implementation of the scaling channel supplying additive bias that is incorporated in the feature maps generated from the image channels.
  • This additive bias contributes to base calling accuracy because it is based on mean intensities of central duster pixel(s), which are pixel-wise encoded in the scaling channel.
  • the discussion about additive biasing in the context of the distance channels analogously applies to the scaling channel.
  • Figure 95b further shows an example of how the additive biases are derived from the distance and scaling channels and incorporated into the features maps generated from the image channels.
  • convolution filter i 9514 evaluates a local receptive field 9512 (in magenta) across the two image channels 9502 and 9506, the two distance channels 9504 and 9508, and the scaling channel 9510. Because the distance and scaling channels are separately encoded, the additive biasing occurs when the intermediate outputs 9516a-e of each of the channel-specific convolution kernels (or feature detectors) 9516a-e (plus bias 9516f) are channel-wise accumulated 9518 as the final output/feature map element 9520 for the local receptive field 9512.
  • the additive biases supplied by the two distance channels 9504 and 9508 are the intermediate outputs 9516b and 9516 ( 1, respectively.
  • the additive bias supplied by the scaling channel 9510 is the intermediate output 9516e.
  • the additive biasing guides the feature map compilation process by putting greater emphasis on those features in the image channels that are considered more important and reliable for base calling, i.e., pixel intensities of central cluster pixels and their neighboring pixels.
  • base calling i.e., pixel intensities of central cluster pixels and their neighboring pixels.
  • backpropagation of gradients computed from comparison to the ground truth base calls updates weights of the convolution kernels to produce stronger activations for central cluster pixels and their neighboring pixels.
  • the scaling channel additive bias 9516e derived from the scaling channel 9510 can positively or negatively bias the convolved representation 9520 of the pixels.
  • Figure 95b shows application of a single convolution filter i 9514 on the input data 9500 for a single sequencing cycle.
  • convolution filters e.g., a filter bank of k filters, where k can be 8, 16, 32, 64, 128, 256, and so on
  • convolutional layers e.g., multiple spatial and temporal convolution layers
  • sequencing cycles e.g., t, t+1, t- 1.
  • the distance and scaling channels instead of being separately encoded, are directly applied to the image channels to generate modulated pixel multiplication) since the distance and scaling channels and the image channels have the same dimensionality.
  • weights of the convolution kernels are determined based on the distance and image channels so as to detect most important features in the image channels during the elementwise multiplication.
  • the distance and scaling channels instead of being fed to a first layer, are provided as auxiliary input to downstream layers and/or networks (e.g., to a fully-connected network or a classification layer).
  • the distance and scaling channels are fed to the first layer and re-fed to the downstream layers and/or networks (e.g., via a residual connection).
  • volumetric input is a 4D tensor with dimensions k x l x w x A, with / being the additional dimension, length.
  • Each individual kernel is a 4D tensor swept in a 4D tensor, resulting in a 3D tensor (the channel dimension is collapsed because it is not swept across).
  • the distance and scaling channels are separately encoded on a subpixel-by -subpixel basis and the additive biasing occurs at the subpixel level.
  • Figures 96a, 96b, and 96c depict one implementation of base calling a single target cluster.
  • the specialized architecture processes the input data for three sequencing cycles, namely, a current (time t ) sequencing cycle to be base called, a previous (time t-1) sequencing cycle, and a next (time t+ 1) sequencing cycle and produces a base call for the single target cluster at the current (time t ) sequencing cycle.
  • Figures 96a and 96b show the spatial convolution layers.
  • Figure 96c shows the temporal convolution layers, along with some other non-convolution layers.
  • vertical dotted lines demarcate spatial convolution layers from the feature maps and horizontal dashdotted lines demarcate the three convolution pipelines corresponding to the three sequencing cycles.
  • the input data includes a tensor of dimensionality n x n x m (e.g., the input tensor 9500 in Figure 95a), where n represents the width and height of a square tensor and m represents the number of input channels, making the dimensionality of the input data for the three cycles n x n x m x t.
  • n represents the width and height of a square tensor
  • m represents the number of input channels
  • each per-cycle tensor contains, in the center pixel of its image channels, a center of the single target duster. It also depicts intensity emissions of the single target cluster, of some adjacent clusters, and of their surrounding background captured in each of the image channels at a particular sequencing cycle. In Figure 96a, two example image channels are depicted, namely, the red image channel and the green image channel.
  • Each per-cycle tensor also includes distance channels that supplement corresponding image channels (e.g., a red distance channel and a green distance channel). The distance channels identify center-to -center distance of each pixel in the corresponding image channels to the single target cluster.
  • Each per-cycle tensor further includes a scaling channel that pixel-wise scales intensity values in each of the image channels.
  • the specialized architecture has five spatial convolution layers and two temporal convolution layers.
  • Each spatial convolution layer applies segregated convolutions using a bank of k convolution filters of dimensionality ; x) x d, where j represents the width and height of a square filter and d represents its depth.
  • Each temporal convolution layer applies combinatory convolutions using a bank of k convolution filters of dimensionality j xj x a, where j represents the width and height of a square filter and a. represents its depth.
  • the specialized architecture has pre-classification layers (e.g., a flatten layer and a dense layer) and an output layer (e.g., a softmax classification layer).
  • the pre-classification layers prepare the input for the output layer.
  • the output layer produces the base call for the single target cluster at the current (time t) sequencing cycle.
  • Figures 96a, 96b, and 96c also show the resulting feature maps (convolved representations or intermediate convolved representations or convolved features or activation maps) produced by the convolution filters.
  • the spatial dimensionality of the resulting feature maps reduces by a constant step size from one convolution layer to the next, a concept referred to herein as the“consistently reducing spatial dimensionality”.
  • an example constant step size of two is used for the consistently reducing spatial dimensionality.
  • the consistently reducing spatial dimensionality causes the convolution filters to progressively narrow the focus of attention on the central cluster pixels and their neighboring pixels and generate feature maps with features that capture local dependencies among the central cluster pixels and their neighboring pixels. This in turn helps with accurately base calling the clusters whose centers are contained in the central cluster pixels.
  • the combinatory convolutions of the two temporal convolution layers mix information between the three sequencing cycles.
  • the first temporal convolution layer convolves over the next and current spatially convolved representations respectively produced for the next and current sequencing cycles by a final spatial convolution layer. This yields a first temporal output.
  • the first temporal convolution layer also convolves over the current and previous spatially convolved representations respectively produced for the current and previous sequencing cycles by the final spatial convolution layer. This yields a second temporal output.
  • the second temporal convolution layer convolves over the first and second temporal outputs and produces a final temporal output.
  • the final temporal output is fed to the flatten layer to produce a flattened output.
  • the flattened output is then fed to the dense layer to produce a dense output.
  • the dense output is processed by the output layer to produce the base call for the single target cluster at the current (time t) sequencing cycle.
  • the output layer produces likelihoods (classification scores) of a base incorporated in the single target cluster at the current sequencing cycle being A, C, T, and G, and classifies the base as A, C, T, or G based on the likelihoods (e.g., the base with the maximum likelihood is selected, such the base A in Figure 96a).
  • the likelihoods are exponentially normalized scores produced by a softmax classification layer and sum to unity.
  • the output layer derives an output pair for the single target cluster.
  • the output pair identifies a class label of a base incorporated in the single target cluster at the current sequencing cycle being A, C, T, or G, and base calls the single target cluster based on the class label.
  • a class label of 1, 0 identifies an A base
  • a class label of 0, 1 identifies a C base
  • a class label of 1, 1 identifies a T base
  • a class label of 0, 0 identifies a Gbase.
  • a class label of 1, 1 identifies an A base
  • a class label of 0, 1 identifies a C base
  • a class label of 0.5, 0.5 identifies a T base
  • a class label of 0, 0 identifies a Gbase.
  • a class label of 1, 0 identifies an A base
  • a class label of 0, 1 identifies a C base
  • a class label of 0.5, 0.5 identifies a T base
  • a class label of 0, 0 identifies a G base.
  • a class label of 1, 2 identifies an A base
  • a class label of 0, 1 identifies a C base
  • a class label of 1, 1 identifies a T base
  • a class label of 0, 0 identifies a Gbase.
  • the output layer derives a class label for the single target cluster that identifies a base incorporated in the single target cluster at the current sequencing cycle being A, C, T, or G, and base calls the single target cluster based on the class label.
  • a class label of 0.33 identifies an A base
  • a class label of 0.66 identifies a C base
  • a class label of 1 identifies a T base
  • a class label of 0 identifies a Gbase.
  • a class label of 0.50 identifies an A base
  • a class label of 0.75 identifies a C base
  • a class label of 1 identifies a T base
  • a class label of 0.25 identifies a G base.
  • the output layer derives a single output value, compares the single output value against class value ranges corresponding to bases A, C, T, and G, based on the comparison, assigns the single output value to a particular class value range, and base calls the single target duster based on the assignment.
  • the single output value is derived using a sigmoid function and the single output value ranges from 0 to 1.
  • a class value range of 0-0.25 represents an A base
  • a class value range of 0.25-0.50 represents a C base
  • a class value range of 0.50-0.75 represents a T base
  • a class value range of 0.75-1 represents a Gbase.
  • the specialized architecture can process input data for fewer or greater number of sequencing cycles and can comprise fewer or greater number of spatial and temporal convolution layers. Also, the dimensionality of the input data, the per-cycle tensors in the input data, the convolution filters, the resulting feature maps, and the output can be different. Also, the number of convolution filters in a convolution layer can be different. It can use different padding and striding configurations.
  • It can use a different classification function (e.g., sigmoid or regression) and may or may not include a fully-connected layer. It can use ID convolutions, 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and crosschannel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • ID convolutions 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and crosschannel convolutions, shuffled grouped convolution
  • It can use one or more loss functions such as logistic regression/log loss, multi-class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous SGD.
  • loss functions such as logistic regression/log loss, multi-class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss.
  • It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g., non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g., max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • sigmoid and hyperbolic tangent sigmoid and hyperbolic tangent
  • a tile includes twenty thousand to three hundred thousand clusters.
  • Illumina's NovaSeq sequencer has up to four million clusters per tile. Therefore, a sequencing image of the tile (tile image) can depict intensify emissions from twenty thousand to three hundred thousand clusters and their surrounding background. So, in one implementation, using input data which includes the entire tile image results in three hundred thousand clusters being simultaneously base called on a per-input basis. In another implementation, using image patches of size 15 x 15 pixels in the input data results in less than hundred clusters being simultaneously base called on a per-input basis.
  • these numbers can vary depending on the sequencing configuration, the parallelism strategy, the details of the architecture (e.g., based on optimal architecture hyperparameters), and available compute.
  • Figure 97 shows one implementation of simultaneously base calling multiple target clusters.
  • the input data has three tensors for the three sequencing cycles discussed above.
  • Each per-cycle tensor e.g., the input tensor 9500 in Figure 95a
  • some additional adjacent clusters, which are not base called, are also included for context.
  • each per-cycle tensor includes distance channels that supplement corresponding image channels (e.g., a red distance channel and a green distance channel).
  • the distance channels identify center-to -center distance of each pixel in the corresponding image channels to the nearest cluster from among the multiple target clusters.
  • each per-cycle tensor includes distance channels that supplement corresponding image channels (e.g., a red distance channel and a green distance channel).
  • the distance channels identify center-to -center distance of each cluster pixel in the corresponding image channels to the cluster to which it belongs or is attributed to from among the multiple target clusters.
  • Each per-cycle tensor further includes a scaling channel that pixel-wise scales intensify values in each of the image channels.
  • each per-cycle tensor is great than that shown in Figure 96a. That is, in the single target cluster base calling implementation in Figure 96a, the spatial dimensionality of each per-cycle tensor is 15 x 15, whereas in the multiple cluster base calling implementation in Figure 97, the spatial dimensionality of each per-cycle tensor is 114 x 114. Having greater amount of pixelated data that depicts intensify emissions of additional clusters improves the accuracy of base calls simultaneously predicted for the multiple clusters, according to some implementations. Avoiding Redundant Convolutions
  • the image channels in each per-cycle tensor are obtained from the image patches extracted from the sequencing images.
  • there are overlapping pixels between extracted image patches that are spatially contiguous e.g., left, right, top, and bottom contiguous. Accordingly, in one implementation, the overlapping pixels are not subjected to redundant convolutions and results from a prior convolution are reused in later instances when the overlapping pixels are part of the subsequent inputs.
  • a first image patch of size n x n pixels is extracted from a sequencing image and a second image patch of size m x m pixels is also extracted from the same sequencing image, such that the first and second image patches are spatially contiguous and share an overlapping region of o x o pixels.
  • the o x o pixels are convolved as part of the first image patch to produce a first convolved representation that is stored in memory. Then, when the second image patch is convolved, the o x o pixels are not convolved again and instead the first convolved representation is retrieved from memory and reused.
  • n m. In other implementations, they are not equal.
  • the input data is then processed through the spatial and temporal convolution layers of the specialized architecture to produce a final temporal output of dimensionality w x w x k.
  • the spatial dimensionality is reduced by a constant step size of two at each convolution layer. That is, starting with a n x n spatial dimensionality of the input data, a w x w spatial dimensionality of the final temporal output is derived.
  • an output layer produces a base call for each unit in the w x w set of units.
  • the output layer is a softmax layer that produces four-way classification scores for the four bases (A, C, T, and G) on a unit-by-unit basis. That is, each unit in the w x w set of units is assigned a base call based on the maximum classification score in a corresponding softmax quadruple, as depicted in Figure 97.
  • the w x w set of units is derived as a result of processing the final temporal output through a flatten layer and a dense layer to produce a flattened output and a dense output, respectively.
  • the flattened output has tf x w xi elements and the dense output has w x w elements that form the w x w set of units.
  • Base calls for the multiple target dusters are obtained by identifying which of the base called units in the w x w set of units coincide with or correspond to central duster pixels, i.e., pixels in the input data that contain the respective centers of the multiple target dusters.
  • a given target duster is assigned the base call of the unit that coincides with or corresponds to the pixel that contains the center of the given target duster.
  • base calls of units that do not coincide with or correspond to the central duster pixels are filtered out.
  • This functionality is operationalized by a base call filtering layer, which is part of the specialized architecture in some implementations, or implemented as a postprocessing module in other implementations.
  • base calls for the multiple target dusters are obtained by identifying which groups of base called units in the w x w set of units cover a same duster, i.e., identifying pixel groups in the input data that depict a same duster. Then, for each duster and its corresponding pixel group, an average of classification scores (softmax probabilities) of the respective four base classes (A, C, T, and G) is calculated across pixels in the pixel group and the base class that has the highest average classification score is selected for base calling the duster.
  • classification scores softmax probabilities
  • ground truth comparison and error computation occurs only for those units that coincide with or correspond to the central duster pixels, such that their predided base calls are evaluated against the correct base calls identified as ground truth labels.
  • Figure 98 shows one implementation of simultaneously base calling multiple target dusters at a plurality of successive sequencing cycles, thereby simultaneously producing a base call sequence for each of the multiple target clusters.
  • base call at one sequencing cycle is predicted using data for three sequencing cycles (the current (time t ), the previous/left flanking (time t- 1), and the next/right flanking (time f+1) sequencing cycles), where the right and left flanking sequencing cycles provide sequence-specific context for base triplet motifs and second order contribution of pre-phasing and phasing signals.
  • each per-cycle tensor includes image channels, corresponding distance channels, and a scaling channel, such as the input tensor 9500 in Figure 95a.
  • the input data with t per-cycle tensors is then processed through the spatial and temporal convolution layers of the specialized architecture to producer final temporal outputs, each of which corresponds to a respective one of the y sequencing cycles being base called.
  • Each of the y final temporal outputs has a dimensionality ofw s w s t
  • the spatial dimensionality is reduced by a constant step size of two at each convolution layer. That is, starting with a n s» spatial dimensionality of the input data, a w x w spatial dimensionality of each of the y final temporal outputs is derived.
  • each of the y final temporal outputs is processed in parallel by an output layer.
  • the output layer produces a base call for each unit in the w x w set of units.
  • the output layer is a softmax layer that produces four-way classification scores for the four bases (A, C, T, and G) on a unit-by-unit basis. That is, each unit in the w x w set of units is assigned a base call based on the maximum classification score in a corresponding softmax quadruple, as depicted in Figure 97.
  • the w x w set of units is derived for each of the y final temporal outputs as a result of respectively processing the later through a flatten layer and a dense layer to produce corresponding flattened outputs and dense outputs.
  • each flattened output has u xw xi elements and each dense output has w x w elements that form the w x w set of units.
  • base calls for the multiple target clusters are obtained by identifying which of the base called units in the corresponding w x w set of units coincide with or correspond to central cluster pixels, i.e., pixels in the input data that contain the respective centers of the multiple target clusters.
  • a given target cluster is assigned the base call of the unit that coincides with or corresponds to the pixel that contains the center of the given target cluster.
  • base calls of units that do not coincide with or correspond to the central cluster pixels are filtered out.
  • This functionality is operationalized by a base call filtering layer, which is part of the specialized architecture in some implementations, or implemented as a post-processing module in other implementations.
  • ground truth comparison and error computation occurs only for those units that coincide with or correspond to the central cluster pixels, such that their predicted base calls are evaluated against the correct base calls identified as ground truth labels.
  • rectangles represent data operators like spatial and temporal convolution layers and softmax classification layer, and rounded corner rectangles represent data (e.g., feature maps) produced by the data operators.
  • Figure 99 illustrates the dimensionality diagram 9900 for the single cluster base calling implementation.
  • the“cycle dimension” of the input is three and continues to be that for the resulting feature maps up until the first temporal convolution layer.
  • Cycle dimension of three presents the three sequencing cycles, and its continuity represents that feature maps for the three sequencing cycles are separately generated and convolved upon and no features are mixed between the three sequencing cycles.
  • the segregated convolution pipelines are effectuated by the depth-wise segregated convolution filters of the spatial convolution layers.
  • the“depth dimensionality” of the depth-wise segregated convolution filters of the spatial convolution layers is one.
  • depth dimensionality of the depth-wise combinatory convolution filters of the temporal convolution layers is two. This is what enables the depth-wise combinatory convolution filters to groupwise convolve over resulting features maps from multiple sequencing cycles and mix features between the sequencing cycles.
  • a vector with four elements is exponentially normalized by the softmax layer to produce classification scores (i.e., confidence scores, probabilities, likelihoods, softmax scores) for the four bases (A, C, T, and G).
  • classification scores i.e., confidence scores, probabilities, likelihoods, softmax scores
  • the base with the highest (maximum) softmax score is assigned to the single target cluster being base called at the current sequencing cycle.
  • Figure 100 illustrates the dimensionality diagram 10000 for the multiple clusters, single sequencing cycle base calling implementation. The above discussion about the cycle, depth, and spatial dimensionality with respect to the single cluster base calling applies to this implementation.
  • the softmax layer operates independently on each of the 10,000 units and produces a respective quadruple of softmax scores for each of the 10,000 units.
  • the quadruple corresponds to the four bases (A, C, T, and G).
  • the 10,000 units are derived from the transformation of 64,0000 flattened units to 10,000 dense units.
  • those 2500 units are selected which correspond the 2,500 central cluster pixels containing respective centers of the 2,500 target clusters being simultaneously base called at the current sequencing cycle.
  • the bases assigned to the selected 2,500 units are in turn assigned to the corresponding ones of the 2,500 target clusters.
  • the illustrated dimensionalities can vary depending on the sequencing configuration, the parallelism strategy, the details of the architecture (e.g., based on optimal architecture hyperparameters), and available compute.
  • Figure 101 illustrates the dimensionality diagram 10100 for the multiple clusters, multiple sequencing cycles base calling implementation.
  • the softmax-based base call classification of the 2,500 target clusters occurs in parallel for each of the thirteen sequencing cycles base called, thereby simultaneously producing thirteen base calls for each of the 2,500 target clusters.
  • the illustrated dimensionalities can vary depending on the sequencing configuration, the parallelism strategy, the details of the architecture (e.g., based on optimal architecture hyperparameters), and available compute.
  • the first configuration is called“arrayed input” and the second configuration is called“stacked input”.
  • the arrayed input is shown in Figure 102a and is discussed above with respect to Figures 96a to 101.
  • the arrayed input encodes each sequencing cycle's input in a separate columnblock because image patches in the per-cycle inputs are misaligned with respect to each other due to residual registration error.
  • the specialized architecture is used with the arrayed input to segregate processing of each of the separate columns/blocks. Also, the distance channels are calculated using the transformed cluster centers to account for the misalignments between image patches in a cycle and between image patches across cycles.
  • the stacked input shown in Figure 102b, encodes the inputs from different sequencing cycles in a single columnblock. In one implementation, this obviates the need of using the specialized architecture because the image patches in the stacked input are aligned with each other through affine transformation and intensity interpolation, which eliminate the inter-cycle and intra-cycle residual registration error. In some implementations, the stacked input has a common scaling channel for all the inputs.
  • intensity interpolation is used to reframe or shift the image patches such that the center of the center pixel of each image patch coincides with the center of the single target cluster being base called. This obviates the need of using the supplemental distance channels because all the non-center pixels are equidistant from the center of the single target cluster. Stacked input without the distance channels is referred to herein as the“reframed input” and is illustrated in Figure 104.
  • the reframing may not be feasible with base calling implementations involving multiple clusters because there the image patches contain multiple central cluster pixels that are base called.
  • Stacked input without the distance channels and without the reframing is referred to herein as the“aligned input” and is illustrated in Figures 105 and 106.
  • Aligned input may be used when calculation of the distance channels is not desired (e.g., due to compute limitations) and reframing is not feasible.
  • Figure 103a depicts one implementation of reframing 10300a pixels of an image patch 10302 to center a center of a target cluster being base called in a center pixel.
  • the center of the target cluster (in purple) falls within the center pixel of the image patch 10302, but is at an offset (in red) from the center pixel's center, as depicted in Figure 10300a.
  • a reframer 10304 shifts the image patch 10302 by interpolating intensity of the pixels to compensate for the reframing and produces a reframed/shifted image patch 10306. In the shifted image patch 10306, the center of the center pixel coincides with the center of the target cluster.
  • the non-center pixels are equidistant from the center of the target cluster.
  • the interpolation can be performed by nearest neighbor intensity extraction, Gaussian based intensity extraction, intensity extraction based on average of 2 x 2 subpixel area, intensity extraction based on brightest of 2 x 2 subpixel area, intensity extraction based on average of 3 x 3 subpixel area, bilinear intensity extraction, bicubic intensity extraction, and/or intensity extraction based on weighted area coverage.
  • Figure 103b depicts another example reframed/shifted image patch 10300b in which (i) the center of the center pixel coincides with the center of the target cluster and (ii) the non-center pixels are equidistant from the center of the target duster. These two factors obviate the need of providing a supplemental distance channel because all the non-center pixels have the same degree of proximity to the center of the target cluster.
  • Figure 104 shows one implementation of base calling a single target duster at a current sequencing cycle using a standard convolution neural network and the reframed input.
  • the reframed input includes a current image patch set for a current (t) sequencing cycle being base called, a previous image patch set for a previous (t-1) sequencing cycle, and a next image patch set for a next (t+1) sequencing cycle.
  • Each image patch set has an image patch for a respective one of one or more image channels.
  • Figure 104 depicts two image channels, a red channel and a green channel.
  • Each image patch has pixel intensity data for pixels covering a target duster being base called, some adjacent dusters, and their surrounding background.
  • the reframed input also includes a common scaling channel.
  • the reframed input does not include any distance channels because the image patches are reframed or shifted to center at the center the target duster, as explained above with respect to Figures 103a-b. Also, the image patches are aligned with each other to remove inter-cycle and intra-cycle residual registration error. In one implementation, this is done using affine transformation and intensity interpolation, additional details of which can be found in Appendices 1, 2, 3, and 4. These factors obviate the need of using the specialized architecture, and instead a standard convolutional neural network is used with the reframed input.
  • the standard convolutional neural network 10400 includes seven standard convolution layers that use standard convolution filters. This means that there are no segregated convolution pipelines to prevent mixing of data between the sequencing cycles (since the data is aligned and can be mixed).
  • the consistently reducing spatial dimensionality phenomenon is used to teach the standard convolution filters to attend to the central cluster center and its neighboring pixels more than to other pixels.
  • the reframed input is then processed through the standard convolution layers to produce a final convolved representation.
  • the base call for the target cluster at the current sequencing cycle is obtained in the similar fashion using flatten, dense, and classification layers as discussed above with respect to Figure 96c.
  • the process is iterated over a plurality of sequencing cycles to produce a sequence of base calls for the target cluster.
  • the process is iterated over a plurality of sequencing cycles for a plurality of target clusters to produce a sequence of base calls for each target cluster in the plurality of target clusters.
  • Aligned Input Aligned Image Patches without the Distance Channels and the Refraining
  • Figure 105 shows one implementation of base calling multiple target clusters at the current sequencing cycle using the standard convolution neural network and the aligned input.
  • the reframing is not feasible here because the image patches contain multiple central cluster pixels that are being base called. As a result, the image patches in the aligned input are not reframed. Further, the supplemental distance channels are not included due to compute considerations, according to one implementation.
  • the aligned input is then processed through the standard convolution layers to produce a final convolved representation.
  • a base call for each of the target clusters is obtained at the current sequencing cycle in the similar fashion using flatten (optional), dense (optional), classification, and base call filtering layers as discussed above with respect to Figure 97.
  • Figure 106 shows one implementation of base calling multiple target clusters at a plurality of sequencing cycles using the standard convolution neural network and the aligned input.
  • the aligned input is processed through the standard convolution layers to produce a final convolved representation for each of they sequencing cycles being base called.
  • a base call for each of the target clusters is obtained for each of the y sequencing cycles being base called in the similar fashion using flatten (optional), dense (optional), classification, and base call filtering layers as discussed above with respect to Figure 98.
  • the standard convolutional neural network can process reframed input for fewer or greater number of sequencing cycles and can comprise fewer or greater number of standard convolution layers. Also, the dimensionality of the reframed input, the per-cycle tensors in the reframed input, the convolution filters, the resulting feature maps, and the output can be different. Also, the number of convolution filters in a convolution layer can be different.
  • ID convolutions 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and cross-channel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • loss functions such as logistic regression/log loss, multiclass cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss.
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g., non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g., max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • sigmoid and hyperbolic tangent sigmoid and hyperbolic tangent
  • Figure 107 shows one implementation of training 10700 the neural network-based base caller 1514.
  • the neural network-based base caller 1514 is trained using a backpropagation-based gradient update technique that compares the predicted base calls 10704 against the correct base calls 10708 and computes an error 10706 based on the comparison.
  • the error 10706 is then used to calculate gradients, which are applied to the weights and parameters of the neural network-based base caller 1514 during backward propagation 10710.
  • the training 10700 is operationalized by the trainer 1510 using a stochastic gradient update algorithm such as ADAM.
  • the trainer 1510 uses training data 10702 (derived from the sequencing images 108) to train the neural network-based base caller 1514 over thousands and millions of iterations of the forward propagation 10712 that produces the predicted base calls 10704 and the backward propagation 10710 that updates the weights and parameters based on the error 10706. Additional details about the training 10700 can be found in Appendix entitled“Deep Learning Tools”.
  • Figure 108a depicts one implementation of a hybrid neural network 10800a that is used as the neural network-based base caller 1514.
  • the hybrid neural network 10800a comprises at least one convolution module 10804 (or convolutional neural network (CNN)) and at least one recurrent module 10808 (or recurrent neural network (RNN)).
  • the recurrent module 10808 uses and/or receives inputs from the convolution module 10804.
  • the convolution module 10804 processes input data 10802 through one or more convolution layers and produces convolution output 10806.
  • the input data 10802 includes only image channels or image data as the main input, as discussed above in the Section entitled“Input”.
  • the image data fed to the hybrid neural network 10800a can be the same as the image data 7902 described above.
  • the input data 10802 in addition to the image channels or the image data, also includes supplemental channels such as the distance channels, the scaling channel, the cluster center coordinates, and/or cluster attribution information, as discussed above in the Section entitled“Input”.
  • the image data (i.e., the input data 10802) depicts intensify emissions of one or more clusters and their surrounding background.
  • the convolution module 10804 processes the image data for a series of sequencing cycles of a sequencing run through the convolution layers and produces one or more convolved representations of the image data (i.e., the convolved output 10806).
  • the series of sequencing cycles can include image data for t sequencing cycles that are to be base called, where t is any number between 1 and 1000. We observe accurate base calling results when t is between fifteen and twenty -one.
  • the recurrent module 10810 convolves the convolved output 10806 and produces recurrent output 10810.
  • the recurrent module 10810 produces current hidden state representations (i.e., the recurrent output 10810) based on convolving the convolved representations and previous hidden state representations.
  • the recurrent module 10810 applies three-dimensional (3D) convolutions to the convolved representations and previous hidden state representations and produces the current hidden state representations, mathematically formulated as:
  • W 1 3DCONV Presents weights of a first 3D convolution filter applied to h t _ j represents a previous hidden state representation produced at a previous time step t— 1 , and
  • W 2 3DCONV represents weights of a second 3D convolution filter applied to h t _ j .
  • W 1 3DCONV and W 2 3DCONV are the same because the weights are shared.
  • An output module 10812 then produces base calls 10814 based on the recurrent output 10810.
  • the output module 10812 comprises one or more fully-connected layers and a classification layer (e.g., softmax).
  • the current hidden state representations are processed through the fully -connected layers and the outputs of the fully -connected layers are processed through the classification layer to produce the base calls 10814.
  • the base calls 10814 include a base call for at least one of the clusters and for at least one of the sequencing cycles. In some implementations, the base calls 10814 include a base call for each of the clusters and for each of sequencing cycles. So, for example, when the input data 10802 includes image data for twenty -five clusters and for fifteen sequencing cycles, the base calls 10802 include a base call sequence of fifteen base calls for each of the twenty -five clusters.
  • Figure 108b shows one implementation of 3D convolutions 10800b used by the recurrent module 10810 of the hybrid neural network 10800b to produce the current hidden state representations.
  • a 3D convolution is a mathematical operation where each voxel present in the input volume is multiplied by a voxel in the equivalent position of the convolution kernel. At the end, the sum of the results is added to the output volume.
  • Figure 108b it is possible to observe the representation of the 3D convolution operation, where the voxels 10816a highlighted in the input 10816 are multiplied with their respective voxels in the kernel 10818. After these calculations, their sum 10820a is added to the output 10820.
  • 3D convolutions in addition to extracting spatial information from matrices like 2D convolutions, extract information present between consecutive matrices. This allows them to map both spatial information of 3D objects and temporal information of a set of sequential images.
  • Figure 109 illustrates one implementation of processing, through a cascade of convolution layers 10900 of the convolution module 10804, per-cycle input data 10902 for a single sequencing cycle among the series of t sequencing cycles to be base called.
  • the convolution module 10804 separately processes each per-cycle input data in a sequence of per-cycle input data through the cascade of convolution layers 10900.
  • the sequence of per-cycle input data is generated for a series of t sequencing cycles of a sequencing ran that are to be base called, where t is any number between 1 and 1000. So, for example, when the series includes fifteen sequencing cycles, the sequence of per-cycle input data comprises fifteen different per-cycle input data.
  • each per-cycle input data includes only image channels (e.g., a red channel and a green channel) or image data (e.g., the image data 7902 described above).
  • the image channels or the image data depict intensity emissions of one or more dusters and their surrounding background captured at a respective sequencing cycle in the series.
  • each per-cycle input data in addition to the image channels or the image data, also includes supplemental channels such as the distance channels and the scaling channel (e.g., the input data 9500 described above).
  • the per-cycle input data 10902 includes two image channels, namely, a red channel and a green channel, for the single sequencing cycle among the series of t sequencing cycles to be base called.
  • Each image channel is encoded in an image patch of size 15 x 15.
  • the convolution module 10804 comprises five convolution layers.
  • Each convolution layer has a bank of twenty-five convolution filters of size 3 x 3.
  • the convolution filters use so-called SAME padding that preserves the height and width of the input images or tensors. With the SAME padding, a padding is added to the input features such that the output feature map has the same size as the input features. In contrast, so-called VALID padding means no padding.
  • the first convolution layer 10904 processes the per-cycle input data 10902 and produces a first convolved representation 10906 of size 15 x 15 x 25.
  • the second convolution layer 10908 processes the first convolved representation 10906 and produces a second convolved representation 10910 of size 15 x 15 x 25.
  • the third convolution layer 10912 processes the second convolved representation 10910 and produces a third convolved representation 10914 of size 15 x 15 x 25.
  • the fourth convolution layer 10916 processes the third convolved representation 10914 and produces a fourth convolved representation 10918 of size 15 x 15 x 25.
  • the fifth convolution layer 10920 processes the fourth convolved representation 10918 and produces a fifth convolved representation 10922 of size 15 x 15 x 25.
  • the SAME padding preserves the spatial dimensions of the resulting convolved representations (e.g., 15 x 15).
  • the number of convolution filters in the convolution layers are a power of two, such as 2, 4, 16, 32, 64, 128, 256, 512, and 1024.
  • Figure 110 depicts one implementation of mixing 11000 the single sequencing cycle's per-cycle input data 10902 with its corresponding convolved representations 10906, 10910, 10914, 10918, and 10922 produced by the cascade of convolution layers 10900 of the convolution module 10804.
  • the convolved representations 10906, 10910, 10914, 10918, and 10922 are concatenated to form a sequence of convolved representations 11004, which in turn is concatenated with the per-cycle input data 10902 to produce a mixed representation 11006.
  • summation is used instead of concatenation.
  • the mixing 11000 is operationalized by the mixer 11002.
  • a flattener 11008 then flattens the mixed representation 11006 and produces a per-cycle flattened mixed representation 11010.
  • the flattened mixed representation 11010 is a high dimensional vector or two-dimensional (2D) array that shares at least one dimension size with the per-cycle input data 10902 and the convolved representations 10906, 10910, 10914, 10918, and 10922 (e.g., 15 x 1905, i.e., same row-wise dimension). This induces symmetry in the data that facilitates feature extraction in downstream 3D convolutions.
  • Figures 109 and 110 illustrate processing of the per-cycle image data 10902 for the single sequencing cycle among the series of t sequencing cycles to be base called.
  • the convolution module 10804 separately processes respective per-cycle image data for each of the t sequencing cycles and produces a respective per-cycle flattened mixed presentation for each of the t sequencing cycles.
  • Figure 111 shows one implementation of arranging flattened mixed representations of successive sequencing cycles as a stack 11100.
  • fifteen flattened mixed representations 10904a to 10904o for fifteen sequencing cycles are stacked in the stack 11100.
  • Stack 11100 is a 3D input volume that makes available features from both spatial and temporal dimensions (i.e., multiple sequencing cycles) in a same receptive field of a 3D convolution filter.
  • the stacking is operationalized by the stacker 11102.
  • stack 11100 can be a tensor of any dimensionality (e.g., ID, 2D, 4D, 5D, etc.).
  • a current hidden state representation at a current time step is a function of (i) the previous hidden state representation from a previous time step and (ii) the current input at the current time step.
  • the recurrent module 10808 subjects the stack 11100 to recurrent application of 3D convolutions (i.e., recurrent processing 11200) in forward and backward directions and produces base calls for each of the clusters at each of the t sequencing cycles in the series.
  • the 3D convolutions are used to extract spatio-temporal features from a subset of the flattened mixed representations in the stack 11100 on a sliding window basis.
  • Each sliding window (w) corresponds to a respective sequencing cycle and is highlighted in Figure 112a in orange.
  • w is parameterized to be 1, 2, 3, 5, 7, 9, 15, 21, etc., depending on the total number of sequencing cycles being simultaneously base called.
  • w is a fraction of the total number of sequencing cycles being simultaneously base called.
  • each sliding window contains three successive flattened mixed representations from the stack 11100 that comprises the fifteen flattened mixed representations 10904a to 10904o. Then, the first three flattened mixed representations 10904a to 10904c in the first sliding window correspond to the first sequencing cycle, the next three flattened mixed representations 10904b to 10904(1 in the second sliding window correspond to the second sequencing cycle, and so on. In some implementations, padding is used to encode adequate number of flattened mixed representations in the final sliding window corresponding to the final sequencing cycle, starting with the final flattened mixed representation 10904o.
  • each current input x(t) is a 3D volume of a plurality of flattened mixed representations (e.g., 1, 2, 3, 5, 7, 9, 15, or 21 flattened mixed representations, depending on w).
  • each current input x(t), at each time step is a 3D volume with dimensions 15 x 1905 x 7.
  • the recurrent module 10808 applies a first 3D convolution ( W 1 3DCONV ) to the input x(t) and a second 3D convolution ( W 2 3DCONV ) to the previous hidden state representation h(t-l) to produce the current hidden state representation h(t).
  • W 1 3DCONV and W 2 3DCONV are the same because the weights are shared.
  • the recurrent module 10808 processes the current input x(t) and the previous hidden state representation h(t-l) through a gated network such as long short-term memory (LSTM) network or gated recurrent unit (GRU) network.
  • LSTM long short-term memory
  • GRU gated recurrent unit
  • Figure 112b shows one implementation of processing 11200b the current input x(t) and the previous hidden state representation h(t-l) through an LSTM unit that applies 3D convolutions to the current input x(t) and the previous hidden state representation h(t-l) and produces the current hidden state representation h(t) as output.
  • the weights of the input, activation, forget, and output gates apply 3D convolutions.
  • the gated units do not use the non-linearity/squashing functions like hyperbolic tangent and sigmoid.
  • the current input x(t), the previous hidden state representation h(t-l), and the current hidden state representation h(t) are all 3D volume with same dimensionality and are processed through or produced by the input, activation, forget, and output gates as 3D volume.
  • the 3D convolutions of the recurrent module 10808 use a bank of twenty -five convolution filters of size 3 x 3, along with the SAME padding.
  • the size of the convolution filters is 5 x 5.
  • the number of convolution filters used by the recurrent module 10808 are factorized by a power of two, such as 2, 4, 16, 32, 64, 128, 256, 512, and 1024.
  • the recurrent module 10808 first processes the stack 11100 from the beginning to the end (top-down) on the sliding window basis and produces a sequence of current hidden state representations (vectors) for the forward traversal
  • the recurrent module 10808 then processes the stack 11100 from the end to the beginning (bottom-up) on the sliding window basis and produces a sequence of current hidden state representations (vectors) for the backward/reverse traversal
  • the processing uses the gates of an LSTM or a GRU. For example, at each time step, a forward current input x(t) is processed through the input, activation, forget, and output gates of an LSTM unit to produce a forward current hidden state representation and a backward current input x(t) is processed through the input, activation, forget, and output
  • the recurrent module 10808 combines (concatenates or sums or averages) the corresponding forward and backward current hidden state representations and produces a combined hidden state representation
  • the combined hidden representation is then processed through one or more fully-connected networks to produce a dense
  • the hybrid architecture can process input data for fewer or greater number of sequencing cycles and can comprise fewer or greater number of convolution and recurrent layers. Also, the dimensionality of the input data, the current and previous hidden representations, the convolution filters, the resulting feature maps, and the output can be different.
  • the number of convolution filters in a convolution layer can be different. It can use different padding and striding configurations. It can use a different classification function (e.g., sigmoid or regression) and may or may not include a fully -connected layer. It can use ID convolutions, 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and cross-channel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • ID convolutions 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions
  • It can use one or more loss functions such as logistic regression/log loss, multi-class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous SGD.
  • loss functions such as logistic regression/log loss, multi-class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss.
  • It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g., non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g., max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • sigmoid and hyperbolic tangent sigmoid and hyperbolic tangent
  • Figure 113 shows one implementation of balancing trinucleotides (3-mers) in the training data used to train the neural network-based base caller 1514. Balancing results in very little learning of statistics about genome in the training data and in turn improves generalization.
  • Heat map 11302 shows balanced 3-mers in the training data for a first organism called“A.baumanni”.
  • Heap map 11304 shows balanced 3-mers in the training data for a second organism called“E.coli”.
  • Figure 114 compares base calling accuracy of the RTA base caller against the neural network-based base caller 1514. As illustrated in Figure 114, the RTA base caller has a higher error percentage in two sequencing runs (Read: 1 and Read: 2). That is, the neural network-based base caller 1514 outperforms the RTA base caller in both the sequencing runs.
  • Figure 115 compares tile-to-tile generalization of the RTA base caller with that of the neural network-based base caller 1514 on a same tile. That is, with the neural network-based base caller 1514, the inference (testing) is performed on data for the same tile whose data is used in the training.
  • Figure 116 compares tile-to-tile generalization of the RTA base caller with that of the neural network-based base caller 1514 on a same tile and on different tiles. That is, the neural network-based base caller 1514 is trained on data for clusters on a first tile, but performs inference on data from clusters on a second tile. In the same tile implementation, the neural network-based base caller 1514 is trained on data from clusters on tile five and tested on data from dusters on tile five. In the different tile implementation, the neural network-based base caller 1514 is trained on data from clusters on tile ten and tested on data from clusters on tile five.
  • Figure 117 also compares tile-to-tile generalization of the RTA base caller with that of the neural network-based base caller 1514 on different tiles.
  • the neural network-based base caller 1514 is once trained on data from clusters on tile ten and tested on data from clusters on tile five, and then trained on data from clusters on tile twenty and tested on data from dusters on tile five.
  • Figure 118 shows how different sizes of the image patches fed as input to the neural network-based base caller 1514 effect the base calling accuracy.
  • the error percentage decreases as the patch size increases from 3 x 3 to 11 x 11. That is, the neural network-based base caller 1514 produces more accurate base calls with larger image patches.
  • base calling accuracy is balanced against compute efficiency by using image patches that are not larger than 100 x 100 pixels. In other implementations, image patches as large as 3000 x 3000 pixels (and larger) are used.
  • Figures 119, 120, 121, and 122 show lane-to-lane generalization of the neural network-based base caller 1514 on training data from A.baumanni and E.coli.
  • the neural network-based base caller 1514 is trained on E.coli data from dusters on a first lane of a flow cell and tested on A.baumanni data from dusters on both the first and second lanes of the flow cell. In another implementation, the neural network-based base caller 1514 is trained on A.baumanni data from dusters on the first lane and tested on the A.baumanni data from dusters on both the first and second lanes. In yet another implementation, the neural network-based base caller 1514 is trained on E.coli data from dusters on the second lane and tested on the A.baumanni data from dusters on both the first and second lanes. In yet further implementation, the neural network-based base caller 1514 is trained on A.baumanni data from dusters on the second lane and tested on the A.baumanni data from dusters on both the first and second lanes.
  • the neural network-based base caller 1514 is trained on E.coli data from dusters on a first lane of a flow cell and tested on E.coli data from dusters on both the first and second lanes of the flow cell. In another implementation, the neural network-based base caller 1514 is trained on A.baumanni data from dusters on the first lane and tested on the E.coli data from dusters on both the first and second lanes. In yet another implementation, the neural network-based base caller 1514 is trained on E.coli data from clusters on the second lane and tested on the E.coli data from dusters on the first lane. In yet further implementation, the neural network-based base caller 1514 is trained on A.baumanni data from clusters on the second lane and tested on the E.coli data from clusters on both the first and second lanes.

Abstract

La technologie selon l'invention traite une première entrée par l'intermédiaire d'un premier réseau de neurones artificiels et produit une première sortie. La première entrée comprend des premières données d'images dérivées d'images d'analytes et de leur arrière-plan environnant capturé par un système de séquençage pour une exécution de séquençage. La technologie selon l'invention traite la première sortie par l'intermédiaire d'un post-processeur et produit des métadonnées concernant les analytes et leur arrière-plan environnant. La technologie selon l'invention traite une seconde entrée par l'intermédiaire d'un second réseau de neurones artificiels et produit une seconde sortie. La seconde entrée comprend des troisièmes données d'images obtenues par la modification de deuxièmes données d'images sur la base des métadonnées. Les deuxièmes données d'images sont dérivées des images des analytes et de leur arrière-plan environnant. La seconde sortie identifie des appels de bases d'un ou plusieurs des analytes lors d'un ou plusieurs cycles de séquençage de l'exécution du séquençage.
PCT/US2020/024092 2019-03-21 2020-03-22 Séquençage à base d'intelligence artificielle WO2020191391A2 (fr)

Priority Applications (10)

Application Number Priority Date Filing Date Title
EP20757979.8A EP3942074A2 (fr) 2019-03-21 2020-03-22 Séquençage à base d'intelligence artificielle
CN202080004529.4A CN112689875A (zh) 2019-03-21 2020-03-22 基于人工智能的测序
SG11202012463YA SG11202012463YA (en) 2019-03-21 2020-03-22 Artificial intelligence-based sequencing
CA3104951A CA3104951A1 (fr) 2019-03-21 2020-03-22 Sequencage a base d'intelligence artificielle
MX2020014302A MX2020014302A (es) 2019-03-21 2020-03-22 Secuenciacion basada en inteligencia artificial.
AU2020240141A AU2020240141A1 (en) 2019-03-21 2020-03-22 Artificial intelligence-based sequencing
JP2020572706A JP2022535306A (ja) 2019-03-21 2020-03-22 人工知能ベースの配列決定
KR1020217003270A KR20210145116A (ko) 2019-03-21 2020-03-22 인공 지능 기반 서열분석
BR112020026455-5A BR112020026455A2 (pt) 2019-03-21 2020-03-22 Sequenciamento baseado em inteligência artificial
IL279533A IL279533A (en) 2019-03-21 2020-12-17 Creation through artificial intelligence

Applications Claiming Priority (30)

Application Number Priority Date Filing Date Title
US201962821681P 2019-03-21 2019-03-21
US201962821602P 2019-03-21 2019-03-21
US201962821724P 2019-03-21 2019-03-21
US201962821766P 2019-03-21 2019-03-21
US201962821618P 2019-03-21 2019-03-21
US62/821,766 2019-03-21
US62/821,724 2019-03-21
US62/821,602 2019-03-21
US62/821,681 2019-03-21
US62/821,618 2019-03-21
NL2023316A NL2023316B1 (en) 2019-03-21 2019-06-14 Artificial intelligence-based sequencing
NL2023311 2019-06-14
NL2023311A NL2023311B9 (en) 2019-03-21 2019-06-14 Artificial intelligence-based generation of sequencing metadata
NL2023312 2019-06-14
NL2023314A NL2023314B1 (en) 2019-03-21 2019-06-14 Artificial intelligence-based quality scoring
NL2023310 2019-06-14
NL2023316 2019-06-14
NL2023310A NL2023310B1 (en) 2019-03-21 2019-06-14 Training data generation for artificial intelligence-based sequencing
NL2023312A NL2023312B1 (en) 2019-03-21 2019-06-14 Artificial intelligence-based base calling
NL2023314 2019-06-14
US16/826,126 2020-03-20
US16/826,134 2020-03-20
US16/825,987 US11347965B2 (en) 2019-03-21 2020-03-20 Training data generation for artificial intelligence-based sequencing
US16/825,991 2020-03-20
US16/826,134 US11676685B2 (en) 2019-03-21 2020-03-20 Artificial intelligence-based quality scoring
US16/826,126 US11783917B2 (en) 2019-03-21 2020-03-20 Artificial intelligence-based base calling
US16/825,991 US11210554B2 (en) 2019-03-21 2020-03-20 Artificial intelligence-based generation of sequencing metadata
US16/825,987 2020-03-20
US16/826,168 US11436429B2 (en) 2019-03-21 2020-03-21 Artificial intelligence-based sequencing
US16/826,168 2020-03-21

Publications (2)

Publication Number Publication Date
WO2020191391A2 true WO2020191391A2 (fr) 2020-09-24
WO2020191391A3 WO2020191391A3 (fr) 2020-12-03

Family

ID=72519388

Family Applications (5)

Application Number Title Priority Date Filing Date
PCT/US2020/024088 WO2020191387A1 (fr) 2019-03-21 2020-03-21 Appel de base à base d'intelligence artificielle
PCT/US2020/024087 WO2020205296A1 (fr) 2019-03-21 2020-03-21 Génération à base d'intelligence artificielle de métadonnées de séquençage
PCT/US2020/024091 WO2020191390A2 (fr) 2019-03-21 2020-03-21 Notation de qualité faisant appel à l'intelligence artificielle
PCT/US2020/024090 WO2020191389A1 (fr) 2019-03-21 2020-03-21 Génération de données d'apprentissage pour séquençage à base d'intelligence artificielle
PCT/US2020/024092 WO2020191391A2 (fr) 2019-03-21 2020-03-22 Séquençage à base d'intelligence artificielle

Family Applications Before (4)

Application Number Title Priority Date Filing Date
PCT/US2020/024088 WO2020191387A1 (fr) 2019-03-21 2020-03-21 Appel de base à base d'intelligence artificielle
PCT/US2020/024087 WO2020205296A1 (fr) 2019-03-21 2020-03-21 Génération à base d'intelligence artificielle de métadonnées de séquençage
PCT/US2020/024091 WO2020191390A2 (fr) 2019-03-21 2020-03-21 Notation de qualité faisant appel à l'intelligence artificielle
PCT/US2020/024090 WO2020191389A1 (fr) 2019-03-21 2020-03-21 Génération de données d'apprentissage pour séquençage à base d'intelligence artificielle

Country Status (1)

Country Link
WO (5) WO2020191387A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111883203A (zh) * 2020-07-03 2020-11-03 上海厦维生物技术有限公司 用于预测pd-1疗效的模型的构建方法
CN113506243A (zh) * 2021-06-04 2021-10-15 联合汽车电子有限公司 Pcb焊接缺陷检测方法、装置及存储介质
CN113658643A (zh) * 2021-07-22 2021-11-16 西安理工大学 一种基于注意力机制对lncRNA和mRNA的预测方法
CN115630566A (zh) * 2022-09-28 2023-01-20 中国人民解放军国防科技大学 一种基于深度学习和动力约束的资料同化方法和系统
US11580641B1 (en) 2021-12-24 2023-02-14 GeneSense Technology Inc. Deep learning based methods and systems for nucleic acid sequencing
WO2023183937A1 (fr) * 2022-03-25 2023-09-28 Illumina, Inc. Appel de bases séquence par séquence
CN115630566B (zh) * 2022-09-28 2024-05-07 中国人民解放军国防科技大学 一种基于深度学习和动力约束的资料同化方法和系统

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110398370B (zh) * 2019-08-20 2021-02-05 贵州大学 一种基于hts-cnn模型的轴承故障诊断方法
US11200446B1 (en) 2020-08-31 2021-12-14 Element Biosciences, Inc. Single-pass primary analysis
CN112598620B (zh) * 2020-11-25 2022-11-15 哈尔滨工程大学 尿沉渣中透明管型、病理管型以及粘液丝的识别方法
CN112629851B (zh) * 2020-12-11 2022-10-25 南方海上风电联合开发有限公司 基于数据增强方法与图像识别的海上风电机组齿轮箱故障诊断方法
CN112541576B (zh) * 2020-12-14 2024-02-20 四川翼飞视科技有限公司 Rgb单目图像的生物活体识别神经网络构建方法
CN112652356B (zh) * 2021-01-19 2024-01-26 深圳市儒瀚科技有限公司 一种dna甲基化表观修饰的识别方法、识别设备及存储介质
CN112418360B (zh) * 2021-01-21 2021-04-13 深圳市安软科技股份有限公司 卷积神经网络的训练方法、行人属性识别方法及相关设备
CN113034355B (zh) * 2021-04-20 2022-06-21 浙江大学 一种基于深度学习的肖像图像双下巴去除方法
WO2023010069A1 (fr) * 2021-07-29 2023-02-02 Ultima Genomics, Inc. Systèmes et procédés d'appel de base adaptatifs
WO2023049212A2 (fr) * 2021-09-22 2023-03-30 Illumina, Inc. Appel de base basé sur l'état
CN114399628B (zh) * 2021-12-21 2024-03-08 四川大学 复杂空间环境下的绝缘子高效检测系统
CN114092920B (zh) * 2022-01-18 2022-04-15 腾讯科技(深圳)有限公司 一种模型训练的方法、图像分类的方法、装置及存储介质
CN115277116B (zh) * 2022-07-06 2024-02-02 中能电力科技开发有限公司 网络隔离的方法、装置、存储介质及电子设备
CN115272136B (zh) * 2022-09-27 2023-05-05 广州卓腾科技有限公司 基于大数据的证件照眼镜反光消除方法、装置、介质及设备
CN116796196B (zh) * 2023-08-18 2023-11-21 武汉纺织大学 基于多模态联合嵌入的共语姿势生成方法

Citations (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991006678A1 (fr) 1989-10-26 1991-05-16 Sri International Sequençage d'adn
US5528050A (en) 1995-07-24 1996-06-18 Molecular Dynamics, Inc. Compact scan head with multiple scanning modalities
US5641658A (en) 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5719391A (en) 1994-12-08 1998-02-17 Molecular Dynamics, Inc. Fluorescence imaging system employing a macro scanning objective
WO1998044151A1 (fr) 1997-04-01 1998-10-08 Glaxo Group Limited Methode d'amplification d'acide nucleique
WO2000018957A1 (fr) 1998-09-30 2000-04-06 Applied Research Systems Ars Holding N.V. Procedes d'amplification et de sequençage d'acide nucleique
WO2000063437A2 (fr) 1999-04-20 2000-10-26 Illumina, Inc. Detection de reactions d'acide nucleique sur microsupports de billes en reseau
US6266459B1 (en) 1997-03-14 2001-07-24 Trustees Of Tufts College Fiber optic sensor with encoded microspheres
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US20020055100A1 (en) 1997-04-01 2002-05-09 Kawashima Eric H. Method of nucleic acid sequencing
US20040002090A1 (en) 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
WO2004018497A2 (fr) 2002-08-23 2004-03-04 Solexa Limited Nucleotides modifies
US20040096853A1 (en) 2000-12-08 2004-05-20 Pascal Mayer Isothermal amplification of nucleic acids on a solid support
US6770441B2 (en) 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
WO2005010145A2 (fr) 2003-07-05 2005-02-03 The Johns Hopkins University Procede et compositions de detection et d'enumeration de variations genetiques
US6859570B2 (en) 1997-03-14 2005-02-22 Trustees Of Tufts College, Tufts University Target analyte sensors utilizing microspheres
US20050064460A1 (en) 2001-11-16 2005-03-24 Medical Research Council Emulsion compositions
US20050130173A1 (en) 2003-01-29 2005-06-16 Leamon John H. Methods of amplifying and sequencing nucleic acids
WO2005065814A1 (fr) 2004-01-07 2005-07-21 Solexa Limited Arrangements moleculaires modifies
US20050244870A1 (en) 1999-04-20 2005-11-03 Illumina, Inc. Nucleic acid sequencing using microsphere arrays
US20060114714A1 (en) 2004-11-29 2006-06-01 Yoshiharu Kanegae Magnetroresistive random access memory and method of manufacturing the same
US7057026B2 (en) 2001-12-04 2006-06-06 Solexa Limited Labelled nucleotides
WO2006064199A1 (fr) 2004-12-13 2006-06-22 Solexa Limited Procede ameliore de detection de nucleotides
US20060240439A1 (en) 2003-09-11 2006-10-26 Smith Geoffrey P Modified polymerases for improved incorporation of nucleotide analogues
WO2007010251A2 (fr) 2005-07-20 2007-01-25 Solexa Limited Preparation de matrices pour sequencage d'acides nucleiques
WO2007010252A1 (fr) 2005-07-20 2007-01-25 Solexa Limited Procede de sequencage d'une matrice de polynucleotide
US7211414B2 (en) 2000-12-01 2007-05-01 Visigen Biotechnologies, Inc. Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
US20070099208A1 (en) 2005-06-15 2007-05-03 Radoje Drmanac Single molecule arrays for genetic and chemical analysis
US20070128624A1 (en) 2005-11-01 2007-06-07 Gormley Niall A Method of preparing libraries of template polynucleotides
WO2007123744A2 (fr) 2006-03-31 2007-11-01 Solexa, Inc. Systèmes et procédés pour analyse de séquençage par synthèse
US7315019B2 (en) 2004-09-17 2008-01-01 Pacific Biosciences Of California, Inc. Arrays of optical confinements and uses thereof
US20080009420A1 (en) 2006-03-17 2008-01-10 Schroth Gary P Isothermal methods for creating clonal single molecule arrays
US7329492B2 (en) 2000-07-07 2008-02-12 Visigen Biotechnologies, Inc. Methods for real-time single molecule sequence determination
US20080108082A1 (en) 2006-10-23 2008-05-08 Pacific Biosciences Of California, Inc. Polymerase enzymes and reagents for enhanced nucleic acid sequencing
US7405281B2 (en) 2005-09-29 2008-07-29 Pacific Biosciences Of California, Inc. Fluorescent nucleotide analogs and uses therefor
US7414716B2 (en) 2006-10-23 2008-08-19 Emhart Glass S.A. Machine for inspecting glass containers
US20080280773A1 (en) 2004-12-13 2008-11-13 Milan Fedurco Method of Nucleotide Detection
US20090088327A1 (en) 2006-10-06 2009-04-02 Roberto Rigatti Method for sequencing a polynucleotide template
US7592435B2 (en) 2005-08-19 2009-09-22 Illumina Cambridge Limited Modified nucleosides and nucleotides and uses thereof
US7622294B2 (en) 1997-03-14 2009-11-24 Trustees Of Tufts College Methods for detecting target analytes and enzymatic reactions
US20120020537A1 (en) 2010-01-13 2012-01-26 Francisco Garcia Data processing system and methods
US8158926B2 (en) 2005-11-23 2012-04-17 Illumina, Inc. Confocal imaging methods and apparatus
US20120270305A1 (en) 2011-01-10 2012-10-25 Illumina Inc. Systems, methods, and apparatuses to image a sample for biological or chemical analysis
US20120316086A1 (en) 2011-06-09 2012-12-13 Illumina, Inc. Patterned flow-cells useful for nucleic acid analysis
US20130023422A1 (en) 2008-05-05 2013-01-24 Illumina, Inc. Compensator for multiple surface imaging
US20130116153A1 (en) 2011-10-28 2013-05-09 Illumina, Inc. Microarray fabrication system and method
US20130184796A1 (en) 2012-01-16 2013-07-18 Greatbatch Ltd. Elevated Hermetic Feedthrough Insulator Adapted for Side Attachment of Electrical Conductors on the Body Fluid Side of an Active Implantable Medical Device
US20130260372A1 (en) 2012-04-03 2013-10-03 Illumina, Inc. Integrated optoelectronic read head and fluidic cartridge useful for nucleic acid sequencing
US20130296175A1 (en) 2011-01-13 2013-11-07 Illumina Inc. Genetic Variants as Markers for Use in Urinary Bladder Cancer Risk Assessment, Diagnosis, Prognosis and Treatment
US20140147014A1 (en) 2011-11-29 2014-05-29 Lucasfilm Entertainment Company Ltd. Geometry tracking
US20140243224A1 (en) 2013-02-26 2014-08-28 Illumina, Inc. Gel patterned surfaces
WO2014142831A1 (fr) 2013-03-13 2014-09-18 Illumina, Inc. Procédés et systèmes pour aligner des éléments d'adn répétitifs
WO2015002813A1 (fr) 2013-07-01 2015-01-08 Illumina, Inc. Greffage de polymère et fonctionnalisation de surface sans catalyseur
US9079148B2 (en) 2008-07-02 2015-07-14 Illumina Cambridge Limited Using populations of beads for the fabrication of arrays on surfaces
WO2015106941A1 (fr) 2014-01-16 2015-07-23 Illumina Cambridge Limited Modification de polynucléotides sur support solide
US20160085910A1 (en) 2014-09-18 2016-03-24 Illumina, Inc. Methods and systems for analyzing nucleic acid sequencing data
WO2016066586A1 (fr) 2014-10-31 2016-05-06 Illumina Cambridge Limited Nouveaux polymères et revêtements de copolymères d'adn
US20180274023A1 (en) 2013-12-03 2018-09-27 Illumina, Inc. Methods and systems for analyzing image data

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2782858B2 (ja) 1989-10-31 1998-08-06 松下電器産業株式会社 スクロール気体圧縮機
WO1991006778A1 (fr) 1989-11-02 1991-05-16 Sundstrand Corporation Pompe regenerative et procede de refoulement du fluide sous pression
DE10320388A1 (de) 2003-05-06 2004-11-25 Basf Ag Polymere für die Wasserbehandlung
EP1819304B1 (fr) 2004-12-09 2023-01-25 Twelve, Inc. Reparation de valvule sigmoide aortique
US20060178901A1 (en) 2005-01-05 2006-08-10 Cooper Kelana L Home movies television (HMTV)
JP2006199187A (ja) 2005-01-21 2006-08-03 Kyowa Sangyo Kk 車両用サンバイザ
SE529136C2 (sv) 2005-01-24 2007-05-08 Volvo Lastvagnar Ab Styrväxelkylare
US7144195B1 (en) 2005-05-20 2006-12-05 Mccoskey William D Asphalt compaction device
US7293515B2 (en) 2005-06-10 2007-11-13 Janome Sewing Machine Co., Ltd. Embroidery sewing machine
GB2549554A (en) * 2016-04-21 2017-10-25 Ramot At Tel-Aviv Univ Ltd Method and system for detecting an object in an image
LT3566158T (lt) * 2017-01-06 2022-06-27 Illumina, Inc. Fazinė korekcija
KR102246285B1 (ko) * 2017-03-07 2021-04-29 일루미나, 인코포레이티드 단일 광원, 2-광학 채널 서열분석
NL2018852B1 (en) * 2017-05-05 2018-11-14 Illumina Inc Optical distortion correction for imaged samples
CN111094540A (zh) * 2017-09-15 2020-05-01 伊鲁米纳公司 序列检测系统的调整与校准特征
MX2019014689A (es) * 2017-10-16 2020-10-19 Illumina Inc Clasificacion de sitio de escision y empalme basado en aprendizaje profundo.
NZ759818A (en) * 2017-10-16 2022-04-29 Illumina Inc Semi-supervised learning for training an ensemble of deep convolutional neural networks
US11288576B2 (en) * 2018-01-05 2022-03-29 Illumina, Inc. Predicting quality of sequencing results using deep neural networks
WO2019136388A1 (fr) * 2018-01-08 2019-07-11 Illumina, Inc. Systèmes et dispositifs de séquençage à haut débit avec détection basée sur un semi-conducteur
SG11201911805VA (en) * 2018-01-15 2020-01-30 Illumina Inc Deep learning-based variant classifier
US20200251183A1 (en) * 2018-07-11 2020-08-06 Illumina, Inc. Deep Learning-Based Framework for Identifying Sequence Patterns that Cause Sequence-Specific Errors (SSEs)

Patent Citations (68)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991006678A1 (fr) 1989-10-26 1991-05-16 Sri International Sequençage d'adn
US5641658A (en) 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5719391A (en) 1994-12-08 1998-02-17 Molecular Dynamics, Inc. Fluorescence imaging system employing a macro scanning objective
US5528050A (en) 1995-07-24 1996-06-18 Molecular Dynamics, Inc. Compact scan head with multiple scanning modalities
US6859570B2 (en) 1997-03-14 2005-02-22 Trustees Of Tufts College, Tufts University Target analyte sensors utilizing microspheres
US6266459B1 (en) 1997-03-14 2001-07-24 Trustees Of Tufts College Fiber optic sensor with encoded microspheres
US7622294B2 (en) 1997-03-14 2009-11-24 Trustees Of Tufts College Methods for detecting target analytes and enzymatic reactions
US20020055100A1 (en) 1997-04-01 2002-05-09 Kawashima Eric H. Method of nucleic acid sequencing
WO1998044151A1 (fr) 1997-04-01 1998-10-08 Glaxo Group Limited Methode d'amplification d'acide nucleique
WO2000018957A1 (fr) 1998-09-30 2000-04-06 Applied Research Systems Ars Holding N.V. Procedes d'amplification et de sequençage d'acide nucleique
US7115400B1 (en) 1998-09-30 2006-10-03 Solexa Ltd. Methods of nucleic acid amplification and sequencing
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US20050244870A1 (en) 1999-04-20 2005-11-03 Illumina, Inc. Nucleic acid sequencing using microsphere arrays
WO2000063437A2 (fr) 1999-04-20 2000-10-26 Illumina, Inc. Detection de reactions d'acide nucleique sur microsupports de billes en reseau
US6770441B2 (en) 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
US7329492B2 (en) 2000-07-07 2008-02-12 Visigen Biotechnologies, Inc. Methods for real-time single molecule sequence determination
US7211414B2 (en) 2000-12-01 2007-05-01 Visigen Biotechnologies, Inc. Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
US20040096853A1 (en) 2000-12-08 2004-05-20 Pascal Mayer Isothermal amplification of nucleic acids on a solid support
US20050064460A1 (en) 2001-11-16 2005-03-24 Medical Research Council Emulsion compositions
US7427673B2 (en) 2001-12-04 2008-09-23 Illumina Cambridge Limited Labelled nucleotides
US7566537B2 (en) 2001-12-04 2009-07-28 Illumina Cambridge Limited Labelled nucleotides
US7057026B2 (en) 2001-12-04 2006-06-06 Solexa Limited Labelled nucleotides
US20040002090A1 (en) 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
US20070166705A1 (en) 2002-08-23 2007-07-19 John Milton Modified nucleotides
WO2004018497A2 (fr) 2002-08-23 2004-03-04 Solexa Limited Nucleotides modifies
US7541444B2 (en) 2002-08-23 2009-06-02 Illumina Cambridge Limited Modified nucleotides
US20050130173A1 (en) 2003-01-29 2005-06-16 Leamon John H. Methods of amplifying and sequencing nucleic acids
WO2005010145A2 (fr) 2003-07-05 2005-02-03 The Johns Hopkins University Procede et compositions de detection et d'enumeration de variations genetiques
US20060240439A1 (en) 2003-09-11 2006-10-26 Smith Geoffrey P Modified polymerases for improved incorporation of nucleotide analogues
US20110059865A1 (en) 2004-01-07 2011-03-10 Mark Edward Brennan Smith Modified Molecular Arrays
US8563477B2 (en) 2004-01-07 2013-10-22 Illumina Cambridge Limited Modified molecular arrays
WO2005065814A1 (fr) 2004-01-07 2005-07-21 Solexa Limited Arrangements moleculaires modifies
US7315019B2 (en) 2004-09-17 2008-01-01 Pacific Biosciences Of California, Inc. Arrays of optical confinements and uses thereof
US20060114714A1 (en) 2004-11-29 2006-06-01 Yoshiharu Kanegae Magnetroresistive random access memory and method of manufacturing the same
WO2006064199A1 (fr) 2004-12-13 2006-06-22 Solexa Limited Procede ameliore de detection de nucleotides
US20080280773A1 (en) 2004-12-13 2008-11-13 Milan Fedurco Method of Nucleotide Detection
US20070099208A1 (en) 2005-06-15 2007-05-03 Radoje Drmanac Single molecule arrays for genetic and chemical analysis
WO2007010252A1 (fr) 2005-07-20 2007-01-25 Solexa Limited Procede de sequencage d'une matrice de polynucleotide
WO2007010251A2 (fr) 2005-07-20 2007-01-25 Solexa Limited Preparation de matrices pour sequencage d'acides nucleiques
US7592435B2 (en) 2005-08-19 2009-09-22 Illumina Cambridge Limited Modified nucleosides and nucleotides and uses thereof
US7405281B2 (en) 2005-09-29 2008-07-29 Pacific Biosciences Of California, Inc. Fluorescent nucleotide analogs and uses therefor
US20070128624A1 (en) 2005-11-01 2007-06-07 Gormley Niall A Method of preparing libraries of template polynucleotides
US8158926B2 (en) 2005-11-23 2012-04-17 Illumina, Inc. Confocal imaging methods and apparatus
US20080009420A1 (en) 2006-03-17 2008-01-10 Schroth Gary P Isothermal methods for creating clonal single molecule arrays
WO2007123744A2 (fr) 2006-03-31 2007-11-01 Solexa, Inc. Systèmes et procédés pour analyse de séquençage par synthèse
US8241573B2 (en) 2006-03-31 2012-08-14 Illumina, Inc. Systems and devices for sequence by synthesis analysis
US20090088327A1 (en) 2006-10-06 2009-04-02 Roberto Rigatti Method for sequencing a polynucleotide template
US7414716B2 (en) 2006-10-23 2008-08-19 Emhart Glass S.A. Machine for inspecting glass containers
US20080108082A1 (en) 2006-10-23 2008-05-08 Pacific Biosciences Of California, Inc. Polymerase enzymes and reagents for enhanced nucleic acid sequencing
US20130023422A1 (en) 2008-05-05 2013-01-24 Illumina, Inc. Compensator for multiple surface imaging
US9079148B2 (en) 2008-07-02 2015-07-14 Illumina Cambridge Limited Using populations of beads for the fabrication of arrays on surfaces
US20120020537A1 (en) 2010-01-13 2012-01-26 Francisco Garcia Data processing system and methods
US20120270305A1 (en) 2011-01-10 2012-10-25 Illumina Inc. Systems, methods, and apparatuses to image a sample for biological or chemical analysis
US20130296175A1 (en) 2011-01-13 2013-11-07 Illumina Inc. Genetic Variants as Markers for Use in Urinary Bladder Cancer Risk Assessment, Diagnosis, Prognosis and Treatment
US20120316086A1 (en) 2011-06-09 2012-12-13 Illumina, Inc. Patterned flow-cells useful for nucleic acid analysis
US8778848B2 (en) 2011-06-09 2014-07-15 Illumina, Inc. Patterned flow-cells useful for nucleic acid analysis
US20130116153A1 (en) 2011-10-28 2013-05-09 Illumina, Inc. Microarray fabrication system and method
US8778849B2 (en) 2011-10-28 2014-07-15 Illumina, Inc. Microarray fabrication system and method
US20140147014A1 (en) 2011-11-29 2014-05-29 Lucasfilm Entertainment Company Ltd. Geometry tracking
US20130184796A1 (en) 2012-01-16 2013-07-18 Greatbatch Ltd. Elevated Hermetic Feedthrough Insulator Adapted for Side Attachment of Electrical Conductors on the Body Fluid Side of an Active Implantable Medical Device
US20130260372A1 (en) 2012-04-03 2013-10-03 Illumina, Inc. Integrated optoelectronic read head and fluidic cartridge useful for nucleic acid sequencing
US20140243224A1 (en) 2013-02-26 2014-08-28 Illumina, Inc. Gel patterned surfaces
WO2014142831A1 (fr) 2013-03-13 2014-09-18 Illumina, Inc. Procédés et systèmes pour aligner des éléments d'adn répétitifs
WO2015002813A1 (fr) 2013-07-01 2015-01-08 Illumina, Inc. Greffage de polymère et fonctionnalisation de surface sans catalyseur
US20180274023A1 (en) 2013-12-03 2018-09-27 Illumina, Inc. Methods and systems for analyzing image data
WO2015106941A1 (fr) 2014-01-16 2015-07-23 Illumina Cambridge Limited Modification de polynucléotides sur support solide
US20160085910A1 (en) 2014-09-18 2016-03-24 Illumina, Inc. Methods and systems for analyzing nucleic acid sequencing data
WO2016066586A1 (fr) 2014-10-31 2016-05-06 Illumina Cambridge Limited Nouveaux polymères et revêtements de copolymères d'adn

Non-Patent Citations (46)

* Cited by examiner, † Cited by third party
Title
"3.3.9.11. Watershed and random walker for segmentation", SCIPY LECTURE NOTES, 13 November 2018 (2018-11-13), Retrieved from the Internet <URL:http://scipv-lectures.org/packages/scikit-image/autoexamples/plot_segmentations.htnd>
"skikit-image/peak.py at master", 16 November 2018, GITHUB
A. G. HOWARDM. ZHUB. CHEND. KALENICHENKOW. WANGT. WEYANDM. ANDREETTOH. ADAM: "Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications", ARXIV: 1704.04861, 2017
BENTLEY ET AL., NATURE, vol. 456, 2008, pages 53 - 59
BENTLEY ET AL., NATURE, vol. 456, pages 53 - 59
C. SZEGEDYW. LIUY. JIAP. SERMANETS. REEDD. ANGUELOVD. ERHANV. VANHOUCKEA. RABINOVICH: "GOING DEEPER WITH CONVOLUTIONS", ARXIV: 1409.4842, 2014
DRESSMAN ET AL., PROC. NATL. ACAD. SCI. USA, vol. 100, 2003, pages 8817 - 8822
F. CHOLLET: "Xception: Deep Learning with Depthwise Separable Convolutions", PROC. OF CVPR, 2017
F. YUV. KOLTUN: "MULTI-SCALE CONTEXT AGGREGATION BY DILATED CONVOLUTIONS", ARXIV: 1511.07122, 2016
G. HUANGZ. LIUL. VAN DER MAATENK. Q. WEINBERGER: "DENSELY CONNECTED CONVOLUTIONAL NETWORKS", ARXIV: 1608.06993, 2017
I. J. GOODFELLOWD. WARDE-FARLEYM. MIRZAA. COURVILLEY. BENGIO: "AUTOREGRESSIVE MODEL BASED ON A DEEP CONVOLUTIONAL NEURAL NETWORK FOR AUDIO GENERATION", 2016, TAMPERE UNIVERSITY OF TECHNOLOGY, article "CONVOLUTIONAL NETWORKS"
J. GUZ. WANGJ. KUENL. MAA. SHAHROUDYB. SHUAIT. LIUX. WANGG. WANG: "RECENT ADVANCES IN CONVOLUTIONAL NEURAL NETWORKS", ARXIV:1512.07108, 2017
J. HUANGV. RATHODC. SUNM. ZHUA. KORATTIKARAA. FATHII. FISCHERZ. WOJNAY. SONGS. GUADARRAMA ET AL.: "Speed/accuracy trade-offs for modern convolutional object detectors", ARXIV PREPRINT ARXIV: 1611.10012, 2016
J. LONGE. SHELHAMERT. DARRELL: "Fully convolutional networks for semantic segmentation", CVPR, 2015
J. M. WOLTERINKT. LEINERM. A. VIERGEVERI. ISGUM: "DILATED CONVOLUTIONAL NEURAL NETWORKS FOR CARDIOVASCULAR MR SEGMENTATION IN CONGENITAL HEART DISEASE", ARXIV:1704.03669, 2017
K. HEX. ZHANGS. RENJ. SUN: "DEEP RESIDUAL LEARNING FOR IMAGE RECOGNITION", ARXIV: 1512.03385, 2015
K. HEX. ZHANGS. RENJ. SUN: "Deep Residual Learning for Image Recognition", PROC. OF CVPR, 2016
KIM, S.SCHEFFLER, K.HALPERN, A.L.BEKRITSKY, M.A.NOH, E.KALLBERG, M.CHEN, X.BEYTER, D.KRUSCHE, P.SAUNDERS, C.T., STRELKA2: FAST AND ACCURATE VARIANT CALLING FOR CLINICAL SEQUENCING APPLICATIONS, vol. 595-595, 2017
L. SIFRE: "Rigid-motion Scattering for Image Classification", PH.D. THESIS, 2014
L. SIFRES. MALLAT: "Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination", PROC. OF CVPR, 2013
LIANG-CHIEH CHENGEORGE PAPANDREOUFLORIAN SCHROFFHARTWIG ADAM: "Rethinking atrous convolution for semantic image segmentation", CORR, 2017
LIU PHEMANI APAUL KWEIS CJUNG MWEHN N: "3D-Stacked Many-Core Architecture for Biological Sequence Analysis Problems", INT J PARALLEL PROG., vol. 45, no. 6, 2017, pages 1420 - 60, XP036325442, DOI: 10.1007/s10766-017-0495-0
LIZARDI ET AL., NAT. GENET., vol. 19, 1998, pages 225 - 232
LONG, JONATHAN: "Fully Convolutional Networks for Semantic Segmentation", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 39, no. 4, 1 April 2017 (2017-04-01)
M. LINQ. CHENS. YAN: "Network in Network", PROC. OF ICLR, 2014
M. SANDLERA. HOWARDM. ZHUA. ZHMOGINOVL. CHEN: "MobileNetV2: Inverted Residuals and Linear Bottlenecks", ARXIV:1801.04381V3, 2018
MORDVINTSEV, ALEXANDERABID K., IMAGE SEGMENTATION WITH WATERSHED ALGORITHM, 13 November 2018 (2018-11-13), Retrieved from the Internet <URL:httns://opencv-pvthon-tutroals.readthedocs.io/en/latest/nvtutorials/pvimgproc/pvwatershed/pvwatershed.html>
PRABHAKAR ET AL.: "Plasticine: A Reconfigurable Architecture for Parallel Patterns", ISCA '17, 24 June 2017 (2017-06-24)
R.K. SRIVASTAVAK. GREFFJ. SCHMIDHUBER: "HIGHWAY NETWORKS", ARXIV: 1505.00387, 2015
RONNEBERGER OFISCHER PBROX T.: "U-net: Convolutional networks for biomedical image segmentation", MED. IMAGE COMPUT. COMPUT. ASSIST. INTERV., 2015, Retrieved from the Internet <URL:http://link.springer.com/chapter/10.1007/978-3-319-24574-4_28>
RONNEBERGER, OLAF: "U-net: Convolutional networks for biomedical image segmentation", INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, 18 May 2015 (2015-05-18)
S. DIELEMANH. ZENK. SIMONYANO. VINYALSA. GRAVESN. KALCHBRENNERA. SENIORK. KAVUKCUOGLU: "WAVENET: A GENERATIVE MODEL FOR RAW AUDIO", ARXIV: 1609.03499, 2016
S. IOFFEC. SZEGEDY: "BATCH NORMALIZATION: ACCELERATING DEEP NETWORK TRAINING BY REDUCING INTERNAL COVARIATE SHIFT", ARXIV: 1502.03167, 2015
S. O. ARIKM. CHRZANOWSKIA. COATESG. DIAMOSA. GIBIANSKYY. KANGX. LIJ. MILLERA. NGJ. RAIMAN: "DEEP VOICE: REAL-TIME NEURAL TEXT-TO-SPEECH", ARXIV:1702.07825, 2017
S. XIER. GIRSHICKP. DOLLARZ. TUK. HE: "Aggregated Residual Transformations for Deep Neural Networks", PROC. OF CVPR, 2017
SHEVCHENKO, A., KERAS WEIGHTED CATEGORICAL_CROSSENTROPY, 15 January 2019 (2019-01-15), Retrieved from the Internet <URL:https://eist.eithub.com/skeeet/cad06d584548fb45eeceld4e28cfa98b>
STROMBERG, MICHAELROY, RAJATLAJUGIE, JULIENJIANG, YULI, HAOCHENMARGULIES, ELLIOTT, NIRVANA: CLINICAL GRADE VARIANT ANNOTATOR, vol. 596-596, 2017
T SAUNDERS, CHRISTOPHERWONG, WENDYSWAMY, SAJANIBECQ, JENNIFERJ MURRAY, LISACHEETHAM, KEIRA: "Strelka: Accurate somatic small-variant calling from sequenced tumor-normal sample pairs", BIOINFORMATICS (OXFORD, ENGLAND, vol. 28, 2012, pages 1811 - 7, XP055257165, DOI: 10.1093/bioinformatics/bts271
THAKUR, PRATIBHA: "A Survey of Image Segmentation Techniques", INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS, vol. 2, no. 4, April 2014 (2014-04-01), pages 158 - 165
VAN DEN ASSEM, D.C.F.: "Master of Science Thesis", 18 August 2017, DELFT UNIVERSITY OF TECHNOLOGY, article "Deep Learning for Pixelwise Classification of Hyperspectral Images", pages: 3 - 38
X. ZHANGX. ZHOUM. LINJ. SUN: "ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices", ARXIV:1707.01083, 2017
XIE, W.: "Microscopy cell counting and detection with fully convolutional regression networks", COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION, vol. 6, no. 3, 2018, pages 283 - 292, XP055551866, DOI: 10.1080/21681163.2016.1149104
XIE, YUANPU ET AL.: "Beyond classification: structured regression for robust cell detection using convolutional neural network", INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, October 2015 (2015-10-01)
Z. QINZ. ZHANGX. CHENY. PENG: "FD-MobileNet: Improved MobileNet with a Fast Downsampling Strategy", ARXIV: 1802.03750, 2018
Z. WUK. HAMMADE. GHAFAR-ZADEHS. MAGIEROWSKI: "FPGA-Accelerated 3rd Generation DNA Sequencing", IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, vol. 14, no. 1, February 2020 (2020-02-01), pages 65 - 74, XP011771041, DOI: 10.1109/TBCAS.2019.2958049
Z. WUK. HAMMADR. MITTMANNS. MAGIEROWSKIE. GHAFAR-ZADEHX. ZHONG: "FPGA-Based DNA Basecalling Hardware Acceleration", PROC. IEEE 61ST INT. MIDWEST SYMP. CIRCUITS SYST., August 2018 (2018-08-01), pages 1098 - 1101, XP033508770, DOI: 10.1109/MWSCAS.2018.8623988

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111883203A (zh) * 2020-07-03 2020-11-03 上海厦维生物技术有限公司 用于预测pd-1疗效的模型的构建方法
CN111883203B (zh) * 2020-07-03 2023-12-29 上海厦维医学检验实验室有限公司 用于预测pd-1疗效的模型的构建方法
CN113506243A (zh) * 2021-06-04 2021-10-15 联合汽车电子有限公司 Pcb焊接缺陷检测方法、装置及存储介质
CN113658643A (zh) * 2021-07-22 2021-11-16 西安理工大学 一种基于注意力机制对lncRNA和mRNA的预测方法
CN113658643B (zh) * 2021-07-22 2024-02-13 西安理工大学 一种基于注意力机制对lncRNA和mRNA的预测方法
US11580641B1 (en) 2021-12-24 2023-02-14 GeneSense Technology Inc. Deep learning based methods and systems for nucleic acid sequencing
WO2023183937A1 (fr) * 2022-03-25 2023-09-28 Illumina, Inc. Appel de bases séquence par séquence
CN115630566A (zh) * 2022-09-28 2023-01-20 中国人民解放军国防科技大学 一种基于深度学习和动力约束的资料同化方法和系统
CN115630566B (zh) * 2022-09-28 2024-05-07 中国人民解放军国防科技大学 一种基于深度学习和动力约束的资料同化方法和系统

Also Published As

Publication number Publication date
WO2020191390A3 (fr) 2020-11-12
WO2020191391A3 (fr) 2020-12-03
WO2020205296A1 (fr) 2020-10-08
WO2020191387A1 (fr) 2020-09-24
WO2020191390A2 (fr) 2020-09-24
WO2020191389A1 (fr) 2020-09-24

Similar Documents

Publication Publication Date Title
US11436429B2 (en) Artificial intelligence-based sequencing
US20230004749A1 (en) Deep neural network-based sequencing
WO2020191391A2 (fr) Séquençage à base d&#39;intelligence artificielle
NL2023316B1 (en) Artificial intelligence-based sequencing
NL2023312B1 (en) Artificial intelligence-based base calling
NL2023314B1 (en) Artificial intelligence-based quality scoring
NL2023311B9 (en) Artificial intelligence-based generation of sequencing metadata
NL2023310B1 (en) Training data generation for artificial intelligence-based sequencing

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 3104951

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2020572706

Country of ref document: JP

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020026455

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 2020240141

Country of ref document: AU

Date of ref document: 20200322

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20757979

Country of ref document: EP

Kind code of ref document: A2

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
ENP Entry into the national phase

Ref document number: 112020026455

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20201222

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2020757979

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2020757979

Country of ref document: EP

Effective date: 20211021