WO2022212180A1 - Appelant de base à base d'intelligence artificielle avec reconnaissance contextuelle - Google Patents

Appelant de base à base d'intelligence artificielle avec reconnaissance contextuelle Download PDF

Info

Publication number
WO2022212180A1
WO2022212180A1 PCT/US2022/021814 US2022021814W WO2022212180A1 WO 2022212180 A1 WO2022212180 A1 WO 2022212180A1 US 2022021814 W US2022021814 W US 2022021814W WO 2022212180 A1 WO2022212180 A1 WO 2022212180A1
Authority
WO
WIPO (PCT)
Prior art keywords
intensity
images
context data
neural network
sequencing
Prior art date
Application number
PCT/US2022/021814
Other languages
English (en)
Inventor
Amirali Kia
Original Assignee
Illumina, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/687,586 external-priority patent/US20220319639A1/en
Application filed by Illumina, Inc. filed Critical Illumina, Inc.
Priority to CN202280005054.XA priority Critical patent/CN115803816A/zh
Priority to CA3183578A priority patent/CA3183578A1/fr
Priority to AU2022248999A priority patent/AU2022248999A1/en
Priority to EP22720805.5A priority patent/EP4315343A1/fr
Publication of WO2022212180A1 publication Critical patent/WO2022212180A1/fr

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Definitions

  • the technology disclosed relates to artificial intelligence type computers and digital data processing systems and corresponding data processing methods and products for emulation of intelligence (i.e., knowledge based systems, reasoning systems, and knowledge acquisition systems); and including systems for reasoning with uncertainty (e.g ., fuzzy logic systems), adaptive systems, machine learning systems, and artificial neural networks.
  • intelligence i.e., knowledge based systems, reasoning systems, and knowledge acquisition systems
  • systems for reasoning with uncertainty e.g ., fuzzy logic systems
  • adaptive systems e.g ., machine learning systems
  • artificial neural networks such as convolutional neural networks for analyzing data.
  • Convolutional neural networks are the current state-of-the-art machine learning algorithms for many tasks in computer vision, such as classification or segmentation. Training convolutional neural networks requires large amounts of computer memory, which increases exponentially with increasing image size. Computer memory becomes a limiting factor because the backpropagation algorithm for optimizing deep neural networks requires the storage of intermediate activations. Since the size of these intermediate activations in the convolutional neural networks increases proportionate to the input size, memory quickly fills up with large images.
  • Figure l is a simplified block diagram that shows various aspects of the technology disclosed.
  • Figure 2 illustrates one implementation of accessing a sequencing image on a patch- by-patch basis for base calling.
  • Figure 3 shows one implementation of generating intensity contextualized images.
  • Figure 4 shows one example of a full image from which a patch is accessed by the neural network such that the patch is centered at a target cluster to be base called.
  • Figure 5 depicts one implementation of the intensity contextualization unit having a plurality of convolution pipelines.
  • Figure 6 illustrates one implementation of the neural network processing an intensity contextualized patch and generating the base calls.
  • Figure 7 shows one implementation of the neural network processing previous, current, and successive intensity contextualized images for a plurality of sequencing cycles and generating the base calls.
  • Figures 8A and 8B demonstrate base calling superiority of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit over another neural network-based base caller (DeepRTA) and another non-neural network-based base caller (RTA).
  • DeepRTA neural network-based base caller
  • RTA non-neural network-based base caller
  • Figure 9 shows the base calling error rates observed for various combinations (configurations) of filter sizes (or kernel sizes), strides, and filter bank sizes (K) of convolution filters of the disclosed neural network-based base caller.
  • Figure 10 compares base calling error rate of DeepRTA against base calling error rates of different filter bank size configurations (KOs) of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit (DeepRTA-K0-04; DeepRTA-K0-06; DeepRTA-KO- 10; DeepRTA-KO-16; DeepRTA-K0-18; and DeepRTA-K0- 20).
  • Ks filter bank size configurations
  • Figure 11 shows base calling error rates when the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit extracts intensity context data from an original input image of size 115x115 (fitted line with “o”) versus an original input image of size 160x160 (fitted line with “n”).
  • Figure 12 shows base calling accuracy (1-base calling error rate) of the different configurations of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit, i.e., DeepRTA-K0-06, DeepRTA-349-K0-10-160p, DeepRTA- KO-16, DeepRTA-KO- 16-Lanczos, DeepRTA-KO- 18, and DeepRTA-K0-20 against DeepRTA over base calling homopolymers (e.g ., GGGGG) and flanked-homopolymers (e.g, GGTGG).
  • base calling homopolymers e.g ., GGGGG
  • flanked-homopolymers e.g, GGTGG
  • Figure 13 compares base calling error rates of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit and trained on normalized sequencing images (“DeepRTA-V2:349”) against DeepRTA, RTA, the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit, trained on, and performing inference on normalized sequencing images (“DeepRTA-V2:349”), and the DeepRTA trained on and performing inference on normalized sequencing images (“DeepRTA-norm”).
  • Figures 14A and 14B depict one implementation of a sequencing system.
  • the sequencing system comprises a configurable processor.
  • Figure 14C is a simplified block diagram of a system for analysis of sensor data from the sequencing system, such as base call sensor outputs.
  • Figure 15 is a simplified diagram showing aspects of the base calling operation, including functions of a runtime program executed by a host processor.
  • Figure 16 is a simplified diagram of a configuration of a configurable processor such as the one depicted in Figure 14C.
  • Figure 17 is a computer system that can be used to implement the technology disclosed.
  • the functional blocks are not necessarily indicative of the division between hardware circuitry.
  • one or more of the functional blocks e.g ., modules, processors, or memories
  • the programs may be stand-alone programs, may be incorporated as subroutines in an operating system, may be functions in an installed software package, and the like.
  • modules can be implemented in hardware or software, and need not be divided up in precisely the same blocks as shown in the figures. Some of the modules can also be implemented on different processors, computers, or servers, or spread among a number of different processors, computers, or servers. In addition, it will be appreciated that some of the modules can be combined, operated in parallel or in a different sequence than that shown in the figures without affecting the functions achieved.
  • the modules in the figures can also be thought of as flowchart steps in a method.
  • a module also need not necessarily have all its code disposed contiguously in memory; some parts of the code can be separated from other parts of the code with code from other modules or other functions disposed in between.
  • Figure l is a simplified block diagram that shows various aspects of the technology disclosed.
  • Figure 1 includes images 102, a data flow logic 104, an intensity contextualization unit 112 (also referred to herein as “patch processing unit (PPU)”), intensity context data 122, intensity contextualized images 114, a neural network 124 (or neural network- based base caller), and base calls 134.
  • the system can be formed by one or more programmed computers, with programming being stored on one or more machine readable media with code executed to carry out one or more steps of methods described herein.
  • the system includes the data flow logic 104 configured to output the intensity contextualized images 114 as digital image data, for example, image data that is representative of individual picture elements or pixels that, together, form an image of an array or other object.
  • Base calling is the process of determining the nucleotide composition of a sequence.
  • Base calling involves analyzing image data, i.e., sequencing images, produced during a sequencing run (or sequencing reaction) carried out by a sequencing instrument such as Illumina’s iSeq, HiSeqX, HiSeq 3000, HiSeq 4000, HiSeq 2500, NovaSeq 6000, NextSeq 550, NextSeq 1000, NextSeq 2000, NextSeqDx, MiSeq, and MiSeqDx.
  • Base calling decodes the intensity data encoded in the sequencing images into nucleotide sequences.
  • the Illumina sequencing platforms employ cyclic reversible termination (CRT) chemistry for base calling.
  • CRT cyclic reversible termination
  • the process relies on growing nascent strands complementary to template strands with fluorescently-labeled nucleotides, while tracking the emitted signal of each newly added nucleotide.
  • the fluorescently-labeled nucleotides have a 3' removable block that anchors a fluorophore signal of the nucleotide type.
  • Sequencing occurs in repetitive cycles, each comprising three steps: (a) extension of a nascent strand by adding the fluorescently-labeled nucleotide; (b) excitation of the fluorophore using one or more lasers of an optical system of the sequencing instrument and imaging through different filters of the optical system, yielding the sequencing images; and (c) cleavage of the fluorophore and removal of the 3' block in preparation for the next sequencing cycle. Incorporation and imaging cycles are repeated up to a designated number of sequencing cycles, defining the read length. Using this approach, each cycle interrogates a new position along the template strands.
  • a cluster comprises approximately one thousand identical copies of a template strand, though clusters vary in size and shape.
  • the clusters are grown from the template strand, prior to the sequencing run, by bridge amplification or exclusion amplification of the input library.
  • the purpose of the amplification and cluster growth is to increase the intensity of the emitted signal since the imaging device cannot reliably sense fluorophore signal of a single strand.
  • the physical distance of the strands within a cluster is small, so the imaging device perceives the cluster of strands as a single spot.
  • Sequencing occurs in a flow cell (or biosensor) - a small glass slide that holds the input strands.
  • the flow cell is connected to the optical system, which comprises microscopic imaging, excitation lasers, and fluorescence filters.
  • the flow cell comprises multiple chambers called lanes. The lanes are physically separated from each other and may contain different tagged sequencing libraries, distinguishable without sample cross contamination.
  • the flow cell comprises a patterned surface.
  • a “patterned surface” refers to an arrangement of different regions in or on an exposed layer of a solid support.
  • the imaging device of the sequencing instrument e.g ., a solid-state imager such as a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) sensor
  • CCD charge-coupled device
  • CMOS complementary metal-oxide-semiconductor
  • the output of the sequencing run is the sequencing images.
  • Sequencing images depict intensity emissions of the clusters and their surrounding background using a grid (or array) of pixelated units (e.g., pixels, superpixels, subpixels).
  • the intensity emissions are stored as intensity values of the pixelated units.
  • the sequencing images have dimensions w x h of the grid of pixelated units, where w (width) and h (height) are any numbers ranging from 1 and 100,000 (e.g, 115 x 115, 200 x 200, 1800 x 2000, 2200 x 25000, 2800 x 3600, 4000 x 400). In some implementations, w and h are the same. In other implementations, w and h are different.
  • the sequencing images depict intensity emissions generated as a result of nucleotide incorporation in the nucleotide sequences during the sequencing run. The intensity emissions are from associated clusters and their surrounding background.
  • Figure 2 illustrates one implementation of accessing a sequencing image 202 on a patch-by-patch basis 220 for base calling.
  • the data flow logic 104 provides the sequencing image 202 to the neural network 124 for base calling.
  • the neural network 124 accesses the sequencing image 202 on the patch-by-patch basis 220, for example, patches 202a, 202b, 202c, and 202d.
  • Each of the patches is a sub-grid (or sub-array) of pixelated units in the grid of pixelated units that forms the sequencing image 202.
  • the patches have dimensions q x r of the sub-grid of pixelated units, where q (width) and r (height) can be, for example, 1 x 1, 3 x 3, 5 x 5, 7 x 7, 10 x 10, 15 x 15, 25 x 25, and so on. In some implementations, q and r are the same. In other implementations, q and r are different. In some implementations, the patches are of the same size. In other implementations, the patches are of different sizes. In some implementations, the patches can have overlapping pixelated units (e.g., on the edges).
  • the sequencing image 202 depicts intensity emissions of a set of twenty-eight clusters 1-28.
  • the patches depict the intensity emissions for a subset of the clusters.
  • the patch 202a substantially depicts the intensity emissions for a first subset of seven clusters 1, 2, 3, 4, 5, 10, and 16;
  • the patch 202b substantially depicts the intensity emissions for a second subset of eight clusters 15, 16, 19, 20, 21, 22, 25, and 26;
  • the patch 202c substantially depicts the intensity emissions for a third subset of eight clusters 5, 6, 7, 8, 9, 12,
  • the patch 202d substantially depicts the intensity emissions for a fourth subset of nine clusters 13, 14, 17, 18, 22, 23, 24, 27, and 28.
  • each of the images 102 has one or more image (or intensity) channels (analogous to the red, green, blue (RGB) channels of a color image).
  • each image channel corresponds to one of a plurality of filter wavelength bands.
  • each image channel corresponds to one of a plurality of imaging events at a sequencing cycle.
  • each image channel corresponds to a combination of illumination with a specific laser and imaging through a specific optical filter. The patches are accessed from each of the m image channel(s) for a particular sequencing cycle.
  • m is 4 or 2. In other implementations, m is 1, 3, or greater than 4.
  • the sequencing uses two different image channels: a blue channel and a green channel. Then, at each sequencing cycle, the sequencing produces a blue image and a green image. This way, for a series of k sequencing cycles, a sequence with k pairs of blue and green images is produced as output and stored as the images 102. Accordingly, a sequence of per-cycle image patches is generated for a series of k sequencing cycles of a sequencing run.
  • the per-cycle image patches contain intensity data for associated clusters and their surrounding background in one or more image channels (e.g ., a red channel and a green channel).
  • the per-cycle image patches are centered at a center pixel that contains intensity data for a target associated cluster and non-center pixels in the per-cycle image patches contain intensity data for associated clusters adjacent to the target associated cluster.
  • the patches have undiverse (indistinguishable) intensity patterns due to limited base diversity of clusters in the subset. Compared to a full image, a patch is smaller and has fewer clusters, which in turn reduces the base diversity.
  • the patch has scarce base variety because, compared to the full image, the patch depicts intensity patterns for a smaller number of different types of bases A, C, T, and G.
  • the patch can depict low-complexity base patterns in which some of the four bases A, C, T, and G are represented at a frequency of less than 15%, 10%, or 5% of all the nucleotides.
  • Low nucleotide diversity in the patches creates intensity patterns that lack signal diversity (contrast), i.e., undiverse intensity patterns.
  • Figure 3 shows one implementation of generating the intensity contextualized images 114.
  • the intensity contextualization unit 112 generates the intensity context data 122 from the images 102 and makes the intensity context data 122 available for incorporation into the patches.
  • the intensity contextualization unit 112 is configured with feature extraction logic that is applied on the intensity values in the images 102 to generate the intensity context data 122.
  • the feature extraction logic determines summary statistics of the intensity values in the images 102. Examples of the summary statistics include maximum value, minimum value, mean, mode, standard deviation, variance, skewness, kurtosis, percentiles, and entropy.
  • the feature extraction logic determines secondary statistics based on the summary statistics. Examples of the secondary statistics include deltas, sums, series of maximum values, series of minimum values, minimum of the maximum values in the series, and maximum of minimum values in the series.
  • the intensity context data 122 specifies summary statistics of the intensity values. In one implementation, the intensity context data 122 identifies a maximum value in the intensity values. In one implementation, the intensity context data 122 identifies a minimum value in the intensity values. In one implementation, the intensity context data 122 identifies a mean of the intensity values. In one implementation, the intensity context data 122 identifies a mode of the intensity values. In one implementation, the intensity context data 122 identifies a standard deviation of the intensity values. In one implementation, the intensity context data 122 identifies a variance of the intensity values. In one implementation, the intensity context data 122 identifies a skewness of the intensity values. In one implementation, the intensity context data 122 identifies a kurtosis of the intensity values. In one implementation, the intensity context data 122 identifies an entropy of the intensity values.
  • the intensity context data 122 identifies one or more percentiles of the intensity values. In one implementation, the intensity context data 122 identifies a delta between at least one of the maximum value and the minimum value, the maximum value and the mean, the mean and the minimum value, and a higher one of the percentiles and a lower one of the percentiles. In one implementation, the intensity context data 122 identifies a sum of the intensity values. In one implementation, the intensity contextualization unit 112 determines a plurality (or series) of maximum values by dividing the intensity values into groups and determining a maximum value for each of the groups. The intensity context data 122 identifies the smallest value in the plurality of maximum values.
  • the intensity contextualization unit 112 determines a plurality (or series) of minimum values by dividing the intensity values into groups and determining a minimum value for each of the groups.
  • the intensity context data 122 identifies the largest value in the plurality of minimum values.
  • the intensity contextualization unit 112 determines a plurality of sums by dividing the intensity values into groups and determining a sum of intensity values in each of the groups.
  • the intensity context data 122 identifies the smallest value in the plurality of sums.
  • the intensity context data 122 identifies the largest value in the plurality of sums.
  • the intensity context data 122 identifies a mean of the plurality of sums.
  • the intensity context data 122 comprises numerical values (e.g ., floating-point numbers or integers) determined (or calculated) from the intensity values in the images 102.
  • the numerical values in the intensity context data 122 are features or feature maps generated as a result of applying convolution operations on the images 102.
  • the features in the intensity context data 122 can be stored as pixelated units ( e.g ., pixels, superpixels, subpixels) that contain the respective numerical values.
  • the intensity contextualization unit 112 is a multilayer perceptron (MLP). In another implementation, the intensity contextualization unit 112 is a feedforward neural network. In yet another implementation, the intensity contextualization unit 112 is a fully-connected neural network. In a further implementation, the intensity contextualization unit 112 is a fully convolutional neural network. In yet further implementation, the intensity contextualization unit 112 is a semantic segmentation neural network. In yet another further implementation, the intensity contextualization unit 112 is a generative adversarial network (GAN).
  • GAN generative adversarial network
  • the intensity contextualization unit 112 is a convolutional neural network (CNN) with a plurality of convolution layers.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • LSTM long short-term memory network
  • Bi-LSTM bi directional LSTM
  • GRU gated recurrent unit
  • it includes both a CNN and a RNN.
  • the intensity contextualization unit 112 can use ID convolutions, 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and cross channel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • It can use one or more loss functions such as logistic regression/log loss, multi class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous stochastic gradient descent (SGD).
  • loss functions such as logistic regression/log loss, multi class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g., PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g, non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g, max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • the intensity contextualization unit 112 is trained using backpropagation-based gradient update techniques.
  • Example gradient descent techniques that can be used for training the intensity contextualization unit 112 include stochastic gradient descent, batch gradient descent, and mini -batch gradient descent.
  • Some examples of gradient descent optimization algorithms that can be used to train the intensity contextualization unit 112 are Momentum, Nesterov accelerated gradient, Adagrad, Adadelta, RMSprop, Adam, AdaMax, Nadam, and AMSGrad.
  • an initial version of intensity context data as generated by the intensity contextualization unit 112, has spatial dimensions that are different from the images 102 ( e.g ., the full image 202).
  • the initial version of intensity context data produced by the intensity contextualization unit 122 is subjected to further processing to generate the intensity context data 122 that is appendable to the full image 202.
  • the intensity context data 122 “being appendable” to the full image 202 means that the two have matching or similar spatial dimensions, i.e., width and height.
  • the initial version of intensity context data can be converted into the appendable intensity context data 122 by use of dimensionality augmentation techniques like upsampling, deconvolution, transpose convolution, dilated convolution, concatenation, and padding (e.g., when the spatial dimensions of the two are not exactly matching).
  • dimensionality augmentation techniques like upsampling, deconvolution, transpose convolution, dilated convolution, concatenation, and padding (e.g., when the spatial dimensions of the two are not exactly matching).
  • the initial version of intensity context data can be of size 1 x 1, 3 x 3, or 5 x 5, whereas the full image 202 is of size 115 x 115.
  • the initial version of intensity context data is duplicated (or cloned) such that the clones of the initial version of intensity context data are concatenated to form the intensity context data 122, which has spatial dimensions that match the full image 202.
  • the spatial dimensions of the initial version of intensity context data are 1 x 1, and the full image 202 is of size 115 x 115.
  • the intensity context data 122 comprises a plurality of context channels. Each context channel in the plurality of context channels is constructed using a respective feature from a plurality of features generated by the intensity contextualization unit 112.
  • the intensity contextualization unit 112 generates six 1 x 1 initial versions of the intensity context data.
  • six 115 x 115 context channels are generated using concatenation to constitute the intensity context data 122.
  • the data flow logic 104 appends the intensity context data 122 to the images 102 to generate the intensity contextualized images 114.
  • the intensity context data 122 comprises the plurality of context channels in which each context channel has the same spatial dimensions as the images 202. Consider a full image that has two image channels that form a first grid (or array) of pixelated units of size 115 x 115 and depth two. Further consider that the intensity context data 122 has six context channels that form a second grid of pixelated units of size 115 x 115 and depth six.
  • each of the intensity contextualized images has eight channels, two image channels from the full image and six context channels from the intensity context data 122.
  • the data flow logic 104 provides the intensity contextualized images 114 to the neural network 124 as input, which accesses them on the patch-by-patch basis 220.
  • the input to the neural network 124 comprises intensity contextualized images for multiple sequencing cycles (e.g ., a current sequencing cycle, one or more preceding sequencing cycles, and one or more successive sequencing cycles).
  • the input to the neural network 124 comprises intensity contextualized images for three sequencing cycles, such that intensity contextualized images for a current (time t) sequencing cycle to be base called is accompanied with (i) intensity contextualized images for a left flanking/context/previous/preceding/prior (time t- 1) sequencing cycle and (ii) intensity contextualized images for a right flanking/context/next/successive/subsequent (time /+1) sequencing cycle.
  • the input to the neural network 124 comprises intensity contextualized images for five sequencing cycles, such that intensity contextualized images for a current (time t) sequencing cycle to be base called is accompanied with (i) data for a first left flanking/context/previous/preceding/prior (time t- 1) sequencing cycle, (ii) intensity contextualized images for a second left flanking/context/previous/preceding/prior (time 1-2) sequencing cycle, (iii) intensity contextualized images for a first right flanking/context/next/successive/subsequent (time /+ 1), and (iv) intensity contextualized images for a second right flanking/context/next/successive/subsequent (time 1+2) sequencing cycle.
  • the input to the neural network 124 comprises intensity contextualized images for seven sequencing cycles, such that data for a current (time t) sequencing cycle to be base called is accompanied with (i) intensity contextualized images for a first left flanking/context/previous/preceding/prior (time t- 1) sequencing cycle, (ii) intensity contextualized images for a second left flanking/context/previous/preceding/prior (time 1-2) sequencing cycle, (iii) intensity contextualized images for a third left flanking/context/previous/preceding/prior (time t- 3) sequencing cycle, (iv) intensity contextualized images for a first right flanking/context/next/successive/subsequent (time /+ 1),
  • the input to the neural network 124 comprises intensity contextualized images for a single sequencing cycle.
  • the input to the neural network 124 comprises intensity contextualized images for 58, 75, 92, 130, 168, 175, 209, 225, 230, 275, 318, 325, 330, 525, or 625 sequencing cycles.
  • the sequencing images from the current (time t) sequencing cycle are accompanied with the sequencing images from the preceding (time t- 1) sequencing cycle and the sequencing images from the succeeding (time /+ 1 ) sequencing cycle.
  • the neural network-based base caller 104 processes the sequencing images through its convolution layers and produces an alternative representation, according to one implementation.
  • the alternative representation is then used by an output layer (e.g ., a softmax layer) for generating a base call for either just the current (time t) sequencing cycle or each of the sequencing cycles, i.e., the current (time t) sequencing cycle, the preceding (time t- 1) sequencing cycle, and the succeeding (time /+1) sequencing cycle.
  • the resulting base calls form the sequencing reads.
  • the neural network-based base caller 124 processes the intensity contextualized images 114 through its convolution layers and produces an alternative representation, according to one implementation.
  • the alternative representation is then used by an output layer (e.g., a softmax layer) for generating a base call for either just the current (time t) sequencing cycle or each of the sequencing cycles, i.e., the current (time t) sequencing cycle, the preceding (time t- 1) sequencing cycle, and the succeeding (time /+1) sequencing cycle.
  • the resulting base calls form the sequencing reads and are stored as the base calls 134.
  • the neural network-based base caller 124 accesses the intensity contextualized images 114 on a patch-by-patch basis (or a tile-by-tile basis).
  • Each of the patches is a sub-grid (or sub-array) of pixelated units in the grid of pixelated units that forms the sequencing images.
  • the patches have dimensions q x r of the sub-grid of pixelated units, where q (width) and r (height) are any numbers ranging from 1 and 10000 (e.g, 3 x 3, 5 x 5, 7 x 7, 10 x 10, 15 x 15, 25 x 25, 64 x 64, 78 x 78, 115 x 115). In some implementations, q and r are the same.
  • the neural network-based base caller 124 outputs a base call for a single target cluster for a particular sequencing cycle. In another implementation, it outputs a base call for each target cluster in a plurality of target clusters for the particular sequencing cycle. In yet another implementation, it outputs a base call for each target cluster in a plurality of target clusters for each sequencing cycle in a plurality of sequencing cycles, thereby producing a base call sequence for each target cluster.
  • the neural network-based base caller 124 is a multilayer perceptron (MLP). In another implementation, the neural network-based base caller 124 is a feedforward neural network. In yet another implementation, the neural network-based base caller 124 is a fully-connected neural network. In a further implementation, the neural network-based base caller 124 is a fully convolutional neural network. In yet further implementation, the neural network-based base caller 124 is a semantic segmentation neural network. In yet another further implementation, the neural network-based base caller 124 is a generative adversarial network (GAN).
  • GAN generative adversarial network
  • the neural network-based base caller 124 is a convolutional neural network (CNN) with a plurality of convolution layers.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • LSTM long short-term memory network
  • Bi-LSTM bi directional LSTM
  • GRU gated recurrent unit
  • it includes both a CNN and a RNN.
  • the neural network-based base caller 124 can use ID convolutions, 2D convolutions, 3D convolutions, 4D convolutions, 5D convolutions, dilated or atrous convolutions, transpose convolutions, depthwise separable convolutions, pointwise convolutions, l x l convolutions, group convolutions, flattened convolutions, spatial and cross channel convolutions, shuffled grouped convolutions, spatial separable convolutions, and deconvolutions.
  • It can use one or more loss functions such as logistic regression/log loss, multi class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g, PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous stochastic gradient descent (SGD).
  • loss functions such as logistic regression/log loss, multi class cross-entropy/softmax loss, binary cross-entropy loss, mean-squared error loss, LI loss, L2 loss, smooth LI loss, and Huber loss. It can use any parallelism, efficiency, and compression schemes such TFRecords, compressed encoding (e.g, PNG), sharding, parallel calls for map transformation, batching, prefetching, model parallelism, data parallelism, and synchronous
  • It can include upsampling layers, downsampling layers, recurrent connections, gates and gated memory units (like an LSTM or GRU), residual blocks, residual connections, highway connections, skip connections, peephole connections, activation functions (e.g, non-linear transformation functions like rectifying linear unit (ReLU), leaky ReLU, exponential liner unit (ELU), sigmoid and hyperbolic tangent (tanh)), batch normalization layers, regularization layers, dropout, pooling layers (e.g ., max or average pooling), global average pooling layers, and attention mechanisms.
  • ReLU rectifying linear unit
  • ELU exponential liner unit
  • sigmoid and hyperbolic tangent sigmoid and hyperbolic tangent
  • the neural network-based base caller 124 is trained using backpropagation-based gradient update techniques.
  • Example gradient descent techniques that can be used for training the neural network-based base caller 124 include stochastic gradient descent, batch gradient descent, and mini -batch gradient descent.
  • Some examples of gradient descent optimization algorithms that can be used to train the neural network-based base caller 124 are Momentum, Nesterov accelerated gradient, Adagrad, Adadelta, RMSprop, Adam, AdaMax, Nadam, and AMSGrad.
  • the intensity contextualization unit 112 contains intensity extractor, discriminators, and approximators (e.g., convolution filters) whose kernel weights or coefficients can be learned (or trained) using backpropagation-based gradient update techniques.
  • the intensity contextualization unit 112 is trained “end-to-end” with the neural network 124, such that the error is calculated between the base call predictions of the neural network 124 and the ground truth base calls and the gradients determined from the error are used to update the weights of the neural network 124 and further update the weights of the intensity contextualization unit 112. This way, the intensity contextualization unit 112 learns to extract those intensity features and contexts from the images 102 that contribute to correct base call predictions by the neural network 124.
  • Figure 4 shows one example of a full image 402 from which a patch 402a is accessed such that the patch 402a is centered at a target cluster 412 (in red) to be base called.
  • the size of the full image 402 is 115 x 115 pixels and the size of the patch 402a is 15 x 15 pixels.
  • Figure 5 depicts one implementation of the intensity contextualization unit 112 having a plurality of convolution pipelines.
  • Each of the convolution pipelines has a plurality of convolution filters.
  • Convolution filters in the plurality of convolution filters have varying filter sizes and varying filter strides.
  • Each of the convolution pipelines processes an image to generate a plurality of convolved representations of the image.
  • the input to the intensity contextualization unit 112 is a full image 502 of size 115 x 115 pixels and two image channels, i.e., blue and green image channels.
  • the intensity contextualization unit 112 has n convolution pipelines 502a, ..., 502n, where n can range from 1 to 100 (e.g, 4, 6, 10, 16, 18, 20).
  • a convolution pipeline has a series of convolution filters ( e.g ., 542).
  • convolution filters in the series of convolution filters of a particular convolution pipeline have different filter (or kernel) sizes.
  • convolution filters in the series of convolution filters of a particular convolution pipeline have same filter sizes.
  • the particular convolution pipeline can have three sets of filters, such that filters in a first set of filters are of size 3 x 3, filters in a second set of filters are of size 3 x 3, and filters in a third set of filters are of size 12 x 12.
  • the particular convolution pipeline can have three sets of filters, such that filters in a first set of filters are of size 3 x 3, filters in a second set of filters are of size 4 x 4, and filters in a third set of filters are of size 9 x 9.
  • the particular convolution pipeline can have four sets of filters, such that filters in a first set of filters are of size 3 x 3, filters in a second set of filters are of size 3 x 3, filters in a third set of filters are of size 4 x 4, and filters in a fourth set of filters are of size 9 x 9.
  • the particular convolution pipeline can have four sets of filters, such that filters in a first set of filters are of size 5 x 5, filters in a second set of filters are of size 3 x 3, filters in a third set of filters are of size 3 x 3, and filters in a fourth set of filters are of size 7 x 7.
  • the particular convolution pipeline can have four sets of filters, such that filters in a first set of filters are of size 5 x 5, filters in a second set of filters are of size 3 x 3, filters in a third set of filters are of size 3 x 3, and filters in a fourth set of filters are of size 7 x 7.
  • the particular convolution pipeline can have four sets of filters, such that filters in a first set of filters are of size 5 x 5, filters in a second set of filters are of size 4 x 4, filters in a third set of filters are of size 4 x 4, and filters in a fourth set of filters are of size 5 x 5.
  • the particular convolution pipeline can have four sets of filters, such that filters in a first set of filters are of size 5 x 5, filters in a second set of filters are of size 5 x 5, filters in a third set of filters are of size 5 x 5, and filters in a fourth set of filters are of size 3 x 3.
  • the particular convolution pipeline can have four sets of filters, such that filters in a first set of filters are of size 3 x 3, filters in a second set of filters are of size 3 x 3, filters in a third set of filters are of size 4 x 4, and filters in a fourth set of filters are of size 9 x 9.
  • convolution filters in the series of convolution filters of a particular convolution pipeline have different stride sizes.
  • the particular convolution pipeline can have three sets of filters, such that filters in a first set of filters use a stride size of 3, filters in a second set of filters use a stride size of 4, and filters in a third set of filters use a stride size of 1.
  • convolution filters in the series of convolution filters of a particular convolution pipeline have same stride sizes.
  • the image 502 is fed as input to each of the n convolution pipelines 502a, ... , 502n.
  • Each convolution pipeline processes the image 502, generates successive feature maps, and produces a final output (e.g ., convolved representations 512a, 512n of size 1 x 1).
  • a final output e.g ., convolved representations 512a, 512n of size 1 x 1).
  • respective final outputs of the convolution pipelines are also different and therefore encode different intensity features or contexts determined from the intensity values in the image 502. This way, a plurality of intensity features and contexts is determined from the image 502 by using the plurality of convolution pipelines configured with varying convolution coefficients (or kernel weights).
  • Each of the final outputs is made up of one or more pixelated units (e.g., pixels, superpixels, subpixels).
  • the spatial dimensions of the respective final outputs (e.g, 512a, 512n) of the n convolution pipelines 502a, ..., 502n are cloned and concatenated by a cloner 562 and concatenator 572 to match the spatial dimensions of the image 502, as discussed above.
  • the cloned and concatenated versions of the respective final outputs form respective context channels (e.g, 516a, 516n), which are arranged on a pixelated unit-by-pixelated unit basis to constitute the intensity context data 122 of size 115 x 115 x 6.
  • the intensity context data 122 is appended to the image 502 on the pixelated unit-by-pixelated unit basis to form intensity contextualized image 508 of size 115 x 115 x 8, of which the six channels are the context channels from the intensity context data 122 and the two channels are image channels from the image 502.
  • the data flow logic 104 provides the intensity contextualized image 508 as input to the neural network 124 for base calling, which accesses and base calls the intensity contextualized image 508 on the patch-by-patch basis 220.
  • Figure 6 illustrates one implementation of the neural network 124 processing an intensity contextualized patch 614 and generating the base calls 134.
  • a patch 602a of size 15 x 15 is accessed from a full image 602 of size 115 x 115.
  • the intensity context data 604 of size 15 x 15 is pixelwise appended to the patch 602a to form the intensity contextualized patch 614.
  • the neural network 124 comprises a plurality of convolution layers and filters 634 whose receptive fields 624 are smaller than the full image 602.
  • the intensity context data 604 determines whether the intensity context data 604 is confined to the spatial dimensions of the patch 602a and therefore do not take into account image portions of the full image 602 that are outside the patch 602a.
  • the intensity context data 604 provides intensity context from the distant regions of the image that are not covered by the patch 602a.
  • the intensity contextualized patch 614 is processed by the convolution layers and filters 634 of the neural network 124 to generate the base calls 134.
  • Figure 7 shows one implementation of the neural network 124 processing previous, current, and successive intensity contextualized images 764, 774, 784 for a plurality of sequencing cycles and generating the base calls 134.
  • Image 702 is generated at a previous sequencing cycle t-1 of a sequencing run.
  • Image 712 is generated at a current sequencing cycle t of the sequencing run.
  • Image 722 is generated at a successive sequencing cycle t+1 of the sequencing run.
  • Previous patch 702a is accessed from the previous image 702 and previous intensity context data 704 is determined from the intensity values in the previous image 702 and pixelwise appended to the previous patch 702a to form the previous intensity contextualized patch 764.
  • Current patch 712a is accessed from the current image 712 and current intensity context data 704 is determined from the intensity values in the current image 712 and pixelwise appended to the current patch 712a to form the current intensity contextualized patch 774.
  • Successive patch 722a is accessed from the successive image 722 and successive intensity context data 714 is determined from the intensity values in the successive image 722 and pixelwise appended to the successive patch 722a to form the successive intensity contextualized patch 784.
  • the neural network 124 uses a specialized architecture to segregate processing of data for different sequencing cycles. The motivation for using the specialized architecture is described first. As discussed above, the neural network 124 processes intensity contextualized images for a current sequencing cycle, one or more preceding sequencing cycles, and one or more successive sequencing cycles. Data for additional sequencing cycles provides sequence- specific context. The neural network-based base caller 124 learns the sequence-specific context during training and base call them. Furthermore, data for pre and post sequencing cycles provides second order contribution of pre-phasing and phasing signals to the current sequencing cycle.
  • the specialized architecture comprises spatial convolution layers that do not mix information between sequencing cycles and only mix information within a sequencing cycle.
  • Spatial convolution layers use so-called “segregated convolutions” that operationalize the segregation by independently processing data for each of a plurality of sequencing cycles through a “dedicated, non-shared” sequence of convolutions.
  • the segregated convolutions convolve over data and resulting feature maps of only a given sequencing cycle, i.e., intra-cycle, without convolving over data and resulting feature maps of any other sequencing cycle.
  • the input data comprises (i) current intensity contextualized patch for a current (time t) sequencing cycle to be base called, (ii) previous intensity contextualized patch for a previous (time t- 1) sequencing cycle, and (iii) next intensity contextualized patch for a next (time /+1) sequencing cycle.
  • the specialized architecture then initiates three separate convolution pipelines, namely, a current convolution pipeline, a previous convolution pipeline, and a next convolution pipeline.
  • the current data processing pipeline receives as input the current intensity contextualized patch for the current (time t) sequencing cycle and independently processes it through a plurality of spatial convolution layers 784 to produce a so-called “current spatially convolved representation” as the output of a final spatial convolution layer.
  • the previous convolution pipeline receives as input the previous intensity contextualized patch for the previous (time t- 1) sequencing cycle and independently processes it through the plurality of spatial convolution layers 784 to produce a so-called “previous spatially convolved representation” as the output of the final spatial convolution layer.
  • the next convolution pipeline receives as input the next intensity contextualized patch for the next (time /+1) sequencing cycle and independently processes it through the plurality of spatial convolution layers 784 to produce a so-called “next spatially convolved representation” as the output of the final spatial convolution layer.
  • the current, previous, and next convolution pipelines are executed in parallel.
  • the spatial convolution layers are part of a spatial convolutional network (or subnetwork) within the specialized architecture.
  • the neural network-based base caller 124 further comprises temporal convolution layers 794 that mix information between sequencing cycles, i.e., inter-cycles.
  • the temporal convolution layers 794 receive their inputs from the spatial convolutional network and operate on the spatially convolved representations produced by the final spatial convolution layer for the respective data processing pipelines.
  • Temporal convolution layers 794 use so-called “combinatory convolutions” that groupwise convolve over input channels in successive inputs on a sliding window basis.
  • the successive inputs are successive outputs produced by a previous spatial convolution layer or a previous temporal convolution layer.
  • the temporal convolution layers 794 are part of a temporal convolutional network (or subnetwork) within the specialized architecture.
  • the temporal convolutional network receives its inputs from the spatial convolutional network.
  • a first temporal convolution layer of the temporal convolutional network groupwise combines the spatially convolved representations between the sequencing cycles.
  • subsequent temporal convolution layers of the temporal convolutional network combine successive outputs of previous temporal convolution layers.
  • the output of the final temporal convolution layer is fed to an output layer that produces an output. The output is used to base call one or more clusters at one or more sequencing cycles.
  • FIGS. 8A and 8B compare the base calling accuracy of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit (referred to herein as “DeepRTA-V2”) against a neural network-based base caller without the disclosed intensity contextualization unit (referred to herein as “DeepRTA”). Additional details about DeepRTA can be found in commonly owned U.S.
  • Figures 8 A and 8B also compare the base calling accuracy of DeepRTA-V2 against a non-neural network-based base caller without the disclosed intensity contextualization unit (referred to herein as “RTA”). Additional details about RTA can be found in commonly owned U.S. Patent Application 13/006,206.
  • the model titled “DeepRTA- V2+lanczos” is the disclosed neural network-based base caller with the disclosed intensity contextualization unit combined with an additional non-linearity logic referred to herein as “lanczos.”
  • the y-axis has the base calling error rate (“Error%”).
  • the Error% is calculated over a multitude of base calls made for a multitude of clusters (e.g, hundreds of or millions of base calls made for hundreds of or millions of clusters).
  • the x-axis has the progression of sequencing cycles 20-140 of a sequencing run over which the multitude of base calls were made for reads 1 ( Figure 8 A) and 2 ( Figure 8B).
  • Figure 9 shows the base calling error rates observed for various combinations (configurations) of filter sizes (or kernel sizes), strides, and filter bank sizes (K) of convolution filters of the disclosed neural network-based base caller 124.
  • R1C20 denotes sequencing cycle twenty during sequencing of read 1.
  • R1C20 denotes a multitude of base calls made during the sequencing cycle twenty for a multitude of clusters (e.g, hundreds of or millions of base calls made for hundreds of or millions of clusters).
  • R1C20 is used as a representative sequence cycle for early sequencing cycles in a sequencing run.
  • R1C80 denotes sequencing cycle eighty during the sequencing of read 1.
  • R1C80 denotes a multitude of base calls made during the sequencing cycle eighty for the multitude of clusters (e.g, hundreds of or millions of base calls made for hundreds of or millions of clusters).
  • R1C80 is used as a representative sequence cycle for middle sequencing cycles in the sequencing run.
  • R1C120 denotes sequencing cycle one hundred and twenty during the sequencing of read 1.
  • R1C120 denotes a multitude of base calls made during the sequencing cycle one hundred and twenty for the multitude of clusters (e.g., hundreds of or millions of base calls made for hundreds of or millions of clusters).
  • R1C120 is used as a representative sequence cycle for later sequencing cycles in the sequencing run.
  • DeepRTA denotes a particular combination of filter sizes (or kernel sizes), strides, and filter bank sizes (K) of convolution filters DeepRTA.
  • the “3-3-12” combination denotes three successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the three successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, and then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output.
  • the first spatial convolution layer has convolution filters/kemels of size 3x3.
  • the second spatial convolution layer has convolution filters/kemels of size 3x3.
  • the third spatial convolution layer has convolution filters/kemels of size 12x12.
  • the first, second, and third spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g ., lxl or 2x2). In other implementations, the first, second, and third spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g., lxl or 2x2). In some implementations, the first, second, and third spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2). In other implementations, the first, second, and third spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, and third spatial convolution layers can use different filter bank sizes so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2). In other implementations, the first, second, and third spatial convolution layers can use a same filter bank size (e.g, six or ten) so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the “3-4-9” combination denotes three successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the three successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, and then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output.
  • the first spatial convolution layer has convolution filters/kemels of size 3x3.
  • the second spatial convolution layer has convolution filters/kernels of size 4x4.
  • the third spatial convolution layer has convolution filters/kemels of size 9x9.
  • the first, second, and third spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g, lxl or 2x2). In other implementations, the first, second, and third spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2). In some implementations, the first, second, and third spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2). In other implementations, the first, second, and third spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, and third spatial convolution layers can use different filter bank sizes so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2). In other implementations, the first, second, and third spatial convolution layers can use a same filter bank size (e.g, six or ten) so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2). [00114] In Figure 9, the “3 -3 -4-9” combination denotes four successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the four successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by a fourth spatial convolution layer to produce a fourth intermediate output.
  • the first spatial convolution layer has convolution filters/kernels of size 3x3.
  • the second spatial convolution layer has convolution filters/kemels of size 3x3.
  • the third spatial convolution layer has convolution filters/kemels of size 4x4.
  • the fourth spatial convolution layer has convolution filters/kernels of size 9x9.
  • the first, second, third, and fourth spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g ., lxl or 2x2). In other implementations, the first, second, third, and fourth spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g., lxl or 2x2). In some implementations, the first, second, third, and fourth spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different filter bank sizes so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same filter bank size (e.g, six or ten) so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the “5-3-3-7” combination denotes four successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the four successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by a fourth spatial convolution layer to produce a fourth intermediate output.
  • the first spatial convolution layer has convolution filters/kemels of size 5x5.
  • the second spatial convolution layer has convolution filters/kemels of size 3x3.
  • the third spatial convolution layer has convolution filters/kemels of size 3x3.
  • the fourth spatial convolution layer has convolution filters/kernels of size 7x7.
  • the first, second, third, and fourth spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g ., lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g., lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different filter bank sizes so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same filter bank size (e.g, six or ten) so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the “5-4-4-5” combination denotes four successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the four successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by a fourth spatial convolution layer to produce a fourth intermediate output.
  • the first spatial convolution layer has convolution filters/kernels of size 5x5.
  • the second spatial convolution layer has convolution filters/kemels of size 4x4.
  • the third spatial convolution layer has convolution filters/kemels of size 4x4.
  • the fourth spatial convolution layer has convolution filters/kernels of size 5x5.
  • the first, second, third, and fourth spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different filter bank sizes so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same filter bank size (e.g ., six or ten) so that the third intermediate output has the target dimensionality (e.g., lxl or 2x2).
  • the “5-5-5-3” combination denotes four successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the four successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by a fourth spatial convolution layer to produce a fourth intermediate output.
  • the first spatial convolution layer has convolution filters/kernels of size 5x5.
  • the second spatial convolution layer has convolution filters/kemels of size 5x5.
  • the third spatial convolution layer has convolution filters/kemels of size 5x5.
  • the fourth spatial convolution layer has convolution filters/kernels of size 3x3.
  • the first, second, third, and fourth spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different filter bank sizes so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same filter bank size (e.g, six or ten) so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the “3-3-4-9_K0:3-6-8-10” combination denotes four successive spatial convolution layers of the disclosed neural network-based base caller 124.
  • the four successive spatial convolution layers are arranged in a sequence, such that a patch is first processed by a first spatial convolution layer to produce a first intermediate output, then the first intermediate output is processed by a second convolution layer to produce a second intermediate output, then the second intermediate output is processed by a third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by a fourth spatial convolution layer to produce a fourth intermediate output.
  • the first spatial convolution layer has convolution filters/kernels of size 3x3.
  • the second spatial convolution layer has convolution filters/kemels of size 3x3.
  • the third spatial convolution layer has convolution filters/kemels of size 4x4.
  • the fourth spatial convolution layer has convolution filters/kernels of size 9x9.
  • the first, second, third, and fourth spatial convolution layers can use different striding so that the third intermediate output has a target dimensionality (e.g ., lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same striding so that the third intermediate output has the target dimensionality (e.g., lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use different padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the first, second, third, and fourth spatial convolution layers can use a same padding so that the third intermediate output has the target dimensionality (e.g, lxl or 2x2).
  • the values in the table are the respective base calling error rates for the respective combinations (configurations).
  • the base calling error rates of many of the combinations of the disclosed neural network-based base caller 124 are lower than the DeepRTA.
  • base calling error rates decrease when the filter/kemel size progressively increases between successive spatial convolution layers.
  • Figure 10 compares base calling error rate of DeepRTA against base calling error rates of different filter bank size configurations (KOs) of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit (DeepRTA-K0-04; DeepRTA-K0-06; DeepRTA-KO- 10; DeepRTA-KO-16; DeepRTA-K0-18; and DeepRTA-K0- 20).
  • Ks filter bank size configurations
  • the y-axis has the base calling error rate (“Error%”).
  • the Error% is calculated over a multitude of base calls made for a multitude of clusters ⁇ e.g., hundreds of or millions of base calls made for hundreds of or millions of clusters).
  • the x-axis has the progression of sequencing cycles 20-80 of a sequencing run over which the multitude of base calls were made for read 1.
  • the Error% of DeepRTA is depicted by a fitted line with “o”; the Error% of DeepRTA-K0-04 is depicted by a fitted line with “D”; the Error% of DeepRTA-K0- 06 is depicted by a fitted line with “dl”; the Error% of DeepRTA-KO-10 is depicted by a fitted line with “n”; the Error% of DeepRTA-KO-16 is depicted by a fitted line with “V”; the Error% of DeepRTA-K0-18 is depicted by a fitted line with “O”; and the Error% of DeepRTA-K0-20 is depicted by a fitted line with “A”.
  • the different filter bank size configurations of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit i.e., DeepRTA-K0-04, DeepRTA-K0-06, DeepRTA-KO-10, DeepRTA- KO-16, DeepRTA-K0-18, and DeepRTA-K0-20 have lower base calling error rates than DeepRTA. Furthermore, this is true consistently for the progression of the sequencing cycles 20- 80 for read 1, as indicated by the fitted lines with “D”, “EH”, “p”, “V”, “O”, “A” and being consistently below the fitted line with “o” in Figure 10.
  • Figure 11 shows base calling error rates when the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit extracts intensity context data from an original input image of size 115x115 (fitted line with “o”) versus an original input image of size 160x160 (fitted line with “n”). As demonstrated by Figure 11, the base calling error rate is lower when intensity context data is gathered from a larger original input image of size 160x160.
  • Figure 12 shows base calling accuracy (1-base calling error rate) of the different configurations of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit, i.e., DeepRTA-K0-06, DeepRTA-349-K0-10-160p, DeepRTA- KO-16, DeepRTA-K0-16-Lanczos, DeepRTA-K0-18, and DeepRTA-K0-20 against DeepRT A over base calling homopolymers (e.g ., GGGGG) and flanked-homopolymers (e.g, GGTGG).
  • base calling homopolymers e.g ., GGGGG
  • flanked-homopolymers e.g, GGTGG
  • the neural network-based base caller 124 makes a base call for a current sequencing cycle by processing a window of sequencing images for a plurality of sequencing cycles, including the current sequencing cycle contextualized by right and left sequencing cycles. Since the base “G” is indicated by a dark or off state in the sequencing images, repeat patterns of the base “G” can lead to erroneous base calls, particularly when the current sequencing cycle is for a non-G base (e.g., base “T”), but right and left flanked by Gs.
  • a non-G base e.g., base “T”
  • the different configurations of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit have a high base calling accuracy for such homopolymers (e.g, GGGGG) and flanked-homopolymers (e.g, GGTGG).
  • GGGGG homopolymers
  • GGTGG flanked-homopolymers
  • the disclosed intensity contextualization unit extracts intensity context beyond a given patch to inform the neural network-based base caller 124 that even though the flanking sequencing cycles represent the base “G”, the center sequencing cycle is a non-G base.
  • Figure 13 compares base calling error rates of the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit and trained on normalized sequencing images (“DeepRTA-V2:349”) against DeepRTA, RTA, the disclosed neural network-based base caller configured with the disclosed intensity contextualization unit, trained on, and performing inference on normalized sequencing images (“DeepRTA-V2:349”), and the DeepRTA trained on and performing inference on normalized sequencing images (“DeepRT A-norm”).
  • the normalized sequencing images are normalized to have a certain intensity distribution (e.g, they have intensity values in a lower percentile and a higher percentile (e.g, five percent of the normalized intensity values are below zero, another five percent of the normalized intensity values are greater than one, and the remaining ninety percent of the normalized intensity values are between zero and one)). Additional details and examples of normalization can be found in commonly owned U.S. Patent Application 62/979,384.
  • DeepRTA-V2:349 (fitted line with “n”) and DeepRTA-V2:349-norm (fitted line with “D”) outperform DeepRTA-norm (fitted line with “V”), DeepRTA (fitted line with “o”), and RTA (fitted line with “A”).
  • Figures 14A and 14B depict one implementation of a sequencing system 1400 A.
  • the sequencing system 1400A comprises a configurable processor 1446.
  • the configurable processor 1446 implements the base calling techniques disclosed herein.
  • the sequencing system is also referred to as a “sequencer.”
  • the sequencing system 1400 A can operate to obtain any information or data that relates to at least one of a biological or chemical substance.
  • the sequencing system 1400A is a workstation that may be similar to a bench-top device or desktop computer.
  • a majority (or all) of the systems and components for conducting the desired reactions can be within a common housing 1402.
  • the sequencing system 1400A is a nucleic acid sequencing system configured for various applications, including but not limited to de novo sequencing, resequencing of whole genomes or target genomic regions, and metagenomics.
  • the sequencer may also be used for DNA or RNA analysis.
  • the sequencing system 1400A may also be configured to generate reaction sites in a biosensor.
  • the sequencing system 1400 A may be configured to receive a sample and generate surface attached clusters of clonally amplified nucleic acids derived from the sample. Each cluster may constitute or be part of a reaction site in the biosensor.
  • the exemplary sequencing system 1400 A may include a system receptacle or interface 1410 that is configured to interact with a biosensor 1412 to perform desired reactions within the biosensor 1412.
  • the biosensor 1412 is loaded into the system receptacle 1410.
  • a cartridge that includes the biosensor 1412 may be inserted into the system receptacle 1410 and in some states the cartridge can be removed temporarily or permanently.
  • the cartridge may include, among other things, fluidic control and fluidic storage components.
  • the sequencing system 1400A is configured to perform a large number of parallel reactions within the biosensor 1412.
  • the biosensor 1412 includes one or more reaction sites where desired reactions can occur.
  • the reaction sites may be, for example, immobilized to a solid surface of the biosensor or immobilized to beads (or other movable substrates) that are located within corresponding reaction chambers of the biosensor.
  • the reaction sites can include, for example, clusters of clonally amplified nucleic acids.
  • the biosensor 1412 may include a solid-state imaging device (e.g ., CCD or CMOS imager) and a flow cell mounted thereto.
  • the flow cell may include one or more flow channels that receive a solution from the sequencing system 1400A and direct the solution toward the reaction sites.
  • the biosensor 1412 can be configured to engage a thermal element for transferring thermal energy into or out of the flow channel.
  • the sequencing system 1400 A may include various components, assemblies, and systems (or sub-systems) that interact with each other to perform a predetermined method or assay protocol for biological or chemical analysis.
  • the sequencing system 1400A includes a system controller 1406 that may communicate with the various components, assemblies, and sub-systems of the sequencing system 1400A and also the biosensor 1412.
  • the sequencing system 1400A may also include a fluidic control system 1408 to control the flow of fluid throughout a fluid network of the sequencing system 1400A and the biosensor 1412; a fluid storage system 1414 that is configured to hold all fluids (e.g., gas or liquids) that may be used by the bioassay system; a temperature control system 1404 that may regulate the temperature of the fluid in the fluid network, the fluid storage system 1414, and/or the biosensor 1412; and an illumination system 1416 that is configured to illuminate the biosensor 1412.
  • the cartridge may also include fluidic control and fluidic storage components.
  • the sequencing system 1400A may include a user interface 1418 that interacts with the user.
  • the user interface 1418 may include a display 1420 to display or request information from a user and a user input device 1422 to receive user inputs.
  • the display 1420 and the user input device 1422 are the same device.
  • the user interface 1418 may include a touch-sensitive display configured to detect the presence of an individual’s touch and also identify a location of the touch on the display.
  • other user input devices 1422 may be used, such as a mouse, touchpad, keyboard, keypad, handheld scanner, voice-recognition system, motion-recognition system, and the like.
  • the sequencing system 1400A may communicate with various components, including the biosensor 1412 (e.g, in the form of a cartridge), to perform the desired reactions.
  • the sequencing system 1400A may also be configured to analyze data obtained from the biosensor to provide a user with desired information.
  • the system controller 1406 may include any processor-based or microprocessor- based system, including systems using microcontrollers, reduced instruction set computers (RISC), application specific integrated circuits (ASICs), field programmable gate array (FPGAs), coarse-grained reconfigurable architectures (CGRAs), logic circuits, and any other circuit or processor capable of executing functions described herein.
  • RISC reduced instruction set computers
  • ASICs application specific integrated circuits
  • FPGAs field programmable gate array
  • CGRAs coarse-grained reconfigurable architectures
  • logic circuits and any other circuit or processor capable of executing functions described herein.
  • the system controller 1406 executes a set of instructions that are stored in one or more storage elements, memories, or modules in order to at least one of obtain and analyze detection data.
  • Detection data can include a plurality of sequences of pixel signals, such that a sequence of pixel signals from each of the millions of sensors (or pixels) can be detected over many base calling cycles.
  • Storage elements may be in the form of information sources or physical memory elements within the sequencing system 1400 A.
  • the set of instructions may include various commands that instruct the sequencing system 1400 A or biosensor 1412 to perform specific operations such as the methods and processes of the various implementations described herein.
  • the set of instructions may be in the form of a software program, which may form part of a tangible, non-transitory computer readable medium or media.
  • the terms “software” and “firmware” are interchangeable and include any computer program stored in memory for execution by a computer, including RAM memory, ROM memory, EPROM memory, EEPROM memory, and non-volatile RAM (NVRAM) memory.
  • RAM memory random access memory
  • ROM memory read only memory
  • EPROM memory electrically erasable programmable read-only memory
  • EEPROM memory electrically erasable programmable read-only memory
  • NVRAM non-volatile RAM
  • the software may be in various forms such as system software or application software. Further, the software may be in the form of a collection of separate programs, or a program module within a larger program or a portion of a program module. The software also may include modular programming in the form of object-oriented programming. After obtaining the detection data, the detection data may be automatically processed by the sequencing system 1400 A, processed in response to user inputs, or processed in response to a request made by another processing machine ( e.g ., a remote request through a communication link).
  • the system controller 1406 includes an analysis module 1444. In other implementations, system controller 1406 does not include the analysis module 1444 and instead has access to the analysis module 1444 (e.g., the analysis module 1444 may be separately hosted on cloud).
  • the system controller 1406 may be connected to the biosensor 1412 and the other components of the sequencing system 1400A via communication links.
  • the system controller 1406 may also be communicatively connected to off-site systems or servers.
  • the communication links may be hardwired, corded, or wireless.
  • the system controller 1406 may receive user inputs or commands, from the user interface 1418 and the user input device 1422.
  • the fluidic control system 1408 includes a fluid network and is configured to direct and regulate the flow of one or more fluids through the fluid network.
  • the fluid network may be in fluid communication with the biosensor 1412 and the fluid storage system 1414.
  • select fluids may be drawn from the fluid storage system 1414 and directed to the biosensor 1412 in a controlled manner, or the fluids may be drawn from the biosensor 1412 and directed toward, for example, a waste reservoir in the fluid storage system 1414.
  • the fluidic control system 1408 may include flow sensors that detect a flow rate or pressure of the fluids within the fluid network. The sensors may communicate with the system controller 1406.
  • the temperature control system 1404 is configured to regulate the temperature of fluids at different regions of the fluid network, the fluid storage system 1414, and/or the biosensor 1412.
  • the temperature control system 1404 may include a thermocycler that interfaces with the biosensor 1412 and controls the temperature of the fluid that flows along the reaction sites in the biosensor 1412.
  • the temperature control system 1404 may also regulate the temperature of solid elements or components of the sequencing system 1400 A or the biosensor 1412.
  • the temperature control system 1404 may include sensors to detect the temperature of the fluid or other components. The sensors may communicate with the system controller 1406.
  • the fluid storage system 1414 is in fluid communication with the biosensor 1412 and may store various reaction components or reactants that are used to conduct the desired reactions therein.
  • the fluid storage system 1414 may also store fluids for washing or cleaning the fluid network and biosensor 1412 and for diluting the reactants.
  • the fluid storage system 1414 may include various reservoirs to store samples, reagents, enzymes, other biomolecules, buffer solutions, aqueous, and non-polar solutions, and the like.
  • the fluid storage system 1414 may also include waste reservoirs for receiving waste products from the biosensor 1412.
  • the cartridge may include one or more of a fluid storage system, fluidic control system or temperature control system.
  • a cartridge can have various reservoirs to store samples, reagents, enzymes, other biomolecules, buffer solutions, aqueous, and non-polar solutions, waste, and the like.
  • a fluid storage system, fluidic control system or temperature control system can be removably engaged with a bioassay system via a cartridge or other biosensor.
  • the illumination system 1416 may include a light source (e.g ., one or more LEDs) and a plurality of optical components to illuminate the biosensor. Examples of light sources may include lasers, arc lamps, LEDs, or laser diodes.
  • the optical components may be, for example, reflectors, dichroics, beam splitters, collimators, lenses, filters, wedges, prisms, mirrors, detectors, and the like.
  • the illumination system 1416 may be configured to direct an excitation light to reaction sites.
  • fluorophores may be excited by green wavelengths of light, as such the wavelength of the excitation light may be approximately 1432 nm.
  • the illumination system 1416 is configured to produce illumination that is parallel to a surface normal of a surface of the biosensor 1412.
  • the illumination system 1416 is configured to produce illumination that is off-angle relative to the surface normal of the surface of the biosensor 1412.
  • the illumination system 1416 is configured to produce illumination that has plural angles, including some parallel illumination and some off- angle illumination.
  • the system receptacle or interface 1410 is configured to engage the biosensor 1412 in at least one of a mechanical, electrical, and fluidic manner.
  • the system receptacle 1410 may hold the biosensor 1412 in a desired orientation to facilitate the flow of fluid through the biosensor 1412.
  • the system receptacle 1410 may also include electrical contacts that are configured to engage the biosensor 1412 so that the sequencing system 1400 A may communicate with the biosensor 1412 and/or provide power to the biosensor 1412.
  • the system receptacle 1410 may include fluidic ports (e.g., nozzles) that are configured to engage the biosensor 1412.
  • the biosensor 1412 is removably coupled to the system receptacle 1410 in a mechanical manner, in an electrical manner, and also in a fluidic manner.
  • the sequencing system 1400 A may communicate remotely with other systems or networks or with other bioassay systems 1400A. Detection data obtained by the bioassay system(s) 1400 A may be stored in a remote database.
  • FIG 14B is a block diagram of a system controller 1406 that can be used in the system of Figure 14 A.
  • the system controller 1406 includes one or more processors or modules that can communicate with one another.
  • Each of the processors or modules may include an algorithm (e.g, instructions stored on a tangible and/or non-transitory computer readable storage medium) or sub-algorithms to perform particular processes.
  • the system controller 1406 is illustrated conceptually as a collection of modules, but may be implemented utilizing any combination of dedicated hardware boards, DSPs, processors, etc. Alternatively, the system controller 1406 may be implemented utilizing an off-the-shelf PC with a single processor or multiple processors, with the functional operations distributed between the processors.
  • a communication port 1450 may transmit information (e.g ., commands) to or receive information (e.g., data) from the biosensor 1412 ( Figure 14A) and/or the sub-systems 1408, 1414, 1404 ( Figure 14 A).
  • the communication port 1450 may output a plurality of sequences of pixel signals.
  • a communication link 1434 may receive user input from the user interface 1418 ( Figure 14A) and transmit data or information to the user interface 1418.
  • Data from the biosensor 1412 or sub-systems 1408, 1414, 1404 may be processed by the system controller 1406 in real-time during a bioassay session. Additionally or alternatively, data may be stored temporarily in a system memory during a bioassay session and processed in slower than real-time or off-line operation.
  • the system controller 1406 may include a plurality of modules 1426-548 that communicate with a main control module 1424, along with a central processing unit (CPU) 1452.
  • the main control module 1424 may communicate with the user interface 1418 ( Figure 14 A).
  • the modules 1426-548 are shown as communicating directly with the main control module 1424, the modules 1426-548 may also communicate directly with each other, the user interface 1418, and the biosensor 1412. Also, the modules 1426-548 may communicate with the main control module 1424 through the other modules.
  • the plurality of modules 1426-548 include system modules 1428-532, 1426 that communicate with the sub-systems 1408, 1414, 1404, and 1416, respectively.
  • the fluidic control module 1428 may communicate with the fluidic control system 1408 to control the valves and flow sensors of the fluid network for controlling the flow of one or more fluids through the fluid network.
  • the fluid storage module 1430 may notify the user when fluids are low or when the waste reservoir is at or near capacity.
  • the fluid storage module 1430 may also communicate with the temperature control module 1432 so that the fluids may be stored at a desired temperature.
  • the illumination module 1426 may communicate with the illumination system 1416 to illuminate the reaction sites at designated times during a protocol, such as after the desired reactions (e.g, binding events) have occurred. In some implementations, the illumination module 1426 may communicate with the illumination system 1416 to illuminate the reaction sites at designated angles.
  • the plurality of modules 1426-548 may also include a device module 1436 that communicates with the biosensor 1412 and an identification module 1438 that determines identification information relating to the biosensor 1412.
  • the device module 1436 may, for example, communicate with the system receptacle 1410 to confirm that the biosensor has established an electrical and fluidic connection with the sequencing system 1400A.
  • the identification module 1438 may receive signals that identify the biosensor 1412.
  • the identification module 1438 may use the identity of the biosensor 1412 to provide other information to the user. For example, the identification module 1438 may determine and then display a lot number, a date of manufacture, or a protocol that is recommended to be run with the biosensor 1412.
  • the plurality of modules 1426-548 also includes an analysis module 1444 (also called signal processing module or signal processor) that receives and analyzes the signal data (e.g ., image data) from the biosensor 1412.
  • Analysis module 1444 includes memory (e.g., RAM or Flash) to store detection/image data.
  • Detection data can include a plurality of sequences of pixel signals, such that a sequence of pixel signals from each of the millions of sensors (or pixels) can be detected over many base calling cycles.
  • the signal data may be stored for subsequent analysis or may be transmitted to the user interface 1418 to display desired information to the user.
  • the signal data may be processed by the solid-state imager (e.g, CMOS image sensor) before the analysis module 1444 receives the signal data.
  • the solid-state imager e.g, CMOS image sensor
  • the analysis module 1444 is configured to obtain image data from the light detectors at each of a plurality of sequencing cycles.
  • the image data is derived from the emission signals detected by the light detectors and process the image data for each of the plurality of sequencing cycles through the neural network-based base caller 124 and produce a base call for at least some of the analytes at each of the plurality of sequencing cycle.
  • the light detectors can be part of one or more over-head cameras (e.g, Illumina’s GAIIx’s CCD camera taking images of the clusters on the biosensor 1412 from the top), or can be part of the biosensor 1412 itself (e.g, Illumina’s iSeq’s CMOS image sensors underlying the clusters on the biosensor 1412 and taking images of the clusters from the bottom).
  • over-head cameras e.g, Illumina’s GAIIx’s CCD camera taking images of the clusters on the biosensor 1412 from the top
  • the output of the light detectors is the sequencing images, each depicting intensity emissions of the clusters and their surrounding background.
  • the sequencing images depict intensity emissions generated as a result of nucleotide incorporation in the sequences during the sequencing.
  • the intensity emissions are from associated analytes and their surrounding background.
  • the sequencing images are stored in memory 1448.
  • Protocol modules 1440 and 1442 communicate with the main control module 1424 to control the operation of the sub-systems 1408, 1414, and 1404 when conducting predetermined assay protocols.
  • the protocol modules 1440 and 1442 may include sets of instructions for instructing the sequencing system 1400 A to perform specific operations pursuant to predetermined protocols.
  • the protocol module may be a sequencing-by-synthesis (SBS) module 1440 that is configured to issue various commands for performing sequencing-by synthesis processes.
  • SBS sequencing-by-synthesis
  • extension of a nucleic acid primer along a nucleic acid template is monitored to determine the sequence of nucleotides in the template.
  • the underlying chemical process can be polymerization (e.g ., as catalyzed by a polymerase enzyme) or ligation (e.g ., catalyzed by a ligase enzyme).
  • fluorescently labeled nucleotides are added to a primer (thereby extending the primer) in a template dependent fashion such that detection of the order and type of nucleotides added to the primer can be used to determine the sequence of the template.
  • commands can be given to deliver one or more labeled nucleotides, DNA polymerase, etc., into/through a flow cell that houses an array of nucleic acid templates.
  • the nucleic acid templates may be located at corresponding reaction sites. Those reaction sites where primer extension causes a labeled nucleotide to be incorporated can be detected through an imaging event. During an imaging event, the illumination system 1416 may provide an excitation light to the reaction sites.
  • the nucleotides can further include a reversible termination property that terminates further primer extension once a nucleotide has been added to a primer. For example, a nucleotide analog having a reversible terminator moiety can be added to a primer such that subsequent extension cannot occur until a deblocking agent is delivered to remove the moiety.
  • a command can be given to deliver a deblocking reagent to the flow cell (before or after detection occurs).
  • One or more commands can be given to effect wash(es) between the various delivery steps.
  • the cycle can then be repeated n times to extend the primer by n nucleotides, thereby detecting a sequence of length n.
  • Exemplary sequencing techniques are described, for example, in Bentley et ah, Nature 456:53-59 (2005); WO 04/015497; US 7,057,026; WO 91/06675; WO 07/123744; US 7,329,492; US 7,211,414; US 7,315,019; US 7,405,251, and US 2005/014705052, each of which is incorporated herein by reference.
  • nucleotide delivery step of an SBS cycle either a single type of nucleotide can be delivered at a time, or multiple different nucleotide types (e.g., A, C, T and G together) can be delivered.
  • nucleotide delivery configuration where only a single type of nucleotide is present at a time, the different nucleotides need not have distinct labels since they can be distinguished based on temporal separation inherent in the individualized delivery.
  • a sequencing method or apparatus can use single color detection.
  • an excitation source need only provide excitation at a single wavelength or in a single range of wavelengths.
  • sites that incorporate different nucleotide types can be distinguished based on different fluorescent labels that are attached to respective nucleotide types in the mixture.
  • four different nucleotides can be used, each having one of four different fluorophores.
  • the four different fluorophores can be distinguished using excitation in four different regions of the spectrum.
  • four different excitation radiation sources can be used.
  • fewer than four different excitation sources can be used, but optical filtration of the excitation radiation from a single source can be used to produce different ranges of excitation radiation at the flow cell.
  • fewer than four different colors can be detected in a mixture having four different nucleotides.
  • pairs of nucleotides can be detected at the same wavelength, but distinguished based on a difference in intensity for one member of the pair compared to the other, or based on a change to one member of the pair ( e.g ., via chemical modification, photochemical modification or physical modification) that causes apparent signal to appear or disappear compared to the signal detected for the other member of the pair.
  • Exemplary apparatus and methods for distinguishing four different nucleotides using detection of fewer than four colors are described for example in US Pat. App. Ser. Nos. 61/535,294 and 61/619,575, which are incorporated herein by reference in their entireties.
  • U.S. Application No. 13/624,200 which was filed on September 21, 2012, is also incorporated by reference in its entirety.
  • the plurality of protocol modules may also include a sample-preparation (or generation) module 1442 that is configured to issue commands to the fluidic control system 1408 and the temperature control system 1404 for amplifying a product within the biosensor 1412.
  • the biosensor 1412 may be engaged to the sequencing system 1400 A.
  • the amplification module 1442 may issue instructions to the fluidic control system 1408 to deliver necessary amplification components to reaction chambers within the biosensor 1412.
  • the reaction sites may already contain some components for amplification, such as the template DNA and/or primers.
  • the amplification module 1442 may instruct the temperature control system 1404 to cycle through different temperature stages according to known amplification protocols.
  • the amplification and/or nucleotide incorporation is performed isothermally.
  • the SBS module 1440 may issue commands to perform bridge PCR where clusters of clonal amplicons are formed on localized areas within a channel of a flow cell. After generating the amplicons through bridge PCR, the amplicons may be “linearized” to make single stranded template DNA, or sstDNA, and a sequencing primer may be hybridized to a universal sequence that flanks a region of interest. For example, a reversible terminator-based sequencing by synthesis method can be used as set forth above or as follows. [00163] Each base calling or sequencing cycle can extend an sstDNA by a single base which can be accomplished for example by using a modified DNA polymerase and a mixture of four types of nucleotides.
  • each nucleotide can further have a reversible terminator that allows only a single-base incorporation to occur in each cycle.
  • excitation light may be incident upon the reaction sites and fluorescent emissions may be detected.
  • the fluorescent label and the terminator may be chemically cleaved from the sstDNA.
  • Another similar base calling or sequencing cycle may follow. In such a sequencing protocol, the SBS module 1440 may instruct the fluidic control system 1408 to direct a flow of reagent and enzyme solutions through the biosensor 1412.
  • Exemplary reversible terminator-based SBS methods which can be utilized with the apparatus and methods set forth herein are described in US Patent Application Publication No. 2007/0166705 Al, US Patent Application Publication No. 2006/0156*3901 Al, US Patent No. 7,057,026, US Patent Application Publication No. 2006/0240439 Al, US Patent Application Publication No. 2006/02514714709 Al, PCT Publication No. WO 05/065514, US Patent Application Publication No. 2005/014700900 Al, PCT Publication No. WO 06/05B199 and PCT Publication No. WO 07/01470251, each of which is incorporated herein by reference in its entirety.
  • the amplification and SBS modules may operate in a single assay protocol where, for example, template nucleic acid is amplified and subsequently sequenced within the same cartridge.
  • the sequencing system 1400A may also allow the user to reconfigure an assay protocol.
  • the sequencing system 1400A may offer options to the user through the user interface 1418 for modifying the determined protocol. For example, if it is determined that the biosensor 1412 is to be used for amplification, the sequencing system 1400A may request a temperature for the annealing cycle. Furthermore, the sequencing system 1400A may issue warnings to a user if a user has provided user inputs that are generally not acceptable for the selected assay protocol.
  • the biosensor 1412 includes millions of sensors (or pixels), each of which generates a plurality of sequences of pixel signals over successive base calling cycles.
  • the analysis module 1444 detects the plurality of sequences of pixel signals and attributes them to corresponding sensors (or pixels) in accordance to the row-wise and/or column-wise location of the sensors on an array of sensors.
  • Figure 14C is a simplified block diagram of a system for analysis of sensor data from the sequencing system 1400A, such as base call sensor outputs.
  • the system includes the configurable processor 1446.
  • the configurable processor 1446 can execute a base caller (e.g ., the neural network-based base caller 124) in coordination with a runtime program executed by the central processing unit (CPU) 1452 ( i.e ., a host processor).
  • the sequencing system 1400A comprises the biosensor 1412 and flow cells.
  • the flow cells can comprise one or more tiles in which clusters of genetic material are exposed to a sequence of analyte flows used to cause reactions in the clusters to identify the bases in the genetic material.
  • the sensors sense the reactions for each cycle of the sequence in each tile of the flow cell to provide tile data.
  • Genetic sequencing is a data intensive operation, which translates base call sensor data into sequences of base calls for each cluster of genetic material sensed in during a base call operation.
  • the system in this example includes the CPU 1452, which executes a runtime program to coordinate the base call operations, memory 1448B to store sequences of arrays of tile data, base call reads produced by the base calling operation, and other information used in the base call operations. Also, in this illustration the system includes memory 1448A to store a configuration file (or files), such as FPGA bit files, and model parameters for the neural networks used to configure and reconfigure the configurable processor 1446, and execute the neural networks.
  • the sequencing system 1400A can include a program for configuring a configurable processor and in some implementations a reconfigurable processor to execute the neural networks.
  • the sequencing system 1400A is coupled by a bus 1489 to the configurable processor 1446.
  • the bus 1489 can be implemented using a high throughput technology, such as in one example bus technology compatible with the PCIe standards (Peripheral Component Interconnect Express) currently maintained and developed by the PCI-SIG (PCI Special Interest Group).
  • a memory 1448A is coupled to the configurable processor 1446 by bus 1493.
  • the memory 1448A can be on-board memory, disposed on a circuit board with the configurable processor 1446.
  • the memory 1448A is used for high speed access by the configurable processor 1446 of working data used in the base call operation.
  • the bus 1493 can also be implemented using a high throughput technology, such as bus technology compatible with the PCIe standards.
  • Configurable processors including field programmable gate arrays FPGAs, coarse grained reconfigurable arrays CGRAs, and other configurable and reconfigurable devices, can be configured to implement a variety of functions more efficiently or faster than might be achieved using a general purpose processor executing a computer program.
  • Configuration of configurable processors involves compiling a functional description to produce a configuration file, referred to sometimes as a bitstream or bit file, and distributing the configuration file to the configurable elements on the processor.
  • the configuration file defines the logic functions to be executed by the configurable processor, by configuring the circuit to set data flow patterns, use of distributed memory and other on-chip memory resources, lookup table contents, operations of configurable logic blocks and configurable execution units like multiply-and-accumulate units, configurable interconnects and other elements of the configurable array.
  • a configurable processor is reconfigurable if the configuration file may be changed in the field, by changing the loaded configuration file.
  • the configuration file may be stored in volatile SRAM elements, in non-volatile read-write memory elements, and in combinations of the same, distributed among the array of configurable elements on the configurable or reconfigurable processor.
  • a variety of commercially available configurable processors are suitable for use in a base calling operation as described herein.
  • Examples include Google’s Tensor Processing Unit (TPU)TM, rackmount solutions like GX4 Rackmount SeriesTM, GX9 Rackmount SeriesTM, NVIDIA DGX-1TM, Microsoft’ Stratix V FPGATM, Graphcore’s Intelligent Processor Unit (TPU)TM, Qualcomm’s Zeroth PlatformTM with Snapdragon processorsTM, NVIDIA’ s VoltaTM, NVIDIA’ s DRIVE PXTM, NVIDIA’s JETSON TX1/TX2 MODULETM, Intel’s NirvanaTM, Movidius VPUTM, Fujitsu DPITM, ARM’s DynamicIQTM, IBM TrueNorthTM, Lambda GPU Server with Testa VIOOsTM, Xilinx AlveoTM U200, Xilinx AlveoTM U250, Xilinx AlveoTM U280, Intel/Altera StratixTM GX2800, Intel/Altera StratixTM GX2800, and Intel StratixTM GX10M
  • Implementations described herein implement the neural network-based base caller 124 using the configurable processor 1446.
  • the configuration file for the configurable processor 1446 can be implemented by specifying the logic functions to be executed using a high level description language HDL or a register transfer level RTL language specification.
  • the specification can be compiled using the resources designed for the selected configurable processor to generate the configuration file.
  • the same or similar specification can be compiled for the purposes of generating a design for an application-specific integrated circuit which may not be a configurable processor.
  • configurable processor configurable processor 1446 in all implementations described herein, therefore include a configured processor comprising an application specific ASIC or special purpose integrated circuit or set of integrated circuits, or a system-on-a-chip SOC device, or a graphics processing unit (GPU) processor or a coarse-grained reconfigurable architecture (CGRA) processor, configured to execute a neural network based base call operation as described herein.
  • a configured processor comprising an application specific ASIC or special purpose integrated circuit or set of integrated circuits, or a system-on-a-chip SOC device, or a graphics processing unit (GPU) processor or a coarse-grained reconfigurable architecture (CGRA) processor, configured to execute a neural network based base call operation as described herein.
  • GPU graphics processing unit
  • CGRA coarse-grained reconfigurable architecture
  • neural network processors In general, configurable processors and configured processors described herein, as configured to execute runs of a neural network, are referred to herein as neural network processors.
  • the configurable processor 1446 is configured in this example by a configuration file loaded using a program executed by the CPU 1452, or by other sources, which configures the array of configurable elements 1491 (e.g ., configuration logic blocks (CLB) such as look up tables (LUTs), flip-flops, compute processing units (PMUs), and compute memory units (CMUs), configurable I/O blocks, programmable interconnects), on the configurable processor to execute the base call function.
  • the configuration includes data flow logic 104 which is coupled to the buses 1489 and 1493 and executes functions for distributing data and control parameters among the elements used in the base call operation.
  • the configurable processor 1446 is configured with data flow logic 104 to execute the neural network-based base caller 124.
  • the logic 104 comprises multi-cycle execution clusters (e.g., 1479) which, in this example, includes execution cluster 1 through execution cluster X.
  • the number of multi-cycle execution clusters can be selected according to a trade-off involving the desired throughput of the operation, and the available resources on the configurable processor 1446.
  • the multi-cycle execution clusters are coupled to the data flow logic 104 by data flow paths 1499 implemented using configurable interconnect and memory resources on the configurable processor 1446. Also, the multi-cycle execution clusters are coupled to the data flow logic 104 by control paths 1495 implemented using configurable interconnect and memory resources for example on the configurable processor 1446, which provide control signals indicating available execution clusters, readiness to provide input units for execution of a run of the neural network-based base caller 124 to the available execution clusters, readiness to provide trained parameters for the neural network-based base caller 124, readiness to provide output patches of base call classification data, and other control data used for execution of the neural network-based base caller 124.
  • the configurable processor 1446 is configured to execute runs of the neural network- based base caller 124 using trained parameters to produce classification data for the sensing cycles of the base calling operation.
  • a run of the neural network-based base caller 124 is executed to produce classification data for a subject sensing cycle of the base calling operation.
  • a run of the neural network-based base caller 124 operates on a sequence including a number N of arrays of tile data from respective sensing cycles of N sensing cycles, where the N sensing cycles provide sensor data for different base call operations for one base position per operation in time sequence in the examples described herein.
  • some of the N sensing cycles can be out of sequence if needed according to a particular neural network model being executed.
  • the number N can be any number greater than one.
  • sensing cycles of the N sensing cycles represent a set of sensing cycles for at least one sensing cycle preceding the subject sensing cycle and at least one sensing cycle following the subject cycle in time sequence. Examples are described herein in which the number N is an integer equal to or greater than five.
  • the data flow logic 104 is configured to move tile data and at least some trained parameters of the model parameters from the memory 1448 A to the configurable processor 1446 for runs of the neural network-based base caller 124, using input units for a given run including tile data for spatially aligned patches of the N arrays.
  • the input units can be moved by direct memory access operations in one DMA operation, or in smaller units moved during available time slots in coordination with the execution of the neural network deployed.
  • Tile data for a sensing cycle as described herein can comprise an array of sensor data having one or more features.
  • the sensor data can comprise two images which are analyzed to identify one of four bases at a base position in a genetic sequence of DNA, RNA, or other genetic material.
  • the tile data can also include metadata about the images and the sensors.
  • the tile data can comprise information about alignment of the images with the clusters such as distance from center information indicating the distance of each pixel in the array of sensor data from the center of a cluster of genetic material on the tile.
  • tile data can also include data produced during execution of the neural network-based base caller 124, referred to as intermediate data, which can be reused rather than recomputed during a run of the neural network-based base caller 124.
  • intermediate data data produced during execution of the neural network-based base caller 124
  • the data flow logic 104 can write intermediate data to the memory 1448 A in place of the sensor data for a given patch of an array of tile data. Implementations like this are described in more detail below.
  • a system for analysis of base call sensor output, comprising memory (e.g ., 1448A) accessible by the runtime program storing tile data including sensor data for a tile from sensing cycles of a base calling operation. Also, the system includes a neural network processor, such as configurable processor 1446 having access to the memory.
  • memory e.g ., 1448A
  • the system includes a neural network processor, such as configurable processor 1446 having access to the memory.
  • the neural network processor is configured to execute runs of a neural network using trained parameters to produce classification data for sensing cycles. As described herein, a run of the neural network is operating on a sequence of N arrays of tile data from respective sensing cycles of N sensing cycles, including a subject cycle, to produce the classification data for the subject cycle.
  • the data flow logic 908 is provided to move tile data and the trained parameters from the memory to the neural network processor for runs of the neural network using input units including data for spatially aligned patches of the N arrays from respective sensing cycles of N sensing cycles.
  • the neural network processor has access to the memory, and includes a plurality of execution clusters, the execution clusters in the plurality of execution clusters configured to execute a neural network.
  • the data flow logic 104 has access to the memory and to execution clusters in the plurality of execution clusters, to provide input units of tile data to available execution clusters in the plurality of execution clusters, the input units including a number N of spatially aligned patches of arrays of tile data from respective sensing cycles, including a subject sensing cycle, and to cause the execution clusters to apply the N spatially aligned patches to the neural network to produce output patches of classification data for the spatially aligned patch of the subject sensing cycle, where N is greater than 1.
  • FIG. 15 is a simplified diagram showing aspects of the base calling operation, including functions of a runtime program executed by a host processor.
  • the output of image sensors from a flow cell are provided on lines 1500 to image processing threads 1501, which can perform processes on images such as alignment and arrangement in an array of sensor data for the individual tiles and resampling of images, and can be used by processes which calculate a tile cluster mask for each tile in the flow cell, which identifies pixels in the array of sensor data that correspond to clusters of genetic material on the corresponding tile of the flow cell.
  • the outputs of the image processing threads 1501 are provided on lines 1502 to a dispatch logic 1510 in the CPU which routes the arrays of tile data to a data cache 1504 (e.g ., SSD storage) on a high-speed bus 1503, or on high-speed bus 1505 to the neural network processor hardware 1520, such as the configurable processor 1446 of Figure 14C, according to the state of the base calling operation.
  • the processed and transformed images can be stored on the data cache 1504 for sensing cycles that were previously used.
  • the hardware 1520 returns classification data output by the neural network to the dispatch logic 1515, which passes the information to the data cache 1504, or on lines 1511 to threads 1502 that perform base call and quality score computations using the classification data, and can arrange the data in standard formats for base call reads.
  • the host can include threads (not shown) that perform final processing of the output of the hardware 1520 in support of the neural network.
  • the hardware 1520 can provide outputs of classification data from a final layer of the multi -cluster neural network.
  • the host processor can execute an output activation function, such as a softmax function, over the classification data to configure the data for use by the base call and quality score threads 1502.
  • the host processor can execute input operations (not shown), such as batch normalization of the tile data prior to input to the hardware 1520.
  • Figure 16 is a simplified diagram of a configuration of a configurable processor 1446 such as that of Figure 14C.
  • the configurable processor 1446 comprises an FPGA with a plurality of high speed PCIe interfaces.
  • the FPGA is configured with a wrapper 1690 which comprises the data flow logic 104 described with reference to Figure 14C.
  • the wrapper 1690 manages the interface and coordination with a runtime program in the CPU across the CPU communication link 1677 and manages communication with the on-board DRAM 1699 (e.g ., memory 1448A) via DRAM communication link 1697.
  • the data flow logic 104 in the wrapper 1690 provides patch data retrieved by traversing the arrays of tile data on the on-board DRAM 1699 for the number N cycles to a cluster 1685, and retrieves process data 1687 from the cluster 1685 for delivery back to the on-board DRAM 1699.
  • the wrapper 1690 also manages transfer of data between the on-board DRAM 1699 and host memory, for both the input arrays of tile data, and for the output patches of classification data.
  • the wrapper transfers patch data on line 1683 to the allocated cluster 1685.
  • the wrapper provides trained parameters, such as weights and biases on line 1681 to the cluster 1685 retrieved from the on-board DRAM 1699.
  • the wrapper provides configuration and control data on line 1679 to the cluster 1685 provided from, or generated in response to, the runtime program on the host via the CPU communication link 1677.
  • the cluster can also provide status signals on line 1689 to the wrapper 1690, which are used in cooperation with control signals from the host to manage traversal of the arrays of tile data to provide spatially aligned patch data, and to execute the multi-cycle neural network over the patch data using the resources of the cluster 1685.
  • each cluster can be configured to provide classification data for base calls in a subject sensing cycle using the tile data of multiple sensing cycles described herein.
  • model data including kernel data like filter weights and biases can be sent from the host CPU to the configurable processor, so that the model can be updated as a function of cycle number.
  • a base calling operation can comprise, for a representative example, on the order of hundreds of sensing cycles. Base calling operation can include paired end reads in some embodiments.
  • the model trained parameters may be updated once every 20 cycles (or other number of cycles), or according to update patterns implemented for particular systems and neural network models.
  • the trained parameters can be updated on the transition from the first part to the second part.
  • image data for multiple cycles of sensing data for a tile can be sent from the CPU to the wrapper 1690.
  • the wrapper 1690 can optionally do some pre-processing and transformation of the sensing data and write the information to the on-board DRAM 1699.
  • the input tile data for each sensing cycle can include arrays of sensor data including on the order of 4000 x 3000 pixels per sensing cycle per tile or more, with two features representing colors of two images of the tile, and one or two bytes per feature per pixel.
  • the array of tile data for each run of the multi-cycle neural network can consume on the order of hundreds of megabytes per tile.
  • the tile data also includes an array of DFC data, stored once per tile, or other type of metadata about the sensor data and the tiles.
  • the wrapper allocates a patch to the cluster.
  • the wrapper fetches a next patch of tile data in the traversal of the tile and sends it to the allocated cluster along with appropriate control and configuration information.
  • the cluster can be configured with enough memory on the configurable processor to hold a patch of data including patches from multiple cycles in some systems, that is being worked on in place, and a patch of data that is to be worked on when the current patch of processing is finished using a ping-pong buffer technique or raster scanning technique in various embodiments.
  • an allocated cluster completes its run of the neural network for the current patch and produces an output patch, it will signal the wrapper.
  • the wrapper will read the output patch from the allocated cluster, or alternatively the allocated cluster will push the data out to the wrapper. Then the wrapper will assemble output patches for the processed tile in the DRAM 1699.
  • the wrapper sends the processed output array for the tile back to the host/CPU in a specified format.
  • the on-board DRAM 1699 is managed by memory management logic in the wrapper 1690.
  • the runtime program can control the sequencing operations to complete analysis of all the arrays of tile data for all the cycles in the run in a continuous flow to provide real time analysis.
  • Figure 17 is a computer system 1700 that can be used by the sequencing system 500A to implement the base calling techniques disclosed herein.
  • Computer system 1700 includes at least one central processing unit (CPU) 1772 that communicates with a number of peripheral devices via bus subsystem 1755.
  • peripheral devices can include a storage subsystem 858 including, for example, memory devices and a file storage subsystem 1736, user interface input devices 1738, user interface output devices 1776, and a network interface subsystem 1774.
  • the input and output devices allow user interaction with computer system 1700.
  • Network interface subsystem 1774 provides an interface to outside networks, including an interface to corresponding interface devices in other computer systems.
  • system controller 1406 is communicably linked to the storage subsystem 1710 and the user interface input devices 1738.
  • User interface input devices 1738 can include a keyboard; pointing devices such as a mouse, trackball, touchpad, or graphics tablet; a scanner; a touch screen incorporated into the display; audio input devices such as voice recognition systems and microphones; and other types of input devices.
  • pointing devices such as a mouse, trackball, touchpad, or graphics tablet
  • audio input devices such as voice recognition systems and microphones
  • use of the term “input device” is intended to include all possible types of devices and ways to input information into computer system 1700.
  • User interface output devices 1776 can include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices.
  • the display subsystem can include an LED display, a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image.
  • the display subsystem can also provide a non-visual display such as audio output devices.
  • output device is intended to include all possible types of devices and ways to output information from computer system 1700 to the user or to another machine or computer system.
  • Storage subsystem 858 stores programming and data constructs that provide the functionality of some or all of the modules and methods described herein. These software modules are generally executed by deep learning processors 1778.
  • Deep learning processors 1778 can be graphics processing units (GPUs), field- programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and/or coarse-grained reconfigurable architectures (CGRAs). Deep learning processors 1778 can be hosted by a deep learning cloud platform such as Google Cloud PlatformTM, XilinxTM, and CirrascaleTM.
  • GPUs graphics processing units
  • FPGAs field- programmable gate arrays
  • ASICs application-specific integrated circuits
  • CGRAs coarse-grained reconfigurable architectures
  • Deep learning processors 1778 can be hosted by a deep learning cloud platform such as Google Cloud PlatformTM, XilinxTM, and CirrascaleTM.
  • Examples of deep learning processors 1778 include Google’s Tensor Processing Unit (TPU)TM, rackmount solutions like GX4 Rackmount SeriesTM, GX17 Rackmount SeriesTM, NVIDIA DGX-1TM, Microsoft’ Stratix V FPGATM, Graphcore’s Intelligent Processor Unit (IPU)TM, Qualcomm’s Zeroth PlatformTM with Snapdragon processorsTM, NVIDIA’ s VoltaTM, NVIDIA’s DRIVE PXTM, NVIDIA’s JETSON TX1/TX2 MODULETM, Intel’s NirvanaTM, Movidius VPUTM, Fujitsu DPITM, ARM’s DynamicIQTM, IBM TrueNorthTM, Lambda GPU
  • Memory subsystem 1722 used in the storage subsystem 858 can include a number of memories including a main random access memory (RAM) 1732 for storage of instructions and data during program execution and a read only memory (ROM) 1734 in which fixed instructions are stored.
  • a file storage subsystem 1736 can provide persistent storage for program and data files, and can include a hard disk drive, a floppy disk drive along with associated removable media, a CD-ROM drive, an optical drive, or removable media cartridges.
  • the modules implementing the functionality of certain implementations can be stored by file storage subsystem 1736 in the storage subsystem 858, or in other machines accessible by the processor.
  • Bus subsystem 1755 provides a mechanism for letting the various components and subsystems of computer system 1700 communicate with each other as intended. Although bus subsystem 1755 is shown schematically as a single bus, alternative implementations of the bus subsystem can use multiple busses.
  • Computer system 1700 itself can be of varying types including a personal computer, a portable computer, a workstation, a computer terminal, a network computer, a television, a mainframe, a server farm, a widely -distributed set of loosely networked computers, or any other data processing system or user device. Due to the ever changing nature of computers and networks, the description of computer system 1700 depicted in Figure 17 is intended only as a specific example for purposes of illustrating the preferred implementations of the present invention. Many other configurations of computer system 1700 are possible having more or less components than the computer system depicted in Figure 17.
  • the technology disclosed provides an artificial intelligence-based base caller with contextual awareness.
  • the technology disclosed can be practiced as a system, method, or article of manufacture.
  • One or more features of an implementation can be combined with the base implementation. Implementations that are not mutually exclusive are taught to be combinable.
  • One or more features of an implementation can be combined with other implementations. This disclosure periodically reminds the user of these options. Omission from some implementations of recitations that repeat these options should not be taken as limiting the combinations taught in the preceding sections - these recitations are hereby incorporated forward by reference into each of the following implementations.
  • Various processes and steps of the methods set forth herein can be carried out using a computer.
  • the computer can include a processor that is part of a detection device, networked with a detection device used to obtain the data that is processed by the computer or separate from the detection device.
  • information may be transmitted between components of a system disclosed herein directly or via a computer network.
  • a local area network (LAN) or wide area network (WAN) may be a corporate computing network, including access to the Internet, to which computers and computing devices comprising the system are connected.
  • the LAN conforms to the transmission control protocol/intemet protocol (TCP/IP) industry standard.
  • the information e.g., image data
  • an input device e.g., disk drive, compact disk player, USB port etc.
  • the information is received by loading the information, e.g., from a storage device such as a disk or flash drive.
  • a processor that is used to run an algorithm or other process set forth herein may comprise a microprocessor.
  • the microprocessor may be any conventional general purpose single- or multi-chip microprocessor such as a PentiumTM processor made by Intel Corporation.
  • a particularly useful computer can utilize an Intel Ivybridge dual- 12 core processor, LSI raid controller, having 128 GB of RAM, and 2 TB solid state disk drive.
  • the processor may comprise any conventional special purpose processor such as a digital signal processor or a graphics processor.
  • the processor typically has conventional address lines, conventional data lines, and one or more conventional control lines.
  • implementations disclosed herein may be implemented as a method, apparatus, system or article of manufacture using standard programming or engineering techniques to produce software, firmware, hardware, or any combination thereof.
  • article of manufacture refers to code or logic implemented in hardware or computer readable media such as optical storage devices, and volatile or non-volatile memory devices.
  • Such hardware may include, but is not limited to, field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), complex programmable logic devices (CPLDs), programmable logic arrays (PLAs), microprocessors, or other similar processing devices.
  • FPGAs field programmable gate arrays
  • ASICs application-specific integrated circuits
  • CPLDs complex programmable logic devices
  • PDAs programmable logic arrays
  • microprocessors or other similar processing devices.
  • information or algorithms set forth herein are present in non transient storage media.
  • a computer-implemented method set forth herein can occur in real time while multiple images of an object are being obtained.
  • Such real time analysis is particularly useful for nucleic acid sequencing applications wherein an array of nucleic acids is subjected to repeated cycles of fluidic and detection steps.
  • Analysis of the sequencing data can often be computationally intensive such that it can be beneficial to perform the methods set forth herein in real time or in the background while other data acquisition or analysis algorithms are in process.
  • Example real time analysis methods that can be used with the present methods are those used for the MiSeq and HiSeq sequencing devices commercially available from Illumina, Inc. (San Diego, Calif.) and/or described in US Pat. App. Pub. No. 2012/0020537 Al, which is incorporated herein by reference.
  • the system includes memory, a data flow logic, a neural network, and an intensity contextualization unit.
  • the memory stores images that depict intensity emissions of a set of analytes.
  • the intensity emissions are generated by analytes in the set of analytes during sequencing cycles of a sequencing run.
  • the images have the intensity values for one or more intensity channels.
  • the data flow logic has access to the memory and is configured to provide a neural network access to the images on a patch-by-patch basis.
  • the patches in an image depict the intensity emissions for a subset of the analytes.
  • the patches have undiverse intensity patterns due to limited base diversity of analytes in the subset.
  • the neural network has a plurality of convolution filters.
  • Convolution filters in the plurality of convolution filters have receptive fields confined to the patches.
  • the convolution filters are configured to detect intensity patterns in the patches with losses in detection due to the undiverse intensity patterns and the confined receptive fields.
  • the intensity contextualization unit is configured to determine intensity context data based on intensity values in the images and store the intensity context data in the memory.
  • the data flow logic is configured to append the intensity context data to the patches to generate intensity contextualized images and provide the intensity contextualized images to the neural network.
  • the neural network is configured to apply the convolution filters on the intensity contextualized images and generates base call classifications.
  • the intensity context data in the intensity contextualized images compensates for the losses in detection.
  • the intensity context data specifies summary statistics of the intensity values.
  • the intensity context data identifies a maximum value in the intensity values.
  • the intensity context data identifies a minimum value in the intensity values.
  • the intensity context data identifies a mean of the intensity values.
  • the intensity context data identifies a mode of the intensity values.
  • the intensity context data identifies a standard deviation of the intensity values.
  • the intensity context data identifies a variance of the intensity values.
  • the intensity context data identifies a skewness of the intensity values. In one implementation, the intensity context data identifies a kurtosis of the intensity values. In one implementation, the intensity context data identifies an entropy of the intensity values.
  • the intensity context data identifies one or more percentiles of the intensity values. In one implementation, the intensity context data identifies a delta between at least one of the maximum value and the minimum value, the maximum value and the mean, the mean and the minimum value, and a higher one of the percentiles and a lower one of the percentiles. In one implementation, the intensity context data identifies a sum of the intensity values.
  • the intensity contextualization unit determines a plurality of maximum values by dividing the intensity values into groups and determining a maximum value for each of the groups.
  • the intensity context data identifies the smallest value in the plurality of maximum values.
  • the intensity contextualization unit determines a plurality of minimum values by dividing the intensity values into groups and determining a minimum value for each of the groups.
  • the intensity context data identifies the largest value in the plurality of minimum values.
  • the intensity contextualization unit determines a plurality of sums by dividing the intensity values into groups and determining a sum of intensity values in each of the groups.
  • the intensity context data identifies the smallest value in the plurality of sums.
  • the intensity context data identifies the largest value in the plurality of sums.
  • the intensity context data identifies a mean of the plurality of sums.
  • the intensity contextualization unit has a plurality of convolution pipelines.
  • Each of the convolution pipelines has a plurality of convolution filters.
  • Convolution filters in the plurality of convolution filters have varying filter sizes.
  • the convolution filters have varying filter strides.
  • each of the convolution pipelines processes an image to generate a plurality of convolved representations of the image.
  • the intensity context data has a context channel for each convolved representation in the plurality of convolved representations.
  • the context channel has as many concatenated copies of a respective one of the convolved representations as required to match a size of the image.
  • each convolved representation is of size 1 x 1.
  • the concatenated copies are pixelwise appended to the image.
  • the method includes accessing images that depict intensity emissions of a set of analytes.
  • the intensity emissions are generated by analytes in the set of analytes during sequencing cycles of a sequencing run.
  • the method includes processing the images on a patch-by-patch basis, and thereby generating patches.
  • the patches depict the intensity emissions for a subset of the analytes.
  • the method includes determining intensity context data based on intensity values in the images.
  • the method includes appending the intensity context data to the patches and generating intensity contextualized images.
  • the method includes processing the intensity contextualized images and generating base call classifications.
  • implementations of the method described in this section can include a non- transitory computer readable storage medium storing instructions executable by a processor to perform any of the methods described above.
  • implementations of the method described in this section can include a system including memory and one or more processors operable to execute instructions, stored in the memory, to perform any of the methods described above.
  • a self-normalizing neural network comprises a normalization layer (e.g., the intensity contextualization unit 112).
  • the normalization layer is configured to determine one or more normalization parameters from an input on an input-by-input basis.
  • the normalization layer is further configured to append context data characterizing the normalization parameters to patches accessed from the input.
  • two inputs such as two images.
  • the normalization layer determines a first set of normalization parameters for the first image and a second set of normalization parameters for the second image. This is different from other normalization techniques like batch normalization, which learns a fixed set of normalization parameters and uses them for a whole batch of inputs.
  • the normalization parameters determined by the disclosed normalization layer are specific to a given input, and determined at runtime (e.g., at inference).
  • the normalization layer is trained to generate normalization parameters that are specific to a subject input.
  • the self-normalizing neural network further comprises runtime logic.
  • the runtime logic is configured to process the patches appended with the context data through the self- normalizing neural network to generate an output.
  • the normalization layer is further configured to determine respective normalization parameters for respective inputs at runtime.
  • the normalization parameters are summary statistics about intensity values in the input.
  • the context data includes the summary statistics in a pixel- wise encoding.
  • the context data is pixel-wise encoded to the patches.

Abstract

L'invention concerne un réseau neuronal qui traite des images de séquençage correctif par correctif pour un appel de base. Les images de séquençage représentent des émissions d'intensité d'un ensemble d'analytes. Les correctifs représentent les émissions d'intensité pour un sous-ensemble des analytes et présentent des motifs d'intensité non divers en raison d'une diversité de base limitée. Le réseau neuronal comprend des filtres de convolution qui ont des champs de réception limités aux correctifs. Les filtres de convolution détectent des motifs d'intensité dans les correctifs avec des pertes de détection dues aux motifs d'intensité non divers et aux champs de réception limités. Une unité de contextualisation d'intensité détermine des données de contexte d'intensité sur la base de valeurs d'intensité dans les images. La logique de flux de données ajoute les données de contexte d'intensité aux images de séquençage pour générer des images contextualisées d'intensité. Le réseau neuronal applique les filtres de convolution aux images contextualisées d'intensité et génère des classifications d'appels de base. Les données de contexte d'intensité dans les images contextualisées d'intensité compensent les pertes de détection.
PCT/US2022/021814 2021-03-31 2022-03-24 Appelant de base à base d'intelligence artificielle avec reconnaissance contextuelle WO2022212180A1 (fr)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202280005054.XA CN115803816A (zh) 2021-03-31 2022-03-24 具有情境感知的基于人工智能的碱基检出器
CA3183578A CA3183578A1 (fr) 2021-03-31 2022-03-24 Appelant de base a base d'intelligence artificielle avec reconnaissance contextuelle
AU2022248999A AU2022248999A1 (en) 2021-03-31 2022-03-24 Artificial intelligence-based base caller with contextual awareness
EP22720805.5A EP4315343A1 (fr) 2021-03-31 2022-03-24 Appelant de base à base d'intelligence artificielle avec reconnaissance contextuelle

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163169163P 2021-03-31 2021-03-31
US63/169,163 2021-03-31
US17/687,586 US20220319639A1 (en) 2021-03-31 2022-03-04 Artificial intelligence-based base caller with contextual awareness
US17/687,586 2022-03-04

Publications (1)

Publication Number Publication Date
WO2022212180A1 true WO2022212180A1 (fr) 2022-10-06

Family

ID=81579662

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/021814 WO2022212180A1 (fr) 2021-03-31 2022-03-24 Appelant de base à base d'intelligence artificielle avec reconnaissance contextuelle

Country Status (5)

Country Link
EP (1) EP4315343A1 (fr)
CN (1) CN115803816A (fr)
AU (1) AU2022248999A1 (fr)
CA (1) CA3183578A1 (fr)
WO (1) WO2022212180A1 (fr)

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991006675A1 (fr) 1989-11-02 1991-05-16 Hoechst Japan Limited Nouvelle sonde a base d'oligonucleotides
WO2004015497A1 (fr) 2002-08-07 2004-02-19 Mitsubishi Chemical Corporation Materiau de formation d'image comportant une couche de materiau de reserve photosensible au laser violet bleuatre et procede de formation d'image de reserve correspondant
US20050147009A1 (en) 2003-12-01 2005-07-07 Lg Electronics Inc. Method for successively recording data in hybrid digital recorder
US20050147050A1 (en) 2002-04-29 2005-07-07 Joachim Klink Method for examining the connectivity of links in mpls networks
WO2005065514A1 (fr) 2004-01-12 2005-07-21 Djibril Soumah Lunette de wc
WO2006058199A1 (fr) 2004-11-23 2006-06-01 Fazix Corporation Procedes de modulation de niveaux de cholesterol a lipoproteines de haute densite et formulations pharmaceutiques a cet effet
US7057026B2 (en) 2001-12-04 2006-06-06 Solexa Limited Labelled nucleotides
US20060240439A1 (en) 2003-09-11 2006-10-26 Smith Geoffrey P Modified polymerases for improved incorporation of nucleotide analogues
US20060251471A1 (en) 2005-05-06 2006-11-09 Wei-Gen Chen Manual adjustment device for headlamps
WO2007014702A1 (fr) 2005-07-29 2007-02-08 Cairos Technologies Ag Dispositif mobile et dispositif recepteur pour detection de contacts avec le dispositif mobile
US7211414B2 (en) 2000-12-01 2007-05-01 Visigen Biotechnologies, Inc. Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
US20070166705A1 (en) 2002-08-23 2007-07-19 John Milton Modified nucleotides
WO2007123744A2 (fr) 2006-03-31 2007-11-01 Solexa, Inc. Systèmes et procédés pour analyse de séquençage par synthèse
WO2007145365A1 (fr) 2006-06-14 2007-12-21 Jichi Medical University Agent thérapeutique destiné au traitement du cancer et procédé destiné à cribler ledit agent
US7315019B2 (en) 2004-09-17 2008-01-01 Pacific Biosciences Of California, Inc. Arrays of optical confinements and uses thereof
US7329492B2 (en) 2000-07-07 2008-02-12 Visigen Biotechnologies, Inc. Methods for real-time single molecule sequence determination
US7405251B2 (en) 2002-05-16 2008-07-29 Dow Corning Corporation Flame retardant compositions
US7414716B2 (en) 2006-10-23 2008-08-19 Emhart Glass S.A. Machine for inspecting glass containers
US7592435B2 (en) 2005-08-19 2009-09-22 Illumina Cambridge Limited Modified nucleosides and nucleotides and uses thereof
US20120020537A1 (en) 2010-01-13 2012-01-26 Francisco Garcia Data processing system and methods
US20200302223A1 (en) * 2019-03-21 2020-09-24 Illumina, Inc. Artificial Intelligence-Based Generation of Sequencing Metadata
US20200302297A1 (en) * 2019-03-21 2020-09-24 Illumina, Inc. Artificial Intelligence-Based Base Calling

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991006675A1 (fr) 1989-11-02 1991-05-16 Hoechst Japan Limited Nouvelle sonde a base d'oligonucleotides
US7329492B2 (en) 2000-07-07 2008-02-12 Visigen Biotechnologies, Inc. Methods for real-time single molecule sequence determination
US7211414B2 (en) 2000-12-01 2007-05-01 Visigen Biotechnologies, Inc. Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
US7427673B2 (en) 2001-12-04 2008-09-23 Illumina Cambridge Limited Labelled nucleotides
US7566537B2 (en) 2001-12-04 2009-07-28 Illumina Cambridge Limited Labelled nucleotides
US7057026B2 (en) 2001-12-04 2006-06-06 Solexa Limited Labelled nucleotides
US20050147050A1 (en) 2002-04-29 2005-07-07 Joachim Klink Method for examining the connectivity of links in mpls networks
US7405251B2 (en) 2002-05-16 2008-07-29 Dow Corning Corporation Flame retardant compositions
WO2004015497A1 (fr) 2002-08-07 2004-02-19 Mitsubishi Chemical Corporation Materiau de formation d'image comportant une couche de materiau de reserve photosensible au laser violet bleuatre et procede de formation d'image de reserve correspondant
US7541444B2 (en) 2002-08-23 2009-06-02 Illumina Cambridge Limited Modified nucleotides
US20070166705A1 (en) 2002-08-23 2007-07-19 John Milton Modified nucleotides
US20060240439A1 (en) 2003-09-11 2006-10-26 Smith Geoffrey P Modified polymerases for improved incorporation of nucleotide analogues
US20050147009A1 (en) 2003-12-01 2005-07-07 Lg Electronics Inc. Method for successively recording data in hybrid digital recorder
WO2005065514A1 (fr) 2004-01-12 2005-07-21 Djibril Soumah Lunette de wc
US7315019B2 (en) 2004-09-17 2008-01-01 Pacific Biosciences Of California, Inc. Arrays of optical confinements and uses thereof
WO2006058199A1 (fr) 2004-11-23 2006-06-01 Fazix Corporation Procedes de modulation de niveaux de cholesterol a lipoproteines de haute densite et formulations pharmaceutiques a cet effet
US20060251471A1 (en) 2005-05-06 2006-11-09 Wei-Gen Chen Manual adjustment device for headlamps
WO2007014702A1 (fr) 2005-07-29 2007-02-08 Cairos Technologies Ag Dispositif mobile et dispositif recepteur pour detection de contacts avec le dispositif mobile
US7592435B2 (en) 2005-08-19 2009-09-22 Illumina Cambridge Limited Modified nucleosides and nucleotides and uses thereof
WO2007123744A2 (fr) 2006-03-31 2007-11-01 Solexa, Inc. Systèmes et procédés pour analyse de séquençage par synthèse
WO2007145365A1 (fr) 2006-06-14 2007-12-21 Jichi Medical University Agent thérapeutique destiné au traitement du cancer et procédé destiné à cribler ledit agent
US7414716B2 (en) 2006-10-23 2008-08-19 Emhart Glass S.A. Machine for inspecting glass containers
US20120020537A1 (en) 2010-01-13 2012-01-26 Francisco Garcia Data processing system and methods
US20200302223A1 (en) * 2019-03-21 2020-09-24 Illumina, Inc. Artificial Intelligence-Based Generation of Sequencing Metadata
US20200302297A1 (en) * 2019-03-21 2020-09-24 Illumina, Inc. Artificial Intelligence-Based Base Calling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BENTLEY ET AL., NATURE, vol. 456, 2005, pages 53 - 59

Also Published As

Publication number Publication date
CA3183578A1 (fr) 2022-10-06
AU2022248999A1 (en) 2023-02-02
CN115803816A (zh) 2023-03-14
EP4315343A1 (fr) 2024-02-07

Similar Documents

Publication Publication Date Title
US20210264267A1 (en) Bus Network for Artificial Intelligence-Based Base Caller
US11749380B2 (en) Artificial intelligence-based many-to-many base calling
US20210265015A1 (en) Hardware Execution and Acceleration of Artificial Intelligence-Based Base Caller
US20220067489A1 (en) Detecting and Filtering Clusters Based on Artificial Intelligence-Predicted Base Calls
US20220319639A1 (en) Artificial intelligence-based base caller with contextual awareness
EP4315343A1 (fr) Appelant de base à base d'intelligence artificielle avec reconnaissance contextuelle
US20230005253A1 (en) Efficient artificial intelligence-based base calling of index sequences

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22720805

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3183578

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2022248999

Country of ref document: AU

Date of ref document: 20220324

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2022720805

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022720805

Country of ref document: EP

Effective date: 20231031

NENP Non-entry into the national phase

Ref country code: DE