CN115803816A

CN115803816A - Artificial intelligence-based base detector with context awareness

Info

Publication number: CN115803816A
Application number: CN202280005054.XA
Authority: CN
Inventors: A·起亚
Original assignee: Inmair Ltd
Current assignee: Inmair Ltd
Priority date: 2021-03-31
Filing date: 2022-03-24
Publication date: 2023-03-14
Also published as: EP4315343A1; CA3183578A1; WO2022212180A1; AU2022248999A1

Abstract

A neural network processes sequencing images on a block-by-block basis for base detection. These sequencing images depict the intensity emissions of a set of analytes. These blocks depict the intensity emissions of a subset of these analytes and have a single intensity pattern due to limited base diversity. The neural network has convolution filters with a receive domain that is confined to the blocks. These convolution filters detect the intensity patterns in these blocks, with a loss of detection due to these single intensity patterns and the localized receive domain. An intensity contextualization unit determines intensity context data based on intensity values in the images. The data flow logic appends the intensity context data to the sequencing images to generate intensity contextualized images. The neural network applies the convolution filters to the intensity-contextualized images and generates a base call classification. The intensity context data in these intensity-contextualized images compensates for this detection loss.

Description

Artificial intelligence-based base detector with context awareness

Priority application

The present application claims priority and benefit from U.S. non-provisional patent application Ser. No. 17/687,586, entitled "Artificial Intelligence-Based Base Call with Contextual Aware", filed on 3/4/2022 (attorney docket No. ILLM 1033-2/IP-2007-US) and U.S. provisional patent application Ser. No. 63/169,163, entitled "Artificial Intelligence-Based Base Call with Contexture Aware", filed on 3/31/2021 (attorney docket No. ILLM 1033-1/IP-2007-PRV).

Technical Field

The technology disclosed herein relates to artificial intelligence type computers and digital data processing systems and corresponding data processing methods and products for simulated intelligence (i.e., knowledge-based systems, inference systems, and knowledge acquisition systems); and include systems for uncertainty inference (e.g., fuzzy logic systems), adaptive systems, machine learning systems, and artificial neural networks. In particular, the disclosed technology relates to the use of neural networks, such as convolutional neural networks, for analyzing data.

Incorporation of documents

The following documents are incorporated by reference, i.e., as if fully set forth herein, for all purposes:

U.S. patent application No. 62/979,384 entitled "ARTIFICIAL INTELLIGENCE-BASED BASE CALLING OF INDEX SEQUENCES" filed on 20/2/2020 (attorney docket No. ILLM 1015-1/IP-1857-PRV);

U.S. patent application No. 62/979,414 entitled "ARTIFICIAL INTELLIGENCE-BASED MANY-TO-MANY BASE CALLING", filed on 20/2/2020 (attorney docket No. ILLM 1016-1/IP-1858-PRV);

U.S. patent application No. 62/979,385 entitled "KNOWLEDGE DISTILLATION-BASED COMPRESSION OF ARTIFICIAL INTELLIGENCE-BASED BASE CALLER" filed on 20/2/2020 (attorney docket No. ILLM 1017-1/IP-1859-PRV);

U.S. patent application No. 63/072,032 entitled "DETECTING AND DETECTING CLUSTERS BASED ON ARTIFICIAL INTERLIGENCE-PREDICTED BASE CALLS", filed ON 28.8.2020 (attorney docket No. ILLM 1018-1/IP-1860-PRV);

U.S. patent application No. 62/979,412 entitled "Multi-CYCLE CLUSTER BASED REAL TIME ANALYSIS SYSTEM" filed on 20/2/2020 (attorney docket No. ILLM 1020-1/IP-1866-PRV);

U.S. patent application No. 62/979,411 entitled "DATA COMPRESSION FOR ARTIFICIAL INTELLIGENCE-BASED BASE CALLING", filed on 20.2.2020 (attorney docket No. ILLM 1029-1/IP-1964-PRV);

U.S. patent application No. 17/179,395 entitled "DATA COMPRESSION FOR ARTIFICIAL INTELLIGENCE-BASED BASE CALLING", filed on 18/2.2021 (attorney docket No. ILLM 1029-2/IP-1964-US);

U.S. patent application No. 62/979,399 entitled "SQUEEZING LAYER FOR ARTIFICIAL INTELLIGENCE-BASED BASE CALLING", filed on 20.2.2020 (attorney docket No. ILLM 1030-1/IP-1982-PRV);

U.S. patent application Ser. No. 17/180,480 entitled "SPLIT ARCHITECTURE FOR ARTIFICIAL INTELLIGENCE-BASED BASE CALLER", filed on 19/2/2021 (attorney docket No. ILLM 1030-2/IP-1982-US);

U.S. patent application No. 17/180,513 (attorney docket No. ILLM 1031-2/IP-1965-US) entitled "BUS NETWORK FOR ARTIFICIAL INTELLIGENCE-BASED BASE CALLER" filed on 19/2/2021;

U.S. patent application Ser. No. 16/825,987 (attorney docket No. ILLM 1008-16/IP-1693-US) entitled "TRAINING DATA GENERATION FOR ARTIFICIAL INTELLIGENCE-BASED SEQUENCEING", filed on 3, 20.2020;

U.S. patent application Ser. No. 16/825,991 entitled "ARTIFICIAL INTELLIGENCE-BASED GENERATION OF SEQUENCENING METADATA", filed 3/20/2020 (attorney docket No. ILLM 1008-17/IP-1741-US);

U.S. patent application Ser. No. 16/826,126 entitled "ARTIFICIAL INTELLIGENCE-BASED BASE CALLING", filed 3, 20/2020 (attorney docket No. ILLM 1008-18/IP-1744-US);

U.S. patent application Ser. No. 16/826,134 entitled "ARTIFICIAL INTELLIGENCE-BASED QUALITY SCORING", filed 3, 20/2020 (attorney docket No. ILLM 1008-19/IP-1747-US); and

U.S. patent application Ser. No. 16/826,168 entitled "ARTIFICIAL INTELLIGENCE-BASED SEQUENCEING" filed on 21/3.2020 (attorney docket number ILLM 1008-20/IP-1752-PRV-US).

Background

The subject matter discussed in this section should not be admitted to be prior art merely by virtue of its mention in this section. Similarly, the problems mentioned in this section or associated with the subject matter provided as background should not be considered as having been previously recognized in the prior art. The subject matter in this section is merely representative of various approaches that may themselves correspond to implementations of the claimed technology.

Convolutional neural networks are currently the most advanced machine learning algorithms for many tasks in computer vision, such as classification or segmentation. Training convolutional neural networks requires a large amount of computer memory, which grows exponentially as image size increases. Computer memory becomes a limiting factor because the back propagation algorithm used to optimize the deep neural network requires storage of intermediate activations. Since the size of these intermediate activations in the convolutional neural network increases in proportion to the input size, the memory is quickly filled with large images.

The problem of large images is circumvented by down-sampling the original image or processing the original image on a block-by-block basis. Both of these methods have significant disadvantages: the former results in a loss of local detail, while the latter results in a loss of global context information. The receive domain of the convolutional filter of the convolutional neural network is at most the size of a block. The convolution filter ignores the spatial relationship between blocks, thereby limiting the incorporation of context information from outside the subject block.

Thus, there is an opportunity to improve base detection by analyzing both local detail in the block and the global context outside the block. More accurate base detection may result in reduced error rates.

Drawings

In the drawings, like reference characters generally refer to like parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the disclosed technology. In the following description, various embodiments of the disclosed technology are described with reference to the following drawings, in which:

FIG. 1 is a simplified block diagram illustrating aspects of the disclosed technology.

Figure 2 illustrates one implementation of accessing sequencing images for base detection on a block-by-block basis.

FIG. 3 illustrates one implementation of generating an intensity contextualized image.

FIG. 4 shows one example of a global image from which the neural network accesses blocks such that the blocks are centered on a target cluster to be base detected.

FIG. 5 depicts one implementation of an intensity contextualization unit with multiple convolution pipelines.

FIG. 6 illustrates one implementation of a neural network that processes intensity contextualization blocks and generates base detections.

Figure 7 illustrates one implementation of a neural network processing previous, current, and subsequent intensity contextualized images of multiple sequencing cycles and generating base detections.

Fig. 8A and 8B show the superiority of the disclosed base detector based on a neural network configured with the disclosed intensity contextualization unit in base detection with respect to another base detector based on a neural network (deep RTA) and another base detector based on a non-neural network (RTA).

Fig. 9 illustrates the base call error rates observed for various combinations (configurations) of filter size (or kernel size), step size, and filter bank size (K) of the convolution filters of the disclosed neural network-based base call picker.

FIG. 10 compares the base detection error rates of DeepRTA with the base detection error rates of different filter bank size configurations (K0) of the disclosed neural network-based base detectors (DeepRTA-K0-04, deepRTA-K0-06, deepRTA-K0-10, deepRTA-K0-16, deepRTA-K0-18, and DeepRTA-K0-20) configured with the disclosed intensity contextualization units.

FIG. 11 shows a base detection error rate (fitted line with ". Smallcircle.") when the disclosed neural network-based base finder configured with the disclosed intensity contextualization unit extracts intensity context data from a raw input image of size 115X 115 versus a base detection error rate (fitted line with "\9633;") when intensity context data is extracted from a raw input image of size 160X 160.

FIG. 12 shows different configurations of the disclosed neural-network based base picker (i.e., deepRTA-K0-06, deepRTA-349-K0-10-160p, deepRTA-K0-16-Lanczos, deepRTA-K0-18, and DeepRTA-K0-20) configured with the disclosed intensity contextualization units versus the base pick accuracy (1-base pick error rate) of the DeepRTA for homopolymers (e.g., GGGGG) and flanking homopolymers (e.g., GGTGG).

FIG. 13 compares the base detection error rates of a disclosed neural network-based base finder ("DeepRTA-V2: 349") configured with the disclosed intensity contextualization unit and trained based on normalized sequencing images versus DeepRTA, RTA, a disclosed neural network-based base finder ("DeepRTA-V2: 349") configured with the disclosed intensity contextualization unit, trained based on normalized sequencing images, and performing inferences based on normalized sequencing images, and a DeepRTA ("DeepRTA-norm") trained based on normalized sequencing images and performing inferences based on normalized sequencing images.

Fig. 14A and 14B depict one implementation of a sequencing system. The sequencing system includes a configurable processor.

Figure 14C is a simplified block diagram of a system for analyzing sensor data (such as base call sensor output) from a sequencing system.

FIG. 15 is a simplified diagram illustrating aspects of the base detection operation, including the functionality of a runtime program executed by a host processor.

Fig. 16 is a simplified diagram of a configuration of a configurable processor, such as the configurable processor depicted in fig. 14C.

FIG. 17 is a computer system that may be used to implement the disclosed technology.

Detailed Description

The following discussion is presented to enable any person skilled in the art to make and use the disclosed techniques, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed implementations will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other implementations and applications without departing from the spirit and scope of the disclosed technology. Thus, the disclosed technology is not intended to be limited to the particular implementations shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.

The detailed description of various embodiments will be better understood when read in conjunction with the appended drawings. To the extent that the figures illustrate diagrams of the functional blocks of various implementations, the functional blocks are not necessarily indicative of the division between hardware circuitry. Thus, for example, one or more of the functional blocks (e.g., modules, processors or memories) may be implemented in a single piece of hardware (e.g., a general purpose signal processor or a block of random access memory, hard disk, or the like) or multiple pieces of hardware. Similarly, the programs may be stand alone programs, may be incorporated as subroutines in an operating system, may be functions in an installed software package, and the like. It should be understood that the various implementations are not limited to the arrangements and instrumentality shown in the drawings.

The processing engines and databases designated as modules in the figures may be implemented in hardware or software and need not be partitioned into exactly the same blocks as shown in the figures. Some of these modules may also be implemented on different processors, computers, or servers, or distributed among multiple different processors, computers, or servers. Further, it should be understood that some of the modules may be combined, operated in synchronization, or operated in a different sequence than shown in the figures without affecting the functionality achieved. The blocks in the figures may also be considered as steps of a flow chart in a method. A module also does not necessarily need to have all its code placed in memory in succession; some portions of code may be separated from other portions of code with code from other modules or other functions disposed between them.

The disclosed technology provides artificial intelligence based base detectors with context awareness. FIG. 1 is a simplified block diagram illustrating aspects of the disclosed technology. Fig. 1 includes an image 102, data flow logic 104, an intensity contextualization unit 112 (also referred to herein as a "block processing unit (PPU)"), intensity context data 122, an intensity contextualization image 114, a neural network 124 (or a neural network-based base finder), and a base detection 134. The system may be formed of one or more programmed computers, where the programming is stored on one or more machine readable media, where the code is executed to perform one or more steps of the methods described herein. In the illustrated implementation, for example, the system includes data flow logic 104 configured to output intensity contextualized image 114 as digital image data, such as image data representing individual picture elements or pixels that together form an image of an array or other object.

Sequencing images

Base calling is the process of determining the nucleotide composition of a sequence. Base calling involves analyzing image data, i.e., sequencing images, generated during sequencing runs (or sequencing reactions) performed by sequencing instruments such as ilumina's iSeq, hiSeqX, hiSeq 3000, hiSeq 4000, hiSeq 2500, novaSeq 6000, nextSeq 550, nextSeq 1000, nextSeq 2000, nextSeqDx, miSeq, and MiSeqDx.

The following discussion outlines a method of generating a sequencing image and what is depicted therein according to one implementation.

Base calling decodes the intensity data encoded in the sequencing image into the nucleotide sequence. In one implementation, the Illumina sequencing platform employs Cycle Reversible Termination (CRT) chemistry for base calling. This process relies on growing a nascent strand complementary to the template strand with fluorescently labeled nucleotides while tracking the emission signal of each newly added nucleotide. The fluorescently labeled nucleotides have a 3' removable block that anchors the fluorophore signal of the nucleotide type.

Sequencing is performed in repeated cycles, each cycle comprising three steps: (a) Extending the nascent strand by adding a fluorescently labeled nucleotide; (b) Exciting fluorophores using one or more lasers of an optical system of the sequencing instrument and imaging through different filters of the optical system, thereby generating a sequencing image; and (c) cleaving the fluorophore and removing the 3' block in preparation for the next sequencing cycle. The incorporation and imaging cycles are repeated until a specified number of sequencing cycles are reached, thereby defining the read length. Using this method, each cycle interrogates a new location along the template chain.

The tremendous power of Illumina sequencers stems from their ability to simultaneously perform and sense millions or even billions of clusters (also referred to as "analytes") that undergo CRT reactions. The clusters include about one thousand identical copies of the template strand, but the clusters vary in size and shape. Clusters from the template strand are grown by bridge or exclusion amplification of the input library prior to the sequencing run. The purpose of amplification and cluster growth is to increase the intensity of the emitted signal, since the imaging device cannot reliably sense the fluorophore signal of a single strand. However, the physical distance of the chains within a cluster is small, so the imaging device perceives the cluster of chains as a single point.

Sequencing occurs in a flow cell (or biosensor), i.e., a small slide that holds the input strand. The flow cell is connected to an optical system comprising a microscope imaging, excitation laser and fluorescence filter. A flow cell comprises a plurality of chambers called channels. The channels are physically separated from each other and can comprise different tagged sequencing libraries that can be distinguished without sample cross-contamination. In some implementations, the flow cell includes a patterned surface. "patterned surface" refers to the arrangement of different regions in or on an exposed layer of a solid support.

An imaging device (e.g., a solid-state imaging device such as a charge-coupled device (CCD) or complementary metal-oxide semiconductor (CMOS) sensor) of a sequencing instrument takes snapshots at multiple locations along a channel in a series of non-overlapping regions, referred to as patches. For example, there may be sixty-four sectors or ninety-six sectors per channel. A block accommodates hundreds of thousands to millions of clusters.

The output of the sequencing run is the sequencing image. Sequencing images use a grid (or array) of pixelated units (e.g., pixels, superpixels, subpixels) to delineate intensity emissions of clusters and their surrounding background. The intensity emission is stored as an intensity value of the pixelated cell. The sequencing image has dimensions w, x, h of a grid of pixelated cells, where w (width) and h (height) are any number in the range of 1 to 100000 (e.g., 115 × 115, 200 × 200, 1800 × 2000, 2200 × 25000, 2800 × 3600, 4000 × 400). In some implementations, w and h are the same. In other implementations, w and h are different. The sequencing images depict the intensity emission generated as a result of nucleotide incorporation into the nucleotide sequence during a sequencing run. The intensity emission comes from the associated cluster and its surrounding background.

FIG. 2 illustrates one implementation of accessing the sequencing image 202 for base detection on a block-by-block basis 220. In the illustrated example, the data flow logic 104 provides the sequencing image 202 to the neural network 124 for base detection. The neural network 124 accesses the sequencing image 202, e.g.,

blocks

202a, 202b, 202c, and 202d, on a block-by-block basis 220. Each of the blocks is a sub-grid (or sub-array) of pixelated cells in a grid of pixelated cells that forms a sequencing image 202. The block has a size q × r of a sub-grid of pixelated cells, where q (width) and r (height) may be, for example, 1 × 1, 3 × 3, 5 × 5, 7 × 7, 10 × 10, 15 × 15, 25 × 25, etc. In some implementations, q and r are the same. In other implementations, q and r are different. In other implementations, the blocks are the same size. In other implementations, the blocks have different sizes. In some implementations, blocks may have overlapping pixelated cells (e.g., on edges).

In the illustrated example, the sequencing image 202 depicts the intensity emissions of a set of twenty-eight clusters 1-28. The blocks depict intensity emissions of a subset of the clusters. For example, block 202a depicts substantially the intensity emissions of a first subset of seven

clusters

1, 2, 3, 4, 5, 10, and 16; block 202b depicts substantially the intensity emissions of a second subset of eight

clusters

15, 16, 19, 20, 21, 22, 25, and 26; block 202c depicts substantially intensity emissions of a third subset of eight clusters 5, 6, 7, 8, 9, 12, 13, and 14; and block 202d depicts substantially the intensity emissions of a fourth subset of the nine

clusters

13, 14, 17, 18, 22, 23, 24, 27, and 28.

Sequencing generates m sequencing images per sequencing cycle for the corresponding m image channels. That is, each of the images 102 has one or more image (or intensity) channels (similar to the red, green, blue (RGB) channels of a color image). In one implementation, each image channel corresponds to one of a plurality of filter wavelength bands. In another implementation, each image channel corresponds to one imaging event of a plurality of imaging events in a sequencing cycle. In yet another implementation, each image channel corresponds to a combination of illumination with a particular laser and imaging through a particular optical filter. The block is accessed from each of the m image channels for a particular sequencing cycle. In different implementations, such as four-channel chemistry, two-channel chemistry, and single-channel chemistry, m is 4 or 2. In other implementations, m is 1, 3, or greater than 4.

For example, consider sequencing using two different image channels: a blue channel and a green channel. Then, at each sequencing cycle, the sequencing produced a blue image and a green image. Thus, for a series of k sequencing cycles, a sequence of k pairs of blue and green images is produced as output and stored as image 102. Thus, a sequence of k sequencing cycles for a sequencing run generates a sequence of image blocks for each cycle. The image blocks for each cycle contain intensity data for the associated clusters and their surrounding background in one or more image channels (e.g., red and green channels). In one implementation, when a single target cluster (e.g., cluster) is to be base-detected, the image patch for each cycle is centered on a center pixel that contains intensity data of the target associated cluster, and non-center pixels in the image patch for each cycle contain intensity data of associated clusters adjacent to the target associated cluster.

Due to the limited base diversity of the clusters in the subset, the patches have a single (indistinguishable) intensity pattern. Compared to the full image, the blocks are smaller and have fewer clusters, which in turn reduces base diversity. The base types of the blocks are few because the blocks depict a smaller number of intensity patterns of different types of bases a, C, T and G compared to the full image. The blocks may depict a low complexity base pattern in which some of the four bases a, C, T and G are represented at a frequency of less than 15%, 10% or 5% of all nucleotides. The diversity of the oligonucleotides in the block produces an intensity pattern that lacks signal diversity (contrast), i.e., a single intensity pattern.

Strength contextualization unit (Block processing Unit)

FIG. 3 illustrates one implementation of generating an intensity contextualized image 114. To compensate for the lack of intensity diversity in the block, the intensity contextualization unit 112 generates intensity context data 122 from the image 102 and makes the intensity context data 122 available for incorporation into the block.

The intensity contextualization unit 112 is configured with feature extraction logic that applies to intensity values in the image 102 to generate intensity context data 122. The feature extraction logic determines summary statistics of intensity values in the image 102. Examples of summary statistics include maximum, minimum, mean, mode, standard deviation, variance, skewness, kurtosis, percentile, and entropy. In other implementations, the feature extraction logic determines the medium statistics based on the summary statistics. Examples of medium statistics include increments, sums, a series of maxima, a series of minima, a minimum of the maxima in the series, and a maximum of the minima in the series.

The intensity context data 122 specifies summary statistics of intensity values. In one implementation, the intensity context data 122 identifies the maximum of the intensity values. In one implementation, the intensity context data 122 identifies a minimum value among the intensity values. In one implementation, the intensity context data 122 identifies an average of the intensity values. In one implementation, the intensity context data 122 identifies a pattern of intensity values. In one implementation, the intensity context data 122 identifies a standard deviation of the intensity values. In one implementation, the intensity context data 122 identifies a variance of the intensity values. In one implementation, the intensity context data 122 identifies skewness of intensity values. In one implementation, the intensity context data 122 identifies kurtosis of intensity values. In one implementation, the intensity context data 122 identifies the entropy of the intensity values.

In one implementation, the intensity context data 122 identifies one or more percentiles of the intensity values. In one implementation, the intensity context data 122 identifies an increment between at least one of a maximum and a minimum, a maximum and a mean, a mean and a minimum, and a higher one of the percentiles and a lower one of the percentiles. In one implementation, the intensity context data 122 identifies a sum of intensity values. In one implementation, the intensity contextualization unit 112 determines a plurality (or series) of maximum values by dividing the intensity values into a plurality of groups and determining a maximum value for each of the groups. The intensity context data 122 identifies a minimum value of a plurality of maximum values.

In one implementation, the intensity contextualization unit 112 determines a plurality (or series) of minimum values by dividing the intensity values into a plurality of groups and determining a minimum value for each of the groups. The intensity context data 122 identifies the maximum of a plurality of minima. In one implementation, the intensity contextualization unit 112 determines the plurality of sums by dividing the intensity values into a plurality of groups and determining a sum of the intensity values in each of the groups. The intensity context data 122 identifies the minimum of the plurality of sums. In other implementations, the intensity context data 122 identifies a maximum of a plurality of sums. In yet other implementations, the intensity context data 122 identifies an average of a plurality of sums.

The intensity context data 122 includes numerical values (e.g., floating point numbers or integers) determined (or calculated) from intensity values in the image 102. In one implementation, the values in the intensity context data 122 are features or feature maps generated as a result of applying a convolution operation to the image 102. Features in the intensity context data 122 may be stored as pixelated units (e.g., pixels, super-pixels, sub-pixels) that include respective values.

In one implementation, the intensity contextualization unit 112 is a multi-layer perceptron (MLP). In another implementation, the intensity contextualization unit 112 is a feed-forward neural network. In another implementation, the intensity contextualization unit 112 is a fully-connected neural network. In further implementations, the intensity contextualization unit 112 is a full convolution neural network. In yet further implementations, the intensity contextualization unit 112 is a semantically segmented neural network. In yet another implementation, the intensity contextualization unit 112 is a generate confrontation network (GAN).

In one implementation, the intensity contextualization unit 112 is a Convolutional Neural Network (CNN) having a plurality of convolutional layers. In another implementation, the neural network-based base detector is a Recurrent Neural Network (RNN), such as a long short term memory network (LSTM), a bidirectional LSTM (Bi-LSTM), or a Gated Recurrent Unit (GRU). In yet another implementation, the neural network-based base finder includes both CNNs and RNNs.

In other implementations, the intensity contextualization unit 112 may use 1D convolution, 2D convolution, 3D convolution, 4D convolution, 5D convolution, dilation or hole convolution, transposed convolution, depth separable convolution, point-by-point convolution, 1 x1 convolution, block convolution, flat convolution, spatial and cross-channel convolution, shuffle block convolution, spatial separable convolution, and deconvolution. It may use one or more loss functions such as logistic regression/logarithmic loss functions, multi-class cross entropy/softmax loss functions, binary cross entropy loss functions, mean square error loss functions, L1 loss functions, L2 loss functions, smooth L1 loss functions, and Huber loss functions. It may use any parallelism, efficiency, and compression scheme, such as TFRecords, compression coding (e.g., PNG), sharpening, parallel detection of map transformations, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous Stochastic Gradient Descent (SGD). It may include an upsampling layer, a downsampling layer, a recursive connection, a gate and gate memory cell (e.g., LSTM or GRU), a residual block, a residual connection, a high-speed connection, a skip connection, a peephole connection, an activation function (e.g., a non-linear transformation function such as modified linear unit (ReLU), leaky ReLU, index liner unit (ELU), sigmoid, and hyperbolic tangent (tanh)), a bulk normalization layer, a regularization layer, a discard layer, a pooling layer (e.g., maximum or average pooling), a global average pooling layer, and an attention mechanism.

The intensity contextualization unit 112 is trained using a back-propagation-based gradient update technique. Exemplary gradient descent techniques that may be used to train the intensity contextualization unit 112 include random gradient descent, batch gradient descent, and mini-batch gradient descent. Some examples of gradient descent optimization algorithms that may be used to train the intensity contextualization unit 112 are Momentum, nesterov acceleration gradients, adagarad, adadelta, RMSprop, adam, adaMax, nadam, and AMSGrad.

In one implementation, the initial version of the intensity context data generated by the intensity contextualization unit 112 has a different spatial dimension than the image 102 (e.g., the full image 202). In this case, the initial version of the intensity context data produced by the intensity contextualization unit 122 is further processed to generate intensity context data 122 that may be appended to the full image 202. In one implementation, the intensity context data 122 is "appendable" to the full image 202 meaning that the two have matching or similar spatial dimensions, i.e., width and height. The initial version of the intensity context data may be converted into appendable intensity context data 122 by using size enhancement techniques such as upsampling, deconvolution, transposed convolution, dilated convolution, concatenation, and padding (e.g., when the spatial sizes of the two do not completely match).

For example, the size of the initial version of the intensity context data may be 1 × 1, 3 × 3, or 5 × 5, while the size of the full image 202 is 115 × 115. In this case, the initial version of the intensity context data is copied (or cloned) such that clones of the initial version of the intensity context data are concatenated to form the intensity context data 122 having a spatial size that matches the full image 202. Consider, for example, the case where the spatial dimension of the initial version of intensity context data is 1 × 1 and the size of the full image 202 is 115 × 115. Then, another 114 clones of the initial version of the intensity context data are generated and concatenated with each other and with the 1 × 1 initial version of the intensity context data to form a 115 × 115 grid having spatial dimensions matching the 115 × 115 full image 202. The 115 x 115 grid constitutes intensity context data 122.

In some implementations, the intensity context data 122 includes a plurality of context channels. Each context channel of the plurality of context channels is constructed using a respective feature from the plurality of features generated by the intensity contextualization unit 112. Consider, for example, the case where the intensity contextualization unit 112 generates six 1 x1 primary versions of intensity context data. Then, six 115 x 115 context channels are generated using concatenation to construct the intensity context data 122.

The data flow logic 104 appends intensity context data 122 to the image 102 to generate the intensity contextualized image 114. In one implementation, the intensity context data 122 includes a plurality of context channels, where each context channel has the same spatial dimensions as the image 202. Consider a full image with two image channels forming a first grid (or array) of pixelated cells of size 115 x 115 and depth 2. Consider further the case where the intensity context data 122 has six context channels forming a second grid of pixelated cells of size 115 x 115 and depth 2. The first and second meshes of pixelated units are then appended (or attached) on a per-pixelated unit basis to form a single mesh of pixelated units of size 115 x 115 and depth 8, referred to herein as an intensity contextualized image 114. Thus, each intensity contextualized image in the intensity contextualized image has eight channels: two image channels from the full image and six context channels from the intensity context data 122.

The data flow logic 104 provides the intensity contextualized image 114 as input to the neural network 124, which accesses them on a block-by-block basis 220. The input to the neural network 124 includes intensity contextualized images of a plurality of sequencing cycles (e.g., a current sequencing cycle, one or more previous sequencing cycles, and one or more subsequent sequencing cycles). In one implementation, the input to the neural network 124 comprises intensity contextualized images of three sequencing cycles, such that the intensity contextualized image for the current (time t) sequencing cycle to be base detected is accompanied by: (i) Left flank/context/previous (time t-1) intensity contextualized image of the sequencing cycle and (ii) right flank/context/next/subsequent/next (time t + 1) intensity contextualized image of the sequencing cycle. In another implementation, the input to the neural network 124 comprises an intensity contextualized image of five sequencing cycles, such that the intensity contextualized image of the current (time t) sequencing cycle to be base detected is accompanied by: (i) Data of the first left flank/context/previous (time t-1) sequencing cycle; (ii) Intensity contextualized image of the second left flank/context/previous (time t-2) sequencing cycle; (iii) Intensity contextualized images of the first right flank/context/next/subsequent/next (time t + 1) sequencing cycle; and (iv) the intensity contextualized image of the second right flank/context/next/subsequent/next (time t + 2) sequencing cycle. In yet another implementation, the input to the neural network 124 comprises intensity contextualized images of seven sequencing cycles, such that the data for the current (time t) sequencing cycle to be base detected is accompanied by: (i) Intensity contextualized images of the first left flank/context/previous (time t-1) sequencing cycle; (ii) Intensity contextualized image of the second left flank/context/previous (time t-2) sequencing cycle; (iii) Intensity contextualized image of the third left flank/context/previous (time t-3) sequencing cycle; (iv) Intensity contextualized images of the first right flank/context/next/subsequent/next (time t + 1) sequencing cycle; (v) Intensity contextualized images of a second right flank/context/next/subsequent/next (time t + 2) sequencing cycle; and (vi) intensity contextualized image of the third right flank/context/next/subsequent/next (time t + 3) sequencing cycle. In other implementations, the input to the neural network 124 includes an intensity contextualized image of a single sequencing cycle. In yet other implementations, the input to the neural network 124 includes intensity contextualized images of 58, 75, 92, 130, 168, 175, 209, 225, 230, 275, 318, 325, 330, 525, or 625 sequencing cycles.

In another implementation, a sequencing image from a current (time t) sequencing cycle is accompanied by a sequencing image from a previous (time t-1) sequencing cycle and a sequencing image from a subsequent (time t + 1) sequencing cycle. According to one implementation, the neural network-based base finder 104 processes the sequencing images through its convolutional layer and generates a surrogate representation. The output layer (e.g., softmax layer) then uses the alternative representation to generate base detections for only the current (time t) sequencing cycle or each of the sequencing cycles (i.e., the current (time t) sequencing cycle, the previous (time t-1) sequencing cycle, and the subsequent (time t + 1) sequencing cycle). The resulting base calls form sequencing reads.

Neural network based base detection

According to one implementation, the neural network-based base picker 124 processes the intensity contextualized image 114 through its convolutional layer and generates an alternative representation. The output layer (e.g., softmax layer) then uses the alternative representation to generate base detections for only the current (time t) sequencing cycle or each of the sequencing cycles (i.e., the current (time t) sequencing cycle, the previous (time t-1) sequencing cycle, and the subsequent (time t + 1) sequencing cycle). The resulting base calls form sequencing reads and are stored as base calls 134.

The neural network based base picker 124 accesses the intensity contextualized image 114 on a block-by-block basis (or block-by-block basis). Each of the blocks is a sub-grid (or sub-array) of pixelated cells in a grid of pixelated cells that forms a sequencing image. The block has a size q × r of a sub-grid of pixelated cells, where q (width) and r (height) are any number in the range of 1 to 10000 (e.g., 3 × 3, 5 × 5, 7 × 7, 10 × 10, 15 × 15, 25 × 25, 64 × 64, 78 × 78, 115 × 115). In some implementations, q and r are the same. In other implementations, q and r are different. In some implementations, the blocks extracted from the sequencing image are of the same size. In other implementations, the blocks have different sizes. In some implementations, blocks may have overlapping pixelated cells (e.g., on edges).

In one implementation, the neural network-based base finder 124 outputs a base detection for a single target cluster of a particular sequencing cycle. In another implementation, the neural network-based base finder outputs a base detection for each of a plurality of target clusters for a particular sequencing cycle. In yet another implementation, the neural network-based base finder outputs a base detection for each of a plurality of target clusters for each of a plurality of sequencing cycles, thereby generating a base detection sequence for each target cluster.

In one implementation, the neural network-based base picker 124 is a multilayer perceptron (MLP). In another implementation, the neural network-based base picker 124 is a feed-forward neural network. In yet another implementation, the neural network-based base detector 124 is a fully-connected neural network. In another implementation, the neural network-based base picker 124 is a full convolution neural network. In yet another implementation, the neural network-based base picker 124 is a semantically segmented neural network. In yet another implementation, the neural network based base finder 124 is a Generate Antagonistic Network (GAN).

In one implementation, the neural network-based base picker 124 is a Convolutional Neural Network (CNN) having a plurality of convolutional layers. In another implementation, the neural network-based base detector is a Recurrent Neural Network (RNN), such as a long short term memory network (LSTM), a bidirectional LSTM (Bi-LSTM), or a Gated Recurrent Unit (GRU). In yet another implementation, the neural network-based base finder includes both CNNs and RNNs.

In other implementations, the neural network-based base picker 124 can use 1D convolution, 2D convolution, 3D convolution, 4D convolution, 5D convolution, dilation or hole convolution, transposed convolution, depth separable convolution, point-by-point convolution, 1 x1 convolution, block convolution, flat convolution, spatial and cross-channel convolution, shuffle block convolution, spatial separable convolution, and deconvolution. It may use one or more loss functions such as logistic regression/logarithmic loss functions, multi-class cross entropy/softmax loss functions, binary cross entropy loss functions, mean square error loss functions, L1 loss functions, L2 loss functions, smooth L1 loss functions, and Huber loss functions. It may use any parallelism, efficiency, and compression scheme, such as TFRecords, compression coding (e.g., PNG), sharpening, parallel detection of map transformations, batching, prefetching, model parallelism, data parallelism, and synchronous/asynchronous Stochastic Gradient Descent (SGD). It may include an upsampling layer, a downsampling layer, a recursive connection, a gate and gate memory cell (e.g., LSTM or GRU), a residual block, a residual connection, a high-speed connection, a skip connection, a peephole connection, an activation function (e.g., a non-linear transformation function such as modified linear unit (ReLU), leaky ReLU, index liner unit (ELU), sigmoid, and hyperbolic tangent (tanh)), a bulk normalization layer, a regularization layer, a discard layer, a pooling layer (e.g., maximum or average pooling), a global average pooling layer, and an attention mechanism.

The neural network-based base picker 124 is trained using a back propagation-based gradient update technique. Exemplary gradient descent techniques that may be used to train the neural network-based base picker 124 include random gradient descent, batch gradient descent, and mini-batch gradient descent. Some examples of gradient descent optimization algorithms that may be used to train the neural network-based base picker 124 are Momentum, nesterov acceleration gradient, adagarad, adadelta, RMSprop, adam, adaMax, nadam, and AMSGrad.

More details regarding the neural network-BASED base detector 124 may be found in U.S. provisional patent application No. 62/821,766 (attorney docket No. ill 1008-9/IP-1752-PRV), entitled "artiici integrated into basic SEQUENCING," filed on 21/3.2019, which is incorporated herein by reference.

In some implementations, the intensity contextualization unit 112 includes an intensity extractor, discriminator, and approximator (e.g., a convolution filter), whose kernel weights or coefficients can be learned (or trained) using a back-propagation-based gradient update technique. In such implementations, the intensity contextualization unit 112 is trained "end-to-end" with the neural network 124, such that the error between the base detection prediction of the neural network 124 and the ground truth base detection is calculated, and the gradient determined from the error is used to update the weights of the neural network 124 and further update the weights of the intensity contextualization unit 112. In this way, the intensity contextualization unit 112 learns to extract from the image 102 those intensity features and contexts that contribute to the correct base detection prediction of the neural network 124.

Fig. 4 shows one example of a full image 402 from which a block 402a is accessed such that the block 402a is centered on a target cluster 412 (red) to be base-detected. The size of the full image 402 is 115 × 115 pixels, and the size of the block 402a is 15 × 15 pixels.

FIG. 5 depicts one implementation of intensity contextualization unit 112 with multiple convolution pipelines. Each of the convolution pipelines has a plurality of convolution filters. The convolution filters in the plurality of convolution filters have different filter sizes and different filter step sizes. Each of the convolution pipelines processes the image to generate a plurality of convolved representations of the image.

In the illustrated example, the input to the intensity contextualization unit 112 is a full image 502 of size 115 × 115 pixels and having two image channels (i.e., a blue image channel and a green image channel). The intensity contextualization unit 112 has n convolution pipelines 502a, a. The convolution pipeline has a series of convolution filters (e.g., 542). In some implementations, the convolution filters in a series of convolution filters of a particular convolution pipeline have different filter (or kernel) sizes. In other implementations, the convolution filters in a series of convolution filters of a particular convolution pipeline have the same filter size. In one example, a particular convolution pipeline may have three sets of filters such that the size of the filters in the first set of filters is 3 x 3, the size of the filters in the second set of filters is 3 x 3, and the size of the filters in the third set of filters is 12 x 12. In another example, a particular convolution pipeline may have three sets of filters such that the size of the filters in the first set of filters is 3 x 3, the size of the filters in the second set of filters is 4 x4, and the size of the filters in the third set of filters is 9 x 9. In yet another example, a particular convolution pipeline may have four sets of filters, such that the size of the filters in the first set of filters is 3 x 3, the size of the filters in the second set of filters is 3 x 3, the size of the filters in the third set of filters is 4 x4, and the size of the filters in the fourth set of filters is 9 x 9. In yet another example, a particular convolution pipeline may have four sets of filters, such that the size of the filters in the first set of filters is 5 x 5, the size of the filters in the second set of filters is 3 x 3, the size of the filters in the third set of filters is 3 x 3, and the size of the filters in the fourth set of filters is 7 x 7. In yet another example, a particular convolution pipeline may have four sets of filters, such that the size of the filters in the first set of filters is 5 x 5, the size of the filters in the second set of filters is 3 x 3, the size of the filters in the third set of filters is 3 x 3, and the size of the filters in the fourth set of filters is 7 x 7. In yet another example, a particular convolution pipeline may have four sets of filters, such that the size of the filters in the first set of filters is 5 x 5, the size of the filters in the second set of filters is 4 x4, the size of the filters in the third set of filters is 4 x4, and the size of the filters in the fourth set of filters is 5 x 5. In yet another example, a particular convolution pipeline may have four sets of filters, such that the size of the filters in the first set of filters is 5 x 5, the size of the filters in the second set of filters is 5 x 5, the size of the filters in the third set of filters is 5 x 5, and the size of the filters in the fourth set of filters is 3 x 3. In yet another example, a particular convolution pipeline may have four sets of filters, such that the size of the filters in the first set of filters is 3 x 3, the size of the filters in the second set of filters is 3 x 3, the size of the filters in the third set of filters is 4 x4, and the size of the filters in the fourth set of filters is 9 x 9.

In some implementations, the convolution filters in a series of convolution filters of a particular convolution pipeline have different step sizes. In one example, a particular convolution pipeline may have three sets of filters such that a filter in a first set of filters uses a step size of 3, a filter in a second set of filters uses a step size of 4, and a filter in a third set of filters uses a step size of 1. In other implementations, the convolution filters in a series of convolution filters of a particular convolution pipeline have the same step size.

The image 502 is fed as input to each of n convolution pipelines 502 a. Each convolution pipeline processes the image 502, generates a continuous feature map, and produces a final output (e.g., convolution representations 512a, 512n of size 1 x 1). Because the kernel weights or coefficients of the convolution filter are different across the convolution pipeline, the corresponding final output of the convolution pipeline is also different, and thus encodes different intensity features or contexts determined from the intensity values in the image 502. In this way, a plurality of intensity features and contexts are determined from the image 502 by using a plurality of convolution pipelines configured with different convolution coefficients (or kernel weights). Each of the final outputs is composed of one or more pixelated units (e.g., pixels, superpixels, subpixels).

In some implementations, cloner 562 and concatenator 572 clone and concatenate the spatial dimensions of the respective final outputs (e.g., 512a, 512 n) of the n convolution pipelines 502 a. The cloned and concatenated versions of the respective final outputs then form respective context channels (e.g., 516a, 516 n) that are arranged on a pixilated unit-by-pixilated unit basis to form intensity context data 122 of size 115 x 6. The intensity context data 122 is then appended to the image 502 on a pixelized element by pixelized element basis to form an intensity contextualized image 508 of size 115 x 8, six channels of which are context channels from the intensity context data 122 and two channels are image channels from the image 502.

The data flow logic 104 provides the intensity contextualized image 508 as input to the neural network 124 for base call, which accesses and base calls the intensity contextualized image 508 on a block-by-block basis 220.

FIG. 6 illustrates one implementation of the neural network 124 processing the intensity contextualization block 614 and generating the base detections 134. In the illustrated example, a block 602a of size 15 × 15 is accessed from a full image 602 of size 115 × 115. Intensity context data 604, having a size of 15 x 15, is then appended pixel-by-pixel to block 602a to form intensity contextualization block 614. The neural network 124 includes a plurality of convolutional layers that receive domain 624 smaller than the full image 602 and a filter 634. Thus, without the intensity context data 604 determined from the full image 602, when the convolutional layers and the filter 634 analyze the block 602a, their receive field 624 is limited to the spatial size of the block 602a, and therefore does not consider the portion of the image in the full image 602 that is outside of the block 602a. To compensate for the limited receive field 624, the intensity context data 604 provides intensity context from distant regions of the image not covered by the block 602a. The convolutional layer and filter 634 of the neural network 124 processes the intensity contextualization block 614 to generate the base detections 134.

Figure 7 illustrates one implementation of the neural network 124 processing the previous intensity contextualized image 764, the current intensity contextualized image 774, and the subsequent intensity contextualized image 784 of the plurality of sequencing cycles and generating the base detections 134. Image 702 is generated at the previous sequencing cycle t-1 of the sequencing run. An image 712 is generated at the current sequencing cycle t of the sequencing run. Image 722 is generated at a subsequent sequencing cycle t +1 of the sequencing run. A previous block 702a is accessed from the previous image 702 and previous intensity context data 704 is determined from the intensity values in the previous image 702 and pixel-wise appended to the previous block 702a to form a previous intensity contextualization block 764. The current block 712a is accessed from the current image 712, and current intensity context data 704 is determined from intensity values in the current image 712, and is appended pixel-by-pixel to the current block 712a to form a current intensity contextualized block 774. A subsequent block 722a is accessed from the subsequent image 722 and subsequent intensity context data 714 is determined from the intensity values in the subsequent image 722 and appended pixel-by-pixel to the successive block 722a to form a subsequent intensity contextualization block 784.

The neural network 124 uses a specialized architecture to separate the processing of data for different sequencing cycles. The motivation for using a specialized architecture is first described. As discussed above, the neural network 124 processes intensity contextualized images of the current sequencing cycle, one or more previous sequencing cycles, and one or more subsequent sequencing cycles. Data from additional sequencing cycles provide a sequence specific context. The neural network-based base finder 124 learns a sequence-specific context during training and performs base detection on the sequence-specific context. In addition, the data of the pre-and post-sequencing cycles provide a second order contribution of the pre-and phasing signals for the current sequencing cycle.

However, images captured at different sequencing cycles and in different image channels are misaligned with respect to each other and have residual registration errors. In view of this misalignment, the specialized architecture includes spatial convolution layers that do not mix information between sequencing loops and only mix information within sequencing loops.

The spatial convolution layer uses a so-called "isolated convolution" that achieves isolation by independently processing data for each of a plurality of sequencing cycles via a "dedicated unshared" convolution sequence. The isolated convolution convolves data and the resulting feature map for only a given sequencing cycle (i.e., within a cycle), and does not convolve data and the resulting feature map for any other sequencing cycle.

Consider, for example, that input data includes: (i) A current intensity contextualization block of a current (time t) sequencing cycle to be base detected; (ii) Previous (time t-1) intensity contextualized blocks of the previous sequencing cycle; and (iii) a subsequent intensity contextualization block of a subsequent (time t + 1) sequencing cycle. The specialized architecture then initiates three separate convolution pipelines, namely a current convolution pipeline, a previous convolution pipeline, and a subsequent convolution pipeline. The current data processing pipeline receives as input the current intensity-contextualized block of the current (time t) sequencing loop and processes the current intensity-contextualized block independently through the plurality of spatial convolution layers 784 to produce a so-called "current spatial convolution representation" as the output of the final spatial convolution layer. The previous convolution pipeline receives as input the previous intensity contextualized block of the previous (time t-1) sequencing cycle and processes the previous intensity contextualized block independently through the plurality of spatial convolution layers 784 to produce a so-called "previous spatial convolution representation" as the output of the final spatial convolution layer. The latter convolution pipeline receives as input the latter intensity-contextualized block of the latter (time t + 1) sequencing cycle, and processes it independently through the plurality of spatial convolution layers 784 to produce a so-called "latter spatial convolution representation" as the output of the final spatial convolution layer.

In some implementations, the current convolution pipeline, the previous convolution pipeline, and the next convolution pipeline are executed in parallel. In some implementations, the spatial convolution layer is part of a spatial convolution network (or sub-network) within a specialized architecture.

The neural network-based base finder 124 also includes a time convolution layer 794 that mixes information between sequencing cycles (i.e., between cycles). The temporal convolutional layers 794 receive their inputs from the spatial convolutional network and operate on the spatial convolutional representations produced by the final spatial convolutional layers of the corresponding data processing pipeline.

The inter-cycle operability freedom of the time convolution layer stems from the fact that: the misalignment properties that are present in the image data fed as input to the spatial convolution network are cleaned from the spatial convolution representation by a stack or cascade of isolated convolutions performed by the spatial convolution layer sequence.

The temporal convolution layer 794 uses a so-called "combined convolution" that convolves the input channels in subsequent inputs on a sliding window basis, group-by-group basis. In one implementation, these subsequent inputs are subsequent outputs generated from previous spatial convolution layers or previous temporal convolution layers.

In some implementations, the time convolution layer 794 is part of a time convolution network (or sub-network) within a specialized architecture. The time convolutional network receives its input from the spatial convolutional network. In one implementation, the first time convolution layer of the time convolution network combines, group by group, spatial convolution representations between sequencing cycles. In another implementation, subsequent temporal convolution layers of the temporal convolution network combine subsequent outputs of previous temporal convolution layers. The output of the final time convolution layer is fed to an output layer that produces an output. The output is used for base detection of one or more clusters at one or more sequencing cycles.

Performance results as an objective index of creativity and non-obviousness

FIGS. 8A and 8B compare the base detection accuracy of a disclosed neural network-based base detector configured with a disclosed intensity contextualization unit (referred to herein as "DeepRTA-V2") with a neural network-based base detector without a disclosed intensity contextualization unit (referred to herein as "DeepRTA"). Additional details regarding DeepRTA can be found in commonly owned U.S. patent application Ser. Nos. 6/825,987, 16/825,991, 16/826,126, 16/826,134, 16/826,168, 62/979,412, 62/979,411, 17/179,395, 62/979,399, 17/180,480, 17/180,513, 62/979,414, 62/979,385, and 63/072,032.

FIGS. 8A and 8B also compare the base detection accuracy of DeepRTA-V2 with a non-neural network-based base detector (referred to herein as "RTA") without the disclosed intensity contextualization unit. Additional details regarding RTAs can be found in commonly owned U.S. patent application Ser. No. 13/006,206. In fig. 8A and 8B, the model entitled "deep rta-V2 lanczos" is the disclosed neural network-based base detector with the disclosed intensity contextualization unit combined with additional non-linear logic referred to herein as "lanczos".

Thus, in FIGS. 8A and 8B, base detection performance of DeepRTA-V2 is provided in comparison with that of another neural network-based base detector (DeepRTA) and a non-neural network-based base detector (RTA). As shown in fig. 8A and 8B, the excellent base detection performance of DeepRTA-V2 relative to these benchmark models is an objective indicator of the inventive and unobvious nature of the disclosed intensity contextualization units.

In FIGS. 8A and 8B, the y-axis represents the base detection error rate ("error%"). The% error is calculated for a large number of base detections for a large number of clusters (e.g., hundreds or millions of base detections for hundreds or millions of clusters). In addition, in FIGS. 8A and 8B, the x-axis represents the progression of sequencing cycles 20-140 of the sequencing run in which a large number of base detections were made for reads 1 (FIG. 8A) and 2 (FIG. 8B).

In FIGS. 8A and 8B, the% error of the RTA is depicted by the fitted line with "O"; the% error of DeepRTA is depicted by the fitted line with "\9633;"; the% error for DeepRTA-V2 is depicted by the fitted line with "Δ"; and the% error of DeepRTA-V2+ lanczos is represented by

The fitted line of (c) is depicted. As shown in FIGS. 8A and 8B, the base detection error rate of DeepRTA-V2 is lower than that of DeepRTA and RTA. Furthermore, this holds true throughout the progression of sequencing cycles 20-140 for both reads 1 and 2, as in "Δ" and "Δ" in FIGS. 8A and 8B

The fit line is always lower than the "O" and "\9633the" fit line indicates.

Fig. 9 illustrates the base call error rates observed for various combinations (configurations) of the filter size (or kernel size), step size, and filter bank size (K) of the convolution filter of the disclosed neural network-based base call picker 124.

In FIG. 9, "R1C 20" represents the sequencing cycle twenty during sequencing of read 1. In particular, R1C20 represents a large number of base detections for a large number of clusters (e.g., hundreds or millions of base detections for hundreds or millions of clusters) during the twenty sequencing cycles. In fig. 9, R1C20 was used as a representative sequence cycle for early sequencing cycles in the sequencing run.

In FIG. 9, "R1C 80" represents eighty sequencing cycles during sequencing of read 1. In particular, R1C80 represents a large number of base detections for a large number of clusters (e.g., hundreds or millions of base detections for hundreds or millions of clusters) during an eighty cycle of sequencing. In fig. 9, R1C80 was used as a representative sequence cycle for the middle sequencing cycle in the sequencing run.

In FIG. 9, "R1C 120" represents one hundred twenty sequencing cycles during sequencing of read 1. In particular, R1C120 represents a large number of base detections for a large number of clusters (e.g., hundreds or millions of base detections for hundreds or millions of clusters) during one hundred twenty sequencing cycles. In fig. 9, R1C120 was used as a representative sequence cycle for the middle and late sequencing cycles of the sequencing run.

In fig. 9, "deep rta" denotes a specific combination of the filter size (or kernel size), step size, and filter bank size (K) of the convolution filter deep rta.

In FIG. 9, the "3-3-12" combination represents three consecutive spatial convolution layers of the disclosed neural network-based base detector 124. The three successive spatial convolution layers are arranged in a sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, and then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 3 x 3. The second spatial convolution layer has a convolution filter/kernel size of 3 x 3. The third spatial convolution layer has a convolution filter/kernel size of 12 x 12. In some implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use different steps such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, and third spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use different filter bank sizes such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, and third spatial convolution layers may use the same filter bank size (e.g., 6 or 10) such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2).

In FIG. 9, the combination "3-4-9" represents three consecutive spatial convolution layers of the disclosed neural network-based base detector 124. The three successive spatial convolution layers are arranged in a sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, and then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 3 x 3. The second spatial convolution layer has a convolution filter/kernel size of 4 x 4. The third spatial convolution layer has a convolution filter/kernel size of 9 x 9. In some implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use different steps such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, and third spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, and third spatial convolution layers may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use different filter bank sizes such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first spatial convolution layer, the second spatial convolution layer, and the third spatial convolution layer may use the same filter bank size (e.g., 6 or 10) such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2).

In FIG. 9, the "3-3-4-9" combination represents four consecutive spatial convolution layers of the disclosed neural network-based base detector 124. The four successive spatial convolution layers are arranged in a sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block to produce a first intermediate output, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by the fourth spatial convolution layer to produce a fourth intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 3 x 3. The second spatial convolution layer has a convolution filter/kernel size of 3 x 3. The third spatial convolution layer has a convolution filter/kernel size of 4 x 4. The fourth spatial convolution layer has convolution filters/kernels of size 9 x 9. In some implementations, the first, second, third, and fourth spatial convolution layers may use different strides such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, the third spatial convolution layer, and the fourth spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first, second, third, and fourth spatial convolution layers may use different filter bank sizes such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same filter bank size (e.g., 6 or 10) such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2).

In FIG. 9, the "5-3-3-7" combination represents four consecutive spatially convolutional layers of the disclosed neural network-based base picker 124. The four successive spatial convolution layers are arranged in a sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block to produce a first intermediate output, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by the fourth spatial convolution layer to produce a fourth intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 5 x 5. The second spatial convolution layer has convolution filters/kernels of size 3 x 3. The third spatial convolution layer has a convolution filter/kernel size of 3 x 3. The fourth spatial convolution layer has a convolution filter/kernel size of 7 x 7. In some implementations, the first, second, third, and fourth spatial convolution layers may use different strides such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, the third spatial convolution layer, and the fourth spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first, second, third, and fourth spatial convolution layers may use different filter bank sizes such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same filter bank size (e.g., 6 or 10) such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2).

In FIG. 9, the "combination of 5-4-4-5" represents four consecutive spatially convolutional layers of the disclosed neural network-based base picker 124. The four successive spatial convolution layers are arranged in a sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block to produce a first intermediate output, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by the fourth spatial convolution layer to produce a fourth intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 5 x 5. The second spatial convolution layer has a convolution filter/kernel size of 4 x 4. The third spatial convolution layer has a convolution filter/kernel size of 4 x 4. The fourth spatial convolution layer has a convolution filter/kernel size of 5 x 5. In some implementations, the first, second, third, and fourth spatial convolution layers may use different strides such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, the third spatial convolution layer, and the fourth spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first, second, third, and fourth spatial convolution layers may use different filter bank sizes such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same filter bank size (e.g., 6 or 10) such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2).

In FIG. 9, the "5-5-5-3" combination represents four consecutive spatial convolution layers of the disclosed neural network-based base detector 124. The four successive spatial convolution layers are arranged in sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block to produce a first intermediate output, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by the fourth spatial convolution layer to produce a fourth intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 5 x 5. The second spatial convolution layer has a convolution filter/kernel size of 5 x 5. The third spatial convolution layer has a convolution filter/kernel size of 5 x 5. The fourth spatial convolution layer has a convolution filter/kernel size of 3 x 3. In some implementations, the first, second, third, and fourth spatial convolution layers may use different strides such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, the third spatial convolution layer, and the fourth spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first, second, third, and fourth spatial convolution layers may use different filter bank sizes such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same filter bank size (e.g., 6 or 10) such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2).

In FIG. 9, "3-3-4-9_K0: the 3-6-8-10 "combination represents four consecutive spatial convolution layers of the disclosed neural network-based base picker 124. The four successive spatial convolution layers are arranged in a sequence such that first the first intermediate output is processed by the first spatial convolution layer processing block to produce a first intermediate output, then the first intermediate output is processed by the second convolution layer to produce a second intermediate output, then the second intermediate output is processed by the third spatial convolution layer to produce a third intermediate output, and then the third intermediate output is processed by the fourth spatial convolution layer to produce a fourth intermediate output. The first spatial convolution layer has a convolution filter/kernel size of 3 x 3. The second spatial convolution layer has a convolution filter/kernel size of 3 x 3. The third spatial convolution layer has a convolution filter/kernel size of 4 x 4. The fourth spatial convolution layer has a convolution filter/kernel size of 9 x 9. In some implementations, the first, second, third, and fourth spatial convolution layers may use different strides such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same stride such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In some implementations, the first spatial convolution layer, the second spatial convolution layer, the third spatial convolution layer, and the fourth spatial convolution layer may use different fills such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). In other implementations, the first, second, third, and fourth spatial convolution layers may use the same padding such that the third intermediate output has a target size (e.g., 1 × 1 or 2 × 2). "3-3-4-9 _K0: the 3-6-8-10 "combination has a filter bank size of three in the first spatial convolution layer (i.e., K0= 3). "3-3-4-9 _K0: the 3-6-8-10 "combination has a filter bank size of six in the second spatial convolution layer (i.e., K0= 6). "3-3-4-9 _K0: the 3-6-8-10 "combination has a filter bank size of eight in the third spatial convolution layer (i.e., K0= 8). "3-3-4-9 _K0: the 3-6-8-10 "combination has a filter bank size of ten in the fourth spatial convolution layer (i.e., K0= 10).

In fig. 9, the values in the table are the respective base detection error rates of the respective combinations (configurations). As the figure shows, many combinations of the disclosed neural network-based base detectors 124 have base detection error rates lower than the DeepRTA. Moreover, in different combinations (configurations) of the disclosed neural network-based base picker 124, the base pick error rate decreases as the filter/kernel size gradually increases between successive spatial convolution layers.

deprta-K0-04 represents the disclosed neural network-based base detector configured with the disclosed intensity contextualization unit with four convolution filters in each of the n spatial convolution layers (i.e., filter bank size of four/K0 = 4). deprta-K0-06 represents the disclosed neural network-based base detector configured with the disclosed intensity contextualization unit with six convolution filters in each of the n spatial convolution layers (i.e., a filter bank size of six/K0 = 6). deprta-K0-10 represents the disclosed neural network-based base detector configured with the disclosed intensity contextualization unit with ten convolution filters in each of the n spatial convolution layers (i.e., filter bank size of ten/K0 = 10). deprtta-K0-16 represents the disclosed neural network-based base detector configured with the disclosed intensity contextualization unit with sixteen convolution filters (i.e., filter bank size of sixteen/K0 = 16) in each of the n spatial convolution layers. deprtta-K0-18 represents a disclosed neural network-based base detector configured with a disclosed intensity contextualization unit having eighteen convolution filters (i.e., a filter bank size of eighteen/K0 = 18) in each of the n spatial convolution layers. deprtta-K0-18 represents the disclosed neural network-based base detector configured with the disclosed intensity contextualization unit having twenty convolution filters (i.e., a filter bank size of twenty/K0 = 20) in each of the n spatial convolution layers.

In FIG. 10, the y-axis represents the base detection error rate ("error%"). The% error is calculated for a large number of base detections for a large number of clusters (e.g., hundreds or millions of base detections for hundreds or millions of clusters). Furthermore, in FIG. 10, the x-axis represents the progression of sequencing cycles 20-80 of the sequencing run, in which a large number of bases were detected for read 1.

In FIG. 10, the% error of DeepRTA is depicted by the fitted line with ". Smallcircle"; the% error for DeepRTA-K0-04 is depicted by the fitted line with "Δ"; deepRTA-K0-06% error band

Drawing a fitted line of (c); the% error for DeepRTA-K0-10 is depicted by the fitted line with "\9633;"; deepRTA-K0-16% error band

Drawing a fitted line of (c); deepRTA-K0-18% error by band

Drawing a fitted line of (c); and the% error of DeepRTA-K0-20 is depicted by the fitted line with "four stars".

As shown in FIG. 10, the base detection error rates for the different filter bank size configurations of the disclosed neural network-based base detectors configured with the disclosed intensity contextualization units (i.e., deepRTA-K0-04, deepRTA-K0-06, deepRTA-K0-10, deepRTA-K0-16, deepRTA-K0-18, and DeepRTA-K0-20) are lower than for DeepRTA. Furthermore, this holds true for the progression of sequencing cycles 20-80 for read 1, as in FIG. 10 with a "Δ"),

″□″、

The fit line for "it" indicates and is always lower than the fit line with "O".

As shown in FIG. 10, the excellent base detection performance of the different filter set size configurations of the disclosed neural network-based base detectors configured with the disclosed intensity contextualization units (i.e., deepRTA-K0-04, deepRTA-K0-06, deepRTA-K0-10, deepRTA-K0-16, deepRTA-K0-18, and DeepRTA-K0-20) relative to DeepRTA is an objective indicator of the inventive and unobvious properties of the disclosed intensity contextualization units.

FIG. 11 shows a base detection error rate (fitted line with ". Smallcircle.") when the disclosed neural network-based base finder configured with the disclosed intensity contextualization unit extracts intensity context data from a raw input image of size 115X 115 versus a base detection error rate (fitted line with "\9633;") when intensity context data is extracted from a raw input image of size 160X 160. As shown in fig. 11, when intensity context data is collected from a larger raw input image of size 160 × 160, the base detection error rate is lower.

FIG. 12 shows different configurations of the disclosed neural-network-based base picker (i.e., deepRTA-K0-06, deepRTA-349-K0-10-160p, deepRTA-K0-16-Lanczos, deepRTA-K0-18, and DeepRTA-K0-20) configured with the disclosed intensity contextualization units versus the base pick accuracy (1-base pick error rate) of the DeepRTA for homopolymers (e.g., GGGGG) and flanking homopolymers (e.g., GGTGG).

As discussed above, in some implementations, the neural network-based base finder 124 performs base detection for a current sequencing cycle by processing sequencing image windows for multiple sequencing cycles (including the current sequencing cycle contextualized by a right sequencing cycle and a left sequencing cycle). Since the base "G" is indicated by a dark or off state in the sequencing image, the repeating pattern of bases "G" can lead to erroneous base detections, particularly when the current sequencing cycle is for a non-G base (e.g., base "T") but G is flanked on the left and right.

As shown in fig. 12, different configurations of the disclosed neural network-based base detector configured with the disclosed intensity contextualization units have high base detection accuracy for such homopolymers (e.g., GGGGG) and flanking homopolymers (e.g., GGTGG). One reason for this is that: the disclosed intensity contextualization unit extracts the intensity context beyond a given block to inform the neural network-based base picker 124 to: even though the flanking sequencing cycle represents the base "G", the central sequencing cycle is a non-G base.

FIG. 13 compares the base detection error rates of the disclosed neural network-based base detector ("DeepRTA-V2: 349") configured with the disclosed intensity contextualization unit and trained based on normalized sequencing images versus the DeepRTA, RTA, the disclosed neural network-based base detector ("DeepRTA-V2: 349") configured with the disclosed intensity contextualization unit, trained based on normalized sequencing images and performing inferences based on normalized sequencing images, and the inferred DeepRTA ("DeepRTA-norm") trained based on normalized sequencing images and performing inferences based on normalized sequencing images.

Normalized sequencing images are normalized to have a certain intensity distribution (e.g., they have lower and higher percentile intensity values (e.g., five percent of the normalized intensity values are below zero, another five percent of the normalized intensity values are greater than 1, and the remaining ninety percent of the normalized intensity values are between zero and one).

As shown in fig. 13, deepRTA-V2:349 (fitted line with "\9633;") and DeepRTA-V2:349-norm (fitted line with "Δ") was superior to DeepRTA-norm (with "Δ")

Fitted line of (c), deprta (fitted line with "∘"), and RTA (fitted line with "four").

Sequencing system

Fig. 14A and 14B depict one implementation of a sequencing system 1400A. The sequencing system 1400A includes a configurable processor 1446. Processor 1446 may be configured to implement the base detection techniques disclosed herein. The sequencing system is also referred to as a "sequencer".

The sequencing system 1400A may operate to obtain any information or data related to at least one of a biological substance or a chemical substance. In some implementations, the sequencing system 1400A is a workstation that may be similar to a desktop device or a desktop computer. For example, most (or all) of the systems and components used to perform the desired reaction may be located within a common housing 1402.

In a particular implementation, the sequencing system 1400A is a nucleic acid sequencing system configured for various applications including, but not limited to, de novo sequencing, re-sequencing of whole or target genomic regions, and metagenomics. Sequencers can also be used for DNA or RNA analysis. In some implementations, the sequencing system 1400A can also be configured to generate reaction sites in a biosensor. For example, the sequencing system 1400A can be configured to receive a sample and generate surface-attached clusters of clonally amplified nucleic acids derived from the sample. Each cluster may constitute or be part of a reaction site in the biosensor.

The example sequencing system 1400A may include a system socket or interface 1410 configured to interact with the biosensor 1412 to perform a desired reaction within the biosensor 1412. In the description below with respect to fig. 14A, the biosensor 1412 is loaded into the system socket 1410. However, it should be understood that a cartridge including the biosensor 1412 may be inserted into the system receptacle 1410, and in some states, the cartridge may be temporarily or permanently removed. As noted above, the cartridge can include, among other things, fluid control components and fluid storage components.

In a particular implementation, the sequencing system 1400A is configured to perform a number of parallel reactions within the biosensor 1412. Biosensor 1412 includes one or more reaction sites where a desired reaction may occur. The reaction sites may for example be immobilized to a solid surface of the biosensor or to beads (or other movable substrates) located within corresponding reaction chambers of the biosensor. The reaction site can include, for example, a cluster of clonally amplified nucleic acids. The biosensor 1412 may include a solid-state imaging device (e.g., a CCD or CMOS imaging device) and a flow cell mounted thereto. The flow cell may include one or more flow channels that receive the solution from the sequencing system 1400A and direct the solution to the reaction site. Optionally, the biosensor 1412 may be configured to engage a thermal element for transferring thermal energy into or out of the flow channel.

The sequencing system 1400A may include various components, assemblies, and systems (or subsystems) that interact with one another to perform predetermined methods or assay protocols for biological or chemical analysis. For example, the sequencing system 1400A includes a system controller 1406, which may communicate with various components, and subsystems of the sequencing system 1400A and the biosensor 1412. For example, in addition to the system socket 1410, the sequencing system 1400A may include: a fluid control system 1408 for controlling fluid flow through the entire fluid network and biosensors 1412 of the sequencing system 1400A; a fluid storage system 1414 configured to hold all fluids (e.g., gases or liquids) usable by the biometric system; a temperature control system 1404 that can regulate the temperature of the fluid in the fluid network, the fluid storage system 1414, and/or the biosensor 1412; and a lighting system 1416 configured to illuminate the biosensor 1412. As described above, if a cartridge with a biosensor 1412 is loaded into system receptacle 1410, the cartridge may also include fluid control components and fluid storage components.

As also shown, the sequencing system 1400A may include a user interface 1418 that interacts with a user. For example, the user interface 1418 may include a display 1420 for displaying or requesting information from a user and a user input device 1422 for receiving user input. In some implementations, the display 1420 and the user input device 1422 are the same device. For example, user interface 1418 may include a touch-sensitive display configured to detect the presence of an individual touch and also identify the location of the touch on the display. However, other user input devices 1422 may be used, such as a mouse, touch pad, keyboard, keypad, handheld scanner, voice recognition system, motion recognition system, or the like. As will be discussed in more detail below, the sequencing system 1400A can communicate with various components including a biosensor 1412 (e.g., in the form of a cartridge) to perform a desired reaction. The sequencing system 1400A may also be configured to analyze data obtained from the biosensor to provide the user with the desired information.

System controller 1406 may comprise any processor-based or microprocessor-based system including using microcontrollers, reduced Instruction Set Computers (RISC), application Specific Integrated Circuits (ASIC), field Programmable Gate Arrays (FPGA), coarse Grain Reconfigurable Architecture (CGRA), logic circuits, and any other circuit or processor capable of executing the functions described herein. The above examples are exemplary only, and are thus not intended to limit in any way the definition and/or meaning of the term system controller. In an exemplary implementation, the system controller 1406 executes a set of instructions stored in one or more storage elements, memories, or modules to at least one of obtain detection data and analyze detection data. The detection data may include a plurality of pixel signal sequences, so that a pixel signal sequence from each of millions of sensors (or pixels) can be detected over many base detection cycles. The storage element may be in the form of an information source or a physical memory element within the sequencing system 1400A.

The set of instructions may include various commands that instruct the sequencing system 1400A or the biosensor 1412 to perform specific operations, such as the various embodied methods and processes described herein. The set of instructions may be in the form of a software program, which may form part of one or more tangible, non-transitory computer-readable media. As used herein, the terms "software" and "firmware" are interchangeable, and include any computer program stored in memory for execution by a computer, including RAM memory, ROM memory, EPROM memory, EEPROM memory, and non-volatile RAM (NVRAM) memory. The above memory types are exemplary only, and are thus not limiting as to the types of memory usable for storage of a computer program.

The software may be in various forms, such as system software or application software. Further, the software may be in the form of a collection of separate programs or in the form of program modules or portions of program modules within a larger program. The software may also include modular programming in the form of object-oriented programming. After obtaining the detection data, the detection data may be automatically processed by the sequencing system 1400A, processed in response to a user input, or processed in response to a request made by another processing machine (e.g., a remote request over a communication link). In the illustrated implementation, the system controller 1406 includes an analysis module 1444. In other implementations, system controller 1406 does not include analysis module 1444, but rather may access analysis module 1444 (e.g., analysis module 1444 may be hosted on the cloud separately).

The system controller 1406 can be connected to the biosensor 1412 and other components of the sequencing system 1400A via communication links. The system controller 1406 is also communicatively coupled to an off-site system or server. The communication link may be hardwired, wired, or wireless. The system controller 1406 may receive user inputs or commands from the user interface 1418 and the user input device 1422.

The fluid control system 1408 includes a fluid network and is configured to direct and regulate the flow of one or more fluids through the fluid network. The fluid network may be in fluid communication with the biosensor 1412 and the fluid storage system 1414. For example, the selected fluid may be drawn from the fluid storage system 1414 and directed to the biosensor 1412 in a controlled manner, or the fluid may be drawn from the biosensor 1412 and directed toward a waste reservoir, for example, in the fluid storage system 1414. Although not shown, the fluid control system 1408 may include a flow sensor that detects the flow rate or pressure of fluid within the fluid network. The sensors may be in communication with a system controller 1406.

Temperature control system 1404 is configured to regulate the temperature of fluid at different regions of the fluid network, fluid storage system 1414, and/or biosensor 1412. For example, temperature control system 1404 may include a thermal cycler that interfaces with biosensor 1412 and controls the temperature of fluid flowing along the reaction sites in biosensor 1412. The temperature control system 1404 may also regulate the temperature of solid elements or components of the sequencing system 1400A or the biosensor 1412. Although not shown, temperature control system 1404 may include sensors for detecting the temperature of the fluid or other components. The sensors may be in communication with a system controller 1406.

The fluid storage system 1414 is in fluid communication with the biosensor 1412 and may store various reaction components or reactants for performing a desired reaction therein. The fluid storage system 1414 may also store fluids for washing or cleaning the fluid network and the biosensor 1412, as well as for diluting the reactants. For example, the fluid storage system 1414 may include various reservoirs to store samples, reagents, enzymes, other biomolecules, buffer solutions, aqueous and non-polar solutions, and the like. In addition, the fluid storage system 1414 may also include a waste reservoir for receiving waste from the biosensor 1412. In implementations that include a cartridge, the cartridge may include one or more of a fluid storage system, a fluid control system, or a temperature control system. Accordingly, one or more of the components described herein in connection with those systems may be housed within a cartridge housing. For example, the cartridge may have various reservoirs to store samples, reagents, enzymes, other biomolecules, buffer solutions, aqueous and non-polar solutions, waste, and the like. Thus, one or more of the fluid storage system, the fluid control system, or the temperature control system may be removably engaged with the biometric system via a cartridge or other biosensor.

The illumination system 1416 may include a light source (e.g., one or more LEDs) and a plurality of optical components for illuminating the biosensor. Examples of light sources may include lasers, arc lamps, LEDs, or laser diodes. The optical component may be, for example, a reflector, dichroic mirror, beam splitter, collimator, lens, filter, wedge mirror, prism, mirror, detector, or the like. In implementations using an illumination system, the illumination system 1416 can be configured to direct excitation light to the reaction sites. As one example, the fluorophore may be excited by green wavelength light, and thus the wavelength of the excitation light may be about 1432nm. In one implementation, the illumination system 1416 is configured to produce illumination parallel to a surface normal of the surface of the biosensor 1412. In another implementation, the illumination system 1416 is configured to produce illumination that is off-angle relative to a surface normal of the surface of the biosensor 1412. In yet another implementation, the illumination system 1416 is configured to produce illumination having a plurality of angles, including some parallel illumination and some off-angle illumination.

The system receptacle or interface 1410 is configured to at least one of mechanically, electrically, and fluidically engage the biosensor 1412. The system receptacle 1410 may maintain the biosensor 1412 in a desired orientation to facilitate fluid flow through the biosensor 1412. The system receptacle 1410 may also include electrical contacts configured to engage the biosensor 1412 such that the sequencing system 1400A may communicate with the biosensor 1412 and/or provide power to the biosensor 1412. In addition, the system receptacle 1410 may include a fluid port (e.g., a nozzle) configured to engage the biosensor 1412. In some implementations, the biosensor 1412 is removably coupled to the system receptacle 1410 mechanically, electrically, and fluidically.

Further, the sequencing system 1400A may be in remote communication with other systems or networks or with other biometric systems 1400A. The detection data obtained by the biometric system 1400A may be stored in a remote database.

FIG. 14B is a block diagram of a system controller 1406 that may be used in the system of FIG. 14A. In one implementation, the system controller 1406 includes one or more processors or modules that may communicate with each other. Each of the processors or modules may include an algorithm (e.g., instructions stored on a tangible and/or non-transitory computer-readable storage medium) or sub-algorithm for performing a particular process. The system controller 1406 is conceptually illustrated as a collection of modules, but may be implemented using any combination of dedicated hardware boards, DSPs, processors, etc. Alternatively, the system controller 1406 may be implemented using an off-the-shelf PC having a single processor or multiple processors, with functional operations distributed among the processors. As a further alternative, the modules described below may be implemented using a hybrid configuration, where some modular functions are performed using dedicated hardware, while the remaining modular functions are performed using an off-the-shelf PC or the like. Modules may also be implemented as software modules within a processing unit.

During operation, the communication port 1450 may transmit information (e.g., commands) to or receive information (e.g., data) from the biosensor 1412 (fig. 14A) and/or the

subsystems

1408, 1414, 1404 (fig. 14A). In a particular implementation, the communication port 1450 may output a plurality of pixel signal sequences. Communication link 1434 may receive user input from user interface 1418 (FIG. 14A) and transmit data or information to user interface 1418. Data from the biometric sensor 1412 or

subsystems

1408, 1414, 1404 may be processed by the system controller 1406 in real-time during a biometric session. Additionally or alternatively, data may be temporarily stored in system memory during a biometric session and processed at a slower speed than real-time or offline operations.

As shown in FIG. 14B, the system controller 1406 may include a plurality of modules 1426-548 that communicate with a main control module 1424 and a Central Processing Unit (CPU) 1452. The main control module 1424 may communicate with a user interface 1418 (FIG. 14A). Although modules 1426-548 are shown in direct communication with main control module 1424, modules 1426-548 may also be in direct communication with each other, with user interface 1418 and biosensor 1412. Additionally, the modules 1426-548 may communicate with the main control module 1424 through other modules.

The plurality of modules 1426-548 include system modules 1428-532, 1426 that communicate with

subsystems

1408, 1414, 1404, and 1416, respectively. The fluid control module 1428 may communicate with the fluid control system 1408 to control valves and flow sensors of the fluid network to control the flow of one or more fluids through the fluid network. The fluid storage module 1430 may notify the user when the volume of fluid is low or when the waste reservoir is at or near capacity. The fluid storage module 1430 may also be in communication with a temperature control module 1432 so that the fluid may be stored at a desired temperature. The lighting module 1426 may communicate with the lighting system 1416 to illuminate the reaction sites at specified times during the protocol, such as after a desired reaction (e.g., binding event) has occurred. In some implementations, the lighting module 1426 can communicate with the lighting system 1416 to illuminate the reaction sites at a specified angle.

The plurality of modules 1426-548 may also include a device module 1436 in communication with the biosensor 1412 and an identification module 1438 that determines identification information related to the biosensor 1412. The device module 1436 can communicate with, for example, the system socket 1410 to confirm that the biosensor has established electrical and fluid connections with the sequencing system 1400A. The identification module 1438 may receive a signal identifying the biosensor 1412. The identity module 1438 may use the identity of the biometric sensor 1412 to provide other information to the user. For example, the identification module 1438 may determine and then display a lot number, a manufacturing date, or a protocol suggested to operate with the biosensor 1412.

The plurality of modules 1426-548 also include an analysis module 1444 (also referred to as a signal processing module or signal processor) that receives and analyzes signal data (e.g., image data) from the biosensor 1412. The analysis module 1444 includes a memory (e.g., RAM or flash memory) for storing the detection/image data. The detection data may include a plurality of pixel signal sequences, so that a pixel signal sequence from each of millions of sensors (or pixels) can be detected over many base detection cycles. The signal data may be stored for later analysis or may be transmitted to a user interface 1418 to display the desired information to the user. In some implementations, the signal data can be processed by a solid-state imaging device (e.g., a CMOS image sensor) before the signal data is received by the analysis module 1444.

The analysis module 1444 is configured to obtain image data from the photodetector at each of a plurality of sequencing cycles. The image data is derived from the emission signals detected by the photodetectors, and the image data for each sequencing cycle of the plurality of sequencing cycles is processed by the neural network-based base picker 124 and base picks are generated for at least some of the analytes at each sequencing cycle of the plurality of sequencing cycles. The light detector may be part of one or more overhead view cameras (e.g., a CCD camera of Illumina GAIIx that takes an image of the clusters on biosensor 1412 from the top) or may be part of the biosensor 1412 itself (e.g., a CMOS image sensor of iSeq that is located below the clusters on biosensor 1412 and takes an image of the clusters from the bottom).

The output of the photodetectors are sequencing images, each depicting the intensity emission of the cluster and its surrounding background. Sequencing images depict the intensity emission due to incorporation of nucleotides into the sequence during sequencing. The intensity emission is from the associated analyte and its surrounding background. The sequencing image is stored in memory 1448.

The protocol module 1440 and the protocol module 1442 communicate with the main control module 1424 to control the operation of the

subsystems

1408, 1414, and 1404 when performing a predetermined metering protocol. Protocol module 1440 and protocol module 1442 may include sets of instructions for instructing the sequencing system 1400A to perform specific operations according to a predetermined protocol. As shown, the protocol module may be a sequencing-by-synthesis (SBS) module 1440 configured to issue various commands for performing sequencing-by-synthesis processes. In SBS, extension of a nucleic acid primer along a nucleic acid template is monitored to determine the sequence of nucleotides in the template. The underlying chemical process may be polymerization (e.g., catalyzed by a polymerase) or ligation (e.g., catalyzed by a ligase). In certain polymerase-based SBS implementations, fluorescently labeled nucleotides are added to the primer (thereby extending the primer) in a template-dependent manner such that detection of the order and type of nucleotides added to the primer can be used to determine the sequence of the template. For example, to initiate the first SBS cycle, a command can be issued to deliver one or more labeled nucleotides, DNA polymerase, etc. to/through a flow cell containing an array of nucleic acid templates. The nucleic acid templates may be located at corresponding reaction sites. Those reaction sites where primer extension results in incorporation of a labeled nucleotide can be detected by an imaging event. During an imaging event, the illumination system 1416 can provide excitation light to the reaction sites. Optionally, the nucleotide may also include a reversible termination property that terminates further primer extension once the nucleotide is added to the primer. For example, a nucleotide analog having a reversible terminator moiety can be added to the primer such that subsequent extension does not occur until the deblocking agent is delivered to remove the moiety. Thus, for implementations using reversible termination, a command may be issued to deliver the deblocking agent to the flow cell (before or after detection occurs). One or more commands may be issued to effect washing between the various delivery steps. This cycle can then be repeated n times to extend the primer n nucleotides, thereby detecting sequences of length n. Exemplary sequencing techniques are described in: for example, bentley et al, nature 456:53-59 (2005), WO 04/015497, US 7,057,026, WO 91/06675, WO 07/123744, US 7,329,492, US 7,211,414, US 7,315,019, US 7,405,251 and US 2005/014705052, each of which is incorporated herein by reference.

For the nucleotide delivery step of the SBS cycle, a single type of nucleotide can be delivered at a time, or multiple different nucleotide types can be delivered (e.g., a, C, T, and G together). For nucleotide delivery configurations where only a single type of nucleotide is present at a time, different nucleotides need not have different labels, as they can be distinguished based on the time interval inherent in individualized delivery. Thus, a sequencing method or device may use monochromatic detection. For example, the excitation source need only provide excitation at a single wavelength or within a single wavelength range. For nucleotide delivery configurations in which delivery results in multiple different nucleotides being present in the flow cell simultaneously, sites for incorporation of different nucleotide types can be distinguished based on different fluorescent labels attached to the corresponding nucleotide types in the mixture. For example, four different nucleotides can be used, each nucleotide having one of four different fluorophores. In one implementation, excitation in four different regions of the spectrum can be used to distinguish between four different fluorophores. For example, four different excitation radiation sources may be used. Alternatively, less than four different excitation sources may be used, but optical filtering of excitation radiation from a single source may be used to produce different ranges of excitation radiation at the flow cell.

In some implementations, less than four different colors can be detected in a mixture with four different nucleotides. For example, nucleotide pairs may be detected at the same wavelength, but distinguished based on differences in intensity of one member of the pair relative to the other member, or based on changes in one member of the pair that result in the appearance or disappearance of a distinct signal compared to the signal detected for the other member of the pair (e.g., by chemical, photochemical, or physical modification). Exemplary devices and methods for distinguishing four different nucleotides using detection of less than four colors are described, for example, in U.S. patent application Ser. Nos. 61/535,294 and 61/619,575, which are incorporated herein by reference in their entirety. U.S. application 13/624,200, filed on 21/9/2012, is also incorporated by reference in its entirety.

The plurality of protocol modules may also include a sample preparation (or generation) module 1442 configured to issue commands to the fluid control system 1408 and the temperature control system 1404 to amplify the products within the biosensor 1412. For example, the biosensor 1412 may be coupled to the sequencing system 1400A. Amplification module 1442 may issue instructions to fluid control system 1408 to deliver the necessary amplification components to the reaction chambers within biosensor 1412. In other implementations, the reaction site may already contain some components for amplification, such as template DNA and/or primers. After delivering amplification components to the reaction chamber, amplification module 1442 can instruct temperature control system 1404 to cycle through different temperature phases according to known amplification protocols. In some implementations, amplification and/or nucleotide incorporation is performed isothermally.

The SBS module 1440 may issue commands to perform bridge PCR, where clusters of clonal amplicons are formed on local regions within the channels of the flow cell. After generation of amplicons by bridge PCR, the amplicons can be "linearized" to prepare single-stranded template DNA or sstDNA, and sequencing primers can be hybridized to the universal sequences flanking the region of interest. For example, a sequencing-by-synthesis approach based on reversible terminators may be used as described above or as follows.

Each base detection or sequencing cycle can be achieved by single base extension of sstDNA, which can be accomplished, for example, by using a modified DNA polymerase and a mixture of four types of nucleotides. Different types of nucleotides may have unique fluorescent labels, and each nucleotide may also have a reversible terminator that only allows single base incorporation to occur in each cycle. After adding a single base to the sstDNA, excitation light can be incident on the reaction site and fluorescence emission can be detected. After detection, the fluorescent label and terminator can be chemically cleaved from the sstDNA. This may be followed by another similar cycle of base detection or sequencing. In such a sequencing protocol, SBS module 1440 may instruct fluid control system 1408 to direct reagent and enzyme solutions to flow through biosensor 1412. Exemplary SBS methods based on reversible terminators that may be used with the devices and methods described herein are described in U.S. patent application publication No. 2007/0166705 A1, U.S. patent application publication No. 2006/0156 x 3901 A1, U.S. patent No. 7,057,026, U.S. patent application publication No. 2006/0240439 A1, U.S. patent application publication No. 2006/02514714709 A1, PCT publication No. WO 05/065514, U.S. patent application publication No. 2005/014700900 A1, PCT publication No. WO 06/05B199, and PCT publication No. WO 07/01470251, each of which is incorporated herein by reference in its entirety. Exemplary reagents for SBS based on reversible terminators are described in US 7,541,444, US 7,057,026, US 7,414,14716, US 7,427,673, US 7,566,537, US 7,592,435, and WO 07/14535365, each of which is incorporated herein by reference in its entirety.

In some implementations, the amplification module and SBS module can operate in a single assay protocol in which, for example, a template nucleic acid is amplified and then sequenced within the same cassette.

The sequencing system 1400A may also allow a user to reconfigure the assay protocol. For example, the sequencing system 1400A may provide the user with an option to modify the determined protocol through the user interface 1418. For example, if it is determined that the biosensor 1412 is to be used for amplification, the sequencing system 1400A may request the temperature of the annealing cycle. Further, the sequencing system 1400A may issue a warning to the user if the user has provided user input that is generally unacceptable for the selected assay protocol.

In a specific implementation, the biosensor 1412 includes millions of sensors (or pixels), each of which generates multiple pixel signal sequences during subsequent base detection cycles. The analysis module 1444 detects and attributes a plurality of pixel signal sequences to corresponding sensors (or pixels) according to the row-by-row and/or column-by-column positions of the sensors on the sensor array.

Configurable processor

Fig. 14C is a simplified block diagram of a system for analyzing sensor data (such as base call sensor output) from the sequencing system 1400A. In the example of fig. 14C, the system includes a configurable processor 1446. The configurable processor 1446 may execute a base picker (e.g., neural network-based base picker 124) in coordination with a runtime program executed by a Central Processing Unit (CPU) 1452 (i.e., a host processor). The sequencing system 1400A includes a biosensor 1412 and a flow cell. The flow cell may comprise one or more blocks in which clusters of genetic material are exposed to a sequence of analyte streams for causing reactions in the clusters to identify bases in the genetic material. The sensor senses the reaction of each cycle of the sequence in each block of the flow cell to provide block data. Genetic sequencing is a data intensive operation that converts base-call sensor data into base-call sequences for each cluster of genetic material sensed during a base-call operation.

The system in this example includes a CPU 1452 that executes a runtime program to coordinate base-call operations, a memory 1448B for storing sequences of block data arrays, base-call reads resulting from base-call operations, and other information used in base-call operations. Additionally, in this illustration, the system includes a memory 1448A to store a configuration file (or files), such as an FPGA bit file, and model parameters for configuring and reconfiguring the neural network of the configurable processor 1446, and to execute the neural network. The sequencing system 1400A may include a program to configure a configurable processor, and in some implementations a reconfigurable processor, to execute a neural network.

The sequencing system 1400A is coupled to the configurable processor 1446 by a bus 1489. The bus 1489 may be implemented using high-throughput technology, such as bus technology compatible with the PCIe standard (peripheral component interconnect express) currently maintained and developed by PCI-SIG (PCI special interest group), in one example. Also in this example, memory 1448A is coupled to configurable processor 1446 by a bus 1493. The memory 1448A may be an on-board memory disposed on a circuit board with the configurable processor 1446. Memory 1448A is used for high-speed access by the configurable processor 1446 to working data used in base-finding operations. The bus 1493 may also be implemented using high-throughput technology, such as bus technology compatible with the PCIe standard.

Configurable processors, including field programmable gate arrays FPGA, coarse-grained reconfigurable arrays CGRA, and other configurable and reconfigurable devices, may be configured to implement various functions more efficiently or faster than is possible using a general-purpose processor executing a computer program. Configuration of a configurable processor involves compiling a functional description to produce a configuration file, sometimes referred to as a bitstream or bitfile, and distributing the configuration file to configurable elements on the processorAnd (3) a component. The configuration file defines the logic functions to be performed by the configurable processor by configuring the circuit to set data flow patterns, use of distributed memory and other on-chip memory resources, lookup table contents, configurable logic blocks, and configurable execution units (such as multiply-accumulate units, configurable interconnects, and other elements of the configurable array). The configurable processor is reconfigurable if the configuration file can be changed in the field by changing the loaded configuration file. For example, the configuration file may be stored in volatile SRAM elements, non-volatile read-write memory elements, and combinations thereof, distributed in an array of configurable elements on a configurable or reconfigurable processor. A variety of commercially available configurable processors are suitable for use in the base detection procedure as described herein. Examples include Google's sensor Processing Unit (TPU) ^TM Rack solutions (e.g. GX4 Rackmount Series) ^TM 、GX9 Rackmount Series ^TM )、NVIDIADGX-1 ^TM Stratix V FPGA of Microsoft ^TM Intellient Processor Unit (IPU) of Graphcore ^TM Qualcomm having Snapdagon processors ^TM Zeroth Platform of ^TM Volta of NVIDIA ^TM Driving PX of NVIDIA ^TM JETSON TX1/TX2MODULE of NVIDIA ^TM Nirvana from Intel ^TM 、Movidius VPU ^TM 、Fujitsu DPI ^TM ARM dynamicIQ ^TM 、IBM TrueNorth ^TM Having a Testa V100s ^TM Lambda GPU Server, xilinx Alveo ^TM U200、Xilinx Alveo ^TM U250、Xilinx Alveo ^TM U280、Intel/Altera Stratix ^TM GX2800、Intel/Altera Stratix ^TM GX2800 and Intel Stratix ^TM GX10M. In some examples, the host CPU may be implemented on the same integrated circuit as the configurable processor.

Implementations described herein use a configurable processor 1446 to implement the neural network-based base picker 124. The configuration file for configurable processor 1446 may be implemented by specifying the logical functions to be performed using a high level description language (HDL) or a Register Transfer Level (RTL) language specification. The specification may be compiled using resources designed for the selected configurable processor to generate a configuration file. The same or similar specifications may be compiled in order to generate a design for an application specific integrated circuit that may not be a configurable processor.

Thus, in all implementations described herein, alternatives to the configurable processor 1446 include a configured processor comprising an application specific ASIC or application specific integrated circuit or set of integrated circuits, or a system on a chip SOC device, or a Graphics Processing Unit (GPU) processor or Coarse Grain Reconfigurable Architecture (CGRA) processor configured to perform neural network based base call operations as described herein.

In general, a configurable processor and a configured processor as described herein that is configured to perform the operations of a neural network are referred to herein as a neural network processor.

In this example, the configurable processor 1446 is configured by a configuration file loaded by a program executed using the CPU 1452, or by other sources that configure an array of configurable elements 1491 (e.g., configuration Logic Blocks (CLBs), such as look-up tables (LUTs), flip-flops, computational processing units (PMUs) and Computational Memory Units (CMUs), configurable I/O blocks, programmable interconnects) on the configurable processor to perform base detection functions. In this example, the configuration includes data flow logic 104 that is coupled to bus 1489 and bus 1493 and performs functions for distributing data and control parameters between elements used in base call operations.

Further, the configurable processor 1446 is configured with the dataflow logic 104 to execute the neural network based base finder 124. The logic 104 includes a multi-loop execution cluster (e.g., 1479), which in this example includes execution cluster 1 through execution cluster X. The number of multi-cycle execution clusters may be selected based on a tradeoff involving the required throughput of operations and the available resources on the configurable processor 1446.

The multi-cycle execution cluster is coupled to the data flow logic 104 through data flow paths 1499 that are implemented using configurable interconnects and memory resources on the configurable processors 1446. In addition, the multi-cycle execution clusters are coupled to the dataflow logic 104 by a control path 1495 implemented using, for example, configurable interconnects and memory resources on the configurable processor 1446, that provides input units indicating available execution clusters, ready to provide the available execution clusters with data for execution of the neural network-based base picker 124, ready to provide trained parameters to the neural network-based base picker 124, an output block ready to provide base picking classification data, and other control data for execution of the neural network-based base picker 124.

The configurable processor 1446 is configured to perform the operation of the neural network-based base picker 124 using the trained parameters to generate classification data for a sensing cycle of the base picking operation. The operation of the neural network-based base picker 124 is performed to generate classification data for a subject sensing cycle of the base picking operation. The operation of the neural network-based base picker 124 operates on a sequence (comprising a digital N array of block data from respective ones of N sensing cycles, which in the example described herein provide sensor data for different base picking operations for one base position of each operation in the temporal sequence. Optionally, if desired, some of the N sensing cycles may be out of order depending on the particular neural network model being performed. The number N may be any number greater than 1. In some examples described herein, a sensing cycle of the N sensing cycles represents a set of at least one sensing cycle preceding and at least one sensing cycle following a subject cycle (subject cycle) of the subject sensing cycle in the time series. Examples are described herein in which the number N is an integer equal to or greater than five.

The data flow logic 104 is configured to move at least some of the trained parameters of the block data and model parameters from the memory 1448A to the configurable processor 1446 for the operation of the neural network-based base picker 124 using an input unit for a given operation that includes block data for spatially aligned blocks of the N arrays. The input units may be moved by direct memory access operations in one DMA operation, or in smaller units that move during available time slots in coordination with the execution of the deployed neural network.

Block data for a sensing cycle as described herein may include an array of sensor data having one or more characteristics. For example, the sensor data may include two images that are analyzed to identify one of four bases at a base position in a genetic sequence of DNA, RNA, or other genetic material. The tile data may also include metadata about the image and the sensor. For example, in implementations of the base call operation, the patch data may include information about the alignment of the image with the clusters, such as information about the distance from the center, which is indicative of the distance of each pixel in the sensor data array from the center of the cluster of genetic material on the patch.

During execution of the neural network based base picker 124 as described below, the block data may also include data generated during execution of the neural network based base picker 124, referred to as intermediate data, which may be reused rather than recalculated during operation of the neural network based base picker 124. For example, during execution of the neural network based base picker 124, the data flow logic 104 may write intermediate data to the memory 1448A in place of sensor data for a given block of the block data array. Implementations similar to this are described in more detail below.

As shown, a system for analyzing base call sensor output is described that includes a memory (e.g., 1448A) accessible by a runtime program that stores block data including sensor data from a block of a sensing cycle of a base call operation. In addition, the system includes a neural network processor, such as a configurable processor 1446 that has access to memory. The neural network processor is configured to perform operation of the neural network using the trained parameters to generate classification data for the sensing cycle. As described herein, the operation of the neural network operates on a sequence of N arrays of block data from respective ones of N sensing cycles (including the subject cycle) to generate classification data for the subject cycle. Data flow logic 908 is provided to move the block data and trained parameters from the memory to the neural network processor using the input units (including data from the spatially aligned patches of the N arrays of respective ones of the N sensing cycles) for operation of the neural network.

Additionally, a system is described in which a neural network processor has access to a memory and includes a plurality of execution clusters, an execution cluster of the plurality of execution clusters configured to execute a neural network. The data flow logic 104 may access the memory and an execution cluster of the plurality of execution clusters to provide an input unit of the chunk data to an available execution cluster of the plurality of execution clusters, the input unit including a number N of spatially aligned blocks of the chunk data array from a respective sensing cycle (including a subject sensing cycle), and cause the execution cluster to apply the N spatially aligned blocks to the neural network to generate an output block of classified data for the spatially aligned blocks of the subject sensing cycle, where N is greater than 1.

FIG. 15 is a simplified diagram illustrating aspects of the base detection operation, including the functionality of a runtime program executed by a host processor. In this figure, the output from the image sensor of the flowcell is provided on line 1500 to image processing threads 1501 which can perform processing on the image, such as alignment and placement in the sensor data array of individual patches and resampling of the image, and can be used by a process of computing a patch cluster mask for each patch in the flowcell which identifies pixels in the sensor data array corresponding to clusters of genetic material on the corresponding patch of the flowcell. Depending on the state of the base detection operation, the output of the image processing thread 1501 is provided on line 1502 to scheduling logic 1510 in the CPU that routes the tile data array on a high speed bus 1503 to a data cache 1504 (e.g., SSD storage) or on a high speed bus 1505 to neural network processor hardware 1520, such as the configurable processor 1446 of fig. 14C. The processed and transformed images may be stored on the data cache 1504 for use in previously used sensing cycles. Hardware 1520 returns the sorted data output by the neural network to scheduling logic 1515, which passes the information to data cache 1504 or on line 1511 to thread 1502 which performs base call and quality score calculations using the sorted data, and the data for the base call reads can be arranged in a standard format. The output of thread 1502, which performs base detection and mass fraction calculation, is provided on line 1512 to thread 1503, which aggregates the base detection reads, performs other operations such as data compression, and writes the resulting base detection output to a designated destination for customer utilization.

In some implementations, the host may include threads (not shown) that perform final processing of the output of the hardware 1520 to support a neural network. For example, hardware 1520 may provide an output of classified data from a final layer of the multi-cluster neural network. The host processor may perform an output activation function, such as a softmax function, on the classified data to configure the data for use by the base detection and quality scoring thread 1502. In addition, the host processor may perform input operations (not shown), such as batch normalization of the block data prior to input to the hardware 1520.

Fig. 16 is a simplified diagram of a configuration of a configurable processor 1446, such as the configurable processor of fig. 14C. In fig. 16, configurable processor 1446 includes an FPGA with multiple high speed PCIe interfaces. The FPGA is configured with a wrapper 1690 that includes the dataflow logic 104 described with reference to fig. 14C. Wrapper 1690 manages the interface and coordination with the runtime program in the CPU through CPU communication link 1677, and manages communication with onboard DRAM 1699 (e.g., memory 1448A) via DRAM communication link 1697. The dataflow logic 104 in the packager 1690 provides tile data retrieved by traversing the digital N-cycle array of tile data on the on-board DRAM 1699 to the cluster 1685, and retrieves process data 1687 from the cluster 1685 for delivery back to the on-board DRAM 1699. The packager 1690 also manages data transfer between the on-board DRAM 1699 and the host memory for both the input array of block data and the output block of sorted data. The encapsulator transfers block data on line 1683 to the allocated cluster 1685. The encapsulator provides trained parameters such as weights and offsets on line 1681 to clusters 1685 retrieved from on-board DRAM 1699. The wrapper provides configuration and control data on line 1679 to cluster 1685, which is provided from or generated in response to a runtime program on the host via CPU communication link 1677. The cluster may also provide status signals on line 1689 to the wrapper 1690 which are used in conjunction with control signals from the host to manage traversal of the block data array to provide spatially aligned block data, and to perform a multi-cycle neural network on the block data using the resources of the cluster 1685.

As described above, there may be multiple clusters on a single configurable processor managed by the packager 1690 that are configured to execute on corresponding ones of multiple blocks of tile data. Each cluster may be configured to provide classification data for base detection in a subject sensing cycle using block data for a plurality of sensing cycles as described herein.

In an example of a system, model data (including kernel data such as filter weights and offsets) may be sent from the host CPU to the configurable processor so that the model may be updated according to the number of cycles. As one representative example, the base call operation may include approximately hundreds of sensing cycles. In some embodiments, the base detection operation can include paired-end reads. For example, the model training parameters may be updated every 20 cycles (or other number of cycles), or according to an update pattern implemented for a particular system and neural network model. In some embodiments that include double-ended reads, where a sequence of a given string in a genetic cluster on a block includes a first portion extending from a first end down (or up) the string and a second portion extending from a second end up (or down) the string, the trained parameters may be updated in the transition from the first portion to the second portion.

In some examples, image data for multiple cycles of the sensed data for a tile may be sent from the CPU to the packager 1690. The packager 1690 may optionally perform some preprocessing and conversion of the sensed data and write the information to the onboard DRAM 1699. The input tile data for each sensing cycle may comprise an array of sensor data comprising about 4000 x 3000 pixels or more per tile per sensing cycle, wherein two features represent colors of two images of the tile and each feature is one or two bytes per pixel. For an embodiment where the number N is three sensing cycles to be used in each run of the multi-cycle neural network, the block data array for each run of the multi-cycle neural network may consume on the order of hundreds of megabytes per block. In some embodiments of the system, the tile data further comprises an array of DFC data stored once per tile, or other types of metadata about the sensor data and the tiles.

In operation, when a multi-cycle cluster is available, the encapsulator assigns patches to the cluster. The wrapper takes the next patch of block data in the traversal of the block and sends it to the allocated cluster along with the appropriate control and configuration information. The clusters may be configured to have sufficient memory on the configurable processor to hold data patches that include patches from multiple loops in some systems and are being processed in place, as well as data patches to be processed when processing of the current patch is completed using ping-pong buffer techniques or raster scan techniques in various embodiments.

When the allocated cluster completes its execution of the neural network for the current patch and produces an output patch, it will signal the wrapper. The wrapper will read the output patch from the allocated cluster or alternatively the allocated cluster pushes data to the wrapper. The packager will then assemble the output blocks for the processed blocks in DRAM 1699. When processing of the entire chunk has completed and the output patch of data has been transferred to the DRAM, the wrapper sends the processed output array of chunks back to the host/CPU in the specified format. In some implementations, on-board DRAM 1699 is managed by memory management logic in the wrapper 1690. The runtime program can control the sequencing operation to complete the analysis of all arrays of block data for all cycles in the run in a continuous stream, thereby providing real-time analysis.

Computer system

Fig. 17 is a computer system 1700 that can be used by the sequencing system 500A to implement the base detection techniques disclosed herein. Computer system 1700 includes at least one Central Processing Unit (CPU) 1772 that communicates with a number of peripheral devices via a bus subsystem 1755. These peripheral devices may include a storage subsystem 858 (including, for example, a memory device and file storage subsystem 1736), a user interface input device 1738, a user interface output device 1776, and a network interface subsystem 1774. Input devices and output devices allow user interaction with computer system 1700. The network interface subsystem 1774 provides an interface to an external network, including an interface to a corresponding interface device in other computer systems.

In one implementation, the system controller 1406 is communicatively linked to the storage subsystem 1710 and the user interface input device 1738.

User interface input devices 1738 may include: a keyboard; a pointing device such as a mouse, trackball, touchpad, or tablet; a scanner; a touch screen incorporated into the display; audio input devices such as voice recognition systems and microphones; and other types of input devices. In general, use of the term "input device" is intended to include all possible types of devices and ways to input information into computer system 1700.

User interface output devices 1776 may include a display subsystem, a printer, a facsimile machine, or a non-visual display (such as an audio output device). The display subsystem may include an LED display, a Cathode Ray Tube (CRT), a flat panel device such as a Liquid Crystal Display (LCD), a projection device, or some other mechanism for producing a visible image. The display subsystem may also provide a non-visual display, such as an audio output device. In general, use of the term "output device" is intended to include all possible types of devices and ways to output information from computer system 1700 to a user or to another machine or computer system.

The storage subsystem 858 stores programming structures and data structures that provide the functionality of some or all of the modules and methods described herein. These software modules are typically executed by the deep learning processor 1778.

Deep learning processor 1778 may be a Graphics Processing Unit (GPU), liveA programmable gate array (FPGA), an Application Specific Integrated Circuit (ASIC), and/or a Coarse Grain Reconfigurable Architecture (CGRA). The deep learning processor 1778 may be implemented by a deep learning Cloud Platform such as Google Cloud Platform ^TM 、Xilinx ^TM And Cirrascale ^TM And (4) hosting. Examples of deep learning processor 1778 include Google's sensor Processing Unit (TPU) ^TM Rack solutions (e.g. GX4 Rackmount Series) ^TM 、GX17Rackmount Series ^TM )、NVIDIA DGX-1 ^TM Stratix V FPGA of Microsoft ^TM Intellient Processor Unit (IPU) of Graphcore ^TM Qualcomm having Snapdagon processors ^TM Zeroth Platform of ^TM Volta of NVIDIA ^TM Driving PX of NVIDIA ^TM JETSON TX1/TX2MODULE of NVIDIA ^TM Intel Nirvana ^TM 、Movidius VPU ^TM 、Fujitsu DPI ^TM ARM dynamicIQ ^TM 、IBM TrueNorth ^TM Having a Testa V100S ^TM Lambda GPU server, etc.

The memory subsystem 1722 used in the storage subsystem 858 may include a number of memories, including a main Random Access Memory (RAM) 1732 for storing instructions and data during program execution and a Read Only Memory (ROM) 1734 in which fixed instructions are stored. File storage subsystem 1736 may provide persistent storage for program files and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a CD-ROM drive, an optical disk drive, or removable media disk cartridges. Modules implementing certain embodied functions may be stored by the file storage subsystem 1736 in the storage subsystem 858, or in other machines accessible by the processor.

Bus subsystem 1755 provides a mechanism for various components and subsystems of computer system 1700 to communicate with each other as desired. Although bus subsystem 1755 is shown schematically as a single bus, alternative embodiments of the bus subsystem may use multiple buses.

Computer system 1700 itself can be of different types, including a personal computer, a portable computer, a workstation, a computer terminal, a network computer, a television, a mainframe, a server farm, a widely distributed group of loosely networked computers, or any other data processing system or user device. Due to the ever-changing nature of computers and networks, the description of computer system 1700 depicted in FIG. 17 is intended only as a specific example for purposes of illustrating the preferred embodiments of the present invention. Many other configurations of computer system 1700 are possible having more or fewer components than the computer system depicted in fig. 17.

Specific embodiments

The disclosed technology provides artificial intelligence based base detectors with context awareness. The disclosed technology may be practiced as a system, method, or article of manufacture. One or more features of the implementations may be combined with the basic implementations. Non-mutually exclusive implementations are taught as being combinable. One or more features of an implementation may be combined with other implementations. The present disclosure periodically alerts the user to these options. The statement that repetition of these options from some implementations is omitted should not be taken as limiting the combination taught in the preceding section, which statements are hereby incorporated by reference into each of the following implementations.

The various processes and steps of the methods set forth herein may be performed using a computer. The computer may include a processor that is part of the detection device, networked with the detection device for obtaining data processed by the computer, or separate from the detection device. In some implementations, information (e.g., image data) may be transmitted between components of the systems disclosed herein directly or via a computer network. A Local Area Network (LAN) or Wide Area Network (WAN) may be an enterprise computing network that includes access to the internet to which the computers and computing devices that make up the system are connected. In one implementation, the LAN conforms to the Transmission control protocol/Internet protocol (TCP/IP) industry standard. In some cases, information (e.g., image data) is input to the system disclosed herein via an input device (e.g., a disk drive, an optical disk player, a USB port, etc.). In some cases, the information is received by, for example, loading the information from a storage device, such as a disk or flash drive.

The processor for executing the algorithms or other processes set forth herein may comprise a microprocessor. The microprocessor may be any conventional general purpose single or multi-chip microprocessor, such as the Pentium manufactured by Intel corporation ^TM A processor. Particularly useful computers may utilize an Intel Ivybrid dual-12 core processor, LSI raid controller, with 128GB of RAM and 2TB solid state disk drives. Further, the processor may comprise any conventional special purpose processor, such as a digital signal processor or a graphics processor. A processor typically has conventional address lines, conventional data lines, and one or more conventional control lines.

Implementations disclosed herein may be implemented as a method, apparatus, system, or article of manufacture using standard programming or engineering techniques to produce software, firmware, hardware, or any combination thereof. The term "article of manufacture" as used herein refers to code or logic implemented in hardware or a computer readable medium, such as optical storage devices and volatile or non-volatile memory devices. Such hardware may include, but is not limited to, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), complex Programmable Logic Devices (CPLDs), programmable Logic Arrays (PLAs), microprocessors, or other similar processing devices. In particular implementations, the information or algorithms set forth herein reside in a non-transitory storage medium.

In particular implementations, the computer-implemented methods set forth herein may occur in real-time as the multiple images of the object are obtained. Such real-time analysis is particularly useful for nucleic acid sequencing applications, where a nucleic acid array is subjected to repeated cycles of fluidics and detection steps. Analysis of sequencing data can often be computationally intensive, such that it can be beneficial to perform the methods described herein in real-time or in the background as other data acquisition or analysis algorithms are conducted. Exemplary real-time analysis methods that can be used with the methods of the present invention are those for MiSeq and HiSeq sequencing equipment commercially available from Illumina corporation (San Diego, calif.) and/or described in U.S. patent application publication 2012/0020537 A1, which is incorporated herein by reference.

A system for base detection is disclosed. The system includes a memory, data flow logic, a neural network, and an intensity contextualization unit.

The memory stores an image depicting intensity emissions of a set of analytes. The intensity emission is generated from an analyte in a set of analytes during a sequencing cycle of a sequencing run. The image has intensity values for one or more intensity channels.

The data traffic logic may access the memory and be configured to provide neural network access to the image on a block-by-block basis. The blocks in the image depict intensity emissions of a subset of the analytes. The patch has a single intensity pattern due to the limited base diversity of the analytes in the subset.

The neural network has a plurality of convolution filters. A convolution filter of the plurality of convolution filters has a receive domain that is limited to blocks. The convolution filter is configured to detect intensity patterns in the block with a loss of detection due to the single intensity pattern and the localized receive domain.

The intensity contextualization unit is configured to determine intensity context data based on intensity values in the image and store the intensity context data in the memory.

The data flow logic is configured to append the intensity context data to the block to generate an intensity contextualized image and provide the intensity contextualized image to the neural network.

The neural network is configured to apply a convolution filter to the intensity-contextualized image and generate a base call classification. The intensity context data in these intensity-contextualized images compensates for this detection loss.

The systems described in this and other portions of the disclosed technology can include one or more of the following features and/or features described in connection with additional systems disclosed. For the sake of brevity, combinations of features disclosed in this application are not individually enumerated and are not repeated with each basic feature set. The reader will understand how features identified in this system can be readily combined with the basic feature sets identified as implementations in other sections of this application.

The intensity context data specifies summary statistics of intensity values. In one implementation, the intensity context data identifies a maximum of the intensity values. In one implementation, the intensity context data identifies a minimum value among the intensity values.

In one implementation, the intensity context data identifies an average of the intensity values. In one implementation, the intensity context data identifies a pattern of intensity values. In one implementation, the intensity context data identifies a standard deviation of the intensity values. In one implementation, the intensity context data identifies a variance of the intensity values.

In one implementation, the intensity context data identifies skewness of intensity values. In one implementation, the intensity context data identifies kurtosis of the intensity values. In one implementation, the intensity context data identifies the entropy of the intensity value.

In one implementation, the intensity context data identifies one or more percentiles of the intensity values. In one implementation, the intensity context data identifies an increment between at least one of a maximum and a minimum, a maximum and a mean, a mean and a minimum, and a higher one of the percentiles and a lower one of the percentiles. In one implementation, the intensity context data identifies a sum of intensity values.

In one implementation, the intensity contextualization unit determines the plurality of maximum values by dividing the intensity values into a plurality of groups and determining a maximum value for each of the groups. The intensity context data identifies a minimum value of the plurality of maximum values.

In one implementation, the intensity contextualization unit determines the plurality of minimum values by dividing the intensity values into a plurality of groups and determining a minimum value for each of the groups. The intensity context data identifies a maximum value of a plurality of minimum values.

In one implementation, the intensity contextualization unit determines the plurality of sums by dividing the intensity values into a plurality of groups and determining a sum of the intensity values in each of the groups. The intensity context data identifies a minimum value of the plurality of sums. In other implementations, the intensity context data identifies a maximum of the plurality of sums. In yet other implementations, the intensity context data identifies an average of a plurality of sums.

In one implementation, the intensity contextualization unit has a plurality of convolution pipelines. Each of the convolution pipelines has a plurality of convolution filters. The convolution filters of the plurality of convolution filters have different filter sizes. The convolution filters have different filter steps.

In one implementation, each convolution pipeline in the convolution pipeline processes the image to generate multiple convolution representations of the image.

In one implementation, the intensity context data has a context channel for each of a plurality of convolution representations. The context channel has as many concatenated copies of the respective ones of the convolutional representations as needed to match the size of the image. In some implementations, the size of each convolution representation is 1 × 1. The cascaded copies are appended to the image pixel by pixel.

A computer-implemented method of base detection is disclosed. The method comprises the following steps: an image depicting intensity emissions of a set of analytes is accessed. The intensity emission is generated from an analyte in a set of analytes during a sequencing cycle of a sequencing run. The method comprises the following steps: the image is processed on a block-by-block basis, and blocks are generated therefrom. The blocks depict intensity emissions of a subset of the analytes. The method comprises the following steps: intensity context data is determined based on intensity values in the image. The method comprises the following steps: intensity context data is appended to the block and an intensity contextualized image is generated. The method comprises the following steps: the intensity contextualized image is processed and a base detected classification is generated.

Other implementations of the methods described in this section may include a non-transitory computer-readable storage medium storing instructions executable by a processor to perform any of the methods described above. Yet another implementation of the methods described in this section can include a system comprising a memory and one or more processors operable to execute instructions stored in the memory to perform any of the methods described above.

In one implementation, a self-normalizing neural network is disclosed. The self-normalized neural network includes a normalization layer (e.g., intensity contextualization unit 112). The normalization layer is configured to determine one or more normalization parameters from the inputs on an input-by-input basis. The normalization layer is further configured to append contextual data characterizing the normalization parameters to the blocks accessed from the input. Consider, for example, two inputs, such as two images. The normalization layer then determines a first set of normalization parameters for the first image and a second set of normalization parameters for the second image. This is different from other normalization techniques like batch normalization, which learns a fixed set of normalization parameters and uses them for the bulk input. In contrast, the normalization parameters determined by the disclosed normalization layer are specific to a given input and are determined at runtime (e.g., at inference). During training, the normalization layer is trained to generate normalization parameters specific to the subject input.

The self-normalizing neural network also includes runtime logic. The runtime logic is configured to process the block with the context data appended thereto by the self-normalizing neural network to generate an output.

In one implementation, the normalization layer is further configured to determine, at runtime, respective normalization parameters for respective inputs. In another implementation, the normalization parameter is a summary statistic about intensity values in the input.

In one implementation, the context data includes summary statistics for pixel-by-pixel encoding. In one implementation, context data is encoded into blocks pixel by pixel.

While the present invention is disclosed by reference to the preferred embodiments and examples detailed above, it is to be understood that these examples are intended in an illustrative rather than in a limiting sense. It is contemplated that modifications and combinations will readily occur to those skilled in the art, which modifications and combinations will be within the spirit of the invention and the scope of the following claims.

Claims

1. A system for base detection, the system comprising:

a memory storing an image depicting intensity emissions of a set of analytes generated by analytes in the set of analytes during a sequencing cycle of a sequencing run;

data flow logic accessible to the memory and configured to provide neural network access to the image on a block-by-block basis, a block in an image depicting the intensity emissions of a subset of the analytes, and the block having a single intensity pattern due to limited base diversity of analytes in the subset;

a neural network having a plurality of convolution filters, a convolution filter of the plurality of convolution filters having a receive domain confined to the block and the convolution filter being configured to detect an intensity pattern in the block with a detection penalty due to the single intensity pattern and the confined receive domain;

an intensity contextualization unit configured to determine intensity context data based on intensity values in the image and to store the intensity context data in the memory;

the data flow logic is configured to append the intensity context data to the block to generate an intensity contextualized image and provide the intensity contextualized image to the neural network; and

the neural network is configured to apply the convolution filter to the intensity-contextualized image and generate a base-detect classification, the intensity context data in the intensity-contextualized image compensating for the detection loss.

2. The system of claim 1, wherein the image has the intensity values of one or more intensity channels.

3. The system of claim 1 or 2, wherein the intensity context data specifies summary statistics of the intensity values.

4. The system of any of claims 1 to 3, wherein the intensity context data identifies a maximum of the intensity values.

5. The system of any of claims 1 to 4, wherein the intensity context data identifies a minimum of the intensity values.

6. The system of any of claims 1 to 5, wherein the intensity context data identifies an average of the intensity values.

7. The system of any of claims 1 to 6, wherein the intensity context data identifies a pattern of the intensity values.

8. The system of any of claims 1 to 7, wherein the intensity context data identifies a standard deviation of the intensity values.

9. The system of any of claims 1 to 8, wherein the intensity context data identifies a variance of the intensity values.

10. The system of any of claims 1 to 9, wherein the intensity context data identifies skewness of the intensity values.

11. The system of any of claims 1 to 10, wherein the intensity context data identifies kurtosis of the intensity values.

12. The system of any of claims 1 to 11, wherein the intensity context data identifies an entropy of the intensity values.

13. The system of any of claims 1 to 12, wherein the intensity context data identifies one or more percentiles of the intensity values.

14. The system of any of claims 4 to 13, wherein the intensity context data identifies an increment between at least one of the maximum and minimum values, the maximum and average values, the average and minimum values, and a higher one of the percentiles and a lower one of the percentiles.

15. The system of any of claims 1 to 14, wherein the intensity context data identifies a sum of the intensity values.

16. The system according to any one of claims 1 to 15, wherein the intensity contextualization unit determines a plurality of maximum values by dividing the intensity values into a plurality of groups and determining a maximum value for each of the groups, and

wherein the intensity context data identifies a minimum value of the plurality of maximum values.

17. The system of any of claims 1 to 16, wherein the intensity contextualization unit determines a plurality of minima by dividing the intensity values into a plurality of groups and determining a minimum value for each of the groups, wherein the intensity context data identifies a maximum value of the plurality of minima.

18. The system of any of claims 1 to 17, wherein the intensity contextualization unit determines a plurality of sums by dividing the intensity values into a plurality of groups and determining a sum of intensity values in each of the groups, wherein the intensity context data identifies a minimum value of the plurality of sums.

19. The system of claim 18, wherein the intensity context data identifies a maximum value of the plurality of sums.

20. The system of claim 18 or 19, wherein the intensity context data identifies an average of the plurality of sums.

21. The system of any of claims 1-20, wherein the intensity contextualization unit has a plurality of convolution pipelines, wherein each of the convolution pipelines has a plurality of convolution filters, wherein convolution filters of the plurality of convolution filters have different filter sizes, and wherein the convolution filters have different filter steps.

22. The system of claim 21, wherein each of the convolution pipelines processes an image to generate a plurality of convolution representations of the image.

23. The system of claim 21 or 22, wherein the intensity context data has a context channel for each of the plurality of convolutional representations, wherein the context channel has as many concatenated copies of the respective one of the convolutional representations as needed to match the size of the image.

24. The system of claim 22 or 23, wherein each convolution representation has a size of 1 x1, wherein the concatenated copy is appended to the image pixel by pixel.

25. A computer-implemented method of base detection, the method comprising:

accessing an image depicting intensity emissions of a set of analytes generated by analytes in the set of analytes during a sequencing cycle of a sequencing run;

processing the image on a block-by-block basis to generate blocks depicting the intensity emissions for a subset of the analytes;

determining intensity context data based on intensity values in the image;

appending the intensity context data to the block and generating an intensity contextualized image; and

processing the intensity contextualized image and generating a base call classification.

26. A system comprising one or more processors coupled with memory loaded with computer instructions to perform base detection, the instructions when executed on the processors performing a plurality of acts comprising:

processing the image on a block-by-block basis to generate blocks depicting the intensity emissions of a subset of the analytes;

determining intensity context data based on intensity values in the image;

27. A non-transitory computer readable storage medium imprinted with computer program instructions for base detection, the instructions when executed on a processor implementing a method comprising:

determining intensity context data based on intensity values in the image;

28. A self-normalizing neural network, the self-normalizing neural network comprising:

a normalization layer configured to determine one or more normalization parameters from an input on an input-by-input basis and to append context data characterizing the normalization parameters to blocks accessed from the input; and

runtime logic configured to process the block with the context data appended thereto by the self-normalizing neural network to generate an output.

29. The self-normalizing neural network of claim 28, wherein the normalization layer is further configured to determine, at run-time, respective normalization parameters for respective inputs.

30. The self-normalizing neural network of claim 28 or 29, wherein the normalization parameter is a summary statistic about intensity values in the input.

31. The self-normalizing neural network of any one of claims 28-30, wherein the context data comprises summarized statistics of pixel-by-pixel encoding.

32. The self-normalizing neural network of any one of claims 28-31, wherein the context data is encoded to the block on a pixel-by-pixel basis.