CN115409174A - Base sequence filtering method and device based on DRAM memory calculation - Google Patents

Base sequence filtering method and device based on DRAM memory calculation Download PDF

Info

Publication number
CN115409174A
CN115409174A CN202211354686.5A CN202211354686A CN115409174A CN 115409174 A CN115409174 A CN 115409174A CN 202211354686 A CN202211354686 A CN 202211354686A CN 115409174 A CN115409174 A CN 115409174A
Authority
CN
China
Prior art keywords
data
row
base sequence
calculation
line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211354686.5A
Other languages
Chinese (zh)
Other versions
CN115409174B (en
Inventor
杨弢
毛旷
汤昭荣
潘秋红
叶茂伟
黄智华
王京
王颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202211354686.5A priority Critical patent/CN115409174B/en
Publication of CN115409174A publication Critical patent/CN115409174A/en
Application granted granted Critical
Publication of CN115409174B publication Critical patent/CN115409174B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/123DNA computing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a base sequence filtering method and a device based on DRAM memory calculation, wherein the method comprises the following steps: firstly, according to the row width of a storage array of a DRAM and the starting address of a target base sequence to be screened, the target base sequence is screened out and then rearranged and combined; marking the rearranged and combined target base sequence with bases of A adenine, G guanine, C cytosine and T thymine respectively to obtain a marking line of the corresponding base; thirdly, counting the number of the position values of 1 in the marking line after shifting the marking line data to obtain the counting result of the corresponding base; and step four, comparing the statistical result of the reference base sequence with the statistical result of the target base sequence, and filtering the screened target base sequence. The invention carries out position matching screening in the memory subarray, reduces the transfer of a large amount of data between the CPU and the memory, improves the calculation efficiency by times and reduces the power consumption.

Description

Base sequence filtering method and device based on DRAM memory calculation
Technical Field
The invention relates to the field of computer memory calculation, in particular to a base sequence filtering method and device based on DRAM memory calculation.
Background
Genes are functional fragments of DNA (deoxyribonucleic acid) molecules that carry genetic information, and they support the basic structure and properties of life. The prior art already has a set of mature processing procedures for DNA samples, which generally consist of three steps of DNA sequencing, DNA sequence sequencing, gene mapping and mutation detection. The DNA sequencing is to extract and convert DNA of a biological sample into a data sequence Read capable of being recognized by a computer by using a DNA sequencer, generally, a base sequence formed by linking four bases of a, C, T and G in the DNA sequence is recognized by a chemical method, and then converted into a character string sequence capable of being recognized by a computer, which is composed of four characters of a, C, T and G (a-adenine G-guanine C-cytosine T-thymine), wherein one data sequence Read is a DNA fragment with a fixed length and is a basic unit for subsequent DNA sequence processing. For example, referring to FIG. 1, if the length of a data sequence Read is 10BP (BasePair ), the Read data sequence TCCTAATCTG is a Read. The result of DNA sequencing is the generation of a large stack of DNA reads, but the order between these reads is unknown. Sequencing of DNA sequences is to compare these unordered DNA reads with a putative DNA reference sequence to obtain the best match position of each Read in the reference sequence.
Because the data volume of deoxyribonucleic acid is very huge, sequence fragments are usually screened and filtered before sequencing, the screening and filtering are suitable for being realized by adopting a parallelization calculation mode, the memory calculation provides a good calculation platform for calculation, repeated movement of a large amount of data can be reduced, and the system performance can be effectively improved.
In modern computer systems, the movement of data between compute units and memory is a significant percentage of the system power consumption and program runtime. With the advent of multi-core processors, where more and more cores are integrated into the same chip, the total memory bandwidth does not increase proportionally, creating a mismatch between computing power and data transfer, thus leading to the so-called "memory wall" problem. Meanwhile, although the computing resources are increased, the communication delay between the computing resources and a dynamic random access memory (hereinafter referred to as "DRAM") is not improved, so that the data movement becomes one of the system bottlenecks.
In order to solve these challenges, many new computing methods have been proposed in succession, including near memory computing, in-memory processor, in-memory computing, and so on. Memory computing is one of the key technologies to solve the problem of memory walls. The memory computing is operated in the memory as the name suggests, and can obviously reduce the serious computing time delay and power consumption brought by data exchange. Various memory computing technologies are currently emerging based on different storage media materials including RRAM, PCM, STT-MRAM, DRAM, etc. The current common DRAM memory microarchitecture is shown in figure 2.
However, since there are many possibilities of matching positions due to the large amount of base sequence data, there is still a problem that the calculation amount is huge using the conventional method.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a base sequence filtering method and a base sequence filtering device based on DRAM memory calculation, position matching screening is carried out in a memory subarray, namely, the DRAM memory calculation is carried out, based on the principle that the DRAM capacitor charging and discharging can complete basic logic operation, the number of AGCT bases in a certain section of gene sequence, namely a reference sequence, is counted and compared with the number of bases of a target sequence, if a certain threshold value is exceeded, the section of sequence is considered to be not matched with the target sequence, so that the purpose of screening and positioning is achieved, and the sequencing calculation of the base sequence is realized, and the specific technical scheme is as follows:
a base sequence filtering method based on DRAM memory calculation comprises the following steps:
step one, according to the row width of a storage array of a DRAM and the starting address of a target base sequence to be screened, the target base sequence is screened out and then rearranged and combined;
marking the rearranged and combined target base sequence with bases of A adenine, G guanine, C cytosine and T thymine respectively to obtain a marking line of the corresponding base;
thirdly, carrying out displacement operation on the marking line data, and then counting the number of the marking lines with the position value of 1 to obtain the counting results of A adenine, G guanine, C cytosine and T thymine;
step four, comparing the statistical result of the reference base sequence with the statistical result of the target base sequence, and filtering the screened target base sequence.
Further, the step one is specifically:
recording the number of invalid information data and the position of the invalid information data according to the length of the target base sequence and the column width of the storage array;
setting initial segment mask data and tail segment mask data according to the screening starting address and writing the initial segment mask data and the tail segment mask data into a storage array;
selecting a target row data sequence and initial segment mask data of a line before a target base sequence for bitwise and calculating to obtain effective initial segment target row data; selecting a target line data sequence and tail mask data of a next line of the target base sequence to perform bit-wise calculation to obtain effective tail section target line data;
carrying out bitwise or calculation on the effective initial section target row data and the tail section target row data, merging effective parts of the two rows of data into one row to obtain complete effective target base sequence data, wherein the initial parts and the tail parts of the effective target base sequence data are in the same row and have no coincident positions;
performing a column conversion operation on the effective target base sequence data in the line memory format to obtain first high bit data and first low bit data arranged in columns in the memory array;
generating an array GCT mask and an A mask according to the number of invalid information data and the position of the invalid information data, wherein the GCT mask is respectively subjected to AND operation with first high-bit data and first low-bit data, and irrelevant data is set to be 0 to generate array GCT data which is stored as second high-bit data and second low-bit data, and then bit-wise negation is performed and the array GCT data is stored as third high-bit data and third low-bit data; the a mask is or-operated with the first high bit data and the first low bit data, respectively, and sets all the irrelevant data to 1, generates column a data, and stores the column a data as fourth high bit data and fourth low bit data.
Further, the start segment mask data and the end segment mask data are composed of 0 and M, M is composed of two-bit binary 1, when 0 and the target row sequence data are AND, the irrelevant data can be set to 0, and when 1 and the target row sequence data are AND, the valid data can be reserved.
Further, the specific method for labeling the base A adenine in the target base sequence in the second step comprises the following steps:
copying the fourth high bit data and the fourth low bit data stored by the column A data to a first row and a second row of a calculation area of the memory array, respectively;
performing OR operation on the data between the first line and the second line according to bits to obtain a result R1;
performing negation operation on the result R1;
and copying the result after the inversion operation to a mark row of the A adenine in the memory array, wherein the position value of the A adenine is 1, and the rest position values are 0.
Further, the specific method for labeling the base C cytosine in the target base sequence in the second step comprises the following steps:
copying the second high bit data and the second low bit data of the target base sequence to a first row and a second row of a calculation area of the memory array respectively;
performing bitwise AND operation on the data of the first row and the second row of the calculation area to obtain a result R2;
the result R2 is copied to the marked row of C-cytosines in the memory array, where the position value containing the C-cytosine is 1 and the remaining position values are 0.
Further, the specific method for labeling the base G guanine in the target base sequence in the second step is as follows:
copying the third low bit data to the first row in the compute region of the memory array;
copying the second high-bit data to a second row in the calculation area;
performing bitwise AND operation on the data of the first line and the second line in the calculation area to obtain a result R3;
the result R3 is copied to the tag row of G guanine in the memory array, where the position containing G guanine has a value of 1 and the remaining positions have a value of 0.
Further, the specific method for labeling the base T thymine in the target base sequence in the second step comprises the following steps:
copying the third high-order bit data to a first row of a calculation area of the storage array;
copying the second low bit data to a second row of the calculation region;
carrying out bitwise AND operation on the data of the first row and the second row to obtain a result R4;
the result R4 is copied to a tag row of T thymine in the memory array, where the position value containing T thymine is 1 and the remaining position values are 0.
Further, the specific method for statistics in the third step includes the following three steps:
step 1, adopting a column counter and a shift counter, firstly judging whether the value n of the current column counter is 1, if not, reading the marked line, and performing left shift operation on the read result, wherein the number of shifted bits is the power i of 2, i is the value of the shift counter, after the shift operation is completed, adding 1 to the value i of the shift counter, and writing the result back to the DRAM subarray where the marked line is located; setting the original marking line as a line a, setting the shifted result as a line a _ s, if n is 1, ending the calculation, and entering the step 3;
step 2, copying the a row and the a _ s row data to a first row and a second row of a calculation area of the storage array, carrying out summation calculation on the data in the same column, namely carrying out exclusive OR operation on the first row and the second row to obtain a sum s of the first row and the second row, carrying out AND operation on the first row and the second row to obtain a carry term c of the sum of the first row and the second row, and writing the sum result back to a temporary storage area of the storage array; dividing the value n of the current column counter by 2, judging whether the result is 1, and if the result is 1, finishing the calculation; if the result is not 1, carrying out a new round of shifting and summing operation on the basis of the summing result of the temporary storage area, wherein each time the summing operation is finished, the calculation result is increased by one line, namely, the operation of the step 1 is carried out, the calculation result is shifted, and the shifting result is accumulated in a column manner;
and step 3, when the value n of the row counter is finally judged to be 1, a final result can be obtained in the first row of the calculation result and is stored.
Further, the fourth step specifically includes: putting the complement values of the statistical results of the obtained A adenine, G guanine, C cytosine and T thymine of the target base sequence in the same row in a column form, putting the complement values of the statistical results of the reference base sequence in the corresponding column, calculating difference values, and finally, summing the four difference values and comparing the sum values with a threshold value; if the sum of the differences is greater than the threshold, excluding the target base sequence; if the value is less than the threshold value, marking the sequence of the screening position as the target base sequence.
A base sequence filtering device based on DRAM memory calculation comprises:
the memory array is composed of DRAM subarrays and used for storing target base sequences, and binary expression is used for setting base information, and the method specifically comprises the following steps: the binary expression corresponding to A-adenine is 00, the binary expression corresponding to G-guanine is 10, the binary expression corresponding to C-cytosine is 11, the binary expression corresponding to T-thymine is 01, and each base information consists of 2-bit data;
the DRAM subarray is N in width, namely each row is provided with N rows of storage units, two rows of storage units are needed for storing base sequence information with the length of one row being N, the storage units are respectively marked as an H row for storing high bits and an L row for storing low bits, namely the high bits and the low bits of the same base information are stored in the same row;
the storage array is provided with a calculation area for data calculation, an original data storage area for storing original base sequence data, a column data area for storing data converted into a column format, and a temporary storage area for temporarily storing intermediate results generated in the calculation process;
the control module is used for receiving external addresses, data and commands, then carrying out decoding control, sending the decoding control to the word line controller, the bit line controller, the shifting and negating module and the buffer, converting the base sequence data format from a same row mode to a same column mode, writing the base sequence data format into a DRAM subarray, and controlling the calculation process; wherein the word line controller 402 controls row signals of the memory array, and the bit line controller controls column signals of the memory array; the buffer is used for buffering data; and the shifting and negating module comprises a shifting module and a negating module and can perform shifting operation and negating operation on a line of data according to the calculation requirement.
And the counting module is internally provided with a group of counters which comprise a shift counter and a row counter, respectively counts different types of base sequences when the reference sequences are written, records the final result, and writes the AGCT statistical value of the reference sequence into the fixed address of the target array.
Has the advantages that:
the invention carries out position matching screening in the memory subarray, reduces the transfer of a large amount of data between the cpu and the memory, improves the calculation efficiency by times and reduces the power consumption.
Drawings
FIG. 1 is a schematic diagram of the base sequence read in a DNA sequence fragment;
FIG. 2 is a schematic diagram of a generic DRAM memory microarchitecture;
FIG. 3 is a schematic block diagram of a DRAM memory-based base sequence filtering apparatus according to the present invention;
FIG. 4 is a schematic flow chart of a base sequence filtering method based on DRAM memory calculation according to the present invention;
FIG. 5 is a flow chart showing details of step one of the method of the present invention;
FIG. 6 is a schematic diagram showing a manner of storing data when a target base sequence is rearranged and combined in a memory array of the apparatus of the present invention;
FIGS. 7 to 10 are schematic views showing the manner of storing data when labeling the base A adenine, the base C cytosine, the base G guanine, and the base T thymine in the target base sequence according to the present invention;
FIG. 11 is a schematic diagram of a piece of base sequence information stored in a memory array according to an embodiment of the present invention;
FIG. 12 is a schematic diagram showing a manner of storing data when a target base sequence is rearranged and combined in a memory array according to an embodiment of the present invention;
FIG. 13 is a diagram illustrating a portion of data of a binary representation after shifting according to an embodiment of the present invention;
FIG. 14 is a schematic diagram showing a data storage method in the memory array according to the embodiment of the present invention when AND operation is performed on a base GCT mask;
FIG. 15 is a schematic diagram illustrating a storage manner of data when an OR operation is performed on a base A mask in a memory array according to an embodiment of the present invention;
FIGS. 16 to 19 are schematic views showing the manner of storing data when labeling the base A adenine, the base C cytosine, the base G guanine, and the base T thymine in the target base sequence according to the example of the present invention;
FIGS. 20 to 26 are data diagrams illustrating shift and column-wise summation of the label rows for each base type with a value of 1 as a statistic, according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples.
As shown in FIG. 3, the present invention provides a base sequence filtering apparatus based on DRAM memory calculation, comprising:
the memory array 404 is composed of DRAM subarrays, and is used for storing target base sequences, and setting binary expression for base information, specifically: the binary expression corresponding to A-adenine is 00, the binary expression corresponding to G-guanine is 10, the binary expression corresponding to C-cytosine is 11, the binary expression corresponding to T-thymine is 01, and each base information consists of 2-bit data;
the DRAM subarray is N in width, namely N rows of storage units are arranged in each row, two rows of storage units are needed for storing base sequence information with the length of one row being N and are respectively marked as H rows for storing high bits and L rows for storing low bits, namely the high bits and the low bits of the same base information are stored in the same row;
the memory array 404 is provided with a calculation area 501 for data calculation, an original data storage area for storing original base sequence data, a column-type data area for storing data converted into a column format, and a temporary storage area 4010 for temporarily storing intermediate results generated during calculation.
The control module 403 receives external addresses, data and commands, performs decoding control, sends the decoding control to the word line controller 402, the bit line controller 401, the shift and inversion module 406 and the buffer 405, converts the base sequence data format from a parallel mode to a parallel mode, writes the base sequence data format into a DRAM subarray, and controls the calculation process; the word line controller 402 controls row signals of the memory array 404, and the bit line controller 401 controls column signals of the memory array 404; the buffer 405 is used for buffering data; the shift and negation module 406, including a shift module and a negation module, can perform shift operation and negation operation on a line of data according to the calculation requirement.
The counting module 407 is provided with a set of counters therein, and when the reference sequence is written, counts the base sequences of different types respectively, records the final result, and writes the AGCT statistic of the reference sequence into the fixed address of target _ array of the target array, which is denoted as target _ a, target _ G, target _ C, and target _ T.
Based on the base sequencing filtering device, the base sequence filtering method based on DRAM memory calculation adopted by the invention is shown in FIG. 4, and specifically comprises the following steps:
in the first step, in the preparation stage of statistical data, a starting point address is screened in the storage array 404 according to the system setting, a target base sequence with a length smaller than N is screened, and the target base sequence is rearranged and combined.
As shown in fig. 5 and 6, the control module 403 records the number of invalid information data and the invalid information data position 4006 according to the length of the target base sequence and the column width of the memory array 404. Setting start segment mask data 4003 and end segment mask data 4004 according to the address of the screening start point, writing the mask data into the memory array 404, the mask data being composed of 0 and M, M being composed of a two-bit binary 1, when 0 and the sequence data are anded, setting 0 as irrelevant data, and when 1 and the sequence data are anded, keeping valid data; selecting a target line data sequence 4001 where a previous line of the target base sequence is located and starting segment mask data 4003 to perform bitwise calculation to obtain effective starting segment target line data 4001_1; selecting the target line data sequence 4002 in the next line and the tail mask data 4004 to perform bitwise calculation to obtain effective tail section target line data 4002 xu 1; by bitwise or calculating the effective target row data 4001_1 and 4002_1 of the initial segment and the last segment, the effective parts of the two rows of data can be merged into one row, so as to obtain complete effective target base sequence data, and ensure that the beginning and the end parts of one segment of sequence are in the same row and have no overlapping position.
After screening, row-column conversion operation is performed on the data in the row memory format, so that first high bit data 4005 uth and first low bit data 4005\ l which are arranged in columns and stored in two rows of memory cells in the memory array respectively are obtained, and the data 4005 uth and 4005 \/are written back to the memory array 404 for further calculation.
The control module 403 generates a column type GCT mask (10 alternate bit string) and an a mask (combination of 01 alternate bit string and all 1 bit string) according to the number of invalid information data and the position of the invalid information data, the GCT mask performs and operation with data 4005 \/and 4005 \/h respectively, unrelated data are all set to 0, column type GCT data are generated and stored in 4007 \/and 4007 \/h lines respectively, at the same time, bit-wise negation is performed through a negation module, and negated data are stored in 4009 \/and 4009 \/h lines; the A mask is OR-ed with the data 4005 \/and the irrelevant data is set to 1, and the generated line A data is stored in the lines 4008 \/and 4008 \/respectively.
Step two, the control module 403 marks the rearranged and combined target base sequences stored in the DRAM subarray with bases of a adenine, G guanine, C cytosine, and T thymine respectively to obtain a marked row of the corresponding base.
As shown in FIG. 7, the specific method for labeling the base A adenine in the target base sequence comprises the following steps:
copying 4008 \hline and 4008 \lline data of the target base sequence to a first line and a second line of the calculation region 501, respectively;
the data between the first and second rows are ored bitwise to obtain a result R1,
sending the result R1 to an inversion module for inversion operation;
the result of the above inversion operation is copied to a mark line 502 of a adenine in which the position value of a adenine is 1 and the remaining position values are 0.
As shown in FIG. 8, the specific method for labeling the base C cytosine in the target base sequence comprises the following steps:
copying 4007 \hline and 4007 \lline data of the target base sequence to a first line and a second line of the calculation region 501, respectively;
performing bitwise and operation on the data in the first row and the data in the second row of the calculation area 501 to obtain a result R2;
the result R2 is copied to the C cytosine labeled row 503, where the C cytosine is contained at a position value of 1 and the remaining positions are 0.
As shown in FIG. 9, the specific steps of the method for labeling the G-guanine base in the target nucleotide sequence are as follows:
taking a bit-by-bit negation value 4009_l of 4007_l row data of the target base sequence and copying the data to a first row in the calculation area 501;
copying 4007 xu h line data to a second line in the computing area 501;
performing a bitwise and operation on the data of the first row and the second row in the calculation region 501 to obtain a result R3;
the result R3 is copied to the G guanine label line 504, where the G guanine-containing position has a value of 1 and the remaining positions have a value of 0.
As shown in FIG. 10, the specific method for labeling the base T thymine in the target base sequence comprises the following steps:
copying the negation value 4009_h of the data of the line 4007_h of the target base sequence to the first line of the calculation region 501;
copying the data of the 4007 ul line of the target base sequence to the second line of the calculation region 501;
carrying out bitwise AND operation on the data of the first line and the second line to obtain a result R4;
the result R4 is copied to a tag row 505 of T thymines, which have position values of 1 for T thymines and 0 for the remaining position values.
And thirdly, carrying out displacement operation on the marking line data, and then counting the number of the marking lines with the position value of 1 to obtain the counting results of A adenine, G guanine, C cytosine and T thymine.
Specifically, let the column width of the DRAM subarray be N, i.e., each row has N columns of memory cells, N is an integer power of 2, each marker row occupies a row of memory space, and the number of the marker rows having a position value of 1 is counted.
A shift counter and a row counter are arranged in the counting module 407, the shift counter is used for counting the current shift times, the initial value is 0, and the counting is cleared after the counting is finished; the initial value of the column counter is N, and the initial value is restored to N after the calculation is completed.
The specific statistical method comprises the following three steps:
step 1, the bit line controller 401 determines whether the value n of the current column counter is 1, if not, the marked line is read, and the read result is sent to the shift module for left shift operation, the number of shifted bits is the power i of 2, i is the value of the shift counter, after the shift operation is completed, the value i of the shift counter is added by 1, and the result is written back to the subarray where the marked line is located; setting an original marking line as a line a, setting a shifted result as a line a _ s, if n is 1, ending the calculation, and entering the step 3;
step 2, copying the data of the row a and the row a _ s to a first row and a second row of the calculation area 501, performing summation calculation on the data in the same column, that is, performing exclusive-or operation on the first row and the second row to obtain a sum s of the first row and the second row, performing and operation on the first row and the second row to obtain a carry term c of the sum of the first row and the second row, and writing the sum result back to a temporary storage area 4010 of the storage array 404; the bit line controller 401 divides the value n of the current row counter by 2, and determines whether the result is 1, and if the result is 1, the calculation is ended; if the result is not 1, a new round of shift summation operation is performed (one row is added to the calculation result every time the summation operation is completed) based on the summation result of the temporary storage region 4010, that is, the operation of step 1 is performed, the calculation results are respectively copied to the shift modules for shifting, and the shift results are accumulated in a column;
and 3, when the value n of the row counter is finally judged to be 1, obtaining a final result in the first row of the calculation result, and storing the row result to a specified position.
Step four, comparing the statistical result of the reference base sequence with the statistical result of the target base sequence, and filtering the screened target base sequence.
Specifically, the complement values of the statistical results of the obtained A adenine, G guanine, C cytosine and T thymine of the target base sequence are placed in the same row in a column form, then the complement values of the statistical results of the reference base sequence are placed in the corresponding columns, difference calculation is carried out, and finally the four differences are summed and compared with a threshold value; if the sum of the differences is greater than the threshold, excluding the target base sequence; if the value is less than the threshold value, the sequence of the screening position is marked as the target base sequence, and the subsequent action can be carried out.
The embodiment is as follows:
assuming that a piece of base sequence information is stored in the current storage region, the length of the selected sequence is 10, and the sequence is AGTTTCTCCG, as shown in FIG. 11.
Setting the start segment mask to binary 000000000000111111111111 and the end segment mask to binary 111111111111110000000000000000 according to the start address of the screening starting at the arrow in fig. 11, and writing the two mask data into the memory array; selecting a sequence TCTTTGAAAGTTC of a previous line and a start section mask to perform bitwise AND calculation to obtain effective start section data 00000000AGTTTC (wherein each 0 represents 2-bit 0); selecting the next row sequence TCCGAGGATGTGGT and an end mask to perform bitwise AND calculation to obtain effective end segment data TCCG0000000000 (wherein each 0 represents 2-bit 0); by bit-wise or bit-wise computing the valid target line data 00000000AGTTTC and TCCG0000000000, the valid portions of the two lines of data can be merged into one line, resulting in a complete target sequence TCCG0000AGTTTC, as shown in fig. 12.
And after screening, carrying out column conversion storage operation on the data in the row storage format to obtain data 4005 uth and 4005_lwhich are arranged in columns, and writing the result back to the storage array for next calculation. The mask data 4003 is composed of 0 and M, where M represents 2 bits 11 and M is composed of a multi-bit binary 1, and when 0 and the sequence data are not the same, the irrelevant data can be set to 0; when 1 and sequence data are associated, valid data may be retained.
Take partial data TCCG0000 in the above example as an example, where 0 is invalid data; its binary expression is, the lower 8 bits of 0 are effectively invalid data: 0111111000000000;
after the shift module shifts left by one, as shown in fig. 13, the effective data arranged in a column is in the dotted line;
the control module generates column GCT mask and A mask according to the number of invalid information and the position of the invalid information, wherein the GCT mask is a bit string with 10 alternating bits: 101010101010;
the GCT mask performs and operation with 4005 \/and, respectively, sets both irrelevant data to 0, generates column-type GCT data, and stores the column-type GCT data in 4007 _/and 4007_h, respectively, as shown in fig. 14;
the a mask is a combination of 01 alternating bit strings and all 1 bit strings: 0101010111111111, which performs or operation with 4005 _land 4005_h, respectively, sets both irrelevant data to 1, and generates the column type a data to be stored in 4008 _land 4008_h, respectively, as shown in fig. 15.
Then, the A adenine, the G guanine, the C cytosine and the T thymine are respectively subjected to base labeling.
As shown in fig. 16, the specific labeling of adenine a is:
one line H (0111111111111111111) and one line L (1111110111111111) of the target sequence are copied to the first and second lines of the calculation region 501, respectively;
performing a bitwise OR operation between the first row and the second row to obtain a result R1 (1111111111111111);
negating R1 to obtain (0000000000000000);
copying the result to a mark row of A adenine, wherein the position value of the A adenine is 1, the rest position values are 0, and marks of adenine in the current result are all 0.
As shown in fig. 17, labeling cytosine C specifically is:
copy one line 4007_h (0010101000000000) and one line 4007_l (10101000000000000000) of the target sequence to the first and second lines of the calculation area 501, respectively;
performing bitwise AND operation on the first row and the second row to obtain a result R2;
the result is copied to C cytosine mark line 503 (0010100000000000) where the position value containing C cytosine is 1 and the remaining position values are 0.
As shown in fig. 18, the labeling of guanine G specifically includes:
the bitwise inverted value 4009 \ (0101011111111111) of 4007 \/of the target sequence is copied to one of the lines of the calculation region 501;
copy 4007_h (0010101000000000) to one line of the calculation area 501;
the calculation area carries out bitwise AND operation on the two rows to obtain a result R3 (0000001000000000);
the result R3 is copied to the G guanine label line 504, where the G guanine-containing position is 1 and the remaining positions are 0.
As shown in fig. 19, labeling thymine T specifically is:
copying the negated value 4009_h of one line 4007_h of the target sequence to one line of the calculation region;
copying one line 4007 \ of the target sequence to another line of the calculation area, namely a first line and a second line;
performing bitwise AND operation on the first row and the second row to obtain a result R4;
the result R4 is copied to a tag row 505 of T thymines, where the positions containing T thymines have values of 1 and the remaining positions have values of 0.
Then, for the labeled line of each base type, the statistics of the value 1 is carried out, and the specific contents are as follows: assuming that the column width of the current sub-array is 16 (typically N is an integer power of 2), i.e. there are 16 columns of memory cells per row, each marked row occupies a row of memory space. Taking a certain mark line to count the number of 1, for example, the mark line currently taking C (0010100000000000). And a shift counter is arranged in the counting module and used for counting the current shift times, the initial value is 0, and the counting module is cleared after the counting is finished.
The specific statistical method comprises the following steps:
the controller judges whether the current n is 1, if the current n =16, the marking line is read out, the read result is sent to a shift module to carry out left shift operation, the bit number of the shift is 0 power of 2, after the shift operation is completed, the value i of a shift counter is added with 1 to be 1, the result is written back to a subarray module where the marking line is located, the original marking line is set as a line, and the result after the shift is set as a line _ s.
The a row and the a _ s row are copied to the first row and the second row of the calculation area, and the data in the same column are summed to obtain a1 and a0, as shown in fig. 20.
The controller divides the current n by 2 to obtain a new n which is 8; and if the judgment result is not 1, respectively copying the calculation results to a shift row for shifting, wherein the bit number of the shift is 1 power of 2, writing the result back to the subarray after the shift, and continuing to participate in the operation.
Shifting a1 and a0, and adding 1 to the value i of the shift counter to 2 after the shift operation is completed, as shown in fig. 21;
column-wise summation after shifting, as shown in fig. 22;
the controller divides the current n by 2 to obtain a new n which is 4; and the judgment result is not 1, shift a2 a1 a0, and after the shift is completed, add 1 to the value i of the shift counter to be 3, as shown in fig. 23;
column-wise summation after shifting, as shown in fig. 24;
the controller divides the current n by 2 to obtain a new n which is 2; and the judgment result is not 1, shift a3 a2 a1 a0, and after the shift is completed, add 1 to the value i of the shift counter to be 4, as shown in fig. 25;
column-wise summing after shifting, as shown in fig. 26;
when the value of the column counter is finally judged to be 1, a final result can be obtained in the first column of the calculation result, then the complementary code values of the statistical results of the obtained A adenine, G guanine, C cytosine and T thymine of the target base sequence are placed in the same row in a column form, then the complementary code values of the statistical results of the reference base sequence are placed in the corresponding column, the difference calculation is carried out, and finally the four differences are summed and compared with the threshold; if the sum of the differences is greater than the threshold, excluding the target base sequence; if the value is less than the threshold value, marking the sequence of the screening position as the target base sequence.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way. Although the foregoing has described in detail the practice of the invention, it will be appreciated by those skilled in the art that variations may be applied to the embodiments described in the foregoing examples, or equivalents may be substituted for elements thereof. All changes, equivalents and modifications which come within the spirit and scope of the invention are desired to be protected.

Claims (10)

1. A base sequence filtering method based on DRAM memory calculation is characterized by comprising the following steps:
step one, according to the row width of a storage array of a DRAM and the starting address of a target base sequence to be screened, the target base sequence is screened out and then rearranged and combined;
marking the rearranged and combined target base sequence with bases of A adenine, G guanine, C cytosine and T thymine respectively to obtain a marking line of the corresponding base;
thirdly, carrying out displacement operation on the marking line data, and then counting the number of the marking lines with the position value of 1 to obtain the counting results of A adenine, G guanine, C cytosine and T thymine;
and step four, comparing the statistical result of the reference base sequence with the statistical result of the target base sequence, and filtering the screened target base sequence.
2. The method for filtering base sequences based on DRAM memory calculation according to claim 1, wherein the first step is specifically:
recording the number of invalid information data and the position of the invalid information data according to the length of the target base sequence and the column width of the storage array;
setting initial segment mask data and tail segment mask data according to the screening starting address and writing the initial segment mask data and the tail segment mask data into a storage array;
selecting a target line data sequence and initial segment mask data where a previous line of the target base sequence is located to perform bit-wise calculation to obtain effective initial segment target line data; selecting a target line data sequence and tail mask data in the next row of the target base sequence for bitwise and calculation to obtain effective tail target line data;
carrying out bitwise calculation on the effective initial segment target line data and the effective tail segment target line data, merging effective parts of the two lines of data into one line to obtain complete effective target base sequence data, wherein the head part and the tail part of the effective target base sequence data are in the same line and have no coincident position;
performing a column conversion operation on the effective target base sequence data in the line storage format to obtain first high bit data and first low bit data arranged in columns in the storage array;
generating an array GCT mask and an A mask according to the number of invalid information data and the position of the invalid information data, wherein the GCT mask is respectively subjected to AND operation with first high-bit data and first low-bit data, and irrelevant data is set to be 0 to generate array GCT data which is stored as second high-bit data and second low-bit data, and then bit-wise negation is performed and the array GCT data is stored as third high-bit data and third low-bit data; the a mask performs or operation with the first high bit data and the first low bit data, respectively, sets all the irrelevant data to 1, generates column a data, and stores the column a data as fourth high bit data and fourth low bit data.
3. The method as claimed in claim 2, wherein the start segment mask data and the end segment mask data are composed of 0 and M, M is composed of two-bit binary 1, when 0 and the target row sequence data are AND, the irrelevant data is set to 0, and when 1 and the target row sequence data are AND, the valid data is retained.
4. The method for filtering a base sequence based on DRAM memory calculation as claimed in claim 2, wherein the specific method steps for labeling the base A adenine in the target base sequence in the second step are as follows:
copying the fourth high bit data and the fourth low bit data stored by the column A data to a first row and a second row of a calculation area of the memory array, respectively;
performing OR operation on the data between the first line and the second line according to bits to obtain a result R1;
carrying out negation operation on the result R1;
and copying the result after the inversion operation to a mark line of the A adenine in the memory array, wherein the position value containing the A adenine is 1, and the rest position values are 0.
5. The method for filtering base sequences based on DRAM memory calculation as claimed in claim 2, wherein the specific method for labeling the base C cytosine in the target base sequence in the second step is as follows:
copying the second high bit data and the second low bit data of the target base sequence to a first row and a second row of a calculation area of the memory array respectively;
carrying out bitwise AND operation on the data of the first row and the data of the second row in the calculation area to obtain a result R2;
the result R2 is copied to a tag row of C-cytosine in the memory array, where the position value of the C-cytosine is 1 and the remaining position values are 0.
6. The method for filtering base sequences based on DRAM memory calculation as claimed in claim 2, wherein the specific method steps for labeling the base G guanine in the target base sequence in the second step are as follows:
copying the third low bit data to a first row in a compute region of the memory array;
copying the second high bit data to a second row in the calculation area;
carrying out bitwise AND operation on the data of the first line and the second line in the calculation area to obtain a result R3;
the result R3 is copied to a mark row of G guanine in the memory array, wherein the position value containing G guanine is 1, and the rest position values are 0.
7. The method for filtering base sequences based on DRAM memory calculation as claimed in claim 2, wherein the specific method for labeling the base T thymine in the target base sequence in the second step is as follows:
copying the third high-order bit data to a first row of a calculation area of the storage array;
copying the second low bit data to a second row of the calculation region;
carrying out bitwise AND operation on the data of the first line and the second line to obtain a result R4;
the result R4 is copied to a tag row of T thymine in the memory array, where the position value containing T thymine is 1 and the remaining position values are 0.
8. The method for filtering base sequences based on DRAM memory calculation as claimed in claim 1, wherein the specific method of statistics in the third step comprises the following three steps:
step 1, adopting a column counter and a shift counter, firstly judging whether the value n of the current column counter is 1, if not, reading the marked line, and performing left shift operation on the read result, wherein the number of shifted bits is the power i of 2, and i is the value of the shift counter, after the shift operation is completed, adding 1 to the value i of the shift counter, and writing the result back to the DRAM sub-array where the marked line is located; setting the original marking line as a line a, setting the shifted result as a line a _ s, if n is 1, ending the calculation, and entering the step 3;
copying the a-row and the a _ s-row data to a first row and a second row of a calculation area of the storage array, carrying out summation calculation on the data in the same column, namely carrying out exclusive OR operation on the first row and the second row to obtain a sum s of the first row and the second row, carrying out AND operation on the first row and the second row to obtain a carry term c of the sum of the first row and the second row, and writing the sum result back to a temporary storage area of the storage array; dividing the value n of the current column counter by 2, judging whether the result is 1, and if the result is 1, finishing the calculation; if the result is not 1, carrying out a new round of shifting and summing operation on the basis of the summing result of the temporary storage area, wherein each time the summing operation is finished, the calculation result is increased by one line, namely, the operation of the step 1 is carried out, the calculation result is shifted, and the shifting result is accumulated in a column manner;
and step 3, when the value n of the row counter is finally judged to be 1, a final result can be obtained in the first row of the calculation result and is stored.
9. The method for filtering base sequences based on DRAM memory calculation according to claim 1, wherein the fourth step is specifically: putting the complement values of the statistical results of the obtained A adenine, G guanine, C cytosine and T thymine of the target base sequence in the same row in a column form, putting the complement values of the statistical results of the reference base sequence in the corresponding column, calculating difference values, and finally, summing the four difference values and comparing the sum values with a threshold value; if the sum of the differences is greater than the threshold, excluding the target base sequence; if the value is less than the threshold value, marking the sequence of the screening position as the target base sequence.
10. A base sequence filter device based on DRAM memory calculation is characterized by comprising:
the memory array is composed of DRAM subarrays and used for storing target base sequences, and binary expression is used for setting base information, and the method specifically comprises the following steps: the binary expression corresponding to A-adenine is 00, the binary expression corresponding to G-guanine is 10, the binary expression corresponding to C-cytosine is 11, the binary expression corresponding to T-thymine is 01, and each base information is composed of 2bit data;
the DRAM subarray is N in width, namely N rows of storage units are arranged in each row, two rows of storage units are needed for storing base sequence information with the length of one row being N and are respectively marked as H rows for storing high bits and L rows for storing low bits, namely the high bits and the low bits of the same base information are stored in the same row;
the storage array is provided with a calculation area for data calculation, an original data storage area for storing original base sequence data, a column type data area for storing data converted into a column format, and a temporary storage area for temporarily storing intermediate results generated in the calculation process;
the control module receives an external address, data and a command, performs decoding control, sends the decoding control to the word line controller, the bit line controller, the shift and inversion module and the buffer, converts the base sequence data format from a same-row mode to a same-column mode, writes the base sequence data format into a DRAM sub-array, and controls the calculation process; the word line controller 402 controls row signals of the memory array, and the bit line controller controls column signals of the memory array; the buffer is used for buffering data; the shifting and negating module comprises a shifting module and a negating module and can perform shifting operation and negating operation on a line of data according to calculation requirements;
and the counting module is internally provided with a group of counters which comprise a shift counter and a row counter, respectively counts different types of base sequences when the reference sequence is written, records the final result, and writes the AGCT statistical value of the reference sequence into the fixed address of the target array.
CN202211354686.5A 2022-11-01 2022-11-01 Base sequence filtering method and device based on DRAM memory calculation Active CN115409174B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211354686.5A CN115409174B (en) 2022-11-01 2022-11-01 Base sequence filtering method and device based on DRAM memory calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211354686.5A CN115409174B (en) 2022-11-01 2022-11-01 Base sequence filtering method and device based on DRAM memory calculation

Publications (2)

Publication Number Publication Date
CN115409174A true CN115409174A (en) 2022-11-29
CN115409174B CN115409174B (en) 2023-03-31

Family

ID=84169305

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211354686.5A Active CN115409174B (en) 2022-11-01 2022-11-01 Base sequence filtering method and device based on DRAM memory calculation

Country Status (1)

Country Link
CN (1) CN115409174B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665772A (en) * 2023-05-30 2023-08-29 之江实验室 Genome map analysis method, device and medium based on memory calculation

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001184381A (en) * 1999-12-24 2001-07-06 Kanegafuchi Chem Ind Co Ltd Method and device for calculating optimum solution of multiplex variation protein amino acid array and storage medium storing program for conducting the same
US6414746B1 (en) * 1999-11-24 2002-07-02 Advanced Scientific Concepts, Inc. 3-D imaging multiple target laser radar
JP2003167883A (en) * 2001-11-30 2003-06-13 Celestar Lexico-Sciences Inc Array information processor, array information processing method, program and recording medium
US20040054824A1 (en) * 2002-09-13 2004-03-18 International Business Machines Corporation Reduced latency wide-I/O burst architecture
US20050102587A1 (en) * 2003-08-29 2005-05-12 Wen-Shan Wang Non-sequential access pattern based address generator
CN1829805A (en) * 2003-05-23 2006-09-06 冷泉港实验室 Virtual representations of nucleotide sequences
CN101432439A (en) * 2006-02-24 2009-05-13 考利达基因组股份有限公司 High throughput genome sequencing on DNA arrays
CN101466847A (en) * 2005-06-15 2009-06-24 考利达基因组股份有限公司 Single molecule arrays for genetic and chemical analysis
EP2107125A1 (en) * 2008-03-31 2009-10-07 Eppendorf Array Technologies SA (EAT) Real-time PCR of targets on a micro-array
CN102203282A (en) * 2008-06-25 2011-09-28 生命技术公司 Methods and apparatus for measuring analytes using large scale fet arrays
CN102753708A (en) * 2010-01-04 2012-10-24 生命科技股份有限公司 DNA sequencing methods and detectors and systems for carrying out the same
CN103540589A (en) * 2013-10-28 2014-01-29 深圳市第二人民医院 Mononucleotide polymorphism sequence of telomerase reverse transcriptase (TERT) promoter
CN104200133A (en) * 2014-09-19 2014-12-10 中南大学 Read and distance distribution based genome De novo sequence splicing method
CN104850761A (en) * 2014-02-17 2015-08-19 深圳华大基因科技有限公司 Nucleotide sequence assembly method and device
CN106796628A (en) * 2014-09-03 2017-05-31 陈颂雄 Secure transaction device, system and method based on synthetic gene group variant
WO2018009770A1 (en) * 2016-07-07 2018-01-11 Cemvita Technologies Llc. Cognitive cell with coded chemicals for generating outputs from environmental inputs and method of using same
CN110431148A (en) * 2017-01-10 2019-11-08 罗斯威尔生命技术公司 Method and system for the storage of DNA data
CN111132999A (en) * 2017-07-07 2020-05-08 阿瓦克塔生命科学有限公司 Scaffold proteins
WO2020213736A1 (en) * 2019-04-17 2020-10-22 株式会社PEZY Computing Information processing device, information processing method, program and storage medium
CN112789680A (en) * 2019-03-21 2021-05-11 因美纳有限公司 Artificial intelligence based quality scoring
CN112802556A (en) * 2021-01-20 2021-05-14 天津大学合肥创新发展研究院 Accelerator device for parallel recognition of multiple marker sequences of sequencing data

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6414746B1 (en) * 1999-11-24 2002-07-02 Advanced Scientific Concepts, Inc. 3-D imaging multiple target laser radar
JP2001184381A (en) * 1999-12-24 2001-07-06 Kanegafuchi Chem Ind Co Ltd Method and device for calculating optimum solution of multiplex variation protein amino acid array and storage medium storing program for conducting the same
JP2003167883A (en) * 2001-11-30 2003-06-13 Celestar Lexico-Sciences Inc Array information processor, array information processing method, program and recording medium
US20040054824A1 (en) * 2002-09-13 2004-03-18 International Business Machines Corporation Reduced latency wide-I/O burst architecture
CN1829805A (en) * 2003-05-23 2006-09-06 冷泉港实验室 Virtual representations of nucleotide sequences
US20050102587A1 (en) * 2003-08-29 2005-05-12 Wen-Shan Wang Non-sequential access pattern based address generator
CN101466847A (en) * 2005-06-15 2009-06-24 考利达基因组股份有限公司 Single molecule arrays for genetic and chemical analysis
CN101432439A (en) * 2006-02-24 2009-05-13 考利达基因组股份有限公司 High throughput genome sequencing on DNA arrays
EP2107125A1 (en) * 2008-03-31 2009-10-07 Eppendorf Array Technologies SA (EAT) Real-time PCR of targets on a micro-array
CN102203282A (en) * 2008-06-25 2011-09-28 生命技术公司 Methods and apparatus for measuring analytes using large scale fet arrays
CN102753708A (en) * 2010-01-04 2012-10-24 生命科技股份有限公司 DNA sequencing methods and detectors and systems for carrying out the same
CN103540589A (en) * 2013-10-28 2014-01-29 深圳市第二人民医院 Mononucleotide polymorphism sequence of telomerase reverse transcriptase (TERT) promoter
CN104850761A (en) * 2014-02-17 2015-08-19 深圳华大基因科技有限公司 Nucleotide sequence assembly method and device
CN106796628A (en) * 2014-09-03 2017-05-31 陈颂雄 Secure transaction device, system and method based on synthetic gene group variant
CN104200133A (en) * 2014-09-19 2014-12-10 中南大学 Read and distance distribution based genome De novo sequence splicing method
WO2018009770A1 (en) * 2016-07-07 2018-01-11 Cemvita Technologies Llc. Cognitive cell with coded chemicals for generating outputs from environmental inputs and method of using same
CN110431148A (en) * 2017-01-10 2019-11-08 罗斯威尔生命技术公司 Method and system for the storage of DNA data
CN111132999A (en) * 2017-07-07 2020-05-08 阿瓦克塔生命科学有限公司 Scaffold proteins
CN112789680A (en) * 2019-03-21 2021-05-11 因美纳有限公司 Artificial intelligence based quality scoring
WO2020213736A1 (en) * 2019-04-17 2020-10-22 株式会社PEZY Computing Information processing device, information processing method, program and storage medium
CN112802556A (en) * 2021-01-20 2021-05-14 天津大学合肥创新发展研究院 Accelerator device for parallel recognition of multiple marker sequences of sequencing data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"Global Identification of Significantly Expressed Genes in Developing Endosperm of Rice by Expression Sequence Tags and cDNA Array Approaches", 《JOURNAL OF INTEGRATIVE PLANT BIOLOGY》 *
JINGYU LI: "Application of Bioinformatics in Microbial Ecology", 《AGRICULTURAL BIOTECHNOLOGY》 *
许涛等: "核酸适配体筛选技术研究进展", 《生物技术通讯》 *
黄玲玲等: "CRISPR/cas系统及其在家禽上的应用研究进展", 《中国畜牧杂志》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665772A (en) * 2023-05-30 2023-08-29 之江实验室 Genome map analysis method, device and medium based on memory calculation
CN116665772B (en) * 2023-05-30 2024-02-13 之江实验室 Genome map analysis method, device and medium based on memory calculation

Also Published As

Publication number Publication date
CN115409174B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
US20100063264A1 (en) Nucleotide sequencing via repetitive single molecule hybridization
Beckstette et al. Fast index based algorithms and software for matching position specific scoring matrices
US8397131B1 (en) Efficient readout schemes for analog memory cell devices
CN108985008B (en) Method and system for rapidly comparing gene data
CN115409174B (en) Base sequence filtering method and device based on DRAM memory calculation
CN111752859A (en) Techniques for efficient random associative search operations
CN106802870A (en) A kind of efficient embedded system chip Nor Flash controllers and control method
JP2021177623A (en) Technique for executing coding of data symbol for column reading operation
JP2001006375A5 (en)
US20210407564A1 (en) Method and apparatus to perform a read of a column in a memory accessible by row and/or by column
CN113742070A (en) Low-depth sequencing group genotype filling calculation memory optimization method
WO2014010763A1 (en) Apparatus and method for managing flash memory by means of writing data pattern recognition
US7990796B2 (en) Energy efficient memory access technique for single ended bit cells
KR100948468B1 (en) The method for flag satus deterimining of non volatile memory device
CN113627618A (en) Techniques to perform random sparse lifting and Prokrassis orthogonal sparse hashing using column read enabled memory
JP7361218B2 (en) Reference-guided genome sequencing
CN114063933A (en) Block management method, memory controller and memory storage device
JP7439258B2 (en) Reference-guided genome sequencing
CN106547702B (en) A kind of 8 memory access address calculation method of bimodulus
Garzon et al. Sensitivity and capacity of microarray encodings
JP7422228B2 (en) Device and method for locating sample reads within a reference genome
US20220284948A1 (en) Optimized column read enabled memory
US20240087643A1 (en) Sequence alignment with memory arrays
US20240086100A1 (en) Sequence alignment with memory arrays
US20220057961A1 (en) Method and apparatus to perform a multiple bit column read using a single bit per column memory accessible by row and/or by column

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant