CN113012760B - FPGA-based gene sequence assembly algorithm calculation acceleration method - Google Patents

FPGA-based gene sequence assembly algorithm calculation acceleration method Download PDF

Info

Publication number
CN113012760B
CN113012760B CN202011484784.1A CN202011484784A CN113012760B CN 113012760 B CN113012760 B CN 113012760B CN 202011484784 A CN202011484784 A CN 202011484784A CN 113012760 B CN113012760 B CN 113012760B
Authority
CN
China
Prior art keywords
fpga
seeds
cpu
backtracking
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011484784.1A
Other languages
Chinese (zh)
Other versions
CN113012760A (en
Inventor
柳星
张敏杰
蔡晨冉
叶晓艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202011484784.1A priority Critical patent/CN113012760B/en
Publication of CN113012760A publication Critical patent/CN113012760A/en
Application granted granted Critical
Publication of CN113012760B publication Critical patent/CN113012760B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a gene sequence assembly algorithm calculation acceleration method based on FPGA, which comprises a CPU and a heterogeneous calculation platform of the FPGA, and the method comprises the following steps: 1) in the filtering stage, a query sequence Kmer is converted on a CPU host to obtain a series of seeds, and then matching positions of all the seeds on a reference sequence interval are sequentially searched, namely, the seeds are hit; then counting the number of bases overlapped by the seeds in each hit interval, and selecting the positions with the number larger than a threshold value as candidate positions; 2) in the expansion stage, the candidate position is used as a starting point to start expansion through an optimized Smith-Waterman algorithm, and the blocks of the matrix are controlled through a CPU; the FPGA carries out operation on each block matrix to obtain a part of backtracking paths; 3) and the CPU sequentially splices the backtracking paths to obtain a complete backtracking path. The optimized Smith-Waterman algorithm is adopted, so that the running speed of sequence comparison can be greatly improved.

Description

FPGA-based gene sequence assembly algorithm calculation acceleration method
Technical Field
The invention relates to a gene sequence comparison calculation technology, in particular to a gene sequence assembly algorithm calculation acceleration method based on an FPGA (field programmable gate array).
Background
In recent years, with the rapid development of sequencing technology, the growth rate of genome data far exceeds moore's law, so that the existing computer resources cannot meet the requirements of people for processing the massive data. Genome assembly is the primary link for processing these massive data, and how to optimize or accelerate the assembly process is a hot topic at present. Sequence alignment is one of important links of genome assembly, and plays an important role in the field of precise medical treatment.
The existing sequence comparison algorithm is mostly based on a seed and expansion strategy, compared with the original comparison algorithm, areas where future comparison results may appear are screened out in a filtering mode before comparison calculation is carried out, and then comparison calculation is carried out in the areas, so that a great amount of time and space resource waste caused by calculation of the whole area is avoided. According to this strategy, there are currently the following major research directions: filtering technique optimization, seed indexing technique optimization, contrast algorithm optimization, and accelerating the contrast algorithm using hardware.
Despite the tremendous computational burden of gene assembly, current large multi-gene assembly application tools remain developed based on traditional CPU platforms. However, since the CPU is a general-purpose processor whose hardware structure is not specifically designed for the genetic calculation algorithm, execution of the assembly algorithm using the CPU becomes a bottleneck in the face of massive genome data in the big data era.
Compared with the traditional CPU parallel or GPU hardware acceleration mode, the hardware acceleration realized by using the FPGA can better reduce the calculation time and has lower energy consumption. The invention realizes the acceleration of sequence comparison by designing a CPU + FPGA heterogeneous computing platform, and is subsidized by 202010497040 in the national university student innovation entrepreneurship training plan.
Disclosure of Invention
The invention aims to solve the technical problem of providing a gene sequence assembly algorithm calculation acceleration method based on FPGA (field programmable gate array) aiming at the defects in the prior art.
The technical scheme adopted by the invention for solving the technical problems is as follows: a gene sequence assembly algorithm calculation acceleration method based on FPGA comprises a CPU and a heterogeneous calculation platform of FPGA, and comprises the following steps:
1) in the filtering stage, a query sequence Kmer is converted on a CPU host to obtain a series of seeds, and then matching positions of all the seeds on a reference sequence interval are sequentially searched, namely, the seeds are hit; then counting the number of bases overlapped by the seeds in each hit interval, and selecting the positions with the number larger than a threshold value as candidate positions;
2) in the expansion stage, the candidate position is used as a starting point to start expansion through an optimized Smith-Waterman algorithm, and the blocks of the matrix are controlled through a CPU; the FPGA carries out operation on each block matrix to obtain a part of backtracking paths;
3) and the CPU sequentially splices the backtracking paths to obtain a complete backtracking path.
According to the scheme, the step 1) of searching the matching positions of all the seeds on the reference sequence interval specifically comprises the following steps:
1.1) obtaining a series of query sequence seeds by using a sliding window with the size of K, and partitioning a reference sequence according to a fixed size;
1.2) finding the hit position of the seed on the diagonal band of the reference sequence interval.
According to the scheme, the step 1.2) is completed by using a seed pointer table and a seed position table of a data structure based on hash index, which are as follows:
obtaining reference sequence seeds by using the same sliding window with the size of K, sequentially recording the positions of the seeds in a seed position table, and simultaneously recording the initial position of each type of seeds in the storage of the seed position table by using a seed pointer table;
and finding the matched position of each query sequence seed on the reference sequence through a seed pointer table and a seed position table which are constructed and filled in advance.
According to the scheme, the optimized Smith-Waterman algorithm in the step 2) is as follows:
2.1) taking the candidate position obtained in the step 1) as a starting point, and firstly carrying out left expansion to the upper left;
2.2) determining a matrix with the size of T multiplied by T in the CPU as a block matrix to limit the range of each calculation, transmitting the size of the current block matrix and the position information of the current block matrix in the initial matrix to the FPGA so as to control the FPGA to perform scoring and backtracking operation on the block by using a Smith-Waterman algorithm and return a backtracking result to the CPU, and then the CPU moves the block matrix according to the received backtracking result to determine the area of the next calculation;
2.3) if the length of the backtracking path obtained by the current block matrix is 0 or reaches the edge of the initial matrix, ending left expansion, and splicing the backtracking paths of the block matrices in sequence by the CPU to obtain the left half part of the final backtracking path;
2.4) starting from the same starting point, expanding towards the lower right direction, wherein the specific process is similar to the step 2.2), similarly, when the right expansion is finished, the right half part of the final backtracking path can be obtained, and finally, the left part and the right part are spliced to obtain the final complete backtracking path;
the FPGA adopts a hardware parallel algorithm based on a systolic array to realize the parallel calculation of the Smith-Waterman.
The invention has the following beneficial effects:
1. aiming at the filtering stage, a Filter algorithm based on the number of the overlapped bases of the diagonal seeds is used for realizing the filtering, and meanwhile, a Hash index technology is introduced to accelerate the speed of searching the matching position by the seeds.
2. In the expansion stage, considering the space complexity of the square stage of the existing comparison algorithm, the FPGA cannot process long sequences under the limitation of a limited on-chip memory, therefore, an optimized Smith-Waterman algorithm is adopted, and the space complexity is fixed at a certain constant level by using blocks, so that the method is better suitable for hardware acceleration, then the FPGA is used for realizing the comparison step, and the running speed of sequence comparison is greatly improved in a hardware acceleration mode.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a schematic diagram of the calculation acceleration of the FPGA-based gene sequence assembly algorithm according to the embodiment of the present invention;
FIG. 3 is a schematic diagram of extended algorithm in left extension according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a parallel computing acceleration hardware framework for implementing the Smith-Waterman algorithm on an FPGA according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1 and 2, a method for accelerating computation of a gene sequence assembly algorithm based on an FPGA includes a CPU and a heterogeneous computation platform of the FPGA, and includes the following steps:
1) in the Filter filtering stage, a query sequence Kmer is converted on a CPU host to obtain a series of seeds, and then matching positions of all the seeds on a reference sequence interval are sequentially searched, namely, the seeds are hit; then counting the number of bases overlapped by the seeds in each hit interval, and selecting the positions with the number larger than a threshold value as candidate positions;
the step 1) specifically comprises the following steps:
1.1) obtaining a series of query sequence seeds by using a sliding window with the size of k, and partitioning a reference sequence into a group according to B bases with fixed size;
1.2) constructing and filling a 'seed pointer table' and a 'seed position table', obtaining reference sequence seeds by using the same sliding window, and sequentially recording the positions of the seeds in a seed position table, wherein the seed pointer table records the initial position of each type of seeds in the storage of the seed position table;
1.3) finding the matched position of each query sequence seed on the reference sequence, and accelerating the step by constructing and filling a seed pointer table and a seed position table in advance;
1.4) counting the overlapped bases of the seeds in each reference sequence interval, and screening according to a threshold value to obtain a candidate position.
2) An extended expansion stage, namely expanding by taking the candidate position as a starting point through an optimized Smith-Waterman algorithm, and controlling the block division of the matrix through a CPU (Central processing Unit); the FPGA carries out operation on each block matrix to obtain a part of backtracking paths;
2.1) determining the whole matrix calculation range of the complete backtracking path by taking the candidate position obtained in the step 1) as a starting point;
2.2) determining a matrix with the size of T multiplied by T as a block matrix through the CPU to limit the range of each calculation, controlling the FPGA to calculate the block and returning a backtracking result to the CPU, and moving the block matrix and determining the area of the next calculation by the CPU according to the backtracking result;
the FPGA adopts a hardware parallel algorithm based on a systolic array to realize the acceleration of the parallel computation of the Smith-Waterman.
In the expansion stage of the application, the optimized Smith-Waterman algorithm is called an extended algorithm, the Smith-Waterman algorithm mainly realizes the calculation and the table filling of the final score through a W permutation matrix, an F score matrix and a gap penalty, and the whole calculation process follows the following state transition equation:
F(0,0)=0
Figure GDA0003064241110000071
since the 0 option is added to the state transition equation, the negative score that occurs will be filled with 0. The backtracking phase then starts with the highest scoring matrix element until an element with a zero score is encountered. The highest score local alignment results are generated in this process.
The scoring and backtracking operation of the extended algorithm is similar to that of the existing Smith-Waterman algorithm, the optimization part mainly comprises that the extended algorithm takes a certain small square block in a matrix as a scoring and backtracking basic unit, and the purpose of calculating the whole matrix is achieved by continuously moving the block matrix. The extended respectively carries out left extension and extended with the candidate position obtained in the Filter stage as a starting point, finds a longer and more complete comparison path as far as possible, and can simultaneously Extend a plurality of candidate positions in a batch processing mode.
FIG. 3 is a diagram of the extended algorithm in the left expansion, and the method of dividing the dynamic programming matrix into several small matrices achieves the purpose of reducing the space complexity to candidate the position (i)*,j*) And as a starting point, continuously constructing and moving small matrixes according to the given matrix side length T, solving partial backtracking paths by using a Smith-Waterman algorithm, and finally splicing the backtracking paths of the small matrixes to obtain a final backtracking path.
When constructing and moving the block matrix, pass (i)*,j*) That is, the starting position (icurr, jcurr) of the lower right corner of the current partition matrix, T, and O can obviously determine the starting position of the upper left corner of the first partition matrix: (i)start,jstart)=(max(0,icurr-T),max(0,jcurr-T))。
Due to the side length T of the partition matrix, in practiceIn the actual operation, when a partition block is constructed from the position starting from the lower right corner of the partition matrix, a square partition matrix cannot be constructed, i.e. i*-T,j*T is negative, so it is necessary to put i*-T,j*-T is compared with 0 and if the result is less than 0, a rectangular partition matrix is constructed. In the process of calculating the first partition matrix backtracking, the offset of the partition matrix backtracking pointer in the horizontal direction and the vertical direction are obtained and are respectively used as (i)off,joff) And (4) showing. The next partition matrix is calculated next: after the trace-back path of the current segmentation matrix is calculated, the offset (i) of the trace-back pointer of the current matrix is subtracted by the (icurr, jcurr) position of the lower right corner of the current matrixoff,joff) Then get the starting position of the lower right corner of the next partition matrix, and operate again (i)start,jstart) And (max (0, icurr-T), max (0, jcurr-T)), and obtaining the starting position of the upper left corner of the next matrix according to the overlapping threshold, so that the construction of the next segmentation matrix is completed. By parity of reasoning, the backtracking calculation is continued according to the steps until the backtracking process is calculated to obtain (i)off,joff) Ending the whole process for (0, 0).
The parallel computing acceleration of the Smith-Waterman algorithm is realized on an FPGA, a hardware frame is shown in figure 4, firstly, an input reference sequence and an input query sequence are respectively stored in two BRAMs, one PE in a pulse array is responsible for scoring computation of a certain row in a matrix, a base of the reference sequence is sent to the leftmost PE in each clock cycle, and the former base is sent to the next PE, so that multiple rows of parallel computation are realized. Each PE stores the generated backtracking pointer in SRAM during the calculation process. After all the rows are calculated, the backtracking logic module performs backtracking operation according to the backtracking pointer in the SRAM and outputs the final backtracking path of each block matrix.
3) And the CPU sequentially splices the backtracking paths of each block matrix so as to obtain a complete backtracking path.
The invention adopts a standard system of an OpenCL cross-platform computing framework to construct a complete CPU + FPGA heterogeneous parallel computing platform, the CPU is responsible for executing a filtering algorithm and controlling Smith-Waterman partitioning, the FPGA carries out hardware acceleration on computation intensive steps in the algorithm, PCIE interfaces are used for realizing data interaction between the CPU and the Smith-Waterman, a final result is tested, and the acceleration performance is evaluated. The invention is subsidized by 202010497040 of the training plan of the innovative entrepreneur of the national university students.
The invention adds a Filter filtering stage, adopts a filtering algorithm based on the number of seed hit bases in a diagonal zone to count, and uses a Hash index structure of a seed position table and a seed pointer table to accelerate filtering; and then, an extended algorithm is adopted to realize an extension stage, and the original Smith-Waterman algorithm is optimized to reduce the space complexity, so that the method is more suitable for running on the FPGA.
The applicant uses C + + language to realize pure software version of Filter algorithm + Extend algorithm on CPU, compares the operation result with the original Smith-Waterman algorithm operation result using the same test data, and verifies the correctness of the scheme, wherein the result is the same.
By means of the Xilinx SDAccel platform, interaction with an FPGA is achieved by calling an OpenCL API to modify a host program, meanwhile, hardware codes are compiled into binary files and are written on an FPGA board, so that the building of a CPU + FPGA heterogeneous computing platform is achieved, tests are respectively carried out in a Software-Emulation mode (pure CPU) and a System mode (CPU + FPGA) provided by the SDAccel platform, the final acceleration performance is evaluated, as shown in Table 1, five tests are carried out by using the same 16384 sets (a set of reference sequences with the length of 256 and query sequences with the length of 128) and time consumption is recorded, and from the result, the scheme can be proved to be capable of actually accelerating a gene comparison algorithm and has a remarkable acceleration effect.
TABLE 1 comparison of run results
Figure GDA0003064241110000101
Figure GDA0003064241110000111
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (3)

1. A gene sequence assembly algorithm calculation acceleration method based on FPGA comprises a CPU and a heterogeneous calculation platform of FPGA, and is characterized by comprising the following steps:
1) in the filtering stage, a query sequence Kmer is converted on a CPU host to obtain a series of seeds, and then matching positions of all the seeds on a reference sequence interval are sequentially searched, namely, the seeds are hit; then counting the number of bases overlapped by the seeds in each hit interval, and selecting the positions with the number larger than a threshold value as candidate positions;
2) in the expansion stage, the candidate position is used as a starting point to start expansion through the optimized Smith-Waterman algorithm, and the blocks of the matrix are controlled through a CPU; the FPGA carries out operation on each block matrix to obtain a partial backtracking path;
the optimized Smith-Waterman algorithm in the step 2) is as follows:
2.1) taking the candidate position obtained in the step 1) as a starting point, and firstly carrying out left expansion to the upper left;
2.2) determining a matrix with the size of T multiplied by T in the CPU as a block matrix to limit the range of each calculation, transmitting the size of the current block matrix and the position information of the current block matrix in the initial matrix to the FPGA so as to control the FPGA to perform scoring and backtracking operation on the block by using a Smith-Waterman algorithm and return a backtracking result to the CPU, and then the CPU moves the block matrix according to the received backtracking result to determine the area of the next calculation;
2.3) if the length of the backtracking path obtained by the current block matrix is 0 or reaches the edge of the initial matrix, ending left expansion, and splicing the backtracking paths of the block matrices in sequence by the CPU to obtain the left half part of the final backtracking path;
2.4) starting from the same starting point, expanding towards the lower right direction, wherein the specific process is the same as the step 2.2), similarly, obtaining the right half part of the final backtracking path when the right expansion is finished, and finally splicing the left part and the right part to obtain the final complete backtracking path;
the FPGA adopts a hardware parallel algorithm based on a pulse array to realize the parallel calculation of the Smith-Waterman;
3) and the CPU sequentially splices the backtracking paths to obtain a complete backtracking path.
2. The FPGA-based gene sequence assembly algorithm calculation acceleration method according to claim 1, wherein the step 1) is implemented by searching for matching positions of all seeds on a reference sequence interval as follows:
1.1) obtaining a series of query sequence seeds by using a sliding window with the size of K, and partitioning a reference sequence according to a fixed size;
1.2) finding the hit position of the seed on the diagonal band of the reference sequence interval.
3. The FPGA-based gene sequence assembly algorithm computation acceleration method of claim 2, wherein the step 1.2) is performed by using a seed pointer table and a seed position table of a hash index-based data structure, specifically as follows:
obtaining reference sequence seeds by using the same sliding window with the size of K, sequentially recording the positions of the seeds in a seed position table, and simultaneously recording the initial position of each type of seeds in the storage of the seed position table by using a seed pointer table;
and finding the matched position of each query sequence seed on the reference sequence through a seed pointer table and a seed position table which are constructed and filled in advance.
CN202011484784.1A 2020-12-16 2020-12-16 FPGA-based gene sequence assembly algorithm calculation acceleration method Active CN113012760B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011484784.1A CN113012760B (en) 2020-12-16 2020-12-16 FPGA-based gene sequence assembly algorithm calculation acceleration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011484784.1A CN113012760B (en) 2020-12-16 2020-12-16 FPGA-based gene sequence assembly algorithm calculation acceleration method

Publications (2)

Publication Number Publication Date
CN113012760A CN113012760A (en) 2021-06-22
CN113012760B true CN113012760B (en) 2022-07-05

Family

ID=76383489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011484784.1A Active CN113012760B (en) 2020-12-16 2020-12-16 FPGA-based gene sequence assembly algorithm calculation acceleration method

Country Status (1)

Country Link
CN (1) CN113012760B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114334008B (en) * 2022-01-24 2022-08-02 广州明领基因科技有限公司 FPGA-based gene sequencing accelerated comparison method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080086274A1 (en) * 2006-08-10 2008-04-10 Chamberlain Roger D Method and Apparatus for Protein Sequence Alignment Using FPGA Devices
CN106778077A (en) * 2016-11-10 2017-05-31 电子科技大学 A kind of new method that Smith Waterman algorithms are realized in FPGA platform
US10241970B2 (en) * 2016-11-14 2019-03-26 Microsoft Technology Licensing, Llc Reduced memory nucleotide sequence comparison
CN110473593A (en) * 2019-07-25 2019-11-19 深圳大学 A kind of Smith-Waterman algorithm implementation method and device based on FPGA

Also Published As

Publication number Publication date
CN113012760A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
Jiang et al. Hardware/software co-exploration of neural architectures
JP2023522567A (en) Generation of integrated circuit layouts using neural networks
US20160062951A1 (en) Semiconductor device
CN111242289A (en) Convolutional neural network acceleration system and method with expandable scale
CN110276442A (en) A kind of searching method and device of neural network framework
Guo et al. Gpu-accelerated path-based timing analysis
CN113012760B (en) FPGA-based gene sequence assembly algorithm calculation acceleration method
CN107644063A (en) Time series analysis method and system based on data parallel
JP7285977B2 (en) Neural network training methods, devices, electronics, media and program products
CN112149808A (en) Method, system and medium for expanding stand-alone graph neural network training to distributed training
CN108875914B (en) Method and device for preprocessing and post-processing neural network data
CN110633785A (en) Method and system for calculating convolutional neural network
US20210201163A1 (en) Genome Sequence Alignment System and Method
Gong et al. Improving hw/sw adaptability for accelerating cnns on fpgas through a dynamic/static co-reconfiguration approach
CN113449861A (en) Speculative training using partial gradient update
US20230306236A1 (en) Device and method for executing lstm neural network operation
CN110837567A (en) Method and system for embedding knowledge graph
CN106933777A (en) The high-performance implementation method of the one-dimensional FFT of base 2 based on the domestic processor of Shen prestige 26010
US11630479B2 (en) Apparatus for adjusting skew of circuit signal and adjusting method thereof
CN112200310A (en) Intelligent processor, data processing method and storage medium
Li et al. An experimental study on deep learning based on different hardware configurations
CN113627120B (en) Superconducting integrated circuit layout optimization method and device, storage medium and terminal
CN108460453B (en) Data processing method, device and system for CTC training
CN106055543B (en) The training method of extensive phrase translation model based on Spark
Nodine et al. I/O overhead and parallel VLSI architectures for lattice computations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant