CN106682258B - Multi-operand addition optimization method and system in high-level comprehensive tool - Google Patents

Multi-operand addition optimization method and system in high-level comprehensive tool Download PDF

Info

Publication number
CN106682258B
CN106682258B CN201611009866.4A CN201611009866A CN106682258B CN 106682258 B CN106682258 B CN 106682258B CN 201611009866 A CN201611009866 A CN 201611009866A CN 106682258 B CN106682258 B CN 106682258B
Authority
CN
China
Prior art keywords
operand
addition
compression tree
optimization
gpc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611009866.4A
Other languages
Chinese (zh)
Other versions
CN106682258A (en
Inventor
王自鑫
陈弟虎
衣杨
张晓强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201611009866.4A priority Critical patent/CN106682258B/en
Publication of CN106682258A publication Critical patent/CN106682258A/en
Application granted granted Critical
Publication of CN106682258B publication Critical patent/CN106682258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/327Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses a method and a system for optimizing multi-operand addition in a high-level comprehensive tool, wherein the method comprises the following steps: acquiring high-level function description of a circuit design, and further acquiring operation and operand contained in the circuit design; analyzing the operation, judging whether 3 or more than 3 operands are continuously added, if so, continuing to execute the next operation, otherwise, ending the operation; reading an optimization target in a user configuration file, establishing a compression tree according to the optimization target, and storing compression tree information; generating synthesizable compressed tree HDL code from the compressed tree information. The invention can carry out the design space optimization of the multi-operand addition according to the optimization target in the user configuration file in the high-level synthesis stage, and is beneficial to generating a multi-operand addition circuit with better performance and improving the performance of a high-level synthesis tool. The multi-operand addition optimization method and system in the high-level comprehensive tool can be widely applied to the field of computer and circuit design.

Description

Multi-operand addition optimization method and system in high-level comprehensive tool
Technical Field
The invention relates to the field of computer and circuit design, in particular to a multi-operand addition optimization method and system in a high-level synthesis tool.
Background
In digital circuit design, multi-operand addition has wide application in digital signal processing, picture video processing, high-performance calculation and other aspects, and the operation speed and the resource overhead of the multi-operand addition often have important influence on the circuit design quality.
The high-level comprehensive technology directly converts a high-level language into a hardware description language through the processes of compiling, scheduling, resource allocation and the like, so that the design efficiency can be effectively improved, and the design time can be saved. The high-efficiency algorithm and the hardware circuit design method are both beneficial to improving the performance of the high-level comprehensive tool. For multi-operand addition, its hardware circuit implementation may have a variety of architectures. However, in the conventional high-level synthesis system, full adders, half adders or conventional adder trees are usually adopted to realize multi-operand addition, and design space exploration and related optimization of multi-operation addition are not deeply considered. On one hand, the larger carry propagation delay is caused; on the other hand, the logic structure of the target platform cannot be well adapted, especially for the case that the target platform is a Field Programmable Gate Array (FPGA). Therefore, in the design of the hardware circuit automatically generated by the conventional high-level integrated system, if a large-scale multi-operand addition operation is performed, the design often has a large time delay and occupies a large amount of hardware resources, so that the overall quality of the hardware design is affected.
Disclosure of Invention
In order to solve the technical problems, the invention aims to: the high-performance multi-operand addition optimization method based on the generalized parallel counter in the high-level comprehensive tool is provided.
In order to solve the above technical problems, another object of the present invention is to: the high-performance multi-operand addition optimization system based on the generalized parallel counter in the high-level synthesis tool is provided.
The technical scheme adopted by the invention is as follows: a multi-operand addition optimization method in a high-level synthesis tool comprises the following steps:
A. acquiring high-level function description of a circuit design, and further acquiring operation and operand contained in the circuit design;
B. judging whether the operation obtained in the step A has 3 or more than 3 operands for continuous addition, if so, loading an addition optimization processing unit, and entering the step C to execute the processing unit, otherwise, ending the operation;
C. reading optimized target data in a user configuration file, establishing a compression tree according to the optimized target data, and storing compression tree information;
D. generating synthesizable compressed tree HDL code according to the compressed tree information saved in step C.
Further, the step C specifically includes:
c1, reading the user configuration file and obtaining optimization target data, and performing priority sequencing on the generalized parallel counters according to the optimization target;
and C2, processing the operands by using the generalized parallel counter subjected to the priority sorting, generating a compression tree and storing the compression tree information.
Further, in the step B, the operand is represented by a two-dimensional dot matrix diagram.
Further, in the step C2, the compression tree is used to sum a plurality of numbers and take the sum as an output, and the saved compression tree information includes the number of stages of the compression tree, the type and the number of usage of the generalized parallel counter used at each stage, and the input and output information of the final adder.
Further, in the step C, the input of the compression tree is an operand of the multi-operand addition, the output of the compression tree is a sum of the operands of the multi-operand addition, and the function of the compression tree is the same as the addition function of the multi-operand addition.
The other technical scheme adopted by the invention is as follows: a system for multi-operand addition optimization in a high-level synthesis tool, the system comprising:
the circuit comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring high-level function description of a circuit design so as to obtain operation and operand contained in the circuit design;
the judging unit is used for judging whether the operation obtained by the obtaining unit has continuous addition of 3 or more than 3 operands, if so, the addition optimization processing unit is loaded and the processing unit is executed, otherwise, the processing unit is ended;
the addition optimization processing unit is used for reading the optimization target data in the user configuration file, establishing a compression tree according to the optimization target data and storing the compression tree information;
and the code generating unit is used for generating compressible tree HDL codes which can be synthesized according to the compressed tree information stored by the addition optimization processing unit.
Further, the addition optimization processing unit includes:
the sequencing module is used for reading the user configuration file, obtaining design optimization target data and carrying out priority sequencing on the generalized parallel counter according to the optimization target data;
and the generating module is used for processing the operands by using the generalized parallel counter subjected to priority sequencing in the sequencing module, generating a compression tree and storing compression tree information.
Furthermore, in the judging unit, the operand is represented by a two-dimensional dot matrix diagram.
Further, in the generating module, the compression tree is used for summing a plurality of numbers and taking the sum as an output, and the stored compression tree information includes the number of stages of the compression tree, the type and the number of usage of the generalized parallel counter used at each stage, and the input and output information of the final adder.
Further, in the addition optimization processing unit, the input of the compression tree is the operand of the multi-operand addition, the output of the compression tree is the sum of the operands of the multi-operand addition, and the function of the compression tree is the same as the addition function of the multi-operand addition.
The invention has the beneficial effects that: by using the method, the design space optimization of the multi-operand addition can be carried out according to the optimization target in the user configuration file in the high-level synthesis stage, the generation of a multi-operand addition circuit with better performance is facilitated, and the improvement of the performance of a high-level synthesis tool is facilitated.
The invention has another beneficial effect that: by using the system of the invention, the design space optimization of the multi-operand addition can be carried out in high-level synthesis according to the optimization target in the user configuration file, which is beneficial to generating a multi-operand addition circuit with better performance and is beneficial to improving the performance of a high-level synthesis tool.
Drawings
The following further describes embodiments of the present invention with reference to the accompanying drawings:
FIG. 1 is a flow chart of the steps of the method of the present invention;
FIG. 2 is a flow chart of the steps of a particular embodiment of the method of the present invention;
FIG. 3 is a schematic diagram of an addition in an embodiment of the method of the present invention;
FIG. 4 is a two-dimensional lattice diagram of an embodiment of the method of the present invention;
FIG. 5 is a schematic illustration of a portion of a GPC bitmap of the method of the present invention;
FIG. 6 is a flow diagram of compressed tree generation in an embodiment of the method of the present invention;
FIG. 7 is a block diagram of the architecture of the system of the present invention;
fig. 8 is a block diagram of the architecture of an embodiment of the system of the present invention.
Detailed Description
The following further describes embodiments of the present invention with reference to the accompanying drawings:
referring to fig. 1, a method for optimizing multi-operand addition in a high-level synthesis tool includes the following steps:
A. acquiring high-level function description of a circuit design, and further acquiring operation and operand contained in the circuit design;
in this embodiment, there are 5 4-bit unsigned number additions, and the obtained result is 4 additions and 5 operands.
B. Judging whether the operation obtained in the step A has 3 or more than 3 operands for continuous addition, if so, loading an addition optimization processing unit, and entering the step C to execute the processing unit, otherwise, ending the operation;
in this embodiment, 5 operands are detected to be added consecutively, and if yes, step C is performed. The process of adding 5 unsigned 4 bits in this embodiment is shown in FIG. 3, where aijBit j, s, representing the ith operandkIndicating the k-th bit of the addition result.
C. Reading optimized target data in a user configuration file, establishing a compression tree according to the optimized target data, and storing compression tree information;
D. generating synthesizable compressed tree HDL code according to the compressed tree information saved in step C.
Referring to fig. 2, as a further preferred embodiment, the step C specifically includes:
c1, reading the user configuration file and obtaining optimization target data, and performing priority sequencing on a Generalized Parallel Counter (GPC for short) according to the optimization target;
one specific GPC input-output relationship is exemplified by GPC (1,4,1, 5; 5), which has 5 inputs with a weight of 0, 1 input with a weight of 1,4 inputs with a weight of 2, 1 input with a weight of 3, whose output is an unsigned number R of 5 bits, when all inputs are 1:
R=5×20+1×21+4×22+1×23=(11111)2=(31)10
and C2, processing the operands by using the generalized parallel counter subjected to the priority sorting, generating a compression tree and storing the compression tree information.
Further as a preferred embodiment, the design optimization objective includes area optimization, timing optimization or timing area product optimization.
Hardware resources occupied by different GPCs in the FPGA and time delay from input to output of the GPCs are different, and the GPCs are prioritized according to different optimization targets by using different comparison criteria.
For example, in Xilinx's FPGA, 3 LUTs are used for GPC (2, 6; 4), with a maximum delay from input to output of 0.316ns and a 2+6-4 to 4 difference in the number of inputs and outputs. GPC (6; 3) uses 2 LUTs with a maximum delay from input to output of 0.293ns and a difference in the number of inputs and outputs of 6-3 to 3.
If the optimization objective is timing optimization, then the ratio of the difference of the GPC input to the output and the maximum delay of the GPC input to the output (denoted PD) is used as the ranking criterion. GPC (6; 3) has a PD value of 3/0.293 to 10.239 and GPC (2, 6; 4) has a PD value of 4/0.316 to 12.658, and since 12.658>10.239, GPC (2, 6; 4) has higher priority than GPC (6; 3).
If the optimization objective is area optimization, then the ratio of the difference in GPC inputs and outputs to the GPC resources (usually LUTs) (denoted AD) is used as the ranking criterion. GPC (6; 3) has an AD value of 3/2 of 1.5, GPC (2, 6; 4) has an AD value of 4/3 of 1.333, and GPC (6; 3) has higher priority than GPC (2, 6; 4) because 1.5> 1.333.
If the optimization target is time sequence area product optimization, the product of PD and AD (denoted as APD) is the ranking criterion. For example, the APD of GPC (6; 3) is 10.239 × 1.5 to 15.3585, the APD of GPC (2, 6; 4) is 12.658 × 1.333 to 16.8731, and since 18.8731>15.3585, GPC (2, 6; 4) has higher priority than GPC (6; 3).
In the embodiment of the invention, the design optimization target is area optimization, and the ratio E of the difference of the input and output numbers of GPCs to the used resources is used as a sorting standard when sorting is carried out, wherein the larger the ratio is, the more input can be compressed by using less resources corresponding to GPCs. GPCs used in this example are GPC (1,4,1, 5; 5), GPC (4; 3) and GPC (3; 2), which occupy 4 hardware resources of 2 and 1 LUTs, respectively, and have input and output differences of 6, 1 and 1, respectively, and have E values of 6/4 of 1.5, 1/2 of 0.5, 1/1 of 1, and 1.5>1>0.5, respectively, so that the three GPCs are GPC (1,4,1, 5; 5), GPC (3; 2) and GPC (4; 3) in order of priority from high to low.
In a further preferred embodiment, in the step B, the operands are represented by a two-dimensional dot matrix, as shown in fig. 4.
Fig. 4 is a two-dimensional lattice diagram corresponding to fig. 3, which abstracts the operands participating in the operation into a two-dimensional lattice, where each row represents an operand, each point represents a certain bit (value is 0 or 1) of the operand, the leftmost point is the most significant bit of the operand in the row, the rightmost point is the least significant bit of the operand in the row, and all points in any column represent the same weight.
Figure 5 lists several different GPC dot-matrix representations. In this embodiment, the output bitmap of the GPC network has at most 2 points per column, i.e., the output of the GPC network can be composed into two new operands for input to subsequent adders.
Further preferably, in step C1, GPC is a circuit configuration having M-bit input and n-bit output, and the function of GPC is to sum up the number of 1 s represented by all inputs and to represent the unsigned number of n bits as an output result. Each input has a certain weight, the weight represents the number of 1 corresponding to the actual representation of the input, if the actual input of one input is A (can only be 0 or 1), and the weight is W, the number of 1 actually represented by the input is A x 2W. The GPC symbols can be expressed as: (m)k-1,mk-2,…,m1,m0(ii) a n) where mk-1>0,miWhere i represents the weight of the input, miRepresenting the number of inputs with weight i, k representing the number of input bits, n representing the number of output bits, and having:
Figure BDA0001154470450000081
Figure BDA0001154470450000082
GPC can compress the two-dimensional lattice diagram abstracted by a plurality of operands continuously to obtain the required number of operands. Because different GPCs can reduce the number of inputs, the hardware resources used and the delay from input to output, the GPCs can be prioritized according to different design optimization objectives and compressed by using the highest priority GPCs as much as possible.
Further as a preferred embodiment, in the step C2, the compression tree is used to sum a plurality of numbers and output the sum, and the stored compression tree information includes the number of stages of the compression tree, the type and number of the generalized parallel counters used at each stage, and the input and output information of the final adder.
The compression tree in the step C2 is a structure that can sum a plurality of numbers and output the sum, and includes two parts, a GPC network and an adder. The GPC network is divided into multiple stages (assumed to be N stages), each of which can pick a different GPC to compress the input of the stage according to the algorithm policy. The input of stage 1 is the original input composed of a plurality of operands; for the other stages, the input of the current stage is composed of the remaining output of all stages before the current stage and the remaining input of the original input. Finally, the N-level GPC network compresses the original bitmap into a bitmap having at most no more than the required number of points per column. And finally, taking the output dot-matrix diagram of the GPC network as the input of the adder to carry out summation, and finally obtaining the sum of a plurality of operands.
Referring to fig. 6, taking the GPC network generated in this embodiment as an example: the rectangle with solid line in the frame represents GPC (1,4,1, 5; 5), and the dots connecting the two ends of the solid line represent the output of GPC (1,4,1, 5; 5); the rectangle with the frame in dotted line represents the GPC (4; 3), and the dot connecting the two segments in dotted line represents the output of GPC (4; 3); the border is a rectangle with a dotted central line representing the GPC (3; 2) and the dots connecting the two segments with a dotted central line represent the output of the GPC (3; 2). The GPC network in this example had 3 levels in total, with 1 GPC (1,4,1, 5; 5) and 2 GPCs (4; 3) being used in the first level, 2 GPCs (3; 2) being used in the second level, and 1 GPC (3; 2) being used in the third level, as shown in FIG. 6. The output of the third stage is used as the input of the adder, and the result of the multi-operand addition is obtained after operation.
In a further preferred embodiment, in the step C, the input of the compression tree is an operand of a multi-operand addition, the output of the compression tree is a sum of operands of the multi-operand addition, and the function of the compression tree is the same as the addition function of the multi-operand addition.
Referring to fig. 7, a system for multi-operand addition optimization in a high-level synthesis tool, the system comprising:
the circuit comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring high-level function description of a circuit design so as to obtain operation and operand contained in the circuit design;
the judging unit is used for judging whether the operation obtained by the obtaining unit has continuous addition of 3 or more than 3 operands, if so, the addition optimization processing unit is loaded and the processing unit is executed, otherwise, the processing unit is ended;
the addition optimization processing unit is used for reading the optimization target data in the user configuration file, establishing a compression tree according to the optimization target data and storing the compression tree information;
and the code generating unit is used for generating compressible tree HDL codes which can be synthesized according to the compressed tree information stored by the addition optimization processing unit.
Referring to fig. 8, further as a preferred embodiment, the addition optimization processing unit includes:
the sequencing module is used for reading the user configuration file, obtaining design optimization target data and carrying out priority sequencing on the generalized parallel counter according to the optimization target data;
and the generating module is used for processing the operands by using the generalized parallel counter subjected to priority sequencing in the sequencing module, generating a compression tree and storing compression tree information.
In a further preferred embodiment, the judgment unit represents the operand in a two-dimensional dot matrix.
Further preferably, in the generating module, the compression tree is configured to sum a plurality of numbers and output the sum, and the stored compression tree information includes the number of stages of the compression tree, the type and number of the generalized parallel counters used at each stage, and input and output information of the final adder.
In a further preferred embodiment, in the addition optimization processing unit, the input of the compression tree is an operand of multi-operand addition, the output of the compression tree is a sum of operands of multi-operand addition, and the function of the compression tree is the same as the addition function of the multi-operand addition.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A multi-operand addition optimization method in a high-level synthesis tool is characterized by comprising the following steps: the method comprises the following steps:
A. acquiring high-level function description of a circuit design, and further acquiring operation and operand contained in the circuit design;
B. judging whether the operation obtained in the step A has more than 3 operands for continuous addition, if so, loading an addition optimization processing unit, and entering a step C to execute the processing unit, otherwise, ending the operation;
C. reading optimized target data in a user configuration file, establishing a compression tree according to the optimized target data, and storing compression tree information;
D. generating an HDL code of the comprehensive compression tree according to the compression tree information saved in the step C;
the compression tree is used for summing a plurality of numbers and taking the sum as output, and comprises a GPC network and an adder, wherein the GPC network is divided into a plurality of stages, each stage can select different GPCs according to algorithm strategies to compress the input of the stage, and the output of the GPC network is taken as the input of the adder to be summed to obtain the sum of a plurality of operands, and the GPC is a generalized parallel counter.
2. The method of claim 1, wherein the method comprises the steps of: the step C specifically comprises the following steps:
c1, reading the user configuration file and obtaining optimization target data, and performing priority sequencing on the generalized parallel counters according to the optimization target;
and C2, processing the operands by using the generalized parallel counter subjected to the priority sorting, generating a compression tree and storing the compression tree information.
3. The method of claim 1 or 2, wherein the method comprises the steps of: in step B, the operands are represented by a two-dimensional dot matrix diagram.
4. The method of claim 2, wherein the method comprises the steps of: in step C2, the saved compression tree information includes the number of stages of the compression tree, the type and number of generalized parallel counters used at each stage, and the input and output information of the final adder.
5. The method of claim 1 or 2, wherein the method comprises the steps of: in the step C, the input of the compression tree is an operand of the multi-operand addition, the output of the compression tree is the sum of the operands of the multi-operand addition, and the function of the compression tree is the same as the addition function of the multi-operand addition.
6. A multi-operand addition optimization system in a high-level synthesis tool is characterized in that: the system comprises: the circuit comprises an acquisition unit, a processing unit and a control unit, wherein the acquisition unit is used for acquiring high-level function description of a circuit design so as to obtain operation and operand contained in the circuit design;
the judging unit is used for judging whether the operation obtained by the obtaining unit has more than 3 operands for continuous addition, if so, the addition optimization processing unit is loaded and the processing unit is executed, otherwise, the processing unit is ended;
the addition optimization processing unit is used for reading the optimization target data in the user configuration file, establishing a compression tree according to the optimization target data and storing the compression tree information;
a code generating unit for generating a synthesizable compressed tree HDL code based on the compressed tree information held by the addition optimization processing unit;
the compression tree is used for summing a plurality of numbers and taking the sum as output, and comprises a GPC network and an adder, wherein the GPC network is divided into a plurality of stages, each stage can select different GPCs according to algorithm strategies to compress the input of the stage, and the output of the GPC network is taken as the input of the adder to be summed to obtain the sum of a plurality of operands, and the GPC is a generalized parallel counter.
7. The system of claim 6, wherein the system comprises: the addition optimization processing unit includes:
the sequencing module is used for reading the user configuration file, obtaining design optimization target data and carrying out priority sequencing on the generalized parallel counter according to the optimization target data;
and the generating module is used for processing the operands by using the generalized parallel counter subjected to priority sequencing in the sequencing module, generating a compression tree and storing compression tree information.
8. The system for multi-operand addition optimization in a high-level synthesis tool according to claim 6 or 7, wherein: and the judgment unit represents the operand by a two-dimensional dot matrix diagram.
9. The system for multi-operand addition optimization in a high-level synthesis tool according to claim 6 or 7, wherein: in the generating module, the stored compression tree information comprises the stage number of the compression tree, the type and the use number of the generalized parallel counter used at each stage, and the input and output information of the final adder.
10. The system for multi-operand addition optimization in a high-level synthesis tool according to claim 6 or 7, wherein: in the addition optimization processing unit, the input of the compression tree is the operand of the multi-operand addition, the output of the compression tree is the sum of the operands of the multi-operand addition, and the function of the compression tree is the same as the addition function of the multi-operand addition.
CN201611009866.4A 2016-11-16 2016-11-16 Multi-operand addition optimization method and system in high-level comprehensive tool Active CN106682258B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611009866.4A CN106682258B (en) 2016-11-16 2016-11-16 Multi-operand addition optimization method and system in high-level comprehensive tool

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611009866.4A CN106682258B (en) 2016-11-16 2016-11-16 Multi-operand addition optimization method and system in high-level comprehensive tool

Publications (2)

Publication Number Publication Date
CN106682258A CN106682258A (en) 2017-05-17
CN106682258B true CN106682258B (en) 2020-04-24

Family

ID=58839463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611009866.4A Active CN106682258B (en) 2016-11-16 2016-11-16 Multi-operand addition optimization method and system in high-level comprehensive tool

Country Status (1)

Country Link
CN (1) CN106682258B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009348A (en) * 2017-11-30 2018-05-08 上海安路信息科技有限公司 Plus/minus musical instruments used in a Buddhist or Taoist mass optimization method based on input bit time delay
CN109583360B (en) * 2018-11-26 2023-01-10 中山大学 Video human body behavior identification method based on spatio-temporal information and hierarchical representation
CN113658623B (en) * 2021-08-20 2024-03-01 湘潭大学 Ferroelectric memory array capable of realizing multi-operand memory calculation
CN115438614A (en) * 2022-09-22 2022-12-06 中山大学 High-level comprehensive rapid linear programming method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104360906A (en) * 2014-10-31 2015-02-18 中山大学 High-level comprehensive scheduling method based on difference constraint system and iterative model
CN104408232A (en) * 2014-10-30 2015-03-11 中山大学 Combinational logic optimization method and system in high-level synthesis
CN105005638A (en) * 2015-06-04 2015-10-28 广东顺德中山大学卡内基梅隆大学国际联合研究院 High-level comprehensive dispatching method based on linear delaying model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408232A (en) * 2014-10-30 2015-03-11 中山大学 Combinational logic optimization method and system in high-level synthesis
CN104360906A (en) * 2014-10-31 2015-02-18 中山大学 High-level comprehensive scheduling method based on difference constraint system and iterative model
CN105005638A (en) * 2015-06-04 2015-10-28 广东顺德中山大学卡内基梅隆大学国际联合研究院 High-level comprehensive dispatching method based on linear delaying model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于FPGA的SHA-512算法高速实现;光焱等;《信息工程大学学报》;20080331;第94-96页 *

Also Published As

Publication number Publication date
CN106682258A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106682258B (en) Multi-operand addition optimization method and system in high-level comprehensive tool
CN107729989B (en) Device and method for executing artificial neural network forward operation
US20210349692A1 (en) Multiplier and multiplication method
CN109543830B (en) Splitting accumulator for convolutional neural network accelerator
KR20190055447A (en) Apparatus and method for generating and using neural network model applying accelerated computation
US6601077B1 (en) DSP unit for multi-level global accumulation
CN108012156B (en) Video processing method and control platform
US20100312802A1 (en) Shared-memory multiprocessor system and method for processing information
CN110780923B (en) Hardware accelerator applied to binary convolution neural network and data processing method thereof
CN111240746B (en) Floating point data inverse quantization and quantization method and equipment
CN117813585A (en) Systolic array with efficient input reduced and extended array performance
US20070180015A1 (en) High speed low power fixed-point multiplier and method thereof
CN110716751B (en) High-parallelism computing platform, system and computing implementation method
CN111914987A (en) Data processing method and device based on neural network, equipment and readable medium
CN111694648B (en) Task scheduling method and device and electronic equipment
CN110659014B (en) Multiplier and neural network computing platform
WO2023124371A1 (en) Data processing apparatus and method, and chip, computer device and storage medium
CN112667241B (en) Machine learning instruction conversion method and device, board card, main board and electronic equipment
KR102227437B1 (en) Apparatus and method for generating and using neural network model applying accelerated computation
US6731820B2 (en) Image filter circuit and image filtering method
CN113986194A (en) Neural network approximate multiplier implementation method and device based on preprocessing
CN109255771B (en) Image filtering method and device
CN110648287A (en) Parallel efficient calculation method for box type filter
CN111224674A (en) Decoding method, device and decoder of multi-system LDPC code
US20230004788A1 (en) Hardware architecture for processing tensors with activation sparsity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant