CN109474268B - Circuit structure, circuit board and super computing device - Google Patents

Circuit structure, circuit board and super computing device Download PDF

Info

Publication number
CN109474268B
CN109474268B CN201811556850.4A CN201811556850A CN109474268B CN 109474268 B CN109474268 B CN 109474268B CN 201811556850 A CN201811556850 A CN 201811556850A CN 109474268 B CN109474268 B CN 109474268B
Authority
CN
China
Prior art keywords
adder
input end
register
exclusive
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811556850.4A
Other languages
Chinese (zh)
Other versions
CN109474268A (en
Inventor
李文彬
范靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bitmain Technologies Inc
Original Assignee
Bitmain Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bitmain Technologies Inc filed Critical Bitmain Technologies Inc
Priority to CN201811556850.4A priority Critical patent/CN109474268B/en
Publication of CN109474268A publication Critical patent/CN109474268A/en
Application granted granted Critical
Publication of CN109474268B publication Critical patent/CN109474268B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K19/00Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
    • H03K19/0008Arrangements for reducing power consumption
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/501Half or full adders, i.e. basic adder cells for one denomination
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K19/00Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
    • H03K19/20Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits characterised by logic function, e.g. AND, OR, NOR, NOT circuits
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03KPULSE TECHNIQUE
    • H03K19/00Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
    • H03K19/20Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits characterised by logic function, e.g. AND, OR, NOR, NOT circuits
    • H03K19/21EXCLUSIVE-OR circuits, i.e. giving output if input signal exists at only one input; COINCIDENCE circuits, i.e. giving output only if all input signals are identical

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Logic Circuits (AREA)

Abstract

The application provides a circuit structure, circuit board and super computing device, wherein, this circuit structure includes: at least two stages of operation circuit units, adjacent operation circuit units are connected, each operation circuit unit is connected with an output unit for outputting parameters to be calculated, and the operation circuit unit is the smallest unit applied to a circuit of the BLAKE algorithm; a sequential logic element is arranged between the adder and the exclusive-OR gate of each operation circuit unit on the circuit structure, and/or a sequential logic element is arranged on the input end of the adder on the circuit structure. Furthermore, BLAKE algorithm can be realized through the circuit structure; and, the addition operation and the exclusive-or operation are isolated by the sequential logic element, and/or, the burr of the signal input into the adder can be removed by the sequential logic element; therefore, the timing frequency in the circuit structure can be reduced, the propagation of the timing frequency can be prevented, and the dynamic power consumption of the whole circuit structure is reduced.

Description

Circuit structure, circuit board and super computing device
Technical Field
The present application relates to the field of supercomputer devices, for example, to a circuit arrangement, a circuit board and a supercomputer device.
Background
The BLAKE algorithm, which is an efficient encryption algorithm, is often used in the implementation of digital money algorithms; the BLAKE algorithm can be implemented using a circuit architecture.
In the prior art, a circuit structure for implementing the BLAKE algorithm includes a plurality of operation circuit units, each of which is provided with an adder and/or an exclusive-or gate; four operation paths are provided on each operation circuit unit, and a sequential logic element, for example, a register, is provided on an input end of each operation path. The BLAKE algorithm can be realized by the above circuit structure.
However, in the prior art, since the circuit structure for implementing the BLAKE algorithm has a plurality of adder cores and a plurality of exclusive-or gates, when these devices are cascaded together, a burr is formed in a signal output by each operation circuit unit, which further results in an increase in timing frequency of the operation circuit unit receiving the burr, further results in an increase in timing frequency of the whole circuit structure, and results in higher dynamic power consumption of the whole circuit structure.
Disclosure of Invention
The application provides a circuit structure, a circuit board and a super computing device, which are used for solving the problem that the timing frequency of the circuit structure for realizing BLAKE algorithm in the prior art is increased and the dynamic power consumption of the circuit structure is higher.
In a first aspect, the present application provides a circuit structure for implementing a BLAKE algorithm, including:
at least two stages of operation circuit units, wherein adjacent operation circuit units are connected, each operation circuit unit is connected with an output unit for outputting parameters to be calculated, and the operation circuit unit is the smallest unit applied to a circuit of the BLAKE algorithm;
a sequential logic element is arranged between the adder and the exclusive-or gate of each operation circuit unit on the circuit structure, and/or the input end of the adder on the circuit structure is provided with a sequential logic element.
Further, the arithmetic circuit unit includes a first arithmetic path, a second arithmetic path, a third arithmetic path, and a fourth arithmetic path;
the first operation path is provided with a first adder and a second adder, the second operation path is provided with a first exclusive-OR gate and a first shifter, the third operation path is provided with a third adder, and the fourth operation path is provided with a second exclusive-OR gate and a second shifter;
the input end of the first operation path and the input end of the second operation path are respectively connected with the input end of the first adder, the output end of the first adder is connected with the input end of the second adder, the input end of the second adder is connected with the output end of the output unit, and the output end of the second adder is connected with the input end of the second exclusive-OR gate on the fourth operation path;
The input end of the second operation path is connected with the input end of the first exclusive-OR gate, and the output end of the first exclusive-OR gate is connected with the input end of the first shifter;
the input end of the third operation path is connected with the input end of the third adder, and the output end of the third adder is connected with the input end of the first exclusive-OR gate;
the input end of the fourth operation path is connected with the input end of the second exclusive-OR gate, the output end of the second exclusive-OR gate is connected with the input end of the second shifter, and the output end of the second shifter is connected with the input end of the third adder.
Further, the sequential logic elements are a first register, a second register, a third register and a fourth register, respectively.
Further, when a sequential logic element is arranged between the adder and the exclusive-or gate of each operation circuit unit on the circuit structure, the output end of the second adder is connected with the input end of the first register, and the output end of the first register is respectively connected with the input end of the second exclusive-or gate and the input end of the first operation path of the next operation circuit unit;
The output end of the first shifter is connected with the input end of the second register, and the output end of the second register is connected with the input end of a second operation path of the next stage operation circuit unit;
the output end of the third adder is connected with the input end of the third register, and the output end of the third register is respectively connected with the input end of the first exclusive-OR gate and the input end of a third operation path of the next-stage operation circuit unit;
the output end of the second shifter is connected with the input end of the fourth register, and the output end of the fourth register is respectively connected with the input end of the third adder and the input end of a fourth operation path of the next-stage operation circuit unit.
Further, when a sequential logic element is arranged on the input end of the adder on the circuit structure, the input end of the first operation path is connected with the input end of the first register, and the output end of the first register is connected with the input end of the first adder;
the input end of the second operation path is connected with the input end of the second register, and the output end of the second register is respectively connected with the input end of the first adder and the input end of the first exclusive-OR gate;
The input end of the third operation path is connected with the input end of the third register, and the output end of the third register is connected with the input end of the third adder;
the output end of the second shifter is connected with the input end of the fourth register, and the output end of the fourth register is respectively connected with the input end of the third adder and the input end of a fourth operation path of the next-stage operation circuit unit.
Further, the sequential logic element is any one or more of the following: flip-flops, counters, registers.
Further, the first shifter in the upper stage operation circuit unit in the adjacent two stages of operation circuit units is a shifter shifted to the right by 12, and the second shifter in the upper stage operation circuit unit is a shifter shifted to the right by 16 bits;
the first shifter in the next stage of operation circuit units in the adjacent two stages of operation circuit units is a shifter shifted to the right by 7, and the second shifter in the next stage of operation circuit units is a shifter shifted to the right by 8 bits.
Further, the output end of the first operation path of the current operation circuit unit is connected with the input end of the first operation path of the next operation circuit unit; the output end of the second operation path of the current operation circuit unit is connected with the input end of the second operation path of the next stage operation circuit unit; the output end of the third operation path of the current operation circuit unit is connected with the input end of the third operation path of the next stage operation circuit unit; the output end of the fourth operation path of the current operation circuit unit is connected with the input end of the fourth operation path of the next operation circuit unit.
Further, the output unit comprises a first parameter output device, a second parameter output device and a third exclusive-or gate;
the output end of the first parameter output device and the output end of the second parameter output device are respectively connected with the input end of the third exclusive-OR gate, and the output end of the third exclusive-OR gate is connected with the input end of the second adder.
In a second aspect, the present application provides a circuit board provided with a circuit structure as defined in any one of the above.
In a third aspect, the present application provides a super computing device comprising at least one circuit board as described above.
In the above aspects, by providing a circuit structure constituted by at least two stages of arithmetic circuit units, adjacent arithmetic circuit units are connected, each arithmetic circuit unit is connected with an output unit for outputting a parameter to be calculated, the arithmetic circuit unit is a minimum unit applied to a circuit of the BLAKE algorithm; a sequential logic element is arranged between the adder and the exclusive-OR gate of each operation circuit unit on the circuit structure, and/or a sequential logic element is arranged on the input end of the adder on the circuit structure. Furthermore, BLAKE algorithm can be realized through the circuit structure; and, the addition operation and the exclusive-or operation are isolated by the sequential logic element, and/or, the burr of the signal input into the adder can be removed by the sequential logic element; therefore, the timing frequency in the circuit structure can be reduced, the propagation of the timing frequency can be prevented, and the dynamic power consumption of the whole circuit structure is reduced.
Drawings
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which like reference numerals refer to similar elements, and in which:
fig. 1 is a schematic structural diagram of a circuit structure according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a circuit structure according to a second embodiment of the present disclosure;
fig. 3 is a schematic structural diagram III of a circuit structure according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a circuit structure according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of another circuit structure according to an embodiment of the present disclosure;
fig. 6 is a schematic diagram of a second circuit structure according to another embodiment of the present disclosure;
fig. 7 is a schematic diagram III of another circuit structure according to the embodiment of the present application;
fig. 8 is a schematic structural diagram of still another circuit structure according to an embodiment of the present disclosure;
FIG. 9 is a clock sequence diagram provided in an embodiment of the present application;
fig. 10 is a schematic structural diagram of still another circuit structure according to an embodiment of the present disclosure;
FIG. 11 is a clock sequence diagram provided in an embodiment of the present application;
fig. 12 is a schematic structural diagram of a circuit board according to an embodiment of the present disclosure;
fig. 13 is a schematic structural diagram of a super computing device according to an embodiment of the present application.
Reference numerals:
1-arithmetic circuit unit 2-output unit 3-sequential logic element
4-first operation path 5-second operation path 6-third operation path
7-fourth operation path 8-first adder 9-second adder
10-first exclusive-OR gate 11-first shifter 12-third adder
13-second exclusive-OR gate 14-second shifter 15-first parameter output device
16-second parameter output device 17-third exclusive-OR gate 18-first register
19-second register 20-third register 21-fourth register
22-Circuit Board body 161-circuit board
Detailed Description
For a more complete understanding of the features and technical content of the embodiments of the present application, reference should be made to the following detailed description of the embodiments of the present application, taken in conjunction with the accompanying drawings, which are for purposes of illustration only and not intended to limit the embodiments of the present application. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may still be practiced without these details. In other instances, well-known structures and devices may be shown simplified in order to simplify the drawing.
The embodiment of the application is applied to the super computing equipment. It should be noted that, when the solution of the embodiment of the present application is applied to a present circuit structure or a circuit structure that may occur in the future, a present circuit board or a circuit board that may occur in the future, a present super computing device or a super computing device that may occur in the future, names of the respective structures may change, but this does not affect implementation of the solution of the embodiment of the present application.
First, terms appearing in the present application are explained.
1) BLAKE algorithm: also called third generation secure hash algorithm (Secure Hash Algorithm3, abbreviated SHA-3); the BLAKE algorithm is an efficient encryption algorithm that is commonly used in the implementation of digital money algorithms. Typically the BLAKE algorithm may be implemented using a circuit architecture.
2) Phase (phase): the phase is the position of a particular moment in the wave cycle for a wave.
It should be noted that, the terms or terms related to the embodiments of the present application may be referred to each other, and are not repeated.
In the prior art, a circuit structure for implementing the BLAKE algorithm includes a plurality of operation circuit units, each of which is provided with an adder and/or an exclusive-or gate; four operation paths are provided on each operation circuit unit, and a sequential logic element, for example, a register, is provided on an input end of each operation path. The BLAKE algorithm can be realized by the above circuit structure. However, since the circuit structure for implementing the BLAKE algorithm has a plurality of adder cores and a plurality of exclusive-or gates, when these devices are cascaded together, a signal output by each operation circuit unit has a glitch, which further results in an increase in the timing frequency of the operation circuit unit receiving the glitch, further results in an increase in the timing frequency of the whole circuit structure, and results in a higher dynamic power consumption of the whole circuit structure.
The circuit structure, the circuit board and the super computing equipment provided by the application aim to solve the technical problems in the prior art.
The following describes the technical solutions of the present application and how the technical solutions of the present application solve the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a circuit structure provided in an embodiment of the present application, where the circuit structure is applied to implementation of a BLAKE algorithm, as shown in fig. 1, and includes:
at least two stages of operation circuit units 1, adjacent operation circuit units 1 are connected, each operation circuit unit 1 is connected with an output unit 2 for outputting a parameter to be calculated, and the operation circuit unit 1 is a minimum unit applied to a circuit of the BLAKE algorithm.
A sequential logic element 3 is arranged between the adder and the exclusive-or gate of each arithmetic circuit unit 1 in the circuit structure, and/or a sequential logic element 3 is arranged on the input end of the adder in the circuit structure.
Optionally, sequential logic element 3 is any one or more of: flip-flops, counters, registers.
Illustratively, the circuit structure provided by the embodiment of the application is used for realizing the BLAKE algorithm and further completing the encryption operation.
The circuit structure provided by the embodiment of the application comprises an N-level operation circuit unit 1, wherein N is a positive integer greater than or equal to 2. Adjacent arithmetic circuit units 1 are connected, each arithmetic circuit unit 1 is the smallest unit on a circuit applied to the BLAKE algorithm provided in the prior art, and an adder and an exclusive or gate are provided on each arithmetic circuit unit 1. Each of the arithmetic circuit units 1 is connected to a corresponding one of the output units 2, wherein the output unit 2 is configured to input a parameter to be calculated into the arithmetic circuit unit 1, and the output unit 2 is configured to input a signal into the arithmetic circuit unit 1.
Because each arithmetic circuit unit 1 is provided with an adder and an exclusive-or gate, the circuit structure formed by the multi-stage arithmetic circuit units 1 is provided with the adder and the exclusive-or gate, the adder and the exclusive-or gate can form a combined logic element, and the signal output by the previous-stage arithmetic circuit unit 1 is provided with burrs, so that the signal received by the next-stage arithmetic circuit unit 1 is provided with burrs, and the timing frequency (toggle rate) of the next-stage arithmetic circuit unit 1 is increased.
To address the above issues, the present application provides several implementations.
The first implementation mode: fig. 2 is a schematic diagram of a second circuit structure provided in the embodiment of the present application, and as shown in fig. 2, in the embodiment of the present application, a sequential logic element 3 is disposed between an adder and an exclusive-or gate of each operation circuit unit 1 in the circuit structure. For example, on the arithmetic circuit unit 1, a sequential logic element 3 is provided between each pair of mutually connected adders and exclusive-or gates; alternatively, a plurality of sequential logic elements 3 are provided between each pair of mutually connected adders and exclusive-or gates; alternatively, only one sequential logic element 3 is provided between one or more pairs of interconnected adders and exclusive-or gates; alternatively, a plurality of sequential logic elements 3 are provided only between one or more pairs of mutually connected adders and exclusive-or gates. Thus, the adder and the exclusive-or gate on the circuit structure can be isolated through the sequential logic element 3, namely, the addition operation and the exclusive-or operation are isolated; the sequential logic element 3 can remove burrs of the signal, thereby reducing the propagation of timing frequency.
The second implementation mode: fig. 3 is a schematic diagram III of a circuit structure provided in the embodiment of the present application, and as shown in fig. 3, in the embodiment of the present application, a sequential logic element 3 is disposed at an input end of an adder on the circuit structure. For example, a sequential logic element 3 is connected before the input of the first adder on each operation path in the circuit structure; alternatively, a plurality of sequential logic elements 3 are connected before the input terminal of the first adder on each operation path on the circuit structure; alternatively, a sequential logic element 3 is connected before the input of each adder; alternatively, a plurality of sequential logic elements 3 are connected before the input of each adder; alternatively, only one sequential logic element 3 is connected before the input of one or more adders; alternatively, the plurality of sequential logic elements 3 are connected just before the input of one or more adders. Thus, since the logic duty ratio of the adder is large, by providing the sequential logic element 3 on the input terminal of the adder, the sequential logic element 3 can remove burrs of the signal input to the adder, thereby reducing propagation of the timing frequency.
Third implementation: fig. 4 is a schematic diagram of a circuit structure according to an embodiment of the present application, as shown in fig. 4, in the embodiment of the present application, a sequential logic element 3 is disposed between an adder and an exclusive-or gate of each operation circuit unit 1 in the circuit structure, and the sequential logic element 3 is disposed at an input terminal of the adder in the circuit structure. For example, on the arithmetic circuit unit 1, a sequential logic element 3 is provided between each pair of mutually connected adders and exclusive-or gates, and one sequential logic element 3 is connected before the input terminal of the first adder on each arithmetic path in the circuit configuration; alternatively, a plurality of sequential logic elements 3 are provided between each pair of mutually connected adders and exclusive-or gates, and the plurality of sequential logic elements 3 are connected before the input terminal of the first adder on each operation path in the circuit configuration; alternatively, only one sequential logic element 3 is provided between one or more pairs of interconnected adders and exclusive-or gates, and only one sequential logic element 3 is connected before the input of one or more adders; alternatively, a plurality of sequential logic elements 3 are provided only between one or more pairs of mutually connected adders and exclusive-or gates, the plurality of sequential logic elements 3 being connected only before the input of one or more adders. Thus, the adder and the exclusive-or gate on the circuit structure can be isolated through the sequential logic element 3, namely, the addition operation and the exclusive-or operation are isolated; meanwhile, a sequential logic element 3 is arranged on the input end of the adder, and the sequential logic element 3 can remove burrs of signals input into the adder; thereby reducing the propagation of the timing frequency.
The sequential logic element 3 may employ one or more of a flip-flop, a counter, and a register, among others. For example, one or more registers are provided in the circuit structure provided in the embodiment of the present application; or alternatively; one or more counters are set in the circuit structure provided in the embodiment of the present application, or one or more registers and one or more counters are set in the circuit structure provided in the embodiment of the present application.
In the present embodiment, by providing a circuit structure constituted by at least two stages of arithmetic circuit units 1, adjacent arithmetic circuit units 1 are connected, each arithmetic circuit unit 1 is connected to an output unit 2 for outputting a parameter to be calculated, the arithmetic circuit unit 1 is a minimum unit applied to a circuit of a BLAKE algorithm; a sequential logic element 3 is arranged between the adder and the exclusive-or gate of each arithmetic circuit unit 1 in the circuit structure, and/or a sequential logic element 3 is arranged on the input end of the adder in the circuit structure. Furthermore, BLAKE algorithm can be realized through the circuit structure; and, the addition operation and the exclusive-or operation are isolated by the sequential logic element 3, and/or, burrs of the signal input into the adder can be removed by the sequential logic element 3; therefore, the timing frequency in the circuit structure can be reduced, the propagation of the timing frequency can be prevented, and the dynamic power consumption of the whole circuit structure is reduced.
Fig. 5 is a schematic structural diagram of another circuit structure provided in the embodiment of the present application, which is applied to implementation of the BLAKE algorithm, and in the circuit structure provided in the embodiment of fig. 1, as shown in fig. 5, the arithmetic circuit unit 1 includes a first arithmetic path 4, a second arithmetic path 5, a third arithmetic path 6, and a fourth arithmetic path 7.
The first arithmetic path 4 is provided with a first adder 8 and a second adder 9, the second arithmetic path 5 is provided with a first exclusive-or gate 10 and a first shifter 11, the third arithmetic path 6 is provided with a third adder 12, and the fourth arithmetic path 7 is provided with a second exclusive-or gate 13 and a second shifter 14.
The input end of the first operation path 4 and the input end of the second operation path 5 are respectively connected with the input end of the first adder 8, the output end of the first adder 8 is connected with the input end of the second adder 9, the input end of the second adder 9 is connected with the output end of the output unit 2, and the output end of the second adder 9 is connected with the input end of the second exclusive-or gate 13 on the fourth operation path 7.
The input end of the second operation path 5 is connected to the input end of the first exclusive-or gate 10, and the output end of the first exclusive-or gate 10 is connected to the input end of the first shifter 11.
An input of the third arithmetic path 6 is connected to an input of the third adder 12, and an output of the third adder 12 is connected to an input of the first exclusive or gate 10.
The input end of the fourth operation path 7 is connected to the input end of the second exclusive-or gate 13, the output end of the second exclusive-or gate 13 is connected to the input end of the second shifter 14, and the output end of the second shifter 14 is connected to the input end of the third adder 12.
Optionally, the first shifter 11 in the upper stage arithmetic circuit unit 1 in the adjacent two stage arithmetic circuit units 1 is a shifter shifted right 12, and the second shifter 14 in the upper stage arithmetic circuit unit 1 is a shifter shifted right 16 bits; the first shifter 11 in the next-stage arithmetic circuit unit 1 of the adjacent two-stage arithmetic circuit units 1 is a shifter shifted to the right by 7, and the second shifter 14 in the next-stage arithmetic circuit unit 1 is a shifter shifted to the right by 8 bits.
Optionally, the output end of the first operation path 4 of the current operation circuit unit 1 is connected with the input end of the first operation path 4 of the next operation circuit unit 1; the output end of the second operation path 5 of the current operation circuit unit 1 is connected with the input end of the second operation path 5 of the next operation circuit unit 1; the output end of the third operation path 6 of the current operation circuit unit 1 is connected with the input end of the third operation path 6 of the next operation circuit unit 1; the output end of the fourth operation path 7 of the current operation circuit unit 1 is connected to the input end of the fourth operation path 7 of the next operation circuit unit 1.
Optionally, the output unit 2 comprises a first parameter output device 15, a second parameter output device 16 and a third exclusive or gate 17; the output end of the first parameter output device 15 and the output end of the second parameter output device 16 are respectively connected with the input end of the third exclusive-OR gate 17, and the output end of the third exclusive-OR gate 17 is connected with the input end of the second adder 9.
Illustratively, the circuit structure provided by the embodiment of the application is used for realizing the BLAKE algorithm and further completing the encryption operation.
On the basis of the circuit structure provided in the above embodiment, each of the arithmetic circuit units 1 provided in the embodiment of the present application includes four arithmetic paths, which are the first arithmetic path 4, the second arithmetic path 5, the third arithmetic path 6, and the fourth arithmetic path 7, respectively.
The first arithmetic path 4 is provided with a first adder 8 and a second adder 9 which are connected to each other, the second arithmetic path 5 is provided with a first exclusive-OR gate 10 and a first shifter 11 which are connected to each other, the third arithmetic path 6 is provided with a third adder 12 which is connected to each other, and the fourth arithmetic path 7 is provided with a second exclusive-OR gate 13 and a second shifter 14 which are connected to each other. The first arithmetic path 4 has an input and an output, the second arithmetic path 5 has an input and an output, the third arithmetic path 6 has an input and an output, and the fourth arithmetic path 7 has an input and an output.
As shown in fig. 5, in the first arithmetic path 4, the input terminal of the first arithmetic path 4 is connected to the input terminal of the first adder 8, the input terminal of the second arithmetic path 5 is connected to the input terminal of the first adder 8, the output terminal of the first adder 8 is connected to the input terminal of the second adder 9, the input terminal of the second adder 9 is connected to the output terminal of the output unit 2, and the output terminal of the second adder 9 is connected to the input terminal of the second exclusive or gate 13 in the fourth arithmetic path 7. The first adder 8 and the second adder 9 may constitute one combinational logic element.
In the second arithmetic path 5, the input terminal of the second arithmetic path 5 is connected to the input terminal of the first exclusive or gate 10, and the output terminal of the first exclusive or gate 10 is connected to the input terminal of the first shifter 11. The first exclusive-or gate 10 and the first shifter 11 may constitute a combinational logic element.
In the third arithmetic path 6, an input terminal of the third arithmetic path 6 is connected to an input terminal of the third adder 12, and an output terminal of the third adder 12 is connected to an input terminal of the first exclusive or gate 10. The third adder 12 may constitute a combinational logic element.
In the fourth arithmetic path 7, the input terminal of the fourth arithmetic path 7 is connected to the input terminal of the second exclusive or gate 13, the output terminal of the second exclusive or gate 13 is connected to the input terminal of the second shifter 14, and the output terminal of the second shifter 14 is connected to the input terminal of the third adder 12. The second xor gate 13 and the second shifter 14 may constitute one combinational logic element.
Each output unit 2 is constituted by a first parameter output device 15, a second parameter output device 16, and a third exclusive or gate 17; the output end of the first parameter output device 15 and the output end of the second parameter output device 16 are respectively connected with the input end of the third exclusive-or gate 17, and the output end of the third exclusive-or gate 17 is connected with the input end of the second adder 9 on the arithmetic circuit unit 1 corresponding to the output unit 2.
In the circuit structure provided in the embodiment of the present application, as shown in fig. 5, two adjacent two-stage arithmetic circuit units 1 respectively include two arithmetic circuit units 1, namely, an upper-stage arithmetic circuit unit 1 and a lower-stage arithmetic circuit unit 1; in the upper stage arithmetic circuit unit 1, the first shifter 11 on the second arithmetic path 5 is a shifter shifted right 12, and the second shifter 14 on the fourth arithmetic path 7 is a shifter shifted right 16 bits; in the next stage arithmetic circuit unit 1, the first shifter 11 on the second arithmetic path 5 is a shifter shifted to the right 7, and the second shifter 14 on the fourth arithmetic path 7 is a shifter shifted to the right 8 bits.
In addition to the above connection relationship, sequential logic elements 3 are provided on the respective operation paths of the operation circuit unit 1.
Several implementations are provided.
The first implementation mode: as shown in fig. 5, since in each arithmetic circuit unit 1 in the circuit configuration, the adder is connected to an exclusive or gate; one sequential logic element 3 may be provided between at least one pair of interconnected adders and exclusive-or gates, or a plurality of sequential logic elements 3 may be provided between at least one pair of interconnected adders and exclusive-or gates. Thus, the sequential logic element 3 is arranged between the adder and the exclusive-or gate which are connected with each other, so that the addition operation and the exclusive-or operation are isolated; as shown in fig. 5, since the sequential logic element 3 is disposed between at least one pair of the adder and the exclusive-or gate, the sequential logic element 3 can filter the burrs in the signals output from the previous stage of the operation circuit unit 1, so that the signals received by the next stage of the operation circuit unit 1 have no burrs, thereby reducing the timing frequency of the next stage of the operation circuit unit 1, reducing the propagation of the timing frequency, and reducing the dynamic power consumption of the circuit structure.
The second implementation mode: fig. 6 is a schematic diagram of a second circuit structure provided in the embodiment of the present application, as shown in fig. 6, one sequential logic element 3 is disposed at an input end of at least one adder in the circuit structure, or a plurality of sequential logic elements 3 are disposed at an input end of at least one adder in the circuit structure. Thus, since the logic duty ratio of the adder is large, by providing the sequential logic element 3 on the input terminal of the adder, the sequential logic element 3 can remove burrs of the signal input into the adder; further, the timing frequency of each stage of operation circuit unit 1 is reduced, the propagation of the timing frequency is reduced, and the dynamic power consumption of the circuit structure is reduced.
Third implementation: fig. 7 is a schematic structural diagram III of another circuit structure provided in the embodiment of the present application, as shown in fig. 7, on the operation circuit unit 1, a sequential logic element 3 may be disposed between at least one pair of adders and exclusive-or gates that are connected to each other, and an input terminal of at least one adder in the circuit structure is provided with a sequential logic element 3; alternatively, a plurality of sequential logic elements 3 are provided between at least one pair of adders and exclusive-or gates connected to each other, and a plurality of sequential logic elements 3 are provided at an input terminal of at least one adder in the circuit configuration. Thus, by providing the sequential logic element 3 between the adder and the exclusive-or gate connected to each other, and providing the sequential logic element 3 on the input terminal of the adder, the sequential logic element 3 can filter burrs in signals output in the arithmetic circuit unit 1 of the previous stage, while the sequential logic element 3 filters burrs of signals input into the arithmetic circuit unit 1 of each stage; further, the timing frequency of each stage of operation circuit unit 1 is reduced, the propagation of the timing frequency is reduced, and the dynamic power consumption of the circuit structure is reduced.
In the above-described circuit configuration, each arithmetic circuit unit 1 of the circuit configuration has four arithmetic paths, and the output terminal of each arithmetic path is the output terminal of the element at the end of the arithmetic path. For example, in fig. 5, the output end of the operation path where the sequential logic element 3 is set is the output end of the sequential logic element 3 on the operation path, and the output end of the operation path where the sequential logic element 3 is not set is the output end of the element on the end of the operation path; for example, if the sequential logic element 3 is not provided on the third operation path 6, the output of the third adder 12 on the third operation path 6 is the output of the third operation path 6.
In the above-described circuit configuration, for the adjacent two-stage arithmetic circuit units 1, the adjacent two-stage arithmetic circuit units 1 include the upper-stage arithmetic circuit unit 1 and the lower-stage arithmetic circuit unit 1; the output end of the first operation path 4 of the previous stage operation circuit unit 1 needs to be connected to the input end of the first operation path 4 of the next stage operation circuit unit 1, the output end of the second operation path 5 of the previous stage operation circuit unit 1 needs to be connected to the input end of the second operation path 5 of the next stage operation circuit unit 1, the output end of the third operation path 6 of the previous stage operation circuit unit 1 needs to be connected to the input end of the third operation path 6 of the next stage operation circuit unit 1, and the output end of the fourth operation path 7 of the previous stage operation circuit unit 1 needs to be connected to the input end of the fourth operation path 7 of the next stage operation circuit unit 1. By analogy, an N-stage arithmetic circuit unit 1 is obtained, and the N-stage arithmetic circuit unit 1 constitutes the circuit structure provided in the embodiment of the present application.
In the application, by providing a circuit structure formed by at least two stages of operation circuit units 1, adjacent operation circuit units 1 are connected, each operation circuit unit 1 is connected with an output unit 2 for outputting parameters to be calculated, and the operation circuit unit 1 is the smallest unit applied to a circuit of a BLAKE algorithm; a sequential logic element 3 is arranged between the adder and the exclusive-or gate of each arithmetic circuit unit 1 in the circuit structure, and/or a sequential logic element 3 is arranged on the input end of the adder in the circuit structure. Furthermore, BLAKE algorithm can be realized through the circuit structure; because the sequential logic element 3 is arranged between at least one pair of mutually connected adders and exclusive-OR gates, the sequential logic element 3 can filter burrs in signals output by the previous-stage operation circuit unit 1, so that the signals received by the next-stage operation circuit unit 1 are free from burrs, and the timing frequency of the next-stage operation circuit unit 1 is reduced; the sequential logic element 3 is arranged on the input end of the adder, so that the sequential logic element 3 can remove burrs of signals input into the adder, and the timing frequency of each stage of operation circuit unit 1 can be further reduced; by the mode, the propagation of the time-frequency in the circuit structure can be reduced, and the dynamic power consumption of the circuit structure is reduced.
Fig. 8 is a schematic structural diagram of another circuit structure provided in the embodiment of the present application, where the circuit structure is applied to implementation of the BLAKE algorithm, and the sequential logic element 3 is a first register 18, a second register 19, a third register 20, and a fourth register 21, respectively, based on the circuit structure provided in the embodiment of fig. 1-7, as shown in fig. 8.
When the sequential logic element 3 is disposed between the adder and the exclusive-or gate of each arithmetic circuit unit 1 in the circuit structure, the output end of the second adder 9 is connected to the input end of the first register 18, and the output end of the first register 18 is connected to the input end of the second exclusive-or gate 13 and the input end of the first arithmetic path 4 of the next arithmetic circuit unit 1.
The output end of the first shifter 11 is connected to the input end of the second register 19, and the output end of the second register 19 is connected to the input end of the second operation path 5 of the next stage operation circuit unit 1.
The output terminal of the third adder 12 is connected to the input terminal of the third register 20, and the output terminal of the third register 20 is connected to the input terminal of the first exclusive-or gate 10 and the input terminal of the third operation path 6 of the next stage operation circuit unit 1, respectively.
The output terminal of the second shifter 14 is connected to the input terminal of the fourth register 21, and the output terminal of the fourth register 21 is connected to the input terminal of the third adder 12 and the input terminal of the fourth operation path 7 of the next stage operation circuit unit 1, respectively.
Illustratively, sequential logic element 3 includes a first register 18, a second register 19, a third register 20, and a fourth register 21. One register may be provided on each operation path of each operation circuit unit 1.
When the sequential logic element 3 is provided between the adder and the exclusive-or gate of each arithmetic circuit unit 1 in the circuit configuration, a register is provided between each pair of the adder and the exclusive-or gate.
For each arithmetic circuit unit 1, a first register 18 is provided between the second adder 9 on the first arithmetic path 4 and the second exclusive-or gate 13 on the fourth arithmetic path 7 of the arithmetic circuit unit 1. As shown in fig. 8, the output of the second adder 9 is connected to the input of the first register 18; the output of the first register 18 is coupled to the input of the second exclusive-or gate 13; the output of the first register 18 is connected to the input of the first arithmetic path 4 of the next arithmetic circuit unit 1, whereby the output of the first register 18 is connected to the input of the first adder 8 on the first arithmetic path 4 of the next arithmetic circuit unit 1.
A second register 19 is provided between the first exclusive or gate 10 on the second operation path 5 and the first adder 8 on the first operation path 4 of the next stage operation circuit unit 1. As shown in fig. 8, the output terminal of the first shifter 11 on the second operation path 5 is connected to the input terminal of the second register 19; the output of the second register 19 is connected to the input of the second operation path 5 of the next stage operation circuit unit 1, so that the output of the second register 19 is connected to the input of the first adder 8 on the first operation path 4 of the next stage operation circuit unit 1 and the input of the first exclusive-or gate 10 on the second operation path 5 of the next stage operation circuit unit 1, respectively.
A third register 20 is provided between the third adder 12 on the third operational path 6 and the first exclusive or gate 10 on the second operational path 5. As shown in fig. 8, the output of the third adder 12 is connected to the input of the third register 20; the output of the third register 20 is coupled to the input of the first exclusive-or gate 10; the output of the third register 20 is connected to the input of the third operation path 6 of the next stage operation circuit unit 1, whereby the output of the third register 20 is connected to the input of the third register 20 on the third operation path 6 of the next stage operation circuit unit 1.
A fourth register 21 is provided between the second exclusive or gate 13 on the fourth operation path 7 and the third adder 12 on the third operation path 6. As shown in fig. 8, the output of the second shifter 14 is connected to the input of the fourth register 21; an output of the fourth register 21 and an input of the third adder 12; the output of the fourth register 21 is connected to the input of the fourth operational path 7 of the next-stage operational circuit unit 1, whereby the output of the fourth register 21 is connected to the second exclusive-or gate 13 on the fourth operational path 7 of the next-stage operational circuit unit 1.
By the above connection, the following procedure is performed for each stage of the arithmetic circuit unit 1.
The third exclusive or gate 17 receives the signal output from the first parameter output device 15 and the signal output from the second parameter output device 16; after the third exclusive-or gate 17 performs the exclusive-or operation on the received signal, the exclusive-or operated signal is input to the second adder 9; the first adder 8 receives signals through the input end of the first operation path 4 and the input end of the second operation path 5, and the first adder 8 performs addition operation on the two paths of signals; then, the first adder 8 inputs the signal obtained by the addition to the second adder 9; thus, the second adder 9 obtains two signals. The second adder 9 adds the two received signals, and then inputs the signal obtained by the addition to the first register 18; the first register 18 filters the signal to remove glitches in the signal; then, the first register 18 inputs the filtered signal into the second exclusive-or gate 13 and the first adder 8 of the next-stage arithmetic circuit unit 1; so that the signal obtained by the first operation path 4 of the operation circuit unit 1 next to the current operation circuit unit 1 is burr-free.
The first exclusive-or gate 10 receives the signal through the input end of the second operation path 5; the first exclusive-or gate 10 performs exclusive-or operation on the received signal, and then inputs the signal obtained by the exclusive-or operation into the first shifter 11; the first shifter 11 performs displacement processing on the signals to obtain signals after the displacement processing; the first shifter 11 inputs the signal after the shift processing to the second register 19; the second register 19 filters the signal to remove glitches in the signal; then, the second register 19 inputs the filtered signal into the first adder 8 of the next-stage arithmetic circuit unit 1 and the first exclusive-or gate 10 of the next-stage arithmetic circuit unit 1; so that the signal obtained by the first operation path 4 of the next-stage operation circuit unit 1 is free from glitches, and the signal obtained by the second operation path 5 of the next-stage operation circuit unit 1 is free from glitches.
The third adder 12 receives the signal through the input of the third operational path 6; the third adder 12 adds the received signals, and then inputs the added signals into the third register 20; the third register 20 filters the signal to remove glitches in the signal; then, the third register 20 inputs the filtered signal into the first exclusive-or gate 10 and the third adder 12 of the next-stage arithmetic circuit unit 1; so that the signal obtained by the third operation path 6 of the operation circuit unit 1 next to the current operation circuit unit 1 is burr-free.
The second exclusive-or gate 13 receives the signal through the input terminal of the fourth operation path 7 and receives the signal output by the first register 18; the second exclusive-or gate 13 performs exclusive-or operation on the received two paths of signals, and then inputs the signals obtained by the exclusive-or operation into the second shifter 14; the second shifter 14 carries out displacement processing on the signals to obtain signals after displacement processing; the second shifter 14 inputs the shift-processed signal to the fourth register 21; the fourth register 21 filters the signal to remove burrs in the signal; then, the fourth register 21 inputs the filtered signal into the third adder 12 and the second exclusive-or gate 13 of the next-stage arithmetic circuit unit 1; so that the signal obtained by the fourth operation path 7 of the next operation circuit unit 1 of the current operation circuit unit 1 is burr-free.
It is known that, in the above manner, the signal output from each arithmetic circuit unit 1 to the next-stage arithmetic circuit unit 1 can be deburred; the signal received by the next-stage operation circuit unit 1 is free of burrs, so that the timing frequency of the next-stage operation circuit unit 1 is reduced, the propagation of the timing frequency is reduced, and the dynamic power consumption of the circuit structure is reduced.
Through the connection mode, the existing register applied to the circuit structure of the BLAKE algorithm is moved to the space between the adder and the exclusive OR gate; furthermore, the addition operation and the exclusive-or operation are isolated through the register, so that redundant hardware cost is not increased.
Fig. 9 is a clock sequence diagram provided in the embodiment of the present application, as shown in fig. 9, the operation process of two operation circuit units 1 is one algorithm period, and fig. 9 shows the algorithm period in the prior art and the timing sequence in the prior art; by adopting the circuit configuration shown in fig. 8, the register can be divided into 4 phases, namely phase1, phase2, phase3 and phase4, and fig. 9 shows the clock sequence diagrams of each of phase1, phase2, phase3 and phase 4; as can be seen from fig. 9, each phase occupies only one eighth of the algorithm period, and the algorithm period of the addition of the phases of the two arithmetic circuit units 1 is still the same as the algorithm period in the prior art; furthermore, the computing power of the circuit structure provided by the embodiment of the application is not changed in one algorithm period, and the circuit structure provided by the embodiment of the application can keep the performance of the BLAKE algorithm.
In the present embodiment, by providing a circuit structure constituted by at least two stages of arithmetic circuit units 1, adjacent arithmetic circuit units 1 are connected, each arithmetic circuit unit 1 is connected to an output unit 2 for outputting a parameter to be calculated, the arithmetic circuit unit 1 is a minimum unit applied to a circuit of a BLAKE algorithm; a register is provided between an adder and an exclusive-OR gate in each operation circuit unit 1 in the circuit structure. Thus, the signal output from each arithmetic circuit unit 1 to the next-stage arithmetic circuit unit 1 can be deburred; the signal received by the next-stage operation circuit unit 1 is free of burrs, so that the timing frequency of the next-stage operation circuit unit 1 is reduced, the propagation of the timing frequency is reduced, and the dynamic power consumption of the circuit structure is reduced. In addition, the existing register applied to the circuit structure of the BLAKE algorithm is moved to the position between the adder and the exclusive OR gate, so that redundant hardware cost is not increased; the computing capability of the circuit structure provided by the embodiment of the application is not changed, and the circuit structure provided by the embodiment of the application can keep the performance of the BLAKE algorithm.
Fig. 10 is a schematic structural diagram of still another circuit structure provided in the embodiment of the present application, which is applied to implementation of the BLAKE algorithm, and is based on the circuit structure provided in the embodiment of fig. 1 to 7, as shown in fig. 10, in the circuit structure, the sequential logic element 3 is a first register 18, a second register 19, a third register 20, and a fourth register 21, respectively.
When the sequential logic element 3 is provided at the input of the adder in the circuit configuration, the input of the first arithmetic path 4 is connected to the input of the first register 18, and the output of the first register 18 is connected to the input of the first adder 8.
The input end of the second operation path 5 is connected to the input end of the second register 19, and the output end of the second register 19 is connected to the input end of the first adder 8 and the input end of the first exclusive-or gate 10, respectively.
An input of the third operation path 6 is connected to an input of the third register 20, and an output of the third register 20 is connected to an input of the third adder 12.
The output terminal of the second shifter 14 is connected to the input terminal of the fourth register 21, and the output terminal of the fourth register 21 is connected to the input terminal of the third adder 12 and the input terminal of the fourth operation path 7 of the next stage operation circuit unit 1, respectively.
Illustratively, sequential logic element 3 includes a first register 18, a second register 19, a third register 20, and a fourth register 21. One register may be provided on each operation path of each operation circuit unit 1.
When the sequential logic element 3 is provided at the input terminal of the adder of each arithmetic circuit unit 1 in the circuit configuration, a register is provided at the input terminal of the first adder of each arithmetic circuit unit 1.
For each arithmetic circuit unit 1, a first register 18 and a second register 19 are provided on the input of the first adder 8 on the first arithmetic path 4 of the arithmetic circuit unit 1. As shown in fig. 9, since the input terminal of the first adder 8 is connected to the input terminal of the first arithmetic path 4 and the input terminal of the second arithmetic path 5, respectively, the input terminal of the first arithmetic path 4 needs to be connected to the input terminal of the first register 18, and the output terminal of the first register 18 is connected to the input terminal of the first adder 8; the input terminal of the second operation path 5 is connected to the input terminal of the second register 19, and the output terminal of the second register 19 is connected to the input terminal of the first adder 8 and the input terminal of the first exclusive-or gate 10, respectively. Further, a first register 18 is provided on the first operation path 4, a second register 19 is provided on the second operation path 5, and the first register 18 and the second register 19 are connected to the input terminal of the first adder 8, respectively.
A third register 20 and a fourth register 21 are provided on the inputs of the third adder 12 on the third operational path 6 of the operational circuit unit 1. As shown in fig. 9, since the input terminal of the third adder 12 is connected to the input terminal of the second operation path 5 and the output terminal of the fourth operation path 7, respectively, it is necessary to connect the input terminal of the third operation path 6 to the input terminal of the third register 20 and connect the output terminal of the third register 20 to the input terminal of the third adder 12; the output of the second shifter 14 is connected to the input of the fourth register 21, and the output of the fourth register 21 is connected to the input of the third adder 12 and the input of the fourth operation path 7 of the next-stage operation circuit unit 1, respectively. Further, the third register 20 is provided on the third operation path 6, the fourth register 21 is provided on the fourth operation path 7, and the third register 20 and the fourth register 21 are connected to the input terminal of the third adder 12, respectively.
By the above connection, the following procedure is performed for each stage of the arithmetic circuit unit 1.
The first register 18 receives signals through the input end of the first operation path 4, and the first register 18 filters the received signals to remove burrs in the signals; then, the first register 18 inputs the filtered signal into the first adder 8; the second register 19 receives signals through the input end of the second operation path 5, and the second register 19 filters the received signals to remove burrs in the signals; then, the second register 19 inputs the filtered signal into the first adder 8; thus, the first adder 8 receives two signals, and the first adder 8 performs addition operation on the two signals; then, the first adder 8 inputs the signal obtained by the addition to the second adder 9; the third exclusive-or gate 17 receives the signal output by the first parameter output device 15 and the signal output by the second parameter output device 16, and after the third exclusive-or gate 17 performs an exclusive-or operation on the received signal, the signal after the exclusive-or operation is input to the second adder 9; thus, the second adder 9 obtains two signals. The second adder 9 adds the two signals received, and then inputs the signal obtained by the addition to the second exclusive or gate 13 and the first register 18 of the next stage arithmetic circuit unit 1. So that the first and second arithmetic paths 4, 5 of the current arithmetic circuit unit 1 can first remove burrs from the received signal, the signals processed by the other elements in the first and second arithmetic paths 4, 5 of the current arithmetic circuit unit 1 are burr-free.
The first exclusive-or gate 10 receives the signal output from the first register 18; the first exclusive-or gate 10 performs exclusive-or operation on the received signal, and then inputs the signal obtained by the exclusive-or operation into the first shifter 11; the first shifter 11 performs displacement processing on the signals to obtain signals after the displacement processing; the first shifter 11 inputs the signal after the shift processing to the first adder 8 of the next-stage arithmetic circuit unit 1 and the second register 19 of the next-stage arithmetic circuit unit 1; so that the signal obtained by the second operation path 5 of the current operation circuit unit 1 is burr-free.
The third adder 12 receives the signal output from the third register 20; the third adder 12 adds the received signals, and then inputs the added signals to the first exclusive-or gate 10 and the third register 20 of the next-stage arithmetic circuit unit 1. The second exclusive-or gate 13 receives the signal through the input terminal of the fourth operation path 7 and receives the signal output from the second adder 9; the second exclusive-or gate 13 performs exclusive-or operation on the received two paths of signals, and then inputs the signals obtained by the exclusive-or operation into the second shifter 14; the second shifter 14 carries out displacement processing on the signals to obtain signals after displacement processing; the second shifter 14 inputs the shift-processed signal to the fourth register 21; the fourth register 21 filters the signal to remove burrs in the signal; then, the fourth register 21 inputs the filtered signal into the third adder 12 and the second exclusive-or gate 13 of the next-stage arithmetic circuit unit 1. So that the signal obtained by the third adder 12 on the third path unit of the present arithmetic circuit unit 1 is free from burrs, and the signal obtained by the fourth arithmetic path 7 of the next arithmetic circuit unit 1 of the present arithmetic circuit unit 1 is free from burrs; due to the connection mode of the multi-stage operation circuit unit 1 adopted in the present embodiment, it is further known that the signal obtained by the fourth operation path 7 of the current operation circuit unit 1 is also free of burrs.
It is known that, in the above manner, the signal obtained by each of the arithmetic circuit units 1 can be deburred; and the timing frequency of the adder can be reduced; further, the timing frequency of the next-stage operation circuit unit 1 is reduced, the propagation of the timing frequency is reduced, and the dynamic power consumption of the circuit structure is reduced.
By the connection mode, the existing register applied to the circuit structure of the BLAKE algorithm is moved to the input end of the adder, so that redundant hardware cost is not increased.
Fig. 11 is a clock sequence diagram provided in the embodiment of the present application, as shown in fig. 11, the operation process of two operation circuit units 1 is one algorithm period, and fig. 11 shows the algorithm period in the prior art and the timing sequence in the prior art; by adopting the circuit configuration shown in fig. 10, the register can be divided into 2 phases, namely phase1 and phase2, and fig. 11 shows the clock sequence diagram of each of phase1 and phase 2; as can be seen from fig. 11, each phase occupies only one quarter of the algorithm period, and the algorithm period of the addition of the phases of the two arithmetic circuit units 1 is still the same as the algorithm period in the prior art; furthermore, the computing power of the circuit structure provided by the embodiment of the application is not changed in one algorithm period, and the circuit structure provided by the embodiment of the application can keep the performance of the BLAKE algorithm.
In the present embodiment, by providing a circuit structure constituted by at least two stages of arithmetic circuit units 1, adjacent arithmetic circuit units 1 are connected, each arithmetic circuit unit 1 is connected to an output unit 2 for outputting a parameter to be calculated, the arithmetic circuit unit 1 is a minimum unit applied to a circuit of a BLAKE algorithm; registers are provided on the input terminals of adders on the respective arithmetic circuit units 1 in the circuit configuration. Thus, the signal obtained by each arithmetic circuit unit 1 can be deburred; further, the time frequency of the adder is reduced, the timing frequency of each operation circuit unit 1 is reduced, the propagation of the timing frequency is reduced, and the dynamic power consumption of the circuit structure is reduced. In addition, the existing register applied to the circuit structure of the BLAKE algorithm is moved to the input end of the adder, so that redundant hardware cost is not increased; the computing capability of the circuit structure provided by the embodiment of the application is not changed, and the circuit structure provided by the embodiment of the application can keep the performance of the BLAKE algorithm.
Fig. 12 is a schematic structural diagram of a circuit board according to an embodiment of the present application, and as shown in fig. 12, the circuit board includes: the circuit board body 22, the circuit structure provided in any of the embodiments above is provided on the circuit board body 22.
Illustratively, the circuit board is comprised of one circuit board body 22.
The shape of the circuit board body 22 may be rectangular, square, trapezoid, other regular shape, or other irregular shape; the shape of the circuit board body 22 is not limited in this application.
The material of the circuit board body 22 is not limited in this application.
The circuit board body 22 may be a single-sided board, or a double-sided board, or a multi-layer board, without limitation.
The circuit structure provided in any of the embodiments above is provided at an arbitrary position of the circuit board body 22.
The structure and principle of the circuit structure can be referred to the above embodiments, and will not be described in detail.
In this embodiment, a circuit structure is provided by arranging circuit structures on a circuit board, wherein the circuit structure is formed by at least two stages of operation circuit units, adjacent operation circuit units are connected, each operation circuit unit is connected with an output unit for outputting parameters to be calculated, and the operation circuit unit is a minimum unit applied to a circuit of a BLAKE algorithm; a sequential logic element is arranged between the adder and the exclusive-OR gate of each operation circuit unit on the circuit structure, and/or a sequential logic element is arranged on the input end of the adder on the circuit structure. Furthermore, BLAKE algorithm can be realized through the circuit structure; and, the addition operation and the exclusive-or operation are isolated by the sequential logic element, and/or, the burr of the signal input into the adder can be removed by the sequential logic element; therefore, the timing frequency in the circuit structure can be reduced, the propagation of the timing frequency can be prevented, and the dynamic power consumption of the whole circuit structure is reduced.
Fig. 13 is a schematic structural diagram of a super computing device according to an embodiment of the present application, and as shown in fig. 13, the super computing device includes at least one circuit board 161 provided in the foregoing embodiment.
Illustratively, one or more circuit boards 161 are provided in the supercomputer device, and the circuit boards 161 employ the circuit boards provided in the above embodiments. The structure and function of the circuit board 161 may be referred to the description of the above embodiment, and will not be repeated.
In this embodiment, a plurality of circuit boards 161 may be connected in parallel, and then the parallel circuit boards 161 may be provided in the super computing device. In one implementation, the supercomputing device may be a supercomputing server.
The circuit board 161 may be connected to the computing device by a fixed or sliding connection. For example, one or more runners may be provided on the chassis of the supercomputer device, and then the circuit board 161 is placed in the runners such that the circuit board 161 can slide over the runners.
Where multiple circuit boards 161 are provided in a supercomputer, the structure of each of the multiple circuit boards 161 may be the same or different. For example, in the super computing device, S circuit boards 161 are provided, S is a positive integer greater than or equal to 2, the circuit structure shown in fig. 8 is provided on some of the S circuit boards 161, and the circuit structure shown in fig. 10 is provided on the rest of the circuit boards 161.
In this embodiment, by providing a circuit board on the super computing device, a circuit structure formed by at least two stages of operation circuit units is provided on the circuit board, adjacent operation circuit units are connected, each operation circuit unit is connected with an output unit for outputting a parameter to be calculated, and the operation circuit unit is a minimum unit applied to a circuit of the BLAKE algorithm; a sequential logic element is arranged between the adder and the exclusive-OR gate of each operation circuit unit on the circuit structure, and/or a sequential logic element is arranged on the input end of the adder on the circuit structure. Furthermore, BLAKE algorithm can be realized through the circuit structure; and, the addition operation and the exclusive-or operation are isolated by the sequential logic element, and/or, the burr of the signal input into the adder can be removed by the sequential logic element; therefore, the timing frequency in the circuit structure can be reduced, the propagation of the timing frequency can be prevented, and the dynamic power consumption of the whole circuit structure is reduced.
When used in this application, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without changing the meaning of the description, so long as all occurrences of the "first element" are renamed consistently and all occurrences of the "second element" are renamed consistently. The first element and the second element are both elements, but may not be the same element.
The words used in this application are merely for describing embodiments and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a," "an," and "the" (the) are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, when used in this application, the terms "comprises," "comprising," and/or "includes," and variations thereof, mean that the stated features, integers, steps, operations, elements, and/or components are present, but that the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof is not precluded.
The above technical description may refer to the accompanying drawings, which form a part of the present application, and in which are shown by way of illustration implementations in accordance with the described embodiments. While these embodiments are described in sufficient detail to enable those skilled in the art to practice them, these embodiments are non-limiting; other embodiments may be used, and changes may be made without departing from the scope of the described embodiments. For example, the order of operations described in the flowcharts is non-limiting, and thus the order of two or more operations illustrated in the flowcharts and described in accordance with the flowcharts may be changed in accordance with several embodiments. As another example, in several embodiments, one or more operations illustrated in the flowcharts and described in accordance with the flowcharts are optional or may be deleted. In addition, certain steps or functions may be added to the disclosed embodiments or more than two of the step sequences may be substituted. All such variations are considered to be encompassed by the disclosed embodiments and the claims.
Additionally, terminology is used in the above technical description to provide a thorough understanding of the described embodiments. However, no overly detailed details are required to implement the described embodiments. Accordingly, the foregoing description of the embodiments has been presented for purposes of illustration and description. The embodiments presented in the foregoing description and examples disclosed in accordance with these embodiments are provided separately to add context and aid in the understanding of the described embodiments. The foregoing description is not intended to be exhaustive or to limit the described embodiments to the precise form disclosed. Several modifications, alternative adaptations and variations are possible in light of the above teachings. In some instances, well known process steps have not been described in detail in order to avoid unnecessarily obscuring the described embodiments.
The principles and embodiments of the present application are described herein with reference to specific examples, the description of which is only for the purpose of aiding in the understanding of the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. A circuit structure for use in the implementation of the BLAKE algorithm, comprising:
at least two stages of operation circuit units, wherein adjacent operation circuit units are connected, each operation circuit unit is connected with an output unit for outputting parameters to be calculated, and the operation circuit unit is the smallest unit applied to a circuit of the BLAKE algorithm;
a sequential logic element is arranged between the adder and the exclusive-OR gate of each operation circuit unit on the circuit structure, and/or the input end of the adder on the circuit structure is provided with a sequential logic element;
the operation circuit unit comprises a first operation path, a second operation path, a third operation path and a fourth operation path;
the first operation path is provided with a first adder and a second adder, the second operation path is provided with a first exclusive-OR gate and a first shifter, the third operation path is provided with a third adder, and the fourth operation path is provided with a second exclusive-OR gate and a second shifter;
The input end of the first operation path and the input end of the second operation path are respectively connected with the input end of the first adder, the output end of the first adder is connected with the input end of the second adder, the input end of the second adder is connected with the output end of the output unit, and the output end of the second adder is connected with the input end of the second exclusive-OR gate on the fourth operation path;
the input end of the second operation path is connected with the input end of the first exclusive-OR gate, and the output end of the first exclusive-OR gate is connected with the input end of the first shifter;
the input end of the third operation path is connected with the input end of the third adder, and the output end of the third adder is connected with the input end of the first exclusive-OR gate;
the input end of the fourth operation path is connected with the input end of the second exclusive-OR gate, the output end of the second exclusive-OR gate is connected with the input end of the second shifter, and the output end of the second shifter is connected with the input end of the third adder; the sequential logic element is any one or more of the following: a trigger, a counter and a register; the first shifter in the upper-stage operation circuit unit in the adjacent two-stage operation circuit units is a shifter shifting 12 to the right, and the second shifter in the upper-stage operation circuit unit is a shifter shifting 16 bits to the right;
The first shifter in the next stage of operation circuit units in the adjacent two stages of operation circuit units is a shifter shifted to the right by 7, and the second shifter in the next stage of operation circuit units is a shifter shifted to the right by 8 bits.
2. The circuit structure of claim 1, wherein the sequential logic elements are a first register, a second register, a third register, and a fourth register, respectively.
3. The circuit configuration according to claim 2, wherein when a sequential logic element is provided between an adder and an exclusive-or gate of each of the arithmetic circuit units on the circuit configuration, an output terminal of the second adder is connected to an input terminal of the first register, and an output terminal of the first register is connected to an input terminal of the second exclusive-or gate and an input terminal of a first arithmetic path of a next-stage arithmetic circuit unit, respectively;
the output end of the first shifter is connected with the input end of the second register, and the output end of the second register is connected with the input end of a second operation path of the next stage operation circuit unit;
the output end of the third adder is connected with the input end of the third register, and the output end of the third register is respectively connected with the input end of the first exclusive-OR gate and the input end of a third operation path of the next-stage operation circuit unit;
The output end of the second shifter is connected with the input end of the fourth register, and the output end of the fourth register is respectively connected with the input end of the third adder and the input end of a fourth operation path of the next-stage operation circuit unit.
4. The circuit arrangement according to claim 2, wherein when a sequential logic element is provided on an input of an adder on the circuit arrangement, an input of the first operation path is connected to an input of the first register, and an output of the first register is connected to an input of the first adder;
the input end of the second operation path is connected with the input end of the second register, and the output end of the second register is respectively connected with the input end of the first adder and the input end of the first exclusive-OR gate;
the input end of the third operation path is connected with the input end of the third register, and the output end of the third register is connected with the input end of the third adder;
the output end of the second shifter is connected with the input end of the fourth register, and the output end of the fourth register is respectively connected with the input end of the third adder and the input end of a fourth operation path of the next-stage operation circuit unit.
5. The circuit configuration according to any one of claims 1 to 4, wherein an output terminal of the first operation path of the present operation circuit unit is connected to an input terminal of the first operation path of the next operation circuit unit; the output end of the second operation path of the current operation circuit unit is connected with the input end of the second operation path of the next stage operation circuit unit; the output end of the third operation path of the current operation circuit unit is connected with the input end of the third operation path of the next stage operation circuit unit; the output end of the fourth operation path of the current operation circuit unit is connected with the input end of the fourth operation path of the next operation circuit unit.
6. The circuit structure according to any one of claims 1 to 4, wherein the output unit includes a first parameter output device, a second parameter output device, and a third exclusive-or gate;
the output end of the first parameter output device and the output end of the second parameter output device are respectively connected with the input end of the third exclusive-OR gate, and the output end of the third exclusive-OR gate is connected with the input end of the second adder.
7. A circuit board, characterized in that the circuit board is provided with a circuit structure as claimed in any one of claims 1-6.
8. A super computing device comprising at least one circuit board as claimed in claim 7.
CN201811556850.4A 2018-12-19 2018-12-19 Circuit structure, circuit board and super computing device Active CN109474268B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811556850.4A CN109474268B (en) 2018-12-19 2018-12-19 Circuit structure, circuit board and super computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811556850.4A CN109474268B (en) 2018-12-19 2018-12-19 Circuit structure, circuit board and super computing device

Publications (2)

Publication Number Publication Date
CN109474268A CN109474268A (en) 2019-03-15
CN109474268B true CN109474268B (en) 2024-02-06

Family

ID=65675317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811556850.4A Active CN109474268B (en) 2018-12-19 2018-12-19 Circuit structure, circuit board and super computing device

Country Status (1)

Country Link
CN (1) CN109474268B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11349639B2 (en) * 2018-12-28 2022-05-31 ePIC Blockchain Technologies Inc. Circuit and method for overcoming memory bottleneck of ASIC-resistant cryptographic algorithms
CN113010145B (en) * 2021-03-22 2024-02-20 香港中文大学(深圳) Digital operation component, digital calculator and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5010511A (en) * 1988-04-18 1991-04-23 General Electric Company Digit-serial linear combining apparatus useful in dividers
TW200410131A (en) * 2002-09-25 2004-06-16 Infineon Technologies Ag Apparatus and method for converting, and adder circuit
GB0801053D0 (en) * 2005-01-27 2008-02-27 Samsung Electronics Co Ltd Cryptographic logic circuits and method of performing logic operations
CN107357552A (en) * 2017-06-06 2017-11-17 西安电子科技大学 The optimization method of floating-point complex vector summation is realized based on BWDSP chips
CN209151142U (en) * 2018-12-19 2019-07-23 北京比特大陆科技有限公司 Circuit structure, circuit board and supercomputer equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3726966B2 (en) * 2003-01-23 2005-12-14 インターナショナル・ビジネス・マシーンズ・コーポレーション Multiplier and encryption circuit
US7284212B2 (en) * 2004-07-16 2007-10-16 Texas Instruments Incorporated Minimizing computational complexity in cell-level noise characterization
JP2009301210A (en) * 2008-06-11 2009-12-24 Tokyo Denki Univ N-digit subtraction unit, n-digit subtraction module, n-digit addition unit and n-digit addition module
JP4837058B2 (en) * 2009-03-10 2011-12-14 株式会社東芝 Arithmetic apparatus and program
US8855302B2 (en) * 2011-06-21 2014-10-07 Intel Corporation Apparatus and method for Skein hashing
US10142098B2 (en) * 2016-06-29 2018-11-27 Intel Corporation Optimized SHA-256 datapath for energy-efficient high-performance Bitcoin mining
EP3610382A4 (en) * 2017-04-11 2021-03-24 The Governing Council of the University of Toronto A homomorphic processing unit (hpu) for accelerating secure computations under homomorphic encryption

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5010511A (en) * 1988-04-18 1991-04-23 General Electric Company Digit-serial linear combining apparatus useful in dividers
TW200410131A (en) * 2002-09-25 2004-06-16 Infineon Technologies Ag Apparatus and method for converting, and adder circuit
GB0801053D0 (en) * 2005-01-27 2008-02-27 Samsung Electronics Co Ltd Cryptographic logic circuits and method of performing logic operations
CN107357552A (en) * 2017-06-06 2017-11-17 西安电子科技大学 The optimization method of floating-point complex vector summation is realized based on BWDSP chips
CN209151142U (en) * 2018-12-19 2019-07-23 北京比特大陆科技有限公司 Circuit structure, circuit board and supercomputer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
超级计算机系统实时节能控制技术的理论与实践研究;金士尧;张冬松;吴飞;;计算机工程与科学(第08期);28-35 *

Also Published As

Publication number Publication date
CN109474268A (en) 2019-03-15

Similar Documents

Publication Publication Date Title
US10367477B2 (en) Sparse cascaded-integrator-comb filters
CN109474268B (en) Circuit structure, circuit board and super computing device
Khoo et al. A programmable FIR digital filter using CSD coefficients
Hwang et al. New distributed arithmetic algorithm for low-power FIR filter implementation
US8829953B1 (en) Programmable clock divider
Bachir et al. Performing floating-point accumulation on a modern FPGA in single and double precision
JP4560039B2 (en) Quadrature clock divider
US20110089987A1 (en) Multi-phase signals generator
CN209151142U (en) Circuit structure, circuit board and supercomputer equipment
US8203367B2 (en) Frequency divider and method for frequency division
US9787290B2 (en) Resource-saving circuit structures for deeply pipelined systolic finite impulse response filters
Patel et al. Design of fast FIR filter using compressor and Carry Select Adder
JP4589253B2 (en) Differential output divider
CN212726990U (en) Full adder, chip and computing device
CN111459458A (en) Arithmetic circuit, chip and computing device
JP4362407B2 (en) Digital noise filter
Vaisakhi et al. Fault tolerance in a hardware efficient parallel FIR filter
JP2003216268A (en) Circuit and method for selecting clock
CN106020768B (en) Combined adder and pre- adder for high radix multiplier circuit
Sever et al. 8× 8-Bit multiplier designed with a new wave-pipelining scheme
Roach et al. Design of low power and area efficient ESPFFIR filter using multiple constant multiplier
CN212084127U (en) Arithmetic circuit, chip and computing device
CN103580687A (en) Ultra-high speed digital configurable frequency divider
US20220263498A1 (en) Circuit and electronic device
Mazher Iqbal et al. Performance Comparison of Reconfigurable Low Complexity FIR Filter Architectures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant