CN110515589A - Multiplier, data processing method, chip and electronic equipment - Google Patents
Multiplier, data processing method, chip and electronic equipment Download PDFInfo
- Publication number
- CN110515589A CN110515589A CN201910819020.4A CN201910819020A CN110515589A CN 110515589 A CN110515589 A CN 110515589A CN 201910819020 A CN201910819020 A CN 201910819020A CN 110515589 A CN110515589 A CN 110515589A
- Authority
- CN
- China
- Prior art keywords
- circuit
- data
- sub
- multiplier
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 230000001186 cumulative effect Effects 0.000 claims abstract description 51
- 238000006243 chemical reaction Methods 0.000 claims description 105
- 238000000034 method Methods 0.000 claims description 77
- 238000012545 processing Methods 0.000 claims description 71
- VRDIULHPQTYCLN-UHFFFAOYSA-N Prothionamide Chemical compound CCCC1=CC(C(N)=S)=CC=N1 VRDIULHPQTYCLN-UHFFFAOYSA-N 0.000 claims description 64
- 238000009825 accumulation Methods 0.000 claims description 43
- 230000008569 process Effects 0.000 claims description 38
- 238000010801 machine learning Methods 0.000 claims description 32
- 238000007667 floating Methods 0.000 claims description 16
- 238000013528 artificial neural network Methods 0.000 claims description 13
- 230000005611 electricity Effects 0.000 claims description 13
- 235000013399 edible fruits Nutrition 0.000 claims description 9
- 230000005540 biological transmission Effects 0.000 claims description 7
- 238000012937 correction Methods 0.000 claims description 3
- 238000013500 data storage Methods 0.000 claims description 2
- 230000006399 behavior Effects 0.000 claims 1
- 238000000151 deposition Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 13
- 238000004364 calculation method Methods 0.000 description 4
- 101100498818 Arabidopsis thaliana DDR4 gene Proteins 0.000 description 3
- 241001269238 Data Species 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 238000004806 packaging method and process Methods 0.000 description 3
- FBOUIAKEJMZPQG-AWNIVKPZSA-N (1E)-1-(2,4-dichlorophenyl)-4,4-dimethyl-2-(1,2,4-triazol-1-yl)pent-1-en-3-ol Chemical compound C1=NC=NN1/C(C(O)C(C)(C)C)=C/C1=CC=C(Cl)C=C1Cl FBOUIAKEJMZPQG-AWNIVKPZSA-N 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000004378 air conditioning Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003032 molecular docking Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/4824—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices using signed-digit representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G06F7/5318—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel with column wise addition of partial products, e.g. using Wallace tree, Dadda counters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/533—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even
- G06F7/5332—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even by skipping over strings of zeroes or ones, e.g. using the Booth Algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Abstract
The application provides a kind of multiplier, data processing method, chip and electronic equipment, which includes: multiplying operational circuit, deposit control circuit, register circuit, state control circuit and selection circuit;Multiplying operational circuit includes canonical signed number coding sub-circuit and cumulative sub-circuit, the output end of canonical signed number coding sub-circuit is connect with the input terminal of cumulative sub-circuit, the output end of cumulative sub-circuit is connect with the first input end of deposit control circuit, the input terminal of the output end and register circuit of depositing control circuit connects, the output end of register circuit and the first input end of selection circuit connect, first output end of state control circuit is connect with the second input terminal of deposit control circuit, the second output terminal of state control circuit and the second input terminal of selection circuit connect, the multiplier can carry out canonical signed number coding to the data received, the number of obtained live part product is less, reduce the complexity that multiplier realizes multiplying.
Description
Technical field
This application involves field of computer technology, more particularly to a kind of multiplier, data processing method, chip and electronics
Equipment.
Background technique
With the continuous development of Digital Electronic Technique, all kinds of artificial intelligence (Artificial Intelligence, AI) cores
The fast-developing requirement for good digital multiplier of piece is also higher and higher.Neural network algorithm is extensive as intelligent chip
One of algorithm of application, carrying out multiplying by multiplier is a kind of common operation in neural network algorithm.
Currently, multiplier is to encode to every three bit value in multiplier as one, and obtain partial product according to multiplicand,
And compression processing is carried out to all partial products with Wallace tree and obtains target operation result.It is non-in coding but in traditional technology
The number of zero-bit numerical value is more, and the number of the corresponding part product of generation is more, and multiplier is caused to realize the complexity of multiplying
It is higher.
Summary of the invention
Based on this, it is necessary to which in view of the above technical problems, providing a kind of can reduce having of obtaining in multiplication procedure
Partial product number is imitated, to reduce multiplier, data processing method, chip and the electronic equipment of multiplier multiplying complexity.
The embodiment of the present application provides a kind of multiplier, comprising: multiplying operational circuit, deposit control circuit, register circuit,
State control circuit and selection circuit, the multiplying operational circuit include canonical signed number coding sub-circuit and cumulative son
The output end of circuit, the canonical signed number coding sub-circuit is connect with the input terminal of the cumulative sub-circuit, described cumulative
The output end of sub-circuit is connect with the first input end of the deposit control circuit, the output end of the deposit control circuit and institute
State the input terminal connection of register circuit, the first input end company of the output end of the register circuit and the selection circuit
It connects, the first output end of the state control circuit is connect with the second input terminal of the deposit control circuit, the state control
The second output terminal of circuit processed is connect with the second input terminal of the selection circuit.
The canonical signed number coding sub-circuit includes canonical signed number coding unit in one of the embodiments,
And partial product acquiring unit, the canonical signed number coding unit are used to receive the first data, and to first data
The canonical signed number coded treatment is carried out, obtains the target code, the partial product acquiring unit is for receiving second
Data obtain initial protion product according to the target code and second data, and are obtained according to initial protion product
The partial product of the target code, the cumulative sub-circuit are used to carry out accumulation process to the partial product of the target code to obtain
Multiplication result, the state control circuit is for obtaining storage indication signal and reading indication signal, the deposit control
Circuit processed is used for the storage indication signal inputted according to the state control circuit, determines and stores the multiplication result
The register circuit, for the register circuit for storing the multiplication result, the selection circuit is used for basis
The reading indication signal received, reads the data in the multiplication result stored in the register circuit,
As target operation result.
The canonical signed number coding unit may include: data-in port and mesh in one of the embodiments,
Mark coding output port;The data-in port is used to receive first number for carrying out canonical signed number coded treatment
It is used to export according to, the target code output port to obtaining after first data progress canonical signed number coded treatment
The target code.
The partial product acquiring unit is specifically used for carrying out at conversion the target code in one of the embodiments,
Reason obtains initial protion product, and carries out sign bit extension process to initial protion product, the part after obtaining symbol Bits Expanding
Product, obtains the partial product of the target code according to the partial product after the symbol Bits Expanding.
The partial product acquiring unit includes: that target code input port, the second data are defeated in one of the embodiments,
Inbound port and partial product output port;The target code input port is for receiving the target code, second number
It is used to export the part of the target code for receiving second data, the partial product output port according to input port
Product.
The cumulative sub-circuit includes: Wallace tree group unit and summing elements in one of the embodiments,;Wherein,
The output end of the Wallace tree group unit is connect with the input terminal of the summing elements;The Wallace tree group unit for pair
The partial product of the target code carries out accumulation process and obtains accumulating operation as a result, the summing elements are used for the cumulative fortune
It calculates result and carries out accumulation process.
The Wallace tree group unit includes: Wallace tree subelement, the Wallace tree in one of the embodiments,
Subelement is used to carry out accumulation process to each columns value in the partial product of all target codes.
The summing elements include: adder in one of the embodiments, and the adder is used for the institute received
It states cumulative correction result and carries out add operation.
In one of the embodiments, the adder include: carry signal input port and position signal input port with
And result output port;The carry signal input port is used for for receiving carry signal, described and position signal input port
It receives and position signal, the result output port is used to export the carry signal and carries out accumulation process with described and position signal
As a result.
The register circuit includes: deposit sub-circuit in one of the embodiments, and the deposit sub-circuit is used for will
The corresponding multiplication result of difference storage indication signal is stored.
The embodiment of the present application provides a kind of multiplier, which includes: multiplying operational circuit and revolution circuit, described
Multiplying operational circuit includes canonical signed number coding sub-circuit and cumulative sub-circuit, the canonical signed number coding electricity
The output end on road is connect with the input terminal of the cumulative sub-circuit, the output end of the cumulative sub-circuit and the revolution circuit
Input terminal connection, the revolution circuit include the first conversion sub-circuit and the second conversion sub-circuit;
Wherein, the canonical signed number coding sub-circuit is used to carry out canonical signed number coding to the data received
Processing obtains target code, and obtains the partial product of target code according to the target code, the cumulative sub-circuit for pair
The partial product of the target code is modified accumulation process and obtains multiplication result, first conversion sub-circuit and second
Conversion sub-circuit is respectively used to carry out revolution processing to the multiplication result, obtains target operation result.
It in one of the embodiments, include input port, for receiving data conversion signal in the revolution circuit;Institute
Data conversion signal is stated for determining the data conversion type of the revolution processing of circuit.
First conversion sub-circuit is specifically used for for the multiplication result being converted into one of the embodiments,
The target operation result of floating point type, second conversion sub-circuit are specifically used for for the multiplication result being converted into
The target operation result of fixed point type.
A kind of multiplier provided in this embodiment, above-mentioned multiplier can encode sub-circuit docking by canonical signed number
The data received carry out canonical signed number coding, and the number of obtained live part product is less, to reduce multiplier reality
The complexity of existing multiplying.
The embodiment of the present application provides a kind of data processing method, which comprises
Receive pending data;
Canonical signed number coded treatment is carried out to the pending data, obtains the partial product of target code;
Accumulation process is carried out to the partial product of the target code, obtains multiplication result;
It obtains storage indication signal and reads indication signal;
Multiple multiplication results are stored into different deposit sub-circuits according to the storage indication signal;
According to the reading indication signal, read in the correspondence multiplication result stored in different deposit sub-circuits
Partial data, obtain target operation result.
It is described in one of the embodiments, that canonical signed number coded treatment is carried out to the pending data, it obtains
The partial product of target code, comprising:
Canonical signed number coded treatment is carried out to the pending data, obtains initial protion product;
Sign bit extension process is carried out to initial protion product, obtains the partial product of the target code.
It is described in one of the embodiments, that canonical signed number coded treatment is carried out to the pending data, it obtains
Initial protion product, comprising:
Canonical signed number coded treatment is carried out to the pending data, obtains target code;
Conversion process is carried out according to the pending data and the target code, obtains the initial protion product.
It is described in one of the embodiments, that sign bit extension process is carried out to initial protion product, obtain the mesh
Mark the partial product of coding, comprising: cover processing is carried out to initial protion product, obtains the partial product of the target code.
It is described in one of the embodiments, to be stored multiple multiplication results according to the storage indication signal
To in different deposit sub-circuits, comprising:
Corresponding first multiplication result of first storage indication signal is stored into the first deposit sub-circuit;
Corresponding second multiplication result of second storage indication signal is stored into the second deposit sub-circuit.
It is described according to the reading indication signal in one of the embodiments, it reads in different deposit sub-circuits and stores
The correspondence multiplication result in partial data, obtain target operation result, comprising:
Indication signal is read according to first, is read in the first multiplication result stored in the first deposit sub-circuit
First part's data, obtain the first operation result;
Indication signal is read according to second, reads the first multiplying knot stored in the first deposit sub-circuit
Second part data in fruit, obtain the second operation result;
Indication signal is read according to third, is read in the second multiplication result stored in the second deposit sub-circuit
First part's data, obtain third operation result;
Indication signal is read according to the 4th, reads the second multiplying knot stored in the second deposit sub-circuit
Second part data in fruit, obtain the 4th operation result.
The embodiment of the present application provides a kind of data processing method, which comprises
Receive data conversion signal and pending data;
Canonical signed number coded treatment is carried out to the pending data, obtains the partial product of target code;
Accumulation process is carried out to the partial product of the target code, obtains multiplication result;
The multiplication result is subjected to revolution processing according to the data conversion signal, obtains target operation result,
Wherein, the data conversion signal is used to indicate the data class that multiplier needs to be converted to the target operation result demand
Type.
A kind of data processing method provided in this embodiment, the above method can carry out just the pending data received
Then signed number encodes, and the number of live part product in multiplying is reduced, to reduce the complexity of multiplying.
A kind of machine learning arithmetic unit provided by the embodiments of the present application, the machine learning arithmetic unit include one or
Multiple multipliers;The machine learning arithmetic unit is used to obtained from other processing units to operational data and control letter
Breath, and specified machine learning operation is executed, implementing result is passed into other processing units by I/O interface;
When the machine learning arithmetic unit includes multiple multipliers, by default between multiple computing devices
Specific structure is attached and transmits data;
Wherein, multiple multipliers are interconnected by PCIE bus and are transmitted data, to support more massive machine
The operation of device study;Multiple multipliers share same control system or possess respective control system;Multiple multiplication
Device shared drive possesses respective memory;The mutual contact mode of multiple multipliers is any interconnection topology.
A kind of combined treatment device provided by the embodiments of the present application, the combined treatment device include machine learning as mentioned
Processing unit, general interconnecting interface and other processing units;The machine learning arithmetic unit and above-mentioned other processing units carry out
Interaction, the common operation completing user and specifying;The combined treatment device can also include storage device, the storage device respectively with
The machine learning arithmetic unit is connected with other processing units, for saving the machine learning arithmetic unit and described
The data of other processing units.
A kind of neural network chip provided by the embodiments of the present application, the neural network chip include multiplication described above
Device, machine learning arithmetic unit described above or combined treatment device described above.
A kind of neural network chip encapsulating structure provided by the embodiments of the present application, the neural network chip encapsulating structure include
Neural network chip described above.
A kind of board provided by the embodiments of the present application, the board include neural network chip encapsulating structure described above.
The embodiment of the present application provides a kind of electronic device, the electronic device include neural network chip described above or
Person's board described above.
A kind of chip provided by the embodiments of the present application, including at least one multiplier as described in any one of the above embodiments.
A kind of electronic equipment provided by the embodiments of the present application, including chip as mentioned.
Detailed description of the invention
Fig. 1 is the structural schematic diagram for the multiplier that an embodiment provides;
Fig. 2 is the structural schematic diagram for the multiplier that another embodiment provides;
Fig. 3 is the regularity of distribution schematic diagram of the partial product for 9 target codes that another embodiment provides;
The particular circuit configurations figure of summation circuit when 8 data operations that Fig. 4 provides for another embodiment;
Fig. 5 is a kind of flow diagram for data processing method that an embodiment provides;
Fig. 6 is the flow diagram for another data processing method that another embodiment provides;
Fig. 7 is a kind of structure chart for combined treatment device that an embodiment provides;
Fig. 8 is the structure chart for another combined treatment device that an embodiment provides;
Fig. 9 is a kind of structural schematic diagram for board that an embodiment provides.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Multiplier provided by the present application can be applied to AI chip, on-site programmable gate array FPGA (Field-
Programmable Gate Array, FPGA) chip or be to be compared calculation process in other hardware circuit equipment,
Its concrete structure schematic diagram is as depicted in figs. 1 and 2.
A kind of structural schematic diagram of the multiplier provided as shown in Figure 1 for an embodiment.The multiplier includes: multiplying
Circuit 11, deposit control circuit 12, register circuit 13, state control circuit 14 and selection circuit 15, the multiplying
Circuit 11 includes canonical signed number coding sub-circuit 111 and cumulative sub-circuit 112, the canonical signed number coding electricity
The output end on road 111 is connect with the input terminal of the cumulative sub-circuit 112, the output end of the cumulative sub-circuit 112 with it is described
Deposit the first input end connection of control circuit 12, the output end and the register circuit 13 of the deposit control circuit 12
Input terminal connection, the output end of the register circuit 13 are connect with the first input end of the selection circuit 15, the state
First output end of control circuit 14 is connect with the second input terminal of the deposit control circuit 13, the state control circuit 14
Second output terminal connect with the second input terminal of the selection circuit 15.
Wherein, the canonical signed number coding sub-circuit 111 includes canonical signed number coding unit 1111 and portion
Divide product acquiring unit 1112, the canonical signed number coding unit 1111 is counted for receiving the first data, and to described first
According to the canonical signed number coded treatment is carried out, the target code is obtained, the partial product acquiring unit 1112 is for connecing
The second data are received, obtain initial protion product according to the target code and second data, and according to the initial protion
Product obtains the partial product of the target code, and the cumulative sub-circuit 112 is tired for carrying out to the partial product of the target code
Processing is added to obtain multiplication result;The state control circuit 14 is for obtaining storage indication signal and reading instruction letter
Number;The storage indication signal that the deposit control circuit 12 is used to be inputted according to the state control circuit 14, determination are deposited
The register circuit 13 of the multiplication result is stored up, the register circuit 13 is for storing the multiplying knot
Fruit, the selection circuit 15 are used to read in the register circuit 13 and store according to the reading indication signal received
The multiplication result in data, as target operation result.
It, can be with specifically, above-mentioned canonical signed number coding sub-circuit 111 is by canonical signed number coding unit 1111
Canonical signed number coded treatment is carried out to the first data received and obtains target code, above-mentioned first data can be multiplication
Multiplier in operation.Optionally, above-mentioned partial product acquiring unit 1112 can be compiled according to the second data and target received
Code obtains initial protion product, and obtains the partial product of target code according to initial protion product, which can transport for multiplication
Multiplicand in calculation.Wherein, above-mentioned multiplier and multiplicand can be the fixed-point number with bit wide.Optionally, above-mentioned register electricity
Road 13 may include multiple storage units.Optionally, the bit wide of above-mentioned multiplication result can be equal to canonical signed number and compile
2 times of the data bit width that numeral circuit 111 receives.Optionally, above-mentioned canonical signed number coding sub-circuit 111 can be to solid
It positions wide data to be handled, and the data bit width that receives of canonical signed number coding sub-circuit 111 can be equal to and multiply
The bit wide of multiplier input terminal mouth, in addition, in the present embodiment, the bit wide of multiplier outputs mouth can be less than input port bit wide
2 times.Optionally, the input port of above-mentioned selection circuit 15 can have multiple, and the function of each input port can not phase
Together, and output port can have one.Optionally, the bit wide of above-mentioned target operation result can be equal to multiplication result position
Wide 1/2, the present embodiment do not do any restriction to this.In the present embodiment, it is also understood that being, the bit wide of target operation result
2 times of multiplication result bit wide can be less than.Optionally, the number of above-mentioned target code can be equal to the part of target code
Long-pending number, and may include three kinds of numerical value, respectively -1,0 and 1 in the target code.
It obtains each it should be noted that above-mentioned state control circuit 14 can obtain cumulative sub-circuit 112 automatically and multiplies
When method operation, corresponding storage indication signal, for example, when cumulative sub-circuit 112 obtains first multiplication result, state control
The storage indication signal that circuit 14 processed obtains can be 1, when cumulative sub-circuit 112 obtains second multiplication result, state
The storage indication signal that control circuit 14 obtains can be 2, and so on, cumulative sub-circuit 112 obtains each multiplying
As a result, the numerical value for the storage indication signal that state control circuit 14 obtains, can be to deposit in upper multiplication result correspondence
Add 1 on the basis of storage indication signal numerical value.Optionally, above-mentioned state control circuit 14 can also obtain register circuit 13 automatically
In there are when multiplication result, the corresponding reading indication signal of present clock period number, wherein above-mentioned state control circuit 14
Current clock periodicity can be obtained automatically, can also receive the clock periodicity of external device transmission.For example, if when first
When storing first multiplication result under the clock period, in register circuit 13, what state control circuit 14 obtained corresponds to reading
Indication signal can be 1, at this point, selection circuit 15 can read the partial data stored in register circuit 13, second clock
When the period, the corresponding indication signal that reads that state control circuit 14 obtains can be 2, post at this point, selection circuit 15 can be read
The remainder data in first multiplication result that latch circuit 13 stores, it is also understood that being multiplier corresponding two
A clock cycle can export a multiplication result;But it needs after obtaining first multiplication result by five
When clock cycle available second multiplication result of, under the 6th clock cycle, register circuit 13 can just be deposited
Second multiplication result is stored up, at this point, the corresponding indication signal that reads that state control circuit 14 obtains can be 3, is equivalent to
The numerical value for reading indication signal can be determined according to the number of storing data in register circuit 13.
In addition, the multiplication result that cumulative sub-circuit 112 obtains not is the target operation result that multiplier obtains, mesh
Two operation results that mark operation result can be exported twice by multiplier splice to obtain, and the selection circuit 15 in multiplier
The operation result of output for the first time splices, the target operation that available multiplier obtains with the operation result of second of output
As a result, and so on, the operation result splicing that selection circuit 15 exports twice, multiplying obtains available multiplier each time
The target operation result arrived.In addition, the corresponding multiple clock cycle of multiplying operational circuit 11 can also export a target operation knot
Fruit.
It should be noted that multiplier can receive cumulative sub-circuit 112 multiplication each time by depositing control circuit 12
The multiplication result of operation output, and according to the storage indication signal received, it determines and stores each multiplication result
Storage unit.Optionally, selection circuit 15 can determine according to the different reading indication signals received and read corresponding posts
Data in latch circuit 13, in the multiplication result of storage.Optionally, if the bit wide of multiplier input mouth is N, and
The data bit width received is also N, at this point, the bit wide M of multiplier outputs mouth can be equal to 2N/t+deta ((2N/t+deta)
<2N), wherein under normal conditions, multiplying operational circuit 11 can complete a multiplier by t (t>1) a clock cycle and realize
Multiplying, obtain a multiplication result, and the multiplication that the cumulative sub-circuit 112 in multiplying operational circuit 11 is obtained
Operation result is stored into register circuit 13, wherein deta (deta >=0) it is a constant.In addition, there is also a kind of small
The case where probability, multiplier can complete multiplication operation by m (m < t, and m≤1) a clock cycle, obtain one and multiply
Method operation result, and the multiplication result that the cumulative sub-circuit 112 in multiplying operational circuit 11 obtains is stored to register
In circuit 13.Optionally, selection circuit 15 can with twi-read register circuit 13 store multiplication result in data,
Wherein, the bit wide of multiplication result can be equal to 2N, and the data bit width in the multiplication result of reading can be equal to N, choosing
Circuit 15 is selected twice and can read respectively high N data and low N data in the same multiplication result as transporting twice
Calculate as a result, and two operation results are spliced, obtain multiplier and carry out the target operation result that multiplying obtains.
In addition, in the present embodiment, it is to be understood that above-mentioned partial product acquiring unit 1112 can be according to initial protion
Product obtains the partial product after symbol Bits Expanding, and obtains the partial product of target code according to the partial product after symbol Bits Expanding.
Optionally, the bit wide of the partial product after above-mentioned symbol Bits Expanding can be equal to 2 times of the data bit width N that multiplier receives, on
The bit wide for stating initial protion product can be equal to the data bit width N that multiplier receives.Optionally, the partial product after symbol Bits Expanding
In high N bit value can be equal to initial protion product in highest bit value, i.e., initial protion product symbol bit value, also
It is that the high N+1 bit value in the partial product after symbol Bits Expanding is equal to, low N-1 bit value can be equal in initial protion product
Low N-1 bit value.
Illustratively, if multiplier currently processed 8 * 8 fixed-point number multiplyings, pass through partial product acquiring unit
1112 obtained initial protion products are " p7p6p5p4p3p2p1p0", sign bit extension process is carried out to initial protion product, is obtained
Symbol Bits Expanding after partial product can be expressed as " p7p7p7p7p7p7p7p7p7p6p5p4p3p2p1p0”。
It will also be appreciated that in the regularity of distribution of the partial product of all target codes, the part of each target code
Product can have the partial product after corresponding symbol Bits Expanding, and the partial product of first aim coding can be first sign bit
Partial product after extension, since the partial product that second target encodes, the partial product after corresponding symbol Bits Expanding is upper one
On the basis of the partial product of a target code, one digit number value, the highest of the partial product of each target code can be moved to the left
The highest order numerical value of bit value and the partial product of first aim coding is located at same row, is equivalent to, encodes from second target
Partial product start, after the partial product after moving to left each symbol Bits Expanding, the corresponding more high-order numerical value moved to left is without addition
Operation.
A kind of multiplier provided in this embodiment, multiplier encode sub-circuit to the number received by canonical signed number
Target code is obtained according to canonical signed number coded treatment is carried out, and the partial product of target code is obtained according to target code, is led to
It crosses cumulative sub-circuit and multiplication result is obtained to the partial product progress accumulation process after symbol Bits Expanding, pass through state and control electricity
Road obtains storage indication signal and reads indication signal, and deposits control circuit according to storage indication signal, determines storage
The register circuit of the multiplication result, by register circuit store multiplication result, meanwhile, selection circuit according to
It reads indication signal to read in register circuit, the data in the multiplication result of storage obtain target operation result, this multiplies
Musical instruments used in a Buddhist or Taoist mass can carry out canonical signed number coded treatment to the data received using canonical signed number coding sub-circuit, reduce
The number of the live part product obtained in multiplication procedure, to reduce the complexity that multiplier realizes multiplying;Meanwhile
The multiplier can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
A kind of concrete structure schematic diagram of multiplier of embodiment offer is provided.The multiplier includes: multiplication
Computing circuit 21 and revolution circuit 22, the multiplying operational circuit 21 include canonical signed number coding sub-circuit 211 and tire out
Add sub-circuit 212, the output end of the canonical signed number coding sub-circuit 211 and the input terminal of the cumulative sub-circuit 212
Connection, the output end on cumulative 212 tunnel Zi electricity are connect with the input terminal of the revolution circuit 22, and the revolution circuit 22 includes
First conversion sub-circuit 221 and the second conversion sub-circuit 222;Wherein, canonical signed number coding sub-circuit 211 for pair
The data received carry out canonical signed number coded treatment and obtain target code, and obtain target according to the target code and compile
The partial product of code, the cumulative sub-circuit 212 are used to carry out accumulation process to the partial product of the target code to obtain multiplication fortune
It calculates as a result, first conversion sub-circuit 221 and the second conversion sub-circuit 222 are respectively used to carry out the multiplication result
Revolution processing, obtains target operation result.
Optionally, canonical signed number coding sub-circuit 211 include canonical signed number coding unit 2111 and
Partial product acquiring unit 2112, the canonical signed number coding unit 2111 are used to receive the first data, and to described first
Data carry out the canonical signed number coded treatment, obtain the target code, the partial product acquiring unit 2112 is used for
The second data are received, obtain initial protion product according to the target code and second data, and according to the original portion
Product is divided to obtain the partial product of the target code.
Specifically, above-mentioned canonical signed number coding sub-circuit 211 can carry out canonical to the data received and have symbol
Number encoder processing, above-mentioned data can be the multiplier and multiplicand in multiplying, and multiplier and multiplicand can be same
The fixed-point number of bit wide.Optionally, above-mentioned canonical signed number coding sub-circuit 211 may include multiple numbers with different function
According to processing sub-circuit, the input port of the data processing sub-circuit of multiple and different functions can have one or more, each data
The function of handling each input port in sub-circuit can not be identical, and output port can also have one, each data processing
The function of each output port in sub-circuit can not be identical, and the circuit structure of different function data processing sub-circuit can
With not identical.Optionally, the multiplication result of cumulative 212 output of son electricity can be converted into target by above-mentioned revolution circuit 22
The data of format, i.e. target operation result, wherein multiplication result can be fixed-point number, then the data of above-mentioned object format
It can be fixed-point number, or floating number, in addition, the data bit width of object format can be less than multiplication result bit wide
2 times.Optionally, target operation result can be the partial data in multiplication result.Optionally, above-mentioned target operation result
Bit wide can be equal to multiplication result bit wide 1/2, can also be equal to multiplication result bit wide 1/4, the present embodiment
Any restriction is not done to this.In the present embodiment, it is also understood that being, the bit wide of target operation result is less than multiplication result
2 times of bit wide.In addition, the multiplication result that cumulative son electricity 212 obtains not is the mesh that multiplier realizes that multiplying obtains
Operation result is marked, only the partial data in target operation result.Optionally, the number of above-mentioned target code can be equal to target
The number of the partial product of coding, and may include three kinds of numerical value, respectively -1,0 and 1 in the target code.
It should be noted that above-mentioned canonical signed number coding sub-circuit 211 can multiply the data of fixed bit wide
Method calculation process, and the data bit width that canonical signed number coding sub-circuit 211 receives can be equal to multiplier input
The bit wide of mouth, in addition, in the present embodiment, the bit wide of multiplier outputs mouth can be less than 2 times of input port bit wide.
It optionally, include input port, for receiving data conversion signal in the revolution circuit 22.Optionally, described
Data conversion signal is used to determine the data conversion type that the revolution circuit 22 is handled.
Optionally, above-mentioned data conversion signal can there are many, different data conversion signal corresponds to revolution circuit 22 can be with
By the data conversion received at the data of object format.Optionally, above-mentioned data conversion type may include that fixed-point number turns fixed
Points and fixed-point number turn floating number.Illustratively, if the bit wide of multiplier input mouth and output port is N, multiplication
The multiplication result of the available 2N bit bit wide of device, and multiplier can be by revolution circuit 22 by 2N bit bit wide
Multiplication result is converted into the target operation result of N-bit bit wide, which can be floating number, in addition, multiplying
Musical instruments used in a Buddhist or Taoist mass can also be converted into the fixed-point number of N-bit bit wide, i.e., by revolution circuit 22 by the multiplication result of 2N bit bit wide
Target operation result.In the present embodiment, the circuit structure and its function of canonical signed number coding sub-circuit 211, with canonical
Signed number encode sub-circuit 111 circuit structure and its function it is identical, to this this embodiment is not repeated canonical signed number
Encode the specific structure of sub-circuit 211.
A kind of multiplier provided in this embodiment, the multiplier can be using canonical signed number coding sub-circuits to reception
The data arrived carry out canonical signed number coded treatment, reduce the number of the live part product obtained in multiplication procedure, from
And reduce the complexity that multiplier realizes multiplying;Meanwhile the multiplier can be improved the operation efficiency of multiplying, effectively
Reduce the power consumption of multiplier.
As one of embodiment, the canonical signed number coding unit 1111 may include: the input of the first data
Port 1111a and target code output port 1111b;The first data-in port 1111a has for receiving progress canonical
First data of symbolic number coded treatment, the target code output port 1111b is for exporting to first data
Carry out the target code obtained after canonical signed number coded treatment.
Specifically, the first data-in port 1111a in canonical signed number coding unit 1111 receive first
Data can be the multiplier in multiplying, which can be fixed-point number.Optionally, partial product acquiring unit 1112 receives
The second data can be the multiplicand in multiplying, which can be fixed-point number, and above-mentioned multiplier and multiplicand can
Think the data of same bit wide.Optionally, the number of above-mentioned target code can be equal to the number of initial protion product and target is compiled
The number of the partial product of code.
It should be noted that the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N
For the multiplier of position, handled from low level numerical value to high-order numerical value, it, then can be by continuous n if it exists when continuous l (l >=2) bit value 1
Bit value 1 is converted to data " 1 (0)l-1(- 1) ", and remaining is corresponded into position (l+1) after (N-l) bit value and conversion
Numerical value is combined to obtain a new data;Then using the new data as the primary data of next stage conversion process, until
There is no until continuous l (l >=2) bit value 1 in the new data obtained after conversion process;Wherein, canonical is carried out to N multipliers
The bit wide of signed number coded treatment, obtained target code can be equal to (N+1).Further, it is compiled in canonical signed number
Code processing when, data 11 can be converted to (100-001), i.e., data 11 can equivalence be converted to 10 (- 1);Data 111 can turn
Be changed to (1000-0001), i.e., data 111 can equivalence be converted to 100 (- 1);And so on, other continuous l (l >=2) digit
The mode of 1 conversion process of value is also similar.
For example, the multiplier that canonical signed number coding unit 1111 receives is " 001010101101110 ", to the multiplier
Obtained the first new data is " 0010101011100 (- 1) 0 " after carrying out first order conversion process, continue to the first new data into
The second new data obtained after the conversion process of the row second level is " 0010101100 (- 1) 00 (- 1) 0 ", is continued to the second new data
Carrying out the third new data obtained after third level conversion process is " 0010110 (- 1) 00 (- 1) 00 (- 1) 0 ", is continued new to third
It is " 00110 (- 1) 0 (- 1) 00 (- 1) 00 (- 1) 0 " that data, which carry out the 4th new data obtained after fourth stage conversion process, is continued
Carrying out the 5th new data obtained after level V conversion process to the 4th new data is " 010 (- 1) 0 (- 1) 0 (- 1) 00 (- 1) 00
(- 1) 0 ", there is no continuous l (l >=2) bit values 1 in the 5th new data, at this point, the 5th new data is properly termed as centre
Coding, and after carrying out the processing of cover to intermediate code, characterization canonical signed number coded treatment is completed, wherein is compiled centre
The bit wide of code can be equal to the bit wide of multiplier.Optionally, canonical signed number coding unit 1111, which carries out canonical to multiplier, symbol
After number coded treatment, in obtained new data (i.e. intermediate code), if highest bit value and time high-order numerical value in new data
For " 10 " or " 01 ", then canonical signed number coding unit 1111 can highest digit to the intermediate code that the new data obtains
One digit number value 0 is mended at higher one of value, high three bit value for obtaining corresponding target code is respectively " 010 " or " 001 ".It is optional
, the bit wide that the bit wide of above-mentioned intermediate code can be equal to target code subtracts 1.
It should be noted that canonical signed number coding unit 1111 can be incited somebody to action by target code output port 1111b
Target code output.Optionally, the bit wide of above-mentioned target code can be equal to canonical signed number coding unit 1111 and receive
Data bit wide, and may include three kinds of numerical value, respectively -1,0 and 1 in target code, it is understood that, target
The number for the numerical value for including in coding can be equal to the bit wide of target code.
A kind of multiplier provided in this embodiment, can be with by the canonical signed number coding unit in multiplying operational circuit
Canonical signed number coded treatment is carried out to the data received and obtains target code, then by partial product acquiring unit according to every
One target code obtains initial protion product, and obtains the partial product of target code according to initial protion product, finally by cumulative
Sub-circuit carries out accumulation process to the partial product of target code, obtains multiplying processing, is deposited by state control circuit acquisition
It stores up indication signal and reads indication signal, and deposit control circuit according to storage indication signal, determine and store the multiplication
The register circuit of operation result stores multiplication result by register circuit, meanwhile, selection circuit is indicated according to reading
In signal-obtaining register circuit, the data in the multiplication result of storage obtain target operation result, which can
Canonical signed number coded treatment is carried out to the data received using canonical signed number coding unit, reduces multiplying
The number of the live part product obtained in journey, to reduce the complexity that multiplier realizes multiplying;Meanwhile the multiplier energy
The operation efficiency for enough improving multiplying, effectively reduces the power consumption of multiplier.
As one of embodiment, the partial product acquiring unit 1112 is specifically used for turning the target code
It changes processing and obtains initial protion product, and sign bit extension process is carried out to initial protion product, after obtaining symbol Bits Expanding
Partial product obtains the partial product of the target code according to the partial product after the symbol Bits Expanding.
Specifically, above-mentioned conversion process can be characterized as, based on the multiplicand (i.e. X) in multiplying, by target code
In numerical value conversion at initial protion product.Optionally, each bit value in target code has corresponding initial protion product;If
Numerical value in target code is -1, then corresponding initial protion product can be that-X is corresponded to if the numerical value in target code is 1
Initial protion product can be X, if numerical value in target code is 0, corresponding initial protion product can be 0.Optionally, on
Stating initial protion product can be not carry out the partial product of symbol Bits Expanding, and the bit wide of initial protion product can be with multiplying electricity
The bit wide that road 11 is presently in reason data is identical.Optionally, the bit wide of the partial product after above-mentioned symbol Bits Expanding can be equal to and multiply
Musical instruments used in a Buddhist or Taoist mass handles 2 times of data bit width N, at this point, the bit wide of initial protion product can be equal to N.Optionally, the portion after symbol Bits Expanding
The low N bit value divided in product can be equal to the N bit value that initial protion product includes, the high N in partial product after symbol Bits Expanding
Bit value can be equal to the highest bit value of initial protion product, i.e. the symbol bit value of initial protion product.
In addition, partial product acquiring unit 1112 can obtain target according to the partial product after obtained all symbol Bits Expandings
The partial product of coding, in the regularity of distribution of the partial product of all target codes, the partial product of first aim coding can be equal to
Partial product after first symbol Bits Expanding, since the partial product that second target encodes, the part of each target code
The highest order numerical value for the partial product that long-pending highest bit value can be encoded with first aim is located at same row, each target is compiled
The bit wide for the partial product that the bit wide of the partial product of code can be equal to a upper target code subtracts 1, can also be equal to each correspondence
The bit wide 2N of partial product after symbol Bits Expanding subtracts (i-1), wherein and i indicates number of the partial product of target code since 1,
The distribution map of the partial product of 9 obtained target codes can be as shown in Figure 3.
Optionally, the partial product acquiring unit 1112 includes: target code input port 1112a, the input of the second data
Port 1112b and partial product output port 1112c;The target code input port 1112a is compiled for receiving the target
Code, for receiving second data, the partial product output port 1112c is used for the second data-in port 1112b
Export the partial product of the target code.
In the present embodiment, partial product acquiring unit 1112 can receive canonical by target code input port 1112a
The target code that signed number coding unit 1111 obtains receives the second data by the second data-in port 1112b, according to
Target code and the second data carry out conversion process and shifting processing obtains the partial product of target code, and target is compiled
The partial product of code is exported by partial product output port 1112c.
The number of a kind of multiplier provided in this embodiment, the live part product that multiplier can obtain is less, to drop
Low multiplier realizes the complexity of multiplying;Meanwhile the multiplier can be improved the operation efficiency of multiplying, be effectively reduced
The power consumption of multiplier.
A kind of multiplier that another embodiment provides, wherein multiplier includes the cumulative sub-circuit 112, the cumulative son
Circuit 112 includes: Wallace tree group unit 1121 and summing elements 1122;Wherein, the Wallace tree group unit 1121 is defeated
Outlet is connect with the input terminal of the summing elements 1122;The Wallace tree group unit 1121 is used for the target code
Partial product carries out accumulation process and obtains accumulating operation as a result, the summing elements 1122 are used to carry out the accumulating operation result
Accumulation process.
Specifically, above-mentioned Wallace tree group unit 1121 can compile all targets that partial product acquiring unit 1112 obtains
Numerical value in the partial product of code carries out accumulation process and obtains accumulating operation as a result, and by summing elements 1122 to Wallace tree group
Unit 1121 obtains accumulating operation result and carries out accumulation process, obtains target operation result.
A kind of multiplier provided in this embodiment can carry out the partial product of target code by Wallace tree group unit
Accumulation process, and accumulation process is carried out to accumulation result by summing elements, multiplication result is obtained, and according to multiplying
As a result target operation result is obtained, to guarantee that the number for the live part product that multiplier obtains is less, multiplier is reduced and realizes
The complexity of multiplying;Meanwhile the multiplier can be improved the operation efficiency of multiplying, effectively reduce the function of multiplier
Consumption.
The Wallace tree group unit 1121 in multiplier that another embodiment provides includes: Wallace tree subelement
1121_1~1121_n, multiple Wallace tree subelement 1121_1~1121_n are used for the partial product to all target codes
In each columns value carry out accumulation process.
Specifically, the circuit structure of Wallace tree subelement 1121_1~1121_n can be combined by full adder and half adder
Realize, furthermore it is also possible to be interpreted as Wallace tree subelement 1121_1~1121_n be one kind can to multidigit input signal into
Row processing, multidigit input signal is added to obtain the circuit of two output signals.Optionally, Wallace tree group unit 1121 includes
The number n of Wallace tree subelement can be equal to multiplying operational circuit 11 and be presently in reason 2 times of data bit width, and n
Wallace tree subelement can carry out parallel processing to the partial product of target code, but connection type can be serial connection.It can
Choosing, each Wallace tree subelement can be every in the partial product to all target codes in Wallace tree group unit 1121
One columns value carries out addition process, each Wallace tree subelement can export two signals, i.e. carry signal CarryiWith
One and position signal Sumi, wherein i can indicate each corresponding number of Wallace tree subelement, first Wallace tree
The number of subelement is 1.Optionally, the number that each Wallace tree subelement receives input signal can be equal to target and compile
The number of code or the number of the partial product after symbol Bits Expanding.
In addition, the signal that each Wallace tree subelement receives in Wallace tree group unit 1121 may include carry
Input signal Cini, partial product input signal, carry output signals Couti.Optionally, each Wallace tree subelement receives
To partial product input signal can be each columns value in the partial product of all target codes, each Wallace tree is single
The carry signal Cout of member outputiDigit can be equal to NCout=floor ((NI+NCin)/2)-1.Wherein, NIIt can indicate this
The number of the partial product numerical value input signal of Wallace tree subelement, NCinIt can indicate that the carry of the Wallace tree subelement is defeated
Enter the number of signal, NCoutIt can indicate the number of the least carry output signals of Wallace tree subelement, floor () can
To indicate downward bracket function.Optionally, in Wallace tree group unit 1121 each Wallace tree subelement receive into
Position input signal can be the carry output signals of upper Wallace tree subelement output, and first Wallace tree is single
The carry input signal that member receives can be 0, meanwhile, the carry signal input terminal that first Wallace tree subelement receives
The number of mouth, can be identical as the number of carry signal input port of other Wallace tree subelements.
Illustratively, if multiplying operational circuit 11 currently processed 8 * 8 multiplyings, pass through partial product acquiring unit
Partial product after 1112 obtained symbol Bits Expandings is " pi9pi9pi9pi9pi9pi9pi9pi9pi8pi7pi6pi5pi4pi3pi2pi1" (i=
1 ..., n=9), wherein i can indicate the partial product after i-th of symbol Bits Expanding, and according to the portion after 9 symbol Bits Expandings
Divide product to obtain the partial product of 9 target codes, and accumulation process is carried out to the partial product of this 9 target codes.Optionally, 9
The regularity of distribution of the partial product of target code may refer to shown in Fig. 3, each origin can represent the portion after symbol Bits Expanding
Divide each bit value in product, and the partial product of first aim coding can be the partial product after first symbol Bits Expanding,
Wherein, in the regularity of distribution of the partial product of 9 target codes, the partial product of each target code can have corresponding symbol
Partial product after Bits Expanding, since the partial product that second target encodes, the partial product after corresponding symbol Bits Expanding is upper
On the basis of the partial product of one target code, it can be moved to the left one digit number value, the partial product of each target code is most
The highest order numerical value of the partial product of high-order numerical value and first aim coding is located at same row, is equivalent to, and compiles from second target
The partial product of code starts, and after the partial product after moving to left each symbol Bits Expanding, the corresponding more high-order numerical value moved to left is without adding
Method operation.Optionally, in the partial product of 9 target codes, the partial product of first aim coding can be first sign bit
Partial product after extension, since the partial product that second target encodes, the highest digit of the partial product of each target code
Value is located at same row with the highest order numerical value of the partial product of first aim coding;It counts from right column to left column, needs altogether
The partial product that 16 Wallace tree subelements accord with target code to 9 carries out accumulation process, the company of 16 Wallace tree subelements
It is as shown in Figure 4 to connect circuit diagram, wherein Wallace_i indicates that Wallace tree subelement, i are Wallace tree subelement from 1 in Fig. 4
The number of beginning, and the solid line connected between Wallace tree subelement two-by-two indicates that the corresponding Wallace tree of high bit number is single
Member has carry output signals, and dotted line indicates that the corresponding Wallace tree subelement of high bit number does not carry out signal.
The number of a kind of multiplier provided in this embodiment, the live part product which obtains is less, reduces multiplication
The complexity of device realization multiplying;Meanwhile the multiplier can be improved the operation efficiency of multiplying, effectively reduce multiplication
The power consumption of device.
As one of embodiment, wherein the summing elements 1122 in multiplier include: adder, the adder
For carrying out add operation to the cumulative correction result received.
Specifically, adder can be the adder of different bit wides, which can be carry lookahead adder.It is optional
, adder can receive the two paths of signals that amendment Wallace tree group unit 1121 exports, and carry out addition to two-way output signal
Operation exports multiplication result.
Optionally, the adder includes: carry signal input port and position signal input port and result output end
Mouthful;The carry signal input port is for receiving carry signal, and described and position signal input port is used to receive and position signal,
The result output port is used to export the carry signal and described and position signal carries out the result of accumulation process.
Specifically, adder can receive what amendment Wallace tree group unit 1121 exported by carry signal input port
Carry signal Carry, by receiving amendment Wallace tree group unit 1121 exports and position signal with position signal input port
Sum, and by carry signal Carry with and position signal Sum progress accumulated result, exported by result output port.
It should be noted that multiplication process circuit 11 can be using the adder of different bit wides to amendment when multiplying
Wallace tree group unit 1121 export carry output signals Carry with and position output signal Sum progress add operation, wherein
The bit wide that above-mentioned adder can handle data can be equal to 2 times of the currently processed data bit width N of multiplier.Optionally, it corrects
Each of Wallace tree group unit 1121 Wallace tree subelement can export a carry output signals Carryi, with one
A and position output signal Sumi(i=0 ..., 2N-1, i are the reference numeral of each Wallace tree subelement, and number is opened from 0
Begin).Optionally, the Carry={ [Carry that adder receives0: Carry2N-2], 0 }, that is to say, that adder received
The bit wide of carry output signals Carry is 2N, the corresponding amendment Wallace tree group of preceding 2N-1 bit value in carry output signals Carry
In unit 1121, the carry output signals of preceding 2N-1 Wallace tree subelement, last one digit number in carry output signals Carry
Value can be replaced with numerical value 0.Optionally, adder receive and position output signal Sum bit wide be 2N and position output signal
Numerical value in Sum, which can be equal to exporting with position for each Wallace tree subelement in amendment Wallace tree group unit 1121, to be believed
Number.
Illustratively, if multiplying operational circuit 11 currently processed 8 * 8 multiplyings, adder can be 16
Carry lookahead adder continues as shown in figure 4, amendment Wallace tree group unit 1121 can export 16 Wallace tree subelements
And position output signal Sum and carry output signals Carry, still, 16 carry lookahead adders receive and position output
The complete and position signal Sum that signal can export for amendment Wallace tree group unit 1121, the carry output signals received can
Think in amendment Wallace tree group unit 1121, removes the institute of the carry output signals of the last one Wallace tree subelement output
There are carry output signals, the carry signal Carry after being combined with numerical value 0.
The number of a kind of multiplier provided in this embodiment, the live part product which obtains is less, reduces multiplication
The complexity of operation improves the operation efficiency of multiplying, effectively reduces the power consumption of multiplier.
In one embodiment, multiplier includes the register circuit 13, which includes: deposit son electricity
Road 131, the deposit sub-circuit 131 is for storing the corresponding multiplication result of different storage indication signals.
Specifically, above-mentioned register circuit 13 may include two or more deposit sub-circuits 131, it is also understood that be,
The number that sub-circuit 131 is deposited in register circuit 13, can be equal to 2Nin/Nout, NinIndicate the data bit that multiplier receives
Width, Nout(Nout<2Nin) indicate the data bit width that multiplier exports.Optionally, the data bit width that deposit sub-circuit 131 stores can
To be equal to 2 times of multiplier input mouth bit wide.Optionally, the data bit width that multiplier receives can be equal to multiplier and input
The bit wide of port, and the data bit width of multiplier output can be equal to the bit wide of multiplier input mouth, be also less than and multiply
2 times of multiplier input terminal mouth bit wide.Illustratively, if the bit wide of multiplier input mouth and the bit wide of output port are N ratio
Spy, then register circuit 13 needs to be composed by two deposit sub-circuits 131;If the bit wide of multiplier input mouth is N
The bit wide of bit, output port is N/2 bit, then register circuit 13 needs to be composed by four deposit sub-circuits 131.
Optionally, the multiplication result that multiplier can obtain multiplying each time according to storage indication signal, stores to right
The 2N answeredin/NoutIn a deposit sub-circuit 131, wherein different storage indication signals has corresponding storage multiplication result
Difference deposit sub-circuit 131.Optionally, each multiplication result that multiplier obtains, can only be according to storage indication signal
Corresponding deposit sub-circuit 131 stores, the multiplication result that will can not be obtained each time, store to storage indication signal
In not corresponding other deposit sub-circuits 131.
Illustratively, if having n deposit sub-circuit 131, reference numeral 1,2,3 ..., n in register circuit 13, then
First multiplication result that multiplier obtains can store into No. 1 deposit sub-circuit 131, at this point, storage indication signal
Numerical value can be 1, second multiplication result that multiplier obtains can store into No. 2 deposit sub-circuits 132, this
When, the numerical value for storing indication signal can be for 2, it is also understood that storing multiplication when being that store the numerical value of indication signal be odd number
The reference numeral of the deposit sub-circuit 131 of operation result is also odd number, when the numerical value for storing indication signal is even number, stores multiplication
The reference numeral of the deposit sub-circuit 131 of operation result is also even number, wherein the numerical value for storing indication signal, which can be equal to, to be corresponded to
Store the number of the deposit sub-circuit 131 of multiplication result.
A kind of multiplier provided in this embodiment, the deposit sub-circuit in multiplier, according to different storage indication signals
The multiplication result that multiplying each time is obtained is stored into different deposit sub-circuits, and then is indicated according to reading
Data in the multiplication result of the corresponding deposit sub-circuit storage of signal output, so as to subsequently through output port bit wide not
With 2 times of multiplier of input port bit wide, target operation result is exported, meanwhile, the live part product that above-mentioned multiplier obtains
Number is less, reduces the complexity that multiplier realizes multiplying.
A kind of multiplier that another embodiment provides, wherein multiplier includes the cumulative sub-circuit 212, the cumulative son
Circuit 212 includes: Wallace tree group unit 2121 and summing elements 2122;Wherein, the Wallace tree group unit 2121 is defeated
Outlet is connect with the input terminal of the summing elements 2122;The Wallace tree group unit 2121 is used for the target code
Partial product carries out accumulation process and obtains accumulating operation as a result, the summing elements 2122 are used to carry out the accumulating operation result
Accumulation process obtains the target operation result.
Specifically, above-mentioned Wallace tree group unit 2121 can compile all targets that partial product acquiring unit 2112 obtains
Numerical value in the partial product of code carries out accumulation process and obtains accumulating operation as a result, and by summing elements 2122 to Wallace tree group
Unit 2121 obtains accumulating operation result and carries out accumulation process, obtains target operation result.
Optionally, a kind of multiplier includes the Wallace tree group unit 2121, which includes:
Wallace tree subelement 2121_1~2121_n, multiple Wallace tree subelement 2121_1~2121_n are for all mesh
The each columns value marked in the partial product of coding carries out accumulation process.
In the present embodiment, the circuit structure and its function of Wallace tree group unit 2121, with Wallace tree group unit
1121 circuit structure and its function can be identical, to the specific knot of this this embodiment is not repeated Wallace tree group unit 2121
Structure.
A kind of multiplier provided in this embodiment can carry out the partial product of target code by Wallace tree group unit
Accumulation process, and accumulation process is carried out to result by summing elements, multiplication result is obtained, and according to multiplication result
Target operation result is obtained, to guarantee that the number for the live part product that multiplier obtains is less, multiplier is reduced and realizes multiplication
The complexity of operation;Meanwhile the multiplier can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
As one of embodiment, wherein multiplier includes the summing elements 2122, the summing elements 2122 packet
Include: adder, the adder are used to carry out add operation to the accumulating operation result.
Specifically, adder can be the adder of different bit wides, which can be carry lookahead adder.It is optional
, adder can receive the two paths of signals of the output of Wallace tree group unit 2121, add operation is carried out to two-way output signal,
Export multiplication result.
A kind of multiplier provided in this embodiment can believe the two-way that Wallace tree group unit exports by summing elements
Number accumulation process is carried out, exports multiplication result, and target operation result is obtained according to multiplication result, to guarantee to multiply
The number for the live part product that musical instruments used in a Buddhist or Taoist mass obtains is less, reduces the complexity that multiplier realizes multiplying;Meanwhile the multiplier energy
The operation efficiency for enough improving multiplying, effectively reduces the power consumption of multiplier.
In one of the embodiments, wherein, multiplier includes the adder, which includes: that carry signal is defeated
Inbound port and position signal input port and result output port;The carry signal input port is used to receive carry signal,
Described and position signal input port is used to export the carry signal and institute for receiving with position signal, the result output port
It states and carries out the multiplication result that accumulation process obtains with position signal.
Specifically, adder can receive the carry that Wallace tree group unit 2121 exports by carry signal input port
Signal Carry, by receiving Wallace tree group unit 2121 exports and position signal Sum with position signal input port, and will be into
Position signal Carry with and the multiplication result that is added up of position signal Sum, exported by result output port.
It should be noted that multiplying operational circuit 21 can use the adder Lay to China of different bit wides when multiplying
Scholar's tree group unit 2121 export carry output signals Carry with and position output signal Sum progress add operation, wherein it is above-mentioned
The bit wide that adder can handle data can be equal to 2 times of the currently processed data bit width N of multiplier.Optionally, Wallace tree
Each of group unit 2121 Wallace tree subelement can export a carry output signals Carryi, defeated with one and position
Signal Sum outi(i=0 ..., 2N-1, i are the reference numeral of each Wallace tree subelement, are numbered since 0).It is optional
, the Carry={ [Carry that adder receives0: Carry2N-2], 0 }, that is to say, that the carry-out that adder receives
The bit wide of signal Carry is 2N, before preceding 2N-1 bit value corresponds in Wallace tree group unit 2121 in carry output signals Carry
The carry output signals of 2N-1 Wallace tree subelement, last bit value can use for 0 generation in carry output signals Carry
It replaces.Optionally, adder receive and the bit wide of position output signal Sum be that numerical value in 2N and position output signal Sum can be with
Equal in Wallace tree group unit 2121 each Wallace tree subelement and position output signal.
Illustratively, if multiplying operational circuit 11 currently processed 8 * 8 multiplyings, adder can be 16
Carry lookahead adder continues as shown in figure 4, Wallace tree group unit 2121 can export the sum of 16 Wallace tree subelements
Position output signal Sum and carry output signals Carry, still, 16 carry lookahead adders receive and position output signal
It can be the complete and position signal Sum that Wallace tree group unit 2121 exports, the carry output signals received can be Hua Lai
In scholar's tree group unit 2121, all carry-outs letter of the carry output signals of the last one Wallace tree subelement output is removed
Carry signal Carry after number being combined with 0.
A kind of multiplier provided in this embodiment can believe the two-way that Wallace tree group unit exports by summing elements
Number accumulating operation is carried out, exports multiplication result, and target operation result is obtained according to multiplication result, to guarantee to multiply
The number for the live part product that musical instruments used in a Buddhist or Taoist mass obtains is less, reduces the complexity that multiplier realizes multiplying;Meanwhile the multiplier energy
The operation efficiency for enough improving multiplying, effectively reduces the power consumption of multiplier.
A kind of multiplier that another embodiment provides, the multiplier include first conversion sub-circuit 221 and described
Second conversion sub-circuit 222, first conversion sub-circuit 221 are specifically used for the multiplication result being converted into floating-point class
The target operation result of type, it is fixed that second conversion sub-circuit 222 is specifically used for for the multiplication result being converted into
The target operation result of vertex type.
Specifically, the bit wide of above-mentioned multiplication result can be equal to 2 times of the data bit width that multiplier receives, floating-point
The bit wide of type operation result and the bit wide of fixed point type operation result can be equal to the bit wide of multiplier outputs mouth, and
In revolution circuit 22, the bit wide of the operation result of floating point type can be equal to the bit wide of the operation result of fixed point type.
It should be noted that first conversion sub-circuit 221 and the second conversion sub-circuit 222 do not have in revolution circuit 22
Any connection relationship, the two is mutually indepedent, and each time when multiplying, revolution circuit 22 only needs to use the first conversion sub-circuit
221 or second conversion sub-circuit 222 carry out the processing of data revolution, obtain target operation result.Optionally, revolution circuit 22
It can determine that this multiplying is needed through the first conversion sub-circuit 221 or according to the data conversion signal received
Two conversion sub-circuits 222 carry out the processing of data revolution.
Optionally, data conversion signal may include two kinds of signals, can be expressed as 00,01 with binary numeral respectively,
Wherein, it may include the data that receive of revolution circuit 22 is determining for 2N bit bit wide that data conversion signal, which is the signal of 00 characterization,
The fixed-point number of the 2N bit bit wide is needed to be converted into the fixed-point number of N-bit bit wide, and conversion postfixed point number decimal point by points
Position, wherein the position of the fixed-point number decimal point of 2N bit bit wide can be determining before converting;Data conversion signal is 01
The signal of characterization may include the fixed-point number that the multiplication result that receives of revolution circuit 22 is 2N bit bit wide, by the 2N ratio
The fixed-point number of special bit wide needs to be converted into the floating number of N-bit bit wide.Optionally, revolution circuit 22 can be according to two received
The different data conversion signal of kind is transported the multiplication received by the first conversion sub-circuit 221 or the second conversion sub-circuit 222
It calculates result and carries out different revolution processing, specific implementation is accomplished in that
(1) if the data conversion signal that revolution circuit 22 receives is 00, revolution circuit 22 can be by 2N bit bit wide
Fixed-point number be converted into the fixed-point number of N-bit bit wide, at this point, revolution circuit 22 can be docked by the second conversion sub-circuit 222
The fixed-point number of the 2N bit bit wide received carries out data conversion, specifically, when revolution is handled, N-bit after needing to convert target
The position of the fixed-point number decimal point of bit wide, the aligned in position with the fixed-point number decimal point for converting preceding 2N bit bit wide, then intercepts
The total N bit value in fixed-point number scaling position front and back of 2N bit bit wide, the fixed point of the N-bit bit wide after being converted before converting
Number, the mode of interception can be divided into three kinds of situations:
Situation a is all contained in the fixed-point number for converting preceding 2N bit bit wide, then the second conversion when that will intercept N bit value
Sub-circuit 222 can directly intercept the total N bit value in scaling position front and back in the fixed-point number for converting preceding 2N bit bit wide;
Situation b, when a part of numerical value in the N bit value that will be intercepted includes the fixed-point number of 2N bit bit wide before switching
It is interior, and the high-order portion numerical value in the N bit value for needing to intercept, it is not corresponding in the fixed-point number of 2N bit bit wide before switching
Component values can intercept, then the second conversion sub-circuit 222 can use the sign bit for the fixed-point number for converting preceding 2N bit bit wide, right
This part bits per inch value carries out cover, and N bit value is then intercepted from the fixed-point number after cover;
Situation c, when a part of numerical value in the N bit value that will be intercepted includes the fixed-point number of 2N bit bit wide before switching
It is interior, and the low portion numerical value in the N bit value for needing to intercept, it is not corresponding in the fixed-point number of 2N bit bit wide before switching
Component values can intercept, then the second conversion sub-circuit 222 can be according to the positive and negative of the fixed-point number for converting preceding 2N bit bit wide, to this
Part bits per inch value carries out cover, if for the fixed-point number of 2N bit bit wide into positive number, this part bits per inch value can use number before converting
Otherwise 0 cover of value uses 1 cover of numerical value, N bit value is then intercepted from the fixed-point number after cover;
(2) if the data conversion signal that revolution circuit 22 receives is 01, revolution circuit 22 can be by 2N bit bit wide
Fixed-point number be converted into the floating number of N-bit bit wide, at this point, revolution circuit 22 can be docked by the first conversion sub-circuit 221
The fixed-point number of the 2N bit bit wide received carries out data conversion, specifically, when revolution is handled, by the highest bit value of fixed-point number
(i.e. sign bit) can be used as the symbol bit value of floating number after conversion, in addition, if 2N fixed-point numbers remove before converting into positive number
Highest order numerical symbol position is gone to, is searched from 2N-1 fixed-point number highest orders toward lowest order direction, when finding numerical value 1, statistical number
There are also m bit values after value 1, at this point, the index bit value of floating number can add exponent bits deviant i equal to m after conversion, and subtract
The position of 2N fixed-point number decimal points before converting still if 2N fixed-point numbers is negatives before converting, removes highest bit value symbol
Number position is searched from 2N-1 fixed-point number highest orders toward lowest order direction, and when finding numerical value 0, statistics is that there are also m after numerical value 0
Bit value, in addition it is also necessary to mantissa bit value of the high n bit value as floating number after conversion in m bit value is intercepted, if m >=
N, then can directly intercept n bit value as mantissa's bit value, can be to mend n-m after 2N fixed-point numbers before switching if m < n
Highest order (i.e. sign bit) numerical value.
Illustratively, if desired the fixed-point number of 2N bit bit wide is converted into the floating number of 16 bit bit wides, then i can be waited
10 can be equal in 16, n;If desired the fixed-point number of 2N bit bit wide is converted into the floating number of 32 bit bit wides, then i can be waited
23 can be equal in 127, n;If desired the fixed-point number of 2N bit bit wide is converted into the floating number of 64 bit bit wides, then i can be with
52 can be equal to equal to 1023, n.
A kind of multiplier provided in this embodiment, the multiplier can be converted by revolution circuit by multiplication result
After the bit wide data equal with multiplier outputs mouth bit wide, target operation result is exported, so that the target operation knot obtained
The bit wide of fruit can be less than 2 times of the data bit width of multiplier input, to effectively reduce multiplier to input/output port
The requirement of bit wide, meanwhile, the number for the live part product that above-mentioned multiplier obtains is less, reduces multiplier and realizes multiplying
Complexity.
Fig. 5 is the flow diagram for the data processing method that an embodiment provides, and this method can be multiplied by shown in FIG. 1
Musical instruments used in a Buddhist or Taoist mass is handled, and what is involved is the processes that data are compared with operation for the present embodiment.As shown in figure 5, this method comprises:
S101, pending data is received.
Specifically, the canonical signed number coding sub-circuit in multiplier can receive two pending datas.Optionally,
Canonical signed number coding sub-circuit can handle the data of two fixed bit wides, and fixed bit wide can be defeated equal to multiplier
The bit wide of inbound port.Optionally, the pending data that above-mentioned canonical signed number coding sub-circuit receives can be fixed-point number,
And the bit wide of fixed-point number can be equal to the bit wide of multiplier input mouth.
S102, canonical signed number coded treatment is carried out to the pending data, obtains the partial product of target code.
Specifically, the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N multipliers
For, it is handled from low level numerical value to high-order numerical value, it, then can be by continuous n bit value if it exists when continuous l (l >=2) bit value 1
1 is converted to data " 1 (0)l-1(- 1) ", and by remaining correspond to (N-l) bit value and conversion after (l+1) bit value into
Row combines and obtains a new data;Then using the new data as the primary data of next stage conversion process, at conversion
There is no until continuous l (l >=2) bit value 1 in the new data obtained after reason;Wherein, carrying out canonical to N multipliers has symbol
Number encoder processing, the bit wide of obtained target code can be equal to (N+1).It should be noted that the part of above-mentioned target code
Long-pending number can be equal to the data bit width N that multiplier receives and add 1.
S103, accumulation process is carried out to the partial product of the target code, obtains multiplication result.
Specifically, cumulative sub-circuit can each columns value in the partial product to all target codes carry out cumulative fortune
It calculates, obtains multiplication result.Optionally, the bit wide of above-mentioned multiplication result can be equal to the data bit that multiplier receives
Wide 2 times can also be equal to 2 times of multiplier input mouth bit wide.
S104, it obtains storage indication signal and reads indication signal.
Specifically, multiplier can obtain storage indication signal by state control circuit automatically and read instruction letter
Number.
S105, multiple multiplication results are stored to different deposit sub-circuits according to the storage indication signal
In.
Specifically, the storage indication signal that the state control circuit in multiplier will acquire can be input to deposit control electricity
Road deposits control circuit according to the storage indication signal received, determines the multiplication result that this multiplying obtains, can
To store into corresponding deposit sub-circuit.
It should be noted that a deposit sub-circuit can only at most store a multiplication result, and multiple deposits
Can have part deposit sub-circuit in sub-circuit is idle state.
S106, according to the reading indication signal, read the correspondence multiplyings stored in different deposit sub-circuits
As a result the partial data in obtains target operation result.
Specifically, the selection circuit in multiplier can read corresponding deposit according to the reading indication signal received
The partial data in multiplication result stored in circuit, as target operation result.Optionally, above-mentioned operation result is not
It is target operation result, the target operation result of multiplying can be spliced to read operation result twice, or
Multiple operation result is read to be spliced, it can be understood as, the bit wide of partial data can be equal in above-mentioned multiplication result
The 1/2 of multiplication result bit wide is also less than the 1/2 of multiplication result bit wide.Optionally, the position of target operation result
Width can be less than or equal to the bit wide of multiplier input mouth.
A kind of data processing method provided in this embodiment, this method, which can carry out canonical to the data received, symbol
Number encoder processing, obtains the partial product of target code, carries out accumulation process to the partial product of target code, obtains multiplying knot
Fruit reads high position data and low data in multiplication result respectively, as target operation result, so that the mesh obtained
The bit wide for marking operation result can be less than 2 times of the data bit width that multiplier inputs, to effectively reduce multiplier to input
The requirement of output port bit wide;Meanwhile this method can carry out the data received using canonical signed number coding circuit
Canonical signed number coded treatment, reduces the number of the live part product obtained in multiplication procedure, to reduce multiplication fortune
The complexity of calculation;Meanwhile this method can be improved the operation efficiency of multiplying.
As one of embodiment, the pending data is carried out at canonical signed number coding in above-mentioned S102
Reason the step of obtaining the partial product of target code, may include:
S1021, canonical signed number coded treatment is carried out to the pending data, obtains initial protion product.
Optionally, canonical signed number coded treatment is carried out to the pending data in above-mentioned S1021, obtains original portion
Divide long-pending step, may include:
S1021a, canonical signed number coded treatment is carried out to the pending data, obtains target code.
Specifically, multiplier can carry out canonical to the multiplier to be processed received by canonical signed number coding unit
Signed number coded treatment, obtains target code.Wherein, the bit wide of target code can be equal to multiplier bit wide N to be processed and add 1.
Optionally, canonical signed number coded treatment is carried out to the pending data in above-mentioned S1021a, obtains target
The step of coding may include: that l bit value 1 continuous in the pending data is converted to the position (l+1) highest bit value to be
1, lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code, wherein l be more than or equal to 2.
It should be noted that the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N
For the multiplier of position, handled from low level numerical value to high-order numerical value, it, then can be by continuous n if it exists when continuous l (l >=2) bit value 1
Bit value 1 is converted to data " 1 (0)l-1(- 1) ", and remaining is corresponded into position (l+1) after (N-l) bit value and conversion
Numerical value is combined to obtain a new data;Then using the new data as the primary data of next stage conversion process, until
There is no until continuous l (l >=2) bit value 1 in the new data obtained after conversion process;Wherein, canonical is carried out to N multipliers
The bit wide of signed number coded treatment, obtained target code can be equal to (N+1).
S1022b, conversion process is carried out according to the pending data and the target code, obtains the initial protion
Product.
It should be noted that the number of above-mentioned initial protion product can be equal to the bit wide of target code.
Illustratively, if partial product acquiring unit receives one 8 multiplicand " x7x6x5x4x3x2x1x0" (i.e. X), then
Partial product acquiring unit can be according to multiplicand " x7x6x5x4x3x2x1x0" three kinds of numerical value -1 including in (i.e. X) and target code,
0,1 directly obtains corresponding initial protion product, and when one digit number value is -1 in target code, then initial protion product can be-X, when
When one digit number value is 0 in target code, then initial protion product can be 0, when one digit number value is 1 in target code, then original
Partial product can be X.Optionally, above-mentioned conversion process can be characterized as, based on the multiplicand in multiplying, by target code
In numerical value conversion at initial protion product.
S1022, sign bit extension process is carried out to initial protion product, obtains the partial product of the target code.
Optionally, sign bit extension process is carried out to initial protion product in above-mentioned S1022, obtains the target code
Partial product the step of, can specifically include: to the initial protion product carry out cover processing, obtain the portion of the target code
Divide product.
Specifically, the bit wide of the partial product after symbol Bits Expanding can be equal to 2 that multiplier is presently in reason data bit width N
Times, and the bit wide of initial protion product can be equal to N, the digit of sign bit extension bits can be equal to N.Optionally, symbol Bits Expanding
Processing is mended it is to be understood that the numerical value of sign bit extension bits is carried out cover with the numerical value of sign bit in initial protion product
Bit value can be the symbol bit value in initial protion product, which can be the highest digit in initial protion product
Value, the partial product after obtaining the symbol Bits Expanding of a 2N bit bit wide.Optionally, the digit of above-mentioned cover can be equal to N.It can
Choosing, the highest in partial product in the regularity of distribution of the partial product after all symbol Bits Expandings, after all symbol Bits Expandings
Bit value can be located at same row, and lowest order numerical value can also be located at same row, and other corresponding bit values can also correspond to same
Column.
A kind of data processing method provided in this embodiment, this method, which can carry out canonical to the pending data, symbol
Number coded treatment obtains initial protion product, carries out sign bit extension process to initial protion product, obtains the target and compile
The partial product of code, and accumulation process is carried out to the partial product of target code, multiplication result is obtained, and then read multiplication respectively
High position data and low data in operation result, as target operation result, so that the position of the target operation result obtained
2 times of the wide data bit width that can be less than multiplier input, to effectively reduce multiplier to input/output port bit wide
It is required that;Meanwhile this method can obtain live part product number it is less, to reduce the complexity of multiplying;Meanwhile
This method can be improved the operation efficiency of multiplying.
The data processing method that another embodiment provides, in above-mentioned S105 according to the storage indication signal will it is multiple described in
Multiplication result stores the step into different deposit sub-circuits, can specifically include:
S1051, corresponding first multiplication result of the first storage indication signal is stored into the first deposit sub-circuit.
Specifically, the number of storage indication signal can be equal to the number that multiplier realizes multiplying, multiplier is realized
Multiplication operation, an available multiplication result, and the available corresponding storage of state control circuit
Indication signal.If multiplier carries out first time multiplying, the first multiplication result is obtained, state control circuit obtains automatically
First storage indication signal, the first storage indication signal that deposit control circuit is inputted according to state control circuit, determines storage
First deposit sub-circuit of the first multiplication result, and the first multiplication result is input to the first deposit sub-circuit and is deposited
Storage.
S1052, corresponding second multiplication result of the second storage indication signal is stored into the second deposit sub-circuit.
It should be noted that if multiplier carries out second of multiplying, the second multiplication result, state control are obtained
Circuit obtains the second storage indication signal, the second storage instruction letter that deposit control circuit is inputted according to state control circuit automatically
Number, determine the second deposit sub-circuit of the second multiplication result of storage, and the second multiplication result is input to second and is posted
Deposit sub-circuit storage.And so on, multiplier can store the multiplication result that multiplying each time obtains to difference
Deposit sub-circuit in, and store corresponding multiplication result according to the number order of deposit sub-circuit, that is, continuously
Multiplication result twice can store into two adjacent deposit sub-circuits.
A kind of data processing method provided in this embodiment, by the corresponding first multiplying knot of the first storage indication signal
Fruit stores into the first deposit sub-circuit, and corresponding second multiplication result of the second storage indication signal is stored to second and is posted
It deposits in sub-circuit, thus the problem of avoiding the occurrence of multiplication result covering;In addition, this method can also make the target obtained fortune
The bit wide for calculating result can be less than 2 times of the data bit width that multiplier inputs, and multiplier is effectively reduced to input/output port position
Wide requirement, meanwhile, the number for the live part product that this method can obtain is less, reduces the complexity of multiplying.
As one of embodiment, according to the reading indication signal in above-mentioned S106, different deposit sub-circuits are read
Partial data in the correspondence of the middle storage multiplication result, can specifically pass through the step of obtaining target operation result
Following manner is realized:
S1061, indication signal is read according to first, reads the first multiplying stored in the first deposit sub-circuit
As a result first part's data in, obtain the first operation result.
S1062, indication signal is read according to second, reads first multiplication stored in the first deposit sub-circuit
Second part data in operation result, obtain the second operation result.
Specifically, the number for the reading indication signal that the state control circuit in multiplier obtains, can be equal to multiplier
The number for reading operation result, is equivalent to 2 times of multiplication result number.Optionally, multiplication result may include two
Partial data, i.e. first part's data and second part data.Illustratively, if the bit wide of multiplication result is equal to 2N,
Then multiplication result is segmented into two parts data, high N data and low N data, wherein first part's data can be
High N data or low N data, second part data can be low N data or high N data.
S1063, indication signal is read according to third, reads the second multiplying stored in the second deposit sub-circuit
As a result first part's data in, obtain third operation result.
Optionally, each reading indication signal can correspond to first part's data in multiplication result or second
Divided data.
S1064, indication signal is read according to the 4th, reads second multiplication stored in the second deposit sub-circuit
Second part data in operation result, obtain the 4th operation result.
Specifically, multiplier can carry out multiplying to multiple groups pending data, multiple multiplication results are obtained, because
This can read in next multiplication result after multiplier reads the 4th operation result according to next reading indication signal
Partial data.
Illustratively, if the input port bit wide of multiplier is 32 bits, output port bit wide is 64/t+deta bit
(general, multiplier can complete multiplication operation by t clock cycle, obtain a multiplication result, t > 1, deta
>=0), the data bit width that multiplier receives also is 32 bits, and the multiplier needs to multiply multiple groups pending data
Method operation, in this case, including (64/ (64/t+deta)) a 131 (i.e. deposit electricity of deposit sub-circuit in register circuit 13
Road A1, A2..., Ai, i can be equal to (64/ (64/t+deta))), then the realization process for obtaining target operation result can be with are as follows:
If multiplier obtains the first multiplication result M_0 by t (t can be more than or equal to 0) a clock cycle, deposit
Control circuit can store M_0 (64 bit bit wide) to deposit sub-circuit A according to the first storage indication signal1In, at this point, choosing
Indication signal can be read according to first by selecting circuit, from deposit sub-circuit A1Middle high 32 data for reading M_0, as the first time
The first operation result that multiplying obtains;
Meanwhile when multiplier is to t+1 clock cycle, then selection circuit can read indication signal according to second,
From deposit sub-circuit A1Middle low 32 data for reading M_0, as the second operation result that first time multiplying obtains, at this
In embodiment, multiplier splices the first operation result and the second operation result, the target operation of available pending data
As a result;
If multiplier is to 2t clock cycle, available second multiplication result M_1 then deposits control circuit
M_1 can store to deposit sub-circuit A according to the second storage indication signal2In, at this point, selection circuit can be read according to third
Indication signal is taken, from deposit sub-circuit A2Middle high 32 data for reading M_1, the third fortune obtained as second of multiplying
Calculate result;
Meanwhile when operation of the multiplier to the 2t+1 clock cycle, then selection circuit can read according to the 4th and refer to
Show signal, from deposit sub-circuit A2Middle low 32 data for reading M_1, the 4th operation knot obtained as second of multiplying
Fruit, in the present embodiment, data comparator merge third operation result with the 4th operation result, available pending data
Target operation result;
And so on, according to the multiplication result that different storage indication signals will obtain, can store to correspondence not
In same deposit sub-circuit, and read in different deposit sub-circuits according to different reading indication signals, the multiplying of storage
As a result the partial data in obtains target operation result.
In addition, if one group of pending data in multiple groups pending data, the case where there are zeros, at this point, multiplier passes through
The corresponding multiplication result of m (m < t) a clock cycle available this group of pending data is crossed, multiplier can be according to storage
Indication signal stores the multiplication result into corresponding deposit sub-circuit, and under present clock period, multiplier can root
The different partial datas deposited in the multiplication results that sub-circuits store are read according to indication signal is read, following clock cycle multiplies
Musical instruments used in a Buddhist or Taoist mass can export the remainder data in multiplication result;If in next group of pending data, there is also the feelings of zero
Condition, and need 1 clock cycle that can complete multiplication operation, multiplication result is obtained, at this point, multiplier can be with
The multiplication result is stored into adjacent next deposit sub-circuit.
A kind of data processing method provided in this embodiment, multiplier read different deposit according to indication signal is read
The partial data in correspondence multiplication result stored in circuit, obtains target operation result, and this method can be read respectively
High position data and low data in multiplication result, as target operation result, so that the target operation result obtained
Bit wide can be less than 2 times of data bit width of multiplier input, to effectively reduce multiplier to input/output port position
Wide requirement;Meanwhile this method can obtain live part product number it is less, reduce the complexity of multiplying.
Fig. 6 is the flow diagram for the data processing method that one embodiment provides, and this method can be by shown in Fig. 2
Multiplier is handled, and what is involved is the processes that data are carried out with multiplying for the present embodiment.As shown in fig. 6, this method comprises:
S201, data conversion signal and pending data are received.
Specifically, the multiplying operational circuit in multiplier can receive two pending datas and data conversion signal.It can
Choosing, the bit wide of pending data can be equal to the bit wide of multiplier input mouth.Optionally, if revolution circuit receive it is different
Data conversion signal, then revolution circuit can be by the data conversion received at data conversion signal corresponds to the data of format.
S202, canonical signed number coded treatment is carried out to the pending data, obtains the partial product of target code.
Specifically, the principle of above-mentioned canonical signed number coded treatment can be characterized as, for N multipliers, from low
Position is handled to high-order numerical value, if it exists when the position 1 continuous l (l >=2), then the position n 1 can be converted to data " 1 (0)l-1(-
1) ", and remaining is corresponded into N-l bit value and obtains a new data in conjunction with the l+1 bit value after converting, by the new data
As the primary data of next stage conversion process, there is no the positions continuous l (l >=2) in the new data that obtains after conversion process
Until 1, wherein the bit wide for carrying out the target code that canonical signed number coded treatment obtains to N multipliers can be equal to N+1
Numerical value.Add it should be noted that the number of the partial product of above-mentioned target code can be equal to the data bit width N that multiplier receives
1。
S203, accumulation process is carried out to the partial product of the target code, obtains multiplication result.
Specifically, cumulative sub-circuit can each columns value in the partial product to all target codes carry out cumulative fortune
It calculates, obtains multiplication result.Optionally, the bit wide of above-mentioned multiplication result can be equal to the data bit that multiplier receives
Wide 2 times can also be equal to 2 times of multiplier input mouth bit wide.Optionally, the bit wide of above-mentioned multiplication result can wait
In 2 times of the bit wide of multiplier input mouth, 2 times of the bit wide of pending data can also be equal to.
S204, the multiplication result is carried out by revolution processing according to the data conversion signal, obtains target operation
As a result, wherein the data conversion signal is used to indicate the number that multiplier needs to be converted to the target operation result demand
According to type.
Specifically, revolution circuit is determined according to the data conversion signal received, multiplication result can be converted into,
The operation result of fixed point type or the operation result of floating point type.Illustratively, if revolution circuit can receive two kinds of data
Conversion signal is expressed as 00 and 01, meanwhile, the bit wide of multiplier input mouth and output port is N-bit, then 00 table
Show that the position the 2N received multiplication result can be converted by revolution circuit, the operation result of N fixed point types, 01 indicates to turn
The position the 2N received multiplication result can be converted by number circuit, the operation result of N floating point types, wherein different numbers
It can be with flexible setting according to the function that conversion signal corresponds to the realization of revolution circuit.Optionally, each data conversion signal can be with table
Sign multiplier needs to be converted to multiplication result a kind of data type of demand.
A kind of data processing method provided in this embodiment receives data conversion signal and pending data, to described
Pending data carries out multiplying processing, obtains multiplication result, and according to the data conversion signal by the multiplication
Operation result carries out revolution processing, obtains target operation result, and this method enables to the bit wide of the target operation result obtained,
2 times of multiplier input data bit wide can be less than, to effectively reduce requirement of the multiplier to input/output port bit wide;
Meanwhile this method can obtain live part product number it is less, reduce the complexity of multiplying.
The embodiment of the present application also provides a machine learning arithmetic units comprising one or more mentions in this application
The multiplier arrived executes specified machine learning fortune to operational data and control information for obtaining from other processing units
It calculates, implementing result passes to peripheral equipment by I/O interface.Peripheral equipment for example camera, display, mouse, keyboard, net
Card, wifi interface, server.When comprising more than one multiplier, it can be linked by specific structure between multiplier
And data are transmitted, for example, data are interconnected and are transmitted by quick external equipment interconnection bus, to support more massive machine
The operation of device study.At this point it is possible to share same control system, there can also be control system independent;In can sharing
Deposit, can also each accelerator have respective memory.In addition, its mutual contact mode can be any interconnection topology.
The machine learning arithmetic unit compatibility with higher, can by quick external equipment interconnection interface with it is various types of
The server of type is connected.
The embodiment of the present application also provides a combined treatment devices comprising above-mentioned machine learning arithmetic unit leads to
With interconnecting interface and other processing units.Machine learning arithmetic unit is interacted with other processing units, completes user jointly
Specified operation.Fig. 7 is the schematic diagram of combined treatment device.
Other processing units, including central processor CPU, graphics processor GPU, neural network processor etc. are general/special
With one of processor or above processor type.Processor quantity included by other processing units is with no restrictions.Its
Interface of its processing unit as machine learning arithmetic unit and external data and control, including data are carried, and are completed to the machine
Device learns the basic control such as unlatching, stopping of arithmetic unit;Other processing units can also cooperate with machine learning arithmetic unit
It is common to complete processor active task.
General interconnecting interface, for transmitting data and control between the machine learning arithmetic unit and other processing units
Instruction.The machine learning arithmetic unit obtains required input data, write-in machine learning operation dress from other processing units
Set the storage device of on piece;Control instruction can be obtained from other processing units, write-in machine learning arithmetic unit on piece
Control caching;It can also learn the data in the memory module of arithmetic unit with read machine and be transferred to other processing units.
Optionally, the structure as shown in figure 8, can also include storage device, storage device respectively with the machine learning
Arithmetic unit is connected with other processing units.Storage device for be stored in the machine learning arithmetic unit and it is described its
The data of the data of its processing unit, operation required for being particularly suitable for learn arithmetic unit or other processing units in machine
Storage inside in the data that can not all save.
The combined treatment device can be used as the SOC on piece of the equipment such as mobile phone, robot, unmanned plane, video monitoring equipment
The die area of control section is effectively reduced in system, improves processing speed, reduces overall power.When this situation, the combined treatment
The general interconnecting interface of device is connected with certain components of equipment.Certain components for example camera, display, mouse, keyboard,
Network interface card, wifi interface.
In some embodiments, a kind of chip has also been applied for comprising at above-mentioned machine learning arithmetic unit or combination
Manage device.
In some embodiments, a kind of chip-packaging structure has been applied for comprising said chip.
In some embodiments, a kind of board has been applied for comprising said chip encapsulating structure.As shown in figure 9, Fig. 9
A kind of board is provided, above-mentioned board can also include other matching components, this is matched other than including said chip 389
Set component includes but is not limited to: memory device 390, reception device 391 and control device 392;
The memory device 390 is connect with the chip in the chip-packaging structure by bus, for storing data.Institute
Stating memory device may include multiple groups storage unit 393.Storage unit described in each group is connect with the chip by bus.It can
To understand, storage unit described in each group can be DDR SDRAM (English: Double Data Rate SDRAM, Double Data Rate
Synchronous DRAM).
DDR, which does not need raising clock frequency, can double to improve the speed of SDRAM.DDR allows the rising in clock pulses
Edge and failing edge read data.The speed of DDR is twice of standard SDRAM.In one embodiment, the storage device can be with
Including storage unit described in 4 groups.Storage unit described in each group may include multiple DDR4 particles (chip).In one embodiment
In, the chip interior may include 4 72 DDR4 controllers, and 64bit is used for transmission number in above-mentioned 72 DDR4 controllers
According to 8bit is used for ECC check.It is appreciated that data pass when using DDR4-3200 particle in the storage unit described in each group
Defeated theoretical bandwidth can reach 25600MB/s.
In one embodiment, storage unit described in each group include multiple Double Data Rate synchronous dynamics being arranged in parallel with
Machine memory.DDR can transmit data twice within a clock cycle.The controller of setting control DDR in the chips,
Control for data transmission and data storage to each storage unit.
The reception device is electrically connected with the chip in the chip-packaging structure.The reception device is for realizing described
Data transmission between chip and external equipment (such as server or computer).Such as in one embodiment, the reception
Device can be the quick external equipment interconnection interface of standard.For example, pending data is set by server by the way that standard is quickly external
Standby interconnection interface is transferred to the chip, realizes data transfer.Preferably, it is connect when using quick external equipment interconnection 3.0X 16
When port transmission, theoretical bandwidth can reach 16000MB/s.In another embodiment, the reception device can also be other
Interface, the application are not intended to limit the specific manifestation form of above-mentioned other interfaces, and the interface unit can be realized signaling transfer point
.In addition, the calculated result of the chip still sends back external equipment (such as server) by the reception device.
The control device is electrically connected with the chip.The control device is for supervising the state of the chip
Control.Specifically, the chip can be electrically connected with the control device by SPI interface.The control device may include list
Piece machine (Micro Controller Unit, MCU).If the chip may include multiple processing chips, multiple processing cores or more
A processing circuit can drive multiple loads.Therefore, the chip may be at the different work shape such as multi-load and light load
State.It may be implemented by the control device to processing chips multiple in the chip, multiple processing and/or multiple processing circuits
Working condition regulation.
In some embodiments, a kind of electronic equipment has been applied for comprising above-mentioned board.
Electronic equipment can be multiplier, robot, computer, printer, scanner, tablet computer, intelligent terminal, hand
Machine, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, wrist-watch,
Earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.
The vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include TV, air-conditioning, micro-wave oven,
Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument
And/or electrocardiograph.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Electrical combination, but those skilled in the art should understand that, the application is not limited by described electrical combination mode,
Because certain circuits can be realized using other way or structure according to the application.Secondly, those skilled in the art also should
Know, embodiment described in this description belongs to alternative embodiment, related device and module not necessarily this Shen
It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiments.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art
For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application
Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (27)
1. a kind of multiplier, which is characterized in that the multiplier includes: multiplying operational circuit, deposit control circuit, register electricity
Road, state control circuit and selection circuit, the multiplying operational circuit include canonical signed number coding sub-circuit and tire out
Add sub-circuit, the output end of the canonical signed number coding sub-circuit is connect with the input terminal of the cumulative sub-circuit, described
The output end of cumulative sub-circuit is connect with the first input end of the deposit control circuit, the output end of the deposit control circuit
It is connect with the input terminal of the register circuit, the first input end of the output end of the register circuit and the selection circuit
Connection, the first output end of the state control circuit are connect with the second input terminal of the deposit control circuit, the state
The second output terminal of control circuit is connect with the second input terminal of the selection circuit.
2. multiplier according to claim 1, which is characterized in that the canonical signed number coding sub-circuit includes canonical
Signed number coding unit and partial product acquiring unit, the canonical signed number coding unit are used to receive the first data,
And the canonical signed number coded treatment is carried out to first data, the target code is obtained, the partial product obtains
Unit obtains initial protion product for receiving the second data, according to the target code and second data, and according to institute
It states initial protion product and obtains the partial product of the target code, the cumulative sub-circuit is used for the partial product to the target code
It carries out accumulation process and obtains multiplication result, the state control circuit is for obtaining storage indication signal and reading instruction
Signal, the storage indication signal that the deposit control circuit is used to be inputted according to the state control circuit, determines storage
The register circuit of the multiplication result, the register circuit are described for storing the multiplication result
Selection circuit is used to read the multiplication fortune stored in the register circuit according to the reading indication signal received
The data in result are calculated, as target operation result.
3. multiplier according to claim 2, which is characterized in that the canonical signed number coding unit may include:
Data-in port and target code output port;The data-in port carries out at canonical signed number coding for receiving
First data of reason, the target code output port carry out canonical signed number volume to first data for exporting
The target code obtained after code processing.
4. multiplier according to claim 2 or 3, which is characterized in that the partial product acquiring unit is specifically used for institute
It states target code progress conversion process and obtains initial protion product, and sign bit extension process is carried out to initial protion product, obtain
Partial product to after symbol Bits Expanding obtains the partial product of the target code according to the partial product after the symbol Bits Expanding.
5. multiplier according to any one of claim 2 to 4, which is characterized in that the partial product acquiring unit includes:
Target code input port, the second data-in port and partial product output port;The target code input port is used for
The target code is received, second data-in port is for receiving second data, the partial product output port
For exporting the partial product of the target code.
6. multiplier according to any one of claim 1 to 5, which is characterized in that the cumulative sub-circuit includes: Hua Lai
Scholar's tree group unit and summing elements;Wherein, the input terminal of the output end of the Wallace tree group unit and the summing elements connects
It connects;The Wallace tree group unit be used to carry out the partial product of the target code accumulation process obtain accumulating operation as a result,
The summing elements are used to carry out accumulation process to the accumulating operation result.
7. multiplier according to claim 6, which is characterized in that the Wallace tree group unit includes: Wallace tree
Unit, the Wallace tree subelement are used to carry out accumulation process to each columns value in the partial product of all target codes.
8. multiplier according to claim 6 or 7, which is characterized in that the summing elements include: adder, described to add
Musical instruments used in a Buddhist or Taoist mass is used to carry out add operation to the cumulative correction result received.
9. multiplier according to claim 8, which is characterized in that the adder include: carry signal input port and
Position signal input port and result output port;The carry signal input port is for receiving carry signal, described and position
Signal input port is believed with position signal, the result output port for exporting the carry signal and described and position for receiving
Number carry out accumulation process result.
10. multiplier according to any one of claim 1 to 9, which is characterized in that the register circuit includes: to post
Sub-circuit is deposited, the deposit sub-circuit is for storing the corresponding multiplication result of different storage indication signals.
11. a kind of multiplier, which is characterized in that the multiplier includes: multiplying operational circuit and revolution circuit, the multiplication
Computing circuit includes canonical signed number coding sub-circuit and cumulative sub-circuit, and the canonical signed number encodes sub-circuit
Output end is connect with the input terminal of the cumulative sub-circuit, the input of the output end of the cumulative sub-circuit and the revolution circuit
End connection, the revolution circuit include the first conversion sub-circuit and the second conversion sub-circuit;
Wherein, the canonical signed number coding sub-circuit is used to carry out canonical signed number coded treatment to the data received
Target code is obtained, and the partial product of target code is obtained according to the target code, the cumulative sub-circuit is used for described
The partial product of target code carries out accumulation process and obtains multiplication result, first conversion sub-circuit and the second conversion son electricity
Road is respectively used to carry out revolution processing to the multiplication result, obtains target operation result.
12. multiplier according to claim 11, which is characterized in that include input port in the revolution circuit, be used for
Receive data conversion signal;The data conversion signal is used to determine the data conversion type of the revolution processing of circuit.
13. multiplier according to claim 11 or 12, which is characterized in that first conversion sub-circuit is specifically used for will
The multiplication result is converted into the target operation result of floating point type, and second conversion sub-circuit is specifically used for will
The multiplication result is converted into the target operation result of fixed point type.
14. a kind of data processing method, which is characterized in that the described method includes:
Receive pending data;
Canonical signed number coded treatment is carried out to the pending data, obtains the partial product of target code;
Accumulation process is carried out to the partial product of the target code, obtains multiplication result;
It obtains storage indication signal and reads indication signal;
Multiple multiplication results are stored into different deposit sub-circuits according to the storage indication signal;
According to the reading indication signal, the portion in the correspondence multiplication result stored in different deposit sub-circuits is read
Divided data obtains target operation result.
15. according to the method for claim 14, which is characterized in that described to have symbol to pending data progress canonical
Number encoder processing, obtains the partial product of target code, comprising:
Canonical signed number coded treatment is carried out to the pending data, obtains initial protion product;
Sign bit extension process is carried out to initial protion product, obtains the partial product of the target code.
16. according to the method for claim 15, which is characterized in that described to have symbol to pending data progress canonical
Number encoder processing obtains initial protion product, comprising:
Canonical signed number coded treatment is carried out to the pending data, obtains target code;
Conversion process is carried out according to the pending data and the target code, obtains the initial protion product.
17. method according to claim 15 or 16, which is characterized in that described to carry out sign bit to initial protion product
Extension process obtains the partial product of the target code, comprising: carries out cover processing to initial protion product, obtains described
The partial product of target code.
18. method described in any one of 4 to 17 according to claim 1, which is characterized in that described to indicate to believe according to the storage
Number multiple multiplication results are stored into different deposit sub-circuits, comprising:
Corresponding first multiplication result of first storage indication signal is stored into the first deposit sub-circuit;
Corresponding second multiplication result of second storage indication signal is stored into the second deposit sub-circuit.
19. method described in any one of 4 to 18 according to claim 1, which is characterized in that described to indicate to believe according to the reading
Number, the partial data in the correspondence multiplication result stored in different deposit sub-circuits is read, target operation knot is obtained
Fruit, comprising:
Indication signal is read according to first, reads in the first deposit sub-circuit the in the first multiplication result for storing
A part of data obtain the first operation result;
Indication signal is read according to second, is read in first multiplication result stored in the first deposit sub-circuit
Second part data, obtain the second operation result;
Indication signal is read according to third, reads in the second deposit sub-circuit the in the second multiplication result for storing
A part of data obtain third operation result;
Indication signal is read according to the 4th, is read in second multiplication result stored in the second deposit sub-circuit
Second part data, obtain the 4th operation result.
20. a kind of data processing method, which is characterized in that the described method includes:
Receive data conversion signal and pending data;
Canonical signed number coded treatment is carried out to the pending data, obtains the partial product of target code;
Accumulation process is carried out to the partial product of the target code, obtains multiplication result;
The multiplication result is subjected to revolution processing according to the data conversion signal, obtains target operation result, wherein
The data conversion signal is used to indicate the data type that multiplier needs to be converted to the target operation result demand.
21. a kind of machine learning arithmetic unit, which is characterized in that the machine learning arithmetic unit includes one or more as weighed
Benefit requires the described in any item multipliers of 1-13, for being obtained from other processing units to operation input data and control letter
Breath, and specified machine learning operation is executed, implementing result is passed into other processing units by I/O interface;
It is specific by presetting between multiple computing devices when the machine learning arithmetic unit includes multiple multipliers
Structure is attached and transmits data;
Wherein, multiple multipliers are interconnected by PCIE bus and are transmitted data, to support more massive engineering
The operation of habit;Multiple multipliers share same control system or possess respective control system;Multiple multipliers are total
It enjoys memory or possesses respective memory;The mutual contact mode of multiple multipliers is any interconnection topology.
22. a kind of combined treatment device, which is characterized in that the combined treatment device includes machine as claimed in claim 21
Learn arithmetic unit, general interconnecting interface and other processing units;
The machine learning arithmetic unit is interacted with other processing units, the common calculating behaviour for completing user and specifying
Make.
23. combined treatment device according to claim 22, which is characterized in that further include: storage device, the storage device
It is connect respectively with the machine learning arithmetic unit and other processing units, for saving the machine learning arithmetic unit
With the data of other processing units.
24. a kind of neural network chip, which is characterized in that the machine learning chip includes machine as claimed in claim 21
Learn arithmetic unit or combined treatment device as claimed in claim 22 or combined treatment device as claimed in claim 23.
25. a kind of electronic equipment, which is characterized in that the electronic equipment includes the chip as described in the claim 24.
26. a kind of board, which is characterized in that the board includes: memory device, reception device and control device and such as right
It is required that neural network chip described in 24;
Wherein, the neural network chip is separately connected with the memory device, the control device and the reception device;
The memory device, for storing data;
The reception device, for realizing the data transmission between the chip and external equipment;
The control device is monitored for the state to the chip.
27. board according to claim 26, which is characterized in that
The memory device includes: multiple groups storage unit, and storage unit described in each group is connect with the chip by bus, institute
State storage unit are as follows: DDR SDRAM;
The chip includes: DDR controller, the control for data transmission and data storage to each storage unit;
The reception device are as follows: standard PCIE interface.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910819020.4A CN110515589B (en) | 2019-08-30 | 2019-08-30 | Multiplier, data processing method, chip and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910819020.4A CN110515589B (en) | 2019-08-30 | 2019-08-30 | Multiplier, data processing method, chip and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110515589A true CN110515589A (en) | 2019-11-29 |
CN110515589B CN110515589B (en) | 2024-04-09 |
Family
ID=68629959
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910819020.4A Active CN110515589B (en) | 2019-08-30 | 2019-08-30 | Multiplier, data processing method, chip and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110515589B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111176725A (en) * | 2019-12-27 | 2020-05-19 | 北京市商汤科技开发有限公司 | Data processing method, device, equipment and storage medium |
CN112114776A (en) * | 2020-09-30 | 2020-12-22 | 合肥本源量子计算科技有限责任公司 | Quantum multiplication method and device, electronic device and storage medium |
CN113222132A (en) * | 2021-05-22 | 2021-08-06 | 上海阵量智能科技有限公司 | Multiplier, data processing method, chip, computer device and storage medium |
CN113568864A (en) * | 2020-04-29 | 2021-10-29 | 意法半导体股份有限公司 | Circuit, corresponding device, system and method |
WO2022199684A1 (en) * | 2021-03-26 | 2022-09-29 | 南京后摩智能科技有限公司 | Circuit based on digital domain in-memory computing |
CN115857873A (en) * | 2023-02-07 | 2023-03-28 | 湖南三安半导体有限责任公司 | Multiplier, multiplication calculation method, processing system and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1454347A (en) * | 2000-10-16 | 2003-11-05 | 诺基亚公司 | Multiplier and shift device using signed digit representation |
US20030220956A1 (en) * | 2002-05-22 | 2003-11-27 | Broadcom Corporation | Low-error canonic-signed-digit fixed-width multiplier, and method for designing same |
CN101178643A (en) * | 2006-11-09 | 2008-05-14 | 普诚科技股份有限公司 | Data conversion method and data conversion circuit capable of saving digital operation |
CN105183424A (en) * | 2015-08-21 | 2015-12-23 | 电子科技大学 | Fixed-bit-width multiplier with high accuracy and low energy consumption properties |
CN209895329U (en) * | 2019-08-30 | 2020-01-03 | 上海寒武纪信息科技有限公司 | Multiplier and method for generating a digital signal |
-
2019
- 2019-08-30 CN CN201910819020.4A patent/CN110515589B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1454347A (en) * | 2000-10-16 | 2003-11-05 | 诺基亚公司 | Multiplier and shift device using signed digit representation |
US20030220956A1 (en) * | 2002-05-22 | 2003-11-27 | Broadcom Corporation | Low-error canonic-signed-digit fixed-width multiplier, and method for designing same |
CN101178643A (en) * | 2006-11-09 | 2008-05-14 | 普诚科技股份有限公司 | Data conversion method and data conversion circuit capable of saving digital operation |
CN105183424A (en) * | 2015-08-21 | 2015-12-23 | 电子科技大学 | Fixed-bit-width multiplier with high accuracy and low energy consumption properties |
CN209895329U (en) * | 2019-08-30 | 2020-01-03 | 上海寒武纪信息科技有限公司 | Multiplier and method for generating a digital signal |
Non-Patent Citations (3)
Title |
---|
A. MCKEE: "Herz–Schur multipliers of dynamical systems", ADVANCES IN MATHEMATICS, vol. 331, 20 June 2018 (2018-06-20) * |
万超 等: "一种高速数字FIR滤波器的VLSI实现", 《合肥工业大学学报(自然科学版)》, vol. 31, no. 5, pages 736 - 739 * |
王瑞光 等: "基于CSD编码的16位并行乘法器的设计", 《微计算机信息》, vol. 24, no. 23, pages 75 - 76 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111176725A (en) * | 2019-12-27 | 2020-05-19 | 北京市商汤科技开发有限公司 | Data processing method, device, equipment and storage medium |
JP2022518636A (en) * | 2019-12-27 | 2022-03-16 | 北京市商▲湯▼科技▲開▼▲發▼有限公司 | Data processing methods, equipment, equipment, systems, storage media and program products |
US11314457B2 (en) | 2019-12-27 | 2022-04-26 | Beijing Sensetime Technology Development Co., Ltd. | Data processing method for data format conversion, apparatus, device, and system, storage medium, and program product |
CN113568864A (en) * | 2020-04-29 | 2021-10-29 | 意法半导体股份有限公司 | Circuit, corresponding device, system and method |
CN112114776A (en) * | 2020-09-30 | 2020-12-22 | 合肥本源量子计算科技有限责任公司 | Quantum multiplication method and device, electronic device and storage medium |
CN112114776B (en) * | 2020-09-30 | 2023-12-15 | 本源量子计算科技(合肥)股份有限公司 | Quantum multiplication method, device, electronic device and storage medium |
WO2022199684A1 (en) * | 2021-03-26 | 2022-09-29 | 南京后摩智能科技有限公司 | Circuit based on digital domain in-memory computing |
CN113222132A (en) * | 2021-05-22 | 2021-08-06 | 上海阵量智能科技有限公司 | Multiplier, data processing method, chip, computer device and storage medium |
CN115857873A (en) * | 2023-02-07 | 2023-03-28 | 湖南三安半导体有限责任公司 | Multiplier, multiplication calculation method, processing system and storage medium |
CN115857873B (en) * | 2023-02-07 | 2023-05-09 | 兰州大学 | Multiplier, multiplication calculation method, processing system, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110515589B (en) | 2024-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110515589A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN109740739B (en) | Neural network computing device, neural network computing method and related products | |
CN109740754B (en) | Neural network computing device, neural network computing method and related products | |
CN111008003B (en) | Data processor, method, chip and electronic equipment | |
CN110163357A (en) | A kind of computing device and method | |
CN110362293B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110515590A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110515587A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN111381808B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110531954A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN111258541B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN209895329U (en) | Multiplier and method for generating a digital signal | |
CN110647307B (en) | Data processor, method, chip and electronic equipment | |
CN210109863U (en) | Multiplier, device, neural network chip and electronic equipment | |
CN110515586B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110515588B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN111260070B (en) | Operation method, device and related product | |
CN209962284U (en) | Multiplier, device, chip and electronic equipment | |
CN113033788B (en) | Data processor, method, device and chip | |
CN113031909B (en) | Data processor, method, device and chip | |
CN110378477A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN111258542B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN210006032U (en) | Multiplier, machine learning arithmetic device and combination processing device | |
CN110515585A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN210006082U (en) | Multiplier, device, neural network chip and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |