CN110413254A - Data processor, method, chip and electronic equipment - Google Patents
Data processor, method, chip and electronic equipment Download PDFInfo
- Publication number
- CN110413254A CN110413254A CN201910902610.3A CN201910902610A CN110413254A CN 110413254 A CN110413254 A CN 110413254A CN 201910902610 A CN201910902610 A CN 201910902610A CN 110413254 A CN110413254 A CN 110413254A
- Authority
- CN
- China
- Prior art keywords
- data
- partial product
- product
- symbol bits
- target code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The application provides a kind of data processor, method, chip and electronic equipment, data processor includes the first multiplying operational circuit, second multiplying operational circuit and partial product switched circuit, first multiplying operational circuit includes the first amendment coding sub-circuit and the first amendment compression sub-circuit, second multiplying operational circuit includes the second amendment coding sub-circuit and the second amendment compression sub-circuit, the data processor can carry out canonical signed number coded treatment to data are received, so that the number of the live part product obtained is less, data processor is reduced to realize multiplying or multiply accumulating the complexity of operation.
Description
Technical field
This application involves field of computer technology, set more particularly to a kind of data processor, method, chip and electronics
It is standby.
Background technique
With the continuous development of Digital Electronic Technique, all kinds of artificial intelligence (Artificial Intelligence, AI) cores
The fast-developing requirement for High performance data processor of piece is also higher and higher, wherein data processor is multiplier, addition
Device or multiply-accumulator.Neural network algorithm multiply by multiply-accumulator tired as one of widely applied algorithm of intelligent chip
Adding operation is a kind of common operation in neural network algorithm.
Currently, data processor is to encode to every three bit value in multiplier as one, and obtain portion according to multiplicand
Point product, and compression processing is carried out to all partial products with Wallace tree and obtains multiplication result or multiplies accumulating operation result.But
It is that in traditional technology, the number of non-zero bit value is more in coding, the number of the effective partial product of the correspondence of generation is more, causes
Data processor realization multiplying or the complexity for multiplying accumulating operation are higher.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of number of live part product that can reduce acquisition
Mesh reduces data processor, method, chip and the electronic equipment of computational complexity.
A kind of data processor, the data processor include: the first multiplying operational circuit, the second multiplying operational circuit with
And partial product switched circuit, first multiplying operational circuit include the first amendment coding sub-circuit and the first amendment compression
Circuit, second multiplying operational circuit include the second amendment coding sub-circuit and the second amendment compression sub-circuit, wherein institute
Stating the first amendment coding sub-circuit includes the first encoding branches and first choice branch, the second amendment coding sub-circuit packet
Include the second encoding branches and the second selection branch, the first output end of the first amendment coding sub-circuit and the partial product
The first input end of switched circuit connects, and the second output terminal of the first amendment coding sub-circuit and first amendment are compressed
The input terminal of sub-circuit connects, and the first output end of the partial product switched circuit is defeated with the first amendment coding sub-circuit
Entering end connection, the second output terminal of the partial product switched circuit is connect with the input terminal of the second amendment coding sub-circuit,
First output end of the second amendment coding sub-circuit is connect with the second input terminal of the partial product switched circuit, and described the
The second output terminal of two amendment coding sub-circuits is connect with the input terminal of the second amendment compression sub-circuit;
Wherein, first encoding branches are used to carry out canonical signed number coded treatment to the first data received, obtain
First partial product after symbol Bits Expanding, the first choice branch are used for from the first partial product after the symbol Bits Expanding
The first partial product of selection target coding, the first amendment compression sub-circuit are used for the first partial product to the target code
Compression processing is carried out, first object operation result is obtained, second encoding branches are used to carry out the second data received
Canonical signed number coded treatment, the second partial product after obtaining symbol Bits Expanding, the second selection branch are used for from described
The second partial product that selection target encodes in second partial product after symbol Bits Expanding, the second amendment compression sub-circuit are used for
Compression processing is carried out to the second partial product of the target code, obtains the second target operation result, the partial product exchange electricity
Road is for handing over the second partial product after the first partial product and the symbol Bits Expanding after the symbol Bits Expanding
It changes.
Include in first multiplying operational circuit and second multiplying operational circuit in one of the embodiments,
First input end is used for receive capabilities selection mode signal;It include third input terminal in the partial product switched circuit, for connecing
Receive the function selection mode signal;The function selection mode signal can be handled not for determining the data processor currently
With the data operation of mode.
In one of the embodiments, the first amendment coding sub-circuit include: the first amendment coded treatment branch with
And first partial product selects branch, the output end and the first partial product of the first amendment coded treatment branch select branch
Input terminal connection;
Wherein, the first amendment coded treatment branch is used to carry out canonical signed number volume to first data received
Code processing obtains the first object coding, and the first partial product selection branch according to the first object for encoding
First partial product to after symbol Bits Expanding selects the first partial product after the symbol Bits Expanding, and receives institute
Second partial product after stating the symbol Bits Expanding of partial product switched circuit output, after the symbol Bits Expanding received
Second partial product, and the first partial product after selection after the obtained symbol Bits Expanding, as the target code
First partial product.
The first amendment coded treatment branch includes: the first amendment coding unit, low level in one of the embodiments,
Partial product acquiring unit, low level selector group unit, high-order portion product acquiring unit and high digit selector group unit, described the
First output end of one amendment coding unit is connect with the first input end of low portion product acquiring unit, the low level choosing
The output end for selecting device group unit is connect with the second input terminal of low portion product acquiring unit, and the first amendment coding is single
The second output terminal of member is connect with the first input end of high-order portion product acquiring unit, the high digit selector group unit
Output end is connect with the second input terminal of high-order portion product acquiring unit;
Wherein, the first amendment coding unit is used to carry out at canonical signed number coding first data received
Reason determines that the data processor can handle the bit wide of data according to the function selection mode signal received, and according to
The bit wide that the data processor can handle data obtains first object coding, and the low portion product acquiring unit is used for basis
Receive the first object coding in the first low level target code and first data, after obtaining symbol Bits Expanding
The first low portion product, the low level selector group unit is used to gate the product of the first low portion after the symbol Bits Expanding
In numerical value, high-order portion product acquiring unit is used for according to the first high-order mesh in the first object coding received
Mark coding and first data, the first high-order portion product after obtaining symbol Bits Expanding, the high digit selector group unit
For gating the numerical value in the product of the first high-order portion after the symbol Bits Expanding.
The first amendment coding unit includes: the first data-in port, first mode in one of the embodiments,
Selection signal input port, low level target code output port and high-order target code output port;First data are defeated
Inbound port is for receiving first data, and the first mode selection signal input port is for receiving the function selection mould
Formula signal, the low level target code output port carry out canonical signed number coded treatment to first data for exporting
Afterwards, the first low level target code obtained, the high position target code output port is for exporting to first data
After carrying out canonical signed number coded treatment, the high-order target code of described first obtained.
The low portion product acquiring unit includes: low level target code input port, choosing in one of the embodiments,
Logical value input mouth, the first data-in port and low portion product output port;The low level target code input terminal
Mouth is for receiving the first low level target code of the first amendment coding unit output, the gating value input mouth
After receiving the low level selector group one-cell switching, in the first low portion product after the obtained symbol Bits Expanding
Numerical value, first data-in port is for receiving first data, and the low portion product output port is for exporting
The first low portion product after the symbol Bits Expanding.
The high-order portion product acquiring unit includes: high-order target code input port, choosing in one of the embodiments,
Logical value input mouth, the first data-in port and high-order portion product output port;The high position target code input terminal
Mouth is used for for receiving the first high-order target code of the first amendment coding unit output, the gating value input mouth
The number in the first high-order portion product after receiving the high digit selector group one-cell switching, after the symbol Bits Expanding of output
Value, first data-in port is for receiving first data, and the high-order portion product output port is for exporting institute
The first high-order portion product after stating symbol Bits Expanding.
The low level selector group unit includes: low level selector in one of the embodiments, the low level selector
For being gated to the numerical value in the first low portion product after the symbol Bits Expanding.
The high digit selector group unit includes: high digit selector in one of the embodiments, the high digit selector
For being gated to the numerical value in the first high-order portion product after the symbol Bits Expanding.
The first partial product selection branch includes: function selection mode signal input part in one of the embodiments,
Mouth, first partial product input port, second partial product input port, first partial product output port and gate unit product output
Port;The function selection mode signal input port is for receiving the function selection mode signal, the first partial product
Input port is used to receive the first partial product after the symbol Bits Expanding of the first amendment coding unit output, and described the
Two partial product input ports are used to receive the second partial product after the symbol Bits Expanding of the partial product switched circuit exchange,
The first partial product output port is used to export the symbol Bits Expanding for needing the partial product switched circuit to swap
First partial product afterwards, the gate unit product output port are used to export first after the symbol Bits Expanding after gating
Divide product, and the second partial product after the symbol Bits Expanding received.
The first amendment compression sub-circuit includes: amendment Wallace tree group unit and tires out in one of the embodiments,
Add unit, the output end of the amendment Wallace tree group unit is connect with the input terminal of the summing elements;The amendment Hua Lai
It is every in the first partial product of the target code of acquisition when scholar's tree group unit is used to handle the data operation of different mode
One columns value carries out accumulation process, obtains accumulating operation as a result, the summing elements are used to carry out the accumulating operation result
Add operation.
The amendment Wallace tree group unit includes: low level Wallace tree subelement, selection in one of the embodiments,
Device and high-order Wallace tree subelement, the output end of the low level Wallace tree subelement and the input terminal of the selector connect
It connects, the output end of the selector is connect with the input terminal of the high-order Wallace tree subelement;Wherein, the low level Wallace
Tree unit is used to carry out each columns value in the first partial product of the target code accumulating operation to obtain described add up
Operation result, the selector is for gating the received carry input signal of the high-order Wallace tree subelement, the high position
Wallace tree subelement is used to carry out accumulating operation to each columns value in the first partial product of the target code to obtain institute
State accumulating operation result.
The summing elements include: adder in one of the embodiments, and the adder is used for the cumulative fortune
It calculates result and carries out add operation.
In one of the embodiments, the second amendment coding sub-circuit include: the second amendment coded treatment branch with
And second partial product selects branch, the output end and the second partial product of the second amendment coded treatment branch select branch
Input terminal connection;
The second amendment coded treatment branch is used to carry out at canonical signed number coding second data received
Reason obtains second target code, and the second partial product selection branch according to second target code for being accorded with
Second partial product after number Bits Expanding, selects the second partial product after the symbol Bits Expanding, and receive the portion
First partial product after the symbol Bits Expanding of point product switched circuit output, by the after the symbol Bits Expanding received
First partial product after the symbol Bits Expanding obtained after two partial products, and selection, second as the target code
Partial product.
The second partial product selection branch includes: function selection mode signal input part in one of the embodiments,
Mouth, second partial product input port, first partial product input port, second partial product output port and gate unit product output
Port;The function selection mode signal input port is for receiving the function selection mode signal, the second partial product
Input port is used to receive the second partial product after the symbol Bits Expanding of the second amendment coded treatment branch output, institute
After first partial product input port is stated for receiving the symbol Bits Expanding obtained after the partial product switched circuit exchange
First partial product, the second partial product output port for export need the partial product switched circuit need to exchange it is described
Second partial product after symbol Bits Expanding, the gate unit product output port are used to export the symbol Bits Expanding after gating
First partial product after second partial product afterwards, and the symbol Bits Expanding that receives.
The partial product switched circuit includes: function selection mode signal input port, in one of the embodiments,
A part product input port, first partial product output port, second partial product input port and second partial product output port,
The function selection mode signal input port is for receiving the function selection mode signal, the first partial product input terminal
First partial product after the symbol Bits Expanding that the needs that mouth is used to receive the first partial product selection branch output exchange,
The first partial product output port is for exporting the first partial product after the symbol Bits Expanding, the second partial product output
Second part after the symbol Bits Expanding that the needs that port is used to receive the second partial product selection branch output exchange
Product, the second partial product output port is for exporting the second partial product after the symbol Bits Expanding.
A kind of data processing method, which comprises
Receive pending data and function selection mode signal, wherein the function selection mode signal is used to indicate at data
Reason device can currently handle the data operation of different mode;
According to the function selection mode signal, judge whether the pending data needs to carry out deconsolidation process;
If the pending data needs to carry out deconsolidation process, deconsolidation process is carried out to the pending data, is split
Data afterwards;
Canonical signed number coded treatment is carried out to the data after the fractionation, obtains target code;
Conversion process is carried out according to the data after the target code and the fractionation, the part after obtaining symbol Bits Expanding
Product;
According to the function selection mode signal, judge whether need to swap place to the partial product after the symbol Bits Expanding
Reason;
If not needing to swap processing to the partial product after the symbol Bits Expanding, by the part after the symbol Bits Expanding
Partial product of the product as target code;
Compression processing is carried out to the partial product of the target code, obtains target operation result.
It is described according to the function selection mode signal in one of the embodiments, judge that the pending data is
It is no to need to carry out deconsolidation process, comprising: according to the function selection mode signal, to judge the bit wide and number of the pending data
Whether the data bit width according to processor currently accessible associative mode operation is equal.
In one of the embodiments, according to the function selection mode signal, the position of the pending data is judged
After whether width is equal with the data bit width of data processor currently accessible associative mode operation, the method also includes:
If the pending data does not need to carry out deconsolidation process, continues to execute and canonical signed number is carried out to the pending data
Coded treatment obtains the target code.
The data to after the fractionation carry out canonical signed number coded treatment in one of the embodiments, obtain
To target code, comprising: will be continuous in the data after the fractionationlBit value 1 be converted to (l+ 1) highest bit value in position is 1,
Lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code, whereinlMore than or equal to 2.
The data to after the fractionation carry out canonical signed number coded treatment in one of the embodiments, obtain
To target code, comprising:
Canonical signed number coded treatment is carried out to the data after the fractionation, obtains intermediate code;
According to the intermediate code and the function selection mode signal, the target code is obtained.
The data according to after the target code and the fractionation carry out at conversion in one of the embodiments,
Reason, the partial product after obtaining symbol Bits Expanding, comprising:
Conversion process is carried out according to the data after the target code and the fractionation, obtains initial protion product;
Sign bit extension process is carried out to initial protion product, the partial product after obtaining the symbol Bits Expanding.
It is described according to the function selection mode signal in one of the embodiments, judge to the symbol Bits Expanding
Whether partial product afterwards needs to swap processing, comprising: according to the function selection mode signal, judges that data processor is worked as
Whether preceding handled data bit width is identical.
In one of the embodiments, according to the function selection mode signal, judge to after the symbol Bits Expanding
Partial product whether need to swap processing after, the method also includes: if desired to the portion after the symbol Bits Expanding
Product is divided to swap processing, then long-pending to the high-order portion in the partial product after the symbol Bits Expanding or low portion product is handed over
Change processing.
The partial product to the target code carries out compression processing in one of the embodiments, obtains target fortune
Calculate result, comprising:
Accumulation process is carried out to the partial product of the target code, obtains intermediate calculation results;
Accumulation process is carried out to the intermediate calculation results, obtains the target operation result.
It is described in one of the embodiments, that accumulation process is carried out to the intermediate calculation results, obtain the target fortune
Calculate result, comprising:
Low level Wallace tree subelement carries out accumulation process to the columns value in the partial product of all target codes, obtains cumulative fortune
Calculate result;
Selector gates the accumulating operation result according to the function selection mode signal, obtains carry gating letter
Number;
High-order Wallace tree subelement is according to the columns value in the carry gating signal and the partial product of the target code
Accumulation process is carried out, the target operation result is obtained.
A kind of data processor provided in this embodiment and method pass through the first amendment coding sub-circuit and the second amendment
Coding sub-circuit realizes canonical signed number coded treatment to the data that receive respectively, the after respectively obtaining symbol Bits Expanding
Second partial product after a part of product and symbol Bits Expanding, and need are determined whether according to the function selection mode signal received
Will by partial product switched circuit to the second partial product after the first partial product and symbol Bits Expanding after symbol Bits Expanding into
Row exchange processing, if desired swaps processing, then after exchanging processing, the first amendment coding sub-circuit and the second amendment coding
Partial product after the symbol Bits Expanding that circuit can respectively have current each sub-circuit is as the partial product of target code, in turn
Obtain the first partial product of target code and the second partial product of target code, finally by first amendment compression sub-circuit and
Second amendment compression sub-circuit is compressed respectively to the second partial product of the first partial product of target code and target code
Processing obtains target operation result, which can pass through the first amendment coding sub-circuit and the second amendment coding electricity
Road, so that the number of the live part product of acquisition is less, drops respectively to data progress canonical signed number coded treatment is received
Low data processor realizes multiplying or multiplies accumulating the complexity of operation.
A kind of machine learning arithmetic unit provided by the embodiments of the present application, the machine learning arithmetic unit include one or
Multiple data processors described above;The machine learning arithmetic unit is used to obtain from other processing units to operational data
With control information, and specified machine learning operation is executed, implementing result is passed into other processing units by I/O interface;
When the machine learning arithmetic unit includes multiple data processors, by default between multiple computing devices
Specific structure is attached and transmits data;
Wherein, multiple data processors are interconnected by PCIE bus and are transmitted data, to support more massive machine
The operation of device study;Multiple data processors share same control system or possess respective control system;It is multiple described
Data processor shared drive possesses respective memory;The mutual contact mode of multiple data processors is that any interconnection is opened up
It flutters.
A kind of combined treatment device provided by the embodiments of the present application, the combined treatment device include engineering described above
Practise processing unit, general interconnecting interface and other processing units.The machine learning arithmetic unit and above-mentioned other processing units into
Row interaction, the common operation completing user and specifying;The combined treatment device can also include storage device, storage device difference
It is connect with the machine learning arithmetic unit and other processing units, for saving the machine learning arithmetic unit and institute
State the data of other processing units.
A kind of neural network chip provided by the embodiments of the present application, the neural network chip include at data described above
Manage device, machine learning arithmetic unit described above or combined treatment device described above.
A kind of neural network chip encapsulating structure provided by the embodiments of the present application, the neural network chip encapsulating structure include
Neural network chip described above.
A kind of board provided by the embodiments of the present application, the board include neural network chip encapsulating structure described above.
The embodiment of the present application provides a kind of electronic device, the electronic device include neural network chip described above or
Person's board described above.
A kind of chip provided by the embodiments of the present application, including at least one data processor as described in any one of the above embodiments.
A kind of electronic equipment provided by the embodiments of the present application, including chip as described above.
Detailed description of the invention
Fig. 1 is a kind of electrical block diagram for data processor that an embodiment provides.
Fig. 2 is the electrical block diagram for another data processor that another embodiment provides.
Fig. 3 is the particular circuit configurations figure for the data processor that an embodiment provides.
Fig. 4 a is the regularity of distribution schematic diagram for the partial product that 16 data multiplyings that an embodiment provides obtain.
Fig. 4 b is the regularity of distribution signal that 16 * 8 data that an embodiment provides multiply accumulating the partial product that operation obtains
Figure.
Fig. 5 is the particular circuit configurations figure for the data processor that another embodiment provides.
Fig. 6 is a kind of data processing method flow diagram that an embodiment provides.
The particular circuit configurations figure of compressor circuit when 8 data operations that Fig. 7 provides for another embodiment.
Fig. 8 is another data processing method flow diagram that an embodiment provides.
Fig. 9 is a kind of structure chart for combined treatment device that an embodiment provides.
Figure 10 is the structure chart for another combined treatment device that an embodiment provides.
Figure 11 is a kind of structural schematic diagram for board that an embodiment provides.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Data processor provided by the present application can be applied to AI chip, on-site programmable gate array FPGA (Field-
Programmable Gate Array, FPGA) chip or be in other hardware circuit equipment progress multiplying processing
Or multiplying accumulating calculation process, the structural schematic diagram of the data processor is as illustrated in fig. 1 and 2.
As shown in FIG. 1, FIG. 1 is a kind of structure charts for data processor that one embodiment provides.As shown in Figure 1, the number
It include: the first multiplying operational circuit 11, the second multiplying operational circuit 12 and partial product switched circuit 13 according to processor;Described
One multiplying operational circuit 11 includes the first amendment coding sub-circuit 111 and the first amendment compression sub-circuit 112, and described second multiplies
Method computing circuit 12 includes the second amendment coding sub-circuit 121 and the second amendment compression sub-circuit 122, wherein described first
Amendment coding sub-circuit 111 includes the first encoding branches 111a and first choice branch 111b, the second amendment coding
Circuit 121 includes the selection of the second encoding branches 121a and second branch 121b, and the of the first amendment coding sub-circuit 111
One output end is connect with the first input end of the partial product switched circuit 13, and the of the first amendment coding sub-circuit 111
Two output ends are connect with the input terminal of the first amendment compression sub-circuit 112, and the first of the partial product switched circuit 13 is defeated
Outlet is connect with the input terminal of the first amendment coding sub-circuit 111, the second output terminal of the partial product switched circuit 13
It is connect with the input terminal of the second amendment coding sub-circuit 121, the first output end of the second amendment coding sub-circuit 121
It is connect with the second input terminal of the partial product switched circuit 13, the second output terminal of the second amendment coding sub-circuit 121
It is connect with the input terminal of the second amendment compression sub-circuit 122.
Wherein, the first encoding branches 111a is used to carry out at canonical signed number coding the first data received
Reason, the first partial product after obtaining symbol Bits Expanding, the first choice branch 111b are used for after the symbol Bits Expanding
The first partial product that selection target encodes in first partial product, the first amendment compression sub-circuit 112 are used for the target
The first partial product of coding carries out compression processing, obtains first object operation result, the second encoding branches 121a for pair
The second data for receiving carry out canonical signed number coded treatment, the second partial product after obtaining symbol Bits Expanding, and described the
Two selection branch 121b are for the second partial product that selection target encodes from the second partial product after the symbol Bits Expanding, institute
It states the second amendment compression sub-circuit 122 to be used to carry out compression processing to the second partial product of the target code, obtains the second mesh
Operation result is marked, the partial product switched circuit 13 is used for the first partial product and the symbol after the symbol Bits Expanding
Second partial product after number Bits Expanding swaps.
Specifically, data multiplication operation can be thus achieved in above-mentioned data processor, data also may be implemented and multiply accumulating operation.
Optionally, the first amendment coding sub-circuit 111 can receive the first data, and the second amendment coding sub-circuit 121 can receive the
Two data, the first data and the second data may each comprise two subdatas, the two subdatas can be for the same as the identical of bit wide
Subdata, or with the different subdatas of bit wide;The subdata can be used as multiplying or multiply accumulating in operation
Multiplicand can also be used as multiplying or multiply accumulating the multiplier in operation.Optionally, two sons in above-mentioned first data
Data can splice after as a whole, be input to the first amendment coding sub-circuit 111, can also separate while be input to the
One amendment coding sub-circuit 111;Two subdatas in above-mentioned second data can splice after as a whole, be input to
Two amendment coding sub-circuits 121 can also separate while be input to the second amendment coding sub-circuit 121.Wherein, above-mentioned subdata
It can be fixed-point number, and bit wide can be 2N, the data bit width obtained after two subdata splicings can be 4N.Optionally, on
Stating the first amendment coding sub-circuit 111 may include multiple data processing units with different function, these data processing lists
Member can be the unit with canonical signed number coded treatment function, can also be the list with different switching processing function
Member does not do any restriction to this present embodiment.Data processor in same secondary data calculation process, in data processor
The subdata that one amendment coding sub-circuit 111 receives can be used as multiplicand, another subdata can be used as multiplier;
The subdata that the second amendment coding sub-circuit 121 in data processor receives can be used as multiplicand, another height
Data can be used as multiplier.After first partial product and symbol Bits Expanding after will also be appreciated that above-mentioned symbol Bits Expanding
The bit wide of second partial product, multiplicand bit wide when can be equal to the currently processed multiplying of data processor or multiply accumulating operation
2 times;The number of first partial product after symbol Bits Expanding can be equal to the number of the first partial product of target code;Sign bit
The number of second partial product after extension can be equal to the number of the second partial product of target code.Wherein, after symbol Bits Expanding
First partial product may include the first high position portion after the first low portion after symbol Bits Expanding is long-pending and symbol Bits Expanding
Divide product;Second partial product after symbol Bits Expanding may include that the second low portion after symbol Bits Expanding is long-pending and sign bit
The second high-order portion product after extension;The first partial product of target code may include the first low portion product of target code,
And the first high-order portion product of target code;The second partial product of target code may include the second low level portion of target code
Divide the second high-order portion of product and target code product.
In the present embodiment, above-mentioned first amendment coding sub-circuit 111 can receive the multiplier in calculating process, and to this
Multiplier carries out canonical signed number coded treatment, obtains target code.It should be noted that at above-mentioned canonical signed number coding
The method of reason can characterize in the following manner: forNFor the multiplier of position, handled from low level numerical value to high-order numerical value, if it exists
Continuouslyl(l >=2) bit value 1 when, then can will be continuousnBit value 1 be converted to data " 1(0) l-1(- 1) ", and will
Remaining correspond to (N-l) bit value and conversion after (l+1) bit value is combined to obtain a new data;Then by the new number
According to the primary data as next stage conversion process, there is no continuous in the new data that obtains after conversion processl(l >=
2) until bit value 1;Wherein, rightNPosition multiplier carries out canonical signed number coded treatment, and the bit wide of obtained target code can be with
Equal to (N+1).Further, in canonical signed number coded treatment, data 11 can be converted to (100-001), that is, count
According to 11 can equivalence be converted to 10(-1);Data 111 can be converted to (1000-0001), i.e., data 111 of equal value can be converted
For 100(-1);And so on, it is other continuousl(l >=2) bit value 1 conversion process mode it is also similar.
For example, the first multiplier for receiving of amendment coding sub-circuit 111 is " 001010101101110 ", to the multiplier into
The first new data obtained after row first order conversion process is " 0010101011100(-1) 0 ", continues to carry out the first new data
The second new data obtained after the conversion process of the second level is " 0010101100(-1) 00(-1) 0 ", continue to the second new data into
The third new data obtained after row third level conversion process be " 0010110(-1) 00(-1) 00(-1) 0 ", continue newly to count third
According to carry out obtained the 4th new data after fourth stage conversion process be " 00110(-1) 0(-1) 00(-1) 00(-1) 0 ", continue pair
4th new data carry out obtained the 5th new data after level V conversion process be " 010(-1) 0(-1) 0(-1) 00(-1) 00(-
1) 0 ";And there is no continuous in the 5th new datal(l >=2) bit value 1, at this point, the 5th new data is properly termed as initially
Coding, and intermediate code is obtained after carrying out a cover processing to the initial code, characterization canonical signed number coded treatment is complete
At;Wherein, the bit wide of initial code can be equal to the bit wide of multiplier.Optionally, first amendment coding sub-circuit 111 to multiplier into
After row canonical signed number coded treatment, obtained new data (i.e. initial code), if highest bit value in new data and time
High-order numerical value is " 10 " or " 01 ", then the first amendment coding sub-circuit 111 can highest bit value to the new data it is high by one
One digit number value 0 is mended at position, high three bit value for obtaining corresponding intermediate code is respectively " 010 " or " 001 ".Optionally, among the above
Between the bit wide that encodes can be equal to data processor and be presently in the bit wides of reason data and add 1.
In addition, if the data bit width that data processor receives is 2N, and can currently handleNPosition data operation, then data
The first amendment coding sub-circuit 111 in processor, can be by 2NPosition data split into two groupsNPosition data carry out data fortune respectively
Calculate, at this point, by obtain two groups (N+1) position intermediate code can be used as target code after being combined;If data processor is worked as
Before can handle 2NPosition data operation, then the first amendment in data processor encodes sub-circuit 111, can be to (the 2 of acquisitionN+1)
Mend one digit number value 0(, that is, complement and handle in high one of highest bit value place of position intermediate code) after, by complement, treated (2N+
2) position data are as target code.In the present embodiment, what data processor can execute initial code is cover processing, and
What it is to intermediate code execution is complement processing.
It optionally, include the first input in first multiplying operational circuit 11 and second multiplying operational circuit 12
End is used for receive capabilities selection mode signal;It include third input terminal in the partial product switched circuit 13, it is described for receiving
Function selection mode signal.Optionally, the function selection mode signal is for determining that the data processor can currently be handled
The data operation of different mode.
In the present embodiment, each data processing unit that the first multiplying operational circuit 11 includes can receive function choosing
Select mode signal;Each data processing unit that second multiplying operational circuit 12 includes can receive function selection mode letter
Number.It should be noted that first multiplying electricity of the data processor in same secondary data calculation process, in data processor
Road 11, the second multiplying operational circuit 12 and partial product switched circuit 13, the function selection mode signal received can phase
Deng.Optionally, above-mentioned function selection mode signal may include four kinds of different signals, four kinds of function selection mode signal difference
Corresponding data processor can handle the data operation of four kinds of different modes, and the data operation of four kinds of different modes may includeNPosition *NThe multiplying of position data,NPosition *NPosition data multiply accumulating operation, 2NPosition * 2NThe multiplying and 2 of position dataNPosition *NPosition
Data multiply accumulating operation.For example, if the first data include two 2NSeat data, the second data include two 2NSeat number
According to then data processor can determine current accessible specific mode according to the different function selection mode signal received
Data operation;Four kinds of function selection mode signals can be expressed as 00,01,10,11 with binary numeral, or
Other representations, wherein mode=00 can currently be handled with characterize data processorNPosition *NThe multiplying of position data,
Mode=01 can currently be handled with characterize data processorNPosition *NPosition data multiply accumulating operation, and mode=10 can characterize number
2 can be currently handled according to processorNPosition * 2NThe multiplying of position data, mode=11 can currently be handled with characterize data processor
2NPosition *NPosition data multiply accumulating operation;It will also be appreciated that four kinds of function selection mode signals and four kinds of different modes
There can be arbitrary one-to-one relationship between data operation, the present embodiment does not do any restriction to this.
In addition, working as data processor processes 2NPosition *NWhen multiplying accumulating operation of data of position, the partial product in data processor
Switched circuit 13 can according to actual needs, by the first amendment coding sub-circuit 111 in data processor, obtained sign bit
The first low portion product after extension or the first high-order portion product after symbol Bits Expanding, are repaired with second in data processor
It is positive to encode sub-circuit 121, the second low portion product after obtained symbol Bits Expanding or second high position after symbol Bits Expanding
Partial product swaps;It is also understood that being, data processor is when handling the data operation of other Three models, data processing
Partial product switched circuit 13 in device is vacant state, the long-pending high position with after symbol Bits Expanding of the low portion after symbol Bits Expanding
Partial product does not do corresponding exchange processing.Meanwhile first the bit wides of two subdatas for including in data be 2N, in the second data
The bit wide for two subdatas for including also is 2NIf data processor can currently handle oneNPosition *NThe multiplication fortune of position data
When calculation, according to actual needs, having a data in the first data and the second data at this time is 0, another data include two
High-order numerical value in subdata be 0 or low level numerical value be 0, the first data and the second data can be according to original at this time
Data carry out calculation process;If data processor can currently handle one 2NPosition * 2NWhen the multiplying of position data, according to reality
Demand, having a data in the first data and the second data at this time is 0, in two subdatas of another data high-order numerical value and
Low level numerical value is non-zero numerical value;If data processor can currently handle two 2NPosition * 2NWhen the multiplying of position data, according to
Data 0 are not present in the first data and the second data at this time in actual demand.
A kind of data processor provided in this embodiment passes through the first amendment coding sub-circuit and the second amendment coding electricity
Canonical signed number coded treatment is realized to the data received respectively in road, the first partial product after respectively obtaining symbol Bits Expanding
And the second partial product after symbol Bits Expanding, and determine the need for passing through portion according to the function selection mode signal received
Divide product switched circuit, place is swapped to the second partial product after the first partial product and symbol Bits Expanding after symbol Bits Expanding
Reason, if desired swaps processing, then after exchanging processing, the first amendment coding sub-circuit and the second amendment encode sub-circuit can be with
Respectively current each sub-circuit, partial product of the partial product as target code after the symbol Bits Expanding having, and then obtain mesh
The first partial product of coding and the second partial product of target code are marked, is repaired finally by the first amendment compression sub-circuit and second
The second partial product of positive compression sub-circuit difference, first partial product and target code to target code carries out compression processing,
Obtain target operation result;The data processor can not only realize multiplying, additionally it is possible to which realization multiplies accumulating operation, to mention
The high versatility of data processor;In addition, the data processor does not need to carry out one-accumulate again to multiplication result
Operation could be completed to multiply accumulating arithmetic operation, can be only directly realized by and be multiplied accumulating or multiplying by once-through operation process
Operation, to reduce the power consumption of data processor;In addition, data processor, which can also carry out canonical to the data received, to be had
Symbolic number coded treatment, obtain live part product number it is less, thus reduce data processor realize multiplying or
Multiply accumulating the complexity of operation.
As shown in Fig. 2, Fig. 2 is a kind of structural schematic diagram for data processor that another embodiment provides, the data processing
Device includes canonical signed number coding circuit 21, first partial product obtains circuit 22, second partial product obtains circuit 23, first and presses
Contracting circuit 24 and the second compressor circuit 25;The canonical signed number coding circuit 21 includes canonical signed number coded treatment
Unit 211, the output end of the canonical signed number coding processing unit 211 and the first partial product obtain the of circuit 22
One input terminal, the output end and the second partial product of the canonical signed number coding processing unit 211 obtain circuit 23
First input end connection, the first partial product obtain the first input of the output end and first compressor circuit 24 of circuit 22
End connection, the output end that the second partial product obtains circuit 23 are connect with the first input end of second compressor circuit 25.
Wherein, the canonical signed number coding processing unit 211 is used to have the first data progress canonical received
Symbolic number coded treatment obtains target code, and the first partial product obtains circuit 22 for receiving the second data, and according to institute
It states target code and second data obtains the first partial product of target code, the second partial product obtains circuit 23 and uses
In reception second data, and the second part of target code is obtained according to the target code and second data
Product, first compressor circuit 24 are used to carry out the first partial product of the target code accumulation process, second compression
Circuit 25 is used to carry out accumulation process to the second partial product of the target code.
Specifically, above-mentioned first data and the second data may each comprise two subdatas, two in first data
Subdata can be used as multiplying or multiply accumulating the multiplier in operation, and two subdatas in the second data, which can be used as, to be multiplied
Method operation multiplies accumulating multiplicand in operation.Optionally, the bit wide of subdata can be 2N, in addition, above-mentioned first data
In two subdatas can splice after as a whole, be input to canonical signed number coding processing unit 211, can be with
It separates while being input to canonical signed number coding processing unit 211;Two subdatas in above-mentioned second data can splice
Afterwards as a whole, it is input to first partial product and obtains circuit 22 and second partial product acquisition circuit 23, can also separate same
When be input to first partial product obtain circuit 22, and separate simultaneously be input to second partial product acquisition circuit 23 in.Optionally,
After two subdatas in first data carry out canonical signed number coded treatment, respectively available first object coding and
Second target code, and first object coding and the second target code are referred to as target code.Optionally, first object is compiled
The bit wide of code can be equal to the bit wide of the second target code, and the bit wide that reason multiplier can also be presently in equal to data processor adds
1;The number of the first partial product of target code can be equal to the bit wide of first object coding;The second partial product of target code
Number can be equal to the bit wide of the second target code.Optionally, above-mentioned first object coding may include that the first low level target is compiled
Code and the first high-order target code, the second target code may include the second low level target code and the second high-order target code.
For example, the first data include dataAAnd dataB, the second data include dataCAnd dataDIf data processor needs
It will be to dataA* dataCMultiplying is carried out, to dataB* dataDMultiplying is carried out, then the canonical in data processor has
Symbolic number coding processing unit 211 can be to dataAIt carries out canonical signed number coded treatment and obtains first object coding, and is right
DataBIt carries out canonical signed number coded treatment and obtains the second target code, and canonical signed number coding processing unit 211
First object can be encoded to (and/or second target code) and dataC(or second data) are input to first partial product
Circuit 22 is obtained, by the second target code (and/or first object coding) and dataD(or second data) are input to second
Partial product obtains circuit 23;Or first object is encoded into (and/or second target code) and dataC(or second data)
It is input to second partial product and obtains circuit 23, by the second target code (and/or first object coding) and dataD(or the
Two data) it is input to first partial product acquisition circuit 22;Meanwhile if first partial product obtains circuit 22 and second partial product obtains
What circuit 23 received is the second data that two subdatas are spliced, then first partial product obtains circuit 22 and second part
Product obtains circuit 23 and can split the second data (i.e. multiplicand), respectively obtains the subnumber for needing to carry out multiplying
According to, and according to actual needs, it is encoded by the subdata and first object of acquisition or the second target code obtains partial product;It is above-mentioned
Actual demand it can be appreciated that data processor current desired multiplicand to be processed and corresponding target code corresponding relationship.
In addition, if the bit wide of first object coding can be equal to 2N, then the first high-order target code can be equal in first object coding
High N data, the first low level target code can for first object coding in low N data.
It should be noted that first partial product, which obtains circuit 22, can receive canonical signed number volume in data processor
The first object coding and multiplicand that code processing unit 211 inputs, obtain the first partial product of target code;Second partial product
Obtaining circuit 23 can receive the second target code and multiplicand of the input of canonical signed number coding processing unit 211, obtain
To the second partial product of target code.Optionally, the first partial product of above-mentioned target code may include the first of target code
The first high-order portion product of low portion product and target code;The second partial product of above-mentioned target code may include that target is compiled
The second low portion product of code and the second high-order portion product of target code.Optionally, the first low portion of target code
Product can be the corresponding partial product of the first low level target code, and the first high-order portion product of target code can be the first high-order mesh
Mark encodes corresponding partial product;Second low portion product of target code can be the corresponding part of the second low level target code
Second high-order portion product of product, target code can be the corresponding partial product of the second high position target code.
Further, the first compressor circuit 24 in data processor can obtain circuit 22 to first partial product, obtain
Target code first partial product (i.e. the first low portion of target code is long-pending and the first high-order portion product of target code)
Carry out accumulation process;The second compressor circuit 25 in data processor can obtain circuit 23, obtained mesh to second partial product
The second partial product (i.e. long-pending the second high-order portion product with target code of the second low portion of target code) for marking coding carries out
Accumulation process, to obtain target operation result.In addition, in the present embodiment, the first data that data processor receives with
And second in data, the bit wide for the subdata for including is 2N。
Optionally, include first input end in the canonical signed number coding processing unit 211, selected for receive capabilities
Select mode signal;The first partial product obtains circuit 22 and the second partial product obtains circuit 23 including the second input
End, for receiving the function selection mode signal;First compressor circuit 24 and second compressor circuit 25 include
Second input terminal, for receiving the function selection mode signal.Optionally, the function selection mode signal is for determining institute
State the data operation of the currently processed different mode of data processor.
It is understood that above-mentioned function selection mode signal (mode) can there are four types of unlike signal, these four functions
Selection mode signal (mode) corresponds to the data operation that the data processor can handle four kinds of different modes.Optionally,
When same secondary data calculation process, canonical signed number coding processing unit 211, first partial product in data processor are obtained
Circuit 22, second partial product obtain circuit 23, the first compressor circuit 24 and the second compressor circuit 25, the function selection received
Mode signal (mode) can be equal, and four kinds of function selection mode signals (mode) can distinguish table with binary numeral
The data operation for being shown as kind of the different mode of mode=00, mode=01, mode=10, mode=11, four may includeNPosition *NPosition data
Multiplying,NPosition *NPosition data multiply accumulating operation, 2NPosition * 2NThe multiplying and 2 of position dataNPosition *NPosition data
Multiply accumulating operation.Wherein, the first partial product in data processor obtains circuit 22 and second partial product obtains circuit 23,
The input of canonical signed number coding processing unit 211 first can be controlled and received according to the function selection mode signal received
Perhaps the second target code or first object coding and the second target code carry out subsequent arithmetic to target code.
In the present embodiment, above-mentioned canonical signed number coding processing unit 211 can receive the multiplier in calculating process,
And canonical signed number coded treatment is carried out to the multiplier, obtain target code.It should be noted that above-mentioned canonical signed number
The method of coded treatment can characterize in the following manner: forNFor the multiplier of position, handled from low level numerical value to high-order numerical value,
It is continuous if it existsl(l >=2) bit value 1 when, then can will be continuousnBit value 1 be converted to data " 1(0) l-1(- 1) ",
And by remaining correspond to (N-l) bit value and conversion after (l+1) bit value is combined to obtain a new data;Then will
Primary data of the new data as next stage conversion process, there is no continuous in the new data obtained after conversion processl
(l >=2) until bit value 1;Wherein, rightNPosition multiplier carries out canonical signed number coded treatment, the position of obtained target code
Width can be equal to (N+1).Further, in canonical signed number coded treatment, data 11 can be converted to (100-
001), i.e., data 11 can equivalence be converted to 10(-1);Data 111 can be converted to (1000-0001), i.e., data 111 can
100(-1 is converted to equivalence);And so on, it is other continuousl(l >=2) bit value 1 conversion process mode it is also similar.
For example, the multiplier that canonical signed number coding processing unit 211 receives is " 001010101101110 ", to this
It is " 0010101011100(-1) 0 " that multiplier, which carries out obtained the first new data after first order conversion process, is continued to the first new number
It is " 0010101100(-1) 00(-1) 0 " according to obtained the second new data after the conversion process of the second level is carried out, continues new to second
Data carry out obtained third new data after third level conversion process be " 0010110(-1) 00(-1) 00(-1) 0 ", continue to the
Three new datas carry out obtained the 4th new data after fourth stage conversion process be " 00110(-1) 0(-1) 00(-1) 00(-1) 0 ",
Continue to carry out the 4th new data obtained the 5th new data after level V conversion process be " 010(-1) 0(-1) 0(-1) 00(-
1) 00(-1) 0 ", there is no continuous in the 5th new datal(l >=2) bit value 1, at this point, the 5th new data is properly termed as
Initial code, and after carrying out the processing of cover to initial code, characterization canonical signed number coded treatment is completed to obtain centre
Coding, wherein the bit wide of initial code can be equal to the bit wide of multiplier.Optionally, canonical signed number coding processing unit 211
After carrying out canonical signed number coded treatment to multiplier, obtained new data (i.e. initial code), if the highest order in new data
Numerical value and time high-order numerical value are " 10 " or " 01 ", then canonical signed number coding processing unit 211 can be to the new data most
One digit number value 0 is mended at high one of high-order numerical value, high three bit value for obtaining corresponding intermediate code is respectively " 010 " or " 001 ".
Optionally, the bit wide that the bit wide of above-mentioned intermediate code can be presently in reason data equal to data processor adds 1.
In addition, if the data bit width that data processor receives is 2N, and can currently handleNPosition data operation, then data
Canonical signed number coding processing unit 211 in processor, can be by 2NPosition data split into two groupsNPosition data carry out respectively
Data operation, at this point, by obtain two groups (N+1) position intermediate code can be used as target code after being combined;If at data
Reason device can currently handle 2NPosition data operation, then the canonical signed number coding processing unit 211 in data processor can be right
(2 obtainedN+1) one digit number value 0(, that is, complement processing is mended at high one of the highest bit value of position intermediate code) after, by complement
Treated (2N+2) position data are as target code.
Data processor provided in this embodiment, the canonical signed number coding processing unit in data processor, docking
The first data received carry out canonical signed number coded treatment and obtain target code, and first partial product obtains circuit according to reception
The second data and target code arrived, obtain the first partial product of corresponding target code, and second partial product obtains circuit root
According to the second data and target code received, the second partial product of corresponding target code is obtained, and passes through the first compression
Circuit and the second compressor circuit carry out accumulation process respectively and obtain target operation result;The data processor can be to receiving
Data carry out canonical signed number coded treatment, obtain live part product number it is less, to reduce data processing
Device realizes multiplying or multiplies accumulating the complexity of operation;Meanwhile the data processor can not only realize multiplying, moreover it is possible to
Enough realize multiplies accumulating operation, to improve the versatility of data processor;In addition, the data processor is not needed to multiplication
Operation result carries out one-accumulate operation again could complete to multiply accumulating arithmetic operation, only can be direct by once-through operation process
Realization multiplies accumulating or multiplying operation, to reduce the power consumption of data processor.
Fig. 3 is the concrete structure schematic diagram of a kind of data processor that another embodiment provides, the in data processor
One amendment coding sub-circuit 111 includes: the first amendment coded treatment branch 1111 and first partial product selection branch 1112, institute
The output end for stating the first amendment coded treatment branch 1111 is connect with the input terminal of first partial product selection branch 1112;
Wherein, the first amendment coded treatment branch 1111 is used to carry out canonical to first data received to have symbol
Number encoder processing, obtains the first object coding, and the first partial product selection branch 1112 is used for according to first mesh
Mark coding obtains the first partial product after symbol Bits Expanding, selects the first partial product after the symbol Bits Expanding, and
And receive the second partial product after the symbol Bits Expanding that the partial product switched circuit 13 exports, the symbol that will be received
First partial product after the symbol Bits Expanding obtained after second partial product after number Bits Expanding, and selection, as described
The first partial product of target code.
Specifically, the first amendment coding sub-circuit 111 can carry out canonical to the multiplier in the first data received and have
Symbolic number coded treatment obtains first object coding, and according to the multiplicand and first object coding in the first data, obtains
First partial product after symbol Bits Expanding.Optionally, the bit wide that the bit wide of above-mentioned first object coding can be equal to multiplier adds 1,
The bit wide of first partial product after above-mentioned symbol Bits Expanding can be equal to 2 that data processor is presently in the bit wide of reason multiplicand
Times.Optionally, the number of the first partial product after above-mentioned symbol Bits Expanding can be equal to the number of the first partial product of target code
Mesh can also be equal to the bit wide of first object coding.Wherein, the number of the first partial product after symbol Bits Expanding can be equal to the
The bit wide of one target code.
Illustratively, what data processor received is the data of two 16 bit bit wides, if data processor currently may be used
The multiplying of 8 * 8 data is handled, then the first amendment coding sub-circuit 111 in data processor, it can be by 16 bits
The data of bit wide are divided into, and two groups of data of most-significant byte and least-significant byte carry out calculation process respectively, at this point, after obtained symbol Bits Expanding
The bit wide of first partial product can be equal to 16, most-significant byte data carry out the after the available 9 symbol Bits Expandings of calculation process
One high-order portion product, least-significant byte data carry out the first low portion product after the available 9 symbol Bits Expandings of calculation process;If
Data processor can currently handle the multiplying of 16 * 16 data, then the first amendment coding electricity in data processor
Road 111 can carry out calculation process to two complete 16 data, at this point, the first part after obtained symbol Bits Expanding
Long-pending bit wide can be equal to 32, and can obtain the first partial product after 18 symbol Bits Expandings, in first object coding
High 9 bit value, the partial product after corresponding symbol Bits Expanding are properly termed as the first high-order portion product after symbol Bits Expanding;First
Low 9 bit value in target code, the partial product after corresponding symbol Bits Expanding are properly termed as after symbol Bits Expanding first low
Bit position product.
Optionally, the second amendment coding sub-circuit 121 includes: the second amendment coded treatment branch 1211 and second
Partial product selects branch 1212, the output end and second partial product selection branch of the second amendment coded treatment branch 1211
The input terminal on road 1212 connects;The second amendment coded treatment branch 1211 is used to carry out second data received
Canonical signed number coded treatment, obtains second target code, and the second partial product selection branch 1212 is used for basis
Second target code obtains the second partial product after symbol Bits Expanding, to the second partial product after the symbol Bits Expanding into
Row selection, and the first partial product after the symbol Bits Expanding that the partial product switched circuit 13 exports is received, it will receive
First part after the symbol Bits Expanding obtained after second partial product after the symbol Bits Expanding arrived, and selection
Product, the second partial product as the target code.
It should be noted that working as data processor processes 2NPosition *NWhen multiplying accumulating operation of data of position, in data processor
Partial product switched circuit 13 can according to actual needs, by the first obtained symbol Bits Expanding of amendment coded treatment branch 1111
The first high-order portion product after rear the first low portion product or symbol Bits Expanding, with the second amendment coding sub-circuit 121
To symbol Bits Expanding after the second low portion product or symbol Bits Expanding after the second high-order portion product swap.It is optional
, after partial product switched circuit 13 realizes exchange processing, the first amendment coded treatment branch 1111 can be encoded the first amendment
First partial product in processing branch 1111 after the symbol Bits Expanding that does not exchange, with second after the symbol Bits Expanding that receives
Product is divided to be combined, the first partial product as target code;Second amendment coded treatment branch 1211 can be by the second amendment
Second partial product after the symbol Bits Expanding not exchanged in coded treatment branch 1211, with after the symbol Bits Expanding that receives
A part product is combined, the second partial product as target code.
In the present embodiment, at the method that the first amendment coded treatment branch 1111 handles data, with the second amendment coding
The method for managing the processing data of branch 1211 is essentially identical;The present embodiment handles data to the second amendment coded treatment branch 1211
Method repeats no more.
Data processor provided in this embodiment, the first amendment coded treatment branch in data processor, to receiving
The first data carry out canonical signed number coded treatment and obtain the first partial product after symbol Bits Expanding, and according to data processing
Device is presently in the data pattern of reason, the first partial product after selecting branch to select symbol Bits Expanding by first partial product, with
The first partial product of target code is obtained, is added up by the first amendment compression sub-circuit to the first partial product of target code
Processing, obtains target operation result;The data processor does not need to carry out multiplication result again one-accumulate operation
Can complete multiply accumulating arithmetic operation, only by once-through operation process can be directly realized by multiply accumulating or multiplying operation, from
And reduce the power consumption of data processor;Meanwhile the data processor can also have symbol to the data progress canonical received
The number of number encoder processing, obtained live part product is less, realizes multiplying to reduce data processor or multiplies tired
Add the complexity of operation.
As one of embodiment, the first amendment coded treatment branch 1111 in data processor includes: first to repair
Positive coding unit 1111a, low portion product acquiring unit 1111b, low level selector group unit 1111c, high-order portion product obtain
Unit 1111d and high digit selector group unit 1111e, the first output end of the first amendment coding unit 1111a and institute
State low portion product acquiring unit 1111b first input end connection, the output end of the low level selector group unit 1111c with
The second input terminal connection of the low portion product acquiring unit 1111b, the second of the first amendment coding unit 1111a are defeated
Outlet is connect with the first input end of high-order portion product acquiring unit 1111d, the high digit selector group unit 1111e's
Output end is connect with the second input terminal of high-order portion product acquiring unit 1111d.
Wherein, the first amendment coding unit 1111a is used to carry out canonical to first data received to have symbol
Number coded treatment determines that the data processor can handle the position of data according to the function selection mode signal received
Width, and first object coding is obtained according to the bit wide that the data processor can handle data, the low portion product obtains single
First 1111b is used for the first low level target code and first data in the first object coding that basis receives,
The first low portion product after obtaining symbol Bits Expanding, the low level selector group unit 1111c is for gating the sign bit
The numerical value in the first low portion product after extension, the high-order portion product acquiring unit 1111d are used for according to the institute received
The first high-order target code and first data in first object coding are stated, first after obtaining symbol Bits Expanding is high-order
Partial product, the high digit selector group unit 1111e is in the first high-order portion product after gating the symbol Bits Expanding
Numerical value.
Specifically, above-mentioned first amendment coded treatment branch 1111 can receive the multiplier in the first data, and this is multiplied
Number carries out canonical signed number coded treatments and obtains first object coding, and low portion product acquiring unit 1111b can be according to connecing
The first object coding that the multiplicand in the first data received and the first amendment coding unit 1111a are obtained, obtains symbol
Low portion product after Bits Expanding;High-order portion product acquiring unit 1111d can be according to being multiplied in the first data received
The first object coding that number and the first amendment coding unit 1111a are obtained, the high-order portion product after obtaining symbol Bits Expanding.
Wherein, above-mentioned first data may include multiplying or multiply accumulating multiplier and multiplicand in operation.If data processor
Currently accessible data bit width isNBit, two numbers that the first amendment coding unit 1111a in data processor is received
According to bit wide be 2NBit, then the first amendment coding unit 1111a will can receive 2 automaticallyNPosition data split into heightNPosition
Data and lowNPosition data;Then respectively to heightNPosition data and lowNPosition data carry out canonical signed number coded treatment, obtain
The bit wide of the first high-order target code be equal toNAdd 1, the bit wide of the first obtained low level target code is also equal toNAdd 1;Meanwhile
The long-pending number with the first low portion product of target code of first high-order portion of obtained correspondence target code, can be equal to
(N+1);If currently accessible data bit width is 2 to data processorN, the first amendment coded treatment branch in data processor
The bit wide of 1111 two data received is 2N, then the first amendment coded treatment branch 1111 can be to receiving 2NPosition data
Canonical signed number coded treatment is carried out, obtains (2N+1) intermediate code of position, and complement processing is carried out to intermediate code, it obtains
(2N+2) position data, by this (2N+2) data of position are encoded as first object, wherein complement processing can be characterized as to data
High one of highest bit value at complement value 0;At this point, first object coding in height (N+1) position data are properly termed as first
High-order target code, first object coding in it is low (N+1) position data are properly termed as the first low level target code.Optionally,
The highest bit value of one target code is the numerical value 0 obtained after complement is handled, is wrapped in the partial product of corresponding obtained target code
The numerical value contained all can be numerical value 0.
It should be noted that above-mentioned low level selector group unit 1111c can believe according to the function selection mode received
Number, the part bit value in the first low portion product after gating symbol Bits Expanding isNThe sign bit that position multiplying obtains expands
The numerical value or 2 in the first low portion product after exhibitionNThe first low portion after the symbol Bits Expanding that position multiplying obtains
Long-pending middle numerical value;Similarly, high digit selector group unit 1111e can gate symbol according to the function selection mode signal received
The part bit value in the first high-order portion product after number Bits Expanding isNThe after the obtained symbol Bits Expanding of position multiplying
Numerical value or 2 in one high-order portion productNThe number in the first high-order portion product after the symbol Bits Expanding that position multiplying obtains
Value.
It is understood that if the data bit width that data processor receives can be 2NBit can currently handle 2NPosition
Data operation, then the low portion product acquiring unit 1111b in data processor can be according in the first low level target code
Each bit value, the low portion product after obtaining corresponding symbol Bits Expanding;Above-mentioned low level selector group unit 1111c can be selected
The numerical value in the first low portion product after logical symbol Bits Expanding;Then by the low portion product after symbol Bits Expanding and after gating
The numerical value in the first low portion product after the symbol Bits Expanding of acquisition is combined, the first low level after obtaining symbol Bits Expanding
Partial product.Optionally, the high-order portion product acquiring unit 1111d in data processor can be according in the first high-order target code
Each bit value, after obtaining corresponding symbol Bits Expanding high-order portion product;Above-mentioned high digit selector group unit 1111e can be with
The numerical value in the first high-order portion product after gating symbol Bits Expanding;Then by the high-order portion product and gating after symbol Bits Expanding
The numerical value in the first high-order portion product after the symbol Bits Expanding obtained afterwards is combined, and first after obtaining symbol Bits Expanding is high
Bit position product.Optionally, in canonical signed number coding process, the bit wide of the first low level target code can be equal to first
The bit wide of high-order target code can also be equal to lowNThe number of the first low portion product after the corresponding symbol Bits Expanding of position data
Mesh, Huo ZhegaoNThe number of the first high-order portion product after the corresponding symbol Bits Expanding of position data.Optionally, the first amendment coding
May include in processing branch 1111 (N+1) a low portion product acquiring unit 1111b, can also include (N+1) a high position portion
Divide product acquiring unit 1111d.Optionally, above-mentioned each low portion product acquiring unit 1111b may include 4NA numerical value is raw
At subelement, each high-order portion product acquiring unit 1111d also may include 4NA numerical generation subelement, and each number
The one digit number value in the first low portion product after the value generation available symbol Bits Expanding of subelement.Meanwhile low portion product
Acquiring unit 1111b can determine the first low level of target code according to the product of the first low portion after obtained symbol Bits Expanding
Partial product, high-order portion product acquiring unit 1111d can be determined according to the product of the first high-order portion after obtained symbol Bits Expanding
The first high-order portion product of target code.
In addition, the second amendment coded treatment branch 1211 and the first amendment coded treatment branch 1111, realize that canonical has symbol
The method of number coded treatment is identical, and the second amendment coded treatment branch 1211 and the first amendment coded treatment branch 1111
Internal structure and external output port function it is also identical, therefore, the present embodiment to second amendment coded treatment branch
The method and structure of 1211 processing data repeats no more.
A kind of data processor provided in this embodiment, data processor pass through the in the first amendment coded treatment branch
One amendment coding unit carries out canonical signed number coded treatment to the data that receive, obtains the first low level target code and the
One high-order target code, and low portion product acquiring unit obtained according to the first low level target code it is low after symbol Bits Expanding
Bit position product, high-order portion product acquiring unit obtain the high-order portion product after symbol Bits Expanding according to the first high-order target code,
And then it determines the need for handing over the low portion product after symbol Bits Expanding and the high-order portion product after symbol Bits Expanding
Processing is changed, to obtain the partial product of target code, and accumulation process is carried out to the partial product of target code, obtains target operation knot
Fruit;The data processor can not only realize multiplying, additionally it is possible to which realization multiplies accumulating operation, to improve data processor
Versatility;Meanwhile the data processor can also carry out canonical signed number coded treatment to the data received, obtain
The number of live part product is less, realizes multiplying to reduce data processor or multiplies accumulating the complexity of operation.
As one of embodiment, the first amendment coding unit 1111a in data processor includes: that the first data are defeated
Inbound port 1111aa, first mode selection signal input port 1111ab, low level target code output port 1111ac and height
Position target code output port 1111ad;The first data-in port 1111aa is described for receiving first data
For first mode selection signal input port 1111ab for receiving the function selection mode signal, the low level target code is defeated
For exit port 1111ac for exporting to after first data progress canonical signed number coded treatment, described first obtained is low
Position target code, the high position target code output port 1111ad have symbol to first data progress canonical for exporting
After number encoder processing, the high-order target code of described first obtained.
Specifically, the first amendment coding unit 1111a in data processor can pass through first in multiplication procedure
Data-in port 1111aa receives the first data, is selected by first mode selection signal input port 1111ab receive capabilities
Mode signal carries out canonical signed number coded treatment to the multiplier in the first data and obtains intermediate code, and according to receiving
Function selection mode signal determine the need for intermediate code carry out complement processing, and then obtain first object coding, so
The first low level target code in first object coding is exported by low level target code output port 1111ac afterwards, passes through a high position
Encode the first high-order target code in output port 1111ad output first object coding.
A kind of data processor provided in this embodiment, which, which can carry out canonical to the data received, has
Symbolic number coded treatment, to reduce the number of the live part obtained in multiplication procedure product, to reduce data processor
The complexity for realizing multiplying, improves the operation efficiency of multiplying, effectively reduces the power consumption of data processor.
The low portion product acquiring unit 1111b in data processor includes: low level target in one of the embodiments,
Coding input port 1111ba, gating value input mouth 1111bb, the first data-in port 1111bc and low portion
Product output port 1111bd;The low level target code input port 1111ba is for receiving the first amendment coding unit
The first low level target code of 1111a input, the gating value input mouth 1111bb is for receiving the low level choosing
After selecting device group unit 1111c gating, numerical value in the first low portion after obtained symbol Bits Expanding product, described first
Data-in port 1111bc is for receiving first data, and the low portion product output port 1111bd is for exporting institute
The first low portion product after stating symbol Bits Expanding.
Specifically, the low portion product acquiring unit 1111b in data processor passes through low level target code input port
1111ba can receive the first low level target code of the first amendment coding unit 1111a output, and be inputted by the first data
Port 1111bc can receive the multiplicand in the first data.Optionally, low portion product acquiring unit 1111b can be according to connecing
The the first low level target code received, and the multiplying that receives or multiply accumulating multiplicand in operation, it obtains corresponding
The first low portion product after symbol Bits Expanding.Optionally, if the first data in low portion product acquiring unit 1111b input
The multiplicand bit wide that port 1111bc is received isN, then after low portion accumulates the symbol Bits Expanding that acquiring unit 1111b is obtained
The bit wide of first low portion product can be equal to 2N.Illustratively, if low portion product acquiring unit 1111b receives oneN
The multiplicand of bit bit wideX, then low portion product acquiring unit 1111b can be according to multiplicandXAnd first low level target compile
That is, -1,1 and 0 three kinds of numerical value for including in code obtain corresponding initial protion product, and obtain sign bit according to initial protion product
Low portion product after extension, in the low portion product after the symbol Bits Expanding it is low (N+ 1) bit value can be equal to original portion
All numerical value that point product includes, the low portion after symbol Bits Expanding accumulate in height (N- 1) bit value can be equal to original portion
Divide the symbol bit value (i.e. highest bit value) of product.Wherein, when the numerical value in the first low level target code is -1, then original portion
Point product can for-X, when the numerical value in the first low level target code is 1, then initial protion product can beX, when the first low level mesh
When numerical value in mark coding is 0, then initial protion product can be 0.
It should be noted that low portion product acquiring unit 1111b can be connect by gating value input mouth 1111bb
When receiving the data operation of the different mode of low level selector group unit 1111c gating, first after obtained symbol Bits Expanding is low
Correspondence bit value in bit position product;It then will be after low portion product acquiring unit 1111b currently available symbol Bits Expanding
Low portion product, is combined with the corresponding bit value after gating, the first low portion product after obtaining symbol Bits Expanding.
Optionally, the high-order portion product acquiring unit 1111d in data processor includes: high-order target code input port
1111da, gating value input mouth 1111db, data-in port 1111dc and high-order portion product output port 1111dd;
The high position target code input port 1111da is used to receive first high position of the first amendment coding unit 1111a output
Target code, it is defeated after the gating value input mouth 1111db is for receiving the high digit selector group unit 1111e gating
The numerical value in the first high-order portion product after the symbol Bits Expanding out, the data-in port 1111dc is for receiving institute
The first data are stated, the high-order portion product output port 1111dd is for exporting the first high-order portion after the symbol Bits Expanding
Product.
It is understood that high-order portion product acquiring unit 1111d obtains the first high-order portion product after symbol Bits Expanding
Method, with low portion product acquiring unit 1111b obtain symbol Bits Expanding after the first low portion product method it is identical, this
The method that embodiment repeats no more high-order portion product acquiring unit 1111d fetching portion product.In addition, low portion accumulates acquiring unit
The internal circuit configuration of 1111b and high-order portion product acquiring unit 1111d can be identical, and the function of external output port can also be with
It is identical, the specific structure of this embodiment is not repeated high-order portion product acquiring unit 1111d.
A kind of data processor provided in this embodiment, the low portion product acquiring unit in data processor can basis
First low level target code obtains the product of the low portion after symbol Bits Expanding, then by after symbol Bits Expanding low portion product with
The numerical value of low level selector group one-cell switching is combined, the first low portion product after obtaining symbol Bits Expanding, and then is determined
Whether place is swapped to the first low portion product after symbol Bits Expanding and the first high-order portion product after symbol Bits Expanding
Reason to obtain the partial product of target code, and carries out accumulation process to the partial product of target code, obtains the data of different mode
Operation result;The data operation processing of different mode may be implemented in the data processor, to improve the logical of data processor
The property used;Meanwhile after the data processor carries out canonical signed number coded treatment to the data received, the live part of acquisition
Long-pending number is less, to reduce the complexity that data processor realizes multiplying.
The low level selector group unit 1111c in data processor includes: low level selector in one of the embodiments,
1111ca, multiple low level selector 1111ca are used for the numerical value in the first low portion product after the symbol Bits Expanding
It is gated.
Specifically, in above-mentioned low level selector group unit 1111c low level selector 1111ca number, can be equal to 3N*
(N+ 1), 2NIt can indicate that data processor is presently in the bit wide of reason data, it is each in low level selector group unit 1111c
The internal circuit configuration of a low level selector 1111ca can be identical.Optionally, multiplying or when multiplying accumulating operation, first repairs
Positive coding unit 1111a connection correspondence (N+ 1) in a low portion product acquiring unit 1111b, each low portion product is obtained
Unit 1111b is taken to may include 4NA numerical generation subelement, wherein 2NA numerical generation subelement can connect 2NA low level
Selector 1111ca, this 2NA numerical generation subelement can connect a low level selector 1111ca.Optionally, 2NIt is a low
Digit selector 1111ca corresponding 2NA numerical generation subelement can be high in the first low portion product after sign bit extension
2NThe corresponding numerical generation subelement of position data, meanwhile, this 2NThe external input port of a low level selector 1111ca is in addition to function
Outside, there are two other input ports for energy selection mode signal input port (mode).Optionally, if data processor can be located
Manage the data operation of four kinds of different modes, and the bit wide of data that data processor receives is 2N, then above-mentioned low level selection
The signal that two other input ports of device 1111ca can receive is respectively numerical value 0, carries out 2 with data processorNBit
The first low portion product when wide data operation, after the correspondence symbol Bits Expanding that low portion product acquiring unit 1111b is obtained
In symbol bit value.Wherein, (N+ 1) a low portion product acquiring unit 1111b can connect (N+ 1) 2 are organizedNA low level selection
Device 1111ca, the 2 of each groupNThe correspondence symbol bit value that a low level selector 1111ca is received can be identical, can not also phase
Together;But the 2 of same groupNThe symbol bit value that a low level selector 1111ca is received is identical, and the symbol digit
Value can be according to each group 2NA low level selector 1111ca, what the low portion product acquiring unit 1111b being correspondingly connected with was obtained
The symbol bit value in the first low portion product after symbol Bits Expanding obtains.
In addition, each low portion accumulates 4 that acquiring unit 1111b includesNA numerical generation subelement, wherein corresponding toN
A numerical generation subelement can be not connected to low level selector 1111ca, at this point, shouldNThe number that a numerical generation subelement obtains
Value can be presently in the numerical value in the first low level target code of the data acquisition for managing different bit wides for data processor, obtain
To correspondence symbol Bits Expanding after the first low portion product in correspondence bit value;It is also understood thatNA numerical generation
The numerical value that unit obtains can correspond in the first low portion product after corresponding symbol Bits Expanding from lowest order (i.e. the 1st)
It is counted to highest order, the 1st toNAll numerical value between bit value.
It should be noted that each above-mentioned low portion accumulates 4 that acquiring unit 1111b includesNA numerical generation is single
It is remaining in memberNA numerical generation subelement also can connectNA low level selector 1111ca, each numerical generation are single
Member can connect 1 low level selector 1111ca;It shouldNThe external input port of a low level selector 1111ca is selected in addition to function
Outside, there are two other input ports for mode signal input port (mode);The letter that the two other input ports can receive
Number, respectively data processor carries out 2NPosition data operation, in the first low portion product after obtained correspondence symbol Bits Expanding
Symbol bit value and data processor carry out 2NPosition data operation, the low portion after obtained correspondence symbol Bits Expanding
Bit value is corresponded in product.Wherein, (N+ 1) a low portion product acquiring unit 1111b can connect (N+ 1) groupNA low level selection
Device 1111ca, each groupNThe symbol bit value that a low level selector 1111ca is received can be identical, can not also be identical;But
It is, same groupNThe symbol bit value that a low level selector 1111ca is received is identical, and the symbol bit value can be with
According to each groupNA low level selector 1111ca, the sign bit that the low portion product acquiring unit 1111b being correspondingly connected with is obtained expand
The symbol bit value in the first low portion product after exhibition obtains.
In addition, each groupNIn the first low portion product after the symbol Bits Expanding that a low level selector 1111ca is received
Correspondence bit value can accumulate acquiring unit 1111b according to the low portion that this group of low level selector 1111ca is connected, acquisition
The correspondence bit value in the first low portion product after symbol Bits Expanding determines;And each groupNA low level selector
In 1111ca, the correspondence bit value that each low level selector 1111ca is received can be identical, can not also be identical.Wherein,
4 in each low portion product acquiring unit 1111bNThe position distribution rule of a numerical generation subelement, can be at upper one
4 in low portion product acquiring unit 1111bNOn the basis of a numerical generation subunit position, it is single to move to left numerical generation
Member.Optionally, it participates in the first low portion product of all target codes of subsequent arithmetic, only the of first aim coding
The bit wide of one low portion product, the bit wide 4 of the first low portion product after first symbol Bits Expanding can be equal toN;It is remaining
The bit wide of the first low portion product of target code all can be one few on the basis of the first partial product of a upper target code,
And the bit wide of the first high-order portion product of the last one target code can be equal to (2N-1).
Optionally, the high digit selector group unit 1111e includes: high digit selector 1111ea, multiple high-order choosings
Device 1111ea is selected for gating to the numerical value in the first high-order portion product after the symbol Bits Expanding.
It should be noted that the method for high digit selector 1111ea gating numerical value can describe by the following method.
Optionally, in above-mentioned high digit selector group unit 1111e high digit selector 1111ea number, can be equal to 3N*
(N+ 1), 2NIt can indicate that data processor is presently in the bit wide of reason data, it is each in the high digit selector group unit 1111e
The internal circuit configuration of a high digit selector 1111ea can be identical.Optionally, multiplying or when multiplying accumulating operation, first repairs
Positive coding unit 1111a can connect (N+ 1) a high-order portion accumulates acquiring unit, each high-order portion accumulates in acquiring unit,
It may include 4NA numerical generation subelement, wherein 2NA numerical generation subelement can connect 2NA high digit selector
1111ea, each numerical generation subelement connect digit selector 1111ea one high.Optionally, above-mentioned 2NA high digit selector
1111ea corresponding 2NA numerical generation subelement can be in the high-order portion product of target code low 2NThe corresponding number of bit value
Value generates subelement, this 2NThe external input port of a high digit selector 1111ea is in addition to function selection mode signal input port
(mode) outside, there are two other input ports.Optionally, if data processor can handle the data fortune of four kinds of different modes
It calculates, and the bit wide of data that data processor receives is 2N, then two other inputs of above-mentioned high digit selector 1111ea
Received signal is distinguished in port can carry out 2 for 0 and data processorNWhen the data operation of bit bit wide, high-order portion product
The correspondence bit value in partial product after the correspondence symbol Bits Expanding that acquiring unit obtains.Wherein, (N+ 1) a high-order portion product obtains
Take unit can connect (N+ 1) 2 are organizedNA high digit selector 1111ea, the 2 of each groupNWhat a high digit selector 1111ea was received
Corresponding bit value can be identical, can not also be identical.
In addition, each high-order portion accumulates 4 that acquiring unit includesNIt is corresponding in a numerical generation subelementNA numerical value is raw
It can connect at subelementNA high digit selector 1111ea, each numerical generation subelement can connect 1 high digit selector
1111ea, shouldNA high digit selector 1111ea can be identical with the internal circuit configuration of selector, and shouldNA high-order selection
The external input port of device 1111ea is other than function selection mode signal input port (mode), and there are two other input terminals
Mouthful, the two other input ports distinguish received signal, can carry out 2 for data processorNPosition data operation, obtained pair
Symbol bit value and data processor in partial product after answering symbol Bits Expanding carry out 2NPosition data operation, obtained correspondence
Symbol bit value in partial product after symbol Bits Expanding.Wherein, (N+ 1) a high-order portion product acquiring unit can connect (N+
1) groupNA high digit selector 1111ea, each groupNThe symbol bit value that a high digit selector 1111ea is received can be identical,
Can not also be identical, still, same groupNThe symbol bit value that a high digit selector 1111ea is received is identical, and
The symbol bit value can be according to each groupNA high digit selector 1111ea, the high-order portion product acquiring unit being correspondingly connected with obtain
The symbol bit value in partial product after the symbol Bits Expanding taken obtains.In addition, each groupNA high digit selector 1111ea is received
To symbol Bits Expanding after partial product in correspond to bit value, the high position that can be connected according to the high digit selector 1111ea of the group
Partial product acquiring unit, the symbol bit value in partial product after the symbol Bits Expanding of acquisition determine, and each groupNA height
In digit selector 1111ea, the correspondence bit value that each high digit selector 1111ea is received can be identical, can not be identical.
It should be noted that each high-order portion accumulates 4 that acquiring unit includesNIt is remaining in a numerical generation subelement
'sNA numerical generation subelement can be not connected to high digit selector 1111ea, at this point, shouldNWhat a numerical generation subelement obtained
Numerical value can be presently in the data for managing different bit wides for data processor, what the obtained numerical value in high-order target code obtained
The correspondence bit value in partial product after corresponding symbol Bits Expanding, it is understood that be,NWhat a numerical generation subelement obtained
Numerical value can be to correspond in the high-order portion product after symbol Bits Expanding, and correspondence is counted from lowest order (i.e. the 1st) to highest order, the
(2N+ 1) position is to the 3rdNAll numerical value between bit value.Wherein, 4 in each high-order portion product acquiring unitNA numerical value is raw
It, can be 4 in upper high-order portion product acquiring unit at the regularity of distribution of the position of subelementNA numerical generation subelement position
On the basis of setting, a numerical generation subelement is moved to left.Optionally, the high-order portion of all target codes of subsequent arithmetic is participated in
In product, the bit wide of the only high-order portion product of first aim coding can be equal to 4N, the high-order portion of remaining target code
Long-pending bit wide all can be one few on the basis of the high-order portion of upper target code product, and the height of the last one target code
The bit wide of bit position product can be equal to (2N-1).
A kind of data processor provided in this embodiment, the low level selector group unit in data processor can gate low
Numerical value in bit position product, the first low portion product after obtaining symbol Bits Expanding, and then according to first after symbol Bits Expanding
Low portion product obtains the first partial product of target code, and is carried out by first partial product of the compressor circuit to target code tired
Add processing, obtains the target operation result of different mode, which may be implemented the data operation processing of different mode,
To improve the versatility of data processor.
Data processor includes first partial product selection branch 1112, the first part in one of the embodiments,
Product selection branch 1112 includes: function selection mode signal input port (mode) 1112a, first partial product input port
1112b, second partial product input port 1112c, first partial product output port 1112d and gate unit product output port
1112e;Function selection mode signal input port (mode) 1112a is for receiving the function selection mode signal, institute
It states first partial product input port 1112b and expands for receiving the sign bit that the first amendment coding sub-circuit 111 inputs
First partial product after exhibition, the second partial product input port 1112c are exchanged for receiving the partial product switched circuit 13
The symbol Bits Expanding after second partial product, the first partial product output port 1112d needs the portion for exporting
First partial product after dividing the product symbol Bits Expanding that switched circuit 13 swaps, the gate unit product output port
1112e is used to export the first partial product after the symbol Bits Expanding after gating, and the symbol Bits Expanding received
Second partial product afterwards.
Specifically, if data processor can currently handle 2NPosition *NPosition data multiply accumulating operation, then in data processor
Partial product switched circuit 13 can exchange the product of the second low portion after symbol Bits Expanding, it is low with first after symbol Bits Expanding
Bit position product;Or the partial product switched circuit 13 in data processor can exchange the second high-order portion after symbol Bits Expanding
Product, with the first high-order portion product after symbol Bits Expanding;At this point, first partial product selection branch 1112 can pass through second part
Accumulate input port 1112c, the second partial product after the symbol Bits Expanding that receiving portion product switched circuit 13 exchanges, first partial product
First partial product after selection branch 1112 and the symbol Bits Expanding for exchanging needs, passes through first partial product output port
1112d is exported to partial product switched circuit 13.Wherein, the gate unit product output port in first partial product selection branch 1112
1112e, the first partial product after the symbol Bits Expanding for not needing exchange can be exported, and after the symbol Bits Expanding that receives
Second partial product;Meanwhile first partial product selection branch 1112 by do not need exchange symbol Bits Expanding after first partial product,
And/or first partial product of the second partial product after the symbol Bits Expanding received as target code, it is input to the first amendment
It compresses sub-circuit 112 and carries out compression processing.
A kind of data processor provided in this embodiment, data processor select branch can choose by first partial product
First partial product after symbol Bits Expanding, to obtain the eastern first partial product of target code, so that data processor can not only
It realizes the multiplying with bit wide data and multiplies accumulating operation, additionally it is possible to that realizes different bit wide data multiplies accumulating operation, from
And improve the versatility of data processor.
Data processor includes that sub-circuit 112, the first amendment pressure are compressed in the first amendment in one of the embodiments,
Contracting sub-circuit 112 includes: amendment Wallace tree group unit 1121 and summing elements 1122, the amendment Wallace tree group unit
1121 output end is connect with the input terminal of the summing elements 1122;The amendment Wallace tree group unit 1121 is used for not
With mode data operation processing when, each columns value in the first partial product of the target code of acquisition carries out cumulative place
Reason obtains accumulating operation as a result, the summing elements 1122 are used to carry out add operation to the accumulating operation result.
Specifically, the mesh that above-mentioned amendment Wallace tree group unit 1121 can obtain the first amendment coding sub-circuit 111
It marks each columns value in the first partial product of coding and carries out accumulation process, and pass through 1122 pairs of amendment Wallace trees of summing elements
Two operation results that group unit 1121 obtains carry out accumulation process, obtain target operation result.Wherein, by correcting Wallace
When tree group unit 1121 carries out accumulation process, the regularity of distribution of the first partial product of all target codes can be characterized as each
Lowest order numerical value present position in the first partial product of the corresponding target code of row, than the first part that next line corresponds to target code
Lowest order numerical value present position is staggered to the right one digit number value in product, still, in the first partial product of each target code most
High-order numerical value is located at same row with highest order numerical value in the first partial product of first aim coding.Optionally, Hua Lai is corrected
Scholar's tree group unit 1121 can be according to the regularity of distribution of the first partial product of all target codes, to the first of all target codes
Each columns value in partial product carries out accumulation process.Optionally, obtain two of above-mentioned amendment Wallace tree group unit 1121
Operation result may include and position output signalSumWith carry output signalsCarry。
Illustratively, if data processor currently processed 16 * 16 fixed-point number multiplyings, pass through first partial product
The regularity of distribution of the first partial product for 9 target codes that selection branch 1112 obtains is as shown in fig. 4 a, wherein hollow
Circle indicates that each bit value in partial product, solid circles indicate the sign extended bit value in partial product.
If data processor is circuit structure shown in Fig. 3, currently processed 16 * 8 fixed points of the data processor
Number multiplies accumulating operation, the target code received by the first amendment compression sub-circuit 112 or the second amendment compression sub-circuit 122
First partial product the regularity of distribution it is as shown in Figure 4 b;Wherein, empty circles indicate that first partial product selects branch 1112 or the
The partial product that two partial products selection branch 1212 obtains;Intersecting empty circles indicates that first partial product selection branch 1112 passes through portion
Divide and accumulates switched circuit 13, the second partial product after the symbol Bits Expanding that the second partial product selection branch 1212 of acquisition obtains, or
Person's second partial product selects branch 121 by partial product switched circuit 13, and the first partial product selection branch 1112 of acquisition obtains
Symbol Bits Expanding after first partial product.
In addition, the second amendment compression sub-circuit 122 handles the method for data and the first amendment compression sub-circuit 112 handles number
According to method it is identical;And the internal structure of the second amendment compression sub-circuit 122 and the first amendment compression sub-circuit 112, and
The function of external output port is also identical, and the present embodiment handles the second amendment compression sub-circuit 122 method and structure of data
Repeat no more.
A kind of data processor provided in this embodiment, data processor can be to mesh by the first amendment compression sub-circuit
The first partial product of mark coding carries out accumulation process, and carries out accumulation process to accumulation result by summing elements, obtains target
The data operation processing of different mode may be implemented in operation result, the data processor, to improve the logical of data processor
With property, the area that data processor occupies AI chip is effectively reduced.
Data processor includes amendment Wallace tree group unit 1121, the amendment Hua Lai in one of the embodiments,
Scholar's tree group unit 1121 includes: low level Wallace tree subelement 1121a, selector 1121b and high-order Wallace tree subelement
The output end of 1121c, the low level Wallace tree subelement 1121a are connect with the input terminal of the selector 1121b, the choosing
The output end for selecting device 1121b is connect with the input terminal of the high-order Wallace tree subelement 1121c;Wherein, multiple low levels
Wallace tree subelement 1121a is used to carry out accumulating operation to each columns value in the first partial product of the target code,
The accumulating operation is obtained as a result, the selector 1121b is received for gating the high-order Wallace tree subelement 1121c
Carry input signal, multiple high position Wallace tree subelement 1121c are used for in the first partial product of the target code
Each columns value carry out accumulating operation obtain the accumulating operation result.
Specifically, the circuit structure of each low level Wallace tree subelement 1121a, it can be by full adder and half adder group
It closes and realizes, realization can also be combined by 4-2 compressor;The circuit structure of each high-order Wallace tree subelement 1121c, can also
To combine realization by full adder and half adder, realization can also be combined by 4-2 compressor;In addition, low level Wallace tree subelement
1121a and high-order Wallace tree subelement 1121c, can be understood as one kind can be handled multidigit input signal,
Multidigit input signal is added to obtain the circuit of two output signals.Optionally, it corrects high-order in Wallace tree group unit 1121
The number of Wallace tree subelement 1121c, when can be equal to data processor can currently handle multiplying or multiply accumulating operation
The bit wide of multiplicandN, the number of low level Wallace tree subelement 1121a can also be equal to;Wherein, two neighboring low level Wallace
It can be connected in series between tree unit 1121a, can also serially connect between two neighboring high position Wallace tree subelement 1121c
It connects.Optionally, the output end of the last one low level Wallace tree subelement 1121a is connect with the input terminal of selector 1121b, choosing
The output end for selecting device 1121b is connect with the input terminal of first high-order Wallace tree subelement 1121a.Optionally, Hua Lai is corrected
In scholar's tree group unit 1121, each low level Wallace tree subelement 1121a can be to the first partial product of all target codes
Respective column numerical value carry out addition process;Each low level Wallace tree subelement 1121a can export two signals, i.e. carry
SignalCarry i With one and position signalSum i ;Wherein,iIt can indicate that each low level Wallace tree subelement 1121a is corresponding
Number, the number of first low level Wallace tree subelement 1121a is 1.Optionally, each low level Wallace tree subelement
1121a receives the number of input signal, can be equal to the number of the first partial product of target code.Wherein, Wallace is corrected
In tree group unit 1121, the sum of the number of high-order Wallace tree subelement 1121c and low level Wallace tree subelement 1121a can
To be equal to 2N;In the first partial product of all target codes, the total columns arranged from low order column to highest can be equal to 2N,NIt is a low
Position Wallace tree subelement 1121a can be to the low of the first partial product of all target codesNEach columns value in column data
Accumulating operation is carried out,NA high position Wallace tree subelement 1121c can be to the height of the first partial product of all target codesNColumn
Each columns value in data carries out accumulating operation.
Illustratively, if data processor currently needs to handle 2NPosition * 2NThe multiplying of position data, at this point, at data
The last one low level Wallace tree that selector 1121b in reason device can be gated in amendment Wallace tree group unit 1121 is single
First 1121a, the carry output signals of outputCout N As in amendment Wallace tree group unit 1121, first high-order Wallace
The carry input signal that tree unit 1121c is receivedCin N+1;If data processor currently needs to handleNPosition *NPosition data
Multiplying, at this point, the selector 1121b in data processor can gate numerical value 0 as amendment Wallace tree group unit
In 1121, carry input signal that first high position Wallace tree subelement 1121c is receivedCin N+1;It is also understood that being number
Can will currently be received according to processor 2NSeat data, are divided into heightNPosition data and lowNPosition data carry out multiplication fortune respectively
It calculates, corrects in Wallace tree group unit 1121, from first low level Wallace tree subelement 1121a to the last one low level Hua Lai
The reference numeral of scholar's tree unit 1121aiIt can be expressed as 1,2 respectively ...,N;From first high-order Wallace tree subelement
The reference numeral of 1121c to the last one high-order Wallace tree subelement 1121ciIt can be expressed as respectivelyN+ 1,N+ 2 ..., 2N。
It should be noted that amendment Wallace tree group unit 1121 in each low level Wallace tree subelement 1121a and
High-order Wallace tree subelement 1121c, the signal received may each comprise carry input signalCin i , the input of partial product numerical value
Signal, carry output signalsCout i .Optionally, each low level Wallace tree subelement 1121a and high-order Wallace tree are single
The partial product numerical value input signal that first 1121c is received can be the number of respective column in the first partial product of all target codes
Value;Each low level Wallace tree subelement 1121a and high-order Wallace tree subelement 1121c, the carry signal of outputCout i
Digit can be equal toN Cout =floor((N I +N Cin ) / 2)-1.Wherein,N I It can indicate low level Wallace tree subelement
The number of the partial product numerical value input signal of 1121a or high-order Wallace tree subelement 1121c,N Cin It can indicate low level Hua Lai
The number of the carry input signal of scholar tree unit 1121a or high-order Wallace tree subelement 1121c,N Cout It can indicate low level
The number of Wallace tree subelement 1121a or a high position least carry output signals of Wallace tree subelement 1121c,floorIt can
To indicate downward bracket function.Optionally, it corrects in Wallace tree group unit 1121, each low level Wallace tree subelement
The carry input signal that 1121a or high position Wallace tree subelement 1121c are received can be upper low level Wallace tree
The carry output signals of unit 1121a or high position Wallace tree subelement 1121c output, and first low level Wallace tree
The carry digit input signal that unit 1121a is received is numerical value 0.Wherein, first high position Wallace tree subelement 1121c is received
The carry input signal arrived can be presently in the data bit width of reason different mode, with data processor by data processor
It is presently in the multiplying of reason or multiplies accumulating the bit wide determination of multiplicand in operation.
A kind of data processor provided in this embodiment, data processor can be to mesh by amendment Wallace tree group unit
The partial product of mark coding carries out accumulation process and obtains two-way output signal, and is carried out by summing elements to the two-way output signal
Accumulation process obtains the data operation result of different mode;The data processor may be implemented at the data operation of different mode
Reason effectively reduces the area that data processor occupies AI chip to improve the versatility of data processor;In addition, should
Data processor does not need to carry out multiplication result again one-accumulate operation could to complete to multiply accumulating arithmetic operation, only leads to
Cross once-through operation process can be directly realized by multiply accumulating or multiplying operation, to reduce the power consumption of data processor.
Data processor includes summing elements 1122 in one of the embodiments, and the summing elements 1122 include: to add
Musical instruments used in a Buddhist or Taoist mass 1122a, the adder 1122a are for carrying out add operation to the accumulating operation result.
Specifically, adder 1122a can be the adder of different bit wides.Optionally, adder 1122a, which can receive, repairs
The two paths of signals of positive 1121 output of Wallace tree group unit, carries out add operation, output data processor to two-way output signal
It is presently in the data operation result of reason mode.Optionally, above-mentioned adder 1122a can be carry lookahead adder, this is super
The bit wide of advanced potential adder alignment processing data can be equal to the position that amendment Wallace tree group unit 1121 exports operation result
It is wide.
A kind of data processor provided in this embodiment, data processor can be to amendment Wallace trees by summing elements
The two paths of signals of group unit output carries out accumulation process, exports the data operation result of different mode;The data processor is not
It needs to carry out multiplication result again one-accumulate operation to complete to multiply accumulating arithmetic operation, only passes through once-through operation process
Multiplication can be directly realized by or multiply accumulating arithmetic operation, to reduce the power consumption of data processor.
The second partial product selection branch 1212 in data processor includes: function selection in one of the embodiments,
Mode signal input port (mode) 1212a, second partial product input port 1212b, first partial product input port 1212c,
Second partial product output port 1212d and gate unit product output port 1212e;The function selection mode signal input part
Mouth (mode) 1212a is for receiving the function selection mode signal, and the second partial product input port 1212b is for receiving
Second partial product after the symbol Bits Expanding that the second amendment coding sub-circuit 121 inputs, the first partial product are defeated
Inbound port 1212c is used to receive the first part after the symbol Bits Expanding obtained after the partial product switched circuit 13 exchanges
Product, the second partial product output port 1212d for export the partial product switched circuit 13 needs swap it is described
Second partial product after symbol Bits Expanding, the gate unit product output port 1212e are used to export the symbol after gating
First partial product after second partial product after Bits Expanding, and the symbol Bits Expanding that receives.
Specifically, if data processor can currently handle 2NPosition *NPosition data multiply accumulating operation, then in data processor
Partial product switched circuit 13 can exchange the second partial product after symbol Bits Expanding, with the first part after symbol Bits Expanding
Product;Second partial product selection branch 1212 in data processor can pass through first partial product input port 1212c, receiving unit
First partial product after dividing the product symbol Bits Expanding that switched circuit 13 exchanges, and second after the symbol Bits Expanding that needs are exchanged
Partial product is exported by second partial product output port 1212d to partial product switched circuit 13.Wherein, gate unit product output end
Mouth 1212e can export the second partial product after the symbol Bits Expanding for not needing exchange, and after the symbol Bits Expanding received
First partial product;Then second partial product selects branch 1212 by the second part after the symbol Bits Expanding for not needing exchange
It accumulates, and/or second partial product of the first partial product after the symbol Bits Expanding received as target code, is input to second and repairs
Positive compression sub-circuit 122 carries out compression processing.
A kind of data processor provided in this embodiment, data processor select branch can choose by second partial product
Partial product after symbol Bits Expanding, to obtain the partial product of target code, so that data processor can not only realize same bit wide
The multiplying of data and multiply accumulating operation, additionally it is possible to which that realizes different bit wide data multiplies accumulating operation, to improve number
According to the versatility of processor.
The partial product switched circuit 13 in data processor includes: function selection mode letter in one of the embodiments,
Number input port (mode) 131, first partial product input port 132, first partial product output port 133, second partial product are defeated
Inbound port 134 and second partial product output port 135, the function selection mode signal input port (mode) 131 are used for
The function selection mode signal is received, the first partial product input port 132 is for receiving the first amendment coding
First partial product after the symbol Bits Expanding that the needs that circuit 111 inputs exchange, the first partial product output port 133 are used for
First partial product after exporting the symbol Bits Expanding, the second partial product output port 134 are repaired for receiving described second
Second partial product after the symbol Bits Expanding that the needs that positive coding sub-circuit 121 inputs exchange, the second partial product output end
Mouth 135 is for exporting the second partial product after the symbol Bits Expanding.
Specifically it is understood that partial product switched circuit 13 is according to function selection mode signal input port (mode)
131, the function selection mode signal received determines whether to need to exchange the first partial product after symbol Bits Expanding, with
Second partial product after symbol Bits Expanding;Wherein, partial product switched circuit 13 can exchange the first low level after symbol Bits Expanding
The second low portion product after partial product and symbol Bits Expanding, alternatively, partial product switched circuit 13 can exchange symbol Bits Expanding
Long-pending the second high-order portion product with after symbol Bits Expanding of the first high-order portion afterwards.But in the present embodiment, only work as data
Processor needs to handle 2NPosition *NWhen multiplying accumulating operation of data of position, partial product switched circuit 13 just need to exchange symbol Bits Expanding
Partial product afterwards, when handling the data operation of other Three models, partial product switched circuit 13 can not need to swap place
Reason.
A kind of data processor provided in this embodiment, data processor can exchange first by partial product switched circuit
First partial product after the symbol Bits Expanding that amendment coding sub-circuit obtains, the sign bit obtained with the second amendment coding sub-circuit
Second partial product after extension, and then realize 2NPosition *NPosition data multiply accumulating operation, which can not only realize
With bit wide data multiplying and multiply accumulating operation, additionally it is possible to that realizes different bit wide data multiplies accumulating operation, to mention
The high versatility of data processor.
A kind of data processor that another embodiment provides, the canonical signed number coding processing unit in data processor
211 include: the first data-in port 2111, function selection mode signal input port 2112 and target code output port
2113, first data-in port 2111 is used to receive first data for carrying out canonical signed number coded treatment,
The function selection mode signal input port 2112 is for receiving the function selection mode signal, the target code output
Port 2113 is for exporting to the target code after first data progress canonical signed number coded treatment, obtained.
Specifically, canonical signed number coding processing unit 211 can be according to the function selection mode signal received, really
Determining data processor, currently accessible data bit width isNOr 2N.If canonical signed number coding processing unit 211 currently may be used
The data bit width of processing isNWhen, then canonical signed number coding processing unit 211 can be automatically by receive two 2NSeat
Data are divided into heightNPosition data (i.e. high position data) and lowNPosition data (i.e. low data), and respectively to high position data and
Low data carries out canonical signed number coded treatment;If the current accessible number of canonical signed number coding processing unit 211
It is 2 according to bit wideNWhen, then canonical signed number coding processing unit 211 can be by two 2NSeat data are as a whole, right respectively
The two subdatas carry out canonical signed number coded treatment.
It should be noted that the first data may include two 2NSeat data, if canonical signed number coded treatment list
Member 211 is currently needed to 2NPosition data carry out canonical signed number coded treatment, then the low data in the first data can wrap
Include two 2NCorresponding two low datas in the data of seat;If the current needs pair of canonical signed number coding processing unit 211N
Position data are handled, then canonical signed number coding processing unit 211 can be by two 2NSeat data, are divided into two
It is aNSeat data, i.e., fourNSeat data;Low data in above-mentioned first data may include two 2NSeat data pair
Four low datas answered.In addition, in canonical signed number coding process, canonical signed number coding processing unit 211
The number of obtained low level target code can be equal to the number of obtained high-order target code, can also be equal to low data
The number of the first low portion product of corresponding target code or the first high-order portion of the corresponding target code of high position data
Long-pending number.If data processor currently processed oneNPosition*NThe multiplying of position data, at this point, the first data and the second number
Having a subdata in is the height in 0, that is, the first data and the second dataNPosition data or lowNPosition data are all 0;
In addition, if data processor currently processed one 2NPosition*2NThe multiplying of position data, at this point, the first data and the second data
In have a subdata be 0, another subdata is 2NThe non-zero numerical value in position.
A kind of data processor provided in this embodiment, data processor pass through canonical signed number coding processing unit,
Canonical signed number coded treatment is carried out to the first data received, obtains target code, and then obtain according to target code
The partial product of target code, and accumulation process is carried out to the partial product of target code and obtains target operation result, it realizes a variety of
The data operation of different mode is handled;The data processor can be by canonical signed number coding processing unit to receiving
Data carry out canonical signed number coded treatment, and the number of obtained live part product is less, to reduce data processor
It realizes multiplying or multiplies accumulating the complexity of operation;Meanwhile the data processor can be realized the data of a variety of different modes
Calculation process effectively reduces the area that data processor occupies AI chip to improve the versatility of data processor.
As one of embodiment, it includes: low portion product that the first partial product in data processor, which obtains circuit 22,
Acquiring unit 221, low level selector group unit 222, high-order portion product acquiring unit 223 and high digit selector group unit 224;
The first input end of the low portion product acquiring unit 221 and the first input of high-order portion product acquiring unit 223
End, connect with the output end of the canonical signed number coding processing unit 211, the low portion product acquiring unit 221
The second input terminal connect with the output end of the low level selector group unit 222, high-order portion product acquiring unit 223
Second input terminal is connect with the output end of the high digit selector group unit 224.
Wherein, low portion product acquiring unit 221 be used for according to the low level target code in the target code with
And second data, the first low portion product after obtaining symbol Bits Expanding, and according to first after the symbol Bits Expanding
Low portion product obtains the first low portion product of target code, and the low level selector group unit 222 is used for basis and receives
The function selection mode signal, after gating the symbol Bits Expanding the first low portion product in numerical value, the high position
Partial product acquiring unit 223 be used for according in the target code high-order target code and second data, accorded with
The first high-order portion product after number Bits Expanding, and target code is obtained according to the first high-order portion product after the symbol Bits Expanding
The first high-order portion product, the high digit selector group unit 224 is used for according to the function selection mode signal that receives,
The numerical value in the first high-order portion product after gating the symbol Bits Expanding.
Specifically it is understood that low portion product acquiring unit 221 can be according to canonical signed number coding unit 211
Each bit value in the low level target code of input, the low portion product after obtaining corresponding symbol Bits Expanding;Low level selection
Device group unit 222 can gate to obtain the numerical value in the first low portion product after symbol Bits Expanding;Then by symbol Bits Expanding
Low portion product afterwards is combined with the numerical value in the first low portion product after the symbol Bits Expanding after gating, obtains symbol
The first low portion product after Bits Expanding, and the first of target code is obtained according to the first low portion product after symbol Bits Expanding
Low portion product.Similarly, the height that high-order portion product acquiring unit 223 can be inputted according to canonical signed number coding unit 211
Each bit value in the target code of position, the high-order portion after obtaining the corresponding symbol Bits Expanding of high position data in the first data
Product;High digit selector group unit 224 can gate to obtain the numerical value in the first high-order portion product after symbol Bits Expanding;Then will
High-order portion product after symbol Bits Expanding and the numerical value in the first high-order portion product after the symbol Bits Expanding after gating, are accorded with
The first high-order portion product after number Bits Expanding, and the of target code is obtained according to the first high-order portion product after symbol Bits Expanding
One high-order portion product.
In the present embodiment, the first partial product of target code can pass through the first low portion of target code product and target
The first high-order portion product of coding obtains.If the bit wide of first object coding can be equal to 2N, in the first low level target code
Numerical value since lowest order numerical value corresponding number can for 1 ...,N, then the first low level portion after corresponding symbol Bits Expanding
Point product reference numeral may be 1 ...,N, the reference numeral and symbol Bits Expanding of the first low portion product of target code
The reference numeral of the first low portion product afterwards is similar;Meanwhile if numerical value in the first high-order target code from lowest order numerical value
Starting corresponding number can beN+1 ..., 2N, then after corresponding symbol Bits Expanding the first high-order portion product reference numeral
OrN+1 ..., 2N, the reference numeral of the first high-order portion product of target code and first after symbol Bits Expanding are high-order
The reference numeral of partial product is similar;And then the regularity of distribution of the first partial product of all target codes can be characterized as, first
First low portion product of a target code can be equal to the first low portion product after first symbol Bits Expanding, i.e., and first
The first partial product of target code;Since the first low portion product that second target encodes, the of each target code
The highest bit value of one low portion product, the highest order numerical value for the first partial product that can be encoded with first aim are located at same
Column;It is equivalent to the lowest order numerical value of the first low portion product of each target code, it is low with the first of a upper target code
The lowest order numerical value of bit position product is staggered one to the left, next target of the first low portion product of the last one target code
The first partial product of coding can be the first high-order portion product of first aim coding;Wherein, the of first aim coding
The bit wide of one high-order portion product can be equal toN, it is equivalent to the product respective column of the first low portion after first symbol Bits Expanding
On the basis of, what the first high-order portion product after first symbol Bits Expanding moved to leftNBit value is not first of target code
Numerical value in point product, the distribution mode of first high-order portion product of other target codes.
It should be noted that if data processor can currently handle 2NPosition * 2NThe multiplying of position data, then data processing
First partial product in device obtain circuit 22 may include (N+1) a low portion product acquiring unit 221, and (N+1) an a high position
Partial product acquiring unit 223;At this point, each low portion product acquiring unit 221 may include 4NA numerical generation subelement,
Each high-order portion product acquiring unit 223 also may include 4NA numerical generation subelement.If data processor currently needs
It is rightNPosition data are handled, then first partial product in data processor obtain circuit 22 may include (N+1)/2 low level portion
Divide and accumulates acquiring unit 221, and (N+1)/2 high-order portion product acquiring unit 223;At this point, each low portion product obtains list
Member 221 may include 2NA numerical generation subelement, each high-order portion product acquiring unit 223 may include 2NA numerical value is raw
A numerical value in first partial product at subelement, after the available symbol Bits Expanding of each numerical generation subelement.
Optionally, it includes: low portion product acquiring unit 231, low level selector that the second partial product, which obtains circuit 23,
Group unit 232, high-order portion product acquiring unit 233 and high digit selector group unit 234;The low portion product acquiring unit
231 first input end and the high-order portion product acquiring unit 233 first input end, with the canonical signed number
The output end of coding processing unit 211 connects, and the second input terminal and the low level of the low portion product acquiring unit 231 select
Select the output end connection of device group unit 232, the second input terminal of the high-order portion product acquiring unit 233 and the high-order selection
The output end of device group unit 234 connects.
Wherein, low portion product acquiring unit 231 be used for according to the low level target code in the target code with
And second data, the first low portion product after obtaining symbol Bits Expanding, and according to first after the symbol Bits Expanding
Low portion product obtains the first low portion product of target code, and the low level selector group unit 232 is used for basis and receives
The function selection mode signal, after gating the symbol Bits Expanding the first low portion product in numerical value, the high position
Partial product acquiring unit 233 be used for according in the target code high-order target code and second data, accorded with
The first high-order portion product after number Bits Expanding, and target code is obtained according to the first high-order portion product after the symbol Bits Expanding
The first high-order portion product, the high digit selector group unit 234 is used for according to the function selection mode signal that receives,
The numerical value in the first high-order portion product after gating the symbol Bits Expanding.
In addition, the method that first partial product obtains the first partial product that circuit 22 obtains after symbol Bits Expanding, with second
Point product obtain circuit 23 obtain the second partial product after symbol Bits Expanding method it is identical, this embodiment is not repeated second part
The method that product obtains 23 fetching portion of circuit product.In addition, first partial product obtains circuit 22 and second partial product obtains circuit 23
Internal circuit configuration can be identical, the function of external output port can also be identical, this embodiment is not repeated second part
Product obtains the specific structure of circuit 23.
A kind of data processor provided in this embodiment, data processor pass through low portion product acquiring unit, high position portion
Divide product acquiring unit and selector group unit, according to low level target code and high-order target code, after obtaining symbol Bits Expanding
First partial product, and the first partial product of target code is obtained according to the first partial product after symbol Bits Expanding, and then to mesh
The first partial product of mark coding carries out accumulation process, obtains target operation result;What the data processor can obtain effectively obtains
The number taken is less, realizes multiplying to reduce data processor or multiplies accumulating the complexity of operation;Meanwhile the data
Processor does not need to carry out multiplication result again one-accumulate operation could to complete to multiply accumulating arithmetic operation, only passes through one
Secondary calculating process can be directly realized by multiplication or multiply accumulating arithmetic operation, to reduce the power consumption of data processor;In addition,
Data processor can also realize the data operation processing of different mode, to improve the versatility of data processor.
The low portion product acquiring unit 221 in data processor includes: that low level target is compiled in one of the embodiments,
Code input port 2211, gating value input mouth 2212, the second data-in port 2213 and low portion product output end
Mouth 2214;The low level target code input port 2211 is defeated for receiving the canonical signed number coding processing unit 211
The the first low level target code entered, the gating value input mouth 2212 is for receiving the low level selector group unit
After 222 gatings, numerical value in the first low portion after obtained symbol Bits Expanding product, second data-in port
2213 for receiving second data, and the low portion product output port 2214 is for exporting the first of the target code
Low portion product.
Specifically, the low portion product acquiring unit 221 in data processor passes through low level target code input port
2211, it can receive the low level target code in the target code of the output of canonical signed number coding unit 211, and pass through second
Data-in port 2213 can receive two subdatas (i.e. multiplicand) in the second data.Optionally, low portion product obtains
It takes unit 221 can be according to the low level target code received, and the multiplying that receives or multiplies accumulating quilt in operation
Multiplier, the low portion product after obtaining the corresponding symbol Bits Expanding of low data, and according to the low portion after symbol Bits Expanding
Product obtains the first low portion product of target code.Optionally, if the second data in low portion product acquiring unit 221 input
The multiplicand bit wide that port 2213 receives isN, then low portion accumulates first after the symbol Bits Expanding that acquiring unit 221 obtains
The bit wide of low portion product can be equal to 2N。
It should be noted that low portion product acquiring unit 221 can be received low by gating value input mouth 2212
When the data operation for the different mode that digit selector group unit 222 gates, in the low portion product after obtained symbol Bits Expanding
Correspondence bit value;Then the low portion after the currently available symbol Bits Expanding of low portion product acquiring unit 221 is long-pending, with
Correspondence bit value after gating is combined, the first low portion product after obtaining symbol Bits Expanding.
Optionally, data processor includes the high-order portion product acquiring unit 223, the high-order portion product acquiring unit
223 include: high-order target code input port 2231, gating value input mouth 2232, the second data-in port 2233 with
And high-order portion product output port 2234;The high position target code input port 2231 is for receiving canonical signed number coding
The high-order target code that unit 211 exports, the gating value input mouth 2232 is for receiving the high digit selector group list
After 224 gating of member, numerical value in the first high-order portion after the symbol Bits Expanding of output product, second data input pin
Mouthfuls 2233 for receiving second data, and the high-order portion product output port 2234 is used to export the of the target code
One high-order portion product.
It is understood that the method that low portion product acquiring unit 221 obtains the first low portion product of target code,
Identical as the long-pending method of the first high-order portion that high-order portion product acquiring unit 223 obtains target code, the present embodiment is no longer superfluous
The method for stating high-order portion product 223 fetching portion of acquiring unit product.In addition, low portion product acquiring unit 221 and high-order portion
The internal circuit configuration of product acquiring unit 223 can be identical, and the function of external output port can be similar, and the present embodiment is no longer superfluous
State the specific structure of high-order portion product acquiring unit 223.
A kind of data processor provided in this embodiment, the low portion product acquiring unit in data processor can basis
Each bit value in low level target code obtains the low portion product after symbol Bits Expanding, then will be low after symbol Bits Expanding
The long-pending numerical value with low level selector group one-cell switching of bit position is combined, the first low portion after obtaining symbol Bits Expanding
Product, and the first low portion product of target code is obtained according to the first low portion product after symbol Bits Expanding, and then to target
The the first low portion product and high-order portion product of coding carry out accumulation process, obtain the data operation of different mode as a result, should
The number effectively obtained that data processor can obtain is less, realizes multiplying to reduce data processor or multiplies tired
Add the complexity of operation;Meanwhile the data operation processing of different mode may be implemented in the data processor, to improve data
The versatility of processor.
Data processor includes low level selector group unit 222, the low level selector group in one of the embodiments,
Unit 222 includes: low level selector 2221, and multiple low level selectors 2221 are used for first after the symbol Bits Expanding
Numerical value in low portion product is gated.
Specifically, 2221 number of low level selector for including in above-mentioned low level selector group unit 222, can be equal to 3N*
(N+ 1), 2NIt can indicate that data processor is presently in the bit wide of reason data, each in the low level selector group unit 222
The internal circuit configuration of low level selector 2221 can be identical.Optionally, if data processor can currently handle 2NPosition * 2NPosition
The multiplying of data, then each canonical signed number coding unit 211 connection correspondence (N+ 1) a low portion product obtains
In unit 221, it may include 4NA numerical generation subelement, wherein 2NA numerical generation subelement can connect 2NA low level
Selector 2221, each numerical generation subelement connect a low level selector 2221.Optionally, above-mentioned 2NA low level selection
Device 2221 corresponding 2NA numerical generation subelement can be high by 2 in the first low portion product after sign bit extensionNPosition data
Corresponding numerical generation subelement, and this 2NThe internal circuit configuration of a low level selector 2221 and selector 212 can be complete
It is identical, meanwhile, this 2NThe external input port of a low level selector 2221 is in addition to function selection mode signal input port
(mode) outside, there are two other input ports.Optionally, if data processor can handle the data fortune of four kinds of different modes
It calculates, and the multiplicand bit wide that data processor receives is 2N, then two other input terminals of above-mentioned low level selector 2221
The signal that mouth can receive is respectively numerical value 0, carries out 2 with data processorNPosition * 2NWhen the multiplying of position data, the low level
The symbol bit value in the first low portion product after the correspondence symbol Bits Expanding that partial product acquiring unit 221 obtains.Wherein, (N
+ 1) a low portion product acquiring unit 221 can connect (N+ 1) 2 are organizedNA low level selector 2221, the 2 of each groupNA low level choosing
Selecting symbol bit value that device 2221 receives can be identical, can not also be identical;But the 2 of same groupNA low level selector
The 2221 correspondence symbol bit values received are identical, and the symbol bit value can be according to each group 2NA low level selection
Device 2221, be correspondingly connected with low portion product acquiring unit 221 obtain symbol Bits Expanding after the first low portion product in
Symbol bit value obtains.
In addition, each low portion accumulates 4 that acquiring unit 221 includesNA numerical generation subelement, wherein corresponding toNIt is a
Numerical generation subelement can be not connected to low level selector 2221, at this point, shouldNThe numerical value that a numerical generation subelement obtains, can
Think the numerical value that data processor is presently in the first low level target code of the data acquisition of reason multiplying difference bit wide,
The correspondence bit value in the first low portion product after obtained correspondence symbol Bits Expanding;It is also understood thatNA numerical generation
The numerical value that subelement obtains can correspond in the first low portion product after corresponding symbol Bits Expanding from lowest order the (the i.e. the 1st
Position) it is counted to highest order, the 1st to theNAll numerical value between bit value.
It should be noted that each low portion accumulates 4 that acquiring unit 221 includesNIn a numerical generation subelement, remain
RemainingNA numerical generation subelement also can connectNA low level selector 2221, each numerical generation subelement can connect
1 low level selector 2221;It shouldNA low level selector 2221 can be identical with the internal circuit configuration of selector 212, and shouldNThe external input port of a low level selector 2221 other than function selection mode signal input port (mode), there are two
Other input ports;The signal that the two other input ports can receive, respectively data processor carry outNPosition *NDigit
According to multiplying, at the symbol bit value and data in the first low portion product after obtained correspondence symbol Bits Expanding
It manages device and carries out 2NPosition * 2NThe multiplying of position data, it is corresponding in the first low portion product after obtained correspondence symbol Bits Expanding
Bit value.Wherein, (N+ 1) a low portion product acquiring unit 221 can connect (N+ 1) groupNA low level selector 2221, each group
'sNThe symbol bit value that a low level selector 2221 receives can be identical, can not also be identical;But same groupNIt is a low
The symbol bit value that digit selector 2221 receives is identical, and the symbol bit value can be according to each groupNA low level
Selector 2221, the low portion being correspondingly connected with accumulate the first low portion product after the symbol Bits Expanding that acquiring unit 221 obtains
In symbol bit value obtain.
In addition, each groupNIt is right in the first low portion product after the symbol Bits Expanding that a low level selector 2221 receives
Bit value is answered, it can be according to the low portion product acquiring unit 221 that this group of low level selector 2221 is connected, the sign bit of acquisition
The correspondence bit value in the first low portion product after extension determines;And each groupNIt is each in a low level selector 2221
The correspondence bit value that a low level selector 2221 receives may be the same or different.Wherein, each low portion product obtains
It takes 4 in unit 221NThe position distribution rule of a numerical generation subelement, can be in upper low portion product acquiring unit 221
In 4NOn the basis of a numerical generation subunit position, a numerical generation subelement is moved to left.Optionally, subsequent arithmetic is participated in
In the first low portion product of all target codes, the only bit wide of the first low portion product of first aim coding can be with
The bit wide 4 of the first low portion product after equal to first symbol Bits ExpandingN;The first low portion product of remaining target code
Bit wide all can be one few on the basis of the first low portion of upper target code product, and the last one target code
The bit wide of first high-order portion product can be equal to (2N-1).
Optionally, the high digit selector group unit 224 includes high digit selector 2241, multiple high digit selectors
2241 for gating the numerical value in the first high-order portion product after the symbol Bits Expanding.
It should be noted that the method that high digit selector 2241 gates numerical value, gates numerical value with high digit selector 1111ea
Method it is identical, the present embodiment to high digit selector 2241 gating numerical value method repeat no more.
A kind of data processor provided in this embodiment, the low level selector group unit in data processor can gate low
Numerical value in bit position product, the first low portion product after obtaining symbol Bits Expanding, and then according to first after symbol Bits Expanding
Low portion product obtains the first partial product of target code, and is carried out by first partial product of the compressor circuit to target code tired
Add processing, obtains the target operation result of different mode;The data operation processing of different mode may be implemented in the data processor,
To improve the versatility of data processor.
Fig. 5 is a kind of concrete structure schematic diagram for data processor that another embodiment provides, wherein data processor packet
The first compressor circuit 24 is included, first compressor circuit 24 includes: amendment Wallace tree group unit 241 and summing elements 242, institute
The output end for stating amendment Wallace tree group unit 241 is connect with the input terminal of the summing elements 242;The amendment Wallace tree
It is every in the first partial product of all target codes of acquisition when group unit 241 is used to handle the data operation of different mode
One columns value carry out accumulation process, obtain accumulating operation as a result, the summing elements 242 be used for the accumulating operation result into
Row add operation.
Specifically, above-mentioned amendment Wallace tree group unit 241 can obtain the target that circuit 22 obtains to first partial product
Each columns value in the first low portion product of coding and the first high-order portion product of target code carries out accumulation process,
And accumulation process is carried out by two operation results that 242 pairs of summing elements amendment Wallace tree group units 241 obtain, obtain mesh
Mark operation result.Wherein, when carrying out accumulation process by amendment Wallace tree group unit 241, first of all target codes
The regularity of distribution for dividing product, can be characterized as every a line and correspond to lowest order numerical value present position in the first partial product of target code,
It corresponds to lowest order numerical value present position in the first partial product of target code than next line to be staggered to the right one digit number value, still, often
Highest bit value in the first partial product of one corresponding target code, with first aim coding first partial product in most
High-order numerical value is located at same row.Optionally, amendment Wallace tree group unit 241 can be according to the first part of all target codes
The long-pending regularity of distribution carries out accumulation process to each columns value in the first partial product of all target codes.Optionally, above-mentioned
Correcting two operation results that Wallace tree group unit 241 obtains may include and position output signalSumWith carry output signalsCarry。
Optionally, second compressor circuit 25 includes: to correct Wallace tree group unit 251 and summing elements 252, described
The output end of amendment Wallace tree group unit 251 is connect with the input terminal of the summing elements 252;The amendment Wallace tree group
It is each in the second partial product of all target codes of acquisition when unit 251 is used to handle the data operation of different mode
Columns value carries out accumulation process, obtains accumulating operation as a result, the summing elements 252 are used to carry out the accumulating operation result
Add operation.
It should be noted that the method that the first compressor circuit 24 carries out compression processing to the first partial product of target code,
It is identical as the second partial product progress method of compression processing of second compressor circuit 25 to target code, no longer to this present embodiment
Repeat the compression method of the second compressor circuit 25.In addition, the internal structure of the first compressor circuit 24 and the second compressor circuit 25, with
And the function of outside port is identical, the specific structure of this embodiment is not repeated the second compressor circuit 25.
A kind of data processor provided in this embodiment, data processor can be to mesh by amendment Wallace tree group unit
First low portion of mark coding is long-pending and high-order portion product carries out accumulation process and obtains accumulating operation as a result, and passing through summing elements
Accumulation process is carried out to accumulating operation result, obtains target operation result, which may be implemented the number of different mode
According to calculation process, to improve the versatility of data processor, the area that data processor occupies AI chip is effectively reduced.
Continue the concrete structure schematic diagram of data processor as shown in Figure 5 in one of the embodiments, wherein data
Processor includes the amendment Wallace tree group unit 241, which includes: low level Wallace tree
Subelement 2411, selector 2412 and high-order Wallace tree subelement 2413, the low level Wallace tree subelement 2411
Output end is connect with the input terminal of the selector 2412, the output end of the selector 2412 and the high-order Wallace tree
The input terminal of unit 2413 connects;Wherein, multiple low level Wallace tree subelements 2411 are used for the target code
Each columns value in first partial product carries out accumulating operation, and the selector 2412 is for gating the high-order Wallace tree
The received carry input signal of unit 2413, multiple high-order Wallace tree subelements 2413 are used for the target code
Each columns value in first partial product carries out accumulating operation and obtains the accumulating operation result.
Specifically, the circuit structure of each low level Wallace tree subelement 2411, it can be by full adder and half adder group
It closes and realizes, realization can also be combined by 4-2 compressor;The circuit structure of each high-order Wallace tree subelement 2413, can also
To combine realization by full adder and half adder, realization can also be combined by 4-2 compressor;In addition, low level Wallace tree subelement
2411 and high-order Wallace tree subelement 2413, can be understood as one kind can be handled multidigit input signal, will
Multidigit input signal is added to obtain the circuit of two output signals.Optionally, high position Hua Lai in Wallace tree group unit 241 is corrected
The number of scholar tree unit 2413, multiplicand when can currently handle multiplying equal to data processor or multiply accumulating operation
Bit wideN, can also be equal to low level Wallace tree subelement 2411 number, and each low level Wallace tree subelement 2411 it
Between can be connected in series, can also be connected in series between each high position Wallace tree subelement 2413.Optionally, the last one low level
The output end of Wallace tree subelement 2411 is connect with the input terminal of selector 2412, the output end of selector 2412 and first
The input terminal of high-order Wallace tree subelement 2411 connects.Optionally, it corrects in Wallace tree group unit 241, each low level
Wallace tree subelement 2411 can respective column numerical value to the partial product of all target codes carry out addition process;Each is low
Position Wallace tree subelement 2411 can export two signals, i.e. carry signalCarry i With one and position signalSum i ;Wherein,i
It can indicate each corresponding number of low level Wallace tree subelement 2411, first low level Wallace tree subelement 2411
Number is 0.Optionally, each low level Wallace tree subelement 2411 receives the number of input signal, can be equal to target
The number of the first partial product of coding.Wherein, correct in Wallace tree group unit 241, high-order Wallace tree subelement 2413 with
The sum of the number of low level Wallace tree subelement 2411 can be equal to 2N;In the first partial product of all target codes, from minimum
2 can be equal to by arranging the total columns arranged to highestN,NA low level Wallace tree subelement 2411 can be to the of all target codes
A part is accumulated lowNIn column, each column carry out accumulating operation,NA high position Wallace tree subelement 2413 can be to all targets
The height of the first partial product of codingNEach column in column carry out accumulating operation.
Optionally, the amendment Wallace tree group unit 251 in the second compressor circuit 25 includes: low level Wallace tree subelement
2511, selector 2512 and high-order Wallace tree subelement 2513, the output end of the low level Wallace tree subelement 2511
It is connect with the input terminal of the selector 2512, the output end of the selector 2512 and the high-order Wallace tree subelement
2513 input terminal connection;Wherein, multiple low level Wallace tree subelements 2511 are used for the second of the target code
Each columns value in partial product carries out accumulating operation, and the selector 2512 is for gating the high-order Wallace tree subelement
2513 received carry input signals, multiple high-order Wallace tree subelements 2513 are used for the second of the target code
Each columns value in partial product carries out accumulating operation and obtains the accumulating operation result.
It should be noted that the circuit structure and its function of the amendment Wallace tree group unit 241 in the first compressor circuit 24
Can, identical as the circuit structure of the amendment Wallace tree group unit 251 in the second compressor circuit 25 and its function, the present embodiment is not
The specific structure of amendment Wallace tree group unit 251 is repeated again.
A kind of data processor provided in this embodiment, data processor can be to mesh by amendment Wallace tree group unit
The partial product of mark coding carries out accumulation process and obtains two-way output signal, and carries out accumulation process to the two-way output signal, obtains
To different mode data operation as a result, the data processor may be implemented different mode data operation processing, to improve
The versatility of data processor effectively reduces the area that data processor occupies AI chip;In addition, the data processor is simultaneously
It does not need to carry out multiplication result again one-accumulate operation could to complete to multiply accumulating arithmetic operation, only passes through once-through operation
Journey can be directly realized by multiplication or multiply accumulating arithmetic operation, to reduce the power consumption of data processor.
A kind of data processor that another embodiment provides, wherein data processor includes the summing elements 242, should
Summing elements 242 include: adder 2421, and the adder 2421 is used to carry out add operation to the accumulating operation result.
Specifically, adder 2421 can be the adder of different bit wides.Optionally, adder 2421 can receive amendment
The two paths of signals that Wallace tree group unit 241 exports carries out add operation to two-way output signal, and output data processor is current
The data operation result of handled mode.Optionally, above-mentioned adder 2421 can be carry lookahead adder.
A kind of data processor provided in this embodiment, data processor can be to amendment Wallace trees by summing elements
Group unit output two paths of signals carry out accumulation process, export the data operation of different mode as a result, the data processor not
It needs to carry out multiplication result again one-accumulate operation to complete to multiply accumulating arithmetic operation, only passes through once-through operation process
Multiplication can be directly realized by or multiply accumulating arithmetic operation, to reduce the power consumption of data processor.
Data processor includes the adder 2421 in one of the embodiments, which includes: carry
Signal input port 2421a and position signal input port 2421b and operation result output port 2421c;The carry signal
Input port 2421a is for receiving carry signal and position signal input port 2421b for receiving and position signal, operation result
Output port 2421c for output carry signal with and position signal progress accumulation process result.
Specifically, adder 2421 can receive amendment Wallace tree group unit by carry signal input port 2421a
The carry signal of 241 outputsCarry, exported by receiving amendment Wallace's array circuit 241 with position signal input port 2421b
And position signalSum, and by carry signalCarryWith with position signalSumAccumulated result is carried out, operation result output end is passed through
Mouth 2421c output.
It should be noted that data processor can use the adder 2421 of different bit wides, right during calculation process
Correct the carry output signals that Wallace tree group unit 241 exportsCarry, and with position output signalSumAdd operation is carried out,
Wherein, above-mentioned adder 2421 can handle the bit wide of data, can be equal to data processor and need to carry out multiplying or multiply tired
2 times of multiplicand bit wide when adding operation.
A kind of data processor provided in this embodiment, data processor can be to amendment Wallace trees by summing elements
Group unit output two paths of signals carry out accumulating operation, export the data operation of different mode as a result, the data processor not
It needs to carry out multiplication result again one-accumulate operation to complete to multiply accumulating arithmetic operation, only passes through once-through operation process
Multiplication can be directly realized by or multiply accumulating arithmetic operation, to reduce the power consumption of data processor.
Fig. 6 is the flow diagram for the data processing method that one embodiment provides, and this method can pass through Fig. 1 and Fig. 3
Shown in data processor handled, the present embodiment what is involved is realize four kinds of different modes data operation process.Such as
Shown in Fig. 6, this method comprises:
S101, pending data and function selection mode signal are received, wherein the function selection mode signal is used to indicate number
The data operation of different mode can be currently handled according to processor.
Specifically, multiplier and multiplicand when above-mentioned pending data may include multiplying or multiply accumulating operation.It can
Choosing, data processor can receive one by the first amendment coding sub-circuit and the second amendment coding sub-circuit respectively
Pending data, the pending data may include two subdatas to be processed, the two subdatas to be processed can be same position
Wide identical subdata, or with the different subdatas of bit wide.Optionally, two subdatas in above-mentioned pending data
After can splicing as a whole, it is input to the first amendment coding sub-circuit and the second amendment coding sub-circuit, it can be with
Separate while being input to the first amendment coding sub-circuit and the second amendment coding sub-circuit.Wherein, above-mentioned subdata to be processed can
Think fixed-point number, and bit wide can be 2N, the data bit width obtained after two subdata splicings to be processed can be 4N。
It should be noted that the first multiplying operational circuit and the second multiplying operational circuit can receive identical function
Energy selection mode, the function selection mode signal can there are four types of unlike signals, four kinds of function selection mode signals to respectively correspond
The data operation of the accessible four kinds of modes of data processor, the data operation of four kinds of modes may includeNPosition *NPosition data
Multiplying,NPosition *NPosition data multiply accumulating operation, 2NPosition * 2NThe multiplying and 2 of position dataNPosition *NPosition data multiply
Accumulating operation.Wherein, for data processor according to the different function selection mode signal received, can determine can currently handle tool
The data operation of bulk-mode.In addition, a subdata to be processed in a pending data can be used as at data processor
Multiplier when managing multiplying or multiplying accumulating calculation process, another subdata to be processed can be used as data processor processes and multiply
Method operation or multiplicand when multiplying accumulating calculation process.
S102, according to the function selection mode signal, judge whether the pending data needs to carry out deconsolidation process.
Specifically, data processor can determine that data processor is current according to the function selection mode signal received
Accessible data bit width, to judge whether to need to carry out deconsolidation process to pending data.Wherein, deconsolidation process can characterize
For the data that pending data is divided into multiple groups same bit-width.
Optionally, judge whether the pending data needs according to the function selection mode signal in above-mentioned S102
The step of carrying out deconsolidation process, may include: to judge the bit wide of the pending data according to the function selection mode signal
It is whether equal with the data bit width of data processor currently accessible associative mode operation.
Optionally, judge whether the pending data needs according to the function selection mode signal in above-mentioned S102
After the step of carrying out deconsolidation process, if the method can also include: that the pending data does not need to carry out deconsolidation process,
It then continues to execute and canonical signed number coded treatment is carried out to the pending data, obtain the target code.
It should be noted that it is above-mentioned according to function selection mode signal, judge whether pending data is split
Processing, can actually be interpreted as, according to function selection mode signal, judge that the bit wide of pending data is worked as with data processor
Whether the data bit width of preceding accessible associative mode operation is equal, if equal, does not need to split pending data
Otherwise processing needs to carry out deconsolidation process to pending data.For example, the first amendment coding sub-circuit in data processor
And second the bit wide of two data that is respectively received of amendment coding sub-circuit beNBit, and data processor can work as
Before can handleNPosition *NThe multiplying of position, at this point, the bit wide of characterization pending data is current accessible right with data processor
Answer the data bit width of mode operation equal.Wherein, above-mentioned canonical signed number coded treatment can be characterized as through numerical value 0, -1
With the data handling procedure of 1 coding.Optionally, the bit wide of target code can be presently in reason data equal to data processor
Bit wide adds 1.
If S103, the pending data need to carry out deconsolidation process, deconsolidation process is carried out to the pending data,
Data after being split.
For example, the first amendment coding sub-circuit and the second amendment coding sub-circuit in data processor are respectively received
The bit wides of two data be 2NBit, and data processor can be handled currentlyNPosition *NThe multiplying of position, at this point,
Receive two data can be divided by the first amendment coding sub-circuit and the second amendment coding sub-circuit automatically respectively
It is highNDigit is accordingly and lowNPosition data, to meet the data bit width of data processor currently accessible associative mode operation.
S104, canonical signed number coded treatment is carried out to the data after the fractionation, obtains target code.
Optionally, canonical signed number coded treatment is carried out to the data after the fractionation in above-mentioned S104, obtains target
The step of coding, may include: will be continuous in the data after the fractionationlBit value 1 be converted to (l+ 1) position highest bit value
Be 1, lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code, whereinlMore than or equal to 2.
Specifically, if the bit wide for the pending data that data processor receives is 2N, data processor can currently handle
Data bit width beN, then the first amendment coding sub-circuit in data processor and the second amendment coding sub-circuit can be with
Automatically by 2NPosition data split into heightNDigit is accordingly and lowNPosition data, meanwhile, respectively to heightNPosition data and lowNPosition data into
Row canonical signed number coded treatment obtains corresponding high-order target code and low level target code.Optionally, above-mentioned wait locate
Managing after data carry out deconsolidation process may include height to be processedNPosition data and to be processed lowNPosition data.Wherein, if wait locate
The bit wide for managing data is 2N, then highNPosition data are properly termed as high position data to be processed, lowNPosition data are properly termed as to be processed
High position data.
S105, conversion process is carried out according to the data after the target code and the fractionation, obtains symbol Bits Expanding
Partial product afterwards.
Specifically, above-mentioned conversion process can be characterized as, based on the multiplicand in multiplying, by the number in target code
Value is converted into the partial product after symbol Bits Expanding.Optionally, the bit wide of the partial product after symbol Bits Expanding can be equal at data
Reason device is presently in 2 times of reason data bit width.
S106, according to the function selection mode signal, judge whether need to the partial product after the symbol Bits Expanding
Swap processing.
Optionally, according to the function selection mode signal in above-mentioned S106, judge to the portion after the symbol Bits Expanding
Divide whether product needs the step of swapping processing, may include: that data processing is judged according to the function selection mode signal
Whether the data bit width that device is presently in reason is identical.
Specifically, working as data processor processes 2NPosition *NWhen multiplying accumulating operation of data of position, partial product switched circuit just may be used
According to actual needs, first to be corrected the first low portion product or sign bit after encoding the symbol Bits Expanding that sub-circuit obtains
The first high-order portion product after extension, the second low portion product after the symbol Bits Expanding that sub-circuit obtains is encoded with the second amendment
After symbol Bits Expanding or the second high-order portion product swaps, it is also understood that being, data processor is handling other three kinds
When the data operation of mode, partial product switched circuit is vacant state, low portion product and sign bit after symbol Bits Expanding
High-order portion product after extension does not do corresponding exchange processing.Meanwhile first two sub- data bit widths in data and the second data
It is 2NIf data processor can currently handle oneNPosition *NWhen the multiplying of position data, according to actual needs, at this time the
Having a data in one data and the second data is 0, and the high-order numerical value in two subdatas that another data includes is 0,
Or low level numerical value is 0, according to actual needs, the first data and the second data can be counted according to initial data at this time
It calculates;If data processor can currently handle one 2NPosition * 2NWhen the multiplying of position data, according to actual needs, at this time first
Having a data in data and the second data is 0, and high-order numerical value and low level numerical value are in two subdatas of another data
Non-zero numerical value;If data processor can currently handle two 2NPosition * 2NWhen the multiplying of position data, according to actual needs, at this time
Data 0 are not present in first data and the second data.
It should be noted that judge data processor be presently in reason data bit width it is whether identical, can actually table
Sign is, data processor be presently in reason multiplicand bit wide and multiplier bit wide it is whether equal.
Optionally, judge according to the function selection mode signal to after the symbol Bits Expanding in above-mentioned S106
After whether partial product needs the step of swapping processing, the method can also include: if desired to expand the sign bit
Partial product after exhibition swaps processing, then to the high-order portion product or low portion in the partial product after the symbol Bits Expanding
Product swaps processing.
If S107, not needing to swap processing to the partial product after the symbol Bits Expanding, the sign bit is expanded
Partial product of the partial product as target code after exhibition.
Specifically, if not needing to swap the partial product after symbol Bits Expanding processing, the first amendment coding electricity
Road can will obtain the first partial product after symbol Bits Expanding as the first partial product of target code, the second amendment coding electricity
Road can will obtain the second partial product after symbol Bits Expanding as the second partial product of target code.
S108, compression processing is carried out to the partial product of the target code, obtains target operation result.
Specifically, data processor can the columns value in the partial product to all target codes carry out accumulation process, obtain
To target operation result.Optionally, the bit wide of target operation result can be presently in reason data bit width equal to data processor
2 times.
A kind of data processing method provided in this embodiment receives pending data and function selection mode signal, according to
Function selection mode signal, judges whether pending data needs to carry out deconsolidation process, if pending data is split
Processing then carries out deconsolidation process to pending data, and the data after being split, carrying out canonical to the data after fractionation has symbol
Number encoder processing, obtains target code, carries out conversion process according to the data after target code and fractionation, obtains sign bit expansion
Partial product after exhibition judges whether need to swap to the partial product after symbol Bits Expanding according to function selection mode signal
Processing, if not needing to swap processing to the partial product after symbol Bits Expanding, using the partial product after symbol Bits Expanding as
The partial product of target code carries out compression processing to the partial product of target code, obtains target operation result, this method passes through number
Multiplying not only may be implemented according to processor, can also realize and multiply accumulating operation, to improve the general of data processor
Property;In addition, this method does not need to carry out multiplication result again one-accumulate operation could to complete to multiply accumulating arithmetic operation,
Only by once-through operation process can be directly realized by multiply accumulating or multiplying operation, to reduce the function of data processor
Consumption;In addition, this method can also carry out canonical signed number coded treatment to the data received, obtained live part product
Number is less, to reduce the complexity realized multiplying or multiply accumulating operation.
As one of embodiment, the data after the fractionation are carried out at canonical signed number coding in above-mentioned S104
The step of managing, obtaining target code, the method may include:
S1041, canonical signed number coded treatment is carried out to the data after the fractionation, obtains intermediate code.
Specifically, the data after the fractionation of above-mentioned carry out canonical signed number coded treatment can be multiplying or multiply tired
Add the multiplier in operation.
S1042, according to the intermediate code and the function selection mode signal, obtain the target code.
Specifically, the method for canonical signed number coded treatment can characterize in the following manner: forNPosition multiplier and
Speech, is handled, if it exists continuously from low level numerical value to high-order numerical valuel(l >=2) bit value 1 when, then can will be continuousnBit value 1
Be converted to data " 1(0) l-1(- 1) ", and by remaining correspond to (N-l) bit value and conversion after (l+1) bit value carries out
In conjunction with obtaining a new data;Then using the new data as the primary data of next stage conversion process, until conversion process
There is no continuous in the new data obtained afterwardsl(l >=2) until bit value 1;Wherein, rightNPosition multiplier carries out canonical signed number
Coded treatment, the bit wide of obtained target code can be equal to (N+1).Further, in canonical signed number coded treatment,
Data 11 can be converted to (100-001), i.e., data 11 can equivalence be converted to 10(-1);Data 111 can be converted to
(1000-0001), i.e. data 111 can equivalence be converted to 100(-1);And so on, it is other continuousl(l >=2) bit value
The mode of 1 conversion process is also similar.
For example, the multiplier that the first amendment coding sub-circuit or the second amendment coding sub-circuit in data processor receive
For " 001010101101110 ", the first new data for obtain after first order conversion process to the multiplier is
" 0010101011100(-1) 0 ", continue be to the second new data that the first new data obtain after the conversion process of the second level
" 0010101100(-1) 00(-1) 0 ", continue to carry out the third new data obtained after third level conversion process to the second new data
For " 0010110(-1) 00(-1) 00(-1) 0 ", continue to carry out the "four news" (new ideas obtained after fourth stage conversion process to third new data
Data be " 00110(-1) 0(-1) 00(-1) 00(-1) 0 ", continue to the 4th new data carry out level V conversion process after obtain
The 5th new data be " 010(-1) 0(-1) 0(-1) 00(-1) 00(-1) 0 ", there is no continuous in the 5th new datal(l >=
2) bit value 1, at this point, the 5th new data is properly termed as initial code, and after carrying out a cover processing to initial code, table
Sign canonical signed number coded treatment is completed to obtain intermediate code, wherein the bit wide of initial code can be equal to the bit wide of multiplier.
Optionally, the first amendment coding sub-circuit or the second amendment coding sub-circuit carry out canonical signed number coded treatment to multiplier
Afterwards, the new data (i.e. initial code) obtained, if the highest bit value and time high-order numerical value in new data are " 10 " or " 01 ",
First amendment coding sub-circuit or the second amendment coding sub-circuit can highest bit value to the new data high one place's benefit
One digit number value 0, high three bit value for obtaining corresponding intermediate code is respectively " 010 " or " 001 ".Optionally, above-mentioned intermediate code
Bit wide can be equal to data processor be presently in reason data bit wide add 1.
In addition, if the data bit width that data processor receives is 2N, and can currently handleNPosition data operation, then data
The first amendment coding sub-circuit or the second amendment coding sub-circuit in processor, can be by 2NPosition data split into two groupsNPosition
Data carry out data operation respectively, at this point, by obtain two groups (N+1) position intermediate code can be used as target volume after being combined
Code;If data processor can currently handle 2NPosition data operation, then the first amendment coding sub-circuit or the in data processor
Two amendment coding sub-circuits, can be to (the 2 of acquisitionN+1) one digit number value is mended at high one of the highest bit value of position intermediate code
0(, that is, complement processing) after, by complement, treated (2N+2) position data are as target code.
A kind of data processing method provided in this embodiment carries out canonical signed number coding to the data after the fractionation
Processing, obtains intermediate code, according to the intermediate code and the function selection mode signal, obtains the target code,
This method can carry out multiplying to the data of a variety of different bit wides and multiply accumulating operation, effectively reduce data processor
Occupy the area of AI chip;Meanwhile this method can carry out canonical signed number coded treatment to data, reduce in calculating process
The number of the live part product of acquisition improves operation efficiency to reduce multiplying or multiply accumulating the complexity of operation.
It is carried out in above-mentioned S105 according to the data after the target code and the fractionation in one of the embodiments,
The step of conversion process, partial product after obtaining symbol Bits Expanding, may include:
S1051, conversion process is carried out according to the data after the target code and the fractionation, obtains initial protion product.
Specifically, if the numerical value in target code is -1, and the data after fractionation areX, then initial protion product can for-X,
If the numerical value in target code is 1, initial protion product can beXIf the numerical value in target code is 0, initial protion product
It can be 0.
S1052, sign bit extension process is carried out to initial protion product, the part after obtaining the symbol Bits Expanding
Product.
Specifically, the bit wide of initial protion product can be equal to the bit wide that data processor is presently in reason dataN, sign bit
Partial product after extension can be equal to data processor and be presently in reason data bit widthN2 times.Wherein, in initial protion productN
Bit value can be low in the partial product after sign bit extensionNBit value, the height in partial product after symbol Bits ExpandingNDigit
Value can be the highest bit value in initial protion product, i.e. symbol bit value in initial protion product.
The number of a kind of data processing method provided in this embodiment, the live part product that this method can obtain is less,
To reduce multiplying or multiply accumulating the complexity of operation.
As one of embodiment, compression processing is carried out to the partial product of the target code in above-mentioned S108, is obtained
The step of target operation result, may include:
S1081, accumulation process is carried out to the partial product of the target code, obtains intermediate calculation results.
For example, to low level target code, (bit wide isN+ 1) lowest order numerical value to highest bit value is numbered in, lowest order
Value number is 1, and the number of highest bit value isN+ 1, then the number of the low portion product of corresponding target code is also similar, together
When, to high-order target code, (bit wide isM+ 1) lowest order numerical value to highest bit value is numbered in, and lowest order value number is
1, the number of highest bit value isM+ 1, then the number of the high-order portion product of corresponding target code is also similar, all target codes
Low portion product and the regularities of distribution of partial product of all target codes can be characterized as the high position of the target code that number is 1
The lowest order numerical value of partial product is with numberNThe secondary low level numerical value of the low portion product of+1 target code is located at same row, In
On the basis of the high-order portion product of first aim coding, the secondary low level numerical value of the high-order portion product of other target codes is under
The lowest order numerical value of the high-order portion product of one target code is located at same row, long-pending in the low portion of first aim coding
On the basis of, the secondary low level numerical value of the low portion product of other target codes is long-pending most with the low portion of next target code
Low level numerical value is located at same row.
It should be noted that amendment Wallace tree group unit can each columns in the partial product to all target codes
Value carries out accumulation process.
S1062, accumulation process is carried out to the intermediate calculation results by summing elements, obtains the target operation knot
Fruit.
Optionally, accumulation process is carried out to the intermediate calculation results by summing elements in above-mentioned S1062, obtained described
The step of target operation result, can specifically include: low level Wallace tree subelement is in the partial product of all target codes
Columns value carries out accumulation process, obtains accumulating operation result;Selector is according to the function selection mode signal to described cumulative
Operation result is gated, and carry gating signal is obtained;High-order Wallace tree subelement according to the carry gating signal and
Columns value in the partial product of the target code carries out accumulation process, obtains the target operation result.
Specifically, being advised according to the distribution of the high-order portion product of the low portion product and all target codes of all target codes
Rule is it is found that total columns that the partial product of all target codes corresponds to numerical value is 2N(NReason data are presently in for data processor
Bit wide), the corresponding number of each columns value can be 0 since lowest order numerical value ..., 2N- 1, wherein number 0 toN- 1 can be with
Claim lowNColumns value.Optionally, accumulating operation result can be the carry-out of the last one high-order Wallace tree subelement output
SignalCout。
It should be noted thatNA low level Wallace tree subelement can be according to number order to lowNColumns value adds up
Operation obtains accumulating operation result.Optionally, accumulating operation result may include that the carry of each Wallace tree subelement is defeated
Signal outCarry,SumAnd the output signal of the last one high-order Wallace tree subelementCout。
It is understood that the selector in amendment Wallace tree group unit can be according to the function selection mode received
Signal gates the output signal of the last one low level Wallace tree subelementCoutOr numerical value 0,Obtain carry gating signal.
In the present embodiment, according to the regularity of distribution of the partial product of all target codes it is found that the portion of all target codes
The total columns for dividing the corresponding numerical value of product is 2N(NThe bit wide of reason data is presently in for data processor), since lowest order numerical value
The corresponding number of each columns value can be 0 ..., 2N- 1, wherein numberNTo 2N- 1 can claim heightNColumns value.
It should be noted thatNA high position Wallace tree subelement can be according to number order to heightNColumns value adds up
Operation exports accumulating operation result.Wherein, the carry input signal that first high-order Wallace tree subelement receives can be
The carry gating signal of selector output.If currently processed 8 data operations of data processor, corresponding amendment compression son electricity
The circuit structure diagram on road may refer to shown in Fig. 7.
A kind of data processing method provided in this embodiment, by amendment Wallace tree group unit to the part of target code
Product carries out accumulation process, obtains intermediate calculation results, carries out accumulation process to the intermediate calculation results by summation circuit, obtains
To target operation result, this method can be according to the function selection mode signal that data processor receives to a variety of different bit wides
Data carry out multiplying, effectively reduce data processor occupy AI chip area;Meanwhile this method can obtain
The number of live part product is less, to reduce multiplying or multiply accumulating the complexity of operation, improves operation efficiency;In addition,
This method does not need to carry out multiplication result again one-accumulate operation could to complete to multiply accumulating arithmetic operation, only passes through one
Secondary calculating process can be directly realized by multiplication or multiply accumulating arithmetic operation, effectively reduce the power consumption of data processor.
Fig. 8 is the flow diagram for the data processing method that one embodiment provides, and this method can pass through Fig. 2 and Fig. 5
Shown in data processor handled, the present embodiment what is involved is realize four kinds of different modes data operation process.Such as
Shown in Fig. 8, this method comprises:
S201, pending data and function selection mode signal are received, wherein the function selection mode signal is used to indicate
The data operation of the current accessible associative mode of data processor.
Specifically, data processor can receive a pending data by canonical signed number coding circuit, pass through
First partial product obtains circuit and second partial product obtains circuit and receives another pending data respectively, and canonical has symbol
Number coding circuit, first partial product obtain circuit and second partial product obtains circuit and can receive the same function simultaneously
Selection mode signal.Optionally, pending data may include two subdatas to be processed, the two subdatas to be processed can be with
For with the identical subdata of bit wide, or with the different subdatas of bit wide.Optionally, two in a pending data
Subdata to be processed can splice after as a whole, be input to canonical signed number coding circuit, can also separate simultaneously
It is input to canonical signed number coding circuit, conduct after two subdatas to be processed in another pending data can splice
One entirety, while being input to first partial product and obtaining circuit and second partial product acquisition circuit, it can also separate while input
Circuit is obtained to first partial product and second partial product obtains circuit.Wherein, above-mentioned subdata to be processed can be fixed-point number, and
Bit wide can be 2N, the data bit width obtained after two subdata splicings to be processed can be 4N。
It should be noted that above-mentioned function selection mode signal can there are four types of, four kinds of function selection mode signals difference
The data operation of the accessible four kinds of modes of corresponding data processor, the data operation of four kinds of modes may includeNPosition *NDigit
According to multiplying,NPosition *NPosition data multiply accumulating operation, 2NPosition * 2NThe multiplying and 2 of position dataNPosition *NPosition data
Multiply accumulating operation.In addition, a subdata to be processed in a pending data can be used as data processor processes and multiply
Method operation or multiplier when multiplying accumulating calculation process, another subdata to be processed can be used as data processor processes multiplication fortune
Multiplicand when calculating or multiplying accumulating calculation process.
S202, according to the function selection mode signal, the pending data is carried out at canonical signed number coding
Reason, obtains target code.
Optionally, according to the function selection mode signal in above-mentioned S202, carrying out canonical to the pending data has
Symbolic number coded treatment, the step of obtaining target code, comprising:, will be described to be processed according to the function selection mode signal
It is continuous in datalBit value 1 be converted to (l+ 1) highest bit value in position is 1, and lowest order numerical value is -1, remaining position is numerical value 0
Afterwards, the target code is obtained, whereinlMore than or equal to 2.
Specifically, if the bit wide for the pending data that data processor receives is 2N, data processor can currently handle
Data bit width beN, then the canonical signed number coding circuit in data processor can be automatically by 2NPosition data split into heightN
Digit is accordingly and lowNPosition data, meanwhile, respectively to heightNPosition data and lowNPosition data carry out canonical signed number coded treatment,
Obtain corresponding high-order target code and low level target code.
Further, the method for canonical signed number coded treatment can characterize in the following manner: forNPosition multiplier and
Speech, is handled, if it exists continuously from low level numerical value to high-order numerical valuel(l >=2) bit value 1 when, then can will be continuousnBit value 1
Be converted to data " 1(0) l-1(- 1) ", and by remaining correspond to (N-l) bit value and conversion after (l+1) bit value carries out
In conjunction with obtaining a new data;Then using the new data as the primary data of next stage conversion process, until conversion process
There is no continuous in the new data obtained afterwardsl(l >=2) until bit value 1;Wherein, rightNPosition multiplier carries out canonical signed number
Coded treatment, the bit wide of obtained target code can be equal to (N+1).Further, in canonical signed number coded treatment,
Data 11 can be converted to (100-001), i.e., data 11 can equivalence be converted to 10(-1);Data 111 can be converted to
(1000-0001), i.e. data 111 can equivalence be converted to 100(-1);And so on, it is other continuousl(l >=2) bit value
The mode of 1 conversion process is also similar.
For example, the multiplier that canonical signed number coding circuit receives is " 001010101101110 ", which is carried out
The first new data obtained after first order conversion process is " 0010101011100(-1) 0 ", continues to carry out the first new data the
The second new data obtained after second level conversion process is " 0010101100(-1) 00(-1) 0 ", continues to carry out the second new data
The third new data obtained after third level conversion process be " 0010110(-1) 00(-1) 00(-1) 0 ", continue to third new data
Carry out obtained the 4th new data after fourth stage conversion process be " 00110(-1) 0(-1) 00(-1) 00(-1) 0 ", continue to the
Four new datas carry out obtained the 5th new data after level V conversion process be " 010(-1) 0(-1) 0(-1) 00(-1) 00(-1)
0 ", there is no continuous in the 5th new datal(l >=2) bit value 1, at this point, the 5th new data is properly termed as initially compiling
Code, and after carrying out the processing of cover to initial code, characterization canonical signed number coded treatment is completed to obtain intermediate code,
In, the bit wide of initial code can be equal to the bit wide of multiplier.Optionally, canonical signed number coding circuit carries out canonical to multiplier
After signed number coded treatment, obtained new data (i.e. initial code), if highest bit value and time seniority top digit in new data
Value is " 10 " or " 01 ", then canonical signed number coding circuit can highest bit value to the new data high one place's benefit one
Bit value 0, high three bit value for obtaining corresponding intermediate code is respectively " 010 " or " 001 ".Optionally, above-mentioned intermediate code
The bit wide that bit wide can be presently in reason data equal to data processor adds 1.
In addition, if the data bit width that data processor receives is 2N, and can currently handleNPosition data operation, then data
Canonical signed number coding circuit in processor, can be by 2NPosition data split into two groupsNPosition data carry out data fortune respectively
Calculate, at this point, by obtain two groups (N+1) position intermediate code can be used as target code after being combined;If data processor is worked as
Before can handle 2NPosition data operation, then the canonical signed number coding circuit in data processor can be to (the 2 of acquisitionN+1)
Mend one digit number value 0(, that is, complement and handle in high one of highest bit value place of position intermediate code) after, by complement, treated (2N+
2) position data are as target code.
S203, according to the target code and the pending data, obtain target code first partial product and
The second partial product of target code.
Specifically, data processor can (multiplying multiplies tired according to actual operation demand and subdata to be processed
Adding the multiplier in operation) (multiplying multiplies accumulating in operation with corresponding subdata to be processed for obtained correspondence target code
Multiplicand), obtain the first partial product of target code and the second partial product of target code.Wherein, data processor can
The first partial product of target code is obtained to obtain circuit by first partial product, circuit is obtained by second partial product and obtains mesh
Mark the second partial product of coding.
S204, compression processing is carried out according to first partial product of the function selection mode signal to the target code,
Obtain first object operation result.
Optionally, in above-mentioned S204 according to the function selection mode signal to the first partial product of the target code into
Row compression processing, the step of obtaining first object operation result, comprising: low level Wallace tree subelement is to all target codes
Columns value in first partial product carries out accumulation process, obtains the first accumulating operation result;Selector is selected according to the function
Mode signal gates the first accumulating operation result, obtains the first carry gating signal;High-order Wallace tree is single
Member carries out accumulation process according to the columns value in the first partial product of the first carry gating signal and the target code,
Obtain the first object operation result.
Specifically, data processor can be by the amendment Wallace tree group unit in the first compressor circuit to target code
First partial product carry out accumulating operation obtain the first accumulating operation as a result, and according to the function selection mode signal received
Corresponding data operation mode determines the first carry gating signal of gating, and using the first carry gating signal as next sub-addition
The carry input signal of operation carries out add operation with the columns value in the first partial product to target code, obtains the first mesh
Mark operation result.Optionally, the first accumulating operation result may include that amendment Wallace tree group unit carries out accumulating operation, obtain
And position output signalSumWith carry output signalsCarry, wherein and position output signalSumWith carry output signalsCarry
Bit wide can be identical.In addition, summing elements be equivalent to position output signalSumWith carry output signalsCarryIt carries out tired
Add operation.Optionally, above-mentioned first object operation result can be data 0, can also be non-zero data.
It should be noted that data processor can be by the adder in summing elements to amendment Wallace tree group unit
The carry output signals of outputCarryWith with position output signalSumAdd operation is carried out, add operation result is exported.Optionally,
Each Wallace tree subelement can export a carry output signals in amendment Wallace tree group unitCarry i , with one
With position output signalSum i (i=0 ..., 2N- 1,iIt for the reference numeral of each Wallace tree subelement, numbers since 0).
Optionally, adder receivesCarry={[Carry 0 :Carry 2N-2], 0 }, that is to say, that the carry that adder receives is defeated
Signal outCarryBit wide beN,Carry output signalsCarryIn preceding 2NIn the corresponding amendment Wallace tree group unit of -1 bit value
Preceding 2NThe carry output signals of -1 Wallace tree subelement, carry output signalsCarryIn last bit value can use number
Value 0 replaces.Optionally, adder receive and position output signalSumBit wide beN,With position output signalSumIn numerical value
Can be equal to amendment Wallace tree group unit in each Wallace tree subelement and position output signal.
In the present embodiment, according to the high-order portion product of the long-pending and all target code of the low portion of all target codes
The regularity of distribution is it is found that total columns that the partial product of all target codes corresponds to numerical value is 2N(NReason is presently in for data processor
The bit wide of data), the corresponding number of each columns value can be 0 since lowest order numerical value ..., 2N- 1, wherein number 0 toN- 1 can claim it is lowNColumns value.Optionally, accumulating operation result can be the last one high-order Wallace tree subelement output
Carry output signalsCout。
It should be noted thatNA low level Wallace tree subelement can be according to number order to lowNColumns value adds up
Operation obtains accumulating operation result.Optionally, accumulating operation result may include that the carry of each Wallace tree subelement is defeated
Signal outCarry,SumAnd the output signal of the last one high-order Wallace tree subelementCout。
It is understood that the selector in amendment Wallace tree group unit can be according to the function selection mode received
Signal gates the output signal of the last one low level Wallace tree subelementCoutOr numerical value 0,Obtain carry gating signal.
In the present embodiment, according to the regularity of distribution of the partial product of all target codes it is found that the portion of all target codes
The total columns for dividing the corresponding numerical value of product is 2N(NThe bit wide of reason data is presently in for data processor), since lowest order numerical value
The corresponding number of each columns value can be 0 ..., 2N- 1, wherein numberNTo 2N- 1 can claim heightNColumns value.
It should be noted thatNA high position Wallace tree subelement can be according to number order to heightNColumns value adds up
Operation exports accumulating operation result.Wherein, the carry input signal that first high-order Wallace tree subelement receives can be
First carry gating signal of selector output.
S205, compression processing is carried out according to second partial product of the function selection mode signal to the target code,
Obtain the second target operation result.
Optionally, in above-mentioned S205 according to the function selection mode signal to the second partial product of the target code into
Row compression processing, the step of obtaining the second target operation result, comprising: low level Wallace tree subelement is to all target codes
Columns value in second partial product carries out accumulation process, obtains the second accumulating operation result;Selector is selected according to the function
Mode signal gates the second accumulating operation result, obtains the second carry gating signal;High-order Wallace tree is single
Member carries out accumulation process according to the columns value in the second partial product of the second carry gating signal and the target code,
Obtain the second target operation result.
Further, data processor can compile target by the amendment Wallace tree group unit in the second compressor circuit
The second partial product of code carries out accumulating operation and obtains the second accumulating operation as a result, and according to function selection mode signal and second
Accumulating operation result gates the second carry gating signal, carries out further according to the second carry gating signal to the second accumulating operation result
Accumulation process obtains the second target operation result.Optionally, above-mentioned second target operation result can be data 0, can also be
Non-zero data.
In the present embodiment, data processor can be with synchronization process step S204 and step S205, to the two steps
Sequencing the present embodiment does not do any restriction.
A kind of data processing method provided in this embodiment, this method can be according to the function selection mode signals received
It determining the data operation that can currently handle specific mode, can not only realize multiplying, additionally it is possible to realization multiplies accumulating operation, from
And improve the versatility of data processor;In addition, this method does not need to carry out one-accumulate fortune again to multiplication result
Calculation could be completed to multiply accumulating arithmetic operation, only can be directly realized by multiplication by once-through operation process or multiply accumulating operation behaviour
Make, also effectively reduces the power consumption of data processor;In addition, this method, which can carry out canonical to the pending data received, to be had
Symbolic number coded treatment, so that the number of the live part product obtained is less, to reduce multiplying or multiply accumulating operation
Complexity improves operation efficiency.
It is obtained in above-mentioned S203 according to the target code and the pending data in one of the embodiments,
The step of second partial product of the first partial product of target code and target code, comprising:
S2031, conversion process is carried out according to first object coding and the pending data, obtains the first initial protion
Product.
Specifically, if the numerical value in first object coding is -1, and pending data isX, then the first initial protion product can
Think-XIf the numerical value in first object coding is 1, the first initial protion product can beXIf the number in first object coding
Value is 0, then the first initial protion product can be 0.
S2032, sign bit extension process is carried out according to the first initial protion product and the pending data, obtained
The first partial product of the target code.
Specifically, the bit wide of the first initial protion product can be equal to the bit wide that data processor is presently in reason dataN, symbol
First partial product after number Bits Expanding can be equal to data processor and be presently in reason data bit widthN2 times.Wherein, first is former
In initial portion productNBit value can be low in the first partial product after sign bit extensionNBit value, after symbol Bits Expanding
Height in first partial productNBit value can for the first initial protion product in highest bit value, i.e. the first initial protion product in
Symbol bit value.
S2033, the conversion process is carried out according to second target code and the pending data, obtains second
Initial protion product.
S2034, sign bit extension process is carried out according to the second initial protion product and the pending data, obtained
The second partial product of the target code.
Optionally, data processor can be to can be same between step S2031 and S2032, with step S2033 and S2034
Step processing, and any restriction is not done to processing sequence.
The number of a kind of data processing method provided in this embodiment, the live part product that this method can obtain is less,
To reduce multiplying or multiply accumulating the complexity of operation.
The embodiment of the present application also provides a machine learning arithmetic units comprising one or more mentions in this application
The data processor arrived executes specified engineering for being obtained from other processing units to operational data and control information
Operation is practised, implementing result passes to peripheral equipment by I/O interface.Peripheral equipment for example camera, display, mouse, keyboard,
Network interface card, wifi interface, server.It, can be by specifically tying between data processor when comprising more than one data processor
Structure is linked and is transmitted data, for example, data is interconnected and transmitted by PCIE bus, to support more massive machine
The operation of study.At this point it is possible to share same control system, there can also be control system independent;Can with shared drive,
Can also each accelerator have respective memory.In addition, its mutual contact mode can be any interconnection topology.
The machine learning arithmetic unit compatibility with higher can pass through PCIE interface and various types of server phases
Connection.
The embodiment of the present application also provides a combined treatment devices comprising above-mentioned machine learning arithmetic unit leads to
With interconnecting interface and other processing units.Machine learning arithmetic unit is interacted with other processing units, completes user jointly
Specified operation.Fig. 9 is the schematic diagram of combined treatment device.
Other processing units, including central processor CPU, graphics processor GPU, neural network processor etc. are general/special
With one of processor or above processor type.Processor quantity included by other processing units is with no restrictions.Its
Interface of its processing unit as machine learning arithmetic unit and external data and control, including data are carried, and are completed to the machine
Device learns the basic control such as unlatching, stopping of arithmetic unit;Other processing units can also cooperate with machine learning arithmetic unit
It is common to complete processor active task.
General interconnecting interface, for transmitting data and control between the machine learning arithmetic unit and other processing units
Instruction.The machine learning arithmetic unit obtains required input data, write-in machine learning operation dress from other processing units
Set the storage device of on piece;Control instruction can be obtained from other processing units, write-in machine learning arithmetic unit on piece
Control caching;It can also learn the data in the memory module of arithmetic unit with read machine and be transferred to other processing units.
Optionally, the structure is as shown in Figure 10, can also include storage device, storage device respectively with the machine learning
Arithmetic unit is connected with other processing units.Storage device for be stored in the machine learning arithmetic unit and it is described its
The data of the data of its processing unit, operation required for being particularly suitable for learn arithmetic unit or other processing units in machine
Storage inside in the data that can not all save.
The combined treatment device can be used as the SOC on piece of the equipment such as mobile phone, robot, unmanned plane, video monitoring equipment
The die area of control section is effectively reduced in system, improves processing speed, reduces overall power.When this situation, the combined treatment
The general interconnecting interface of device is connected with certain components of equipment.Certain components for example camera, display, mouse, keyboard,
Network interface card, wifi interface.
In some embodiments, a kind of chip has also been applied for comprising at above-mentioned machine learning arithmetic unit or combination
Manage device.
In some embodiments, a kind of chip-packaging structure has been applied for comprising said chip.
In some embodiments, a kind of board has been applied for comprising said chip encapsulating structure.As shown in figure 11, scheme
11 provide a kind of board, and above-mentioned board can also include other matching components other than including said chip 389, should
Matching component includes but is not limited to: memory device 390, reception device 391 and control device 392;
The memory device 390 is connect with the chip in the chip-packaging structure by bus, for storing data.It is described to deposit
Memory device may include multiple groups storage unit 393.Storage unit described in each group is connect with the chip by bus.It can manage
Solution, storage unit described in each group can be DDR SDRAM(English: Double Data Rate SDRAM, Double Data Rate are synchronous
Dynamic RAM).
DDR, which does not need raising clock frequency, can double to improve the speed of SDRAM.DDR allows the rising in clock pulses
Edge and failing edge read data.The speed of DDR is twice of standard SDRAM.In one embodiment, the storage device can be with
Including storage unit described in 4 groups.Storage unit described in each group may include multiple DDR4 grain (chip).In one embodiment
In, the chip interior may include 4 72 DDR4 controllers, and 64bit is used for transmission number in above-mentioned 72 DDR4 controllers
According to 8bit is used for ECC check.It is appreciated that data pass when using DDR4-3200 grain in the storage unit described in each group
Defeated theoretical bandwidth can reach 25600MB/s.
In one embodiment, storage unit described in each group include multiple Double Data Rate synchronous dynamics being arranged in parallel with
Machine memory.DDR can transmit data twice within a clock cycle.The controller of setting control DDR in the chips,
Control for data transmission and data storage to each storage unit.
The reception device is electrically connected with the chip in the chip-packaging structure.The reception device is for realizing described
Data transmission between chip and external equipment (such as server or computer).Such as in one embodiment, the reception
Device can be standard PCIE interface.For example, data to be processed are transferred to the core by standard PCIE interface by server
Piece realizes data transfer.Preferably, when using the transmission of 3.0 X of PCIE, 16 interface, theoretical bandwidth can reach 16000MB/s.
In another embodiment, the reception device can also be other interfaces, and the application is not intended to limit above-mentioned other interfaces
Specific manifestation form, the interface unit can be realized signaling transfer point.In addition, the calculated result of the chip is still by institute
It states reception device and sends back external equipment (such as server).
The control device is electrically connected with the chip.The control device is for supervising the state of the chip
Control.Specifically, the chip can be electrically connected with the control device by SPI interface.The control device may include list
Piece machine (Micro Controller Unit, MCU).If the chip may include multiple processing chips, multiple processing cores or more
A processing circuit can drive multiple loads.Therefore, the chip may be at the different work shape such as multi-load and light load
State.It may be implemented by the control device to processing chips multiple in the chip, multiple processing and/or multiple processing circuits
Working condition regulation.
In some embodiments, a kind of electronic equipment has been applied for comprising above-mentioned board.
Electronic equipment can for data processor, robot, computer, printer, scanner, tablet computer, intelligent terminal,
Mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, hand
Table, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.
The vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include TV, air-conditioning, micro-wave oven,
Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument
And/or electrocardiograph.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Electrical combination, but those skilled in the art should understand that, the application is not limited by described electrical combination mode,
Because certain circuits can be realized using other way or structure according to the application.Secondly, those skilled in the art also should
Know, embodiment described in this description belongs to alternative embodiment, related device and module not necessarily this Shen
It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiments.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art
For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application
Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (33)
1. a kind of data processor, which is characterized in that the data processor includes: the first multiplying operational circuit, the second multiplication
Computing circuit and partial product switched circuit, first multiplying operational circuit include that the first amendment encodes sub-circuit and first
Amendment compression sub-circuit, second multiplying operational circuit include the second amendment coding sub-circuit and the second amendment compression son electricity
Road, wherein the first amendment coding sub-circuit includes the first encoding branches and first choice branch, and second amendment is compiled
Numeral circuit include the second encoding branches and second selection branch, it is described first amendment coding sub-circuit the first output end with
The first input end of the partial product switched circuit connects, the second output terminal of the first amendment coding sub-circuit and described the
The input terminal connection of one amendment compression sub-circuit, the first output end of the partial product switched circuit and first amendment encode
The input terminal of sub-circuit connects, and the second output terminal of the partial product switched circuit is defeated with the second amendment coding sub-circuit
Enter end connection, the first output end of the second amendment coding sub-circuit and the second input terminal of the partial product switched circuit connect
It connects, the second output terminal of the second amendment coding sub-circuit is connect with the input terminal of the second amendment compression sub-circuit;
Wherein, first encoding branches are used to carry out canonical signed number coded treatment to the first data received, obtain
First partial product after symbol Bits Expanding, the first choice branch are used for from the first partial product after the symbol Bits Expanding
The first partial product of selection target coding, the first amendment compression sub-circuit are used for the first partial product to the target code
Compression processing is carried out, first object operation result is obtained, second encoding branches are used to carry out the second data received
Canonical signed number coded treatment, the second partial product after obtaining symbol Bits Expanding, the second selection branch are used for from described
The second partial product that selection target encodes in second partial product after symbol Bits Expanding, the second amendment compression sub-circuit are used for
Compression processing is carried out to the second partial product of the target code, obtains the second target operation result, the partial product exchange electricity
Road is for handing over the second partial product after the first partial product and the symbol Bits Expanding after the symbol Bits Expanding
It changes.
2. data processor according to claim 1, which is characterized in that first multiplying operational circuit and described second
Include first input end in multiplying operational circuit, is used for receive capabilities selection mode signal;In the partial product switched circuit
Including third input terminal, for receiving the function selection mode signal;Described in the function selection mode signal is used to determine
Data processor can currently handle the data operation of different mode.
3. data processor according to claim 2, which is characterized in that the first amendment coding sub-circuit includes: the
One amendment coded treatment branch and first partial product select branch, the output end of the first amendment coded treatment branch and institute
State the input terminal connection of first partial product selection branch;
Wherein, the first amendment coded treatment branch is used to carry out canonical signed number volume to first data received
Code processing obtains the first object coding, and the first partial product selection branch according to the first object for encoding
First partial product to after symbol Bits Expanding selects the first partial product after the symbol Bits Expanding, and receives institute
Second partial product after stating the symbol Bits Expanding of partial product switched circuit output, after the symbol Bits Expanding received
Second partial product, and the first partial product after selection after the obtained symbol Bits Expanding, as the target code
First partial product.
4. data processor according to claim 3, which is characterized in that described first, which corrects coded treatment branch, includes:
First amendment coding unit, low portion product acquiring unit, low level selector group unit, high-order portion product acquiring unit and height
Digit selector group unit, the first output end of the first amendment coding unit and the first of low portion product acquiring unit
Second input terminal of input terminal connection, the output end of the low level selector group unit and low portion product acquiring unit connects
It connecing, the second output terminal of the first amendment coding unit is connect with the first input end of high-order portion product acquiring unit,
The output end of the high digit selector group unit is connect with the second input terminal of high-order portion product acquiring unit;
Wherein, the first amendment coding unit is used to carry out at canonical signed number coding first data received
Reason determines that the data processor can handle the bit wide of data according to the function selection mode signal received, and according to
The bit wide that the data processor can handle data obtains first object coding, and the low portion product acquiring unit is used for basis
Receive the first object coding in the first low level target code and first data, after obtaining symbol Bits Expanding
The first low portion product, the low level selector group unit is used to gate the product of the first low portion after the symbol Bits Expanding
In numerical value, high-order portion product acquiring unit is used for according to the first high-order mesh in the first object coding received
Mark coding and first data, the first high-order portion product after obtaining symbol Bits Expanding, the high digit selector group unit
For gating the numerical value in the product of the first high-order portion after the symbol Bits Expanding.
5. data processor according to claim 4, which is characterized in that the first amendment coding unit includes: first
Data-in port, first mode selection signal input port, low level target code output port and high-order target code are defeated
Exit port;First data-in port is for receiving first data, the first mode selection signal input port
For receiving the function selection mode signal, the low level target code output port for export to first data into
After row canonical signed number coded treatment, obtained the first low level target code, the high position target code output port
For exporting to first high position target code after first data progress canonical signed number coded treatment, obtained.
6. data processor according to claim 4 or 5, which is characterized in that the low portion accumulates acquiring unit and includes:
Low level target code input port, gating value input mouth, the first data-in port and low portion product output port;
The first low level target that the low level target code input port is used to receive the first amendment coding unit output is compiled
Code, the sign bit of the gating value input mouth for obtaining after receiving the low level selector group one-cell switching expand
The numerical value in the first low portion product after exhibition, first data-in port are described low for receiving first data
Bit position product output port is used to export the first low portion product after the symbol Bits Expanding.
7. data processor according to claim 4, which is characterized in that the high-order portion product acquiring unit includes: height
Position target code input port, gating value input mouth, the first data-in port and high-order portion product output port;Institute
The first high-order target code that high-order target code input port is exported for receiving the first amendment coding unit is stated, it is described
Gating value input mouth for after receiving the high digit selector group one-cell switching, after the symbol Bits Expanding of output the
Numerical value in one high-order portion product, first data-in port is for receiving first data, the high-order portion product
Output port is used to export the first high-order portion product after the symbol Bits Expanding.
8. data processor according to claim 4, which is characterized in that the low level selector group unit includes: low level
Selector, the low level selector are used to gate the numerical value in the first low portion product after the symbol Bits Expanding.
9. data processor according to claim 4, which is characterized in that the high digit selector group unit includes: a high position
Selector, the high digit selector are used to gate the numerical value in the first high-order portion product after the symbol Bits Expanding.
10. data processor according to claim 3, which is characterized in that the first partial product selection branch includes: function
It can selection mode signal input port, first partial product input port, second partial product input port, first partial product output end
Mouth and gate unit product output port;The function selection mode signal input port is for receiving the function selection mode
Signal, after the first partial product input port is used to receive the symbol Bits Expanding of the first amendment coding unit output
First partial product, the second partial product input port is used to receive the sign bit of partial product switched circuit exchange
Second partial product after extension, the first partial product output port need the partial product switched circuit to be handed over for exporting
First partial product after the symbol Bits Expanding changed, the gate unit product output port are used to export the symbol after gating
Second partial product after first partial product after number Bits Expanding, and the symbol Bits Expanding that receives.
11. data processor according to claim 1, which is characterized in that the first amendment compression sub-circuit includes: to repair
Positive Wallace tree group unit and summing elements, the input of the output end and the summing elements of the amendment Wallace tree group unit
End connection;When the amendment Wallace tree group unit is used to handle the data operation of different mode, the target of acquisition is compiled
Code first partial product in each columns value carry out accumulation process, obtain accumulating operation as a result, the summing elements for pair
The accumulating operation result carries out add operation.
12. data processor according to claim 11, which is characterized in that the amendment Wallace tree group unit includes:
Low level Wallace tree subelement, selector and high-order Wallace tree subelement, the output of the low level Wallace tree subelement
End is connect with the input terminal of the selector, the input terminal of the output end of the selector and the high-order Wallace tree subelement
Connection;Wherein, the low level Wallace tree subelement is used for each columns value in the first partial product of the target code
It carries out accumulating operation and obtains the accumulating operation as a result, the selector is received for gating the high-order Wallace tree subelement
Carry input signal, it is described a high position Wallace tree subelement be used for each column in the first partial product of the target code
Numerical value carries out accumulating operation and obtains the accumulating operation result.
13. data processor according to claim 11, which is characterized in that the summing elements include: adder, described
Adder is used to carry out add operation to the accumulating operation result.
14. data processor according to claim 1, which is characterized in that the second amendment coding sub-circuit includes: the
Two amendment coded treatment branches and second partial product select branch, the output end of the second amendment coded treatment branch and institute
State the input terminal connection of second partial product selection branch;
The second amendment coded treatment branch is used to carry out at canonical signed number coding second data received
Reason obtains second target code, and the second partial product selection branch according to second target code for being accorded with
Second partial product after number Bits Expanding, selects the second partial product after the symbol Bits Expanding, and receive the portion
First partial product after the symbol Bits Expanding of point product switched circuit output, by the after the symbol Bits Expanding received
First partial product after the symbol Bits Expanding obtained after two partial products, and selection, second as the target code
Partial product.
15. data processor according to claim 14, which is characterized in that second partial product selection branch includes:
Function selection mode signal input port, second partial product input port, first partial product input port, second partial product output
Port and gate unit product output port;The function selection mode signal input port is for receiving the function selection mould
Formula signal, the second partial product input port are used to receive the sign bit of the second amendment coded treatment branch output
Second partial product after extension, the first partial product input port after receiving the partial product switched circuit exchange for obtaining
The symbol Bits Expanding after first partial product, the second partial product output port needs the partial product to hand over for exporting
Second partial product after changing the symbol Bits Expanding that circuit needs to exchange, the gate unit product output port is for exporting choosing
First partial product after the second partial product after the symbol Bits Expanding after logical, and the symbol Bits Expanding that receives.
16. data processor according to claim 1, which is characterized in that the partial product switched circuit includes: function choosing
Select mode signal input port, first partial product input port, first partial product output port, second partial product input port with
And second partial product output port, the function selection mode signal input port is for receiving the function selection mode letter
Number, the first partial product input port is used to receive the symbol that the needs of the first partial product selection branch output exchange
First partial product after number Bits Expanding, the first partial product output port is for exporting first after the symbol Bits Expanding
Divide product, the needs that the second partial product output port is used to receive the second partial product selection branch output exchange described
Second partial product after symbol Bits Expanding, the second partial product output port are used to export second after the symbol Bits Expanding
Partial product.
17. a kind of data processing method, which is characterized in that the described method includes:
Receive pending data and function selection mode signal, wherein the function selection mode signal is used to indicate at data
Reason device can currently handle the data operation of different mode;
According to the function selection mode signal, judge whether the pending data needs to carry out deconsolidation process;
If the pending data needs to carry out deconsolidation process, deconsolidation process is carried out to the pending data, is split
Data afterwards;
Canonical signed number coded treatment is carried out to the data after the fractionation, obtains target code;
Conversion process is carried out according to the data after the target code and the fractionation, the part after obtaining symbol Bits Expanding
Product;
According to the function selection mode signal, judge whether need to swap place to the partial product after the symbol Bits Expanding
Reason;
If not needing to swap processing to the partial product after the symbol Bits Expanding, by the part after the symbol Bits Expanding
Partial product of the product as target code;
Compression processing is carried out to the partial product of the target code, obtains target operation result.
18. according to the method for claim 17, which is characterized in that described according to the function selection mode signal, judgement
Whether the pending data needs to carry out deconsolidation process, comprising: according to the function selection mode signal, judgement is described wait locate
Whether bit wide and the data bit width of data processor currently accessible associative mode operation for managing data are equal.
19. according to the method for claim 18, which is characterized in that according to the function selection mode signal, judge institute
State pending data bit wide it is whether equal with the data bit width of data processor currently accessible associative mode operation after,
The method also includes: if the pending data does not need to carry out deconsolidation process, continue to execute to the pending data
Canonical signed number coded treatment is carried out, the target code is obtained.
20. method described in any one of 7 to 19 according to claim 1, which is characterized in that the data to after the fractionation
Canonical signed number coded treatment is carried out, target code is obtained, comprising: will be continuous in the data after the fractionationlBit value 1
Be converted to (l+ 1) position highest bit value be 1, lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code,
In,lMore than or equal to 2.
21. according to the method for claim 17, which is characterized in that the data to after the fractionation, which carry out canonical, symbol
Number coded treatment, obtains target code, comprising:
Canonical signed number coded treatment is carried out to the data after the fractionation, obtains intermediate code;
According to the intermediate code and the function selection mode signal, the target code is obtained.
22. according to the method for claim 17, which is characterized in that it is described according to the target code and the fractionation after
Data carry out conversion process, the partial product after obtaining symbol Bits Expanding, comprising:
Conversion process is carried out according to the data after the target code and the fractionation, obtains initial protion product;
Sign bit extension process is carried out to initial protion product, the partial product after obtaining the symbol Bits Expanding.
23. according to the method for claim 17, which is characterized in that described according to the function selection mode signal, judgement
Whether the partial product after the symbol Bits Expanding is needed to swap processing, comprising: according to the function selection mode signal,
Judge data processor be presently in reason data bit width it is whether identical.
24. according to the method for claim 23, which is characterized in that according to the function selection mode signal, judgement pair
After whether the partial product after the symbol Bits Expanding needs to swap processing, the method also includes: if desired to described
Partial product after symbol Bits Expanding swaps processing, then in the partial product after the symbol Bits Expanding high-order portion product or
Low portion product swaps processing.
25. according to the method for claim 17, which is characterized in that the partial product to the target code is compressed
Processing, obtains target operation result, comprising:
Accumulation process is carried out to the partial product of the target code, obtains intermediate calculation results;
Accumulation process is carried out to the intermediate calculation results, obtains the target operation result.
26. according to the method for claim 25, which is characterized in that described to carry out cumulative place to the intermediate calculation results
Reason, obtains the target operation result, comprising:
Low level Wallace tree subelement carries out accumulation process to the columns value in the partial product of all target codes, obtains cumulative fortune
Calculate result;
Selector gates the accumulating operation result according to the function selection mode signal, obtains carry gating letter
Number;
High-order Wallace tree subelement is according to the columns value in the carry gating signal and the partial product of the target code
Accumulation process is carried out, the target operation result is obtained.
27. a kind of machine learning arithmetic unit, which is characterized in that the machine learning arithmetic unit includes one or more as weighed
Benefit requires the described in any item data processors of 1-16, for obtaining from other processing units to operation input data and control
Information, and specified machine learning operation is handled, processing result is passed into other processing units by I/O interface;
When the machine learning arithmetic unit includes multiple data processors, by pre- between multiple data processors
If specific structure is attached and transmits data;
Wherein, multiple data processors are interconnected by PCIE bus and are transmitted data, to support more massive machine
The operation of device study;Multiple data processors share same control system or possess respective control system;It is multiple described
Data processor shared drive possesses respective memory;The mutual contact mode of multiple data processors is that any interconnection is opened up
It flutters.
28. a kind of combined treatment device, which is characterized in that the combined treatment device includes machine as claimed in claim 27
Learn arithmetic unit, general interconnecting interface and other processing units;
The machine learning arithmetic unit is interacted with other processing units, the common calculating behaviour for completing user and specifying
Make.
29. combined treatment device according to claim 28, which is characterized in that further include: storage device, the storage device
It is connect respectively with the machine learning arithmetic unit and other processing units, for saving the machine learning arithmetic unit
With the data of other processing units.
30. a kind of neural network chip, which is characterized in that the machine learning chip includes machine as claimed in claim 27
Learn arithmetic unit or combined treatment device as claimed in claim 28 or combined treatment device as claimed in claim 29.
31. a kind of electronic equipment, which is characterized in that the electronic equipment includes the chip as described in the claim 30.
32. a kind of board, which is characterized in that the board includes: memory device, reception device and control device and such as right
It is required that neural network chip described in 30;
Wherein, the neural network chip is separately connected with the memory device, the control device and the reception device;
The memory device, for storing data;
The reception device, for realizing the data transmission between the chip and external equipment;
The control device is monitored for the state to the chip.
33. board according to claim 32, which is characterized in that
The memory device includes: multiple groups storage unit, and storage unit described in each group is connect with the chip by bus, institute
State storage unit are as follows: DDR SDRAM;
The chip includes: DDR controller, the control for data transmission and data storage to each storage unit;
The reception device are as follows: standard PCIE interface.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910902610.3A CN110413254B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
CN201911349822.XA CN111008003B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910902610.3A CN110413254B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911349822.XA Division CN111008003B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110413254A true CN110413254A (en) | 2019-11-05 |
CN110413254B CN110413254B (en) | 2020-01-10 |
Family
ID=68370615
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911349822.XA Active CN111008003B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
CN201910902610.3A Active CN110413254B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911349822.XA Active CN111008003B (en) | 2019-09-24 | 2019-09-24 | Data processor, method, chip and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN111008003B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767025A (en) * | 2020-08-04 | 2020-10-13 | 腾讯科技(深圳)有限公司 | Chip comprising multiply-accumulator, terminal and control method of floating-point operation |
CN112558920A (en) * | 2020-12-21 | 2021-03-26 | 清华大学 | Signed/unsigned multiply-accumulate device and method |
CN113031911A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Multiplier, data processing method, device and chip |
CN113031918A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN113033799A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN113033788A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1454347A (en) * | 2000-10-16 | 2003-11-05 | 诺基亚公司 | Multiplier and shift device using signed digit representation |
US20070180015A1 (en) * | 2005-12-09 | 2007-08-02 | Sang-In Cho | High speed low power fixed-point multiplier and method thereof |
CN101923459A (en) * | 2009-06-17 | 2010-12-22 | 复旦大学 | Reconfigurable multiplication/addition arithmetic unit for digital signal processing |
CN103955585A (en) * | 2014-05-13 | 2014-07-30 | 复旦大学 | FIR (finite impulse response) filter structure for low-power fault-tolerant circuit |
CN104011665A (en) * | 2011-12-23 | 2014-08-27 | 英特尔公司 | Super Multiply Add (Super MADD) Instruction |
CN105183424A (en) * | 2015-08-21 | 2015-12-23 | 电子科技大学 | Fixed-bit-width multiplier with high accuracy and low energy consumption properties |
CN110190843A (en) * | 2018-04-10 | 2019-08-30 | 北京中科寒武纪科技有限公司 | Compressor circuit, Wallace tree circuit, multiplier circuit, chip and equipment |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5784305A (en) * | 1995-05-01 | 1998-07-21 | Nec Corporation | Multiply-adder unit |
CN1324456C (en) * | 2004-01-09 | 2007-07-04 | 上海交通大学 | Digital signal processor using mixed compression two stage flow multiplicaton addition unit |
CN100356315C (en) * | 2004-09-02 | 2007-12-19 | 中国人民解放军国防科学技术大学 | Design method of number mixed multipler for supporting single-instruction multiple-operated |
CN100552620C (en) * | 2007-09-21 | 2009-10-21 | 清华大学 | Large number multiplication device based on quadratic B ooth coding |
CN101625634A (en) * | 2008-07-09 | 2010-01-13 | 中国科学院半导体研究所 | Reconfigurable multiplier |
CN102591615A (en) * | 2012-01-16 | 2012-07-18 | 中国人民解放军国防科学技术大学 | Structured mixed bit-width multiplying method and structured mixed bit-width multiplying device |
CN107977191B (en) * | 2016-10-21 | 2021-07-27 | 中国科学院微电子研究所 | Low-power-consumption parallel multiplier |
CN108459840B (en) * | 2018-02-14 | 2021-07-09 | 中国科学院电子学研究所 | SIMD structure floating point fusion point multiplication operation unit |
-
2019
- 2019-09-24 CN CN201911349822.XA patent/CN111008003B/en active Active
- 2019-09-24 CN CN201910902610.3A patent/CN110413254B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1454347A (en) * | 2000-10-16 | 2003-11-05 | 诺基亚公司 | Multiplier and shift device using signed digit representation |
US20070180015A1 (en) * | 2005-12-09 | 2007-08-02 | Sang-In Cho | High speed low power fixed-point multiplier and method thereof |
CN101923459A (en) * | 2009-06-17 | 2010-12-22 | 复旦大学 | Reconfigurable multiplication/addition arithmetic unit for digital signal processing |
CN104011665A (en) * | 2011-12-23 | 2014-08-27 | 英特尔公司 | Super Multiply Add (Super MADD) Instruction |
CN103955585A (en) * | 2014-05-13 | 2014-07-30 | 复旦大学 | FIR (finite impulse response) filter structure for low-power fault-tolerant circuit |
CN105183424A (en) * | 2015-08-21 | 2015-12-23 | 电子科技大学 | Fixed-bit-width multiplier with high accuracy and low energy consumption properties |
CN110190843A (en) * | 2018-04-10 | 2019-08-30 | 北京中科寒武纪科技有限公司 | Compressor circuit, Wallace tree circuit, multiplier circuit, chip and equipment |
Non-Patent Citations (1)
Title |
---|
HENG QUAN等: "《A Novel Vector/SIMD Multiply-Accumulate Unit based on Reconfigurable Booth Array》", 《2010 10TH IEEE INTERNATIONAL CONFERENCE ON SOILD-STATE AND INTEGRATED CIRCUIT TECHNOLOGY》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113031911A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Multiplier, data processing method, device and chip |
CN113031918A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN113033799A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN113033788A (en) * | 2019-12-24 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN113033788B (en) * | 2019-12-24 | 2023-08-18 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN113033799B (en) * | 2019-12-24 | 2023-09-08 | 上海寒武纪信息科技有限公司 | Data processor, method, device and chip |
CN111767025A (en) * | 2020-08-04 | 2020-10-13 | 腾讯科技(深圳)有限公司 | Chip comprising multiply-accumulator, terminal and control method of floating-point operation |
CN111767025B (en) * | 2020-08-04 | 2023-11-21 | 腾讯科技(深圳)有限公司 | Chip comprising multiply accumulator, terminal and floating point operation control method |
CN112558920A (en) * | 2020-12-21 | 2021-03-26 | 清华大学 | Signed/unsigned multiply-accumulate device and method |
CN112558920B (en) * | 2020-12-21 | 2022-09-09 | 清华大学 | Signed/unsigned multiply-accumulate device and method |
Also Published As
Publication number | Publication date |
---|---|
CN110413254B (en) | 2020-01-10 |
CN111008003B (en) | 2023-10-13 |
CN111008003A (en) | 2020-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110413254A (en) | Data processor, method, chip and electronic equipment | |
CN110362293A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN109086076A (en) | Processing with Neural Network device and its method for executing dot product instruction | |
CN110515589A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110163358A (en) | A kind of computing device and method | |
CN107957976A (en) | A kind of computational methods and Related product | |
CN110531954A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110515587A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN105913118A (en) | Artificial neural network hardware implementation device based on probability calculation | |
CN110515590A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN107957977A (en) | A kind of computational methods and Related product | |
CN109711540A (en) | A kind of computing device and board | |
CN111258544B (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110688087B (en) | Data processor, method, chip and electronic equipment | |
CN110647307B (en) | Data processor, method, chip and electronic equipment | |
CN110515588A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN111258541A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN110515586A (en) | Multiplier, data processing method, chip and electronic equipment | |
CN210006029U (en) | Data processor | |
CN210006030U (en) | Data processor | |
CN110554854B (en) | Data processor, method, chip and electronic equipment | |
CN209895329U (en) | Multiplier and method for generating a digital signal | |
CN111260070B (en) | Operation method, device and related product | |
CN111381875B (en) | Data comparator, data processing method, chip and electronic equipment | |
CN210109789U (en) | Data processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |