EP1665029A2 - Unite de traitement de signaux numerique - Google Patents

Unite de traitement de signaux numerique

Info

Publication number
EP1665029A2
EP1665029A2 EP04761027A EP04761027A EP1665029A2 EP 1665029 A2 EP1665029 A2 EP 1665029A2 EP 04761027 A EP04761027 A EP 04761027A EP 04761027 A EP04761027 A EP 04761027A EP 1665029 A2 EP1665029 A2 EP 1665029A2
Authority
EP
European Patent Office
Prior art keywords
unit
signal processing
digital signal
processing device
format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP04761027A
Other languages
German (de)
English (en)
Inventor
Alois Hahn
Premysl Vaclavik
Heinz Gerald Krottendorfer
Christian Tiringer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ON DEMAND MICROELECTRONICS AG
Original Assignee
On Demand Microelectronics GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by On Demand Microelectronics GmbH filed Critical On Demand Microelectronics GmbH
Publication of EP1665029A2 publication Critical patent/EP1665029A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/14Conversion to or from non-weighted codes
    • H03M7/24Conversion to or from floating-point codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/38Indexing scheme relating to groups G06F7/38 - G06F7/575
    • G06F2207/3804Details
    • G06F2207/3808Details concerning the type of numbers or the way they are handled
    • G06F2207/3812Devices capable of handling different types of numbers
    • G06F2207/3824Accepting both fixed-point and floating-point numbers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/499Denomination or exception handling, e.g. rounding or overflow
    • G06F7/49942Significance control
    • G06F7/49947Rounding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products

Definitions

  • the invention relates to a digital signal processing device, in particular a digital computing device, according to the introductory part of claim 1.
  • digital signals are treated digitally using a wide variety of algorithms, the digital signals being derived, for example, from originally analog signals by sampling.
  • the signal processing can take place in the form of calculations in accordance with telecommunications algorithms, for example in order to implement a bandpass filter or the like.
  • the digital signal values are stored in binary form in the storage means, the values usually being stored in a 2's complement representation as an integer or else as a fixed-point number.
  • the more complex floating point format floating point format
  • DSP digital signal processors
  • format adjustments and roundings are carried out programmatically with the aid of a series of individual commands, several clock cycles being required for the execution of these commands; in some cases, the number of clock cycles required for this can then be greater than the number of clock cycles for the actual algorithmic signal processing or calculation, which is of course particularly disadvantageous.
  • the invention provides a digital signal processing device with the features of claim 1.
  • Particularly advantageous embodiments and further developments are defined in the subclaims.
  • a special format conversion unit preferably with a rounding unit, is thus integrated directly into the data path of the arithmetic unit in accordance with a particularly preferred aspect.
  • the possible format conversions and possibly rounding operations thus become a direct component of each signal processing command, so that no separate clock cycle is generally required.
  • Another advantage is that program creation is significantly simplified since the problems associated with format conversion are automatically relieved from the programmer.
  • the number format conversion unit possibly with the integrated rounding unit, does not need to be designed for a predetermined format, but rather a format specification or setting is particularly advantageously possible, for which purpose a format register is preferably provided as the format specification unit.
  • This format register is loaded once as required and then determines the format conversions and roundings and therefore the exact functioning of these units based on its content.
  • the format tab contains fields for specifying the data format, such as the number of digits in total and the number of digits after the decimal point, for the starting format as well as for the target format.
  • a saturation function (also called a clipping function) can be integrated into the number format conversion unit in order to prevent a signal value from overflowing when the maximum value is exceeded in the wrong sign.
  • a saturation function ie Installation of a saturation unit in the format conversion unit also ensures that no additional clock cycle is required, and, as mentioned, errors which could arise in connection with the format conversion and rounding function are prevented by this saturation function.
  • a comparable saturation function is preferably also assigned to the rounding unit, in order to identify any overflow when rounding up and to deliver the correct result.
  • FIG. 1 shows a block diagram of a signal processor known per se
  • FIG. 2 shows a schematic block diagram of an arithmetic unit of such a processor, with a number format conversion unit according to the invention, to which a format specification unit is assigned;
  • FIG. 5 shows, in two related sub-figures 5A and 5B, a detailed structure of the number format conversion unit including rounding unit and saturation unit;
  • FIG. 6 shows an example of a table with signed positive and negative 4-bit binary numbers, with a value range from -8 to +7;
  • FIG. 7 shows a comparable table with 4-bit binary numbers, each having two digits before the decimal point and two digits after the decimal point, the values extending from -2 to +1.75;
  • FIG. 8 schematically, in association with the arrangement of FIG. 5, an example of a number format conversion with rounding and saturation, with an overflow;
  • Fig. 9 shows a comparable example for a number format conversion with rounding and saturation, but now with an underflow (underflow).
  • FIG. 1 schematically shows in a block diagram the known structure of a processor, a program memory 1 being provided to which a program controller 2 is connected in order to correspondingly control an arithmetic logic unit 4 receiving the data to be processed from a data memory 3.
  • the Harvard architecture as shown, or also the Von Neumann architecture are known for the construction of such arithmetic units 4, wherein a computer unit 4 with Harvard architecture is also assumed here, although this is of course not to be seen as restrictive.
  • the arithmetic unit 4 contains, as will be explained in more detail below with reference to FIG. 3, in general, an arithmetic unit (ARU - arithmetic unit), and it defines a data path.
  • ARU - arithmetic unit arithmetic unit
  • each program instruction is carried out in three phases, the sequence being controlled with the aid of the program controller 2.
  • the so-called "fetch" phase fetch - command call
  • a command word is read out of the program memory and fed to the program controller 2, as illustrated in FIG. 1 with the reference symbol la.
  • this command word is decoded and split into individual microoperations with which the arithmetic and logic unit 4 is controlled. This is indicated in Fig. 1 with the connection 2a between the program controller 2 and the arithmetic unit 4.
  • the "Execute" phase the instruction is processed, and accordingly in this phase the micro-operations are passed on in the form of control signals via the connection 2a and the arithmetic unit 4 for actual execution, in addition via the data connection 3a data from the data memory 3 into the computation plant 4 can be loaded; In arithmetic unit 4, this data is processed arithmetically and temporarily stored in registers. After this processing, the data obtained are stored again in the data memory 3, for example via a connection 4a.
  • the data store 3 forms, for example, input storage means and at the same time output storage means for the arithmetic unit 4.
  • FIG. 2 shows the structure of an arithmetic unit 4 in somewhat more detail in a block diagram, with data A, B to be linked to one another being fed, for example, to input registers 5A, 5B (for example from data memory 3 according to FIG. 1), which are used as input Storage means 5 can be viewed, after which the data arrive in the arithmetic unit when the aforementioned microoperations are processed, here, for example, a multiplier unit 6 being provided in series with an adder unit 7.
  • the result of these arithmetic operations is normally fed to output storage means, here illustrated schematically by a result register 8, the result being indicated by "Y".
  • a number format conversion unit 10 is also arranged directly, which at the same time contains a rounding unit, as will be explained in more detail below.
  • This number format conversion unit 10 hereinafter referred to as conversion unit or also adaptation unit, can convert the supplied data into a predetermined number format, wherein, as shown in FIG. 2, a format specification unit 11 is provided, which is in particular in the form of a format register, and the output of which is connected to the conversion unit 10, as indicated in FIG. 2 by the connection 11a.
  • This format specification unit 11 can be used for the respective calculation process or D Data processing process can be filled with appropriate format information, as indicated schematically in FIG. 2 at the input 11b.
  • the arrangement of the conversion unit 10 directly in the data path 9 leading from the input registers 5A, 5B to the results register 8 in the manner shown means that the desired format conversions and possibly rounding operations in the can take place in the same clock cycle in which the arithmetic operations are carried out, with only a certain delay being accepted until the data appear at the output of the conversion unit 10.
  • the present hardware implementation of these conversion and rounding tasks directly in the data path 9 also enables a simplification of the programming, since in the respective program that is to be stored in the program memory 1 in FIG.
  • the desired formats for storage in the format are simple - Specification unit 11 are to be provided (provided that these formats do not automatically result from the storage format of the data store 3), but no conversion and rounding operations have to be programmed.
  • the delay time which is to be taken into account in the present technology, should be rather long compared to the clock time, for example should already take half a clock cycle, which could be the case with particularly fast arithmetic units 4 with particularly short clock cycles , it can be envisaged to install a storage element (register) for buffering within the conversion unit 10, so that the format conversion and rounding activity started in the given clock cycle can then be completed in a second clock cycle without the given delay times
  • the result of the operations in the arithmetic unit which is stored as the result Y in register 8, could impair.
  • MAC function multiply-accumulate function
  • MAC - multiply-accumulate two input numbers (operands) are multiplied, and the multiplication result is then added to the content of an accumulator.
  • MAC function multiply-accumulate function
  • Such a MAC function is, for example, with the arithmetic unit 4 3 realized, in addition, the result obtained is subjected to a range adjustment (number format conversion and rounding).
  • the signed 2's complement representation is often used for the numbers, as will be explained in more detail below with reference to FIGS. 6 and 7, but the invention is of course not restricted to such representations. In the following description, however, for the sake of simplicity, such a signed 2's complement representation will be used throughout.
  • the desired numbers A, B for the multiplication to be carried out are read out from the data memory 3 and loaded into the registers 5A, 5B, which is accomplished by the program control (program control 2 in FIG. 1) by means of corresponding load commands
  • the data memory 3 also receives "CONTROL" control commands from the program memory 2 via a control line 3b.
  • the data or operands A, B are then fed to the arithmetic unit 6 in the next step, with a corresponding control signal (MUL / DIV - multiply / divide) being applied to it by the program controller 2 at 6b.
  • the multiplication result is fed via the connection 6a to the adder / subtractor 7, to which the program controller 2 in a corresponding manner supplies an add command (or subtract command; ADD / SUB).
  • the current content of this accumulator 12 is fed from the output of an accumulator 12 to a second input of this adder / subtractor 7, as indicated at 12a in FIG. 3.
  • the result of this addition is again stored in the accumulator 12, cf. the output 7a of the adder 7, a multiplexer 13 being interposed, which is set by the program controller 2 via a control input 13b (“SELECT”) such that the multiplexer 13 connects the adder output 7a to the corresponding input of the accumulator 12 (see FIG Connects 13a between multiplexer 13 and accumulator 12.
  • the function of the accumulator 12 is initiated by the program controller 2 with a control input 12b (“OPERATION”).
  • the multiply-accumulate command is usually repeated several times in a loop; as soon as the end result is present in the accumulator 12, in the present example it is stored again in the data memory 3, although the number format has to be adapted beforehand, since the width of the accumulator 12 is generally greater than the width of the data values A read from the data memory 3 , B.
  • the multiplexer 13 is used to load the accumulator 12 with an initial value from the data memory 3 with its own instruction at the start of a loop. Usually the value "00" is used as this initial value.
  • the content of the accumulator 12 (output 12a) is thus, as mentioned, transferred to the conversion unit 10 before being stored back in the data memory 3 for the purpose of number format conversion and preferably also for any rounding, in which the adaptation of the to be described in more detail below with reference to FIG. 5 Number format and rounding are performed.
  • the result of this is that the calculation result corresponds to the predetermined storage format, although a larger word width (number width, ie a larger number of bits per number) can be used for the calculation processes carried out in the arithmetic logic unit 4, for a high accuracy of the calculation.
  • the corresponding control information is received by the conversion unit 10 from the format specification unit 11, preferably a register, which contains control data relating to the respectively defined format (FXD_FORMAT); this control information is loaded in advance, at the beginning of the program, during an initialization phase, in accordance with the storage format specifications, for example of the data memory 3. For example, at the beginning of the program, a value is read directly from the data memory 3, see output 3a in FIG. 3, and loaded into the specification unit 11 with the aid of a control signal 11b (“LOAD”). This word thus indicates the target format (DST Destination which the result Y should have (cf. FIG.
  • Register 11 each be 8 bits long (cf. bit positions 0-7, a total of 0-15, in the specification unit 11 according to FIG. 4).
  • the format SRC in the specification unit 11 thus relates to the format of the number given at the output of the accumulator 12, the "source number”, whereas the format DST specifies the target format of the data words for storage in the data memory 3.
  • Each field DST or SRC im Register 11 contains the position of the decimal point in the form of an unsigned binary number, a value of "2" indicating, for example, that the number to be considered should have two decimal places, ie two digits to the right of the decimal point, so that the decimal point is shifted two digits from the rightmost digit to the left.
  • the conversion unit 10 supplies the result (Y; see also FIG. 2) at its actual output 10a according to FIG. 3, which result is stored directly in the data memory 3 in output storage means according to FIG. 3; in addition, the format conversion and rounding can also result in an overflow or underflow in the number adjustment (underflow - UFL; overflow - OFL), and corresponding status signals UFL and OFL are present at outputs 10b and 10c of the conversion unit 10; these two status signals UFL, OFL can preferably be fed to a status register 14 in order to be available for handling exceptional cases.
  • FIG. 5 is composed of FIGS. 5A and 5B, which are to be thought of as being laid along the dashed dividing lines in FIGS. 5A and 5B.
  • 5 also contains exemplary dimension details relating to the number of bits or the bit width of the individual data values to be processed, these dimension details entirely corresponding to common practical examples. Further explanations are given below on the basis of concrete, but simplified numerical examples with lower bit numbers, with particular reference to FIGS. 8 and 9, for easier understanding, with previous explanations also 6 and 7 2's complement number representations are to be explained with regard to “overflow” and “underflow”.
  • the conversion unit 10 also called the ALIGN and ROUND unit (with regard to format adaptation and rounding), is supplied with the output value 12a of the accumulator 12, as can be seen in FIG. 3 as well as in FIG. 5.
  • the format of this output value at the output 12a of the accumulator 12 is subsequently to be adapted by the conversion unit 10 in accordance with the specification by the register 11 (generally called format specification unit 11) in such a way that the finally obtained data word (output 10a) for storage in the data memory 3 (or any other data storage, possibly with a different number format) is suitable.
  • the conversion unit 10 is arranged directly in the data path (see data path 9 in FIG. 2) of the arithmetic unit 4, i.e.
  • the operations carried out by the conversion unit 10 are preferably carried out in the same clock cycle as the arithmetic operations in the preceding arithmetic units 6, 7, with only a slight delay time occurring from stage to stage. If, however, extremely short clock cycles are specified and the circuit modules with which the individual components, in particular the conversion unit 10, are implemented, cause a delay which is somewhat too great in comparison to this, intermediate storage within the conversion unit 10, if appropriate also before and / or after the conversion unit 10, are provided so as to carry out a first part of the operations in a first clock cycle and a second part in a second clock cycle.
  • the present conversion unit 10 also contains, as an integral hardware-related component, a rounding unit 15, which consists of individual logic modules and an adder, as will be explained in more detail below; furthermore, such a called “saturation function" integrated to prevent a change of sign in the event of a number overflow or underflow (overflow, underflow), cf. also the following explanations in connection with FIGS. 6 and 7.
  • a rounding unit 15 which consists of individual logic modules and an adder, as will be explained in more detail below; furthermore, such a called “saturation function" integrated to prevent a change of sign in the event of a number overflow or underflow (overflow, underflow), cf. also the following explanations in connection with FIGS. 6 and 7.
  • the accumulator 12 has a width of 80 bits (cf. bit positions No. 0-79 in FIG. 5A), and a conversion into a number with a width of 32 bits is to take place in the conversion unit 10 , which corresponds to the width of a data word in the data memory 3.
  • the format register 11 also contains a value of 40 in the SRC field (see FIG. 4) and a value of 16 in the DST field, which means that the 80 bit number from the accumulator 12 (the SRC -Number, i.e. the source number) has its decimal point to the right of bit no.40, whereas the 32-bit target number (DST number) should have its decimal point to the right of bit no.16 after the adaptation or conversion process.
  • the 80 bit number is expanded on both sides with the help of an expansion unit 16, namely on the right side, the LSB (least significant bit) side by 32 bits, that is, by as many bits as the target word DST, and these newly added 32 bits are all set to "0".
  • the MSB side MSB-Most Significant Bit
  • there are also 32 Bits corresponding to the bit width of the target word are added for expansion, the value of these bits being selected in accordance with the value of the sign bit which is taken over from the accumulator 12, that is to say the bit at the position “79”.
  • This process is also referred to as a "sign extend" (cf. also the bit field SIGN (SRC) of the expansion unit 16 in FIG. 5A.
  • a total of 32 + 80 + 32 144 bits, thus, of the bit No. 0 to bit No. 143, the bits at positions 32-111 form the original number at the output 12a of the accumulator 12.
  • the displacement unit 17 which can be formed, for example, by a multiplexer control block, receives the corresponding control information for this displacement from a control unit 17 'that calculates the displacement quantity.
  • This control unit 17 ' calculates the shift amount from the values of the format preset register 11, which are at its output 11a and are supplied to the control unit 17'.
  • the calculated shift amount results from the difference between the decimal point positions of the source format (SRC field in register 11) and the target format (DST field in register 11; see FIG. 4).
  • the control unit 17 ' can thus consist of a subtractor, which forms the difference between the two contents of the fields SRC and DST of the register 11, and it can also be integrated directly into the displacement unit 17 as a control stage.
  • FIG. 5A the bit chain thus obtained is illustrated schematically by a block 18, it being illustrated by dashed, oblique lines that the number originally from the accumulator 12 is now increased by a corresponding number ( namely shifted to the right by 24 bits).
  • the bit positions released by the shift must be filled in with the correct sign, i.e. bits with the value of the sign bit of the source number (bit no. 79 in the accumulator 12) are used for filling.
  • the decimal point is already in the right place, corresponding to that in the target number, and the target number can now be selected as a partial field from the total word - i.e. can be taken from the bit chain 18.
  • the accuracy for the target number results from its placement with 32 bits.
  • the fields of the overall word are not changed, but only interpreted in the format of the target number. This can also be referred to as a "mask change", and this operation is illustrated by the arrow 18a in FIG. 5. The result of this is illustrated in FIG. 5 (more precisely: FIG.
  • a logic unit 20 is provided in order to recognize a possible overshoot or undershoot ("overflow” or "underflow") of the number range, which, via a connection 19b from the output of the subfield unit 19, contains all 80 sign bits of the sign field 19SIGN and the sign bit of the target word in the target word field 19DST (bit at the point "31", indicated in the drawing with DST (32)).
  • all sign bits are the same, either all equal to "0” or all equal to "1"
  • OR gate 21 it is now recognized whether all of the bit positions of the Sign field have the value "0”
  • an AND gate 22 it is recognized whether all bit positions of the sign field have the value "1".
  • test block 23 which determines whether the output signal (output 21a) of the OR gate 21 is not equal to "0" or the output signal 22a of the AND gate 22 is not equal Is "1".
  • the test block 23 only has to determine whether there is an overshoot or an undershoot, and this determination is made with the aid of the sign bit of the source number, as it is contained in the accumulator 12, cf. also the connection 12s to the test block 23 in FIG. 5. If this sign bit (bit no.
  • the evaluation result of the test block 23 is also delivered via a connection 23a to a saturation unit 24, which is 33 bits wide, i.e. one bit more than the width of the target number, in order to repeat a possible overflow after a - to be described - rounding -To be able to recognize addition.
  • the saturation unit 24 sets the number supplied at 19a at its output 24a to the respective maximum end value in accordance with the test evaluation by test block 23 (output 23a relating to the UFL / OFL state).
  • this is the largest positive number in the event of an overflow (OFL), ie in this case all bits with the exception of the sign bits (bits nos. 31 and 32) are set to "1", whereas the sign bits at positions 31 and 32 are set to "0".
  • UFL overflow
  • the “largest” negative number ie the negative number with the largest absolute amount
  • becomes output 24a This means that all bits in this output number are set to the value "0", with the exception of the two sign bits No. 31 and No. 32, which are set to the value "1".
  • a corresponding undershoot signal UFL or overflow signal OFL is additionally emitted at the outputs 10b and 10c.
  • rounding up is only carried out if at least one "1" bit in addition to a "1" bit at point No. 31 occurs somewhere in the decimal places (here bit positions No. 0-31) (it a single such additional "1” bit is sufficient, or if only bit No. 31 has the value 1, and if the LSB bit in the target word field 19DST also has the value "1".
  • Such rounding-up means that, with the aid of an adder 25, a “1” (generally: the smallest positive value) is added to the number obtained at the output of the saturation unit 24.
  • a logic unit 26 with an OR gate 27 and an AND gate 28 recognizes whether such a rounding (exactly rounding up) is actually to be carried out by sending the least significant bit (LSB bit) from the target word field 19DST (see connection 19c) and the cut bits (see connection 19d) to the OR gate 27 whose output 27a, as well as bit no. 31 of the cut low-order bits (see output 19e), is applied to the AND gate 28.
  • the aforementioned IEEE rounding sees a rounding up, that is to say the addition of a “1” in the adder 25, then before (output "1" of AND gate 28, connection 28a) if any bit 19d or 19c is set to 1 and at the same time bit 19e (Bit no. 31 of subfield unit 19) also has the value 1.
  • a further saturation unit 29 is connected to the output 25a of the adder 25, and this saturation unit 29 limits the output result in the same way as previously described with the saturation unit 24 (i.e. Target word) to the highest possible numerical value.
  • This highest possible numerical value is output at the output 29a and stored in a register 30. If there is no overflow, the number obtained from the adder 25 is written directly into the register 30.
  • 6 shows 4-bit binary numbers provided with a sign bit S in a table, the value range in this example ranging from -8 to +7.
  • the positive numbers are shown at P, the negative numbers at N.
  • the sign bit S has the value "0”
  • the number is positive (the number 0 should also be added to the positive numbers); it is the sign bit S, on the other hand, "1”, the number is a negative number N.
  • adding or subtracting it can now happen that the number range limits are exceeded or undershot, cf. arrows 40 and 41 in FIG. 6.
  • the range P of the positive numbers can be exceeded (“overflow”), so that a negative number “arises” “because the bit word” Olli "(for the number +7) in the binary number representation shown is followed by the number” 1000 ", which is, however, already the largest negative number (-8).
  • a negative number is added to a negative number (in terms of amount) (see arrow 41 in FIG. 6), a positive number can arise (namely with a "0" at the position of the sign bit S), so that an underflow or an undershoot of the value range results.
  • Fig. 7 also signed (again in the 1st column of the bits) 4-bit binary numbers with integer parts I (I integer) and two decimal places F (F fraction) are illustrated, the range of values of which Binary numbers range from -2 to +1.75.
  • the numbers +0.75, +1.5 and +1.75 would be rounded up to +1, +2 and +2 if the decimal places were cut away become; however, no rounding up would be performed at the number +0.5.
  • the number 0.5 is rounded down, 0.51 is already rounded up, likewise the number 1.5 is rounded up, but not the number 2.5, but again the number 3.5 etc.
  • the 1st line in FIG. 8 shows an 8-bit source number SRC, which contains an integer 4-bit portion and 4-bit decimal places.
  • the leftmost bit for the integer parts is the sign bit S.
  • the target number DST shown in the 8th line consists of 6 bits, the first three bits representing the integer parts including the sign bit and the further three bits the decimal places represent.
  • the value of the source number SRC is +7.9375, which corresponds to the largest value that can be represented here.
  • the difference between the number of decimal places of the source number SRC and that of the target number DST is to be calculated (which is accomplished with the control unit 17 according to FIG. 5), and this difference is "1" in the example of FIG. 8.
  • the bit chain is shifted to the right by one position, see line (3) in FIG. 8, the value of the sign bit is padded on the left-hand side, ie specifically a “0” bit is added here.
  • line (4) in FIG. 8 a new mask is placed over this chain, now with only six digits, corresponding to the number of bits of the target number DST. This mask is recognizable in FIG. 8 by a shorter block (compared to lines (1) to (3)).
  • the 6-bit number in the fourth line of FIG. 8 thereby becomes negative (“1” bit on the leftmost one Job) .
  • the nine bits to the left of it (including the sign bit of the target number) are now checked for equality, and since they are not all the same, an underflow / overflow condition is determined, cf. the logic unit 20 in FIG. 5.
  • the sign bit of the source number SRC is used; in the present case, this sign bit has the value “0”, so that an overflow (OFL) is determined. If the sign bit of the source number SRC had the value “1”, an underflow would be determined.
  • the saturation unit 24 FIG.
  • the target word DST now receives the highest positive value, as can be seen from the fifth line in FIG. 8, this value now being +3.875.
  • the rounding unit 15 recognizes the need for rounding up at R in FIG. 8, the rounding unit 15 using the seven right-most bits for this. Accordingly, the target number DST is incremented by the value 0.125 (the smallest representable value with three bits), this addition value being shown in the 6th line of FIG. 8, whereas the highest positive value obtained from the saturation unit 24 is shown in the 5th line.
  • the source number SRC is again an 8 bit number with a sign bit S and four bit decimal places, the source number SRC shown having the largest negative value (in terms of amount), namely -4,000.
  • the target number should again have six bit positions, and according to this number of bits, the sign bits are expanded with six “1” bits on the left-hand side, according to the second line of FIG the bits on the right-hand side are filled with "0". Then, again - see the 3rd line of FIG. 9 - the chain is shifted by one position to the right, with a "1" bit again on the left-hand side is inserted.
  • the adder 25 is not able to add any rounding result to the target number, i.e. the number remains the same at the output of the adder 25, cf. the 6th line in FIG. 9.
  • the further saturation unit 29 now recognizes no overflow or underflow (7th line in FIG. 9) and forwards the numerical value unchanged to the following register 30, cf. the 8th line in Fig. 9.
  • the configuration described in particular with reference to FIG. 5 can in practice preferably be implemented in combinatorial logic (in particular with AND and OR gates as well as with multiplexer chains for shifting etc.) without storing elements (registers) being provided in between , In this way it is achieved that in the same clock cycle in which the arithmetic operations are carried out, the format adjustments and any rounding operations can also take place. If very short cycle times are to be realized, memory elements (registers) can also be provided between individual units, as already mentioned.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Complex Calculations (AREA)
  • Analogue/Digital Conversion (AREA)

Abstract

L'invention concerne une unité de traitement de signaux numérique comportant: des moyens de mémorisation d'entrée (3; 5); une unité de calcul (4) raccordée à ces derniers, définissant un chemin de données (9) et contenant au moins une unité arithmétique (6) ainsi qu'une entrée de commande (2a) servant à prédéterminer des opérations de calcul; ainsi que des moyens de mémorisation de sortie (8). Une unité de conversion de format numérique (10) comportant une unité de décalage (17) est placée dans le chemin de données (9) entre l'unité arithmétique (6; 7) et les moyens de mémorisation de sortie (8). Une unité de prédétermination de format numérique (11) et une unité de commande (17'), raccordée à cette dernière et calculant les opérations de décalage nécessaires sur la base de la prédétermination de format numérique, est associée à l'unité de conversion de format numérique (10). Les opérations de formatage sont calculées automatiquement à partir des informations de format d'entrée et de format de sortie et les instructions correspondantes sont appliquées à l'unité de décalage (17).
EP04761027A 2003-09-08 2004-09-07 Unite de traitement de signaux numerique Withdrawn EP1665029A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AT0140603A AT413895B (de) 2003-09-08 2003-09-08 Digitale signalverarbeitungseinrichtung
PCT/AT2004/000305 WO2005024542A2 (fr) 2003-09-08 2004-09-07 Unite de traitement de signaux numerique

Publications (1)

Publication Number Publication Date
EP1665029A2 true EP1665029A2 (fr) 2006-06-07

Family

ID=34229714

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04761027A Withdrawn EP1665029A2 (fr) 2003-09-08 2004-09-07 Unite de traitement de signaux numerique

Country Status (5)

Country Link
US (1) US20070033152A1 (fr)
EP (1) EP1665029A2 (fr)
AT (1) AT413895B (fr)
CA (1) CA2537549A1 (fr)
WO (1) WO2005024542A2 (fr)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7515456B2 (en) * 2006-09-11 2009-04-07 Infineon Technologies Ag Memory circuit, a dynamic random access memory, a system comprising a memory and a floating point unit and a method for storing digital data
US11226840B2 (en) 2015-10-08 2022-01-18 Shanghai Zhaoxin Semiconductor Co., Ltd. Neural network unit that interrupts processing core upon condition
CN106485315B (zh) * 2015-10-08 2019-06-04 上海兆芯集成电路有限公司 具有输出缓冲器反馈与遮蔽功能的神经网络单元
US10725934B2 (en) 2015-10-08 2020-07-28 Shanghai Zhaoxin Semiconductor Co., Ltd. Processor with selective data storage (of accelerator) operable as either victim cache data storage or accelerator memory and having victim cache tags in lower level cache wherein evicted cache line is stored in said data storage when said data storage is in a first mode and said cache line is stored in system memory rather then said data store when said data storage is in a second mode
US11221872B2 (en) 2015-10-08 2022-01-11 Shanghai Zhaoxin Semiconductor Co., Ltd. Neural network unit that interrupts processing core upon condition
US10228911B2 (en) * 2015-10-08 2019-03-12 Via Alliance Semiconductor Co., Ltd. Apparatus employing user-specified binary point fixed point arithmetic
US10353860B2 (en) 2015-10-08 2019-07-16 Via Alliance Semiconductor Co., Ltd. Neural network unit with neural processing units dynamically configurable to process multiple data sizes
US11029949B2 (en) 2015-10-08 2021-06-08 Shanghai Zhaoxin Semiconductor Co., Ltd. Neural network unit
US10380481B2 (en) 2015-10-08 2019-08-13 Via Alliance Semiconductor Co., Ltd. Neural network unit that performs concurrent LSTM cell calculations
US11216720B2 (en) 2015-10-08 2022-01-04 Shanghai Zhaoxin Semiconductor Co., Ltd. Neural network unit that manages power consumption based on memory accesses per period
US10664751B2 (en) 2016-12-01 2020-05-26 Via Alliance Semiconductor Co., Ltd. Processor with memory array operable as either cache memory or neural network unit memory
US10776690B2 (en) * 2015-10-08 2020-09-15 Via Alliance Semiconductor Co., Ltd. Neural network unit with plurality of selectable output functions
US10430706B2 (en) 2016-12-01 2019-10-01 Via Alliance Semiconductor Co., Ltd. Processor with memory array operable as either last level cache slice or neural network unit memory
US10423876B2 (en) 2016-12-01 2019-09-24 Via Alliance Semiconductor Co., Ltd. Processor with memory array operable as either victim cache or neural network unit memory
US10515302B2 (en) 2016-12-08 2019-12-24 Via Alliance Semiconductor Co., Ltd. Neural network unit with mixed data and weight size computation capability
US10565492B2 (en) 2016-12-31 2020-02-18 Via Alliance Semiconductor Co., Ltd. Neural network unit with segmentable array width rotator
US10565494B2 (en) 2016-12-31 2020-02-18 Via Alliance Semiconductor Co., Ltd. Neural network unit with segmentable array width rotator
US10586148B2 (en) 2016-12-31 2020-03-10 Via Alliance Semiconductor Co., Ltd. Neural network unit with re-shapeable memory
US10140574B2 (en) 2016-12-31 2018-11-27 Via Alliance Semiconductor Co., Ltd Neural network unit with segmentable array width rotator and re-shapeable weight memory to match segment width to provide common weights to multiple rotator segments

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4041461A (en) * 1975-07-25 1977-08-09 International Business Machines Corporation Signal analyzer system
US4876660A (en) * 1987-03-20 1989-10-24 Bipolar Integrated Technology, Inc. Fixed-point multiplier-accumulator architecture
JPH07109975B2 (ja) * 1989-10-02 1995-11-22 富士ゼロックス株式会社 ディジタルフィルタ
US5666300A (en) * 1994-12-22 1997-09-09 Motorola, Inc. Power reduction in a data processing system using pipeline registers and method therefor
US5764549A (en) * 1996-04-29 1998-06-09 International Business Machines Corporation Fast floating point result alignment apparatus
US5930159A (en) * 1996-10-17 1999-07-27 Samsung Electronics Co., Ltd Right-shifting an integer operand and rounding a fractional intermediate result to obtain a rounded integer result
US5745393A (en) * 1996-10-17 1998-04-28 Samsung Electronics Company, Ltd. Left-shifting an integer operand and providing a clamped integer result
US5844827A (en) * 1996-10-17 1998-12-01 Samsung Electronics Co., Ltd. Arithmetic shifter that performs multiply/divide by two to the nth power for positive and negative N
KR100236533B1 (ko) * 1997-01-16 2000-01-15 윤종용 배럴 쉬프터와 산술논리 연산기가 연결되어 있는 디지탈 신호 처리기 및 그 오버플로 검출방법
US6317770B1 (en) * 1997-08-30 2001-11-13 Lg Electronics Inc. High speed digital signal processor
US6289365B1 (en) * 1997-12-09 2001-09-11 Sun Microsystems, Inc. System and method for floating-point computation
US6535900B1 (en) * 1998-09-07 2003-03-18 Dsp Group Ltd. Accumulation saturation by means of feedback
KR100325430B1 (ko) * 1999-10-11 2002-02-25 윤종용 상이한 워드 길이의 산술연산을 수행하는 데이터 처리장치 및 그 방법
US6829627B2 (en) * 2001-01-18 2004-12-07 International Business Machines Corporation Floating point unit for multiple data architectures

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2005024542A2 *

Also Published As

Publication number Publication date
US20070033152A1 (en) 2007-02-08
WO2005024542A2 (fr) 2005-03-17
CA2537549A1 (fr) 2005-03-17
AT413895B (de) 2006-07-15
ATA14062003A (de) 2005-10-15
WO2005024542A3 (fr) 2005-05-26

Similar Documents

Publication Publication Date Title
AT413895B (de) Digitale signalverarbeitungseinrichtung
DE2616717C2 (de) Digitales Addierwerk
DE2246968C2 (de) Einrichtung zur Multiplikation zweier Gleitkommazahlen
DE4302898A1 (en) Arithmetic logic unit with accumulator function - has two memories and counter with selection to reduce delay in processing
DE2658248C2 (fr)
DE1268886B (de) Binaeres Serienaddierwerk
DE2712224A1 (de) Datenverarbeitungsanlage
DE1236834B (de) Rechengeraet
DE1162111B (de) Gleitkomma-Recheneinrichtung
DE2814078A1 (de) Addierschaltung mit zeitweiliger zwischenspeicherung des uebertrags
DE2626432A1 (de) Arithmetische einheit fuer automatische rechengeraete
DE2816711A1 (de) Divisionseinrichtung mit uebertrags- rettungsaddierwerk und nicht ausfuehrender vorausschau
DE2063199B2 (de) Einrichtung zur Ausführung logischer Funktionen
DE3507584C2 (fr)
EP0344347B1 (fr) Dispositif pour le traitement numérique de signaux
DE2405858A1 (de) Normalisierendes verschiebezaehlernetzwerk
DE2758130A1 (de) Binaerer und dezimaler hochgeschwindigkeitsaddierer
DE2364865C2 (de) Schaltungsanordnung zur Bildung von erweiterten Adressen in einer digitalen Rechenanlage
DE3447634C2 (fr)
EP0489952B1 (fr) Dipositif de circuit à traitement de données digital sériel par bit
DE10329608A1 (de) Verringerung von Rundungsfehlern bei der Bearbeitung digitaler Bilddaten
DE69633479T2 (de) ODER-Gatter mit kontrollierbarer Breite
DE10139099C2 (de) Carry-Ripple Addierer
EP1091290B1 (fr) Système de processeur avec instruction de stockage ou de chargement avec information de partage
DE2708637C3 (de) Schaltungsanordnung zur Bildung einer BCD-Summe oder einer reinen Binär-Summe aus einem ersten und einem zweiten Operanden

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20060302

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR

17Q First examination report despatched

Effective date: 20060714

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ON DEMAND MICROELECTRONICS AG

DAX Request for extension of the european patent (deleted)
GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

RIC1 Information provided on ipc code assigned before grant

Ipc: G06F 7/499 20060101AFI20071023BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20080326