CN103092571B - Support the single-instruction multiple-data arithmetical unit of numerous types of data - Google Patents

Support the single-instruction multiple-data arithmetical unit of numerous types of data Download PDF

Info

Publication number
CN103092571B
CN103092571B CN201310009888.0A CN201310009888A CN103092571B CN 103092571 B CN103092571 B CN 103092571B CN 201310009888 A CN201310009888 A CN 201310009888A CN 103092571 B CN103092571 B CN 103092571B
Authority
CN
China
Prior art keywords
unit
data
operand
result
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310009888.0A
Other languages
Chinese (zh)
Other versions
CN103092571A (en
Inventor
严晓浪
仇径
孟建熠
陈志坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201310009888.0A priority Critical patent/CN103092571B/en
Publication of CN103092571A publication Critical patent/CN103092571A/en
Application granted granted Critical
Publication of CN103092571B publication Critical patent/CN103092571B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

A kind of single-instruction multiple-data arithmetical unit supporting numerous types of data, including N number of atomic operation array, atomic operation array includes: operand preparatory unit, for according to the action type inputted and data type information, the source operand of input is operated, exports intermediate operands;Additive operation unit, is used for receiving intermediate operands, completes additive operation, exports additive operation result;Round off operating unit, for according to the action type inputted and data type information, operation that additive operation result is rounded off, exporting the operating result that rounds off;Operated in saturation unit, for according to the action type inputted and data type information, additive operation result carrying out operated in saturation, exports operated in saturation result;Result encapsulation unit, for the output result of select to round off operating unit or operated in saturation unit, is encapsulated as final data according to data type information by intermediate object program。The present invention can effectively support that multiple data width, the suitability are good。<!--1-->

Description

Support the single-instruction multiple-data arithmetical unit of numerous types of data
Technical field
The present invention relates to multimedia arithmetic parts, especially a kind of arithmetical unit。
Background technology
Multimedia application is often referred to for operations such as the seizure of multimedia object such as text, audio frequency, still image, X-Y scheme, 3-D graphic, animation and full dynamic video etc., storage, conversion, transmission, encoding and decoding。Multi-media signal, the reaction of its maximum feature is in that little data bit width, big data throughout。
Single-instruction multiple-data (SIMD) technology refers to that control two or more parallel process infinitesimals by a controller realizes multiple data stream computing simultaneously, its concurrency is embodied in and an instruction can be utilized to realize the data cell that multiple bit wides are less completes same operation parallel, thus realizing concurrent operation in time。
Existing 40 arithmetic operation units can only once realize the computing of the computing of 32 or the computing of 1 16 or 18 mostly, although so realizing simple, but the low bit width to multimedia application, big data throughput characteristic does not have good treatment effeciency, it has been generally acknowledged that the degree of parallelism adopting SIMD technology can be greatly enhanced corresponding multi-media processing application program, improve the multi-media processing performance of arithmetic element, so needing a kind of single-instruction multiple-data arithmetic operation unit of design。
Due to video, audio frequency, image data width different, and along with the development of multimedia technology, it is anticipated that need to support having more kinds of data widths in the future, and current SIMD arithmetic element only supports a kind of data width mostly, or 8, or 16, the acceleration for multimedia application does not have universality, so needing to design the SIMD arithmetical unit that a kind of operand length is variable。
Summary of the invention
In order to overcome the deficiency that multiple data width, the suitability can not be supported poor of existing arithmetical unit, the present invention provides the single-instruction multiple-data arithmetical unit of a kind of support numerous types of data that can effectively support multiple data width, the suitability good。
The technical solution adopted for the present invention to solve the technical problems is:
A kind of single-instruction multiple-data arithmetical unit supporting numerous types of data, described arithmetical unit includes N number of atomic operation array, N is any positive integer, and each atomic operation array one adder of use realizes the arithmetical operation of the data of multiple bit wide, and described atomic operation array includes:
Operand preparatory unit, for according to input action type and data type information, to input source operand negate, sign bit extension, bit wide extension with carry extended operation, export intermediate operands;
Additive operation unit, for receiving the intermediate operands from operand preparatory unit, completes additive operation, exports additive operation result;
Round off operating unit, for according to the action type inputted and data type information, operation that additive operation result is rounded off, exporting the operating result that rounds off;
Operated in saturation unit, for according to the action type inputted and data type information, additive operation result carrying out operated in saturation, exports operated in saturation result;
Result encapsulation unit, for according to action type and data type information, the output result of select to round off operating unit or operated in saturation unit, and according to data type information, intermediate object program is encapsulated as final data。
Further, described arithmetical unit supports signed number and unsigned number computing, supports the computing of different element width, and described element width includes word, half-word or byte。
Preferably, in described operand preparatory unit, source operand, in units of element, is carried out inversion operation according to action type by inversion operation;If additive operation, keep first operand and second operand constant;If subtraction, keep first operand constant, the second source operand is negated;If negative operand is then negated according to the sign bit of data by signed magnitude arithmetic(al), align operand and remain unchanged。
Further, in described operand preparatory unit, the first source operand and the second source operand are carried out sign bit extension by sign bit extended operation in units of element;For signed number, the highest order at each element extends a bit sign position, and for unsigned number, the highest order at each element mends one 0。
Further, in described operand preparatory unit, the first source operand or the second source operand are extended one times of bit wide with data type information according to action type by bit wide extended operation in units of element。
In described operand preparatory unit, lowest order one carry extension bits of extension at first operand with each element of second operand, for subtraction, carry extended operation expands to 1 at every element lowest order of the first source operand and the second source operand, constitutes subtrahend is taken complementary operation together with preparing with inversion operation number;For additive operation, carry extended operation expands to 0 at every element lowest order of the first source operand and the second source operand。
Described additive operation unit only comprises an adder, type according to operand, each element of addition operand is increased an Extended Precision position, the first carry operations result of record addition operand, obtain an extension intermediate data, thus the carry of isolated data, and finally cast out carry extension bits in computing, it is thus achieved that an additive operation result。
The described operating unit that rounds off only comprises 1 subtractor, it is achieved the operation of rounding off of the data of different bit wides。
In described operated in saturation unit, according to data type information, utilize additive operation unit result and sign bit extended arithmetic result that additive operation result is carried out saturation arithmetic in units of element, if showing as upper spilling, take maximum, lower spilling, takes minima, and other situation operation result remains unchanged。
In described result encapsulation unit, according to instruction type information, choose the result of round off operating unit or operated in saturation unit, and according to data type information, be encapsulated into final operation result。
Beneficial effects of the present invention is mainly manifested in: can effectively support that multiple data width, the suitability are good。
Accompanying drawing explanation
Fig. 1 is single-instruction multiple-data (SIMD) arithmetical unit structured flowchart。
Fig. 2 is atomic operation cellular construction block diagram。
Fig. 3 is operand preparatory unit structured flowchart。
Fig. 4 is 8 positional operand extension schematic diagrams。
Fig. 5 is 16 positional operand extension schematic diagrams。
Fig. 6 is 32 positional operand extension schematic diagrams。
Fig. 7 is that addition operand prepares schematic diagram。
Fig. 8 is the operating unit schematic flow sheet that rounds off。
Fig. 9 is operated in saturation unit schematic flow sheet。
Detailed description of the invention
Below in conjunction with accompanying drawing, the invention will be further described。
With reference to Fig. 1~Fig. 9, a kind of single-instruction multiple-data arithmetical unit supporting numerous types of data, described arithmetical unit includes N number of atomic operation array (11), N is any positive integer, each atomic operation array one adder of use realizes the arithmetical operation of the data of multiple bit wide, and described atomic operation array (11) including:
Operand preparatory unit (21), for according to input action type and data type information, to input source operand negate, sign bit extension, bit wide extension with carry extended operation, export intermediate operands;
Additive operation unit (22), for receiving the intermediate operands from operand preparatory unit, completes additive operation, exports additive operation result;
Round off operating unit (23), for according to the action type inputted and data type information, operation that additive operation result is rounded off, exporting the operating result that rounds off;
Operated in saturation unit (24), for according to the action type inputted and data type information, additive operation result carrying out operated in saturation, exports operated in saturation result;
Result encapsulation unit (25), for according to action type and data type information, the output result of select to round off operating unit or operated in saturation unit, and according to data type information, intermediate object program is encapsulated as final data。
Typically, 2 32 positional operands carrying out arithmetical operation are needed to be first fed to operand preparatory unit, produce corresponding 40 intermediate operands, then pass through additive operation unit and complete result computing, round off operating unit further according to the type selecting of instruction and operated in saturation unit operates accordingly, generate 40 bit arithmetic results, finally 40 bit arithmetic results are packaged into final result by result encapsulation unit。
Described arithmetical unit supports signed number and unsigned number computing, supports the computing of different element width, and the typical element width of support is word, half-word, byte。
In described operand preparatory unit (21), source operand, in units of element, is carried out inversion operation according to action type by inversion operation。If additive operation, keep first operand and second operand constant;If subtraction, keep first operand constant, the second source operand is negated;If negative operand is then negated according to the sign bit of data by signed magnitude arithmetic(al), align operand and remain unchanged。
In described operand preparatory unit (21), the first source operand and the second source operand are carried out sign bit extension by sign bit extended operation in units of element。For signed number, the highest order at each element extends a bit sign position, and for unsigned number, the highest order at each element mends one 0。
In described operand preparatory unit (21), the first source operand or the second source operand are extended one times of bit wide with data type information according to action type by bit wide extended operation in units of element。
In described operand preparatory unit (21), lowest order one carry extension bits of extension at first operand with each element of second operand, for subtraction, carry extended operation expands to 1 at every element lowest order of the first source operand and the second source operand, constitutes subtrahend is taken complementary operation together with preparing with inversion operation number;For additive operation, carry extended operation expands to 0 at every element lowest order of the first source operand and the second source operand。
Described additive operation unit (22) only comprises an adder, it is achieved the data operation of different bit wide length。Typically, by 44 adders, it may be achieved 1 32 or 2 16 or 48 arithmetical operations operations。It is characterized in that its type according to operand, each element of addition operand is increased an Extended Precision position, the first carry operations result of record addition operand, obtain an extension intermediate data, described extension intermediate data is the data of 44, thus the carry of isolated data, and finally cast out carry extension bits in computing, it is thus achieved that 40 additive operation results。
The described operating unit that rounds off (23) only comprises 1 subtractor, it may be achieved the operation of rounding off of the data of different bit wides, typically 32 adders, it may be achieved the computing of rounding off of 1 32 or 2 16 or 48 bit data。
Described operated in saturation unit (24) is according to data type information, utilize additive operation unit result and sign bit extended arithmetic result that additive operation result is carried out saturation arithmetic in units of element, if showing as upper spilling, take maximum, lower spilling, taking minima, other situation operation result remains unchanged。
Described result encapsulation unit (245), according to instruction type information, is chosen the result of round off operating unit or operated in saturation unit, and according to data type information, is encapsulated into final operation result。Typically for the operation result of 40, select 1 32 or 2 16 or 48 bit arithmetic results, be packaged into 32 final bit data。
In the present embodiment, in order to realize computing different length data being carried out degree of parallelism height, data throughout is big, it is provided that a kind of single-instruction multiple-data arithmetical unit based on atomic operation array supporting numerous types of data。
Each atomic operation array includes five main unit: operand preparatory unit (21), additive operation unit (22), the operating unit that rounds off (23), operated in saturation unit (24), result encapsulation unit (25)。Calculating process and result between each atomic operation array do not interfere with each other and affect。The simple atomic operation array that replicates just can realize the width extension of operand, thus to 32,64,96 ... data carry out arithmetical operation, it is achieved broader SIMD operation。
Data are packaged by operand preparatory unit by 32, it is possible to be subdivided into inversion operation unit and extended operation unit。
The data that part is passed through can be carried out inversion operation according to instruction type by inversion operation unit (31) in units of element。For the operation that takes absolute value, if source operand is positive number, then maintain former numerical value constant。If source operand is negative, then source operand is negated。For subtraction and the comparison operation requiring over subtraction realization, it is constant that inversion operation unit maintains the first source operand, it is that next step complementary operation obtaining subtrahend is prepared by the second source operand by inversion operation, only need to carry out adding one in subsequent operation by the second source operand of inversion operation and just can realize the acquisition of complement code。
Extended operation unit completes the extension of instruction demand by adding bit wide extension bits (32), sign-extension bit (33) and carry extension bits (34), and makes this arithmetical unit can realize the addition of the addition of 48 or the addition of 2 16 and 1 32 by 40 adders。
The arithmetical operation operation that operand bit wide of supporting this arithmetic operation unit differs。When operand element bit wide therein not for the moment, it is necessary to shorter operand is carried out bit wide extension (32) so that two operands every element bit wide is identical, realizes follow-up computing further。For unsigned number, bit wide extension zero padding (43) (53);For signed number, bit wide extension escape character position (44) (54) so that two operand bit wides are identical。
In order to keep data precision, prevent carry information from losing, operand is carried out sign bit extension (33) by us: when operand is unsigned number, operand is extended one zero (41) (51) (61) by us before the highest order of each of which element, when operand is signed number, operand is extended a bit sign position (42) (52) (62) by us before the highest order of each of which element。
In order to prevent the carry between adjacent operator number from affecting, we have also been devised carry extension bits (34): each element at operand is eventually adding one。For additive operation, we are unified extension 0(45) (55) (63);For subtraction, we are unified extension 1(46) (56) (64), so when minuend and subtrahend are finally by adder, the carry that operand will be produced 1 by extension bits, realizing subtrahend is added an operation, taking complementary operation thus achieving subtrahend is achieved together with above-mentioned inversion operation unit。
As shown in Figure 3, as shown in Figure 4,1 32 positional operand result after extension is as shown in Figure 5 for 2 16 positional operands result after extension for 48 positional operands result after extension。
Additive operation unit mainly includes 44 adders, completes the additive operation of operand。Adder obtains operand from operand preparatory unit, and two operands are carried out additive operation。In order to keep degree of accuracy, each element of addition operand is increased an Extended Precision position zero, the first carry operations result of record addition operand, the intermediate object program that namely adder will generate one 44 by us。For 48 additive operations, Extended Precision position increase the 10th of operand, the 21st, the 32nd and the 43rd (71);For 2 16 additive operations, Extended Precision position increase the 20th of operand, the 21st, the 42nd and the 43rd (72);For 1 32 additive operation, Extended Precision position increase the 40th of operand, the 41st, the 42nd and the 43rd (73)。
Owing to carry extension bits has been out effect after additive operation terminates, it is possible to cast out, so each additive operation unit also obtains the operation result of 40 the most at last。
Due to the existence of Extended Precision position, sign-extension bit, the operation result of 48 additions saves as 4 10 intermediate calculation results;The operation result of 2 16 additions saves as 2 20 intermediate calculation results;The operation result of 1 32 addition saves as 1 40 intermediate calculation results。Such adder can carry out 48 additions, 2 16 additions, 1 32 addition and the operation result that do not interact, and can retain the precise results of computing。
Additive operation unit also has cut position result treatment unit and comparative result processing unit unit except adder。For needing to take the high operation that partly part or lower half are divided, operation result is carried out cut position computing according to instruction demand by cut position result arithmetic element。For comparing instruction, appointment depositor is carried out zero setting or puts an operation according to the positive and negative of additive operation unit each element operation result by comparative result unit。
In order to support fractional arithmetic better, this arithmetic element also supports operation result is taken the operation of high half part。For taking high part operation and operation of taking the mean, all there is a kind of possibility that fractional part occurs, then need a process that operand is rounded off。Round off operating unit essentially according to instruction demand, uses one operand to be rounded off operation from increasing device。
For taking high half part operation (82), the operating unit that rounds off takes out height half part (8 and 16) and the respective rounding bit of operation result according to the bit wide (16 or 32) of operation element, and (for 16 bit arithmetics, the 7th is the rounding bit of these data;For 32 bit arithmetics, the 15th is the rounding bit of these data) merge, by this number from increasing 1, then cast out last as final operation result。
For computing of taking the mean (83), two operands obtain the sum of two numbers after adder, the operating unit that rounds off merges this number from increasing 1 according to data division and the respective rounding bit of bit wide (8 or 16 or 32) the taking-up operation result of operation element, then casts out last as final operation result。For 8 bit arithmetics, the 1st of operation result to the 8th, the 11st to the 18th, the 21st to the 28th, the 31st be data division to the 38th, the rounding bit of its correspondence respectively the 0th, the 10th, the 20th, the 30th。For 16 bit arithmetics, the 1st of operation result to the 16th, the 21st be data division to the 36th, the rounding bit of its correspondence respectively the 0th, the 20th。For 32 bit arithmetics, the 1st of operation result is data division to the 32nd, and the rounding bit of its correspondence is the 0th。
Round off unit except to realize rounding off except operation, in addition it is also necessary to realize taking absolute value operation。For the instruction that takes absolute value (81), it is constant that positive operand then maintains initial value, for negative operand by negating of operand is added one and realized。In order to save area and the resource of arithmetical unit, this operation realizes also by the unit that rounds off。In units of element, if operand is positive number, then operand is prepared as data division, is prepared as 0 by rounding;If operand is positive number, then the result that negates of operand is prepared as data division, is prepared as 1 by rounding;The operating unit that rounds off takes out the data division after preparing according to the length (8 or 16 or 32) of operation micro unit and respective rounding bit merges this number from increasing 1, then casts out last as final operation result。The operation thus realizing taking absolute value。Because negative numerical representation scope is more than positive number, the maximum maintenance initial value for negative is constant。
The result that operated in saturation unit is obtained by additive operation unit, carries out operated in saturation according to instruction demand to operation result。Because the difference of instruction operands, operated in saturation unit is substantially carried out positive number saturation arithmetic, negative saturation arithmetic and unsigned number saturation arithmetic。
Operated in saturation unit together decides on this computing the need of carrying out operated in saturation by the sign-extension bit of data and Extended Precision position and instruction type。In units of element, if a non-cut position instruction instruction needs operated in saturation, and operand is unsigned number, as long as then sign-extension bit and Extended Precision position have one to be 1 be accomplished by operation result is carried out operated in saturation, operated in saturation takes the maximum of unsigned number。In units of element, if a non-cut position instruction needs operated in saturation, and operand is positive number, as long as then sign-extension bit and Extended Precision position have one to be 1 be accomplished by operation result is carried out operated in saturation, operated in saturation takes the maximum of positive number。In units of element, if a non-cut position instruction needs operated in saturation, and operand is negative, as long as then sign-extension bit and Extended Precision position have one to be 0 be accomplished by operation result is carried out operated in saturation, operated in saturation takes the minima of negative。For intercepting low bit instruction, in units of element, when judging that whether this operation is saturated, not only need to be concerned about sign-extension bit and Extended Precision position, it is also desirable to be concerned about the data being truncated;When operand is unsigned number or has symbol positive number, one is had to be 1 be accomplished by operation result is carried out operated in saturation as long as being then truncated data, sign-extension bit and Extended Precision position;When operand is negative, having one to be 0 be accomplished by operation result is carried out operated in saturation as long as being then truncated data, sign-extension bit and Extended Precision position, operated in saturation is identical with the operated in saturation of non-cut position instruction。
Result encapsulation unit is according to instruction type, respectively from rounding off operating unit and operated in saturation unit obtains data。Signed magnitude arithmetic(al), average computing, cut position computing obtain data from the operating unit that rounds off;Addition (saturated) computing, subtraction (saturated) computing, comparison operation, sizes values Selecting operation obtain data from operated in saturation unit。Round off operating unit and operated in saturation unit to pass to the data of result encapsulation unit be 40 bit data。When data bit width is 8, the 30th to the 37th, the 20th to the 27th, the 10th to the 17th, the 0th that chooses data is encapsulated into 32 bit data to the 7th;When data bit width is 16, the 20th to the 35th, the 0th that chooses data is encapsulated into 32 bit data to the 15th;When data bit width is 32, the 0th that chooses data is encapsulated into 32 bit data to the 31st。These 32 results are exactly the final result of computing。

Claims (8)

1. single-instruction multiple-data (SIMD) arithmetical unit supporting numerous types of data, described arithmetical unit includes N number of atomic operation array, N is any positive integer, each atomic operation array only uses two adders to realize the parallel arithmetic operation of data of multiple bit wide, further, each atomic operation array includes:
Operand preparatory unit, according to input action type and data type information, to input source operand negate, sign bit extension, bit wide extension with carry extended operation, export intermediate operands;
Additive operation unit, receives the intermediate operands from operand preparatory unit, only uses an adder, completes parallel addition operations, exports additive operation result;
Round off operating unit, according to action type and the data type information of input, only uses an adder, operation that additive operation result is rounded off parallel, exports the operating result that rounds off;
Operated in saturation unit, according to action type and the data type information of input, carries out operated in saturation, exports operated in saturation result additive operation result;
Result encapsulation unit, according to action type and data type information, the output result of select to round off operating unit or operated in saturation unit, and according to data type information, intermediate object program is encapsulated as final data;
Typically, 32 positional operands carrying out arithmetical operation are needed to be first fed to operand preparatory unit, produce corresponding 40 intermediate operands, then pass through additive operation unit and complete result computing, round off operating unit further according to the type selecting of instruction and operated in saturation unit operates accordingly, generate 40 bit arithmetic results, finally 40 bit arithmetic results are packaged into final result by result encapsulation unit;32 described positional operands can be 32 positional operands, it is also possible to is two 16 positional operands, it is also possible to is 48 positional operands;Described final result can be 32 bit arithmetic results, it is also possible to is two 16 bit arithmetic results, it is also possible to is 48 bit arithmetic results。
2. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterized in that: arithmetical unit supports signed number and unsigned number computing, support the computing of different element width, the typical element supported is A word, B half-word, C byte, wherein A, B, C are positive integer, and meet relation A=B*2, B=C*2。
3. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterised in that: source operand, in units of element, is carried out inversion operation according to action type by the operand preparatory unit inversion operation in arithmetical unit;If additive operation, keep first operand and second operand constant;If subtraction, keep first operand constant, the second source operand is negated;If negative operand is then negated according to the sign bit of data by signed magnitude arithmetic(al), align operand and remain unchanged。
4. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterised in that: the operand preparatory unit in arithmetical unit, the first source operand and the second source operand are carried out sign bit extension by sign bit extended operation in units of element;For signed number, the highest order at each element extends a bit sign position, and for unsigned number, the highest order at each element mends one 0;
When operand element bit wide therein not for the moment, the first source operand or the second source operand are extended one times of bit wide according to action type and data type information by bit wide extended operation in units of element;
Lowest order one carry extension bits of extension at first operand with each element of second operand, for subtraction, carry extended operation expands to 1 at every element lowest order of the first source operand and the second source operand, constitutes subtrahend is taken complementary operation together with preparing with inversion operation number;For additive operation, carry extended operation expands to 0 at every element lowest order of the first source operand and the second source operand。
5. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterised in that: the additive operation unit in arithmetical unit only comprises an adder, it is achieved the data parallel computing of different bit wide length;Typically, by 44 adders, it is achieved 1 32 or 2 16 or 48 arithmetical operations operations;Type according to operand, increases an Extended Precision position to each element of addition operand, records the first carry operations result of addition operand, thus the carry of isolated data, and finally cast out carry extension bits in computing, it is thus achieved that initial operation result。
6. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterized in that: the operating unit that rounds off in arithmetical unit, only use 1 adder, realize the operation of rounding off parallel of the data of different bit wide, typically 32 adders, it is achieved the computing of rounding off of 1 32 or 2 16 or 48 bit data。
7. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterized in that: the operated in saturation unit in arithmetical unit, according to data type information, utilize additive operation unit result and sign bit extended arithmetic result that additive operation result is carried out saturation arithmetic in units of element, if showing as upper spilling, take maximum, lower spilling, taking minima, other situation operation result remains unchanged。
8. single-instruction multiple-data arithmetical unit as claimed in claim 1, it is characterized in that: the result encapsulation unit in arithmetical unit, according to instruction type information, choose the result of round off operating unit or operated in saturation unit, and according to data type information, it is encapsulated into final operation result;Typically for the operation result of 40, select 1 32 or 2 16 or 48 bit arithmetic results, be packaged into 32 final bit data。
CN201310009888.0A 2013-01-10 2013-01-10 Support the single-instruction multiple-data arithmetical unit of numerous types of data Expired - Fee Related CN103092571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310009888.0A CN103092571B (en) 2013-01-10 2013-01-10 Support the single-instruction multiple-data arithmetical unit of numerous types of data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310009888.0A CN103092571B (en) 2013-01-10 2013-01-10 Support the single-instruction multiple-data arithmetical unit of numerous types of data

Publications (2)

Publication Number Publication Date
CN103092571A CN103092571A (en) 2013-05-08
CN103092571B true CN103092571B (en) 2016-06-22

Family

ID=48205191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310009888.0A Expired - Fee Related CN103092571B (en) 2013-01-10 2013-01-10 Support the single-instruction multiple-data arithmetical unit of numerous types of data

Country Status (1)

Country Link
CN (1) CN103092571B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9916130B2 (en) * 2014-11-03 2018-03-13 Arm Limited Apparatus and method for vector processing
GB2533568B (en) 2014-12-19 2021-11-17 Advanced Risc Mach Ltd Atomic instruction
US20160179530A1 (en) * 2014-12-23 2016-06-23 Elmoustapha Ould-Ahmed-Vall Instruction and logic to perform a vector saturated doubleword/quadword add
WO2020108496A1 (en) * 2018-11-30 2020-06-04 上海寒武纪信息科技有限公司 Method and device for processing data in atomic operation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1432151A (en) * 2000-10-04 2003-07-23 Arm有限公司 Single instruction multiple data processing
CN101685386A (en) * 2008-08-15 2010-03-31 北京北大众志微系统科技有限责任公司 Arithmetic logic unit for processing data with any width and processing method thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120074762A (en) * 2010-12-28 2012-07-06 삼성전자주식회사 Computing apparatus and method based on reconfigurable simd architecture

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1432151A (en) * 2000-10-04 2003-07-23 Arm有限公司 Single instruction multiple data processing
CN101685386A (en) * 2008-08-15 2010-03-31 北京北大众志微系统科技有限责任公司 Arithmetic logic unit for processing data with any width and processing method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一款嵌入式可视媒体处理系统芯片的设计与实现;严明等;《电子学报》;20110228;第39卷(第2期);第249-254页 *

Also Published As

Publication number Publication date
CN103092571A (en) 2013-05-08

Similar Documents

Publication Publication Date Title
KR102447636B1 (en) Apparatus and method for performing arithmetic operations for accumulating floating point numbers
TWI517038B (en) Instruction for element offset calculation in a multi-dimensional array
TWI470544B (en) Systems, apparatuses, and methods for performing a horizontal add or subtract in response to a single instruction
TWI525538B (en) Super multiply add (super madd) instruction
CN113762490A (en) Matrix multiplication acceleration of sparse matrices using column folding and extrusion
CN103092571B (en) Support the single-instruction multiple-data arithmetical unit of numerous types of data
US20070074007A1 (en) Parameterizable clip instruction and method of performing a clip operation using the same
CN107003846B (en) Method and apparatus for vector index load and store
CN112639722A (en) Apparatus and method for accelerating matrix multiplication
TWI511043B (en) Methods, apparatus, systems, and article of manufacture to generate sequences of integers
EP3855308A1 (en) Systems and methods for performing 16-bit floating-point vector dot product instructions
TWI543076B (en) Apparatus and method for down conversion of data types
CN108415882B (en) Vector multiplication using operand-based systematic conversion and retransformation
TW201802667A (en) Apparatuses, methods, and systems for element sorting of vectors
CN107145335B (en) Apparatus and method for vector instructions for large integer operations
CN108269226B (en) Apparatus and method for processing sparse data
TW202217603A (en) Systems, apparatuses, and methods for fused multiply add
US10564932B2 (en) Methods for calculating floating-point operands and apparatuses using the same
EP3238022B1 (en) Method and apparatus for performing big-integer arithmetic operations
US20170068517A1 (en) Decimal and binary floating point rounding
CN116860334A (en) System and method for calculating the number product of nibbles in two block operands
JPWO2016024508A1 (en) Multiprocessor device
TWI482086B (en) Systems, apparatuses, and methods for performing delta encoding on packed data elements
Seo et al. Consecutive operand-caching method for multiprecision multiplication, revisited
CN116561051A (en) Hardware acceleration card and heterogeneous computing system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160622

Termination date: 20200110

CF01 Termination of patent right due to non-payment of annual fee