CN101840323B - Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing - Google Patents

Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing Download PDF

Info

Publication number
CN101840323B
CN101840323B CN201010133813XA CN201010133813A CN101840323B CN 101840323 B CN101840323 B CN 101840323B CN 201010133813X A CN201010133813X A CN 201010133813XA CN 201010133813 A CN201010133813 A CN 201010133813A CN 101840323 B CN101840323 B CN 101840323B
Authority
CN
China
Prior art keywords
scalar
parts
division evolution
multiplexing
division
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010133813XA
Other languages
Chinese (zh)
Other versions
CN101840323A (en
Inventor
刘宏伟
郇丹丹
张晓春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Loongson Technology Corp Ltd
Original Assignee
Loongson Technology Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Loongson Technology Corp Ltd filed Critical Loongson Technology Corp Ltd
Priority to CN201010133813XA priority Critical patent/CN101840323B/en
Publication of CN101840323A publication Critical patent/CN101840323A/en
Application granted granted Critical
Publication of CN101840323B publication Critical patent/CN101840323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses a device and a method for the division evolution of non-full flow water vectors supporting scalar quantity multiplexing. The device comprises a control module and at least two division evolution components, wherein the control module comprises a selection module, a scalar quantity and vector executive instruction control module, a data path selection module and a result control module; the selection module is used for generating a selection signal to convey input data to the division evolution components; the scalar quantity and vector executive instruction control module is used for performing the control of scalar quantity and vector executive instructions according to 'busy' signals of the division evolution components; the data path selection module is used for generating corresponding paths according to the selection signal; and the result control module is used for realizing results of different data formats according to allocation.

Description

Support the non-full flowing water vector division evolution device and method that scalar is multiplexing
Technical field
The present invention relates to the micro-processor architecture technical field, and be specifically related to a kind of non-full flowing water vector division evolution device and method of supporting that scalar is multiplexing.
Background technology
Along with the continuous development of processor technology, the field of its application is also constantly expanded.Especially along with the multimedia computing, the increase of various computings such as section's mathematical operations, various general processors are all in the instruction set of adding single instruction stream multiple data stream.The vector instruction of the instruction set of this single instruction stream multiple data stream all can be for there being corresponding scalar instruction, because not all program all can vectorization.
This vectorization expansion for increasing its computing peak value, needs parallel its function arithmetic unit of increase.As, the functional part that 256 bit wide arithmetic capabilities are arranged, its double-precision arithmetic parts of 64 just need four.
Most computing usually is full flowing water, thus its when carrying out scalar instruction, its untapped other arithmetic unit can not produce waste.And the full flowing water of division evolution parts right and wrong, and its to carry out umber of beats uncertain, so its when carrying out scalar instruction if block other scalar division extract instruction emission, will form waste to other a few division extract instruction execution units.
At present, still find the correlation technique of the non-full flowing water vector division evolution that the support scalar is multiplexing both at home and abroad.
Summary of the invention
The object of the invention is to provide a kind of non-full flowing water vector division evolution device and method of supporting that scalar is multiplexing, when it carries out scalar instruction, still can launch other scalar division extract instruction, thereby can increase the utilization factor of division evolution parts greatly.Because the division extracting operation is non-flowing water, usually block the whole functional parts simultaneously, so also increased its division arithmetic ability, thereby significantly reduced the obstruction of streamline, improved the operational efficiency of processor.
For realizing that the object of the invention provides a kind of non-full flowing water vector division evolution device of supporting that scalar is multiplexing.This device comprises control module and at least two division evolution parts, and wherein control module comprises: select module, be used for producing and select signal to be sent to which division evolution parts to select the input data; Counter is used for adding up the number of the parts that said at least two division evolution parts " do "; Scalar sum vector execution command control module, the number of the parts of " hurry " that is used for obtaining according to said counters count and " hurry " signal of said division evolution parts carry out the vectorial control of executing instruction of scalar sum; Data path is selected module, is used for generating respective via according to said selection signal; Control module as a result: be used for according to configuration, realize the result of different data format.
For realizing that the object of the invention also provides a kind of non-full flowing water vector division evolution method of supporting that scalar is multiplexing.This method may further comprise the steps: S1: the step of transmitting instructions of firing order to control module; S2: produce and select signal to be sent to the selection step of which division evolution parts to select the input data, and the step of adding up the number of the parts of " doing " in said at least two division evolution parts; S3: the number of the parts of " the hurrying " that obtains according to said statistics and " hurrying " signal of said division evolution parts carry out the scalar sum vector execution command controlled step of the control of scalar sum vector execution command; S4: the data path that generates respective via according to said selection signal is selected step; S5: the allocation step of distributing said division evolution parts whether to participate in computing; S6: result data is put step in order according to configuration.。
Beneficial effect of the present invention: in the multiplexing non-full flowing water vector division evolution apparatus and method of support scalar of the present invention, for the structure of multiplexing its sub-function of scalar instruction.This structure can make it under the basic vector structure of maintenance, and scalar division extract instruction is carried out the multiplexing of submodule, also improved the utilization factor of functional part greatly, and its hardware costs is little.
Description of drawings
Fig. 1 is the structural representation according to the multiplexing non-full flowing water vector division evolution device of the support scalar of a specific embodiment of the present invention;
Fig. 2 is the process flow diagram according to the multiplexing non-full flowing water vector division evolution method of the support scalar of a specific embodiment of the present invention;
Fig. 3 is when being 256 bit widths with vector operation, single double-precision floating point result's expression mode synoptic diagram.
Embodiment
In order to make the object of the invention, technical scheme and advantage clearer,, the apparatus and method of raising microprocessor components utilization factor of the present invention and operation efficiency are further elaborated below in conjunction with accompanying drawing and embodiment.Should be appreciated that specific embodiment described herein is only in order to explain the present invention rather than limitation of the present invention.
Below in conjunction with accompanying drawing and embodiment the present invention is described in further detail.
Fig. 1 is the structural representation according to the multiplexing non-full flowing water vector division evolution device of the support scalar of a specific embodiment of the present invention.Need to prove that here four division evolution parts in the accompanying drawing are preferred embodiment a kind of, these parts can be two, three, and five ..., and these parts are identical arithmetic unit.
Instruction issue unit can be used for firing order to control module, and this instruction can be a kind of order code.Particularly, phase instruction is launched into functional part weekly, and its operand is 256.For scalar instruction wherein, then 192 of high level is invalid.So it is any one that can arrive in four division evolution parts that data path is minimum 64.Which, control by the sel signal that control module generates as for the parts that arrive wherein.For vector operation, its operand of 256 all is significant, goes so each operand of 64 all correspondingly is sent in the corresponding division function module.
And, for the result corresponding selection is arranged also, just path direction all will have the data path to lowest order from four parts conversely.Result's selection signal res_control controls.The high position of net result can be done operations such as zero padding.
Carry out in the control in the scalar sum vector instruction, a counter is arranged in the control module, add up the number of the parts that do in four modules.Obtain instruction for the emission of phase weekly, handle as follows:
(1) if a vector instruction, as long as the counter non-zero then notifies upper level division evolution parts " hurry ", whole functional parts obstruction, otherwise this vector instruction can be transmitted in four parts and goes executed in parallel.
(2), if its receive for scalar instruction, then check counter numerical value, only when its value is 4, feed back the division evolution parts signal that " does " to upper level.Otherwise,, carry out computing for it distributes parts.Allocation strategy also can be diversified, takes a kind of example here, and fixed priority is promptly looked for from low to high successively, up to finding non-" a doing " subassembly, passes by the corresponding transmission of operand, participates in computing.
In the result selects,,, also can there be competition so result's generation possibly be different from the emission preface because division evolution parts are the interative computation of indefinite umber of beats.For result's competition, also can use fixed priority.
For example can pass through the situation of two scalar instructions of emission continuously, and some vector instructions, immediately following a scalar instruction both of these case, the working method of the non-full flowing water vector division evolution parts that this support scalar is multiplexing is described.
According to another specific embodiment of the present invention, launch scalar instruction continuously.The parts signal that " hurries " has two groups five; First group four is " hurrying " signal of each subassembly; Another group be " doing " signal of vectorial division parts, it be first group four " hurrying " signals or: represent in the division parts computing to be arranged but also be not submitted to the instruction on the result bus.On this structure, article one scalar instruction is when emission; All busy signals have not all been put, and just choose from low to high, and the subassembly that then generates number is 0; This part number is as the selecting side signal of MUX, and deliver in the Port Multiplier of inlet end and go, so; 0 work song parts are received this instruction, carry out computing, and its subassembly signal that " hurries " has been put; Vector parts busy signal has also been put.
The second transmitting instructions, it checks that oneself op and fmt can confirm as a scalar instruction, at this moment; Control module is checked four sub-components busy signals, selects part number minimum and that do not put busy signal from low to high, and the subassembly that at this moment obtains number should be 1; Promptly corresponding sel signal also is effective 64 bit data in 1 instruction; Deliver to No. 1 division parts of writing a prescription for the patient, the corresponding subassembly signal that " hurries " has been put, and computing begins.
When the result submits to, because the execution umber of beats of division extract instruction is indefinite.So usually submit to after the instruction of emission earlier, also possible same bat, instruction is submitted in a plurality of different computings simultaneously to.For result's output, adopt the mode of bus contention, one-period is only exported a result, and resolving strategy also uses fixed priority, and promptly No. 0 priority is the highest, and all the other successively decrease successively.
According to another specific embodiment of the present invention, for connecing a vector instruction after some the scalar instructions.Content as described before, article one instruction can be assigned to 0 work song parts.And put bottle opener portion 0 signal and the vectorial division signal that " does " that " does ".
Subsequent instructions when emission, control module through op with fmt judgement receive whether the instruction of emission is vector instruction, and check the vectorial parts signal that " does ".Therefore before launched some scalar instructions, they carry out in parts, and have many and all be not submitted on the result bus.Put time-out so " do " at vectorial parts, whole parts block, and promptly notify prime flowing water " expire " signal with functional part, to stop to launch operational order to functional part.
Block and can continue always, the instruction in functional part up to all is finished, and after submitting to through result bus, and all subassemblies signal that " hurry " all can reset to 0, and promptly vectorial parts " hurry " and also reset to 0 at this moment.Then vectorial division extract instruction emission is blocked and is removed, and streamline continues.
Fig. 2 is the process flow diagram according to the multiplexing non-full flowing water vector division evolution method of the support scalar of a specific embodiment of the present invention; As shown in Figure 2; At first, firing order is to control module, and the instruction of being launched comprises at least one scalar instruction and at least one vector instruction.Produce then and select signal to be sent to which division evolution parts to select the input data; Then; Carry out the control of scalar sum vector execution command according to " doing " signal of said division evolution parts, thereafter, generate the data path of respective via according to top selection signal; And distribute said division evolution parts whether to participate in computing, this allocation step has adopted the allocation strategy of priority.
Fig. 3 is the explanation to form as a result of the present invention.According to the IEEE754 standard, single-precision floating-point data is 32, and double-precision floating points is 64.After the data representation of Result_ beginning is selected by clever data path process, the result who sees off; The initial value of the expression destination register of rd_ beginning is an example with 256 bit vector width here, and these three kinds of patterns are described.Simultaneously to the support of different-format, and realized flexible configuration on this basis to it.Application for different occasions provides great convenience.
Can find out that from the specific embodiment of front the advantage of this working method is clearly.This method is than the method for launching next bar again after every instruction is all submitted to, and in the scalar instruction that in instruction, is mostly, efficient can improve greatly.And this situation is more common, because of vector processor, also usually need more compatible compiled, the program of non-vectorized, all division arithmetics all are scalars here.In this case, the raising of efficient will be more obvious.
On the other hand, this method hardware costs seldom when the requirement of area sequential is not too harsh, can improve the utilization factor of arithmetic unit, to improve operation efficiency in this way.
Although described the present invention with reference to preferred embodiment, those skilled in the art will recognize, can carry out the change on form and the details, only otherwise break away from the spirit and scope of the present invention.The present invention attempts to be not limited to the specific embodiment that is disclosed, and is used for the optimal mode of embodiment of the present invention like expection, and on the contrary, the present invention will comprise whole embodiment of the scope that falls into accessory claim.

Claims (12)

1. a non-full flowing water vector division evolution device of supporting that scalar is multiplexing comprises control module and at least two division evolution parts, wherein:
Said control module comprises:
Select module, be used for producing and select signal to be sent to which division evolution parts to select the input data;
Counter is used for adding up the number of the parts that said at least two division evolution parts " do ";
Scalar sum vector execution command control module, the number of the parts of " hurry " that is used for obtaining according to said counters count and " hurry " signal of said division evolution parts carry out the vectorial control of executing instruction of scalar sum;
Data path is selected module, is used for generating respective via according to said selection signal;
Control module is used for being organized into standard format to the result from functional part output as a result, and can be according to different configurations, the result of output different-format.
2. the non-full flowing water vector division evolution device that support scalar as claimed in claim 1 is multiplexing, the multiplexing non-full flowing water vector division evolution device of wherein said support scalar also comprises an instruction issue unit, is used for firing order to said control module.
3. the non-full flowing water vector division evolution device multiplexing like the said support scalar of claim 2, said instruction comprises at least one scalar instruction and at least one vector instruction.
4. the non-full flowing water vector division evolution device multiplexing like the said support scalar of claim 3, said at least two division evolution parts are identical arithmetic unit.
5. the non-full flowing water vector division evolution device multiplexing like the said support scalar of claim 4, wherein said control module also comprises the data preparation module, is used for the result data of said selection signal is organized into standard format.
6. the non-full flowing water vector division evolution device multiplexing like the said support scalar of claim 5, wherein said data preparation module has also comprised selection module and the method to the Different Results data layout, the result data form that they can be different according to selection of configuration.
7. the non-full flowing water vector division evolution device multiplexing like the said support scalar of claim 6, its result data form of supporting comprises following three kinds: zero clearing pattern, three kinds of Different Results data layouts of maintenance pattern and broadcast mode.
8. the non-full flowing water vector division evolution device multiplexing like the said support scalar of claim 7, wherein said control module also comprises distribution module, is used to distribute said division evolution parts whether to participate in computing.
9. non-full flowing water vector division evolution method of supporting that scalar is multiplexing said method comprising the steps of:
S1: the step of transmitting instructions of firing order to control module;
S2: produce and select signal to be sent to the selection step of which division evolution parts to select the input data, and the step of adding up the number of the parts of " doing " in said at least two division evolution parts;
S3: the number of the parts of " the hurrying " that obtains according to said statistics and " hurrying " signal of said division evolution parts carry out the scalar sum vector execution command controlled step of the control of scalar sum vector execution command;
S4: the data path that generates respective via according to said selection signal is selected step;
S5: the allocation step of distributing said division evolution parts whether to participate in computing;
S6: result data is put step in order according to configuration.
10. the non-full flowing water vector division evolution method that support scalar as claimed in claim 9 is multiplexing, wherein the said instruction among the step S1 comprises at least one scalar instruction and at least one vector instruction.
11. the non-full flowing water vector division evolution method that support scalar as claimed in claim 10 is multiplexing, wherein the allocation step among the step S5 has adopted the allocation strategy of priority.
12. the non-full flowing water vector division evolution method that support scalar as claimed in claim 10 is multiplexing, wherein the arrangement step among the step S6 has adopted the strategy according to difference configuration realization different mode.
CN201010133813XA 2010-03-25 2010-03-25 Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing Active CN101840323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010133813XA CN101840323B (en) 2010-03-25 2010-03-25 Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010133813XA CN101840323B (en) 2010-03-25 2010-03-25 Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing

Publications (2)

Publication Number Publication Date
CN101840323A CN101840323A (en) 2010-09-22
CN101840323B true CN101840323B (en) 2012-02-08

Family

ID=42743712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010133813XA Active CN101840323B (en) 2010-03-25 2010-03-25 Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing

Country Status (1)

Country Link
CN (1) CN101840323B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0649086A1 (en) * 1993-10-18 1995-04-19 Cyrix Corporation Microprocessor with speculative execution
CN1142484C (en) * 2001-11-28 2004-03-17 中国人民解放军国防科学技术大学 Vector processing method of microprocessor
CN1987825A (en) * 2005-12-23 2007-06-27 中国科学院计算技术研究所 Fetching method and system for multiple line distance processor using path predicting technology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0649086A1 (en) * 1993-10-18 1995-04-19 Cyrix Corporation Microprocessor with speculative execution
CN1142484C (en) * 2001-11-28 2004-03-17 中国人民解放军国防科学技术大学 Vector processing method of microprocessor
CN1987825A (en) * 2005-12-23 2007-06-27 中国科学院计算技术研究所 Fetching method and system for multiple line distance processor using path predicting technology

Also Published As

Publication number Publication date
CN101840323A (en) 2010-09-22

Similar Documents

Publication Publication Date Title
CN106875012B (en) A kind of streamlined acceleration system of the depth convolutional neural networks based on FPGA
EP3682330B1 (en) Multi-thread systolic array
US8078833B2 (en) Microprocessor with highly configurable pipeline and executional unit internal hierarchal structures, optimizable for different types of computational functions
KR101703797B1 (en) Functional unit having tree structure to support vector sorting algorithm and other algorithms
US5825677A (en) Numerically intensive computer accelerator
CN103699360B (en) A kind of vector processor and carry out vector data access, mutual method
CN104461449A (en) Large integer multiplication realizing method and device based on vector instructions
EP4066170A1 (en) Loading operands and outputting results from a multi-dimensional array using only a single side
CN111291323A (en) Matrix multiplication processor based on systolic array and data processing method thereof
CN101825998A (en) Instruction execution method for vector complex multiplication operation and corresponding device
SE432027B (en) DIGITAL DATA MULTIPLICATOR
CN111159094A (en) RISC-V based near data stream type calculation acceleration array
CN102360281B (en) Multifunctional fixed-point media access control (MAC) operation device for microprocessor
WO2006090108A1 (en) Microprocessor architectures
CN101038582B (en) Systolic array processing method and circuit used for self-adaptive optical wave front restoration calculation
CN101840323B (en) Device and method for division evolution of non-full flow water vectors supporting scalar quantity multiplexing
CN111954872A (en) Data processing engine tile architecture for integrated circuits
CN107368459A (en) The dispatching method of Reconfigurable Computation structure based on Arbitrary Dimensions matrix multiplication
EP0395240A2 (en) High speed numerical processor
CN112074810B (en) Parallel processing apparatus
CN103761213A (en) On-chip array system based on circulating pipeline computation
CN1553310A (en) Symmetric cutting algorithm for high-speed low loss multiplier and circuit strucure thereof
JPH0799515B2 (en) Instruction flow computer
CN109871512B (en) Matrix multiplication acceleration method for heterogeneous fusion system structure
CN117290289B (en) Matrix accelerator architecture based on general-purpose CPU

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing

Patentee after: Loongson Zhongke Technology Co.,Ltd.

Address before: 100080 No. 10 South Road, Haidian District Academy of Sciences, Beijing

Patentee before: LOONGSON TECHNOLOGY Corp.,Ltd.

CP03 Change of name, title or address