CN1622031A - Processor and method for trans-boundary aligned multiple transient memory data - Google Patents

Processor and method for trans-boundary aligned multiple transient memory data Download PDF

Info

Publication number
CN1622031A
CN1622031A CN 200310118814 CN200310118814A CN1622031A CN 1622031 A CN1622031 A CN 1622031A CN 200310118814 CN200310118814 CN 200310118814 CN 200310118814 A CN200310118814 A CN 200310118814A CN 1622031 A CN1622031 A CN 1622031A
Authority
CN
China
Prior art keywords
address
working storage
bit
output terminal
shift
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200310118814
Other languages
Chinese (zh)
Other versions
CN1297887C (en
Inventor
梁伯嵩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sunplus Technology Co Ltd
Original Assignee
Sunplus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunplus Technology Co Ltd filed Critical Sunplus Technology Co Ltd
Priority to CNB2003101188147A priority Critical patent/CN1297887C/en
Publication of CN1622031A publication Critical patent/CN1622031A/en
Application granted granted Critical
Publication of CN1297887C publication Critical patent/CN1297887C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The present invention proposes one kind of processor capable of aligning datan in several temporary storages across boundary and its method. There are one decoder for decoding multiple shift command; one temporary storage file with several temporary storages of N bits; one shifter to connect the first output content and the second output content in the temporary storage file serially to form one 2N bit word block, shift the 2N bit word block by w bits and output the first N bits; and one controller to set the temporary storage file according to the decoded multiple shift command, read out the content in corresponding temporary storage, and write the w bits shifted output of the shift into the temporary storage file.

Description

Trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof
Technical field
The invention relates to the technical field of Data Processing; Especially refer to a kind of trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof utilized.
Background technology
When processor carried out Data Processing, whether the alignment of data was related to the usefulness of many key operations, for example the usefulness of computing such as word string, array.As shown in Figure 1, a data (ABCDEFGHIJKL) that needs to handle is often crossed over the data storage border, when a processor carries out word string or array operation to this document, need to carry out earlier many extra computings, so that after can be with this document being reduced into the form of alignment, this processor could be to the document utilization of being correlated with.
At the unjustified problem of processing data, a kind of known method is after data is written into processor, utilizes various processor instructions to operate again and obtain needed data.As shown in Figure 2, the data (ZABC) that will be arranged in the 100h place earlier is written into working storage R16, working storage R16 is moved to left 8 bits so that unwanted data (Z) is removed, the data (DEFG) that will be arranged in the 104h place again is written into working storage R17, and working storage R17 moved to right 24 bits so that unwanted data (EFG) is removed, at last with working storage R16 and working storage R17 carries out or (OR) computing and its result deposited to working storage R16, the content among this moment working storage R16 is the data (ABCD) of required processing.According to above-mentioned same steps as, data EFGH and IJKL are written among working storage R17 and the working storage R18 in regular turn.
As shown in the above description: if the required unjustified data length that is written into is n word group (a word group is 32 bits), known method then needs 5n instruction to describe and reads action, simultaneously need 5n instruction cycle just can finish at least and read action, this makes procedure code tediously long, occupy the storage area, the burden that also increases processor simultaneously makes processor efficient unclear.
Use processor instruction to handle the problem that unjustified data is drawn the tediously long and efficient of Hyper program sign indicating number at known method, in U.S. USP4,814, in No. 976 patent announcements, be to be written into the action that unjustified data is promptly alignd simultaneously, and, be divided into twice and read a document of crossing the boundary.As shown in Figure 3, the data (ABC) that will be arranged in 101h to 103h place earlier is written into the bit group 0,1,2 of working storage R16, this moment working storage R16 bit group 3 in data be X (don ' t care), the data (D) that will be arranged in the 104h place again is written into the bit group 3 of R16, and the content among the working storage R16 is the data (ABCD) of required processing at this moment.Same steps as is written into data EFGH and IJKL among working storage R17 and the working storage R18 in regular turn according to this.
As shown in the above description,, then need 2n instruction to describe and read action, need 2n instruction cycle just can finish at least simultaneously and read action if the required unjustified data length that is written into is n word group.And, make the processor pipeline stop (Pipeline Stall) possibility and improve because same reservoir and working storage position are made repetitive read-write.Same reservoir position is repeated to read, can waste bus bandwidth, especially in some system that does not have cache, the delay that is caused is obvious especially.
Summary of the invention
The object of the present invention is to provide a kind of with trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof, tediously long with the procedure code of avoiding known technology, as to occupy storage area problem, can avoid because same reservoir is repeated to read waste bus bandwidth simultaneously
According to one of characteristic of the present invention, a kind of trans-boundary alignment multiple transient memory DATA PROCESSING apparatus is proposed, it mainly comprises:
One decoding device is decoded so that a multiple shift is instructed;
One working storage archives, have a plurality of working storages, each working storage is the N bit, these working storage archives can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address;
One shift unit, be coupled to first output terminal and second output terminal of these working storage archives, and the output content of this first output terminal and second output terminal is concatenated into a 2N bit word group, again according to a shift value w with this 2N bit word group displacement w bit (w is a positive integer), and export top n bit in this 2N bit word group; And
One control device, be coupled to this decoding device and working storage archives, according to this decoded multiple shift instruction, to set this first address, second address, the 3rd address and shift value w, read the content of corresponding working storage, with by this shift unit with the content of read working storage displacement w bit, and the output of this shift unit is write this working storage archives according to the 3rd address.
Described device, wherein N is 32.
Described device, wherein w be 8,16,24 one of them.
Described device, wherein this shift unit w bit that can be shifted to the left or to the right.
Described device, wherein the 3rd address is that setting is identical with this first address.
Described device, wherein this second address is the follow-up address that is set at this first address.
According to another characteristic of the present invention, the align method of a plurality of working storage data of a kind of trans-boundary is proposed, these a plurality of working storages form working storage archives, each working storage is the N bit, these working storage archives can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address, this method mainly comprises the following step:
(A) set this first address, this second address, the 3rd address and a shift value w according to multiple shift instruction;
(B) content of reading corresponding working storage according to this first address and second address; And
(C) content strings of step (B) working storage of reading is connected into the word group of 2N bit, again to this 2N bit word group w bit that is shifted, and top n bit in this 2N bit word group after will being shifted, according to the 3rd address write these a plurality of working storages one of them.
Described method, wherein step (A) to step (C) is heavily to cover execution, has all finished displacement up to the working storage of a predetermined number.
Described method, wherein N is 32.
Described method, wherein w be 8,16,24 one of them.
Described method, wherein displacement w bit can be the w bit that is shifted to the left or to the right in the step (C).
Described method, wherein the 3rd address is that setting is identical with this first address.
Described method, wherein this second address is the follow-up address that is set at this first address.
Description of drawings
Fig. 1: be one group of synoptic diagram that unjustified data is arranged in reservoir.
Fig. 2: the procedure code that is written into one group of unjustified data for known technology.
Fig. 3: for another known technology is written into the procedure code of one group of unjustified data and the synoptic diagram of working storage.
Fig. 4: be the calcspar of trans-boundary alignment multiple transient memory DATA PROCESSING apparatus of the present invention.
Fig. 5: be the detailed circuit diagram of the technology of the present invention control device 5.
Fig. 6: be the technology of the present invention running synoptic diagram.
Fig. 7: be an exemplary applications of the technology of the present invention.
Embodiment
Fig. 4 shows the calcspar that utilizes trans-boundary alignment multiple transient memory DATA PROCESSING device of the present invention, and it includes a decoding device 100, a control device 200, working storage archives 300 and a shift unit 400.Working storage archives 300 have a plurality of working storages 3001, and each working storage 3001 is the N bit, and in the present embodiment, the N value is preferably 32.These working storage archives 300 can read working storage 3001 respectively according to one first address 301 and one second address 302, and by one first output terminal 310 and 320 outputs of one second output terminal, and can write this multiple transient memory 3001 one of them (N is a positive integer) via an input end 330 according to one the 3rd address 303.
This decoding device 100 is that instruction is decoded to a multiple shift, and this multiple shift instruction can be divided into a multiple left shift instruction (Multiple Left Shin Instruction, MLSI) and a multiple right shift instruction (Multiple Right Shift Instruction, MRSI).Wherein, multiple left shift instruction form is MLSI Rx, Ry, and w, it is represented the working storage contents value in x to the y scope, and integral body is carried out to the action w bit that shifts left.And multiple right shift instruction form is MRSI Rx, Ry, and w, it is represented the working storage contents value in x to the y scope, and integral body is carried out the action w bit of right shift.Decoding device 100 is after instruction is decoded to a multiple shift, can produce x, y, L_R *And the w signal, and export this control device 200 to, and wherein, L_R *Signal is only first in order to the mobile to the left or to the right w of indication, works as L_R *Signal is 1 o'clock, and expression is moved to the left the w bit, works as L_R *Signal is 0 o'clock, represents to move right the w bit.
This shift unit 400 is first output terminal 310 and second output terminals 320 that are coupled to these working storage archives 300, and the output content of this first output terminal 310 and second output terminal 320 is concatenated into one 64 bit space groups, again according to a shift value w and a L_R *Signal is this 64 bit word group w bit (w is a positive integer) that is shifted to the left or to the right, and exports preceding 32 bits in these displacement back 64 bit word groups.
This control device 200 is coupled to this decoding device 100 and working storage archives 300, according to this decoded x, y, and L_R *And w signal, setting first address 301, second address 302, the 3rd address 303 and the shift value w of these working storage archives 300, and the content of reading x working storage and y working storage in these working storage archives 300 by first output terminal 310 of these working storage archives 300 and second output terminal 320.
Fig. 5 is the detailed circuit diagram of this control device 200, and it mainly comprises a multiplexer 210, a comparer 220, one first address working storage 230, a totalizer 240 and one second address working storage 250.This multiplexer 210 is selected an x signal that is produced by decoding device 100 or by the contents value of this second address working storage 250.The output of this multiplexer 210 writes this first address working storage 230, and it exports first address 301 of these working storage archives 300 to, with the working storage 3001 of these first address, 301 indications of access.This totalizer 240 is written to this second address working storage 250 after the contents value of this first address working storage 230 is added 1 again, and the contents value of this second address working storage 250 is in order to the working storage 3001 of these second address, 302 indications of access.This comparer 220 is the contents value of this first address working storage 230 and the y signal that decoding device 100 is produced relatively, if the contents value of this first address working storage 230 during more than or equal to this y signal, then produces a stop signal (stop signal).
Fig. 6 shows running synoptic diagram of the present invention, and it carries out a MLSI R16, R19, and 8 instructions, this instruction represent that contents value with working storage R16, R17, R18 and R19 is to 8 bits that shift left.When first performance period began, these decoding device 100 these instructions of decoding, and produce x=16, y=19, L_R *=1 and the w=8 signal.This multiplexer 210 is selected an x signal (=16) that is produced by decoding device 100, and 200 of control device insert 16 with this first address working storage 230, and via these totalizer 240 computings this second address working storage 250 are inserted 17.Because the first address working storage 230 is 16, it is less than 19, so comparer 220 can not produce this stop signal (stop signal).That is these working storage archives 300 can according to this first address 301 (=16) and second address 302 (=17) read respectively working storage R16 contents value (=ZABC) and the contents value of R17 (=DEFG).And export this shift unit 400 to by first output terminal 310 and second output terminal 320.
This shift unit 400 with the contents value of this first output terminal 310 (=ZABC) and the contents value of second output terminal 320 (=DEFG) be concatenated into one 64 bit word groups (=ZABCDEFG), again according to a shift value w=8 and a L_R *=1 signal with this 64 bit word group to 8 bits that shift left (=ABCDEFG0), and export in the 64 bit word groups of this displacement back (=ABCDEFG0) preceding 3 bits (=ABCD).200 of control device according to the 3rd address 303 with the output of this shift unit 400 (=ABCD) write among the working storage R16 of these working storage archives 300.
When second performance period began, this multiplexer 210 is selected the contents value (=17) of this second address working storage 250,200 of control device insert 18 with this first address working storage 230, and via these totalizer 240 computings this second address working storage 250 are inserted 18.Its implementation was same as for first performance period, so when second performance period finished, the contents value of this working storage R17 was EFGH.In like manner, so when the 3rd performance period finished, the contents value of this working storage R18 was IJKL.
When the 4th performance period began, this multiplexer 210 is selected the contents value (=19) of this second address working storage 250,200 of control device insert 19 with this first address working storage 230, because the first address working storage 230 is 19, so comparer 220 can produce this stop signal (stop signal) and stop executive routine, that is only needs three performance periods to get final product.
Fig. 7 shows utilization synoptic diagram of the present invention, when desire is written into one group of unjustified data, can respectively unjustified data be written among working storage R16, R17, R18 and the R19 with being written into instruction (LW) earlier, re-using multiple left shift instruction of the present invention (MLSI) can finish.As shown in Figure 7, its procedure code only needs 5 word groups.
As shown in the above description, technology of the present invention can solve the problem that the known technology procedure code is tediously long, occupy the storage area, can avoid because same reservoir is repeated to read the problem of waste bus bandwidth simultaneously.
It should be noted that above-mentioned many embodiment give an example for convenience of explanation, the interest field that the present invention advocated should be as the criterion so that claim is described certainly, but not only limits to the foregoing description.

Claims (13)

1. trans-boundary alignment multiple transient memory DATA PROCESSING apparatus mainly comprises:
One decoding device is decoded so that a multiple shift is instructed;
One working storage archives, have a plurality of working storages, each working storage is the N bit, these working storage archives can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address;
One shift unit, be coupled to first output terminal and second output terminal of these working storage archives, and the output content of this first output terminal and second output terminal is concatenated into a 2N bit word group, again according to a shift value w with this 2N bit word group displacement w bit (w is a positive integer), and export top n bit in this 2N bit word group; And
One control device, be coupled to this decoding device and working storage archives, according to this decoded multiple shift instruction, to set this first address, second address, the 3rd address and shift value w, read the content of corresponding working storage, with by this shift unit with the content of read working storage displacement w bit, and the output of this shift unit is write this working storage archives according to the 3rd address.
2. device as claimed in claim 1 is characterized in that, wherein N is 32.
3. device as claimed in claim 1 is characterized in that, wherein w be 8,16,24 one of them.
4. device as claimed in claim 1 is characterized in that, wherein this shift unit w bit that can be shifted to the left or to the right.
5. device as claimed in claim 1 is characterized in that, wherein the 3rd address is that setting is identical with this first address.
6. device as claimed in claim 1 is characterized in that, wherein this second address is the follow-up address that is set at this first address.
7. the trans-boundary method of a plurality of working storage data of aliging, these a plurality of working storages form working storage archives, each working storage is the N bit, these working storage archives can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address, this method mainly comprises the following step:
(A) set this first address, this second address, the 3rd address and a shift value w according to multiple shift instruction;
(B) content of reading corresponding working storage according to this first address and second address; And
(C) content strings of step (B) working storage of reading is connected into the word group of 2N bit, again to this 2N bit word group w bit that is shifted, and top n bit in this 2N bit word group after will being shifted, according to the 3rd address write these a plurality of working storages one of them.
8. method as claimed in claim 7 is characterized in that, wherein step (A) to step (C) is heavily to cover execution, has all finished displacement up to the working storage of a predetermined number.
9. method as claimed in claim 7 is characterized in that, wherein N is 32.
10. method as claimed in claim 7 is characterized in that, wherein w be 8,16,24 one of them.
11. method as claimed in claim 7 is characterized in that, wherein displacement w bit can be the w bit that is shifted to the left or to the right in the step (C).
12. method as claimed in claim 7 is characterized in that, wherein the 3rd address is that setting is identical with this first address.
13. method as claimed in claim 7 is characterized in that, wherein this second address is the follow-up address that is set at this first address.
CNB2003101188147A 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data Expired - Fee Related CN1297887C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2003101188147A CN1297887C (en) 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2003101188147A CN1297887C (en) 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data

Publications (2)

Publication Number Publication Date
CN1622031A true CN1622031A (en) 2005-06-01
CN1297887C CN1297887C (en) 2007-01-31

Family

ID=34761217

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2003101188147A Expired - Fee Related CN1297887C (en) 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data

Country Status (1)

Country Link
CN (1) CN1297887C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108288964A (en) * 2017-01-09 2018-07-17 南亚科技股份有限公司 Circuit system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4814976C1 (en) * 1986-12-23 2002-06-04 Mips Tech Inc Risc computer with unaligned reference handling and method for the same
US7685212B2 (en) * 2001-10-29 2010-03-23 Intel Corporation Fast full search motion estimation with SIMD merge instruction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108288964A (en) * 2017-01-09 2018-07-17 南亚科技股份有限公司 Circuit system
CN108288964B (en) * 2017-01-09 2021-08-27 南亚科技股份有限公司 Circuit system

Also Published As

Publication number Publication date
CN1297887C (en) 2007-01-31

Similar Documents

Publication Publication Date Title
CN1203420C (en) Linked list DMA descriptor architecture
US9058253B2 (en) Data tree storage methods, systems and computer program products using page structure of flash memory
JP2534465B2 (en) Data compression apparatus and method
JP3229180B2 (en) Data compression system
US5384567A (en) Combination parallel/serial execution of sequential algorithm for data compression/decompression
JP2003050696A (en) Micro-controller for reading compressed instruction code and program memory for compressing and storing instruction code
CN114764407A (en) Method for near memory acceleration for accelerator and dictionary decoding
CN1264096C (en) Data handling method of FIFO memory device
CN111966281A (en) Data storage device and data processing method
CN1297887C (en) Processor and method for trans-boundary aligned multiple transient memory data
CN1335958A (en) Variable-instruction-length processing
CN100336038C (en) Computer system embedding sequential buffers therein for improving the performance of a digital signal processing data access operation and a method thereof
US7676651B2 (en) Micro controller for decompressing and compressing variable length codes via a compressed code dictionary
CN1529229A (en) Marker digit optimizing method in binary system translation
KR100735552B1 (en) Method for reducing program code size on code memory
CN104699414A (en) Data reading and writing method and saving equipment
CN100346291C (en) Method and device for coutrolling block transfer instruction for multi address space
CN1279451C (en) Drive capacity setting method and program and its driver circuit
CN1238788C (en) First-in first-out register quenue arrangement capable of processing variable-length data and its control method
TWI695264B (en) A data storage device and a data processing method
CN1229718C (en) Digital signal processing apparatus and method for controlling same
JPH05502312A (en) improvements in computer systems
CN1190738C (en) Data processing device and its data read method
CN1126022C (en) Signal processor
CN1272703C (en) Processor and method for aligning block by automatic command pattern conversion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20070131

Termination date: 20141128

EXPY Termination of patent right or utility model