CN105843589B - A kind of storage arrangement applied to VLIW type processors - Google Patents

A kind of storage arrangement applied to VLIW type processors Download PDF

Info

Publication number
CN105843589B
CN105843589B CN201610157129.2A CN201610157129A CN105843589B CN 105843589 B CN105843589 B CN 105843589B CN 201610157129 A CN201610157129 A CN 201610157129A CN 105843589 B CN105843589 B CN 105843589B
Authority
CN
China
Prior art keywords
data
storage arrangement
access
memory
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610157129.2A
Other languages
Chinese (zh)
Other versions
CN105843589A (en
Inventor
任浩琪
吴俊�
赵朝兴
雷蕾
王文凯
张志峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Qianxin Technology Co.,Ltd.
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201610157129.2A priority Critical patent/CN105843589B/en
Publication of CN105843589A publication Critical patent/CN105843589A/en
Application granted granted Critical
Publication of CN105843589B publication Critical patent/CN105843589B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]

Abstract

The present invention relates to a kind of storage arrangement applied to VLIW type processors, to improve the efficiency that multiple memory hierarchies access in VLIW type processors, it is characterized in that, the storage arrangement includes the sub- memory bank of multiple data width all sames, more sub- memory banks are arranged according to two-dimentional linescan method, and the memory is equipped with two kinds of working methods according to the combination of address signal and address signal:Mode one:When storage arrangement is used as command memory or instruction buffer, accesses and once read a VLIW instruction word;Mode two:When storage arrangement is used as data storage or data buffer storage, the data once accessed are used as independent one data word for processor, or are used as SIMD data channel of multiple data words for processor.Compared with prior art, the present invention has the advantages that flexibility ratio is high, efficient.

Description

A kind of storage arrangement applied to VLIW type processors
Technical field
The present invention relates to processor architecture field, more particularly, to a kind of storage applied to VLIW type processors Device device.
Background technology
Digital signal processor (DSP) is a kind of microprocessor of special construction, is to be specifically used to processing large scale digital The processor of signal.The real time execution speed of dedicated digital signal processor is generally also faster than general processor, its main characteristics It is powerful digital operating ability, therefore is primarily used to the field for being related to the calculating of large scale digital information.At digital signal Reason device (DSP) has become the chip to become more and more important in digital world.
It is also higher and higher to the performance requirement of digital signal processor (DSP) with the fast development of new and high technology.Overlength The technology such as coding line (VLIW) and single-instruction multiple-data stream (SIMD) (SIMD) has been widely used for setting for digital signal processor (DSP) In meter.Very long instruction word (VLIW) is a kind of design method that a plurality of instruction connects together, and may be performed simultaneously a plurality of instruction, To improve arithmetic speed.Single-instruction multiple-data stream (SIMD) (SIMD) is can to replicate multiple operands, and they are packaged in large-scale post One group of instruction set of storage.In the processor of SIMD type, several execution units access memory at the same time after Instruction decoding, once Property obtain all operands and carry out computings.But when digital signal processor is run, access memory operations can generally consume Long period, the access speed of storage system have become the bottleneck of processor.
In digital signal processor (DSP) there is a variety of the component of access instruction memory (IM), for example processor core takes Finger and dma module etc..The component for accessing data storage (DM) also has very much, such as multiple computing lists in processor core Member, dma module and debugging (Debug) module etc..More traditional way is all to be mounted to all components in processor always On line, this makes it possible to realize that all components can have access to memory.It is but multiple disadvantage of this is that can not achieve Concurrent access of the component to processor, causes system effectiveness relatively low.Another kind can be used with the strategy of concurrent access memory Dual-ported memory replaces common one-port memory, but can so increase the time delay of single reference, can also increase whole The area and power consumption of chip.
The content of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide a kind of flexibility ratio is high, efficiency The high storage arrangement applied to VLIW type processors.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of storage arrangement applied to VLIW type processors, to improve multiple visits in VLIW type processors The efficiency of component access is deposited, which includes the sub- memory bank of multiple data width all sames, and more sub- memory banks are pressed Arrange according to two-dimentional linescan method, the storage arrangement is equipped with two kinds of works according to the combination of address signal and address signal Make mode:
Mode one:When storage arrangement is used as command memory or instruction buffer, accesses and once read a VLIW Coding line;
Mode two:When storage arrangement is used as data storage or data buffer storage, the data once accessed are as single Only one data word is used for processor, or is used as SIMD data channel of multiple data words for processor.
The maximum data access width that the storage arrangement is supported is deposited for the often number of capable sub- memory bank with every height Store up the product of the data width of body.
In mode one, when the maximum data access width that storage arrangement is supported is 128, described VLIW Instruction word length is 128.
In mode two, when the maximum data access width that storage arrangement is supported is 128, the data that once access Used as 1 128 bit data word for processor, or be used as 2 64,4 32,8 16,16 8 bit data words Used for the SIMD data channel of processor.
When the data bit width of memory access is equal to the maximum data access width that the storage arrangement is supported:
According to more sub- memory banks of the high-order portion of address signal selection, according to the high-order portion of address signal at the same time to choosing In sub- memory bank be addressed to access corresponding data, and obtained data composition will be accessed that there is maximum data to access is wide The data of degree, use for processor.
When the data bit width of memory access is less than the maximum data access width that the storage arrangement is supported:
One sub- memory bank is selected according to the high-order portion of address signal, and according to the low portion of address signal to this The sub- memory bank addressing chosen accesses corresponding data, is used for processor.
Compared with prior art, the present invention has the following advantages:
First, flexibility ratio is high:The present invention can be adjusted to different pieces of information width mould in each memory access cycle according to configuration Formula, a variety of access instructions of support that can be more flexible, such as LOAD8, LOAD16, LOAD32 etc..
2nd, it is efficient:When storage device of the present invention is used for command memory, a VLIW can be taken out every time and is referred to Word is made, can include a plurality of sub-instructions in a VLIW instruction word, the present invention, every time at most can be with when for data storage 128 bit datas are taken out, this 128 bit data can be divided into multiple data segments and be used for multiple memory access modules.
Brief description of the drawings
Fig. 1 is one embodiment of architecture of memory device of the present invention;
Fig. 2 is the embodiment that storage arrangement of the present invention accesses 128 data and 32 data;
Fig. 3 is the embodiment that storage arrangement of the present invention accesses 16 data;
Fig. 4 is the embodiment that storage arrangement of the present invention accesses 8 data.
Embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.
Embodiment 1:
As shown in Figure 1, figure is one embodiment of architecture of memory device of the present invention.In the present embodiment, store Device device is divided into the sub- memory bank of 16 same amount of capacity, and the data width of every sub- memory bank is 32.For the ease of Illustrate, 16 pieces of sub- memory banks can be numbered with 4 bits, since sub- memory bank 105 with " 0000 " to " 1111 " are numbered.A circle represents a byte (101) in Fig. 1, and 106 four circles indicated represent one as a group 32 of a sub- memory bank.102 be address signal, and four sub- memory banks are selected according to address signal, by four sons of selection Data in memory bank choose by 104.103 be the valid data of initial option, then further according to method hereafter The final valid data of reselection in the preliminary valid data having been selected.Sequentially it is phase when toward when writing data in memory Anti-.
Embodiment 2:
As shown in Fig. 2, figure is the embodiment that storage arrangement of the present invention accesses 128 data and 32 data. In the present embodiment, when the data bit width of memory access is 128 (202), address signal SEL is equal to " 11 ", and expression allows pair Four sub- memory banks access.Assuming that the four to five (ADDR [5 of its address signal:4]) be equal to " XX " (" X " can be 0 or 1) the sub- memory bank numbering, then accessed is " XX00 ", " XX01 ", " XX10 ", " XX11 ", totally four pieces.Further according to address signal High-order portion accesses corresponding 32 data (201) to this four pieces sub- memory bank addressing at the same time.In this way, it once may have access to 128 Position data.When the storage arrangement is used as command memory or instruction buffer, it is possible to once read continuous 128 VLIW instruction word, is sent to VLIW instruction decoder.When the storage arrangement is used as data storage or data buffer storage, one The data of secondary access can be used as 1 128 data to carry out subsequent operation for processor, can also be used as 2 64 or 4 32 or 8 16 or 16 8 data are used for the SIMD data channel of processor.
When memory access bit wide is 32 (203), address signal SEL is equal to " 10 ", and expression can only be to a son storage Body accesses.At this time, the second to five of address signal, i.e. ADDR [5:2], the sub- memory bank for exactly needing to access is numbered.According to The high-order portion of address signal can address this block memory bank and access corresponding 32 data (204), realize and access 32 The function of position data.When the storage arrangement is used as command memory or instruction buffer, it is possible to once read continuous The coding line of 32, is sent to command decoder.When the storage arrangement is used as data storage or data buffer storage, once The data of access can be used as 1 32 data to carry out subsequent operation for processor, can also be used as 2 16 or 48 Data are used for the SIMD data channel of processor.
Embodiment 3:
As shown in figure 3, figure is the embodiment that storage arrangement of the present invention accesses 16 data.In the present embodiment, When the data bit width of memory access is 16 (302), address signal SEL is equal to " 01 ", and expression only allows to a son storage Body accesses.At this time, the second to five of address signal, i.e. ADDR [5:2], the sub- memory bank for exactly needing to access is numbered.According to The high-order portion of address signal can address this block memory bank and find corresponding 32 data.Further according to address signal First, i.e. ADDR [1], it is possible to determine to access (301) to high 16 or low 16 of this 32 data, so as to fulfill visit Ask the function of 16 data.When the storage arrangement is used as data storage or data buffer storage, the data that once access 1 16 data can be used as to carry out subsequent operation for processor, the SIMD numbers of processor can also be supplied as 28 data Used according to passage.
As shown in figure 4, figure is the embodiment that storage arrangement of the present invention accesses 8 data.In the present embodiment, 401 be to amount to 128 data in four sub- memory bank interfaces that initial option comes out.When the data bit width of memory access is 8 (402), address signal SEL is equal to " 00 ", and expression only allows to access a sub- memory bank.At this time, the of address signal Two to five, i.e. ADDR [5:2], the sub- memory bank for exactly needing to access is numbered.Can according to the high-order portion of address signal Corresponding 32 data are found to the addressing of this block memory bank.Further according to the zero to first of address signal, i.e. ADDR [1:0], It is assured that 8 access (403) to which of this 32 data, so as to fulfill the function of 8 data is accessed.
The interest field that the present invention is advocated is not limited thereto.The present invention also have other various embodiments, without departing substantially from In the case of spirit and its essence of the invention, those skilled in the art can make various corresponding changes according to the present invention and become Shape, but these change and deform the protection domain that should all belong to appended claims of the invention.

Claims (1)

  1. A kind of 1. storage arrangement applied to VLIW type processors, to improve multiple memory access in VLIW type processors The efficiency that component accesses, it is characterised in that the storage arrangement includes the sub- memory bank of multiple data width all sames, more height Memory bank is arranged according to two-dimentional linescan method, and the maximum data access width that the storage arrangement is supported is often row storage The product of the number of body and the data width of every sub- memory bank,
    When the data bit width of memory access is equal to the maximum data access width that the storage arrangement is supported:
    According to the more sub- memory banks of the high-order portion of address signal selection, according to the high-order portion of address signal at the same time to choosing Sub- memory bank is addressed to access corresponding data, and the data composition that access is obtained has maximum data access width Data, use for processor;
    When the data bit width of memory access is less than the maximum data access width that the storage arrangement is supported:
    One sub- memory bank is selected according to the high-order portion of address signal, and this is chosen according to the low portion of address signal The addressing of sub- memory bank access corresponding data, used for processor;
    The storage arrangement is equipped with two kinds of working methods according to the combination of address signal and address signal:
    Mode one:When storage arrangement is used as command memory or instruction buffer, accesses and once read a VLIW instruction Word, when the maximum data access width that storage arrangement is supported is 128, described VLIW instruction word length is 128 Position;
    Mode two:When storage arrangement is used as data storage or data buffer storage, the data once accessed are as independent one A data word is used for processor, or used as SIMD data channel of multiple data words for processor, works as memory device When the maximum data access width for putting support is 128, the data once accessed make as 1 128 bit data word for processor With, or used for the SIMD data channel of processor as 2 64,4 32,8 16,16 8 bit data words.
CN201610157129.2A 2016-03-18 2016-03-18 A kind of storage arrangement applied to VLIW type processors Active CN105843589B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610157129.2A CN105843589B (en) 2016-03-18 2016-03-18 A kind of storage arrangement applied to VLIW type processors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610157129.2A CN105843589B (en) 2016-03-18 2016-03-18 A kind of storage arrangement applied to VLIW type processors

Publications (2)

Publication Number Publication Date
CN105843589A CN105843589A (en) 2016-08-10
CN105843589B true CN105843589B (en) 2018-05-08

Family

ID=56587331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610157129.2A Active CN105843589B (en) 2016-03-18 2016-03-18 A kind of storage arrangement applied to VLIW type processors

Country Status (1)

Country Link
CN (1) CN105843589B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110892373A (en) * 2018-07-24 2020-03-17 深圳市大疆创新科技有限公司 Data access method, processor, computer system and removable device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1505407A (en) * 2002-12-03 2004-06-16 ������������ʽ���� Method and apparatus for writing data by calculating addresses, and digital camera utilizing the same
CN1690899A (en) * 2004-04-19 2005-11-02 株式会社东芝 Controller
CN101840383A (en) * 2010-04-28 2010-09-22 中国科学院自动化研究所 Configurable storage structure supporting continuous/discrete address multidata parallel access
CN104035898A (en) * 2014-06-04 2014-09-10 同济大学 Memory access system based on VLIW (Very Long Instruction Word) type processor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS59103154A (en) * 1982-12-06 1984-06-14 Nec Corp Information processor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1505407A (en) * 2002-12-03 2004-06-16 ������������ʽ���� Method and apparatus for writing data by calculating addresses, and digital camera utilizing the same
CN1690899A (en) * 2004-04-19 2005-11-02 株式会社东芝 Controller
CN101840383A (en) * 2010-04-28 2010-09-22 中国科学院自动化研究所 Configurable storage structure supporting continuous/discrete address multidata parallel access
CN104035898A (en) * 2014-06-04 2014-09-10 同济大学 Memory access system based on VLIW (Very Long Instruction Word) type processor

Also Published As

Publication number Publication date
CN105843589A (en) 2016-08-10

Similar Documents

Publication Publication Date Title
CN110494851B (en) Reconfigurable parallel processing
US11113057B2 (en) Streaming engine with cache-like stream data storage and lifetime tracking
US11099933B2 (en) Streaming engine with error detection, correction and restart
CN111433758B (en) Programmable operation and control chip, design method and device thereof
US20230185649A1 (en) Streaming engine with deferred exception reporting
CN109643233B (en) Data processing apparatus having a stream engine with read and read/forward operand encoding
CN102750133B (en) 32-Bit triple-emission digital signal processor supporting SIMD
CN102541774B (en) Multi-grain parallel storage system and storage
JP2004157593A (en) Multiport integration cache
CN104252392A (en) Method for accessing data cache and processor
CN115904501A (en) Stream engine with multi-dimensional circular addressing selectable in each dimension
Haque et al. Dew: A fast level 1 cache simulation approach for embedded processors with fifo replacement policy
US5333291A (en) Stride enhancer for high speed memory accesses with line fetching mode and normal mode employing boundary crossing determination
CN104035898B (en) A kind of memory access system based on VLIW type processors
CN114297097B (en) Many cores can define distributed shared storage structure
JP6679570B2 (en) Data processing device
CN105843589B (en) A kind of storage arrangement applied to VLIW type processors
CN102073480A (en) Method for simulating cores of multi-core processor by adopting time division multiplex
CN102508802A (en) Data writing method based on parallel random storages, data reading method based on same, data writing device based on same, data reading device based on same and system
Haque et al. Susesim: a fast simulation strategy to find optimal l1 cache configuration for embedded systems
Shang et al. LACS: A high-computational-efficiency accelerator for CNNs
CN104317554B (en) Device and method of reading and writing register file data for SIMD (Single Instruction Multiple Data) processor
CN101930356B (en) Method for group addressing and read-write controlling of register file for floating-point coprocessor
CN101930355A (en) Register circuit realizing grouping addressing and read write control method for register files
Yousefzadeh et al. Energy-efficient in-memory address calculation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200812

Address after: 401, building e, phase I, Daheng science and Technology Park, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Shenzhen xinghaiwei Technology Co., Ltd

Address before: 200092 Shanghai City, Yangpu District Siping Road No. 1239

Patentee before: TONGJI University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210209

Address after: 518000 room 604, building 13, songpingshan residential building, Shenbao Road, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Qiao Hongbo

Address before: 518055 Room 401, building e, phase I, dahen Science Park, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Patentee before: Shenzhen xinghaiwei Technology Co., Ltd

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210331

Address after: Room a8-09, 13 / F, block a, building J1, phase II, innovation industrial park, 2800 innovation Avenue, Hefei high tech Zone, China (Anhui) pilot Free Trade Zone, Hefei City, Anhui Province, 230088

Patentee after: Hefei Qianxin Technology Co.,Ltd.

Address before: 518000 room 604, building 13, songpingshan residential building, Shenbao Road, Nanshan District, Shenzhen City, Guangdong Province

Patentee before: Qiao Hongbo