KR20160109733A

KR20160109733A - Storage apparatus and method for processing a plurality of client data

Info

Publication number: KR20160109733A
Application number: KR1020150034701A
Authority: KR
Inventors: 한성철; 고준원; 송인철
Original assignee: 삼성전자주식회사
Priority date: 2015-03-12
Filing date: 2015-03-12
Publication date: 2016-09-21
Also published as: KR102338872B1

Abstract

The present invention relates to a storage device and method for processing data of multiple clients. The storage device according to one embodiment of the present invention comprises: a first-stage storage unit for receiving data of multiple clients generated for each burst unit from the multiple clients; a second-stage storage unit transferred with the data of the multiple clients from the first-stage storage unit to store the transferred data in multiple memory banks, which the multiple clients share, for the each burst unit; and a third-stage storage unit transferred with data of each client from the second-stage storage unit to store data of a transaction unit, which is a transfer unit for processing data. The present invention provides an efficient data processing method and device for collecting data for each client and transmitting the same.

Description

TECHNICAL FIELD [0001] The present invention relates to a storage apparatus and method for processing a plurality of client data,

본 발명은 다수의 클라이언트 데이터를 처리하는 저장 장치 및 방법에 대한 것이다.The present invention is directed to a storage device and method for processing a plurality of client data.

통신 시스템에서 송신기, 수신기 또는 송수신기(또는 단말기)와 같이 서로 다른 종류의 다수의 클라이언트 데이터(client data)들의 처리가 요구되는 데이터 처리 방식에서는 각 클라이언트별로 전송이 필요한 데이터의 양이 다르고 데이터의 발생 시점도 독립적이다. 통상적으로 서로 다른 클라이언트 데이터는 개별적인 메모리에 저장되며, 클라이언트 데이터별 메모리의 크기도 균일하지 않다. 따라서 기존 데이터 처리 방식에서 클라이언트별로 특화된 메모리 구성을 이용하게 되면, 일부 작은 사이즈의 메모리에 의해 메모리의 면적 효율이 저하된다. 또한 상기 클라이언트별로 특화된 메모리 구성은 프로세서에서 한 번에 처리할 데이터의 양에 비례하여 메모리 크기가 증가하는 문제가 있고, 동작 시나리오에 따라 클라이언트별 데이터의 양이 달라지더라도 클라이언트별 메모리를 효율적으로 재분배(재할당)할 수 없어 전체적으로 더욱 큰 메모리를 배치하여 설계해야 하는 부담이 있다.
In a data processing method requiring processing of a plurality of different types of client data such as a transmitter, a receiver, or a transceiver (or terminal) in a communication system, the amount of data to be transmitted is different for each client, Is also independent. Typically, different client data are stored in separate memories, and the size of memory per client data is also not uniform. Therefore, if the memory configuration specialized for each client is used in the existing data processing method, the memory area efficiency is degraded by some small size memory. In addition, the memory configuration specialized for each client has a problem that the memory size increases in proportion to the amount of data to be processed at one time in the processor, and even if the amount of data for each client varies depending on the operation scenario, (Reallocation) can not be performed, and it is burdensome to design and allocate a larger memory as a whole.

본 발명은 클라이언트별 데이터를 수집하고 전송하기 위한 효율적인 데이터 처리 방법과 장치를 제공한다.The present invention provides an efficient data processing method and apparatus for collecting and transmitting client-specific data.

또한 본 발명은 다단 구조의 메모리들을 이용하여 클라이언트별 데이터를 효율적으로 처리하는 데이터 저장 방법 및 장치를 제공한다.The present invention also provides a data storage method and apparatus for efficiently processing client-specific data using memories having a multi-stage structure.

본 발명의 실시 예에 따라 다수의 클라이언트 데이터를 처리하는 저장 장치는, 다수의 클라이언트들로부터 각각 버스트 단위로 생성되는 상기 다수의 클라이언트 데이터를 입력 받아 저장하는 제1 단 저장부와, 상기 제1 단 저장부로부터 상기 다수의 클라이언트 데이터를 전달 받아 상기 다수의 클라이언트들이 공유하는 다수의 메모리 뱅크들에 각각 상기 버스트 단위로 저장하는 제2 단 저장부와, 상기 제2 단 저장부로부터 각 클라이언트 데이터를 전달 받아 데이터 처리를 위한 전송 단위인 트랜잭션 단위의 데이터를 저장하는 제3 저장부를 포함한다.According to an embodiment of the present invention, a storage apparatus for processing a plurality of client data includes a first storage unit for receiving and storing the plurality of client data generated in units of bursts from a plurality of clients, A second stage storage unit for receiving the plurality of client data from the storage unit and storing the plurality of client data in the plurality of memory banks shared by the plurality of clients in units of bursts; And a third storage unit for storing data in a transaction unit, which is a transfer unit for data processing.

또한 본 발명의 실시 예에 따라 다수의 클라이언트 데이터를 처리하는 저장 방법은, 다수의 클라이언트들로부터 각각 버스트 단위로 생성되는 상기 다수의 클라이언트 데이터를 입력 받아 클라이언트별 메모리에 저장하는 과정과, 상기 클라이언트별 메모리로부터 상기 다수의 클라이언트 데이터를 전달 받아 상기 다수의 클라이언트들이 공유하는 다수의 메모리 뱅크들에 각각 상기 버스트 단위로 저장하는 과정과, 상기 다수의 메모리 뱅크들에 저장된 각 클라이언트 데이터를 전달 받아 목적지 메모리에 데이터 처리를 위한 전송 단위인 트랜잭션 단위의 데이터를 저장하는 과정을 포함한다.
According to another aspect of the present invention, there is provided a storage method for processing a plurality of client data, the method comprising: receiving the plurality of client data generated in units of bursts from a plurality of clients and storing the received data in a memory for each client; Storing the plurality of client data received from the memory in units of bursts in a plurality of memory banks shared by the plurality of clients, receiving the client data stored in the plurality of memory banks, And storing data in a transaction unit, which is a transfer unit for data processing.

도 1a는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면,
도 1b는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 다른 구성 예를 나타낸 도면,
도 2는 특정 클라이언트에서 생성되는 데이터의 패턴을 예시한 도면,
도 3은 본 발명의 실시 예에 따라 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면,
도 4는 본 발명의 실시 예에 따른 상기 저장 장치에서 제2 단 저장부로 동작하는 버스트 메모리의 일 구성 예를 나타낸 도면,
도 5는 본 발명의 실시 예에 따른 저장 장치에서 버스트 메모리의 억세스 패턴의 일 예를 나타낸 도면,
도 6은 본 발명의 실시 예에 따라 1 비트 당 요구되는 메모리의 면적을 메모리의 사이즈별로 예시한 도면,
도 7은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트와 데이터 전달부 간의 인터페이스의 일 예를 나타낸 도면,
도 8은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트 데이터 전송 동작을 설명하기 위한 타이밍도,
도 9는 본 발명의 실시 예에 따른 저장 장치에서 목적지 메모리에 저장된 클라이언트 데이터의 일 예를 나타낸 도면.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1A shows an example of a configuration of a data storage device for processing a plurality of client data in a communication system,
1B illustrates another configuration of a data storage device that processes a plurality of client data in a communication system,
2 is a diagram illustrating a pattern of data generated in a specific client,
3 illustrates an exemplary configuration of a data storage device for processing a plurality of client data in a communication system according to an embodiment of the present invention;
4 is a diagram illustrating a configuration example of a burst memory operating as a second stage storage unit in the storage apparatus according to an embodiment of the present invention;
5 is a diagram illustrating an example of an access pattern of a burst memory in a storage device according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating an area of a memory required per bit according to an embodiment of the present invention,
7 is a diagram illustrating an example of an interface between a client and a data transfer unit in a storage apparatus according to an embodiment of the present invention;
FIG. 8 is a timing diagram illustrating a client data transfer operation in a storage device according to an embodiment of the present invention;
9 is a diagram illustrating an example of client data stored in a destination memory in a storage device according to an embodiment of the present invention.

하기에서 본 발명의 실시 예들을 설명함에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following description of the present invention, detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.

먼저 통신 시스템에서 송신기, 수신기, 또는 송수신기(이하, '단말'이라 칭한다.)는 소프트웨어 기반으로 동작하는 하나 또는 복수의 프로세서와, 정해진 동작을 수행하도록 설계된 하나 또는 복수의 하드웨어 블록(또는 기능 블록)을 포함하여 구현될 수 있다. 여기서 하드웨어 블록과 프로세서(또는 프로세서의 서브시스템) 간에는 정보 교환이 이루어진다. 상기 하드웨어 블록에서 데이터 연산에 따라 얻어진 결과값 중 일부는 상기 프로세서에 전송되어 처리될 수 있으며, 상기 프로세서로 전송되어 처리되는 데이터는 상기 하드웨어 블록의 특성에 따라 다양한 종류의 데이터가 존재할 수 있다. 이러한 다양한 종류의 데이터는 상기 하드웨어 블록에서 주기적으로 또는 비주기적으로 생성된 후에 적절한 시간 내에 상기 프로세서로 전송되어야 하는데, 통신 시스템을 가정하면, 상기 다양한 종류의 데이터는 보정(calibration)을 위한 송수신 I(Inphase)/Q(Quadrature) 데이터, 셀 탐색(cell searcher)의 처리 결과, 수신기의 채널 추정값, 등화기(equalizer)의 적응 필터 가중치(adaptive filter weight), 채널 디코딩(channel decoding) 결과 등의 데이터 등을 예로 들 수 있다.A transmitter, a receiver or a transceiver (hereinafter referred to as a " terminal ") in a communication system includes one or a plurality of processors operating on a software basis, one or a plurality of hardware blocks (or functional blocks) . &Lt; / RTI > Here, information is exchanged between the hardware block and the processor (or subsystem of the processor). Some of the resultant values obtained according to the data operation in the hardware block may be transmitted to the processor and processed, and data to be transmitted to the processor may have various kinds of data depending on the characteristics of the hardware block. These various kinds of data must be transmitted to the processor within a reasonable time after being generated periodically or aperiodically in the hardware block. Assuming a communication system, the various kinds of data are transmitted / received for calibration Data such as an inphase / Q (quadrature) data, a processing result of a cell searcher, a channel estimation value of a receiver, an adaptive filter weight of an equalizer, For example.

동시에 동작하는 다수의 하드웨어 블록들을 다수의 클라이언트들이라 하고, 다수의 하드웨어 블록들에서 생성되는 서로 다른 데이터들을 서로 다른 클라이언트 데이터이라 가정하였을 때, 본 발명의 이해를 돕기 위해 도 1a 및 도 1b를 참조하여 서로 다른 클라이언트 데이터를 처리하는 기존의 데이터 저장 장치와 데이터 처리 방법을 먼저 설명하기로 한다. 이하 본 발명의 실시 예는 통신 시스템의 단말에서 이용되는 저장 장치를 가정하여 설명될 것이나, 본 발명의 실시 예에 따른 저장 장치는 다수의 클라이언트로부터 데이터의 수집, 전송 및 처리가 요구되는 각종 시스템에서 동일하게 적용될 수 있다.When a plurality of hardware blocks operating at the same time are referred to as a plurality of clients, and different data generated in a plurality of hardware blocks are assumed to be different client data, in order to facilitate understanding of the present invention, Existing data storage devices and data processing methods for processing different client data will be described first. Hereinafter, embodiments of the present invention will be described on the assumption of a storage device used in a terminal of a communication system. However, a storage device according to an embodiment of the present invention may be used in various systems in which data collection, The same can be applied.

도 1a는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면으로서, 도 1a의 구성은 슬레이브 버스 인터페이스 구조를 이용하는 데이터 저장 장치를 나타낸 것이다.1A shows an example of a configuration of a data storage device that processes a plurality of client data in a communication system, wherein the configuration of FIG. 1A shows a data storage device using a slave bus interface structure.

도 1a를 참조하면, 다수의 클라이언트들(클라이언트 0 내지 N-1)(110)로부터 생성되는 다수의 클라이언트 데이터(C₀~C_N _-1)는 저장부(130)내 대응되는 다수의 클라이언트별 메모리들(130₀ ~ 130_N-1)에 저장된다. 클라이언트별 메모리들(메모리0 내지 N-1)(130₀ ~ 130_N-1)는 해당 클라이언트에서 생성되는 데이터(C₀~C_N _-1)의 비트 폭(W₀~W_N _-1)과 단위 시간당 생성되는 데이터의 양에 따라 서로 다른 메모리 사이즈를 가진다. 그리고 도시되지 않은 프로세서는 클라이언트별 메모리들(130₀ ~ 130_N-1)로부터 출력되어 다중화기(135)에서 다중화된 데이터를 슬레이브 버스 인터페이스(150)를 통해 읽을 수 있다. 도 1b에서 참조 부호 S1은 슬레이브 버스 인터페이스 구조에서 클라이언트별 메모리들(130₀ ~ 130_N-1)로부터 데이터를 읽기 위한 메모리 선택을 나타낸 것이다.1A, a plurality of client data (C ₀ to C _N _-1 ) generated from a plurality of clients (clients 0 to N-1) 110 are stored in a storage unit 130 And stored in the memories 130 ₀ to 130 _N-1 . The memories (memories 0 to N-1) 130 ₀ to 130 _N-1 of the client correspond to the bit widths (W ₀ to W _N _-1 ) of the data (C ₀ to C _N _-1 ) And have different memory sizes depending on the amount of data generated per unit time. A processor not shown may be output from the client-specific memories 130 ₀ to 130 _N-1 and read data multiplexed by the multiplexer 135 through the slave bus interface 150. In FIG. 1B, reference numeral S1 denotes memory selection for reading data from the memories 130 ₀ to 130 _N-1 for each client in the slave bus interface structure.

도 1b는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 다른 구성 예를 나타낸 도면으로서, 도 1b의 구성은 인터럽트를 통해 데이터 입출력을 수행하는 DMA(Direct Memory 억세스) 버스 인터페이스 구조를 이용하는 데이터 저장 장치를 나타낸 것이다.1B is a diagram showing another configuration example of a data storage device that processes a plurality of client data in a communication system. The configuration of FIG. 1B includes data (DMA) using a DMA (Direct Memory Access) bus interface structure for performing data input / FIG.

도 1b를 참조하면, 다수의 클라이언트들(클라이언트 0 내지 N-1)(110)로부터 생성되는 다수의 클라이언트 데이터(C₀~C_N _-1)는 저장부(130)내 대응되는 다수의 클라이언트별 메모리들(130₀ ~ 130_N-1)에 저장된다. 클라이언트별 메모리들(메모리0 내지 N-1)(130₀ ~ 130_N-1)는 서로 다른 클라이언트 데이터(C₀~C_N _-1)의 양에 따라 서로 다른 메모리 사이즈를 가진다. 클라이언트별 메모리들(1300 ~ 130N-1)로부터 출력되어 다중화기(135)에서 다중화된 데이터는 DSP (Digital Signal Processor) 또는 CPU (Central Processing Unit)과 같은 프로세서 코어(180)로 인터럽트를 발생시키는 데이터 전달 모듈(160), RAM(Random 억세스 Memory)(170)을 거쳐 프로세서 코어(180)로 전달된다.Referring to FIG. 1B, a plurality of client data (C ₀ to C _N _-1 ) generated from a plurality of clients (clients 0 to N-1) And stored in the memories 130 ₀ to 130 _N-1 . The memories (memory 0 to N-1) 130 ₀ to 130 _N-1 for each client have different memory sizes according to the amount of different client data (C ₀ to C _N _-1 ). The data output from the client-specific memories 1300 to 130N-1 and multiplexed by the multiplexer 135 is transferred to the processor core 180 such as a DSP (Digital Signal Processor) or a CPU (Central Processing Unit) The transfer module 160, and the RAM (Random Access Memory) 170 to the processor core 180.

도 1a 및 도 1b의 상기한 구성에서 저장부(130)내 다수의 클라이언트별 메모리들(130₀ ~ 130_N-1)는 예를 들어 플립플롭(flip-flop)이나 SRAM(Static Random 억세스 Memory)과 같은 온 칩 메모리(on-chip memory)로 구현되는데, 다음과 같은 문제점을 가진다. 구체적으로, 클라이언트별 메모리들(130₀ ~ 130_N-1)에서 하나의 클라이언트 메모리는 소프트웨어에서 한 번에 처리되는 데이터의 양(이하 "1 트랜잭션(transaction)")을 저장한다. 클라이언트 메모리로서 플립플롭은 1 비트 당 당 메모리 면적이 가장 넓고, SRAM도 상대적으로 비트 당 면적이 넓으므로, 상기 클라이언트 메모리는 처리되는 트랜잭션 사이즈(transaction size)에 의해 메모리의 실리콘(silicon) 면적이 크게 증가한다.In the above arrangement of the storage unit 130, a plurality of client-specific-memory _{_{(130 0 ~ 130 N-1}} ) in Fig. 1a and 1b, for example, flip-flops (flip-flop) or a SRAM (Static Random Access Memory) On-chip memory, such as a flash memory, which has the following problems. Specifically, one client memory in the client-specific memories 130 ₀ to 130 _N-1 stores the amount of data (hereinafter "one transaction") to be processed in software at one time. Since the flip-flop as the client memory has the largest memory area per bit and the SRAM also has a relatively large area per bit, the client memory has a large silicon area of the memory due to the transaction size being processed. .

또한 클라이언트 메모리로 이용되는 SRAM은 실제 정보를 저장하는 셀 매트릭스(cell matrix)는 물론 데이터를 읽고 쓰기 위한 주소(address)를 디코딩하고 억세스(access)하고자 하는 셀(cell)을 지정하기 위한 제어 로직을 포함한다. 상기 셀 매트릭스(cell matrix)는 저장되는 데이터의 비트 수에 비례하여 크기가 증가하지만 상기 제어 로직은 상기 셀 매트릭스에 비해 크기가 완만하게 증가하므로 메모리의 용량이 작을수록 비트 당 차지하는 실리콘의 면적이 늘어나게 되어 메모리의 면적 효율은 낮아진다. 상기 클라이언트 메모리의 기존 구현 방식에서는 클라이언트별 데이터의 양이 달라서 클라이언트 메모리의 사이즈에도 편차가 있으며, 일부 클라이언트 메모리가 매우 작은 메모리를 사용할 경우 비트 당 차지하는 실리콘 면적의 증가로 전체적인 면적 효율이 감소한다. 또한, 각 클라이언트 메모리가 특정 클라이언트에 고정적으로 매핑되므로 동작 상황에 따라 클라이언트 메모리를 효율적으로 재할당하여 사용할 수 없게 된다. 또한 상기 클라이언트 메모리의 기존 구현 방식에서는 다양한 동작 시나리오에서 각 하드웨어 블록(즉 클라이언트)이 생성하는 데이터의 양이 달라지는 경우, 각 클라이언트 메모리는 모든 동작 시나리오에서 가장 많은 데이터 요구량을 기준으로 만들어져야 하는 점을 고려하면, 전체적으로 요구되는 메모리의 면적이 증가된다.The SRAM used as the client memory includes a cell matrix for storing actual information as well as a control logic for decoding an address for reading and writing data and designating a cell for accessing . The size of the cell matrix increases in proportion to the number of bits of data to be stored, but the control logic gently increases in size compared to the cell matrix, so that the smaller the memory capacity, the larger the area of silicon occupied per bit The area efficiency of the memory is lowered. In the conventional implementation of the client memory, the amount of client-specific data differs depending on the amount of data per client. When a small amount of memory is used for a part of the client memory, the overall area efficiency decreases due to an increase in silicon area per bit. In addition, since each client memory is fixedly mapped to a specific client, the client memory can not be efficiently reallocated and used according to the operation status. In the conventional implementation of the client memory, when the amount of data generated by each hardware block (i.e., client) varies in various operation scenarios, each client memory must be created based on the largest amount of data required in all operation scenarios Considering this, the overall required memory area is increased.

본 발명의 실시 예에서는 다수의 클라이언트들로부터 전송되는 클라이언트 데이터의 트래픽 특성(traffic characteristic)을 이용하여 메모리의 면적을 보다 줄일 수 있는 다단(multi-stage) 구조의 메모리를 이용하는 저장 장치를 제안한다. 상기 클라이언트별 데이터는 상기 저장 장치를 이용하는 단말 등에서 처리되는 서로 다른 종류의 데이터로 이해될 수 있다. In an embodiment of the present invention, a storage device using a multi-stage memory capable of further reducing the area of a memory using a traffic characteristic of client data transmitted from a plurality of clients is proposed. The client-specific data may be understood as different kinds of data processed in a terminal or the like using the storage device.

또한 본 발명의 실시 예에 따른 저장 장치는 프로세서가 한번에 처리할 분량의 데이터 중 대부분의 데이터는 상대적으로 구현 비용(cost)이 적은 메모리에 저장하고, 구현 비용이 높은 메모리는 한번에 생성되는 분량의 데이터의 신속한 저장을 위해 한정하는 방식으로 제안된 것이다. 또한 상기 저장 장치는 메모리의 면적 효율이 떨어지는 작은 용량의 메모리의 사용을 줄이고, 일정 용량 이상의 메모리를 사용하게 되어 메모리의 전체적인 면적 효율을 향상시키도록 제안된 것이다.Also, in the storage device according to the embodiment of the present invention, most of the data to be processed by the processor at one time is stored in a memory having a relatively low cost, and the memory having a high implementation cost is stored at a time For rapid storage of the < / RTI > In addition, the storage device is proposed to reduce the use of a small capacity memory having a low area efficiency of the memory, and to use a memory of a predetermined capacity or more, thereby improving the overall area efficiency of the memory.

또한 본 발명의 실시 예에 따른 저장 장치는 다양한 동작 시나리오에 따라 클리이언트별 메모리의 요구량이 달라지면 설정 레지스터(configuration register)를 통해 클라이언트별 메모리의 할당량을 용이하게 변경할 수 있다.In addition, the storage device according to the embodiment of the present invention can easily change an allocation amount of memory for each client through a configuration register when the memory requirement of each client is changed according to various operation scenarios.

도 2는 특정 클라이언트에서 생성되는 데이터의 패턴을 예시한 도면이다. 도 2를 참조하면, 클라이언트는 데이터를 전송한 후에 인터럽트(205, 207)를 발생시켜 프로세서에 처리를 요구한다. 프로세서는 1개의 트랜잭션(201, 203)에 해당하는 데이터를 모아 한 번에 처리하지만, 실제 데이터는 여러 개의 작은 버스트들(211, 213)로 나뉘어 생성된다. 따라서 본 발명의 실시 예에서는 클라이언트별로 1개의 버스트를 중간 메모리에 우선 저장하고, 1개의 트랜잭션의 전체 데이터는 상대적으로 구현 비용이 적은 목적지 메모리(destination memory)에 저장하는 방식을 제안한다. 상기 목적지 메모리는 외부 DRAM(Dynamic Random 억세스 Memory) 또는 SRAM을 이용할 수 있으며, 개별 인스턴스(instance)의 사이즈가 커서 면적 효율이 좋아진다. 본 발명의 실시 예에서 상기 버스트는 예컨대, 하나의 클라이언트에서 생성되는 데이터의 생성 단위로 이해될 수 있고, 상기 트랜잭션은 프로세서를 통해 처리되는(예컨대, 작업 수행을 위해 소프트웨어 등으로 전달되는) 데이터의 전송 단위로 이해될 수 있다.2 is a diagram illustrating a pattern of data generated in a specific client. Referring to FIG. 2, the client generates interrupts 205 and 207 after transmitting data, and requests processing from the processor. The processor collects data corresponding to one transaction 201, 203 at a time, but the actual data is divided into several small bursts 211, 213. Therefore, in the embodiment of the present invention, one burst is stored in the intermediate memory first for each client, and the entire data of one transaction is stored in a destination memory having a relatively small implementation cost. The destination memory may use an external DRAM (Dynamic Random Access Memory) or an SRAM, and the size of the individual instance is large, thereby improving the area efficiency. In an embodiment of the present invention, the burst can be understood as a unit of production of data generated, for example, in one client, and the transaction can be a unit of data processed (e.g., delivered to software etc.) Can be understood as a transmission unit.

도 3은 본 발명의 실시 예에 따라 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면으로서, 도 3의 저장 장치는 다단 구조의 메모리들을 포함하여 구현된 저장 장치의 구성을 나타낸 것이다.FIG. 3 is a block diagram illustrating an example of a data storage apparatus for processing a plurality of client data in a communication system according to an embodiment of the present invention. FIG. 3 is a block diagram of a storage apparatus implemented with a multi- FIG.

도 3의 저장 장치는 단말 내 다수의 하드웨어 블록들에 대응되는 다수의 클라이언트들(클라이언트 0 내지 N-1)(310)로부터 생성되는 서로 다른 종류의 클라이언트 데이터(C₀~C_N-1)을 입력 받아 일시 저장한 후, 목적지 메모리(350)로 출력하는 데이터 전달부(data mover)(330)를 포함한다. 상기 데이터 전달부(330)는 클라이언트별 FIF0(First In First Out) 메모리들(331)로 구현되는 제1 단 저장부와, 상기 제1 단 저장부로부터 전달되는 다수의 클라이언트 데이터(C₀~C_N _-1)를 각각 하나의 버스트 단위로 저장하고, 다음 버스트 데이터가 입력되기 전에 저장된 버스트 데이터를 목적지 메모리(350)로 출력하는 버스트 메모리(335)로 구현되는 제2 단 저장부를 포함한다. 상기 목적지 메모리(350)는 외부 메모리 등을 이용할 수 있으며, 제3 단 저장부로 이용된다. 따라서 본 발명의 실시 예에 따라 서로 다른 종류의 다수의 클라이언트 데이터를 처리하는 다단 구조의 메모리들로 구성되는 상기 저장 장치는 상기 제1 단 내지 제3 단 저장부를 포함하여 구현될 수 있다. 여기서 상기 FIF0 메모리들(331)로 입력되는 서로 다른 종류의 클라이언트 데이터(C₀~C_N _-1)는 데이터 량이 다를 수 있다. 한편 상기한 실시 예에서는 상기 제1 단 저장부로부터 전달되는 다수의 클라이언트 데이터(C₀~C_N _-1)를 각각 버스트 메모리(335)에 하나의 버스트 단위로 저장하는 것으로 설명하였으나, 이는 일 예를 나타낸 것이며, 본 발명이 이에 한정되는 것은 아니다. 따라서 버스트 메모리(335)에 둘 이상의 버스트 단위로 클라이언트 데이터를 저장하는 것도 가능하다.3 stores different types of client data (C ₀ to C _N-1 ) generated from a plurality of clients (clients 0 to N-1) 310 corresponding to a plurality of hardware blocks in the terminal And a data mover 330 for temporarily storing the received data and outputting the data to the destination memory 350. The data transmission unit 330 and the first end storage unit which is implemented as per-client FIF0 (First In First Out) memories (331), a plurality of client data transferred from the first-stage storage section 1 (C ₀ ~ C storing _N _-1) as a single unit of each burst, and includes a next burst of data is stored the second stage is implemented burst data stored before being input to the burst memory 335 for outputting to a destination memory 350. The destination memory 350 can be an external memory or the like, and is used as a third-stage storage unit. Accordingly, the storage device configured by the multi-stage memories for processing a plurality of different types of client data according to the embodiment of the present invention can be implemented by including the first to third storage sections. Here, the data amounts of the different kinds of client data (C ₀ to C _N _-1 ) input to the FIFO memories 331 may be different. Meanwhile, in the above-described embodiment, a plurality of client data (C ₀ to C _N _-1 ) transmitted from the first stage storage unit is stored in the burst memory 335 in units of one burst. However, And the present invention is not limited thereto. Accordingly, it is also possible to store client data in the burst memory 335 in units of two or more bursts.

도 3에서 상기 데이터 전달부(330)의 구체적인 동작을 설명하면, 상기 제1 단 저장부로 동작하는 입력 FIF0 메모리들(331)은 N 개의 클라이언트(310)로부터 서로 다른 N 개의 클라이언트 데이터(C₀~C_N _-1)를 전달 받아 클라이언트별로 할당된 입력 FIFO(First Input First Output) 메모리에 우선 저장한다. 상기 클라이언트별 입력 FIFO 메모리에 저장된 클라이언트별 데이터는 정해진 방식으로 포맷(formatting)(333)되어 상기 제2 단 저장부로 동작하는 버스트 메모리(335)로 입력되어 저장된다. 상기 포맷(333)은 각 클라이언트 데이터의 데이터 폭(data width)(W₀~W_N _-1)이 버스트 메모리(335) 내의 메모리 뱅크 폭(W_B) 보다 작을 경우 입력되는 데이터를 모은 후, 다시 상기 메모리 뱅크 폭 W_B의 크기로 맞추어 메모리를 효율적으로 사용하기 위한 것이다.Referring to the specific operation of the data transfer unit 330 in FIG. 3, the first-stage input operation portion stored FIF0 memories 331 is N number of client data different from the N client (310) (C ₀ ~ C _N _-1 ) and stores it in an input FIFO (First Input First Output) memory allocated for each client. Client-specific data stored in the client-specific input FIFO memory is formatted 333 in a predetermined manner and input to and stored in a burst memory 335 operating as the second stage storage unit. When the data width (W ₀ to W _N _-1 ) of each client data is smaller than the memory bank width (W _B ) in the burst memory 335, the format 333 collects input data, according to the size of the memory bank width W _B is to use memory efficiently.

본 발명의 실시 예에서 상기 버스트 메모리(335)는 클라이언트별로 하나의 버스트를 저장하므로 버스트 메모리(335)에 저장할 데이터의 양이 하나의 트랜잭션 전체를 저장하는 것에 비해 크게 줄어든다. 상기 버스트 메모리(335)는 상대적으로 고비용의 메모리를 이용하므로 본 발명의 실시 예에서는 클라이언트에서 데이터의 생성 단위인 버스트 단위로 데이터를 저장하도록 구현된다. 버스트 메모리(335)에 하나의 버스트 저장이 완료되면, 버스트 메모리(335)는 그 다음 버스트가 발생하여 기존 데이터가 오버라이트(overwrite) 되기 전에 기 저장된 버스트를 제3 단 저장부로 출력한다. 상기 제2 단 저장부인 버스트 메모리(335)에 저장된 각 클라이언트 데이터는 인접한 버스트들 사이의 처리 시간 동안에 상기 제3 단 저장부로 전달된다. 그리고 상기 제3 단 저장부는 상대적으로 저비용의 메모리를 이용하므로 본 발명의 실시 예에서는 프로세서를 통해 처리되는 데이터의 전송 단위(예컨대, 상기 프로세서를 통해 소프트웨어로 전달되는 데이터의 전송 단위)인 트랜잭션 단위로 데이터를 저장할 수 있다. 소트트웨어는 상기 트랜잭션 단위의 데이터를 한 번에 처리할 수 있다.In the embodiment of the present invention, the burst memory 335 stores one burst for each client, so that the amount of data to be stored in the burst memory 335 is greatly reduced as compared with the case where the entire transaction is stored. Since the burst memory 335 uses a relatively expensive memory, in the embodiment of the present invention, data is stored in units of bursts, which is a unit for generating data in the client. When one burst is stored in the burst memory 335, the burst memory 335 outputs the previously stored burst to the third-stage storage unit before the next burst occurs and the existing data is overwritten. Each client data stored in the burst memory 335, which is the second-stage storage unit, is transferred to the third-stage storage unit during the processing time between adjacent bursts. In addition, since the third-stage storage unit uses a relatively low-cost memory, in the embodiment of the present invention, the third-stage storage unit is a unit for transferring data processed through the processor (for example, a unit of transmission of data transferred to the software through the processor) Data can be stored. The software can process the data of the transaction unit at a time.

상기한 본 발명의 실시 예에 의하면, 클라이언트 데이터를 고비용의 버스트 메모리에 저장할 때 1회 전송의 단위인 트랜잭션 분량의 전체 데이터를 저장하지 않고, 1회 생성되는 버스트 분량만 저장할 수 있으며, 인접한 버스트들 사이의 처리 시간 동안에 그 버스트 데이터를 저비용 메모리에 전달하여 저장함으로써 비용 대비 메모리의 사용 효율을 향상시킬 수 있다.According to the embodiment of the present invention, when the client data is stored in the high-cost burst memory, only the burst volume generated once can be stored without storing the entire data of the transaction volume, which is a unit of one transmission, It is possible to improve the use efficiency of the memory for cost by transmitting the burst data to the low-cost memory and storing the burst data during the processing time.

만일 현재 버스트와 다음 버스트 간의 처리 시간 간격이 매우 짧아서 상기 제3 단 저장부로 데이터를 전송할 처리 시간이 부족한 경우에는 상기 제2 단 저장부에 해당 클라이언트 데이터(즉 처리 시간이 부족한 데이터)에 대해서는 예컨대, 2개 또는 그 이상의 버스트 분량을 저장하는 더블 버퍼링(double buffering)으로 상기 처리 시간의 부족을 해결할 수 있다. If the processing time interval between the current burst and the next burst is so short that the processing time to transfer data to the third-stage storage unit is insufficient, the corresponding client data (i.e., data with insufficient processing time) The shortage of the processing time can be solved by double buffering in which two or more burst quantities are stored.

또한 도 3의 저장 장치는 각 클라이언트 데이터를 처리하는 설정을 저장하는 설정 레지스터(configuration register)(R1)를 포함할 수 있다. 설정 레지스터(R1)는 예를 들어 각 클라이언트 데이터가 버스트 메모리(335)에서 차지하는 시작 주소와 각 클라이언트 데이터가 최종 저장될 목적지 메모리(350)의 주소, 포맷(333) 방식에 대응되는 디-포맷(de-formatting)(337) 방식 등 각 클라이언트 데이터를 처리하는 규칙들 중 적어도 하나가 저장될 수 있다. 도 3에서 출력 FIFO(336)는 버스트 메모리(335)에서 읽은 데이터를 임시 저장하며, 출력 FIFO(336)에서 출력되는 각 클라이언트 데이터는 디-포맷(337)을 거쳐 DMA 제어기인 eXDMAC(eXternal DMA controller)(339)를 통해 제3 단 저장부로 전달된다. 또한 각 클라이언트는 데이터 전달부(330)로 전달하는 제어 정보를 이용하여, 특정 버스트에서 인터럽트 발생을 요청할 수 있으며, 상기 인터럽트를 처리하기 위한 인터럽트 생성부(I1)가 도 3의 저장 장치에 포함될 수 있다.The storage device of FIG. 3 may also include a configuration register R1 that stores settings for processing each client data. The configuration register Rl may for example be a configuration register 331 in which each client data is associated with a start address occupied in the burst memory 335 and an address of the destination memory 350 where each client data is ultimately stored, de-formatting (337) method, etc., can be stored. 3, the output FIFO 336 temporarily stores data read from the burst memory 335 and each client data output from the output FIFO 336 is transferred to the DMA controller eXDMAC ) 339 to the third stage storage unit. In addition, each client can request interrupt generation in a specific burst using the control information delivered to the data transfer unit 330, and an interrupt generation unit I1 for processing the interrupt can be included in the storage device of FIG. 3 have.

도 3에서 목적지 메모리(350)인 상기 제3 단 저장부는 데이터가 전송되는 최종 목적지로서 프로세서에서 엑세스 가능한 주소 공간이다. 상기 제3 단 저장부는 예를 들어 칩 외부의 DRAM 또는 칩 내부의 특정 프로세서의 대용량 데이터 메모리를 이용할 수 있으며, DMA 제어기인 eXDMAC(339)를 통해 데이터를 전달 받는다. 상기 제3 단 저장부는 프로세서에서 데이터 처리를 위해 클라이언트별로 1개 트랜잭션을 저장하며, 상기 제1 단 저장부나 제2 단 저장부에 비해 비트 당 구현 비용이 상대적으로 낮으므로 전반적인 구현 비용이 감소된다. 만일 프로세서의 처리가 지연되어 한 개의 트랜잭션의 처리가 끝나기 전에 그 다음 트랜잭션의 데이터가 전송되면서 데이터가 오버라이트될 우려가 있다면 제3 단 저장부에 2개 또는 그 이상의 트랜잭션을 저장할 공간을 확보하여 더블 버퍼링을 수행하는 것도 가능하다. 그리고 상기 제1 단 내지 제3 단 저장부를 포함하는 저장 장치의 상기한 동작은 도시되지 않은 제어부를 통해 제어될 수도 있다.In FIG. 3, the third stage storage unit, which is the destination memory 350, is an address space accessible from the processor as a final destination to which data is transferred. For example, the third stage storage unit may use a large-capacity data memory of a DRAM outside a chip or a specific processor in a chip, and receives data through an eXDMAC 339 that is a DMA controller. The third stage storage unit stores one transaction for each data processing unit in the processor. The implementation cost per bit is relatively low as compared with the first stage storage unit or the second stage storage unit, so the overall implementation cost is reduced. If processing of the processor is delayed and the data of the next transaction is transferred before the processing of one transaction is finished, if there is a possibility that data is overwritten, it is necessary to secure space for storing two or more transactions in the third- It is also possible to perform buffering. And the operation of the storage device including the first-stage to third-stage storage may be controlled through a control unit (not shown).

도 4는 본 발명의 실시 예에 따른 상기 저장 장치에서 제2 단 저장부로 동작하는 버스트 메모리의 일 구성 예를 나타낸 도면이다.4 is a diagram illustrating a configuration example of a burst memory operating from the storage device to the second stage storage unit according to an embodiment of the present invention.

도 4를 참조하면, 버스트 메모리(335)는 입력 크로스 바(input cross bar)(3351), 다수의 메모리 뱅크들(뱅크 1 내지 M-1)(3353), 출력 다중화기(3355)를 포함한다. N 개의 입력 포트들은 입력 크로스 바(3351)를 거쳐 M 개의 메모리 뱅크들(3353)에 연결되며, 여기서 N<M 이다. 즉 메모리 뱅크의 개수가 입력 포트의 개수 보다 많다. 본 실시 예에서 각 메모리 뱅크는 예컨대, W_B 비트의 폭을 가진다. 각 입력 포트는 매 순간 특정한 메모리 뱅크에 연결되도록 제어된다. 그리고 출력 포트는 매 순간 M 개의 메모리 뱅크 중 하나를 선택하여 출력한다.4, the burst memory 335 includes an input crossbar 3351, a plurality of memory banks (banks 1 to M-1) 3353, and an output multiplexer 3355 . The N input ports are connected to the M memory banks 3353 via an input crossbar 3351, where N < M. That is, the number of memory banks is larger than the number of input ports. In this embodiment, each memory bank has a width of, for example, W _B bits. Each input port is controlled to be connected to a particular memory bank every moment. The output port selects and outputs one of the M memory banks at each moment.

본 실시 예에서 버스트 메모리(335)의 어느 영역이든 클라이언트에 할당이 가능하며, 메모리 뱅크들(3353)에서 클라이언트별 시작 주소를 구성 레지스터(R1)에 설정할 수 있다. 이 설정을 단말의 동작 중에 변경하여 동작 시나리오에 따른 재설정이 가능하다. 따라서 버스트 메모리(335)의 전체 크기는 모든 클라이언트들의 1개 버스트 분량의 합이 최대가 되는 동작 시나리오를 기준으로 설계할 수 있고, 다른 동작 시나리오에서는 구성 레지스터(R1)를 통해 동적으로 클라이언트들 간에 버스트 메모리(335)를 재분배할 수 있다. 본 실시 예에서 이러한 버스트 메모리(335)의 할당 방식은 클라이언트별로 최대 버스트 사이즈로 메모리의 공간을 확보하면서 메모리의 요구 면적을 줄일 수 있다.In this embodiment, any area of the burst memory 335 can be allocated to the client, and the start address of each client in the memory banks 3353 can be set in the configuration register Rl. This setting can be changed during operation of the terminal and reset according to the operation scenario. Thus, the overall size of the burst memory 335 can be designed on the basis of an operational scenario in which the sum of the burst amounts of all clients is the maximum, and in other operational scenarios, The memory 335 can be redistributed. In this embodiment, the allocation method of the burst memory 335 can reduce the required area of the memory while securing the space of the memory with the maximum burst size for each client.

도 5는 본 발명의 실시 예에 따른 저장 장치에서 버스트 메모리의 억세스 패턴의 일 예를 나타낸 도면이다.5 is a diagram illustrating an example of an access pattern of a burst memory in a storage apparatus according to an embodiment of the present invention.

도 5를 참조하면, 예를 들어 클라이언트가 버스트 메모리(335)를 억세스할 때 가로 방향으로 스트라이핑(striping)하는 패턴으로 순차적으로 모든 메모리 뱅크들(3353)를 엑세스 한다. 이 경우 한 클라이언트가 특정 메모리 뱅크를 M 싸이클 마다 1번씩만 엑세스하게 되므로 입력 크로스 바(3351)에서 둘 이상의 클라이언트가 한 순간에 2개의 메모리 뱅크를 엑세스하지 않도록 스케줄링하면 최대 M 개의 클라이언트가 서로 충돌 없이 동시에 메모리 뱅크를 엑세스할 수 있다. 그리고 버스트 메모리(335)에 저장된 데이터를 읽는 것은 하나의 클라이언트 단위로 처리되므로 버스트 메모리(335)에 데이터를 쓰는 클라이언트의 개수 N은 메모리 뱅크의 개수 M보다 작아야 한다.Referring to FIG. 5, for example, when a client accesses the burst memory 335, it sequentially accesses all the memory banks 3353 in a pattern of striping in the horizontal direction. In this case, since one client accesses a specific memory bank only once every M cycles, if two or more clients in the input crossbar 3351 are scheduled to not access two memory banks in one instant, At the same time, the memory banks can be accessed. Since reading data stored in the burst memory 335 is performed in units of one client, the number N of clients that write data in the burst memory 335 must be smaller than the number M of memory banks.

아래 <표 1>은 본 실시 예에 따라 입력 크로스 바(3351)의 클라이언트별 할당의 일 예를 나타낸 것으로서, 이는 예컨대, 클라이언트의 개수가 8인 경우에 대해 시간에 따른 메모리 뱅크의 할당 예를 나타낸 것이다.Table 1 below shows an example of allocation of the input crossbar 3351 according to the client according to the present embodiment. This is an example of allocation of memory banks over time for the case where the number of clients is 8 will be.

<표 1><Table 1>

상기 <표 1>과 같이 연속된 8개의 클럭 싸이클(clock cycle)을 8개의 시간 슬롯으로 정의하여 라운드 로빈(round-robin) 방식으로 할당하면 간단히 충돌 없는 스케줄링이 가능하다. 이때 특정 클라이언트가 데이터 전송을 개시하고자 할 때 할당된 시간 슬롯까지 기다리는데 최대 7 클럭이 소요될 수 있으므로 데이터를 저장해 둘 수 있는 입력 버퍼(즉 도 3에서 입력 FIFO)가 클라이언트별로 필요하다. 제1 단 저장부인 클라이언트별 입력 FIFO는 각 클라이언트로부터 불규칙적으로 발생하는 데이터를 유실 없이 저장했다가 연속된 싸이클로 버스트 메모리(335)로 전송하는 역할을 수행한다.If 8 consecutive clock cycles are defined as eight time slots as shown in Table 1 and are allocated in a round-robin manner, scheduling without collision is possible simply. In this case, when a specific client attempts to start data transmission, it may take up to 7 clocks to wait until the allocated time slot. Therefore, an input buffer (i.e., input FIFO in FIG. The input FIFO for each client, which is the first stage storage unit, stores irregularly generated data from each client without loss and transmits it to the burst memory 335 in succession.

도 6은 본 발명의 실시 예에 따라 1 비트 당 요구되는 메모리의 면적(단위: gate count)을 메모리의 사이즈별로 예시한 도면이다. 여기서 1 게이트(gate)는 2 개의 입력을 가지는 NAND 게이트의 면적을 나타낸다.FIG. 6 is a diagram illustrating an area (unit: gate count) of a memory required per bit according to the size of a memory according to an embodiment of the present invention. Here, one gate (gate) represents the area of a NAND gate having two inputs.

도 6을 참조하면, 가장 작은 사이즈의 메모리(Width 4, Depth 32)의 경우 그 메모리 면적은 3.83 gate/bit이고, 메모리 사이즈가 커질수록 메모리 면적은 감소하여 메모리 면적이 0.20 gate/bit에 수렴한다. 도 6의 예에서 서로 다른 사이즈의 메모리들 간에 메모리 면적은 최대 19배의 차이가 나는 것을 알 수 있다. 또한 메모리 사이즈가 일정 사이즈 이상으로 큰 경우 그 메모리 면적은 0.2~0.3 gate/bit 수준으로 급격히 감소함을 알 수 있다. 따라서 도 1a, 도 1b의 예와 같이 메모리 사이즈의 분포가 다양하게 메모리 뱅크들을 구현하는 것 보다는 메모리 사이즈가 균일하도록 메모리 뱅크를 구성하는 것이 메모리의 면적 감소 측면에서 보다 유리함을 알 수 있다. 따라서 본 실시 예에서는 클라이언트별로 생성되는 데이터의 양과 무관하게 메모리 뱅크들의 메모리 사이즈는 균일하게 구성된다. Referring to FIG. 6, the memory area of the smallest memory (Width 4, Depth 32) is 3.83 gates / bit. As the memory size increases, the memory area decreases and the memory area converges to 0.20 gate / bit . In the example of FIG. 6, the memory area is different by a maximum of 19 times among memories having different sizes. Also, if the memory size is larger than a certain size, the memory area sharply decreases from 0.2 to 0.3 gate / bit. Therefore, it can be seen that it is more advantageous in terms of memory area reduction to form a memory bank so that the memory size is uniform rather than the memory banks are implemented with various memory size distributions as in the example of FIGS. 1A and 1B. Therefore, in this embodiment, the memory sizes of the memory banks are configured uniformly regardless of the amount of data generated for each client.

한편 본 발명의 실시 예에서 각 클라이언트는 목적지 메모리(350)로 전송할 클라이언트 데이터 외에 부가 정보를 전송할 수 있다.In the embodiment of the present invention, each client can transmit additional information in addition to the client data to be transmitted to the destination memory 350.

도 7은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트와 데이터 전달부 간의 인터페이스의 일 예를 나타낸 도면이다.7 is a diagram illustrating an example of an interface between a client and a data transfer unit in a storage apparatus according to an embodiment of the present invention.

도 7의 예는 인터페이스를 통해 Wi 비트의 클라이언트 데이터(ClientData) 외에도 제어 정보(ControlInfo)와 인터럽터 요청 신호(InterruptReq)가 클라이언트 i(310)로부터 데이터 전달부(330)으로 전달되는 예를 나타낸 것이다. 7 shows an example in which control information (ControlInfo) and an interrupt request signal (InterruptReq) are transmitted from the client i (310) to the data transfer unit (330) in addition to Wi-bit client data (ClientData) via an interface.

아래 <표 2>는 상기 제어 정보의 일 예를 나타낸 것이다.Table 2 below shows an example of the control information.

<표 2><Table 2>

도 8은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트 데이터 전송 동작을 설명하기 위한 타이밍도로서, 도 8의 타이밍 도는 <표 2>의 제어 정보를 이용하여 클라이언트 데이터를 전송하는 예를 나타낸 것이다. 일 예로 <표 2>에서 ContorlInfo[1:0]이 "10"인 경우, 도 8에서 ContorlInf[1]은 "1", ContorlInf[0]은 "0"인 경우에 해당된다.FIG. 8 is a timing diagram for explaining an operation of transmitting a client data in a storage device according to an embodiment of the present invention. FIG. 8 is a timing diagram illustrating an example of transmitting client data using the control information in Table 2. For example, when ContorlInfo [1: 0] in Table 2 is "10", ContorlInf [1] is "1" and ContorlInf [0] is "0" in FIG.

도 8의 동작을 도 7을 참조하여 설명하면, 참조 번호 801은 클럭이고, 803, 805는 제어 정보, 807은 인터럽트 요청 신호, 그리고 809는 클라이언트 데이터이다. 클라이언트 i(310)는 대기 상태에서 제어 정보(ControlInfo)를 "00"으로 유지한다. 클라이언트 i(310)는 첫 전송을 시작할 때 제어 정보 ControlInfo[0](803)를 예컨대, "1"로 전송하고, 목적지 메모리(350)에서 데이터 저장을 시작할 헤드 주소(head address)를 전송한다. 이후 클라이언트 i(310)는 클라이언트 데이터(809)를 전송할 때 제어 정보 ControlInfo[1](805)을 "1"로 전송하여 클라이언트 데이터(809)가 유효함을 데이터 전달부(330)에 알린다. 클라이언트 i(310)는 1 개의 버스트 전송이 완료된 후 제어 정보 "11"을 전송한다. 상기 제어 정보 "11"을 수신한 데이터 전달부(330)는 버스트의 마지막임을 확인한 후, DMA 제어기인 eXDMAC(339)를 가동하여 목적지 메모리(350)로 데이터 전송을 시작한다. 특정 버스트의 마지막에 클라이언트 i(310)가 인터럽트 요청 신호(InterruptReq)(807)를 "1"로 전송하면, 데이터 전달부(330)는 해당 버스트의 전송이 완료된 후, 인터럽트를 발생시켜 프로세서에 해당 클라이언트의 데이터 전송이 완료되었음을 통지한다. 기타 참조 부호로 도 8에서 T_end _to _start는 하나의 버스트에서 다음 버스트 사이의 시간을 의미한다.8, reference numeral 801 denotes a clock, reference numerals 803 and 805 denote control information, reference numeral 807 denotes an interrupt request signal, and reference numeral 809 denotes client data. The client i 310 keeps the control information (ControlInfo) at "00" in the standby state. Client i 310 transmits control information ControlInfo [0] 803, for example, "1" at the start of the first transmission and transmits a head address to start storing data in destination memory 350. [ The client i 310 transmits the control information ControlInfo [1] 805 to "1" to inform the data transfer unit 330 that the client data 809 is valid when the client data 809 is transmitted. Client i 310 transmits control information "11" after one burst transmission is completed. Upon receiving the control information "11 ", the data transfer unit 330 activates the eXDMAC 339, which is the DMA controller, to start data transfer to the destination memory 350 after confirming the end of the burst. When the client i 310 transmits an interrupt request signal (InterruptReq) 807 at the end of a specific burst, the data transfer unit 330 generates an interrupt and transmits the interrupt request signal And notifies the client that the data transfer is complete. Other reference numerals denote T _end _to _start means the time between one burst and the next burst.

도 9는 본 발명의 실시 예에 따른 저장 장치에서 목적지 메모리에 저장된 클라이언트 데이터의 일 예를 나타낸 도면이다.9 is a diagram illustrating an example of client data stored in a destination memory in a storage device according to an embodiment of the present invention.

도 9를 참조하면, 설정 레지스터(R1)(도 3 참조)에는 클라이언트별로 두 개의 기본 주소(base address)(BA0, BA1)을 지정할 수 있으며, 두 개의 기본 주소가 다르면 목적지 메모리(350)는 더블 버퍼링을 수행할 수 있다. 실제 각 버스트는 기본 주소(base address)에 각 클라이언트가 전송한 헤드 주소(head address)(HA0, HA1, ...)을 더한 위치부터 데이터를 저장하는데, 이때 데이터 저장부(350)는 부가 정보를 저장하여 프로세서가 데이터를 처리하는 것을 용이하게 할 수 있다. 도 9에서 "n+1"은 트랜잭션 넘버(Transaction number)를 나타낸 것이다. 상기 트랜잭션 넘버는 인터럽트를 이용하지 않고도, 새로운 트랜잭션의 데이터의 전송 완료 여부를 확인하는데 이용된다. 상기 트랜잭션 넘버는 클라이언트별로 인터럽트가 생성될 때마다 1씩 증가된 값을 사용함을 가정한다. 프로세서는 상기 인터럽트 처리를 수행하는 대신에 상기 트랜잭션 넘버가 갱신되었는지의 여부를 확인함으로써 모든 데이터의 전송이 완료되었는지를 판단하고, 처리 시간을 단축시킬 수 있다.Referring to FIG. 9, two base addresses BA0 and BA1 can be designated for each client in the setting register R1 (see FIG. 3). If the two basic addresses are different, the destination memory 350 is double Buffering can be performed. Actually, each burst stores data from a base address plus a head address (HA0, HA1, ...) transmitted by each client. At this time, the data storage unit 350 stores the additional information May be stored to facilitate processing of the data by the processor. In Fig. 9, "n + 1" represents a transaction number. The transaction number is used to check whether data of a new transaction has been transferred without using an interrupt. It is assumed that the transaction number uses a value incremented by 1 each time an interrupt is generated for each client. The processor may determine whether or not the transfer of all the data is completed by checking whether the transaction number is updated instead of performing the interrupt processing, thereby shortening the processing time.

상기한 본 발명의 실시 예에 따른 다단 구조의 메모리들(즉 계층적 구조의 메모리들)을 이용한 저장 장치는 순차적으로 연결된 제1 단 내지 제3 단 저장부를 포함하여 구현된다. 상기 제1 단 저장부는 독립적으로 동작하는 각 클라이언트의 데이터를 저장하기 위해 클라이언트별로 작은 플립플롭이나 메모리를 이용한 FIFO 메모리들로 구현될 수 있다. 상기 제1 단 저장부를 이용하면, 제2 단 저장부에 기록을 개시할 수 있을 때까지의 대기 시간에 데이터를 유실하지 않고 저장할 수 있다.The storage device using multi-stage memories (i.e., hierarchical memories) according to the embodiment of the present invention is implemented by including sequentially connected first to third storage units. The first stage storage unit may be implemented as FIFO memories using a small flip-flop or memory for storing data of each client operating independently. By using the first stage storage unit, it is possible to store the data without losing the waiting time until the recording can be started in the second stage storage unit.

또한 상기 제2 단 저장부는 클라이언트별로 집중적으로 발생하는 하나 또는 둘 이상의 버스트를 저장하되 저장 장치의 구현 비용을 감소시키기 위해 다수의 클라이언트들이 공유할 수 있는 다수의 메모리 뱅크들을 포함하는 버스트 메모리로 구현될 수 있다. 상기 다수의 메모리 뱅크들은 메모리 면적을 감소시키도록 균일한 메모리 사이즈를 가지며, 상기 제1 단 저장부로부터 전달되는 클라이언트 데이터를 동시에 읽어 상기 다수의 메모리 뱅크들에 동시에 기록할 수 있으며, 제3 단 저장부로 기 저장된 데이터를 전달할 때 충돌을 방지할 수 있다. 또한 상기 제2 단 저장부는 클라이언트별 데이터가 버스트 메모리에 저장되는 위치와 데이터 량을 소프트웨어적인 설정으로 변경함(예컨대, 설정 레지스터를 통해 변경함)으로써 클라이언트별 메모리 할당을 동적으로 가변할 수 있다. 상기한 제2 단 저장부의 구성을 이용하면, 메모리의 면적을 감소시킬 수 있다. 그리고 저장 장치의 면적 효율을 높이기 위해 각 클라이언트에서 생성되는 데이터의 트래픽 특성을 이용하는 것도 가능하다.The second stage storage unit may be implemented as a burst memory including a plurality of memory banks that can be shared by a plurality of clients in order to reduce one or more bursts generated intensively for each client, . The plurality of memory banks may have a uniform memory size to reduce a memory area, simultaneously read client data transmitted from the first storage unit and simultaneously record the same on the plurality of memory banks, The collision can be prevented when transmitting the stored data. In addition, the second stage storage unit may dynamically change the memory allocation for each client by changing the location and data amount of client-specific data in the burst memory to a software setting (for example, through a setting register). By using the configuration of the second stage storage unit, the area of the memory can be reduced. It is also possible to use the traffic characteristics of data generated by each client to increase the area efficiency of the storage device.

또한 상기 제3 단 저장부는 프로세서의 처리 단위인 트랜잭션 단위의 데이터를 저장하되 클라이언트별로 하나 또는 둘 이상의 트랜잭션 분량의 데이터를 저장할 수 있다. 상기 제3 단 저장부는 구현 비용이 상대적으로 낮은 외부 메모리 또는 온 칩(on-chip) 메모리(예컨대, 프로세서에 내장된 메모리)를 이용할 수 있으며, 목적지 메모리의 특정 위치에 각 클라이언트의 데이터를 저장하는 방식에 있어 두 개 이상의 기본 주소를 이용하여 더블 버퍼링을 수행할 수 있다. 또한 상기 제3 저장부의 특정 위치에 부가 정보를 자동으로 전송하여 프로세서의 데이터 처리를 용이하게 할 수 있다. 상기 부가 정보는 상기 트랜잭션 넘버 등과 같이 기타 데이터 처리에 도움이 되는 각종 정보가 될 수 있다.The third-stage storage unit stores data of a transaction unit, which is a processing unit of the processor, and may store data of one or two or more transactions for each client. The third-stage storage unit may use an external memory or an on-chip memory having a relatively low implementation cost (for example, a memory built in the processor), and may store data of each client in a specific location of the destination memory It is possible to perform double buffering using two or more base addresses. The additional information may be automatically transmitted to a specific location of the third storage unit to facilitate data processing by the processor. The additional information may be various information that aids in processing other data such as the transaction number.

상기한 본 발명의 실시 예에 의하면, 다수의 하드웨어 블록들이 생성하는 데이터를 특정 메모리로 전송해야 하는 통신 시스템에서, 모든 클라이언트들의 데이터를 효율적으로 처리할 수 있는 공유된 다단 구조의 메모리들을 이용하는 저장 장치 및 방법을 제공할 수 있다.According to the embodiment of the present invention described above, in a communication system in which data generated by a plurality of hardware blocks must be transmitted to a specific memory, a storage device using shared multi-stage memories capable of efficiently processing data of all clients And methods.

또한 상기한 본 발명의 실시 예에 의하면, 다단 구조의 메모리들을 이용하는 저장 장치에서 메모리 억세스 방식을 제공할 수 있으며, 이러한 구성을 통해 작은 메모리의 사용을 줄이고 대용량 데이터는 일정 규모 이상의 메모리를 사용함으로써 전체적으로 메모리의 면적 효율을 향상시킬 수 있다. 또한 동작 시나리오에 따라 클리이언트별 데이터 생성량이 달라질 때 각 클라이언트에 할당되는 메모리의 양을 효율적으로 재할당할 수 있다.In addition, according to the embodiments of the present invention described above, a memory access method can be provided in a storage device using multi-stage memories. By using such a structure, the use of a small memory can be reduced, The area efficiency of the memory can be improved. It is also possible to efficiently reallocate the amount of memory allocated to each client when the amount of data generated per client is changed according to the operation scenario.

Claims

1. A storage device for processing a plurality of client data,
A first stage storage unit for receiving and storing the plurality of client data generated in units of bursts from a plurality of clients;
A second stage storage unit for receiving the plurality of client data from the first stage storage unit and storing the plurality of client data in the plurality of memory banks shared by the plurality of clients in units of the bursts; And
And a third storage unit for receiving each client data from the second stage storage unit and storing data in a transaction unit which is a transmission unit for data processing.

The method according to claim 1,
Wherein the plurality of clients correspond to a plurality of processors operating independently, and the plurality of client data includes different kinds of data.

The method according to claim 1,
Wherein the first stage storage unit includes a plurality of client-specific memories each storing the plurality of client data.

The method according to claim 1,
Wherein the first stage storage unit stores the plurality of client data until the second stage storage unit can start recording.

The method according to claim 1,
Wherein the plurality of memory banks have a uniform memory size.

The method according to claim 1,
Wherein the plurality of client data is simultaneously written to the plurality of memory banks in burst units.

The method according to claim 1,
Wherein the number of the plurality of memory banks is greater than the number of the plurality of clients.

The method according to claim 1,
Wherein the plurality of client data is stored in the plurality of memory banks and the amount of data can be changed to a software setting.

9. The method of claim 8,
Wherein the second stage storage unit includes a setting register,
Wherein the software setting can be changed using the setting register.

The method according to claim 1,
Wherein client-specific memory allocation for the plurality of memory banks is dynamically variable.

The method according to claim 1,
Wherein the third stage storage unit uses an on-chip memory included in an external memory or a processor.

The method according to claim 1,
Wherein the third stage storage unit performs double buffering using a plurality of base addresses when storing each client data.

The method according to claim 1,
And the additional information transferred to the specific address of the third stage storage unit is used for data processing of the processor.

The method according to claim 1,
Wherein each of the client data stored in the second stage storage unit is transferred to the third stage storage unit during a processing time between adjacent bursts.

CLAIMS 1. A storage method for processing a plurality of client data,
Receiving the plurality of client data generated in units of bursts from a plurality of clients and storing the received data in a client-specific memory;
Receiving the plurality of client data from the client-specific memory and storing the plurality of client data in the plurality of memory banks shared by the plurality of clients in units of the burst; And
And storing data of a transaction unit, which is a transfer unit for data processing, in the destination memory, receiving the client data stored in the plurality of memory banks.