KR102338872B1

KR102338872B1 - Storage apparatus and method for processing a plurality of client data

Info

Publication number: KR102338872B1
Application number: KR1020150034701A
Authority: KR
Inventors: 한성철; 고준원; 송인철
Original assignee: 삼성전자주식회사
Priority date: 2015-03-12
Filing date: 2015-03-12
Publication date: 2021-12-13
Also published as: KR20160109733A

Abstract

본 발명은 다수의 클라이언트 데이터를 처리하는 저장 장치 및 방법에 대한 것으로서, 본 발명의 실시 예에 따른 저장 장치는, 다수의 클라이언트들로부터 각각 버스트 단위로 생성되는 상기 다수의 클라이언트 데이터를 입력 받아 저장하는 제1 단 저장부와, 상기 제1 단 저장부로부터 상기 다수의 클라이언트 데이터를 전달 받아 상기 다수의 클라이언트들이 공유하는 다수의 메모리 뱅크들에 각각 상기 버스트 단위로 저장하는 제2 단 저장부와, 상기 제2 단 저장부로부터 각 클라이언트 데이터를 전달 받아 데이터 처리를 위한 전송 단위인 트랜잭션 단위의 데이터를 저장하는 제3 저장부를 포함한다.The present invention relates to a storage device and method for processing a plurality of client data. The storage device according to an embodiment of the present invention receives and stores the plurality of client data generated in burst units from a plurality of clients. a first storage unit; a second storage unit receiving the plurality of client data from the first storage unit and storing the plurality of client data in the plurality of memory banks shared by the plurality of clients in the burst unit, respectively; and a third storage unit for receiving each client data from the second storage unit and storing data in a transaction unit, which is a transmission unit for data processing.

Description

STORAGE APPARATUS AND METHOD FOR PROCESSING A PLURALITY OF CLIENT DATA

본 발명은 다수의 클라이언트 데이터를 처리하는 저장 장치 및 방법에 대한 것이다.The present invention relates to a storage device and method for processing multiple client data.

통신 시스템에서 송신기, 수신기 또는 송수신기(또는 단말기)와 같이 서로 다른 종류의 다수의 클라이언트 데이터(client data)들의 처리가 요구되는 데이터 처리 방식에서는 각 클라이언트별로 전송이 필요한 데이터의 양이 다르고 데이터의 발생 시점도 독립적이다. 통상적으로 서로 다른 클라이언트 데이터는 개별적인 메모리에 저장되며, 클라이언트 데이터별 메모리의 크기도 균일하지 않다. 따라서 기존 데이터 처리 방식에서 클라이언트별로 특화된 메모리 구성을 이용하게 되면, 일부 작은 사이즈의 메모리에 의해 메모리의 면적 효율이 저하된다. 또한 상기 클라이언트별로 특화된 메모리 구성은 프로세서에서 한 번에 처리할 데이터의 양에 비례하여 메모리 크기가 증가하는 문제가 있고, 동작 시나리오에 따라 클라이언트별 데이터의 양이 달라지더라도 클라이언트별 메모리를 효율적으로 재분배(재할당)할 수 없어 전체적으로 더욱 큰 메모리를 배치하여 설계해야 하는 부담이 있다.
In a data processing method that requires processing of a large number of different types of client data, such as a transmitter, a receiver, or a transceiver (or terminal) in a communication system, the amount of data required to be transmitted is different for each client, and the data generation time is also independent. Typically, different client data is stored in separate memories, and the size of the memory for each client data is not uniform. Therefore, if a memory configuration specialized for each client is used in the existing data processing method, the area efficiency of the memory is reduced due to some small-sized memories. In addition, the memory configuration specialized for each client has a problem in that the memory size increases in proportion to the amount of data to be processed by the processor at one time, and even if the amount of data for each client varies according to the operation scenario, memory for each client is efficiently redistributed Since it cannot be (reallocated), there is a burden of designing a larger memory overall.

본 발명은 클라이언트별 데이터를 수집하고 전송하기 위한 효율적인 데이터 처리 방법과 장치를 제공한다.The present invention provides an efficient data processing method and apparatus for collecting and transmitting data for each client.

또한 본 발명은 다단 구조의 메모리들을 이용하여 클라이언트별 데이터를 효율적으로 처리하는 데이터 저장 방법 및 장치를 제공한다.In addition, the present invention provides a data storage method and apparatus for efficiently processing data for each client using multi-tiered memories.

본 발명의 실시 예에 따라 다수의 클라이언트 데이터를 처리하는 저장 장치는, 다수의 클라이언트들로부터 각각 버스트 단위로 생성되는 상기 다수의 클라이언트 데이터를 입력 받아 저장하는 제1 단 저장부와, 상기 제1 단 저장부로부터 상기 다수의 클라이언트 데이터를 전달 받아 상기 다수의 클라이언트들이 공유하는 다수의 메모리 뱅크들에 각각 상기 버스트 단위로 저장하는 제2 단 저장부와, 상기 제2 단 저장부로부터 각 클라이언트 데이터를 전달 받아 데이터 처리를 위한 전송 단위인 트랜잭션 단위의 데이터를 저장하는 제3 저장부를 포함한다.A storage device for processing a plurality of client data according to an embodiment of the present invention includes a first-stage storage unit for receiving and storing the plurality of client data generated in burst units from a plurality of clients, and the first stage A second stage storage unit that receives the plurality of client data from the storage unit and stores the data in the plurality of memory banks shared by the plurality of clients in the burst unit, and transfers each client data from the second stage storage unit and a third storage unit for storing data of a transaction unit, which is a transmission unit for receiving and processing data.

또한 본 발명의 실시 예에 따라 다수의 클라이언트 데이터를 처리하는 저장 방법은, 다수의 클라이언트들로부터 각각 버스트 단위로 생성되는 상기 다수의 클라이언트 데이터를 입력 받아 클라이언트별 메모리에 저장하는 과정과, 상기 클라이언트별 메모리로부터 상기 다수의 클라이언트 데이터를 전달 받아 상기 다수의 클라이언트들이 공유하는 다수의 메모리 뱅크들에 각각 상기 버스트 단위로 저장하는 과정과, 상기 다수의 메모리 뱅크들에 저장된 각 클라이언트 데이터를 전달 받아 목적지 메모리에 데이터 처리를 위한 전송 단위인 트랜잭션 단위의 데이터를 저장하는 과정을 포함한다.
In addition, a storage method for processing a plurality of client data according to an embodiment of the present invention includes a process of receiving the plurality of client data generated in burst units from a plurality of clients and storing the input data in a memory for each client; A process of receiving the plurality of client data from a memory and storing the data in a burst unit in a plurality of memory banks shared by the plurality of clients, and receiving each client data stored in the plurality of memory banks to a destination memory It includes the process of storing data in a transaction unit, which is a transmission unit for data processing.

도 1a는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면,
도 1b는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 다른 구성 예를 나타낸 도면,
도 2는 특정 클라이언트에서 생성되는 데이터의 패턴을 예시한 도면,
도 3은 본 발명의 실시 예에 따라 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면,
도 4는 본 발명의 실시 예에 따른 상기 저장 장치에서 제2 단 저장부로 동작하는 버스트 메모리의 일 구성 예를 나타낸 도면,
도 5는 본 발명의 실시 예에 따른 저장 장치에서 버스트 메모리의 억세스 패턴의 일 예를 나타낸 도면,
도 6은 본 발명의 실시 예에 따라 1 비트 당 요구되는 메모리의 면적을 메모리의 사이즈별로 예시한 도면,
도 7은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트와 데이터 전달부 간의 인터페이스의 일 예를 나타낸 도면,
도 8은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트 데이터 전송 동작을 설명하기 위한 타이밍도,
도 9는 본 발명의 실시 예에 따른 저장 장치에서 목적지 메모리에 저장된 클라이언트 데이터의 일 예를 나타낸 도면.1A is a diagram showing an example configuration of a data storage device for processing a plurality of client data in a communication system;
1B is a diagram showing another configuration example of a data storage device for processing a plurality of client data in a communication system;
2 is a diagram illustrating a pattern of data generated by a specific client;
3 is a diagram illustrating an example configuration of a data storage device for processing a plurality of client data in a communication system according to an embodiment of the present invention;
4 is a diagram showing an example of a configuration of a burst memory operating as a second stage storage unit in the storage device according to an embodiment of the present invention;
5 is a diagram illustrating an example of an access pattern of a burst memory in a storage device according to an embodiment of the present invention;
6 is a diagram illustrating an area of a memory required per 1 bit for each size of the memory according to an embodiment of the present invention;
7 is a diagram illustrating an example of an interface between a client and a data transfer unit in a storage device according to an embodiment of the present invention;
8 is a timing diagram for explaining a client data transmission operation in a storage device according to an embodiment of the present invention;
9 is a diagram illustrating an example of client data stored in a destination memory in a storage device according to an embodiment of the present invention.

하기에서 본 발명의 실시 예들을 설명함에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다.In the following description of the embodiments of the present invention, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted.

먼저 통신 시스템에서 송신기, 수신기, 또는 송수신기(이하, '단말'이라 칭한다.)는 소프트웨어 기반으로 동작하는 하나 또는 복수의 프로세서와, 정해진 동작을 수행하도록 설계된 하나 또는 복수의 하드웨어 블록(또는 기능 블록)을 포함하여 구현될 수 있다. 여기서 하드웨어 블록과 프로세서(또는 프로세서의 서브시스템) 간에는 정보 교환이 이루어진다. 상기 하드웨어 블록에서 데이터 연산에 따라 얻어진 결과값 중 일부는 상기 프로세서에 전송되어 처리될 수 있으며, 상기 프로세서로 전송되어 처리되는 데이터는 상기 하드웨어 블록의 특성에 따라 다양한 종류의 데이터가 존재할 수 있다. 이러한 다양한 종류의 데이터는 상기 하드웨어 블록에서 주기적으로 또는 비주기적으로 생성된 후에 적절한 시간 내에 상기 프로세서로 전송되어야 하는데, 통신 시스템을 가정하면, 상기 다양한 종류의 데이터는 보정(calibration)을 위한 송수신 I(Inphase)/Q(Quadrature) 데이터, 셀 탐색(cell searcher)의 처리 결과, 수신기의 채널 추정값, 등화기(equalizer)의 적응 필터 가중치(adaptive filter weight), 채널 디코딩(channel decoding) 결과 등의 데이터 등을 예로 들 수 있다.First, in a communication system, a transmitter, a receiver, or a transceiver (hereinafter, referred to as a 'terminal') includes one or a plurality of processors operating based on software, and one or a plurality of hardware blocks (or functional blocks) designed to perform predetermined operations. It can be implemented including Here, information is exchanged between the hardware block and the processor (or subsystem of the processor). Some of the result values obtained according to data operation in the hardware block may be transmitted to and processed by the processor, and various types of data may exist in the data transmitted and processed by the processor according to characteristics of the hardware block. These various types of data must be periodically or aperiodically generated in the hardware block and then transmitted to the processor within an appropriate time. Assuming a communication system, the various types of data are transmitted/received I ( Inphase/Q (Quadrature) data, processing results of cell searcher, channel estimation values of receivers, adaptive filter weights of equalizers, data such as channel decoding results, etc. can be exemplified.

동시에 동작하는 다수의 하드웨어 블록들을 다수의 클라이언트들이라 하고, 다수의 하드웨어 블록들에서 생성되는 서로 다른 데이터들을 서로 다른 클라이언트 데이터이라 가정하였을 때, 본 발명의 이해를 돕기 위해 도 1a 및 도 1b를 참조하여 서로 다른 클라이언트 데이터를 처리하는 기존의 데이터 저장 장치와 데이터 처리 방법을 먼저 설명하기로 한다. 이하 본 발명의 실시 예는 통신 시스템의 단말에서 이용되는 저장 장치를 가정하여 설명될 것이나, 본 발명의 실시 예에 따른 저장 장치는 다수의 클라이언트로부터 데이터의 수집, 전송 및 처리가 요구되는 각종 시스템에서 동일하게 적용될 수 있다.Assuming that a plurality of hardware blocks operating at the same time are referred to as a plurality of clients, and different data generated in the plurality of hardware blocks is assumed to be different client data, refer to FIGS. 1A and 1B for better understanding of the present invention. An existing data storage device that processes different client data and a data processing method will be described first. Hereinafter, an embodiment of the present invention will be described assuming a storage device used in a terminal of a communication system, but the storage device according to an embodiment of the present invention can be used in various systems requiring data collection, transmission, and processing from multiple clients. The same can be applied.

도 1a는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면으로서, 도 1a의 구성은 슬레이브 버스 인터페이스 구조를 이용하는 데이터 저장 장치를 나타낸 것이다.1A is a diagram showing an example of a configuration of a data storage device that processes a plurality of client data in a communication system. The configuration of FIG. 1A shows a data storage device using a slave bus interface structure.

도 1a를 참조하면, 다수의 클라이언트들(클라이언트 0 내지 N-1)(110)로부터 생성되는 다수의 클라이언트 데이터(C₀~C_N _-1)는 저장부(130)내 대응되는 다수의 클라이언트별 메모리들(130₀ ~ 130_N-1)에 저장된다. 클라이언트별 메모리들(메모리0 내지 N-1)(130₀ ~ 130_N-1)는 해당 클라이언트에서 생성되는 데이터(C₀~C_N _-1)의 비트 폭(W₀~W_N _-1)과 단위 시간당 생성되는 데이터의 양에 따라 서로 다른 메모리 사이즈를 가진다. 그리고 도시되지 않은 프로세서는 클라이언트별 메모리들(130₀ ~ 130_N-1)로부터 출력되어 다중화기(135)에서 다중화된 데이터를 슬레이브 버스 인터페이스(150)를 통해 읽을 수 있다. 도 1b에서 참조 부호 S1은 슬레이브 버스 인터페이스 구조에서 클라이언트별 메모리들(130₀ ~ 130_N-1)로부터 데이터를 읽기 위한 메모리 선택을 나타낸 것이다. _{Referring to FIG. 1A , a plurality of client data (C 0} to C _N _-1 ) generated from a plurality of clients (clients 0 to N-1) 110 is stored in the storage unit 130 for each corresponding plurality of clients. It is stored in the memories (130 ₀ ~ 130 _N-1 ). The client-specific memory (memory 0 to _{N-1) (130 0 ~} 130 N-1) is the bit width (W ₀ ~ W _N _-1) of the data (C ₀ ~ C _N _-1) generated in the client and It has different memory sizes according to the amount of data generated per unit time. In addition, a processor (not shown _{) may read data output from the client-specific memories 130 0} to 130 _N-1 and multiplexed by the multiplexer 135 through the slave bus interface 150 . In FIG. 1B , reference numeral S1 denotes a memory selection for reading data from the _{memories 130 0} to 130 _{N-1 for each client in the slave bus interface structure.}

도 1b는 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 다른 구성 예를 나타낸 도면으로서, 도 1b의 구성은 인터럽트를 통해 데이터 입출력을 수행하는 DMA(Direct Memory 억세스) 버스 인터페이스 구조를 이용하는 데이터 저장 장치를 나타낸 것이다.1B is a diagram showing another configuration example of a data storage device that processes a plurality of client data in a communication system. The configuration of FIG. 1B is data using a DMA (Direct Memory Access) bus interface structure that performs data input/output through interrupt It represents the storage device.

도 1b를 참조하면, 다수의 클라이언트들(클라이언트 0 내지 N-1)(110)로부터 생성되는 다수의 클라이언트 데이터(C₀~C_N _-1)는 저장부(130)내 대응되는 다수의 클라이언트별 메모리들(130₀ ~ 130_N-1)에 저장된다. 클라이언트별 메모리들(메모리0 내지 N-1)(130₀ ~ 130_N-1)는 서로 다른 클라이언트 데이터(C₀~C_N _-1)의 양에 따라 서로 다른 메모리 사이즈를 가진다. 클라이언트별 메모리들(1300 ~ 130N-1)로부터 출력되어 다중화기(135)에서 다중화된 데이터는 DSP (Digital Signal Processor) 또는 CPU (Central Processing Unit)과 같은 프로세서 코어(180)로 인터럽트를 발생시키는 데이터 전달 모듈(160), RAM(Random 억세스 Memory)(170)을 거쳐 프로세서 코어(180)로 전달된다. _{Referring to FIG. 1B , a plurality of client data (C 0} to C _N ₋₁ ) generated from a plurality of clients (clients 0 to N-1) 110 is stored in the storage unit 130 for each corresponding plurality of clients. It is stored in the memories (130 ₀ ~ 130 _N-1 ). The memories for each client (memories 0 to N-1) 130 ₀ to 130 _N-1 have different memory sizes according to different amounts of client data C ₀ to C _N _-1. Data that is output from the memories 1300 to 130N-1 for each client and multiplexed in the multiplexer 135 is data that generates an interrupt to the processor core 180 such as a digital signal processor (DSP) or a central processing unit (CPU). It is transferred to the processor core 180 through the transfer module 160 and a random access memory (RAM) 170 .

도 1a 및 도 1b의 상기한 구성에서 저장부(130)내 다수의 클라이언트별 메모리들(130₀ ~ 130_N-1)는 예를 들어 플립플롭(flip-flop)이나 SRAM(Static Random 억세스 Memory)과 같은 온 칩 메모리(on-chip memory)로 구현되는데, 다음과 같은 문제점을 가진다. 구체적으로, 클라이언트별 메모리들(130₀ ~ 130_N-1)에서 하나의 클라이언트 메모리는 소프트웨어에서 한 번에 처리되는 데이터의 양(이하 "1 트랜잭션(transaction)")을 저장한다. 클라이언트 메모리로서 플립플롭은 1 비트 당 당 메모리 면적이 가장 넓고, SRAM도 상대적으로 비트 당 면적이 넓으므로, 상기 클라이언트 메모리는 처리되는 트랜잭션 사이즈(transaction size)에 의해 메모리의 실리콘(silicon) 면적이 크게 증가한다.In the above configuration of FIGS. 1A and 1B , the plurality of client-specific memories 130 ₀ to 130 _N-1 in the storage unit 130 are, for example, flip-flops or static random access memory (SRAM). It is implemented with an on-chip memory such as, but has the following problems. Specifically, in each of the client-specific memories 130 ₀ to 130 _N-1 , one client memory stores the amount of data processed at one time in software (hereinafter, “one transaction”). As a client memory, a flip-flop has the largest memory area per bit and SRAM also has a relatively large area per bit. increases

또한 클라이언트 메모리로 이용되는 SRAM은 실제 정보를 저장하는 셀 매트릭스(cell matrix)는 물론 데이터를 읽고 쓰기 위한 주소(address)를 디코딩하고 억세스(access)하고자 하는 셀(cell)을 지정하기 위한 제어 로직을 포함한다. 상기 셀 매트릭스(cell matrix)는 저장되는 데이터의 비트 수에 비례하여 크기가 증가하지만 상기 제어 로직은 상기 셀 매트릭스에 비해 크기가 완만하게 증가하므로 메모리의 용량이 작을수록 비트 당 차지하는 실리콘의 면적이 늘어나게 되어 메모리의 면적 효율은 낮아진다. 상기 클라이언트 메모리의 기존 구현 방식에서는 클라이언트별 데이터의 양이 달라서 클라이언트 메모리의 사이즈에도 편차가 있으며, 일부 클라이언트 메모리가 매우 작은 메모리를 사용할 경우 비트 당 차지하는 실리콘 면적의 증가로 전체적인 면적 효율이 감소한다. 또한, 각 클라이언트 메모리가 특정 클라이언트에 고정적으로 매핑되므로 동작 상황에 따라 클라이언트 메모리를 효율적으로 재할당하여 사용할 수 없게 된다. 또한 상기 클라이언트 메모리의 기존 구현 방식에서는 다양한 동작 시나리오에서 각 하드웨어 블록(즉 클라이언트)이 생성하는 데이터의 양이 달라지는 경우, 각 클라이언트 메모리는 모든 동작 시나리오에서 가장 많은 데이터 요구량을 기준으로 만들어져야 하는 점을 고려하면, 전체적으로 요구되는 메모리의 면적이 증가된다.In addition, the SRAM used as the client memory includes not only a cell matrix that stores actual information, but also a control logic for decoding an address for reading and writing data and designating a cell to access. include Although the size of the cell matrix increases in proportion to the number of bits of data to be stored, the size of the control logic increases gently compared to the cell matrix, so that the smaller the memory capacity, the larger the area of silicon occupied per bit increases. As a result, the area efficiency of the memory decreases. In the existing implementation method of the client memory, the size of the client memory is also different because the amount of data for each client is different. When some of the client memories use a very small memory, the overall area efficiency decreases due to an increase in the silicon area occupied per bit. In addition, since each client memory is fixedly mapped to a specific client, the client memory cannot be efficiently reallocated and used according to the operating situation. In addition, in the existing implementation method of the client memory, when the amount of data generated by each hardware block (that is, the client) varies in various operation scenarios, each client memory should be made based on the largest data demand in all operation scenarios. Considering this, the area of the memory required as a whole is increased.

본 발명의 실시 예에서는 다수의 클라이언트들로부터 전송되는 클라이언트 데이터의 트래픽 특성(traffic characteristic)을 이용하여 메모리의 면적을 보다 줄일 수 있는 다단(multi-stage) 구조의 메모리를 이용하는 저장 장치를 제안한다. 상기 클라이언트별 데이터는 상기 저장 장치를 이용하는 단말 등에서 처리되는 서로 다른 종류의 데이터로 이해될 수 있다. An embodiment of the present invention proposes a storage device using a memory having a multi-stage structure that can further reduce the area of a memory by using a traffic characteristic of client data transmitted from a plurality of clients. The data for each client may be understood as different types of data processed by a terminal using the storage device or the like.

또한 본 발명의 실시 예에 따른 저장 장치는 프로세서가 한번에 처리할 분량의 데이터 중 대부분의 데이터는 상대적으로 구현 비용(cost)이 적은 메모리에 저장하고, 구현 비용이 높은 메모리는 한번에 생성되는 분량의 데이터의 신속한 저장을 위해 한정하는 방식으로 제안된 것이다. 또한 상기 저장 장치는 메모리의 면적 효율이 떨어지는 작은 용량의 메모리의 사용을 줄이고, 일정 용량 이상의 메모리를 사용하게 되어 메모리의 전체적인 면적 효율을 향상시키도록 제안된 것이다.In addition, in the storage device according to an embodiment of the present invention, most data of the amount of data to be processed by the processor at one time is stored in a memory having a relatively low implementation cost, and a memory having a high implementation cost is generated at once. It is proposed as a limiting method for the rapid storage of In addition, the storage device is proposed to reduce the use of a small-capacity memory in which the area efficiency of the memory is low, and to increase the overall area efficiency of the memory by using a memory of a predetermined capacity or more.

또한 본 발명의 실시 예에 따른 저장 장치는 다양한 동작 시나리오에 따라 클리이언트별 메모리의 요구량이 달라지면 설정 레지스터(configuration register)를 통해 클라이언트별 메모리의 할당량을 용이하게 변경할 수 있다.In addition, the storage device according to an embodiment of the present invention can easily change the memory allocation amount for each client through a configuration register when the memory requirement for each client varies according to various operation scenarios.

도 2는 특정 클라이언트에서 생성되는 데이터의 패턴을 예시한 도면이다. 도 2를 참조하면, 클라이언트는 데이터를 전송한 후에 인터럽트(205, 207)를 발생시켜 프로세서에 처리를 요구한다. 프로세서는 1개의 트랜잭션(201, 203)에 해당하는 데이터를 모아 한 번에 처리하지만, 실제 데이터는 여러 개의 작은 버스트들(211, 213)로 나뉘어 생성된다. 따라서 본 발명의 실시 예에서는 클라이언트별로 1개의 버스트를 중간 메모리에 우선 저장하고, 1개의 트랜잭션의 전체 데이터는 상대적으로 구현 비용이 적은 목적지 메모리(destination memory)에 저장하는 방식을 제안한다. 상기 목적지 메모리는 외부 DRAM(Dynamic Random 억세스 Memory) 또는 SRAM을 이용할 수 있으며, 개별 인스턴스(instance)의 사이즈가 커서 면적 효율이 좋아진다. 본 발명의 실시 예에서 상기 버스트는 예컨대, 하나의 클라이언트에서 생성되는 데이터의 생성 단위로 이해될 수 있고, 상기 트랜잭션은 프로세서를 통해 처리되는(예컨대, 작업 수행을 위해 소프트웨어 등으로 전달되는) 데이터의 전송 단위로 이해될 수 있다.2 is a diagram illustrating a pattern of data generated by a specific client. Referring to FIG. 2 , after transmitting data, the client generates interrupts 205 and 207 to request processing from the processor. The processor collects data corresponding to one transaction (201, 203) and processes it at once, but actual data is generated by dividing it into several small bursts (211, 213). Accordingly, an embodiment of the present invention proposes a method in which one burst is first stored in an intermediate memory for each client, and the entire data of one transaction is stored in a destination memory having a relatively low implementation cost. The destination memory may use an external dynamic random access memory (DRAM) or SRAM, and the area efficiency is improved because the size of each instance is large. In an embodiment of the present invention, the burst may be understood as a generation unit of data generated by, for example, one client, and the transaction is data processed through a processor (eg, transmitted to software for performing a task). It can be understood as a transmission unit.

도 3은 본 발명의 실시 예에 따라 통신 시스템에서 다수의 클라이언트 데이터를 처리하는 데이터 저장 장치의 일 구성 예를 나타낸 도면으로서, 도 3의 저장 장치는 다단 구조의 메모리들을 포함하여 구현된 저장 장치의 구성을 나타낸 것이다.3 is a diagram illustrating an example of a configuration of a data storage device for processing a plurality of client data in a communication system according to an embodiment of the present invention. configuration is shown.

도 3의 저장 장치는 단말 내 다수의 하드웨어 블록들에 대응되는 다수의 클라이언트들(클라이언트 0 내지 N-1)(310)로부터 생성되는 서로 다른 종류의 클라이언트 데이터(C₀~C_N-1)을 입력 받아 일시 저장한 후, 목적지 메모리(350)로 출력하는 데이터 전달부(data mover)(330)를 포함한다. 상기 데이터 전달부(330)는 클라이언트별 FIF0(First In First Out) 메모리들(331)로 구현되는 제1 단 저장부와, 상기 제1 단 저장부로부터 전달되는 다수의 클라이언트 데이터(C₀~C_N _-1)를 각각 하나의 버스트 단위로 저장하고, 다음 버스트 데이터가 입력되기 전에 저장된 버스트 데이터를 목적지 메모리(350)로 출력하는 버스트 메모리(335)로 구현되는 제2 단 저장부를 포함한다. 상기 목적지 메모리(350)는 외부 메모리 등을 이용할 수 있으며, 제3 단 저장부로 이용된다. 따라서 본 발명의 실시 예에 따라 서로 다른 종류의 다수의 클라이언트 데이터를 처리하는 다단 구조의 메모리들로 구성되는 상기 저장 장치는 상기 제1 단 내지 제3 단 저장부를 포함하여 구현될 수 있다. 여기서 상기 FIF0 메모리들(331)로 입력되는 서로 다른 종류의 클라이언트 데이터(C₀~C_N _-1)는 데이터 량이 다를 수 있다. 한편 상기한 실시 예에서는 상기 제1 단 저장부로부터 전달되는 다수의 클라이언트 데이터(C₀~C_N _-1)를 각각 버스트 메모리(335)에 하나의 버스트 단위로 저장하는 것으로 설명하였으나, 이는 일 예를 나타낸 것이며, 본 발명이 이에 한정되는 것은 아니다. 따라서 버스트 메모리(335)에 둘 이상의 버스트 단위로 클라이언트 데이터를 저장하는 것도 가능하다. _{The storage device of FIG. 3 stores different types of client data (C 0} ~C _N-1 ) generated from a plurality of clients (clients 0 to N-1) 310 corresponding to a plurality of hardware blocks in the terminal. After receiving the input and temporarily storing it, it includes a data mover 330 that outputs it to the destination memory 350 . The data transfer unit 330 includes a first stage storage unit implemented with first in first out (FIFO) memories 331 for each client, and a plurality of client data (C ₀ to C) transferred from the first stage storage unit. _N ₋₁ ) each of which is stored in one burst unit, and a second stage storage unit implemented as a burst memory 335 that outputs the stored burst data to the destination memory 350 before the next burst data is input is included. The destination memory 350 may use an external memory or the like, and is used as a third stage storage unit. Accordingly, according to an embodiment of the present invention, the storage device configured with multi-tiered memories for processing a plurality of different types of client data may be implemented by including the first to third storage units. _{Here, different types of client data C 0} to C _N _-1 input to the FIF0 memories 331 may have different amounts of data. Meanwhile, in the above embodiment, it has been described that a plurality of client data (C ₀ ~ C _N ₋₁ ) transmitted from the first stage storage unit is stored in the burst memory 335 in one burst unit, respectively, but this is an example is shown, and the present invention is not limited thereto. Accordingly, it is also possible to store the client data in two or more burst units in the burst memory 335 .

도 3에서 상기 데이터 전달부(330)의 구체적인 동작을 설명하면, 상기 제1 단 저장부로 동작하는 입력 FIF0 메모리들(331)은 N 개의 클라이언트(310)로부터 서로 다른 N 개의 클라이언트 데이터(C₀~C_N _-1)를 전달 받아 클라이언트별로 할당된 입력 FIFO(First Input First Output) 메모리에 우선 저장한다. 상기 클라이언트별 입력 FIFO 메모리에 저장된 클라이언트별 데이터는 정해진 방식으로 포맷(formatting)(333)되어 상기 제2 단 저장부로 동작하는 버스트 메모리(335)로 입력되어 저장된다. 상기 포맷(333)은 각 클라이언트 데이터의 데이터 폭(data width)(W₀~W_N _-1)이 버스트 메모리(335) 내의 메모리 뱅크 폭(W_B) 보다 작을 경우 입력되는 데이터를 모은 후, 다시 상기 메모리 뱅크 폭 W_B의 크기로 맞추어 메모리를 효율적으로 사용하기 위한 것이다.Referring to the detailed operation of the data transfer unit 330 in FIG. 3 , the input FIF0 memories 331 serving as the first stage storage unit receive N different client data (C ₀ ~) from the N clients 310 . C _N _-1 ) is received and stored first in the input FIFO (First Input First Output) memory allocated for each client. Data for each client stored in the input FIFO memory for each client is formatted (333) in a predetermined manner, and is input to and stored in the burst memory 335 serving as the second stage storage unit. The format 333 is the data width (data width) (W ₀ ~ W _N _-1 ) of each client data is smaller than the memory bank width (W _B ) in the burst memory 335 After collecting the input data, again This is to efficiently use the memory by adjusting the size of the memory bank width W _{B .}

본 발명의 실시 예에서 상기 버스트 메모리(335)는 클라이언트별로 하나의 버스트를 저장하므로 버스트 메모리(335)에 저장할 데이터의 양이 하나의 트랜잭션 전체를 저장하는 것에 비해 크게 줄어든다. 상기 버스트 메모리(335)는 상대적으로 고비용의 메모리를 이용하므로 본 발명의 실시 예에서는 클라이언트에서 데이터의 생성 단위인 버스트 단위로 데이터를 저장하도록 구현된다. 버스트 메모리(335)에 하나의 버스트 저장이 완료되면, 버스트 메모리(335)는 그 다음 버스트가 발생하여 기존 데이터가 오버라이트(overwrite) 되기 전에 기 저장된 버스트를 제3 단 저장부로 출력한다. 상기 제2 단 저장부인 버스트 메모리(335)에 저장된 각 클라이언트 데이터는 인접한 버스트들 사이의 처리 시간 동안에 상기 제3 단 저장부로 전달된다. 그리고 상기 제3 단 저장부는 상대적으로 저비용의 메모리를 이용하므로 본 발명의 실시 예에서는 프로세서를 통해 처리되는 데이터의 전송 단위(예컨대, 상기 프로세서를 통해 소프트웨어로 전달되는 데이터의 전송 단위)인 트랜잭션 단위로 데이터를 저장할 수 있다. 소트트웨어는 상기 트랜잭션 단위의 데이터를 한 번에 처리할 수 있다.In an embodiment of the present invention, since the burst memory 335 stores one burst for each client, the amount of data to be stored in the burst memory 335 is greatly reduced compared to storing one entire transaction. Since the burst memory 335 uses a relatively high-cost memory, the embodiment of the present invention is implemented to store data in burst units, which are data generation units in the client. When one burst storage in the burst memory 335 is completed, the burst memory 335 outputs a pre-stored burst to the third storage unit before the next burst occurs and the existing data is overwritten. Each client data stored in the burst memory 335 as the second storage unit is transferred to the third storage unit during processing time between adjacent bursts. And, since the third storage unit uses a relatively low-cost memory, in the embodiment of the present invention, it is a transaction unit that is a data transmission unit (eg, a data transmission unit transmitted to software through the processor) processed through the processor. data can be saved. The software may process the data of the transaction unit at a time.

상기한 본 발명의 실시 예에 의하면, 클라이언트 데이터를 고비용의 버스트 메모리에 저장할 때 1회 전송의 단위인 트랜잭션 분량의 전체 데이터를 저장하지 않고, 1회 생성되는 버스트 분량만 저장할 수 있으며, 인접한 버스트들 사이의 처리 시간 동안에 그 버스트 데이터를 저비용 메모리에 전달하여 저장함으로써 비용 대비 메모리의 사용 효율을 향상시킬 수 있다.According to the above-described embodiment of the present invention, when client data is stored in a high-cost burst memory, only the burst amount generated once can be stored without storing the entire data of the transaction amount, which is a unit of one transmission, and adjacent bursts are stored. By transferring and storing the burst data to a low-cost memory during the processing time between them, it is possible to improve the memory usage efficiency compared to the cost.

만일 현재 버스트와 다음 버스트 간의 처리 시간 간격이 매우 짧아서 상기 제3 단 저장부로 데이터를 전송할 처리 시간이 부족한 경우에는 상기 제2 단 저장부에 해당 클라이언트 데이터(즉 처리 시간이 부족한 데이터)에 대해서는 예컨대, 2개 또는 그 이상의 버스트 분량을 저장하는 더블 버퍼링(double buffering)으로 상기 처리 시간의 부족을 해결할 수 있다. If the processing time interval between the current burst and the next burst is very short and the processing time to transmit data to the third storage unit is insufficient, the corresponding client data (that is, data lacking processing time) in the second storage unit is, for example, The lack of processing time can be addressed by double buffering, which stores two or more burst quantities.

또한 도 3의 저장 장치는 각 클라이언트 데이터를 처리하는 설정을 저장하는 설정 레지스터(configuration register)(R1)를 포함할 수 있다. 설정 레지스터(R1)는 예를 들어 각 클라이언트 데이터가 버스트 메모리(335)에서 차지하는 시작 주소와 각 클라이언트 데이터가 최종 저장될 목적지 메모리(350)의 주소, 포맷(333) 방식에 대응되는 디-포맷(de-formatting)(337) 방식 등 각 클라이언트 데이터를 처리하는 규칙들 중 적어도 하나가 저장될 수 있다. 도 3에서 출력 FIFO(336)는 버스트 메모리(335)에서 읽은 데이터를 임시 저장하며, 출력 FIFO(336)에서 출력되는 각 클라이언트 데이터는 디-포맷(337)을 거쳐 DMA 제어기인 eXDMAC(eXternal DMA controller)(339)를 통해 제3 단 저장부로 전달된다. 또한 각 클라이언트는 데이터 전달부(330)로 전달하는 제어 정보를 이용하여, 특정 버스트에서 인터럽트 발생을 요청할 수 있으며, 상기 인터럽트를 처리하기 위한 인터럽트 생성부(I1)가 도 3의 저장 장치에 포함될 수 있다.In addition, the storage device of FIG. 3 may include a configuration register (R1) for storing settings for processing each client data. The setting register R1 is, for example, a start address occupied by each client data in the burst memory 335, an address of the destination memory 350 to which each client data is finally stored, and a de-format ( 333) corresponding to the format (333) method. At least one of rules for processing each client data, such as de-formatting) 337 method, may be stored. In FIG. 3 , the output FIFO 336 temporarily stores data read from the burst memory 335 , and each client data output from the output FIFO 336 goes through a de-format 337 , and is a DMA controller eXDMAC (eXDMAC) ) (339) to the third stage storage unit. In addition, each client may request generation of an interrupt in a specific burst by using the control information transmitted to the data transfer unit 330, and an interrupt generation unit I1 for processing the interrupt may be included in the storage device of FIG. have.

도 3에서 목적지 메모리(350)인 상기 제3 단 저장부는 데이터가 전송되는 최종 목적지로서 프로세서에서 엑세스 가능한 주소 공간이다. 상기 제3 단 저장부는 예를 들어 칩 외부의 DRAM 또는 칩 내부의 특정 프로세서의 대용량 데이터 메모리를 이용할 수 있으며, DMA 제어기인 eXDMAC(339)를 통해 데이터를 전달 받는다. 상기 제3 단 저장부는 프로세서에서 데이터 처리를 위해 클라이언트별로 1개 트랜잭션을 저장하며, 상기 제1 단 저장부나 제2 단 저장부에 비해 비트 당 구현 비용이 상대적으로 낮으므로 전반적인 구현 비용이 감소된다. 만일 프로세서의 처리가 지연되어 한 개의 트랜잭션의 처리가 끝나기 전에 그 다음 트랜잭션의 데이터가 전송되면서 데이터가 오버라이트될 우려가 있다면 제3 단 저장부에 2개 또는 그 이상의 트랜잭션을 저장할 공간을 확보하여 더블 버퍼링을 수행하는 것도 가능하다. 그리고 상기 제1 단 내지 제3 단 저장부를 포함하는 저장 장치의 상기한 동작은 도시되지 않은 제어부를 통해 제어될 수도 있다.In FIG. 3 , the third stage storage unit, which is the destination memory 350 , is an address space accessible by the processor as a final destination to which data is transmitted. The third storage unit may use, for example, a large-capacity data memory of an external DRAM or a specific processor inside the chip, and receives data through the eXDMAC 339, which is a DMA controller. The third storage unit stores one transaction for each client for data processing in the processor, and since the implementation cost per bit is relatively lower than that of the first storage unit or the second storage unit, the overall implementation cost is reduced. If there is a risk of data being overwritten while the data of the next transaction is transmitted before the processing of one transaction is completed due to the delay of the processor, secure space to store two or more transactions in the third storage unit and double It is also possible to perform buffering. In addition, the above-described operation of the storage device including the first to third storage units may be controlled by a control unit (not shown).

도 4는 본 발명의 실시 예에 따른 상기 저장 장치에서 제2 단 저장부로 동작하는 버스트 메모리의 일 구성 예를 나타낸 도면이다.4 is a diagram illustrating a configuration example of a burst memory operating as a second stage storage unit in the storage device according to an embodiment of the present invention.

도 4를 참조하면, 버스트 메모리(335)는 입력 크로스 바(input cross bar)(3351), 다수의 메모리 뱅크들(뱅크 1 내지 M-1)(3353), 출력 다중화기(3355)를 포함한다. N 개의 입력 포트들은 입력 크로스 바(3351)를 거쳐 M 개의 메모리 뱅크들(3353)에 연결되며, 여기서 N<M 이다. 즉 메모리 뱅크의 개수가 입력 포트의 개수 보다 많다. 본 실시 예에서 각 메모리 뱅크는 예컨대, W_B 비트의 폭을 가진다. 각 입력 포트는 매 순간 특정한 메모리 뱅크에 연결되도록 제어된다. 그리고 출력 포트는 매 순간 M 개의 메모리 뱅크 중 하나를 선택하여 출력한다.Referring to FIG. 4 , the burst memory 335 includes an input cross bar 3351 , a plurality of memory banks (banks 1 to M-1 ) 3353 , and an output multiplexer 3355 . . The N input ports are connected to the M memory banks 3353 via an input cross bar 3351 , where N<M. That is, the number of memory banks is greater than the number of input ports. Each memory bank in the present embodiment is, for example, have a width W of the _B bit. Each input port is controlled to be connected to a specific memory bank at every moment. And the output port selects and outputs one of the M memory banks at every moment.

본 실시 예에서 버스트 메모리(335)의 어느 영역이든 클라이언트에 할당이 가능하며, 메모리 뱅크들(3353)에서 클라이언트별 시작 주소를 구성 레지스터(R1)에 설정할 수 있다. 이 설정을 단말의 동작 중에 변경하여 동작 시나리오에 따른 재설정이 가능하다. 따라서 버스트 메모리(335)의 전체 크기는 모든 클라이언트들의 1개 버스트 분량의 합이 최대가 되는 동작 시나리오를 기준으로 설계할 수 있고, 다른 동작 시나리오에서는 구성 레지스터(R1)를 통해 동적으로 클라이언트들 간에 버스트 메모리(335)를 재분배할 수 있다. 본 실시 예에서 이러한 버스트 메모리(335)의 할당 방식은 클라이언트별로 최대 버스트 사이즈로 메모리의 공간을 확보하면서 메모리의 요구 면적을 줄일 수 있다.In this embodiment, any area of the burst memory 335 may be allocated to a client, and a start address for each client in the memory banks 3353 may be set in the configuration register R1. By changing this setting during operation of the terminal, it is possible to reset according to the operation scenario. Therefore, the total size of the burst memory 335 can be designed based on an operation scenario in which the sum of one burst amount of all clients is maximum, and in other operation scenarios, burst memory 335 is dynamically created between clients through the configuration register R1. The memory 335 may be redistributed. In the present embodiment, such an allocation method of the burst memory 335 can reduce the required area of the memory while securing the space of the memory with the maximum burst size for each client.

도 5는 본 발명의 실시 예에 따른 저장 장치에서 버스트 메모리의 억세스 패턴의 일 예를 나타낸 도면이다.5 is a diagram illustrating an example of an access pattern of a burst memory in a storage device according to an embodiment of the present invention.

도 5를 참조하면, 예를 들어 클라이언트가 버스트 메모리(335)를 억세스할 때 가로 방향으로 스트라이핑(striping)하는 패턴으로 순차적으로 모든 메모리 뱅크들(3353)를 엑세스 한다. 이 경우 한 클라이언트가 특정 메모리 뱅크를 M 싸이클 마다 1번씩만 엑세스하게 되므로 입력 크로스 바(3351)에서 둘 이상의 클라이언트가 한 순간에 2개의 메모리 뱅크를 엑세스하지 않도록 스케줄링하면 최대 M 개의 클라이언트가 서로 충돌 없이 동시에 메모리 뱅크를 엑세스할 수 있다. 그리고 버스트 메모리(335)에 저장된 데이터를 읽는 것은 하나의 클라이언트 단위로 처리되므로 버스트 메모리(335)에 데이터를 쓰는 클라이언트의 개수 N은 메모리 뱅크의 개수 M보다 작아야 한다.Referring to FIG. 5 , for example, when a client accesses the burst memory 335 , all memory banks 3353 are sequentially accessed in a horizontally striped pattern. In this case, since one client accesses a specific memory bank only once every M cycles, if it is scheduled in the input crossbar 3351 so that two or more clients do not access two memory banks at a time, a maximum of M clients can be accessed without colliding with each other. You can access the memory bank at the same time. And, since reading data stored in the burst memory 335 is processed in units of one client, the number N of clients writing data to the burst memory 335 must be smaller than the number M of memory banks.

아래 <표 1>은 본 실시 예에 따라 입력 크로스 바(3351)의 클라이언트별 할당의 일 예를 나타낸 것으로서, 이는 예컨대, 클라이언트의 개수가 8인 경우에 대해 시간에 따른 메모리 뱅크의 할당 예를 나타낸 것이다.Table 1 below shows an example of assignment of the input cross bar 3351 for each client according to the present embodiment, which shows an example of assignment of a memory bank according to time when, for example, the number of clients is 8 will be.

<표 1><Table 1>

상기 <표 1>과 같이 연속된 8개의 클럭 싸이클(clock cycle)을 8개의 시간 슬롯으로 정의하여 라운드 로빈(round-robin) 방식으로 할당하면 간단히 충돌 없는 스케줄링이 가능하다. 이때 특정 클라이언트가 데이터 전송을 개시하고자 할 때 할당된 시간 슬롯까지 기다리는데 최대 7 클럭이 소요될 수 있으므로 데이터를 저장해 둘 수 있는 입력 버퍼(즉 도 3에서 입력 FIFO)가 클라이언트별로 필요하다. 제1 단 저장부인 클라이언트별 입력 FIFO는 각 클라이언트로부터 불규칙적으로 발생하는 데이터를 유실 없이 저장했다가 연속된 싸이클로 버스트 메모리(335)로 전송하는 역할을 수행한다.As shown in Table 1 above, if eight consecutive clock cycles are defined as eight time slots and allocated in a round-robin manner, scheduling without collision is possible. At this time, when a specific client wants to start data transmission, it may take up to 7 clocks to wait until the allocated time slot, so an input buffer (ie, an input FIFO in FIG. 3 ) that can store data is required for each client. The first-stage storage unit, the input FIFO for each client, serves to store data irregularly generated from each client without loss and transmit it to the burst memory 335 in successive cycles.

도 6은 본 발명의 실시 예에 따라 1 비트 당 요구되는 메모리의 면적(단위: gate count)을 메모리의 사이즈별로 예시한 도면이다. 여기서 1 게이트(gate)는 2 개의 입력을 가지는 NAND 게이트의 면적을 나타낸다.6 is a diagram illustrating an area (unit: gate count) of a memory required per bit according to a size of a memory according to an embodiment of the present invention. Here, 1 gate represents the area of a NAND gate having two inputs.

도 6을 참조하면, 가장 작은 사이즈의 메모리(Width 4, Depth 32)의 경우 그 메모리 면적은 3.83 gate/bit이고, 메모리 사이즈가 커질수록 메모리 면적은 감소하여 메모리 면적이 0.20 gate/bit에 수렴한다. 도 6의 예에서 서로 다른 사이즈의 메모리들 간에 메모리 면적은 최대 19배의 차이가 나는 것을 알 수 있다. 또한 메모리 사이즈가 일정 사이즈 이상으로 큰 경우 그 메모리 면적은 0.2~0.3 gate/bit 수준으로 급격히 감소함을 알 수 있다. 따라서 도 1a, 도 1b의 예와 같이 메모리 사이즈의 분포가 다양하게 메모리 뱅크들을 구현하는 것 보다는 메모리 사이즈가 균일하도록 메모리 뱅크를 구성하는 것이 메모리의 면적 감소 측면에서 보다 유리함을 알 수 있다. 따라서 본 실시 예에서는 클라이언트별로 생성되는 데이터의 양과 무관하게 메모리 뱅크들의 메모리 사이즈는 균일하게 구성된다. Referring to FIG. 6 , in the case of the smallest size memory (Width 4, Depth 32), the memory area is 3.83 gate/bit, and as the memory size increases, the memory area decreases and the memory area converges to 0.20 gate/bit. . In the example of FIG. 6 , it can be seen that the memory area differs by a maximum of 19 times between memories of different sizes. In addition, it can be seen that when the memory size is larger than a certain size, the memory area rapidly decreases to 0.2~0.3 gate/bit level. Accordingly, it can be seen that configuring the memory banks so that the memory sizes are uniform is more advantageous in terms of reducing the area of the memory, rather than implementing the memory banks having various distributions of memory sizes as in the examples of FIGS. 1A and 1B . Accordingly, in the present embodiment, the memory size of the memory banks is uniformly configured regardless of the amount of data generated for each client.

한편 본 발명의 실시 예에서 각 클라이언트는 목적지 메모리(350)로 전송할 클라이언트 데이터 외에 부가 정보를 전송할 수 있다.Meanwhile, in an embodiment of the present invention, each client may transmit additional information in addition to the client data to be transmitted to the destination memory 350 .

도 7은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트와 데이터 전달부 간의 인터페이스의 일 예를 나타낸 도면이다.7 is a diagram illustrating an example of an interface between a client and a data transfer unit in a storage device according to an embodiment of the present invention.

도 7의 예는 인터페이스를 통해 Wi 비트의 클라이언트 데이터(ClientData) 외에도 제어 정보(ControlInfo)와 인터럽터 요청 신호(InterruptReq)가 클라이언트 i(310)로부터 데이터 전달부(330)으로 전달되는 예를 나타낸 것이다. The example of FIG. 7 shows an example in which control information (ControlInfo) and an interrupter request signal (InterruptReq) are transmitted from the client i 310 to the data transfer unit 330 in addition to the Wi-bit client data (ClientData) through the interface.

아래 <표 2>는 상기 제어 정보의 일 예를 나타낸 것이다.Table 2 below shows an example of the control information.

<표 2><Table 2>

도 8은 본 발명의 실시 예에 따른 저장 장치에서 클라이언트 데이터 전송 동작을 설명하기 위한 타이밍도로서, 도 8의 타이밍 도는 <표 2>의 제어 정보를 이용하여 클라이언트 데이터를 전송하는 예를 나타낸 것이다. 일 예로 <표 2>에서 ContorlInfo[1:0]이 "10"인 경우, 도 8에서 ContorlInf[1]은 "1", ContorlInf[0]은 "0"인 경우에 해당된다.8 is a timing diagram for explaining a client data transmission operation in a storage device according to an embodiment of the present invention. The timing diagram of FIG. 8 shows an example of transmitting client data using the control information of <Table 2>. For example, in <Table 2>, ContorlInfo[1:0] is "10", in FIG. 8, ContorlInf[1] is "1", and ContorlInf[0] is "0".

도 8의 동작을 도 7을 참조하여 설명하면, 참조 번호 801은 클럭이고, 803, 805는 제어 정보, 807은 인터럽트 요청 신호, 그리고 809는 클라이언트 데이터이다. 클라이언트 i(310)는 대기 상태에서 제어 정보(ControlInfo)를 "00"으로 유지한다. 클라이언트 i(310)는 첫 전송을 시작할 때 제어 정보 ControlInfo[0](803)를 예컨대, "1"로 전송하고, 목적지 메모리(350)에서 데이터 저장을 시작할 헤드 주소(head address)를 전송한다. 이후 클라이언트 i(310)는 클라이언트 데이터(809)를 전송할 때 제어 정보 ControlInfo[1](805)을 "1"로 전송하여 클라이언트 데이터(809)가 유효함을 데이터 전달부(330)에 알린다. 클라이언트 i(310)는 1 개의 버스트 전송이 완료된 후 제어 정보 "11"을 전송한다. 상기 제어 정보 "11"을 수신한 데이터 전달부(330)는 버스트의 마지막임을 확인한 후, DMA 제어기인 eXDMAC(339)를 가동하여 목적지 메모리(350)로 데이터 전송을 시작한다. 특정 버스트의 마지막에 클라이언트 i(310)가 인터럽트 요청 신호(InterruptReq)(807)를 "1"로 전송하면, 데이터 전달부(330)는 해당 버스트의 전송이 완료된 후, 인터럽트를 발생시켜 프로세서에 해당 클라이언트의 데이터 전송이 완료되었음을 통지한다. 기타 참조 부호로 도 8에서 T_end _to _start는 하나의 버스트에서 다음 버스트 사이의 시간을 의미한다.When the operation of FIG. 8 is described with reference to FIG. 7 , reference numeral 801 denotes a clock, 803 and 805 denote control information, 807 denote an interrupt request signal, and 809 denote client data. The client i 310 maintains the control information (ControlInfo) as “00” in the standby state. The client i 310 transmits the control information ControlInfo[0] 803, for example, as “1” when starting the first transmission, and transmits a head address at which data storage in the destination memory 350 starts. Thereafter, when the client i 310 transmits the client data 809 , it transmits the control information ControlInfo[1] 805 as “1” to notify the data transfer unit 330 that the client data 809 is valid. Client i 310 transmits control information "11" after one burst transmission is completed. After receiving the control information "11", the data transfer unit 330 confirms that the burst is the end, and operates the eXDMAC 339, which is a DMA controller, to start data transfer to the destination memory 350 . When the client i 310 transmits an interrupt request signal (InterruptReq) 807 as “1” at the end of a specific burst, the data transfer unit 330 generates an interrupt after the transmission of the corresponding burst is completed and corresponds to the processor. Notifies that the client's data transfer has been completed. _{T end} in FIG. 8 with other reference signs _to _start means the time between one burst and the next.

도 9는 본 발명의 실시 예에 따른 저장 장치에서 목적지 메모리에 저장된 클라이언트 데이터의 일 예를 나타낸 도면이다.9 is a diagram illustrating an example of client data stored in a destination memory in a storage device according to an embodiment of the present invention.

도 9를 참조하면, 설정 레지스터(R1)(도 3 참조)에는 클라이언트별로 두 개의 기본 주소(base address)(BA0, BA1)을 지정할 수 있으며, 두 개의 기본 주소가 다르면 목적지 메모리(350)는 더블 버퍼링을 수행할 수 있다. 실제 각 버스트는 기본 주소(base address)에 각 클라이언트가 전송한 헤드 주소(head address)(HA0, HA1, ...)을 더한 위치부터 데이터를 저장하는데, 이때 데이터 저장부(350)는 부가 정보를 저장하여 프로세서가 데이터를 처리하는 것을 용이하게 할 수 있다. 도 9에서 "n+1"은 트랜잭션 넘버(Transaction number)를 나타낸 것이다. 상기 트랜잭션 넘버는 인터럽트를 이용하지 않고도, 새로운 트랜잭션의 데이터의 전송 완료 여부를 확인하는데 이용된다. 상기 트랜잭션 넘버는 클라이언트별로 인터럽트가 생성될 때마다 1씩 증가된 값을 사용함을 가정한다. 프로세서는 상기 인터럽트 처리를 수행하는 대신에 상기 트랜잭션 넘버가 갱신되었는지의 여부를 확인함으로써 모든 데이터의 전송이 완료되었는지를 판단하고, 처리 시간을 단축시킬 수 있다.Referring to FIG. 9, two base addresses (BA0, BA1) can be designated for each client in the setting register R1 (refer to FIG. 3). If the two base addresses are different, the destination memory 350 is doubled. Buffering can be performed. In fact, each burst stores data from the base address plus the head address (HA0, HA1, ...) transmitted by each client, in which case the data storage unit 350 stores additional information can be stored to facilitate processing of the data by the processor. In FIG. 9, “n+1” indicates a transaction number. The transaction number is used to check whether data transmission of a new transaction is completed without using an interrupt. It is assumed that the transaction number is increased by 1 whenever an interrupt is generated for each client. Instead of performing the interrupt processing, the processor determines whether the transmission of all data has been completed by checking whether the transaction number has been updated, thereby reducing the processing time.

상기한 본 발명의 실시 예에 따른 다단 구조의 메모리들(즉 계층적 구조의 메모리들)을 이용한 저장 장치는 순차적으로 연결된 제1 단 내지 제3 단 저장부를 포함하여 구현된다. 상기 제1 단 저장부는 독립적으로 동작하는 각 클라이언트의 데이터를 저장하기 위해 클라이언트별로 작은 플립플롭이나 메모리를 이용한 FIFO 메모리들로 구현될 수 있다. 상기 제1 단 저장부를 이용하면, 제2 단 저장부에 기록을 개시할 수 있을 때까지의 대기 시간에 데이터를 유실하지 않고 저장할 수 있다.The storage device using the multi-tiered memories (ie, hierarchical-structured memories) according to the embodiment of the present invention is implemented by including sequentially connected first to third-tier storage units. The first stage storage unit may be implemented as FIFO memories using a small flip-flop or memory for each client to store data of each client that operates independently. If the first-stage storage unit is used, data can be stored without losing data during the waiting time until recording can be started in the second-stage storage unit.

또한 상기 제2 단 저장부는 클라이언트별로 집중적으로 발생하는 하나 또는 둘 이상의 버스트를 저장하되 저장 장치의 구현 비용을 감소시키기 위해 다수의 클라이언트들이 공유할 수 있는 다수의 메모리 뱅크들을 포함하는 버스트 메모리로 구현될 수 있다. 상기 다수의 메모리 뱅크들은 메모리 면적을 감소시키도록 균일한 메모리 사이즈를 가지며, 상기 제1 단 저장부로부터 전달되는 클라이언트 데이터를 동시에 읽어 상기 다수의 메모리 뱅크들에 동시에 기록할 수 있으며, 제3 단 저장부로 기 저장된 데이터를 전달할 때 충돌을 방지할 수 있다. 또한 상기 제2 단 저장부는 클라이언트별 데이터가 버스트 메모리에 저장되는 위치와 데이터 량을 소프트웨어적인 설정으로 변경함(예컨대, 설정 레지스터를 통해 변경함)으로써 클라이언트별 메모리 할당을 동적으로 가변할 수 있다. 상기한 제2 단 저장부의 구성을 이용하면, 메모리의 면적을 감소시킬 수 있다. 그리고 저장 장치의 면적 효율을 높이기 위해 각 클라이언트에서 생성되는 데이터의 트래픽 특성을 이용하는 것도 가능하다.In addition, the second stage storage unit stores one or more bursts that occur intensively for each client, but is implemented as a burst memory including a plurality of memory banks that can be shared by a plurality of clients in order to reduce the implementation cost of the storage device. can The plurality of memory banks may have a uniform memory size to reduce a memory area, and may simultaneously read and write client data transmitted from the first storage unit to the plurality of memory banks, and store the third storage unit at the same time. Collision can be prevented when transferring pre-stored data. In addition, the second stage storage unit may dynamically change the memory allocation for each client by changing the location and data amount at which data for each client is stored in the burst memory through software settings (eg, through a setting register). When the configuration of the second stage storage unit is used, the area of the memory can be reduced. In addition, in order to increase the area efficiency of the storage device, it is possible to use the traffic characteristics of data generated by each client.

또한 상기 제3 단 저장부는 프로세서의 처리 단위인 트랜잭션 단위의 데이터를 저장하되 클라이언트별로 하나 또는 둘 이상의 트랜잭션 분량의 데이터를 저장할 수 있다. 상기 제3 단 저장부는 구현 비용이 상대적으로 낮은 외부 메모리 또는 온 칩(on-chip) 메모리(예컨대, 프로세서에 내장된 메모리)를 이용할 수 있으며, 목적지 메모리의 특정 위치에 각 클라이언트의 데이터를 저장하는 방식에 있어 두 개 이상의 기본 주소를 이용하여 더블 버퍼링을 수행할 수 있다. 또한 상기 제3 저장부의 특정 위치에 부가 정보를 자동으로 전송하여 프로세서의 데이터 처리를 용이하게 할 수 있다. 상기 부가 정보는 상기 트랜잭션 넘버 등과 같이 기타 데이터 처리에 도움이 되는 각종 정보가 될 수 있다.In addition, the third stage storage unit stores data of a transaction unit, which is a processing unit of the processor, but may store data of one or more transaction amounts for each client. The third stage storage unit may use an external memory or on-chip memory (eg, a memory built into the processor), which has a relatively low implementation cost, and stores data of each client in a specific location of the destination memory. In the method, double buffering can be performed using two or more base addresses. In addition, it is possible to facilitate data processing by the processor by automatically transmitting additional information to a specific location of the third storage unit. The additional information may be various types of information useful for processing other data, such as the transaction number.

상기한 본 발명의 실시 예에 의하면, 다수의 하드웨어 블록들이 생성하는 데이터를 특정 메모리로 전송해야 하는 통신 시스템에서, 모든 클라이언트들의 데이터를 효율적으로 처리할 수 있는 공유된 다단 구조의 메모리들을 이용하는 저장 장치 및 방법을 제공할 수 있다.According to the above-described embodiment of the present invention, in a communication system in which data generated by a plurality of hardware blocks must be transmitted to a specific memory, a storage device using a shared multi-level memory capable of efficiently processing data of all clients and methods may be provided.

또한 상기한 본 발명의 실시 예에 의하면, 다단 구조의 메모리들을 이용하는 저장 장치에서 메모리 억세스 방식을 제공할 수 있으며, 이러한 구성을 통해 작은 메모리의 사용을 줄이고 대용량 데이터는 일정 규모 이상의 메모리를 사용함으로써 전체적으로 메모리의 면적 효율을 향상시킬 수 있다. 또한 동작 시나리오에 따라 클리이언트별 데이터 생성량이 달라질 때 각 클라이언트에 할당되는 메모리의 양을 효율적으로 재할당할 수 있다.In addition, according to the above-described embodiment of the present invention, a memory access method can be provided in a storage device using memories having a multi-level structure. Through this configuration, the use of a small memory is reduced and large-capacity data is used as a whole by using a memory of a certain size or more. It is possible to improve the area efficiency of the memory. In addition, when the amount of data generated for each client varies according to the operation scenario, the amount of memory allocated to each client can be efficiently reallocated.

Claims

A storage device for processing a plurality of client data, comprising:
a first-stage storage unit for receiving and storing the plurality of client data from a plurality of clients;
a second storage unit receiving the plurality of client data from the first storage unit and storing each of the plurality of client data in a plurality of memory banks shared by the plurality of clients in burst units; and
and a third storage unit for receiving the burst data stored in the burst unit from the second storage unit and storing it in a transaction unit that is a transmission unit for data processing,
The plurality of memory banks are included in the second stage storage,
the number of the plurality of memory banks is greater than the number of the plurality of clients;
When the third-stage storage unit stores the burst data in transaction units, a plurality of base addresses are used to perform double buffering to store the burst data, and
Each of the burst data stored in the second storage unit is transferred to the third storage unit during a time during which a plurality of client data to be stored in the second storage unit in the following order is processed.

The method of claim 1,
The plurality of clients corresponds to a plurality of processors operating independently, and each of the plurality of client data includes different types of data.

The method of claim 1,
The storage device including memories corresponding to each of the plurality of clients, wherein the first storage unit stores each of each of the plurality of client data.

The method of claim 1,
and the first-stage storage unit stores the plurality of client data until the second-stage storage unit can start recording.

The method of claim 1,
The plurality of memory banks have a uniform memory size.

delete

The method of claim 1,
A storage device in which the location and the amount of data at which the plurality of client data are stored in the plurality of memory banks are changed through software settings.

9. The method of claim 8,
The second stage storage includes a setting register,
The software setting is changed using the setting register.

The method of claim 1,
The memory allocation of the plurality of clients to the plurality of memory banks is dynamically variable.

The method of claim 1,
The third storage unit uses an external memory or an on-chip memory included in the processor.

delete

The method of claim 1,
The additional information transmitted to the specific address of the third storage unit is used for data processing by the processor.

delete

A method of processing a plurality of client data, comprising:
receiving the plurality of client data from each of the plurality of clients and storing the plurality of client data in memory for each client;
receiving the plurality of client data from the memories for each client and storing each of the plurality of client data in a burst unit in a plurality of memory banks shared by the plurality of clients; and
receiving the burst data stored in the burst unit in the plurality of memory banks and storing the data in a destination memory in a transaction unit, which is a transmission unit for data processing,
the number of the plurality of memory banks is greater than the number of the plurality of clients;
The destination memory performs double buffering to store the burst data using a plurality of base addresses when storing the burst data in transaction units, and
Each of the burst data stored in the plurality of memory banks is transferred to the destination memory during a time during which a plurality of client data to be stored in the plurality of memory banks in the following order is processed.