KR20010098904A

KR20010098904A - Multiprocessor object control

Info

Publication number: KR20010098904A
Application number: KR1020010022639A
Authority: KR
Inventors: 킬리안로버트티.; 나라얀아자이; 미로바노비크라즈코; 오버터프제임스엠.; 페톤쉬아일러티.; 쓰리프트필립알.
Original assignee: 윌리엄 비. 켐플러; 텍사스 인스트루먼츠 인코포레이티드
Priority date: 2000-04-26
Filing date: 2001-04-26
Publication date: 2001-11-08
Also published as: JP2002041308A; TW514832B

Abstract

클라이언트 기한 단계의 정보가 제2 단계의 서브태스크 서버 스케쥴링에 이용되는 2 단계 서버 태스크 스케쥴링을 행하는 클라이언트-서버 시스템이 개시된다. 또한, 상기 시스템의 객체 브로커가 코프로세서에서 데이터를 유지하기 위하여 클라이언트 요구 호 및 회답의 붕괴를 행하며, 주 프로세서 버스 혼잡을 방지하기 위하여 다수의 코프로세서의 공유 메모리를 통해 멀티태스킹 및 데이터 흐름에 대한 서버 메모리 관리 방법이 제공된다.A client-server system is disclosed that performs two-stage server task scheduling in which information of a client deadline step is used for subtask server scheduling in a second step. The object broker of the system also collapses client requests and responses in order to maintain data in the coprocessor, and multitasking and data flows through the shared memory of multiple coprocessors to prevent main processor bus congestion. A server memory management method is provided.

Description

MULTIPROCESSOR OBJECT CONTROL}

본 발명은 전자 장치에 관한 것으로, 특히 객체에 분산된 멀티프로세서 및 디지털 신호 프로세서와 방법들에 관한 것이다.The present invention relates to an electronic device, and more particularly, to a multiprocessor and digital signal processor and methods distributed in an object.

고속 네트워크 액세스로 결합된 인터넷의 성장으로 인해 분산 방식의 계산이 주류가 되었다. CORBA(common object request broker archirecture) 및 DCOM(Distributed component object model) 표준들로 인해, 객체-지향 네트워크 프로그래밍 및 구성 요소 소프트웨어 방법을 간략화할 수 있다. 따라서, 클라이언트애플리케이션은 데이터 혹은 기능을 제공하여 응용 프로그래밍을 간략화하기 위해 원격 서버 객체를 호출할 수 있다. 도 24는 일반적인 원격 방법의 호출 구조를 나타낸다. 그 효과로, 객체-지향 프로그래밍은 구체성을 은닉화하여서, 다른 객체와의 쿼리(query) 혹은 대화(interaction)용 객체 인터페이스만을 나타내어서, 이러한 분산 계산을 할 수 있게 한다.The growth of the Internet, combined with high-speed network access, has made distributed computing mainstream. Common object request broker archirecture (CORBA) and distributed component object model (DCOM) standards simplify object-oriented network programming and component software methods. Thus, a client application can call a remote server object to provide data or functionality to simplify application programming. 24 shows the call structure of a general remote method. As a result, object-oriented programming conceals specificity, representing only object interfaces for queries or interactions with other objects, enabling this distributed computation.

CORBA의 핵심은 로컬 및 원격 둘 다의 객체들 간의 대화를 위해 "버스"를 제공한다. CORBA 객체는 방법과 인터페이스를 플러스한 세트이다. CORBA 객체의 클라이언트는 객체가 클라이언트의 어드레스 공간에 위치되었어도 객체의 레퍼런스를 방법 호출을위한 처리자로서 사용한다. ORB는 객체의 (가능한 원격 서버상에서의) 구현을 발견하고, 클라이언트 애플리케이션으로부터 호출 요구를 수신하기 위해 객체를 준비시키고, 클라이언트에서 객체로 요구(예를 들어, 파라미터들)를 전송하고, 객체에서 클라이언트로 임의의 응답을 반환할 수 있다. 객체 구현은 ORB 인터페이스 혹은 객체 어댑터(OA)에 의해 ORB와 대화한다. 도 25는 모든 CORBA 구조를 나타낸다.The core of CORBA provides a "bus" for conversations between objects, both local and remote. CORBA objects are a set of plus methods and interfaces. A client of a CORBA object uses the object's reference as the handler for the method invocation, even if the object is located in the client's address space. The ORB discovers an implementation of the object (possibly on a remote server), prepares the object to receive a call request from the client application, sends a request (eg, parameters) from the client to the object, and sends the client to the object. Can return any response. The object implementation interacts with the ORB by its ORB interface or object adapter (OA). 25 shows all CORBA structures.

인터페이스 정의 언어(IDL)는 객체 지향 프로그래밍에서는 통상, 구체적인 사항(데이터, 구현)을 은닉하는 한편, 클라이언트에 의해 호출될 방법을 포함하는 개체의 인터페이스를 정의한다. IDL은 통상, 데이터 캡슐화, 폴리모피즘, 및 계승을 제공한다. 도 24에 도시된 바와 같이, 클라이언트는 우선 호출을 만듦으로써 클라이언트 스터브(stub)(프록시:proxy)로 객체 함수를 호출한다. 스터브는 호출 파라미터를 메시지로 정리한다. 와이어 프로토콜이 서버스터브(스켈리턴(skeleton))로 메시지를 전송한다. 서버 스터브는 메시지로부터 호출 파라미터를 정리하지 않고, 객체 함수를 호출한다. 도 25의 상층은 기본 프로그래밍 구조이고, 중간층은 원격 아키텍처이며, 하층은 와이어 프로토콜 아키텍처이다. 클라이언트 프로그램 및 서버 객체 프로그램의 개발자(developer)는 기본 프로그래밍 아키텍처를 대상으로 하고, 원격 구조는 인터페이스 포인터, 객체 레퍼런스를 만들고, 클라이언트와 서버 처리 사이에서 의미있게 처리한다. 와이어 프로토콜은 원격 아키텍처를 다양한 하드웨어 장치 사이로 효과적으로 확장한다.Interface definition language (IDL) typically defines the interface of an object, including methods to be called by the client, while concealing specifics (data, implementations). IDL typically provides data encapsulation, polymorphism, and inheritance. As shown in Fig. 24, a client first calls an object function with a client stub (proxy) by making a call. The stub organizes the call parameters into messages. The wire protocol sends a message to the server stub (skeleton). Server stubs do not clean up call parameters from messages, but call object functions. The upper layer of FIG. 25 is the basic programming structure, the middle layer is the remote architecture, and the lower layer is the wire protocol architecture. Developers of client programs and server object programs target the underlying programming architecture, and the remote architecture creates interface pointers, object references, and handles meaningfully between client and server processing. The wire protocol effectively extends the remote architecture between various hardware devices.

Cheung 등의 DCOM and CORBA Side by Side, Step by Step, and Layer by Layer에 기재되어 있는 바와 같이, CORBA-인에이블드 클라이언트 및 서버 프로세서에서 원격 객체를 이용하는 간단한 애플리케이션은 다음과 같은 5개의 파일을 이용하여 생성될 수 있다. (1) 객체에 대한 인터페이스(들)를 정의하기 위한 IDL 파일. IDL 컴파일러는 클라이언트 및 서버 양자에 의해 이용되는 클라이언트 스터브 및 객체 스켈리턴 코드와 인터페이스 헤더 파일을 생성한다. (2) 인터페이스로부터 객체에 대한 서버 구현 클래스를 유도하기 위한 실행 헤더 파일. 본질적으로, 구현 클래스는 IDL 컴파일러에 의해 생성된 인터페이스 클래스와 (계승에 의해) 연관될 수 있다. (3) 서버 클래스의 방법의 구현. (4) 서버에 대한 메인 프로그램. 이 프로그램은 서버 클래스의 인스턴스(객체)를 생성한다. (5) 클라이언트 스터브에 대한 콜에 의해 객체의 방법을 호출할 클라이언트 애플리케이션.As described in Cheung et al., DCOM and CORBA Side by Side, Step by Step, and Layer by Layer, a simple application that uses remote objects on CORBA-enabled client and server processors uses five files: Can be generated. (1) IDL file to define the interface (s) for the object. The IDL compiler generates client stub and object skeleton code and interface header files used by both client and server. (2) Execution header file to derive the server implementation class for the object from the interface. In essence, an implementation class can be associated (by inheritance) with an interface class generated by the IDL compiler. (3) the implementation of the server class method. (4) the main program for the server. This program creates an instance of the server class. (5) A client application that will invoke the object's method by calling the client stub.

정적 객체 호출에 있어서, 번역(compilation) 후 실행 전에, CORBA는 구현 리포지터리 내에서 실행 가능한 인터페이스 명칭과 경로 명칭 간의 연관을 등록한다 (도 25 참조). 동적 객체 호출에 있어서, IDL 컴파일러는 인터페이스 내의 각 방법에 대한 타입 정보를 생성하여, 인터페이스 레포지터리 내에 저장한다. 클라이언트는 인터페이스 레포지터리를 문의하여 특정 인터페이스에 관한 런타임 정보를 얻고, 그것을 이용하여 동적 호출 인터페이스를 통해 객체 상의 방법을 동적으로 생성 및 호출한다. 마찬가지로, 서버측에서, 동적 스켈리턴 인터페이스는, 클라이언트가 구현중인 객체의 타입의 컴파일-타임 지식을 갖지 않는 객체 상의 동작을 호출할 수 있게 한다.In static object invocation, CORBA registers an association between an executable interface name and a path name within the implementation repository (see FIG. 25) before execution. In dynamic object invocation, the IDL compiler generates type information for each method in the interface and stores it in the interface repository. The client queries the interface repository to get runtime information about a particular interface, and uses it to dynamically create and invoke methods on the object via the dynamic invocation interface. Similarly, on the server side, the dynamic skeleton interface allows clients to invoke operations on objects that do not have compile-time knowledge of the type of object being implemented.

도 26a는 객체 및 호출 방법의 클라이언트 요구의 CORBA 상층(기본 프로그래밍 아키텍처) 동작, 및 객체 인스턴스의 서버 생성 및 클라이언트에 대한 유용성을 도시하고 있다. 특히, 객체 동작은 다음을 따른다. (1) 클라이언트는 객체 인터페이스에 대한 클라이언트 스터브의 정적 함수를 호출한다. (2) ORB는 객체 인터페이스를 지원하는 객체를 포함하는 서버를 기동한다. (3) 서버 프로그램은 객체를 생성하고, 객체 레퍼런스를 등록한다. (4) ORB는 객체 레퍼런스를 클라이언트 애플리케이션으로 복귀시킨다. 그 다음, 객체 방법 호출 [1], [2]에 대하여, 클라이언트는 서버 내에서 방법들을 실질적으로 호출하는 객체 인터페이스의 방법을 콜한다. 방법이 값들을 복귀시키면, 서버는 이들을 클라이언트에 되돌려보낸다.FIG. 26A illustrates the CORBA upper layer (basic programming architecture) operation of client requests of objects and invocation methods, and server creation of object instances and usability for clients. In particular, the object behavior follows: (1) The client calls the static function of the client stub for the object interface. (2) The ORB starts up a server containing objects that support the object interface. (3) The server program creates an object and registers an object reference. (4) The ORB returns the object reference back to the client application. Then, for object method invocation [1], [2], the client calls the method of the object interface that actually invokes the methods in the server. When the method returns the values, the server sends them back to the client.

도 26b는 다음과 같은 객체 동작을 갖는 CORBA 중간층(원격 아키텍처)을 나타내고 있다. (1) 콜 수신시, 클라이언트 스터브는 태스크를 ORB에 위임한다. (2) ORB는 구현 레포지토리를 조회하여, 콜을 그 서버 경로 명칭에 맵핑하고, 서버 프로그램을 활성화한다. (3) 서버는 객체를 생성하고 또한 고유의 레퍼런스 ID를생성하여 객체 레퍼런스를 얻는다. ORB에 의해 객체 레퍼런스를 등록한다. (4) 서버의 구성자(constructor)는 또한 스켈리턴 클래스의 인스턴스를 생성한다. (5) ORB는 클라이언트에게 객체 레퍼런스 택(tack)을 전송하고 또한 클라이언트 스터브 클래스의 인스턴스를 생성하며, 이를 대응하는 객체 레퍼런스와 함께 클라이언트 스터브 객체 테이블에 등록한다. (6) 클라이언트 스터브는 객체 레퍼런스를 클라이언트에 되돌려보낸다. 그 다음, 객체 방법의 클라이언트 호출은 클라이언트 콜의 수신시에 [1] 진행하고, 클라이언트 스터브는 요구 의사 객체, 서버에 콜의 마샬 파라미터를 의사 객체에 생성하고, 의사 객체를 채널 내의 메시지에 포함시킬 것을 요구하며, 응답을 기다린다. 메시지가 서버에 도달하면, ORB는 타겟 스켈리턴을 찾고, 요구 의사 객체를 재구축한 다음, 이를 스켈리턴으로 전송한다. [3] 스켈리턴은 요구 의사 객체로부터의 파라미터를 정리하지 않고, 서버 객체의 방법을 호출하며, 리턴값을 정리하고, 스켈리턴 방법으로부터 되돌아 온다. ORB는 응답 메시지를 구축하고 이를 전송 버퍼에 배치한다. [4] 응답이 클라이언트측에 도달하면, ORB는 수신 버퍼로부터의 응답 메시지를 판독한 후에 ORB 콜이 되돌아온다. 그 다음, 클라이언트 스터브는 리턴값을 정리하지 않고, 그들을 클라이언트에 되돌려 보내어 콜을 종료한다.Figure 26B shows a CORBA middle layer (remote architecture) with the following object behaviors. (1) Upon receiving a call, the client stub delegates the task to the ORB. (2) The ORB queries the implementation repository, maps the call to its server path name, and activates the server program. (3) The server creates an object and also generates a unique reference ID to get an object reference. Register an object reference with the ORB. (4) The constructor of the server also creates an instance of the skeleton class. (5) The ORB sends an object reference tack to the client and also creates an instance of the client stub class and registers it with the corresponding object reference in the client stub object table. (6) The client stub sends an object reference back to the client. The client call of the object method then proceeds [1] upon receipt of the client call, and the client stub generates the request pseudo object, the marshall parameters of the call to the server in the pseudo object, and includes the pseudo object in a message in the channel. It asks for it and waits for a response. When the message reaches the server, the ORB finds the target skeleton, rebuilds the request pseudo-object, and sends it to the skeleton. [3] The skeleton does not clean up parameters from the request pseudo-object, calls the server object's method, cleans up the return value, and returns from the skeleton method. The ORB builds a response message and places it in the transmit buffer. [4] When the response reaches the client side, the ORB returns the ORB call after reading the response message from the receive buffer. Next, the client stub does not clean up the return values, but sends them back to the client to terminate the call.

도 26c에서 설명한 바와 같이, 객체 활성화에 대한 하층 (와이어 프로토콜 아키텍처)은 (1) 요구 수신시에, 클라이언트측 ORB가 객체를 지원하여 TCP/IP를 통해 서버측 ORB에 요구를 전송하는 머신을 선택한다. (2) 서버가 서버측 ORB에 의해 시작되면, 객체는 서버에 의해 생성되고, ORB 구성자가 호출되고, 생성 기능이 행해진다. 생성 기능은 소켓 엔드포인트를 생성하고, 객체에는 객체 아이텐티티가 할당되며, 인터페이스 및 구현명, 레퍼런스 아이덴티티 및 엔드포인트 어드레스를 포함하는 객체 레퍼런스가 생성된다. 객체 레퍼런스는 ORB에 의해 등록된다. (3) 객체 레퍼런스가 클라이언트측으로 되돌아가면, 클라이언트 스터브는 엔드포인트 어드레스를 추출하고 서버에 대한 소켓 접속을 확립한다. 그 다음, 방법 호출이 [1] 콜의 수신시에 진행하고, 클라이언트 스터브는 공통 데이터 표시 (CDR) 포맷 내의 파라미터를 정리한다. [2] 확립된 소켓 접속을 통해 타겟 서버에 요구가 전송된다. [3] 타겟 스켈리턴이 레퍼런스 아이덴티티나 인터페이스 인스턴스 아이덴티티에 의해 식별된다. 그리고, [4] 서버 객체 상의 실제의 방법을 호출한 후에, 스켈리턴 CDR 포맷 내의 리턴값을 정리한다.As described in Figure 26c, the lower layer (wire protocol architecture) for object activation selects a machine that (1) upon receipt of a request, the client-side ORB supports the object and sends the request to the server-side ORB via TCP / IP. do. (2) When the server is started by the server side ORB, the object is created by the server, the ORB constructor is called, and the creation function is performed. The create function creates a socket endpoint, an object is assigned an object identity, and an object reference is created that includes an interface and implementation name, a reference identity, and an endpoint address. Object references are registered by the ORB. (3) When the object reference returns to the client side, the client stub extracts the endpoint address and establishes a socket connection to the server. The method call then proceeds upon receipt of the [1] call, and the client stub cleans up the parameters in the Common Data Indication (CDR) format. [2] The request is sent to the target server over the established socket connection. [3] The target skeleton is identified by either a reference identity or an interface instance identity. [4] After calling the actual method on the server object, the return value in the skeleton CDR format is summarized.

CORBA의 실시간 확장은 전형적으로 예측가능한 성능, 보안 동작, 및 자원 할당 등의 서비스 품질 특징 (QoS)를 제공한다. 예를 들면, Gill 등에 의한, Applying Adaptive Middleware to Manahe End-to-End QoS for Next-generation Distributed Applications가 있다.CORBA's real-time expansion typically provides quality of service features (QoS) such as predictable performance, security behavior, and resource allocation. For example, Gill et al., Applying Adaptive Middleware to Manahe End-to-End QoS for Next-generation Distributed Applications.

메타형으로서 CORBA 구성요소가 도입되었고, 관련된 구성요소 구현 정의 언어(CIDL: component implementation definition language)가 구현예를 설명하는데 사용가능하다. 도 27은 프로그래밍 단계를 도시한다.CORBA components were introduced as metatypes, and a related component implementation definition language (CIDL) is available to describe the implementation. 27 shows a programming step.

DCOM은 유사하게 3 층을 구비하고 CORBA와 어느 정도 유사한 아키텍처를 구비한다.DCOM has similarly three layers and an architecture somewhat similar to CORBA.

Notenboom의 미국특허 제5,748,468호 및 Equator Technologies PCT 공개 출원 WO 99/12097 각각은 프로세서 자원을 다중 태스크에 할당하는 방법을 설명한다. Notenboom은 호스트 프로세서와 함께 코프로세서(coprocessor)를 고려하여 종래의 시스템에 따라 코프로세서 자원을 태스크에 할당한다. Equator Technologies는 지원되는 최소한 한 서비스 수준(프로세서 자원 소비 비율)을 제공하는 각 태스크의 태스크 시간 소비에 따라 프로세서 자원을 스케쥴링하고, 자원 관리 프로그램은 지원 서비스 수준에 대해 충분한 자원이 존재하면 태스크를 허용한다.Notenboom, U. S. Patent No. 5,748, 468 and Equator Technologies PCT Publication Application WO 99/12097 each describe a method for allocating processor resources to multiple tasks. Notenboom considers coprocessors with host processors and allocates coprocessor resources to tasks according to conventional systems. Equator Technologies schedules processor resources according to the task time consumption of each task that provides at least one supported service level (processor resource consumption rate), and the resource manager allows the task if sufficient resources exist for the supported service level. .

각 프로세서가 각자의 운영 체계를 구비한 둘 이상의 프로세서를 가진 시스템 또는 BIOS는 인터넷을 통해 광범위하게 떨어져 있는 프로세서를 가진 시스템, 및, RISC CPU와 함께 하나 이상의 DSP와 같이, 동일한 반도체 다이 상에 집적된 둘 이상의 프로세서를 가진 시스템을 포함한다.A system or BIOS, where each processor has two or more processors with its own operating system, is integrated on the same semiconductor die, such as a system with processors that are widely separated over the Internet, and one or more DSPs with RISC CPUs. It includes a system with two or more processors.

XDAIS 표준은 DSP상에서 실행되는 알고리즘에 대한 인터페이스를 규정하는데, 이는 재사용가능 객체를 제공한다. XDAIS는 알고리즘이 알고리즘을 실행하기 위한 표준 인터페이스 IALG와 함께 확장을 구현할 것을 요구한다. XDAIS는 또한 재배치가능 코드와 같은 소정의 유연한 규칙과 명명 규정(naming convention)을 따를 것을 요구한다. 클라이언트 애플리케이션은 함수 포인터의 표로 호출하여 알고리즘의 인스턴스를 관리할 수 있다. XDAIS 표준/지침을 사용하여, iDSP 미디어 플랫폼(Media Platform) DSP 프레임워크(framework)와 같은 DSP 애플리케이션 프레임워크에 플러그하기 용이하도록 알고리즘 개발자는 알고리즘을 개발하거나 변형시킬 수 있다.The XDAIS standard specifies an interface to algorithms that run on DSPs, which provide reusable objects. XDAIS requires algorithms to implement extensions with the standard interface IALG for executing algorithms. XDAIS also requires that certain flexible rules and naming conventions, such as relocatable code, be followed. Client applications can manage instances of the algorithm by calling tables of function pointers. Using the XDAIS standard / guideline, algorithm developers can develop or transform algorithms to facilitate plugging into DSP application frameworks such as the iDSP Media Platform DSP framework.

네트워크 노드(클라이언트/서버) 내의 서비스의 품질(QoS: quality ofservice) 관리 프로그램에 대한 필요는 특히 모든 스트리밍(streaming) 미디어 기반 애플리케이션의 실시간 서비스 요구사항으로부터 나온다. 스트리밍 미디어 애플리케이션은 이종의 코덱(인코더/디코더) 및 고유 렌더링 기한을 갖는 필터를 처리해야만 한다. 이 애플리케이션들은 또한 서비스의 품질에서 심각하지 않게 저하되는 정도로 인지 특성을 이용하여 번역할 수 있어야 한다. 이들은 프로세싱 및 렌더링 사이클에 있어서 어느 정도의 지터(jitter)를 처리할 수 있어야 한다. 예를 들면, 비디오 애플리케이션에서, 렌더링를 위한 프레임 레이트는 30 프레임/초(fps)에서 유지되어야 하며, 이는 33 ms의 프레임 주기로 번역한다. 그러나, 이 애플리케이션은 서버와 협의에 있어서 한정된 순간 변동을 허용할 수 있어야 한다. 또한, 30 fps에서, 시각은 약 6 프레임/초의 프레임 저하를 허용할 수 있다. 클라이언트 애플리케이션도 성능(프레임의 순간적인 저하)에 있어서 심각하지 않은 저하를 지원하고 서버와 협의된 규정된 허용 오차 내에서 렌더링의 정상 상태를 유지할 수 있어야 한다. QoS 관리 프로그램은 이러한 실시간 시스템을 실현하기 위한 필요한 기능과 용량을 제공하는 메카니즘이다.The need for a quality of service (QoS) management program within a network node (client / server) comes from the real-time service requirements of all streaming media-based applications, in particular. Streaming media applications must handle heterogeneous codecs (encoders / decoders) and filters with unique rendering deadlines. These applications also need to be able to translate using cognitive characteristics to a degree that does not seriously degrade the quality of service. They must be able to handle some jitter in the processing and rendering cycles. For example, in a video application, the frame rate for rendering must be maintained at 30 frames / second (fps), which translates to a frame period of 33 ms. However, the application must be able to tolerate limited instantaneous changes in consultation with the server. Also, at 30 fps, the time of day can allow for a frame drop of about 6 frames / second. Client applications should also be able to support non-fatal degradation in performance (temporary degradation of frames) and maintain a steady state of rendering within specified tolerances negotiated with the server. The QoS management program is a mechanism that provides the necessary functions and capacity to realize such a real-time system.

DSL 및 케이블 모뎀과 같은 광대역 통신이 새로운 시장으로 급격히 증가되고 전례없던 데이터의 용량을 프로세싱 및 소비를 위한 소비자의 장치로 전달하여야 함에 따라, 더욱 효율적인 데이터 처리, 라우팅(routing), 및 프로세싱 기술들이 따라가야할 필요가 있을 것이다.As broadband communications such as DSL and cable modems are rapidly expanding into new markets and must deliver unprecedented amounts of data to consumer devices for processing and consumption, more efficient data processing, routing, and processing technologies are following. You will need to go.

도 20은 데이터가 현재의 이종(heterogeneous) 시스템의 프로세싱 요소를 통하여 처리되는 방법의 다이어그램을 도시한다. 각 데이터 트랜잭션은 시간 순서를보여주도록 번호 매겨진다. 각 트랜잭션마다 데이터는 중앙 제어 프로세서(Center Control Processor; CCP)의 제어하에 시스템 버스를 통과하여야만 한다. CCP는 메세지 또는 트리거를 제어 경로를 통하여 시스템 내의 다양한 프로세싱 요소들로의 트랜잭션을 개시한다.20 shows a diagram of how data is processed through the processing elements of current heterogeneous systems. Each data transaction is numbered to show a time sequence. For each transaction, the data must pass through the system bus under the control of the Center Control Processor (CCP). The CCP initiates a message or trigger through a control path to various processing elements in the system.

도 20의 프로세싱 요소들은 규정된 작업의 세트를 구동할 수 있는 별도의 프로세서들(예를 들어, DSP, ASIC, GPP 등)이다. 이것은 각각이 그 자신의 메모리와 함께 보여지는 이유이다. 프로세싱 요소들은 또한 동일한 프로세서에서 구동하는 개별적인 작업일 수 있다.The processing elements of FIG. 20 are separate processors (eg, DSP, ASIC, GPP, etc.) that can drive a defined set of tasks. This is why each is shown with its own memory. The processing elements can also be separate tasks running on the same processor.

어떤 경우에는, 동일한 데이터가 다수 번 시스템 버스를 통과하여야만 한다(예를 들어, 트랜잭션 1과 2, 3과 4, 및 5와 6). 이러한 시스템에서 데이터는 시스템 버스를 총 2+(2×n) 배, 혹은 이 경우에 6번 통과하여야만 한다. 각각이 시스템 버스를 통과하고 CCP에 의한 중재가 데이터 흐름의 오버헤드를 들여오고 전체적인 시스템 처리량을 감소시킨다.In some cases, the same data must pass through the system bus multiple times (eg, transactions 1 and 2, 3 and 4, and 5 and 6). In such a system, data must pass a total of 2+ (2 × n) times, or six times, in the system bus. Each passes through the system bus and arbitration by the CCP introduces overhead in the data flow and reduces overall system throughput.

데이터 흐름 오버헤드는 데이터가 시스템을 통하여 소정의 시간 프레임에 통과하는 양에 부정적으로 영향을 미치고, 그럼으로써 시스템이 처리할 수 있는 데이터의 양을 제한한다. 그러한 시스템은 그 요소들의 용량의 합이 달리 나타내는 것 보다 소수의 유용한 작업을 수행할 것이다.Data flow overhead negatively affects the amount of data that passes through the system in a given time frame, thereby limiting the amount of data that the system can process. Such a system would perform fewer useful tasks than the sum of the capacities of the elements would otherwise indicate.

본 발명은 하나 이상의 특징을 갖는 클라이언트-서버 시스템을 제공하며, 상기 특징들은 서버 태스크의 2-단계 스케줄링, 서버 DSP 상의 태스크를 엮는 클라이언트-서버 시스템을 위한 객체 요청 브로커(object request broker), 내부 메모리를 프로세서 오버헤드로 분할함에 의한 다중작업 프로세서 내부 메모리 관리에 단일 실행 작업에 속하는 태스크 작업영역을 동시에 더한 것, 및 중앙 제어 프로세서에 버스 접속된 프로세싱 요소들을 더하고, 중앙 제어 프로세서 버스를 피하도록 프로세싱 요소들마다 공유 메모리를 더한 것을 포함하는 이종 시스템에서의 데이터 흐름을 포함한다.The present invention provides a client-server system having one or more features, which features two-stage scheduling of server tasks, an object request broker for a client-server system that binds tasks on the server DSP, and internal memory. Multitasking internal memory management by dividing the processor overhead into task tasks belonging to a single execution task at the same time, adding processing elements bus-connected to the central control processor, and avoiding the central control processor bus. Each includes a data flow in a heterogeneous system, including the addition of shared memory.

도 1은 양호한 실시예의 DSPORB 구조를 도시한 도면.1 shows a DSPORB structure of a preferred embodiment.

도 2는 IDL 번역을 도시한 도면.2 illustrates IDL translation.

도 3-13은 QoS의 타이밍도.3-13 is a timing diagram of QoS.

도 14-19는 양호한 실시예의 메모리 분석을 도시한 도면.14-19 illustrate memory analysis of a preferred embodiment.

도 20은 이종 시스템에서의 공지된 데이터 흐름도를 도시한 도면.20 illustrates a known data flow diagram in a heterogeneous system.

도 21-23은 양호한 실시예의 데이터 흐름도를 도시한 도면.21-23 show data flow diagrams of a preferred embodiment.

도 24-27은 CORBA를 도시한 도면.24-27 depicts CORBA.

1. 개요1. Overview

양호한 실시예의 시스템은 통상적으로 클라이언트 애플리케이션을 동작시키는 호스트 프로세서와 서버 알고리즘을 동작시키는 하나 이상의 서버 프로세서를 가지며, 알고리즘 객체용의 객체 요청 브로커, 객체 요청 브로커용의 서비스 제어 품질, 알고리즘 객체용의 메모리 페이징 및 알고리즘 객체용의 데이터 흐름을 포함한다. iDSPOrb라는 용어의 양호한 실시예는 주 프로세서 및 하나 이상의 DSP 코프로세서를 갖는 시스템에 적용된다.The system of the preferred embodiment typically has a host processor running a client application and one or more server processors running a server algorithm, the object request broker for algorithm objects, the quality of service control for the object request broker, the memory paging for the algorithm objects. And data flow for algorithm objects. The preferred embodiment of the term iDSPOrb applies to a system having a main processor and one or more DSP coprocessors.

iDSPOrb는 멀티프로세서 환경에서 범용 프로세서(General Purpose Processor,GPP) 또는 다른 DSP로부터 DSP 객체를 생성하고 액세스하는 것을 지원하는 고성능의 DSP 객체 요청 브로커(DSP Object Request Broker, DSPORB)이다. iDSPOrb는 CORBA와 유사한 일반적 구조 및 동작을 갖는다. iDSPOrb는 다음의 DSPORB 특징을 갖는다:iDSPOrb is a high performance DSP Object Request Broker (DSPORB) that supports the creation and access of DSP objects from a General Purpose Processor (GPP) or other DSPs in a multiprocessor environment. iDSPOrb has a general structure and behavior similar to CORBA. iDSPOrb has the following DSPORB features:

(1) iDSPOrb는 프로세서 경계를 통한 호출 및 객체 바인딩(DSP 객체 프로시져 콜)을 지원한다.(1) iDSPOrb supports calling and object binding (DSP object procedure call) across processor boundaries.

(2) iDSPOrb는 정적 호출 및 런타임 동적 호출 인터페이스를 위해 컴파일-타임 헤더 및 스터브로 구성된 GPP-측 프록시 인터페이스를 제공한다.(2) iDSPOrb provides a GPP-side proxy interface consisting of compile-time headers and stubs for static call and runtime dynamic call interfaces.

(3) iDSPOrb는 iDSP 서버를 구축하기 위해 DSP-측 알고리즘 인터페이스(스터브 및 헤더)를 제공한다.(3) iDSPOrb provides DSP-side algorithm interfaces (stubs and headers) for building iDSP servers.

(4) iDSPOrb는 동기 호출 및 비동기 호출 모두를 제공한다.(4) iDSPOrb provides both synchronous and asynchronous calls.

(5) iDSPOrb는 보장된 실시간 QoS를 제공한다.(5) iDSPOrb provides guaranteed real time QoS.

(6) iDSPOrb는 프레임 기반 프로세싱 및 스트림 기반 프로세싱을 제공한다.(6) iDSPOrb provides frame based processing and stream based processing.

(7) iDSPOrb는 객체 결합 데이터 흐름을 제공한다(중간 결과는 DSP 메모리에 보유됨).(7) iDSPOrb provides object binding data flow (intermediate results are held in DSP memory).

(8) iDSPOrb는 고역 멀티채널 GPP/DSP I/O 인터페이스에서 구현한다.(8) iDSPOrb is implemented on the high-pass multichannel GPP / DSP I / O interface.

도 1은 GPP가 "클라이언트"로서 기능하고 DSP가 "서버"로서 기능하는 GPP/DSP 이중 프로세서에 대한 iDSPOrb 아키텍처를 도시한다.Figure 1 shows the iDSPOrb architecture for a GPP / DSP dual processor with GPP serving as a "client" and a DSP serving as a "server."

본원에서는 iDSP-QoSM으로 참조되는 iDSP 시스템의 서비스 품질(QoS) 관리 프로그램은 클라이언트 애플리케이션에 협상된 레벨의 서비스를 제공하는 메커니즘(서버 내에 있음)이다. 이는 클라이언트와 통신하는 소정의 등급 강하 정책에 따라 보증된 서비스 품질을 제공한다. iDSP-QoSM은 다음과 같은 특성을 갖는다. 즉, (1) 네트워크 상에 상주하는 노드(인트라-노드)의 제한된 문맥(context) 내에서 정의되고, 인터-노드(네트워크) 통신를 제어하기 위해 적합한 QoS 관리 프로그램이 존재하는 것으로 한다. (2) 부하 공유 능력을 갖는 멀티-프로세서 환경으로 정의됨.The quality of service (QoS) management program of the iDSP system, referred to herein as iDSP-QoSM, is a mechanism (in the server) that provides a negotiated level of service to client applications. This provides a guaranteed quality of service in accordance with certain grade down policies in communication with the client. iDSP-QoSM has the following characteristics. That is, (1) it is assumed that there is a QoS management program defined within the limited context of a node (intra-node) residing on the network and suitable for controlling inter-node (network) communication. (2) Defined as a multi-processor environment with load sharing capabilities.

바람직한 실시예의 iDSP-QoSM으로 수행되는 기능들은 다음과 같은 것을 포함한다. 즉, (1) 이 시스템에서의 서버에 대한 정상 상태 처리 부하를 모티터링. (2) 오버로드된 서버로부터의 부하를 그에 대등하는 통신 실체로 분배. (3) 서비스 요건들을 서버에 임의의 추가 부하를 등록하기 위한 클라이언트 애플리케이션과 협상. (4) 서버에 의해 서비스되어지는 개개의 객체의 특정한 특성에 기초하여 서버에 대해 다른 부하를 예측. (5) 알고리즘 런타임 추정은 프로세스 시간 대신 프로세서 시간의 사이클에 기초할 것임. 즉, 알고리즘 런타임 예측 방식은 프로세서 동작 주파수와 결합되지 않는다.Functions performed by iDSP-QoSM of the preferred embodiment include the following. That is, (1) monitoring the steady state processing load on the server in this system. (2) Distributing the load from overloaded servers to equivalent communication entities. (3) Negotiate service requirements with the client application to register any additional load on the server. (4) Estimate different loads on the server based on the specific characteristics of the individual objects served by the server. (5) Algorithm runtime estimation will be based on cycles of processor time instead of process time. In other words, the algorithm runtime prediction scheme is not combined with the processor operating frequency.

텍사스 인스트루먼츠 TMS320C62XXDSPs에서는, 내부 메모리(온-칩) 데이터량이 제한되어 있다. TMS320C6211(및 그 파생물)을 제외하고는, TMS320C62XXDSPs는 외부 메모리(오프-칩) 액세스를 효율적으로 하기 위한 데이터 캐시를 갖지 않는다. 내부 메모리는 TMS320C62XXDSP의 데이터 메모리 계층에서 최고 레벨이다. 그러므로, TMS320C62XXDSP에서 실행되는 모든 알고리즘은 이것이 데이터 메모리 액세싱을 위한 최고의 효율 레벨이므로 데이터 작업영역으로서 내부 메모리를 사용하기를 원한다.In Texas Instruments TMS320C62XXDSPs, the amount of internal memory (on-chip) data is limited. Except for the TMS320C6211 (and its derivatives), the TMS320C62XXDSPs do not have a data cache for efficient external memory (off-chip) access. Internal memory is the highest level in the data memory hierarchy of the TMS320C62XXDSP. Therefore, all algorithms run on the TMS320C62XXDSP want to use internal memory as the data work area because this is the highest efficiency level for data memory access.

전형적으로, DSP용 알고리즘은 이들이 DSP 프로세서 전체, 즉 DSP의 모든 내부 메모리를 소유하는 것으로 가정하여 개발되었다. 이것은 통합된 여러 상이한 알고리즘들이 동일(동종) 또는 상이한 (이종)이더라도, 이들 알고리즘을 매우 곤란하게 만든다. 알고리즘 개발자에게는 내부 메모리 등의 시스템 자원을 액세싱하여이용하는 통상의 방법에 관한 한 셋트의 룰이 필요로 된다.Typically, algorithms for DSP have been developed assuming they own the entire DSP processor, i.e. all the internal memory of the DSP. This makes these algorithms very difficult even if several different algorithms integrated are the same (homogeneous) or different (heterogeneous). Algorithm developers need a set of rules regarding conventional methods of accessing and using system resources such as internal memory.

바람직한 실시예에서는 DSP 내부 메모리용 데이터 페이징 아키텍처를 이용함으로써 데이터 캐시가 적은 DSP에 대해 다수 알고리즘을 실행할 때의 프로세서 활용을 증가시키는 방법을 제공한다. DSP 알고리즘을 데이터 페이징 아키텍처에 따라 유연하게 개발 또는 변환하는 것은 텍사스 인스투루먼츠 ADAIS 표준에 따라 달성될 수 있다.The preferred embodiment provides a method of increasing processor utilization when executing multiple algorithms for a DSP with a low data cache by using a data paging architecture for DSP internal memory. Flexibly developing or converting DSP algorithms according to the data paging architecture can be accomplished according to the Texas Instruments ADAIS standard.

본 규격은 알고리즘 개발자가 해당 알고리즘에 대한 모든 데이터 메모리를 지원할 적어도 하나 이상의 메모리 영역을 정의할 것을 요구한다. 이들 사용자 정의 영역들 중, 하나 또는 모두는 알고리즘 개발자에 의해 TMS320C62X DSP의 내부 메모리를 동작시키도록 선택된다. 애플리케이션의 DSP 시스템 소프트웨어 부분내에서, 내부 메모리는 시스템 지원 및 데이터 작업영역(페이지)으로 분할된다. DSP 애플리케이션내의 모든 알고리즘은 작업영역을 공유하고 실행 시간시 전체 작업영역을 소유한다. 2개의 알고리즘들간 문맥 전환(contect switch)에 있어서, DSP 시스템은 각 알고리즘의 작업영역 및 외부 쉐도우 메모리 사이의 전송 각각을 다룰 것이다. 바람직한 실시예에 의하면 이하의 점들이 제공된다:This International Standard requires an algorithm developer to define at least one memory area to support all data memories for that algorithm. One or both of these user-defined areas are selected by the algorithm developer to operate the internal memory of the TMS320C62X DSP. Within the DSP system software portion of the application, the internal memory is divided into system support and data workspaces (pages). All algorithms in a DSP application share a workspace and own the entire workspace at run time. In the context switch between the two algorithms, the DSP system will handle each transfer between the work area of each algorithm and the external shadow memory. According to a preferred embodiment the following points are provided:

(1) 데이터 캐쉬가 없는 DSP에서 내부 데이터 메모리를 2개 이상의 DSP 알고리즘들간 공유하는 것은 처리 효율을 증가시킨다.(1) In DSPs without a data cache, sharing internal data memory between two or more DSP algorithms increases processing efficiency.

(2) 동일한 공유 내부 메모리로부터 복수의 알고리즘을 동작시키는 것은, 스택 요구 및 알고리즘 내부 변수들을 지원하기 위해 데이터 메모리를 액세스할 때에, 각 알고리즘이 TMS320C62X DSP 환경에서 최대 효율을 가질 수 있게 한다.(2) Operating multiple algorithms from the same shared internal memory allows each algorithm to have maximum efficiency in the TMS320C62X DSP environment when accessing data memory to support stack requests and algorithm internal variables.

(3) 본 아키텍처는 내부 메모리를 갖는 임의의 단일 프로세서 및 상기 프로세서의 내부 메모리에 액세스한 DMA 유틸리티 상에서 기능할 것이다.(3) The architecture will function on any single processor with internal memory and a DMA utility that accesses the processor's internal memory.

(4) 데이터 입력 프레임 경계들에서만 콘텍스트 스위치를 수행하는 것은 데이터 페이징 아키텍처의 최상의 효율을 제공한다.(4) Performing a context switch only at the data input frame boundaries provides the best efficiency of the data paging architecture.

애플리케이션에서의 데이터 흐름은 알고리즘으로부터 알고리즘으로일 것이고, 바람직한 실시예는 상기 데이터들이 각 알고리즘 실행을 위한 GPP로/GPP로부터 버스로 연결되는 것보다는 오히려 하나 이상의 DSP에 잔존하도록 한다.The data flow in the application will be from algorithm to algorithm, and the preferred embodiment allows the data to remain in one or more DSPs rather than from bus to / to GPP for each algorithm implementation.

2. 듀얼-프로세서 구성에서의 DSP ORB2. DSP ORB in Dual-Processor Configuration

도 1은 범용 프로세서(GPP) 및 디지털 신호 프로세서(DSP)를 포함하는 듀얼-프로세서에 대한 바람직한 실시예인 ORB("iDSPOrb") 아키텍처를 도시하는 것으로, GPP는 "클라이언트"로서 동작하고 DSP는 "서버"로서 동작한다. iDSPOrb는 서비스 품질(QoS) 관리 프로그램을 포함한다는 것에 주의하자. 도 1은 2개의 DSP 알고리즘 객체 "A" 및 "B"를 호출하는 클라이언트 애플리케이션을 도시한다. iDSPOrb는 먼저 GPP에 프록시(클라이언트 스터브) 객체 "a" 및 "b"의 객체 바인딩을 제공한다. 예를 들어, "A" 및 "B"는 아래와 같이 디코더(DEC)에 대한 DSPIDL 인터페이스의 확장일 수 있다:Figure 1 shows an ORB ("iDSPOrb") architecture, which is a preferred embodiment for a dual-processor including a general purpose processor (GPP) and a digital signal processor (DSP), where the GPP operates as a "client" and the DSP is a "server". It acts as ". Note that iDSPOrb includes a quality of service (QoS) management program. Figure 1 shows a client application calling two DSP algorithm objects "A" and "B". iDSPOrb first provides GPP with object bindings of proxy (client stub) objects "a" and "b". For example, "A" and "B" may be an extension of the DSPIDL interface to the decoder DEC as follows:

DSP측 애플리케이션(iDSP 서버라 함)은 DSPIDL 컴파일러에 의해 제공되는 아래의 알고리즘 인터페이스를 사용하여 구현된다:DSP-side applications (called iDSP servers) are implemented using the following algorithmic interfaces provided by the DSPIDL compiler:

GPP측 애플리케이션은 DSPIDL 컴파일러에 의해 제공되는 아래의 프록시 인터페이스를 사용하거나 또는 iDSPOrb 동적 호출 인터페이스를 사용하여 구현된다.GPP-side applications are implemented using the following proxy interfaces provided by the DSPIDL compiler, or using the iDSPOrb dynamic call interface.

동작시에, 버퍼를 처리하기 위해 GPP측 클라이언트 애플리케이션으로부터 "a"가 호출될 수 있다. 이 데이터는 DSP측의 실제 객체 "A"에 전달된다. 객체 연쇄 데이터 흐름을 사용하여, "A"의 출력은 "B"의 입력에 접속될 수 있어, 중재 데이터 버퍼가 GPP로 다시 전송되지 않는다. "B"는 이 데이터를 GPP로 복귀시키는 다른 처리 단계를 초래하는 "B"를 호출한다. iDSPOrb의 동적 호출 인터페이스는 동기식 및 비동기식 호출 모두를 지원한다.In operation, "a" may be called from the GPP side client application to process the buffer. This data is transferred to the actual object "A" on the DSP side. Using the object chain data flow, the output of "A" can be connected to the input of "B" so that the arbitration data buffer is not sent back to GPP. "B" calls "B" which results in another processing step of returning this data to GPP. iDSPOrb's dynamic call interface supports both synchronous and asynchronous calls.

iDSPOrb가 GPP와 단일 DSP 사이에서 구획될 필요는 없다. 이것은 또한 복수의 DSP들로 구성된 것에서도 동작할 수 있다. 이러한 경우, QoS 관리 프로그램(서버측)은 가용 DSP들 사이에서 DSP 알고리즘의 부하-밸런싱을 수행한다. 클라이언트 애플리케이션에 알고리즘 인터페이스가 제공되는 다른 구성들은 ASIC(고정 기능 DSP로서 동작함), 또는 ASIC 플러스 RISC를 포함할 수 있다.iDSPOrb need not be partitioned between GPP and a single DSP. It can also work with one consisting of multiple DSPs. In this case, the QoS management program (server side) performs load-balancing of the DSP algorithm among the available DSPs. Other configurations where an algorithmic interface is provided to the client application may include an ASIC (acting as a fixed function DSP), or an ASIC plus RISC.

2a. DSPIDL 컴파일러2a. DSPIDL compiler

iDSPOrb는 아래의 키워드를 갖는 DSPIDL, IDL(Interface Definition Language)를 지원한다:iDSPOrb supports DSPIDL and IDL (Interface Definition Language) with the following keywords:

module : 인터페이스 사양의 집합module: a set of interface specifications

예를 들어, H263 모듈은 디코더 및 인코더 인터페이스들을 포함하기도 한다.For example, the H263 module may also include decoder and encoder interfaces.

interface : 인터페이스 사양interface: Interface specification

in : 입력 인수를 나타냄in: Represents input argument

out : 출력 인수를 나타냄out: indicates output arguments

BUFFER : 버퍼 타입을 나타냄BUFFER: Indicates buffer type

STREAM : 스트림 타입을 나타냄STREAM: indicates the stream type

RESULT : 함수의 리턴 타입을 나타냄RESULT indicates the return type of the function

나머지는 메모리 사용, 실시간을 위한 것임The rest is for memory use, real time

DSPIDL 파일의 일반적인 형태는 아래와 같다.The general form of a DSPIDL file is:

여기서, method는Where method is

이고, direction은 in, out, 또는[in, out]이며, TYPE은 BUFFER 또는 STREAM이다. 예를 들어, H263 IDL은 도 2에 도시된 바와 같이 알고리즘 및 프록시 인터페이스를 생성하기도 한다., Direction is in, out, or [in, out], and TYPE is BUFFER or STREAM. For example, H263 IDL may generate algorithms and proxy interfaces as shown in FIG.

2b. 프레임 및 스트림 처리2b. Frame and Stream Processing

프레임 대 스트림 처리는 아래의 차이점을 갖는다.Frame-to-stream processing has the following differences.

Keywords :Keywords:

BUFFER : 인수 타입으로서 BUFFER를 갖는 함수들은 프레임 대 프레임에 기초하여 처리를 한다.BUFFER: Functions with BUFFER as argument type handle on a frame-by-frame basis.

STREAM : 인수 타입으로서 STREAM을 갖는 함수들은 통상적으로 태스크를 발생시키는 프레임들의 스트림을 처리한다.STREAM: Functions with STREAM as the argument type typically handle the stream of frames that cause the task.

위 함수는 객체 출력들을 입력들(각각 프레임 또는 스트림)에 접속시킨다.버퍼에 대하여, 접속 연산자는 다른 방법 호출(객체 연쇄)의 입력을 위해 일 한 방법 호출의 출력이 저장되는 DSP 상에 메모리 버퍼를 DSPORB가 생성하도록 할 것이다. 예를 들어:The above function connects the object outputs to inputs (frames or streams, respectively). For a buffer, the connect operator is a memory buffer on the DSP where the output of one method call is stored for the input of another method call (object chain). Will cause DSPORB to generate E.g:

스트림 처리에 대하여, 아래와 같은 프록시 호출은 통상적으로, SIO 스트림(H263_TIDEC_decodeStream의 구현은 태스크가 이것을 수행하도록 할 것이다)을 조작하기 위해 DSP측 상에서 생성되는 태스크를 초래할 것이다. 접속되지 않은 것으로서의 스트림들은 클라이언트 프록시 및 서버 사이에 I/O를 제공한다.For stream processing, the following proxy call will typically result in a task created on the DSP side to manipulate the SIO stream (an implementation of H263_TIDEC_decodeStream will allow the task to do this). Streams as not connected provide I / O between the client proxy and the server.

2c. 실시간 QoS 관리 프로그램2c. Real-time QoS Manager

iDSPOrb는 DSPORB_System_setTimeConstraint() 및 DSPORB_System_setPriority() 인터페이스를 통해 설정 시간 제약 내에서 주어진 동작을 수행하기에 필요한 리소스들을 할당함으로써 하드 리얼-타임 QoS를 제공할 수 있다. GPP/DSP 채널 I/O 드라이버는 다수의 스레드(thread)가 병렬로 동작하는 것을 허용한다. QoS 관리 프로그램은, (1) 클라이언트가 필요로하는 알고리즘을 인스턴스 생성하고, (2) 클라이언트 애플리케이션으로부터의 제약들을 갱신하고 이 제약들을 만족시키도록 리소스를 관리하며(또는 위 제약들은 충족될 수 없다고 보고함), (3) 그 외의 동작을 하는 DSP측 iDSPOrb의 일부이다.iDSPOrb can provide hard real-time QoS by allocating resources necessary to perform a given operation within a set time constraint through the DSPORB_System_setTimeConstraint () and DSPORB_System_setPriority () interfaces. GPP / DSP channel I / O drivers allow multiple threads to run in parallel. The QoS manager can (1) instantiate the algorithms that the client needs, (2) update the constraints from the client application and manage the resource to satisfy these constraints (or report that the above constraints cannot be met). (3) A part of iDSPOrb of the DSP side which performs other operations.

2d. iDSPORB 등록 서비스2d. iDSPORB Registration Service

iDSPOrb는 서버 객체들이 그들의 서비스를 등록할 수 있도록 클래스 등록 서비스를 제공한다. 예를 들어, MP3 오디오를 디코드할 수 있는 iDSPOrb에 의해 서버 객체가 등록할 수 있다. 클라이언트 객체는 소망하는 서비스의 이름을 공급함으로써 서버 객체를 생성한다. iDSPOrb Registration Service는 여하한 종류의 DSP 객체 서비스에 대하여 사용될 수 있지만, 오디오 및 비디오 서비스들에 대하여 모니커(moniker)들의 규격 세트를 제공함으로써 미디어 도메인을 알게된다:iDSPOrb provides a class registration service that allows server objects to register their services. For example, a server object can be registered by iDSPOrb, which can decode MP3 audio. The client object creates a server object by supplying the name of the desired service. The iDSPOrb Registration Service can be used for any kind of DSP object service, but the media domain is known by providing a set of monikers for audio and video services:

오디오 서비스 비디오 서비스Audio Service Video Service

------------------------------------------------------

MP3 오디오 디코드 MPEG 1 비디오 디코드MP3 audio decode MPEG 1 video decode

MP3 오디오 인코드 MPEG 1 비디오 인코드MP3 audio encode MPEG 1 video encode

MPEG1 L2 오디오 디코드 MPEG2 비디오 디코드MPEG1 L2 Audio Decode MPEG2 Video Decode

MPEG1 L2 오디오 인코드 MPEG2 비디오 인코드MPEG1 L2 Audio Encode MPEG2 Video Encode

G. 723 디코드 MPEG4 비디오 디코드G. 723 decode MPEG4 video decode

G. 723 인코드 MPEG4 비디오 인코드G. 723 encode MPEG4 video encode

G. 729 디코드 H. 263 디코드G. 729 decode H. 263 decode

G. 729 인코드 H. 263 인코드G. 729 encode H. 263 encode

... ...... ...

iDSPOrb 등록 서비스로 인해, iDSPOrb가 런타임 동안에 서버 객체를 동적으로 생성할 수 있게 된다. 서버 객체를 생성할 때, iDSPOrb는 마이크로프로세서와DSP간의 로우 레벨 I/O 채널을 동적으로 할당한다. 이들 로우 레벨 채널은, iDSPOrb 스트리밍 인터페이스(DSPORB_ 스트림 인터페이스 참조)를 통해 클라이언트 객체에 의해 직접 액세스될 수 있다. iDSPOrb 등록 서비스는 또한, 특정 서비스를 제공하는 DSP를 위치시킬 수 있도록 해주는 정보도 제공하며, 이로 인해 QoS 관리 프로그램이 부하 밸런싱 및 프로젝션 스케줄링(실시간 QoS 관리 프로그램)을 행할 수 있게 된다. 예를 들면, 동적 호출 모델을 이용하여, call DSPORB_ALG_create("MP3 오디오 디코드", 널)은 MP3 오디오 디코더의 인스턴스를 생성할 것이다. iDSPOrb 부하는 시스템의 밸런싱을 행하고, 클라이언트는, 어떤 DSP가 실질적으로 디코더를 실행시키는 지와, 어떤 로우 레벨 스트림이 데이터를 통과시키는 데에 할당되었는지에 대한 상세한 사항으로부터 차폐된다. 클라이언트는 또한, iDSPOrb를 문의함으로써 현재 등록된 서버 클래스의 리스트를 열거할 수 있다. 기능 DSPORB_Alg*DSPORB_System_getServices()는 현재 등록된 서비스의 열거기를 얻는 데에 사용될 수 있다. 그 후, char*DSPORB_System_next(DSPORB_Alg*enum)은 각 등록된 서비스의 명칭을 얻기 위해 호출될 수 있다. DSPORB_System_reset(DSPORB_Handle*enum)을 호출함으로써 열거는 초기로 리셋될 수 있다.The iDSPOrb registration service allows iDSPOrb to dynamically create server objects during runtime. When creating a server object, iDSPOrb dynamically allocates low-level I / O channels between the microprocessor and the DSP. These low level channels can be accessed directly by the client object via the iDSPOrb streaming interface (see DSPORB_stream interface). The iDSPOrb registration service also provides information that allows the DSP to provide a particular service to be located, which allows the QoS management program to perform load balancing and projection scheduling (real-time QoS management program). For example, using the dynamic call model, call DSPORB_ALG_create ("MP3 audio decode", null) will create an instance of the MP3 audio decoder. The iDSPOrb load balances the system, and the client is shielded from the details of which DSP actually runs the decoder and which low level stream has been allocated to pass the data. The client can also enumerate the list of currently registered server classes by querying iDSPOrb. The function DSPORB_Alg * DSPORB_System_getServices () can be used to get an enumerator of the currently registered services. Then char * DSPORB_System_next (DSPORB_Alg * enum) can be called to get the name of each registered service. The enumeration can be reset initially by calling DSPORB_System_reset (DSPORB_Handle * enum).

2e. 미디어 프레임워크 지원.2e. Media framework support.

iDSPOrb는 DirectShow(윈도우 미디어)와 같은 특정 미디어 프레임워크에 대한 구성요소를 제공함으로써 미디어 프로세싱 가속화를 지원하는 데에 사용될 수 있다: 필터 객체는 iDSPOrb 코덱 클라이언트 객체를 랩핑(wrap)하는 데에 구현되고DirectShow 프레임워크로 플러깅될 수 있다.iDSPOrb can be used to support media processing acceleration by providing components for specific media frameworks such as DirectShow (Windows Media): Filter objects are implemented to wrap iDSPOrb codec client objects and DirectShow It can be plugged into the framework.

리얼미디어 아키텍처(리얼시스템 G2): 렌더러(renderer) 플러그인은 iDSPOrb 코덱 클라이언트 객체를 랩핑하도록 구현되고 리얼시스템 G2 프레임워크로 플러그인될 수 있다.RealMedia Architecture (RealSystem G2): Renderer plug-ins are implemented to wrap iDSPOrb codec client objects and can be plugged into the RealSystem G2 framework.

DSPOrb는 또한 동일 방법을 이용하여 JMF 및 QuickTime으로 플러그 인될 수 있다.DSPOrb can also be plugged into JMF and QuickTime using the same method.

iDSPOrb에 대한 API는 DSPORB 모듈 내에 인캡슐화된다. 클라이언트(GPP)측 DSPORB의 데이터 유형 및 기능은 이하에 상세히 기술된다.The API for iDSPOrb is encapsulated within the DSPORB module. The data type and function of the client (GPP) side DSPORB is described in detail below.

2f. 데이터 유형:2f. Data type:

DSPORB_Alg: DSP 알고리즘 객체에 대한 클라이언트 프록시.DSPORB_Alg: Client proxy for DSP algorithm object.

DSPORB_Fxn: 동적 호출과 함께 사용될 기능 객체.DSPORB_Fxn: Function object to be used with dynamic invocation.

DSPORB_Arg: 동적 호출과 함께 사용될 기능 논의 객체.DSPORB_Arg: Function discussion object to be used with dynamic invocation.

DSPORB_Buffer 및 DSPORB_stream은 DSPORB_Arg의 "서브클래스"이다.DSPORB_Buffer and DSPORB_stream are "subclasses" of DSPORB_Arg.

DSPORB_Params: DSP 측에서 IALG_Params 알고리즘 파라미터 구조를 매칭하는 알고리즘에 대한 파라미터를 제공한다.DSPORB_Params: Provides parameters for the algorithm that matches the IALG_Params algorithm parameter structure on the DSP side.

DSPORB_Buffer: 버퍼 객체.DSPORB_Buffer: A buffer object.

DSPORB_Stream: 스트림 객체.DSPORB_Stream: Stream object.

2g. DSPORB_Buffer 인터페이스2 g. DSPORB_Buffer Interface

-DSPORB_Buffer*DSPORB_Buffer_create(int size, int direction);DSPORB_Buffer * DSPORB_Buffer_create (int size, int direction);

데이터 길이 사이즈를 참조할 수 있는 버퍼 객체를 생성한다. direction은DSPBUFFER_INPUT 또는 DSPBUFFER_OUTPUT중 하나이다. 버퍼 방향은 기능 호출 신호와 매칭시켜야 하며, 그렇지 않을 경우 iDSPOrb 런타임 오류가 발생할 것이다.Create a buffer object that can reference the data length size. direction is either DSPBUFFER_INPUT or DSPBUFFER_OUTPUT. The buffer direction must match the function call signal or an iDSPOrb runtime error will occur.

이와 달리, DSPORB_Buffer*DSPORB_Buffer_create(DSP ORB_Alg*, int,int); 객체에 의해 사용되는 버퍼.Alternatively, DSPORB_Buffer * DSPORB_Buffer_create (DSP ORB_Alg *, int, int); The buffer used by the object.

-unsigned char*DSPORB_Buffer_getData();-unsigned char * DSPORB_Buffer_getData ();

버퍼 객체에 의해 참조된 데이터를 취한다. 버퍼가 다른 버퍼에 연결되어 있을 경우 NULL이 리턴된다.Takes data referenced by a buffer object. NULL is returned if the buffer is connected to another buffer.

-void DSPORB_Buffer_setData(unsigned char*data)-void DSPORB_Buffer_setData (unsigned char * data)

버퍼 데이터 포인터를 설정한다. 이 버퍼가 다른 버퍼에 연결되어 있을 경우, 이 동작은 실패하는데, 그 이유는 이 버퍼의 데이터에 대한 메모리 공간이 DSP 메모리 공간 내에 있기 때문이다.Set the buffer data pointer. If this buffer is connected to another buffer, this operation fails because the memory space for the data in this buffer is in the DSP memory space.

-void DSPORB_Buffer_setSize(int)-void DSPORB_Buffer_setSize (int)

실제 데이터의 크기를 설정한다.Set the actual data size.

-int DSPORB_Buffer_getSize()-int DSPORB_Buffer_getSize ()

실제 데이터의 크기를 취한다.Take the size of the actual data.

-void DSPORB_Buffer_delete(DSPORB_Buffer*buffer)-void DSPORB_Buffer_delete (DSPORB_Buffer * buffer)

-int DSPORB_Buffer_connect(DSPORB_Buffer*output, DSPORB_Buffer*input)-int DSPORB_Buffer_connect (DSPORB_Buffer * output, DSPORB_Buffer * input)

입력 버퍼를 DSP에 대한 출력 버퍼에 접속한다. 이들 버퍼 객체가 접속되어 있을 때, 데이터가 DSP에 유지되며, GGP로 도로 전송되지 않는다(버퍼는 DSP에서 iDSPOrb에 의해 생성되어 중간 결과를 유지하도록 한다).Connect the input buffer to the output buffer for the DSP. When these buffer objects are connected, the data is kept in the DSP and not sent back to GGP (the buffer is created by iDSPOrb in the DSP to keep intermediate results).

2h. DSPORB_Stream 인터페이스2h. DSPORB_Stream Interface

스트림 인터페이스는 이하의 방법을 갖는다.The stream interface has the following method.

-DSPORB_Stream*DSPORB_Stream_create(int n, int direction); n개의 버퍼를 유지할 수 있는 스트림을 생성한다. direction은 DSPSTREAM INPUT 또는 DSPSTREAM OUTPUT중 하나이다.DSPORB_Stream * DSPORB_Stream_create (int n, int direction); Create a stream that can hold n buffers. direction is either DSPSTREAM INPUT or DSPSTREAM OUTPUT.

-int DSPORB_Stream_issue(DSPORB_Buffer*buf); 입력 스트림 상에 전송된 입력 버퍼 buf, 또는 큐 상에 놓여진 비어있는 버퍼를 가져서 출력 스트림 상에 채워지게 된다. 연결되는 스트림에 대해, 이 동작은 효과가 없는데, 그 이유는 스트림이 알고리즘 사이에 직접 접속될 것이기 때문이다.int DSPORB_Stream_issue (DSPORB_Buffer * buf); An input buffer buf sent on the input stream, or an empty buffer placed on the queue, is filled and filled on the output stream. For concatenated streams, this operation is ineffective because the streams will be directly connected between algorithms.

-DSPORB_Buffer*DSPORB_Stream_reclaim(); 출력 스트림으로부터의 출력 버퍼; 혹은 입력 스트림 상에서 재전송될 될 수 있는 입력 버퍼를 취한다. 접속되어 있는 스트림에 대해, 이 동작은 효과를 가지지 않는다.DSPORB_Buffer * DSPORB_Stream_reclaim (); An output buffer from an output stream; Or take an input buffer that can be retransmitted on the input stream. For connected streams, this operation has no effect.

-DSPORB_Stream_select(DSPORB_Stream array, int n_streams, int*mask, long millis); 스트림이 I/O가 준비될 때까지 블럭.DSPORB_Stream_select (DSPORB_Stream array, int n_streams, int * mask, long millis); Block until the stream is ready for I / O.

-DSPORB_Stream_idle(DSPORB_Stream*str); 스트림을 아이들링한다.DSPORB_Stream_idle (DSPORB_Stream * str); Idle the stream.

-DSPORB_Stream_close(DSPORB_Stream*str); 스트림을 클로즈한다.DSPORB_Stream_close (DSPORB_Stream * str); Close the stream.

-DSPORB_Stream_connect(DSPORB_Stream*out, DSPORB_Stream*in); 출력 스트림을 입력 스트림에 연결시킨다. 두 개의 스트림은 이제 DSP 프로세서 스페이스 내의 동작을 이등분하고 GPP에 액세스가능하지 않다.DSPORB_Stream_connect (DSPORB_Stream * out, DSPORB_Stream * in); Connect the output stream to the input stream. The two streams now bisect the operation in the DSP processor space and are not accessible to GPP.

2i. DSPORB 동적 호출 인터페이스2i. DSPORB dynamic call interface

동적 호출 인터페이스는 이하의 방법을 갖는다.The dynamic call interface has the following method.

-int DSPORB_System_int(); 처음에 호출되어 DSPOrb를 초기화해야 한다.int DSPORB_System_int (); It must be called initially to initialize DSPOrb.

-DSPORB_Alg*DSPORB_Alg_create(const char*name, DSPORB_Params*params); 심볼 '명칭'에 의해 참조되는 알고리즘의 인스턴스를 생성한다.DSPORB_Alg * DSPORB_Alg_create (const char * name, DSPORB_Params * params); Create an instance of the algorithm referenced by the symbol 'name'.

-void DSPORB_Alg_delete(DSPORB_Handle alg); 알고리즘 인스턴스를 삭제한다.void DSPORB_Alg_delete (DSPORB_Handle alg); Delete the algorithm instance.

-DSPORB_Fxn*DSPORB_Alg_getFxn(DSPORB_Alg*alg, const char*fxn_name); 심볼 'fxn_name'과 관련된 기능 객체를 복귀한다.DSPORB_Fxn * DSPORB_Alg_getFxn (DSPORB_Alg * alg, const char * fxn_name); Return the function object associated with the symbol 'fxn_name'.

-int DSPORB_Fxn_setTimeConstraint(DSPORB_Fxn*fxn); fxn의 실행을 위한 시간 경계를 설정한다. DSPOrb는 충분한 자원을 할당하여 이 제약을 만족시키거나, 혹은 0으로 돌아간다.int DSPORB_Fxn_setTimeConstraint (DSPORB_Fxn * fxn); Set the time boundary for running fxn. DSPOrb meets this constraint by allocating sufficient resources, or returns to zero.

-int DSPORB_Fxn_setPriority(DSPORB_Fxn*fxn); 1 내지 15의 우선순위 레벨을 설정한다.int DSPORB_Fxn_setPriority (DSPORB_Fxn * fxn); Set a priority level of 1 to 15.

-int DSPORB_Fxn_invoke(DSPORB_Fxn*fxn, DSPORB_Arg*args); 입력 및 출력에 대한 기능을 호출한다. 이 호출은, 접속되지 않은 출력에 대해 모든 데이터가 유용할때까지 차단한다. 'DSPORB_Buffer_connect'에 연결된 입력 및 출력에 대해, 'NULL'이 패스될 수 있다.int DSPORB_Fxn_invoke (DSPORB_Fxn * fxn, DSPORB_Arg * args); Call functions on inputs and outputs. This call blocks until all data is available for unconnected output. For inputs and outputs connected to 'DSPORB_Buffer_connect', 'NULL' may be passed.

-int DSPORB_Fxn_invokeAsync(DSPORB_Fxn*fxn, DSPORB_Arg*arg); 입력 및 출력에 대한 기능을 호출한다. 이 호출은 즉시 리턴되며; 애플리케이션은 'DSPORB_getData'를 사용하여 출력 논의 객체로부터 데이터를 검색한다.int DSPORB_Fxn_invokeAsync (DSPORB_Fxn * fxn, DSPORB_Arg * arg); Call functions on inputs and outputs. This call returns immediately; The application uses 'DSPORB_getData' to retrieve data from the output discussion object.

-unsigned char*DSPORB_Arg_getData(DSPORB_Arg*output, long timeout); 출력 논의 객체로부터 데이터를 취한다. 나노세컨드 내에 'timeout'이 발생할때까지 차단하거나, 혹은 'timeout=-1'일 경우 무기한으로 된다.-unsigned char * DSPORB_Arg_getData (DSPORB_Arg * output, long timeout); Takes data from the output discussion object. Block until timeout occurs in nanoseconds, or indefinitely if timeout = -1.

-voidDSPORB_Arg_setCallback(DSPORB_Arg*output,unsignedchar*(*getData)-voidDSPORB_Arg_setCallback (DSPORB_Arg * output, unsignedchar * (* getData)

(DSPORB_Arg*)); 출력 논의에 대한 콜백 기능을 설정하고, 데이터가 이용가능하면 getData가 호출된다.(DSPORB_Arg *)); Set the callback function for the output discussion, getData is called if data is available.

-voidDSPORB-System_close()는 DSPOrb를 클로즈한다.-voidDSPORB-System_close () closes DSPOrb.

2j. iDSPOrb의 예2j. Example of iDSPOrb

첫 번째 예에서는, 동적 호출 인터페이스를 이용하여, C6xxx에 대해 TI H.263 디코더를 연결시키는 데에 iDSPOrb가 어떻게 사용되는지를 나타낸다. 두 번째 예에서는 프록시 스터브로 기록된 동일 프로그램을 나타낸다.In the first example, we show how iDSPOrb is used to connect the TI H.263 decoder to C6xxx using the dynamic call interface. The second example shows the same program recorded as a proxy stub.

3. 서비스 품질(Quality of Service, QoS)3. Quality of Service (QoS)

iDSPOrb 서비스의 품질 관리 프로그램(iDSP-QoSM)이 규정되는 바람직한 실시예의 구성은 피어 서버(peer server)로서 DSP의 풀을 갖는 호스트 프로세서로 구성된다. 특정 서비스의 품질을 유지하는데 필요한 모든 기능을 실행하는 엄브렐러 QoS-관리 프로그램은 이러한 DSP 서버의 풀을 관리한다. 호스트 컴퓨터는 보통 범용 프로세서(GPP)인데, 이는 공유 메모리 또는 버스형 인터페이스와 같은 하드웨어 인터페이스를 통해 DSP로 연결된다. QoS 관리 프로그램은 iDSPOrb의 일부분일 수 있거나, 좀 더 일반적으로는 DSP상의 개별적인 관리 프로그램일 수 있다. 시스템은 하드웨어 및 소프트웨어 인터럽트에 의해 구동된다. 바람직한 구현은 주 사용자(클라이언트) 애플리케이션을 GPP에서 실행하도록 하고 부하-공유 기반에서 특정 서비스를 DSP상에서 실행하도록 한다. 모든 프로세서상에서 QoS 관리 프로그램과 동시에 실행하는 것은 iDSP 미디어 프레임워크(media framework)와 같은 프레임워크일 수 있다. iDSP-QoS 관리 프로그램은 3개의 주요 기능;(1) 객체의 분류, (2)객체의 스케쥴링 및 (3) 객체의 실행 시간의 예측을 수행한다.The configuration of the preferred embodiment in which the quality management program (iDSP-QoSM) of the iDSPOrb service is defined consists of a host processor having a pool of DSPs as a peer server. The umbrella QoS-management program, which performs all the functions necessary to maintain the quality of a particular service, manages this pool of DSP servers. The host computer is usually a general purpose processor (GPP), which is connected to the DSP through a hardware interface such as shared memory or a bus type interface. The QoS management program may be part of iDSPOrb or, more generally, a separate management program on the DSP. The system is driven by hardware and software interrupts. The preferred implementation allows the main user (client) application to run in GPP and to run certain services on the DSP on a load-sharing basis. Running concurrently with the QoS management program on all processors may be a framework such as the iDSP media framework. The iDSP-QoS management program performs three main functions: (1) classification of objects, (2) scheduling of objects, and (3) prediction of execution time of objects.

이러한 기능들은 미디에 고유 예를 사용하는 GPP/다중-DSP 환경하에서 아래에 설명될 것이다.These functions will be described below under a GPP / multi-DSP environment using a MIDI specific example.

3a. 객체의 분류3a. Classification of objects

미디어 고유 환경에서, 객체는 미디어 코덱/필터(알고리즘)으로 번역된다. 미디어 객체는 그들의 스트림 형태, 애플리케이션 형태 또는 알고리즘 형태에 기초하여 분류될 수 있다. 알고리즘의 형태에 따라서, QoS 관리 프로그램은 코덱-싸이클, 필터-싸이클 등으로 알려진 메트릭(metrics)을 정의한다.In a media native environment, objects are translated into media codecs / filters (algorithms). Media objects may be classified based on their stream type, application type or algorithm type. Depending on the type of algorithm, the QoS management program defines metrics known as codec-cycles, filter-cycles, and the like.

3b. 객체의 스케쥴링(하드-기한)3b. Scheduling of Objects (hard-duration)

iDSP-QoSM은 두단계 스케쥴러(two-phase scheduler)에 기초하여 알고리즘 객체를 스케쥴한다. 제1 단계는 새로운 미디어 스트림이 DSP상에서 스케쥴 가능하고 코덱-스타일에 대해 하드-리얼 타임 기한(hard-real time deadline)을 설정하는지 여부를 결정하는 하이 레벨 스케쥴러이다. 제2 단계는 개별적인 미디어 프레임을 스케쥴하고 제1 단계로부터의 하드-리얼 타임 기한을 사용한다. 제1 단계는 객체 협상 시간으로 실행되고 일반적으로 호스트(GPP)상에서 실행된다. 제2 단계는 DSP(서버)에서 실행되고 퍼 프레임 기반(pre frame basis)으로 실행된다.iDSP-QoSM schedules algorithm objects based on a two-phase scheduler. The first step is a high level scheduler that determines whether a new media stream is scheduleable on the DSP and sets a hard-real time deadline for the codec-style. The second stage schedules individual media frames and uses the hard-real time deadline from the first stage. The first phase is executed at object negotiation time and generally on a host (GPP). The second stage is executed in the DSP (server) and executed on a per frame basis.

스케쥴링의 제1 단계는 QoS 관리 프로그램이 객체가 이미 동시에 실행되는 객체에 의해 지지될 수 있는지를 평균적으로 결정할 때이다. 메모리 측면에서 객체에 대한 충분한 지지의 고려가 제1 단계 스케쥴링의 일부분으로서 요구된다. 내부 사용을 위한 객체 메모리 버퍼, 입력 및 출력은 메모리를 동적으로 할당하는 비확실성을 제거하기 위하여 그의 인스턴스 생성(instantiation)의 시간에 정적으로 고정되어야 한다. iDSP 미디어 플랫폼은 XDAIS 컴플라이언트(compliant) 알고리즘만을 실핸한다. 개발자는 그 알고리즘에 대해 서로 다른 조건의 프로세싱 시간을 정의하도록 요구된다. 서버로/부터 데이터 전송에 요구되는 대략적인 시간은 각 객체에 대해 기한이 설정될 경우 QoS 관리 프로그램에 의해 요인이 되는 초기화 시간으로 결정된다.The first step in scheduling is when the QoS manager determines on average whether an object can be supported by an object already running at the same time. Consideration of sufficient support for objects in terms of memory is required as part of the first stage scheduling. Object memory buffers, inputs, and outputs for internal use must be statically fixed at the time of their instantiation to eliminate the uncertainty of dynamically allocating memory. The iDSP media platform only handles XDAIS compliant algorithms. The developer is required to define different processing times for the algorithm. The approximate time required to transfer data to / from the server is determined by the initialization time that is factored by the QoS management program when a deadline is set for each object.

각 DSP 객체는 QoS 관리 프로그램에 대해 다음 정보를 공급하도록 요구된다.Each DSP object is required to supply the following information about the QoS management program.

n 코덱-싸이클 및 프레임의 수(디폴트: 프레임/초)n Codec-cycle and number of frames (default: frames / sec)

T_acc코덱-싸이클을 타겟 서버(DSP) 싸이클의 수로 연산하기 위한 평균 시간T _acc average time to calculate codec-cycles to the number of target server (DSP) cycles

T_acd타겟 서버(DSP) 싸이클의 수로 된 코덱-싸이클의 디스플레이 시간Display time of codec-cycles in number of T _acd target server (DSP) cycles

비디오 코텍에 대하여, n은 통상 연속적인 I-프레임들 사이의 프레임 개수 (예를 들면, 15 프레임)이다. 그리고, T_acc는 통상 I 프레임을 위해 요청되는 최대 시간량과 P 및 B 프레임들을 위해 요청되는 평균 시간을 더한 합이다. QoS 관리 프로그램은 모든 미디어 객체들에 위한 T_ccd의 트랙을 유지한다. (DSP 사이클 기간에서) 이 시간은 현재의 프레임 레이트에 기초한다. 예를 들면, 30 fps 비디오 스트림과 n=15인 경우, T_ccd=125 Mcycle이라고 하자.For video codecs, n is typically the number of frames (eg 15 frames) between successive I-frames. And, T _acc is usually the sum of the maximum amount of time required for I frames plus the average time required for P and B frames. The QoS manager maintains a track of T _ccd for all media objects. This time (in the DSP cycle period) is based on the current frame rate. For example, assume that 30 ccs video stream and n = 15, T _ccd = 125 Mcycle.

이제 다음과 같이 새로운 스트림이 스케줄가능한지의 여부를 QoS 관리 프로그램이 결정할 수 있다. 현재 스케줄된 모든 스트림들을 위한 코덱-사이클들 (T_acc)의 합을 S하고 하자. 새로운 스트림을 위한 (S+T_acc)이 새로운 스트림을 위한 T_acc보다 작은 경우, 이 스트림이 스케줄가능하고, 그렇지 않은 경우 스케줄가능하지 않다. 예를 들어, n=15, T_axc=39.5 Mcycles (158ms) 및 T_ccd=125Mcycle (500ms)를 갖는 객체-A가 있고, DSP 상에 스케줄된 작업이 없다 (따라서, S=0)고 가정하자. QoS 관리 프로그램은 객체-A를 요청한 새로운 스트림을 위한 자원들을 스케줄할 것을 통보하게 된다. S+39.5 = 39.5 Mcycles ＜ 125 Mcycles (500ms)이기 때문에, 우리는 스트림을 스케줄할 수 있다. 제2 스트림이 요청중인 객체-A를 통과할 때, 이는 또한 S+39.5 = 79 Mcycles (316ms) ＜ 125 Mcycles (500ms)이므로 스케줄된다. 제3 스트림이 또한 스케줄될 수 있다. 그러나, 제4 스트림은 158 Mcycles (632 ms)를 요청함으로 스케줄될 수 없고, 따라서 우리는 500ms의 엄격한 기한을 맞출 수 없게 된다. 이러한 점에서 QoS 관리 프로그램은 스트림의 프레임 레이트를 감소시킬 것을 결정하고, 전체적으로 스트림을 거부하게 될 것이다.The QoS management program can now determine whether a new stream is schedulable as follows. Let S sum the codec-cycles (T _acc ) for all currently scheduled streams. If (S + T _acc ) for the new stream is less than T _acc for the new stream, this stream is schedulable, otherwise it is not schedulable. For example, suppose you have Object-A with n = 15, T _axc = 39.5 Mcycles (158 ms) and T _ccd = 125 Mcycle (500 ms), and there are no jobs scheduled on the DSP (thus S = 0). . The QoS management program will notify to schedule resources for the new stream that requested Object-A. Since S + 39.5 = 39.5 Mcycles <125 Mcycles (500ms), we can schedule the stream. When the second stream passes through the requesting object-A, it is also scheduled as S + 39.5 = 79 Mcycles (316 ms) <125 Mcycles (500 ms). The third stream may also be scheduled. However, the fourth stream cannot be scheduled by requesting 158 Mcycles (632 ms), so we cannot meet the strict deadline of 500 ms. At this point the QoS manager will decide to reduce the frame rate of the stream and will reject the stream as a whole.

변형예는 스케줄러가 코덱 사이클 타임의 차이에 따라 이종 미디어 객체들을 다룰 수 있도록 허용한다. 보다 긴 T_ccd를 갖는 객체들은 가장 작은 T_ccd에 할당된다. 예를 들어, n=30, T_axc=40 Mcycles (160ms) 및 T_ccd=169Mcycle (675ms)를 갖는 객체-B가 있고, DSP 상에 스케줄된 (상기 정의된 바와 같은) 2개의 객체-A가 있다 (따라서, S=79 Mcycles/316ms)고 가정하자. 우리는 S+40*(125/158)=110.45 Mcycles (S+160*500/675=435ms)이므로 새로운 객체-B 스트림을 스케줄할 수 있다. 이는 (79+40 < 125)Mcycles/(316+160 < 150)ms이므로 명백히 정확하고, 따라서, 우리는 500ms의 보다 짧은 코덱 사이클 기한 내에 모든 스트림들을 실제로 확정할 수 있다. 객체-B를 요청하는 제2 스트림이 스케줄링될 필요가 있을 때 어떤 일이 일어나는가? 110.45+40*125/158=139>125 Mcycles/435+160*(500/675)=554ms>500ms. 따라서, 스케줄러는 이러한 스트림을 거절하고 상술한 바와 같이 교섭을 시작한다.The variant allows the scheduler to handle heterogeneous media objects according to differences in codec cycle times. Objects with longer T _ccds are assigned to the smallest T _ccd . For example, there is Object-B with n = 30, T _axc = 40 Mcycles (160ms) and T _ccd = 169Mcycle (675ms), and two Object-As (as defined above) scheduled on the DSP Suppose there is, therefore, S = 79 Mcycles / 316ms. We can schedule a new Object-B stream because S + 40 * (125/158) = 110.45 Mcycles (S + 160 * 500/675 = 435ms). This is clearly accurate as (79 + 40 <125) Mcycles / (316 + 160 <150) ms, so we can actually finalize all streams within a shorter codec cycle of 500ms. What happens when a second stream requesting Object-B needs to be scheduled? 110.45 + 40 * 125/158 = 139> 125 Mcycles / 435 + 160 * (500/675) = 554ms> 500ms. Thus, the scheduler rejects this stream and starts negotiation as described above.

iDSP-QoSM은 코덱-사이클에 기초하여 미디어 객체를 위한 충분한 프로세싱 대역폭을 확보해두기 위해 애플리케이션이나 그의 프록시와 교섭할 것이다. 이러한 교섭은, 객체의 요청된 메모리, 요청된 QoS 레벨 및 다른 동시 실행 중인 DSP 애플리케이션을 갖는 DSP의 유효한 MIPS를 고려해야 할 것이다. 객체 선택이 변경됨에 따라, QoS 관리 프로그램은 DSP 프로세서 대역폭의 재교섭을 수행할 것이다. QoS 관리 프로그램의 교섭 프로세스에 대한 입력 파라미터들은 객체에 대해 다음을 규정하기 위한 애플리케이션을 요청한다.iDSP-QoSM will negotiate with the application or its proxy to ensure sufficient processing bandwidth for the media object based on the codec-cycle. This negotiation would take into account the valid MIPS of the DSP with the requested memory of the object, the requested QoS level, and other concurrently running DSP applications. As the object selection changes, the QoS manager will perform renegotiation of the DSP processor bandwidth. The input parameters for the negotiation process of the QoS management program request an application to define the following for the object.

(1) DSP 메모리의 조건 (입/출력 버퍼들의 개수 및 사이즈)(1) DSP memory conditions (number and size of input / output buffers)

(2) 소망의 QoS 레벨 (전형적으로 초당 프레임으로 표현됨)(2) the desired QoS level (typically expressed in frames per second)

(3) 객체를 시작하기 위한 최악의 경우 런타임(3) Worst case runtime for starting an object

(4) 코덱 사이클이라 불리우는, 미디어 프레임들의 시퀀스들을 위한 고정된 실시간 기한 (프레임의 개수 및 평균 실행 시간)(4) fixed real time deadline (number of frames and average execution time) for sequences of media frames, called codec cycles

iDSP-QoS 관리 프로그램에서 객체의 제2 페이즈 스케줄링은 2개의 양상, 즉 기한에 첫 번째로 도달한 것 및 가장 높은 우선 순위를 갖는 자에 기초한다. 객체-A가 10ms에서 기한을 가지며 객체-D가 3ms에서 기한을 가지는 경우, iDSP QoS 관리 프로그램이 객체-A가 가장 높은 우선 순위를 갖는 경우에도 객체-D가 첫 번째로 실행되도록 스케줄할 수 있는 다음 예를 고려하자. 우리는 객체의 대략적인 실행시간을 알고 있기 때문에, 객체가 그의 기한에 한층 부합할 수 있도록 시작되어야 할 때, "NO Later" 시간을 결정할 수 있다. 도 3에서는, 객체-A에 대한 "No Later" 시작 지점 이전에 객체-D가 종료되는 것으로 가정된다. 이러한 시나리오에서, 보다 높은 우선 순위의 객체-A와 객체-D 사이에서 상충되는 기한은 없다. 그러므로, 낮은 우선 순위의 객체-D 이후에 객체-A가 실행된다.The second phase scheduling of an object in an iDSP-QoS manager is based on two aspects: the first to reach due date and the one with the highest priority. If Object-A has a deadline of 10ms and Object-D has a deadline of 3ms, the iDSP QoS manager can schedule Object-D to run first even if Object-A has the highest priority. Consider the following example. Because we know the approximate runtime of an object, we can determine the "NO Later" time when the object should be started to meet its deadline. In FIG. 3, it is assumed that Object-D ends before the “No Later” start point for Object-A. In this scenario, there is no conflicting time between object-A and object-D of higher priority. Therefore, Object-A is executed after Object-D of low priority.

스케쥴링의 다른 예는 고순위 객체-A의 "늦지 않은(No Later)" 시간이 예측된 객체-D의 예측된 종료 시간 이전인 경우 제1 기한 상에서 우선순위가 가늠된다.이러한 경우에, 객체-D가 객체 인스턴스 생성 시간(object instatntiation time)에서 명기된 그 프레임 드로핑 파라미터에 부합하는 경우에만 객체-A가 더 높은 순위일 뿐만 아니라, 객체-D가 뒤를 이어 동작하도록 허용되었기 때문에 먼저 동작한다. 도 4를 참조하라.Another example of scheduling is prioritization on a first deadline if the "No Later" time of high-ranking Object-A is before the predicted end time of the predicted Object-D. In this case, the Object- Only if D matches the frame dropping parameter specified at object instatntiation time, it operates first because object-A is not only ranked higher, but object-D is allowed to run afterwards. See FIG. 4.

최상의 가능한 효율에 대한 기한을 관리하기 위한 iDSP QoS에 대하여, GPP는 데이터 입력 프레임이 DSP 서브시스템으로 가능한 한 빨리 통과하게 하여, 객체에 대한 착신 시간과 기한 간의 시간의 양이 최대가 되게 한다. 그 착신과 그 기한 간의 데이터 프레임에 대한 시간이 클수록, iDSP QoSM이 해당 객체를 다른 동시적 객체와의 스케쥴링에 있어서 더욱 융통성이 있게 한다.For iDSP QoS to manage deadlines for the best possible efficiency, GPP allows data input frames to pass through the DSP subsystem as quickly as possible, thus maximizing the amount of time between the incoming and deadline for the object. The larger the time for the data frame between the incoming and the due date, the more flexible the iDSP QoSM is in scheduling that object with other concurrent objects.

3c. 객체의 작동시간 예측(소프트-기한)3c. Predictive runtime of objects (soft-due date)

iDSP QoSM의 중심적 기능은 모든 스케쥴링된 객체의 다음 입력 프레임에 대한 요구된 프로세싱 시간을 예측한다는 것이다. QoS 관리 프로그램은 다음 입력 프레임에 대한 예상된 런타임을 계산하기 위해 이전의 런타임의 통계를 사용함으로써 객체에 대한 런타임을 예측한다. 객체에 대한 예상된 런타임은 (객체 고유의) 최대 가능 포지티브 체인지를 갖는 (또한 각 객체에 대하여 고유하게 결정된) 이전 런타임의 함수이다. 예를 들며, 비디오 객체의 경우, I, P 및 B 프레임의 주기성은 결정론적이다. 즉, 현재 프레임의 타입 및 그 위치에 기초하여 비디오 프레임의 주기성 내에서 미래의 프로세싱 시간이 예측될 수 있다. 모든 동시적 알고리즘 상에서 이행된 그러한 예측은 예측된 프로세싱 시간 및 접근 하드 기한에 기초하여 우선순위를 동적으로 재할당하는 것을 돕는다.The central function of the iDSP QoSM is to predict the required processing time for the next input frame of all scheduled objects. The QoS manager predicts the run time for the object by using the statistics of the previous run time to calculate the expected run time for the next input frame. The expected run time for an object is a function of the previous run time (also determined uniquely for each object) with the maximum possible positive change (object specific). For example, for video objects, the periodicity of I, P, and B frames is deterministic. That is, the future processing time can be predicted within the periodicity of the video frame based on the type of the current frame and its location. Such predictions made on all concurrent algorithms help to dynamically reassign priorities based on predicted processing time and access hard deadlines.

이들 예들은 소프트-기한 및 프로세싱 시간 내의 지터들을 관리하기 위한 키 인에이블러(key enabler)이다. 상기 예측에 따라, iDSP QoSM은 프로세싱을 위해 객체를 순간적으로 재스케쥴링한다. 이러한 순간적인 재스케쥴링은 각각의 객체의 코덱-사이클 기한 시간(평균 상에서 정의된 하드-기한) 내에서 발생한다. 이 방법은 개별적인 프레임이 하드 및 소프트 기한 양자 모두에 따라 가늠되었다는 점에서 고유하다. 상기 예에서, 500ms 오버랩에 대한 워크로드를 객체-A에 의해 평균하였을 때 객체-B 내의 모든 프레임들은 동일 시간량을 요구하였다고 가정하였다. 이것은 객체-B에 대한 프레임이 실제 오버랩 동안에 더 많은 시간을 요구할 때에, 또는 객체-B가 평균 시간량이 주어지지 않았을 때 사실이 아닐수 있다. 따라서, 그 코덱-사이클 기한에 가장 가까운 프레임이 최고의 우선순위를 받는다.These examples are key enablers for managing jitter within soft-duration and processing time. In accordance with the prediction, the iDSP QoSM instantly reschedules the object for processing. This instantaneous rescheduling occurs within the codec-cycle deadline time (hard-duration defined on average) of each object. This method is unique in that individual frames are scaled according to both hard and soft deadlines. In the above example, it was assumed that all frames in Object-B required the same amount of time when the workload for 500 ms overlap was averaged by Object-A. This may not be true when the frame for Object-B requires more time during the actual overlap, or when Object-B is not given an average amount of time. Thus, the frame closest to its codec-cycle deadline receives the highest priority.

예측된 런타임이 사용자-정의 시간 요구를 위반하는 경우 QoS 관리 프로그램은 몇가지 가능한 조치중에 하나를 취할 것이다.If the predicted runtime violates a user-defined time requirement, the QoS manager will take one of several possible actions.

단일 DSP 구성에서:In a single DSP configuration:

(레벨 1) 단순 이진 컷 오프 : 이것은 자동 프레임-드롭을 발생시킨다. 주목의 대상인 객체가 프레임 드롭이 파괴적인 결과를 유발할 것인지를 나타낼 수 있어야 한다.(Level 1) Simple Binary Cut Off: This causes an automatic frame-drop. The object of interest must be able to indicate whether the frame drop will cause a destructive effect.

(레벨 2) 할당된 시간의 끝에서 객체의 선취(pre-emption)에 의해 하위 우선순위 객체의 할당 런타임에서의 일반적 감소. 이것은 프레임-드롭을 발생시킬수도 아닐 수도 있다.(Level 2) General reduction in allocation runtime of lower priority objects by pre-emption of objects at the end of the allocated time. This may or may not cause frame-drop.

(레벨 3) 객체는 출력 데이터의 스케일링 백 퀄리티 같은 QoS 커맨드를 받아들일 능력을 가질것이 요구된다.(Level 3) The object is required to have the ability to accept QoS commands such as scaling back quality of output data.

멀티플 DSP 구성에서:In a multiple DSP configuration:

(1) 각 QoS 타임-슬라이스의 끝에서, 로드-데이터를 갖는 메시지가 각 DSP로부터 GPP로 송출된다.(1) At the end of each QoS time-slice, a message with load-data is sent from each DSP to the GPP.

(2) GPP는 산출 데드-라인 오류의 경우에만 객체의 재배분으로 간다. 이 태스크의 재할당은 서빙 DSP로부터 "로드-데이터"를 수신한 후에 GPP(ORB 층)에 의해 이행된다. 그러나, 태스크 스위칭 시간을 감소시키기 위해, 모든 DSP들이 외부 메모리 공간의 공통의 클러스터로부터 동작하는 것이 매우 바람직하다.(2) GPP goes to redistribution of objects only in the event of production dead-line errors. Reassignment of this task is accomplished by the GPP (ORB layer) after receiving "load-data" from the serving DSP. However, to reduce task switching time, it is highly desirable that all DSPs operate from a common cluster of external memory space.

iDSP 시스템 내에서 실행되는 모든 객체들은 실행 시간에 있어서 결정적이어야 한다. DSP 객체는 3가지 타입(데이터의 압축(인코딩), 데이터의 압축 해제(디코딩), 및 데이터 변환(객체에 대한 데이터의 이전 또는 이후 프로세싱)으로 나뉘어질 수 있다. 객체는 블록 내에서 프로세스하기 위해 나타난 데이터이며, 이들 블록들은 입력 데이터 프레임이라고 불린다. 객체는 입력 데이터 프레임을 프로세스하고 출력 데이터 프레임을 발생시킨다. 임의의 계산 데이터로서, 입력 및 출력 데이터 프레임은 둘다 프로세싱의 크기 및 양에 대하여 바운드된다. 임의의 주어진 입력 프레임의 크기에 기초하여 DSP, 또는 그 사항에 대한 임의의 다른 컴퓨터가 그 입력 프레임에 대해 수행해야 할 프로세싱의 최대량의 정확한 결정이 있을 수 있다.All objects running within the iDSP system must be critical in execution time. DSP objects can be divided into three types: compression (encoding) of data, decompression (decoding) of data, and data conversion (pre or post processing of data to objects). The data represented, these blocks are called input data frames, the object processes the input data frame and generates an output data frame, as any computational data, both the input and output data frames are bound for the size and amount of processing There may be an accurate determination of the maximum amount of processing that the DSP, or any other computer for that matter, should perform on that input frame based on the size of any given input frame.

이것이 iDSP 시스템으로 통합되기 전에, 각각의 객체는 단일 프레임을 위한 그 객체에 대한 최악 케이스 런타임을 선언할 필요가 있다. 이 최악 케이스 런타임은 객체가 시작될 수 있도록 제1 입력 데이터 프레임의 런타임을 계산하는데 사용된다. QoS 관리 프로그램은 객체가 수행되기 전에 입력 데이터 프레임을 특징지울 수 없다. 인코더 및 디코더 객체는 최악 케이스 시나리오에서 드물게 수행하기 때문에 제1 입력 프레임은 상당할 것이다(왜냐하면 이것은 최악 케이스로 예측되어야 하기 때문이다). 이 최악 케이스 스케쥴은 제1프레임에 대한 실제 런타임보다 많은 시간을 발생할 것 같다. 이것은 단지 실제 런타임이 최악 케이스 스케쥴보다 많은 경우에만 문제이다.Before this is integrated into the iDSP system, each object needs to declare the worst case runtime for that object for a single frame. This worst case runtime is used to calculate the runtime of the first input data frame so that the object can be started. The QoS manager cannot characterize the input data frame before the object is performed. Since the encoder and decoder objects perform rarely in the worst case scenario, the first input frame will be significant (because this should be predicted as the worst case). This worst case schedule is likely to generate more time than the actual runtime for the first frame. This is only a problem if the actual runtime is more than the worst case schedule.

앞서 기술된 바와 같이, 알고리즘 객체의 프로세싱 시간은 입력 프레임들 사이에서 변화할 것이다. 최초에, iDSP-QoSM은 제1 데이터 입력 프레임에 대한 최악 케이스로 시작할 것이다. 제1 프레임 이후에, QoS 관리 프로그램은 알고리즘의 특성 및 제1 프레임에 대한 측정된 프로세싱 시간에 기초하여 다음 입력 프레임에 대한 프로세싱 시간을 예측할 것이다. 매 후속 프레임마다, QoS 관리 프로그램은 알고리즘 객체의 시멘틱스(semantics) 및 히스토리에 기초하여, 근사 프로세싱 시간을 예측한다. 예를 들어, 인코더 객체는 미래의 인코딩 시간 요건을 예측하기 위해 이전의 유사한 입력 프레임의 평균 인코딩 시간과 함께 객체 시멘틱스(예를 들어, I, P, 및 B 프레임 타입)를 사용한다. 인코더 객체들은 이들이 실행하기로 스케쥴될 때마다 동일한 크기 입력 프레임 상에서 작업한다. 프로세싱 시간의 편차는 프레임 내의 액티비티 레벨, 프레임 간의 움직임의 정도 등과 같은 팩터로부터 비롯된다. 그러나, 이들 편차는 바운드된다. 그러므로, 2개의 프레임 간의 프로세싱 시간은 다음 프레임에 대한 최악 케이스 프로세싱 시간을 결정하기 위해 예측된 프로세싱 시간에 가산될 수 있는 유한 최대 차이를 가질 것이다. 도 5-6 참조.As described above, the processing time of the algorithm object will vary between input frames. Initially, the iDSP-QoSM will begin with the worst case for the first data input frame. After the first frame, the QoS management program will predict the processing time for the next input frame based on the characteristics of the algorithm and the measured processing time for the first frame. Every subsequent frame, the QoS manager predicts an approximate processing time based on the semantics and history of the algorithm object. For example, an encoder object uses object semantics (eg, I, P, and B frame types) along with the average encoding time of previous similar input frames to predict future encoding time requirements. Encoder objects work on the same size input frame whenever they are scheduled to run. The variation in processing time comes from factors such as activity levels in frames, the degree of movement between frames, and the like. However, these deviations are bound. Therefore, the processing time between two frames will have a finite maximum difference that can be added to the predicted processing time to determine the worst case processing time for the next frame. See FIGS. 5-6.

디코딩 객체는 전형적으로 존재하는 가변 사이즈 입력 프레임이다. 입력 데이터 프레임의 프로세싱 시간은 그것의 크기에 정비례한다. 다음 프레임 프로세싱 시간의 증가가 있는지를 결정하기 위해, QoS 관리 프로그램은 현재와 다음 입력 프레임 크기간의 차이의 크기를 체크할 것이다. 인코더와 같이 유사한 독립변수가 또한 디코더에 대해 홀드한다. 즉, 2개의 의미상 유사한 프레임들간의 프로세싱의 차이가 바운드된다. 디코더에 대한 최대 또는 최악 케이스 프로세싱 시간은 객체에 대해 정의된 최대가능한 버퍼이다. 도 7 참조.The decoding object is typically a variable size input frame present. The processing time of the input data frame is directly proportional to its size. To determine if there is an increase in the next frame processing time, the QoS manager will check the magnitude of the difference between the current and next input frame size. Similar independent variables, such as encoders, also hold for the decoder. That is, the difference in processing between two semantically similar frames is bound. The maximum or worst case processing time for a decoder is the maximum possible buffer defined for the object. See FIG. 7.

변환 객체들은 이들이 항상 동일한 크기상에서 작업한다는 점에서 인코더 객체들과 유사하게 수행한다. 각각의 프레임은 항상 동일한 양의 프로세싱 시간이 걸리고 입력 프레임을 한번 통과한다. 그러므로 입력 프레임당 프로세싱 시간은 항상 일정하게 남는다.Transform objects perform similarly to encoder objects in that they always work on the same size. Each frame always takes the same amount of processing time and passes through the input frame once. Therefore, the processing time per input frame always remains constant.

각각의 객체는 통과된 프레임이 객체에 의해 완료되어야 하는 상대적 시간을 사용자 애플리케이션으로부터 수신할 것이다. 한 예는 애플리케이션이 이 프레임이 다음 7mS 내에서 처리되어야 하는 것을 특징하는 것이다. 호스트 GPP와 DSP 사이에 공통 소프트웨어 클럭이 없기 때문에 기한은 상대적 기간내에만 특정될 수 있다. 호스트와 DSP 사이의 데이터 프레임은 결정적인 것이라고 가정하자. iDSP 시스템은 데이터 프레임이 도달시 타임스탬프를 수신하고 다음에 기대된 프로세싱 시간을 계산하는 내부 클럭을 유지한다. 기대된 프로세싱 시간을 계산한 후에 QoS 관리 프로그램은 이제 데이터 프레임 실행을 스케줄한다.Each object will receive a relative time from the user application that the frame passed must be completed by the object. One example is that the application is characterized that this frame must be processed within the next 7mS. Due to the lack of a common software clock between the host GPP and the DSP, the deadline can only be specified within a relative period of time. Assume that the data frame between the host and the DSP is crucial. The iDSP system maintains an internal clock that receives a timestamp when the data frame arrives and calculates the next expected processing time. After calculating the expected processing time, the QoS manager now schedules data frame execution.

객체가 스케쥴될 수 있기 전에, QoS 관리 프로그램은 다른 동시 객체에 대해비교된 객체의 실행의 적절한 순서를 결정한다. 입력 프레임을 프로세싱하는 다른 객체가 없다면, 객체 프레임은 즉각 실행 스케쥴된다. 런닝하는 다른 객체가 있으면, QoS 관리 프로그램은 우선순위, 기대된 기한 및 각각의 요구된 객체의 하드 또는 소프트 실시간 요건을 고려하여 실행 순서를 결정한다. 도 8 참조.Before an object can be scheduled, the QoS manager determines the proper order of execution of the object compared to other concurrent objects. If no other object is processing the input frame, the object frame is immediately scheduled for execution. If there are other objects running, the QoS manager determines the order of execution taking into account the priority, expected deadline, and hard or soft real-time requirements of each required object. See FIG. 8.

서로 다른 런타임 우선 순위를 가진 다중 객체들이 동일한 DSP상에 결합될 때, QoS 관리 프로그램은 각 객체의 특정한 런타임 계산에 기초해서 각 객체에 대한 예상 런타임을 계산할 것이다. 다음에는, 스케쥴링 객체(TBD)에 기초해서 서로 다른 일들을 스케쥴링한다. 아래의 세가지 스케쥴링 시나리오가 가능하다.When multiple objects with different runtime priorities are combined on the same DSP, the QoS manager will calculate the expected runtime for each object based on the specific runtime calculation of each object. Next, different things are scheduled based on the scheduling object TBD. Three scheduling scenarios are possible:

(1) 모든 객체는 주어진 입력 데이터 프레임상에서 완성될 때까지 동작하고, 애플리케이션-특정 기한내에 완성된다. 이 시나리오는 도 9에 제시되어 있고, 도면에 있는 모든 객체들은 각 객체의 기한 전에 완성된다는 것이 중요하다. 모든 객체들이 각각의 기한 전에 완성된다면, QoS 관리 프로그램에 요구되는 일은 최소가 될 것이다.(1) All objects operate until complete on a given input data frame, and complete within application-specific deadlines. This scenario is shown in FIG. 9, and it is important that all objects in the drawing are completed before the deadline of each object. If all objects are completed before each deadline, the work required by the QoS manager will be minimal.

(2) 프로세싱 부하는 하나 이상의 객체(예를 들어, 객체-B)에서 증가하지만, 다음 객체에 대한 예상 기한을 미스(miss)되게 하지는 않는다. 부하가 객체-B와 같은 하나 이상의 객체상에서 증가하는 것은 가능하다. 객체에 따라서, 동일한 객체의 순차적 데이터 프레임이 그들의 제한 기한내에서 프로세스된다면, 기한을 미싱(miss)하는 것은 수용될 수 있다. 한 예는 "I" 프레임 계산이 가장 오래 걸리는 H263 인코더내에 있을 수 있다. "I" 프레임 다음의 프레임은 항상 "P" 프레임이고, 일반적으로 훨씬 더 적은 프로세싱 필요 조건을 가진다. 이것은 "I" 프레임프로세싱이 다음의 "P" 프레임 프로세싱으로부터 사이클 스틸(steal)하도록 한다. 그러므로, 다음 프레임에 충분한 프로세싱 공간이 있다면, 한 프레임에서 기한을 미싱하는 것은 파국이 되지는 않는다.(2) The processing load increases on one or more objects (eg Object-B) but does not miss the expected deadline for the next object. It is possible for the load to increase on one or more objects, such as Object-B. Depending on the object, missed deadlines may be acceptable if sequential data frames of the same object are processed within their deadlines. One example may be in the H263 encoder which takes the longest "I" frame calculation. The frame following the "I" frame is always a "P" frame and generally has much less processing requirements. This allows "I" frame processing to cycle from the next "P" frame processing. Therefore, if there is enough processing space in the next frame, missing the deadline in one frame is not catastrophic.

객체-B에 대한 기한이 경과하면, 전체 시스템 효과는 결정되어야만 한다. 객체-B에 의해 기한을 미싱하는 것이 다음 객체들에 대한 예상 기한을 미싱하게 하지 않는다면, 전체 시스템 위험은 최소가 된다. 도 10-11을 보아라.After the deadline for Object-B has elapsed, the overall system effect must be determined. If missing the deadline by Object-B does not miss the expected deadline for the next objects, the overall system risk is minimal. See Figures 10-11.

(3) 프로세싱 부하는 하나 이상의 객체(예를 들어, 객체-B)에서 증가하지만, 이것은 다음 객체에 대한 예상 기한을 미싱하게 한다. 도 12를 보아라.(3) The processing load increases on one or more objects (e.g., Object-B), but this misses the expected deadline for the next object. See Figure 12.

이 경우에, 객체-B에 의해 기한을 미싱하는 것은 다음 객체들에 대한 예상 기한을 미싱하게 한다. 이 경우에라도, 전체 시스템 위험은 최소가 될 수 있고, 아닐수도 있다. 동시에 동작하는 객체들 각각은 순차적 프레임으로부터 사이클을 스틸할 수 있고, 이에 따른 미스드(missed) 기한의 도미노-효과를 피할 수 있다.In this case, missing the deadline by Object-B causes the expected deadline for the next objects to be missed. Even in this case, the overall system risk may or may not be minimal. Each of the objects working at the same time can steal cycles from the sequential frames, thereby avoiding the domino-effect of missed deadlines.

iDSP-QoSM은 소프트-기한의 조정을 위한 규칙 세트를 제시한다. 이 규칙 세트는 단일의 중요 미스드 기한으로부터 나온 미스드 기한의 스노우-볼링(snow-balling) 효과를 제한하도록 디자인된다. (1) 모든 알고리즘 객체는 프레임-드롭(drop)/제2 허용된 최대 수를 QoS 관리 프로그램에 제공한다. (2) 각 객체는 '미스드 기한' 개수의 동작 카운트를 각 프로세싱 주기후에 이동 평균으로 갱신한다. (3) 한 객체가 미스드 기한의 제한을 초과하면, 그 객체의 우선 순위를 가장 높은 값으로 바꾼다. 오리지날 우선 순위는 숫자가 제한 아래로 떨어질 때 한번 재저장된다. (4) 제한 이후에 기한을 미스한 모든 순차적 프레임은 떨어진다. 이것은 QoS를 다음의 즉각적 레벨로 일시적으로 저하되게 한다. 그리고, QoS에서 이 순간적 드롭은 클라이언트에게 보고된다. (5) 프레임은 DSP가 그 기한의 경과 이후까지 문제의 객체를 시작하지 않고 있을때만, 규칙대로 떨어진다.iDSP-QoSM presents a set of rules for adjusting soft-duration. This set of rules is designed to limit the snow-balling effect of missed deadlines from a single critical missed deadline. (1) Every algorithm object provides the QoS management program with the maximum number of frame-drop / second allowed. (2) Each object updates the 'missed due' number of operation counts to a moving average after each processing period. (3) If an object exceeds the missed deadline limit, change the object's priority to the highest value. The original priority is restored once when the number falls below the limit. (4) All sequential frames that miss the deadline after the limit are dropped. This temporarily causes QoS to degrade to the next immediate level. And, in QoS, this momentary drop is reported to the client. (5) A frame is dropped as a rule only if the DSP has not started the object in question until after that time has passed.

3d. 주기적 미디어 렌더링을 위한 감속 조절3d. Deceleration Control for Periodic Media Rendering

주어진 알고리즘 객체를 위해서, iDSP-QoSM은 어느 순간에도 대기 큐(queue)내에 오직 하나의 요청이 있다고 가정한다. 미디어 스트림은 일반적으로 QoS 관리 프로그램에 요하는 서비스의 질로서 구체화된 주기적 기한(예를 들어, 비디오 스트림동안 30 프레임/초)을 가진다. 미디어 시스템내의 오디오 및 비디오 랜더링의 구성 요소는 프레임을 스케쥴 바로 이전에 도착하도록 하면서, 도착 시간내의 변화를 조절하도록 프레임을 완화할 수 있다. 그러나, 이 버퍼는 한정적이고, 미디어 시스템의 업스트림 구성요소는 프레임이 프로세스되는 상대 속도를 주의해서 감소해야만 한다.For a given algorithm object, iDSP-QoSM assumes that there is only one request in the wait queue at any moment. Media streams generally have a periodic deadline (e.g., 30 frames / second during the video stream) specified as the quality of service required by the QoS management program. The components of audio and video rendering in the media system can mitigate the frame to adjust for changes in arrival time while allowing the frame to arrive just before the schedule. However, this buffer is limited and the upstream component of the media system must carefully reduce the relative speed at which frames are processed.

두 가지 메커니즘은 알고리즘 객체들의 프로세싱 속도를 감소시키기 위해 iDSP-QoSM에 의해 제공된다.Two mechanisms are provided by iDSP-QoSM to reduce the processing speed of algorithmic objects.

(1) DSP 알고리즘 객체의 클라이언트는 알고리즘 객체의 프로세싱 기능(서버)을 호출하는 속도를 제어한다. 이것은, 완료되어야만 하는 요청이 시간 주기내에서 이루어진다면, QoS 관리 프로그램의 스케쥴링 알고리즘의 서브-최적(sub-optimal) 동작을 가져올 수 있다. 예를 들어, 버퍼 A1이 기간 T1내로 처리되어야 하고 버퍼 A2가 기간 T2 내로 처리되어야 하는 알고리즘 객체 A를 고려한다. T1 및 T2가 두 개의 연속적인 기간인 도에서, [x]는 버퍼 x의 도달을 표시하고, {x}는버퍼 x의 프로세싱의 완료를 표시한다. 도 31a 참조하라.(1) The client of the DSP algorithm object controls the speed of invoking the processing function (server) of the algorithm object. This may result in a sub-optimal operation of the scheduling algorithm of the QoS management program if the request to be completed is made in a time period. For example, consider Algorithm Object A, where buffer A1 should be processed within period T1 and buffer A2 should be processed within period T2. In the diagram where T1 and T2 are two consecutive periods, [x] indicates the arrival of buffer x and {x} indicates the completion of the processing of buffer x. See FIG. 31A.

(2) QoS 관리 프로그램은 미디어 스트림의 감속(trottling)을 제어한다. 이 메커니즘은 가능한 한 빨리 입력 버퍼로 클라이언트가 알고리즘 객체의 프로세싱 기능을 호출한다. QoS 관리 프로그램은 입력 버퍼에 "개시-기한"을 첨부할 것이다. 스케줄러는 "개시 기한" 후에서야 이 버퍼를 스케쥴링한다. 현재 버퍼의 프로세싱이 완료된 때까지 클라이언트는 차단한다. 도 13b를 참조하라.(2) The QoS management program controls the trottling of the media stream. This mechanism allows the client to invoke the processing functions of an algorithm object into the input buffer as soon as possible. The QoS manager will attach a "start-due" to the input buffer. The scheduler only schedules this buffer after the "due to" date. The client blocks until processing of the current buffer is complete. See FIG. 13B.

따라서, 양 경우에서, 어떤 순간에 QoS 관리 프로그램 준비 큐에서, 많아야 알고리즘 객체 당 하나의 요청가 있다.Thus, in both cases there is at most one request per algorithm object, in the QoS manager ready queue.

4. 메모리 페이징4. Memory Paging

DSP상에 다중 알고리즘 또는 그런 문제에 관한 어떤 프로세서를 최상으로 구동하기 위해서, 한 세트의 규칙이 확립되어 시스템 자원은 알고리즘 사이에서 공평하게 공유된다. 이러한 규칙은 DMA, 내부 메모리 및 알고리즘에 대한 스케줄링 방법과 같은 프로세서의 주변 장치에의 액세스를 특정한다. 일단 한 세트의 규칙이 수용되면, 시스템 인터페이스는 플러그하는 알고리즘에 대해 개발될 수 있어서, 시스템 자원을 액세스할 수 있다. 공통 시스템 인터페이스는 단독으로 알고리즘 개발에 집중할 수 있고 시스템 지원을 하지 않기 때문에 알고리즘 개발자에 더 빨리 알고리즘을 개발하도록 하는 잘 규정된 한계를 제공한다. 그러한 인터페이스의 예는 텍사스 인스트루먼트사의 iDSP 미디어 플랫폼 DSP 프레임워크이다. 알고리즘과 TMS320C62XX DSP 사이의 모든 액세스가 이 프레임워크를 통해 일어난다.In order to best run multiple algorithms or any processor on such a problem on the DSP, a set of rules is established so that system resources are shared equally among the algorithms. These rules specify access to the processor's peripherals such as DMA, internal memory and scheduling methods for algorithms. Once a set of rules is accepted, a system interface can be developed for the plugging algorithm, so that system resources can be accessed. The common system interface alone can concentrate on algorithm development and lacks system support, thus providing well-defined limits for algorithm developers to develop algorithms faster. An example of such an interface is Texas Instruments' iDSP media platform DSP framework. All access between the algorithm and the TMS320C62XX DSP takes place through this framework.

텍사스 인스트루먼트사의 XDAIS 표준 요구는 하나 이상의 알고리즘을 플러그능력을 iDSP 미디어 플랫폼으로 허용하는 규칙을 확립하고 시스템 통합자가 하나 이상의 알고리즘으로부터 제품 품질 시스템을 빨리 어셈블하도록 한다. XDAIS 표준은 알고리즘이 Alg 인터페이스로 불리는 공통 인터페이스 요구를 충족할 것을 요구한다. 알고리즘이 메모리를 직접 규정하거나 하드웨어 주변장치를 직접 액세스할 수 있는 것이 가장 특징적인 XDAIS 표준에 의해 여러가지 규칙이 부여된다. 시스템 서비스는 모든 알고리즘에 대해 단일 공통인터페이스를 통해 제공된다. 따라서 시스템 통합자는 Alg 인터페이스를 모든 알고리즘으로 지원하는 DSP 프레임워크만을 제공한다. Alg 인터페이스는 또한 알고리즘 개발자에게 액세싱 시스템 서비스의 수단 및 그것들의 알고리즘에 대한 호출을 제공한다.Texas Instruments' XDAIS standard requirements establish rules that allow plugging of one or more algorithms into the iDSP media platform and allow system integrators to quickly assemble product quality systems from one or more algorithms. The XDAIS standard requires that algorithms meet a common interface requirement called the Alg interface. Several rules are imposed by the XDAIS standard, which is most characteristic of algorithms specifying memory directly or accessing hardware peripherals directly. System services are provided through a single common interface for all algorithms. Thus, the system integrator only provides a DSP framework that supports the Alg interface with all algorithms. The Alg interface also provides algorithm developers with a means of accessing system services and calls to their algorithms.

알고리즘은 정확히 내부 메모리 요건을 정의해야 한다. 이것은 내부 메모리에서 동일한 공간을 액세싱하는 다중 알고리즘을 지원하는 페이징 아키텍처에 대한 필요이다. XDAIS 컴플라이언트 알고리즘은 그것들의 내부 및 외부 메모리 요구를 특정할 필요가 있다.The algorithm must exactly define the internal memory requirements. This is a need for a paging architecture that supports multiple algorithms that access the same space in internal memory. XDAIS compliant algorithms need to specify their internal and external memory requirements.

내부(온칩) 메모리는 두 개의 영역으로 분할된다. 제1 영역은 시스템 오버헤드 영역으로 이 영역은 특정 DSP 시스템 구조에 대한 OS 데이터 구조를 지원한다. 제2 영역은 실행하도록 스케줄링된 때에만 알고리즘이 이용한다. 메모리의 제2 영역은 알고리즘 온칩 작업영역으로 불리며, 다르게는 이 작업 공간 영역은 데이터 오버레이 또는 데이터 메모리 페이지로 설명될 수도 있다. 도 14를 참조하라.Internal (on-chip) memory is divided into two areas. The first area is the system overhead area, which supports OS data structures for specific DSP system structures. The second region is used by the algorithm only when it is scheduled to run. The second area of memory is called an algorithm on-chip work area, which may alternatively be described as a data overlay or data memory page. See FIG. 14.

얼마나 많은 메모리가 알고리즘 온칩 작업영역에 대해 유효한지를 결정하기위해서, 시스템 개발자는 유효한 내부 데이터 메모리 공간을 총량을 취하고 페이징 아키텍처에 대한 OS 지원 및 데이터 지원과 같은 시스템 소프트웨이를 지원하는데 필요한 양을 감한다. 태스크, 신호 장치 등과 같은 OS 구조는 설계자가 한 번에 동시에 구동하기를 원하는 알고리즘의 총 수를 지원하는 최대 사이즈로 시스템 DSP 설계자에 의해 설정되어야 한다. 이는 OS 지원 오버헤드를 최소로 유지하고 알고리즘 작업영역을 증가시킨다.To determine how much memory is available for the algorithm on-chip workspace, system developers take the total amount of available internal data memory space and subtract the amount needed to support system software such as OS support and data support for the paging architecture. . OS structures such as tasks, signaling devices, etc., must be set by the system DSP designer to a maximum size that supports the total number of algorithms the designer wants to run simultaneously at one time. This keeps OS support overhead to a minimum and increases the algorithm workspace.

이 환경에서 구동하기 위한 알고리즘에 대해 내부 메모리 요구는 작업영역의 사이즈보다 작아야 한다. 그렇지 않으면, 시스템 통합자는 알고리즘을 통합할 수 없으며, 알고리즘 당 단 한개의 페이지만 있게 되는 한계가 있다. 이 아키텍처는 알고리즘에 대해 다중 페이지를 지원하지 않는다.For the algorithm to run in this environment, the internal memory requirements must be smaller than the size of the work area. Otherwise, the system integrator cannot integrate the algorithm, and there is a limit to having only one page per algorithm. This architecture does not support multiple pages for the algorithm.

알고리즘 작업영역은 3개의 구성요소 즉, 스택(강제), 지속 메모리 및 비지속 메모리로 분할된다. 때때로 지속 메모리의 판독 전용 부분만을 취급하는 후술될 제4 구성요소가 있다. 도 15를 참조하라.The algorithm workspace is divided into three components: stack (forced), persistent memory and non-persistent memory. Sometimes there is a fourth component, described below, that handles only the read-only portion of persistent memory. See FIG. 15.

알고리즘은 실행동안 단지 온칩 작업영역만을 이용한다. 알고리즘이 실행하도록 스케줄링될 때, DSP 시스템 소프트웨어는 알고리즘의 작업영역을 외부 기억 위치(쉐도우 기억)으로부터 내부 작업영역 온칩으로 이동시킬 것이다. 알고리즘이 제어를 생성할 때, DSP 시스템 소프트웨어는 어떤 알고리즘이 다음에 구동할 것인지를 결정할 것이고, 동일한 알고리즘이면 작업영역을 이동시킬 필요가 없다. 다음 알고리즘이 다른 알고리즘이면 현재의 작업영역은 외부 메모리에 쉐도우 위치로 기억되고 다음 알고리즘의 작업영역이 이동된다. 도 16을 참조하라.The algorithm only uses on-chip work area during execution. When the algorithm is scheduled to run, the DSP system software will move the algorithm's workspace from an external storage location (shadow memory) to an internal workspace on chip. When an algorithm generates control, the DSP system software will determine which algorithm will run next, and if it is the same algorithm, there is no need to move the work area. If the next algorithm is another algorithm, the current working area is stored in the external memory as a shadow position and the working area of the next algorithm is moved. See FIG. 16.

알고리즘의 전체 작업영역은 문맥 전환시에 이동된다. 스택 및 지속 데이터 메모리의 이용되는 부분만 이동된다. 알고리즘의 스택은 알고리즘이 호 스택에서 최고의 레벨에 있을 때 최고 레벨(거의 이용되지 않음)에 있다. 달리 말해 알고리즘이 그것의 엔트리 포인트에 있다.The entire working area of the algorithm is shifted in context switching. Only the used portion of the stack and persistent data memory is moved. The stack of algorithms is at the highest level (almost unused) when the algorithm is at the highest level in the call stack. In other words, the algorithm is at its entry point.

알고리즘에 대한 이상적인 문맥 전환은 그것의 스택이 최고의 레벨에 있을 때 일어나고 그 이유는 오프칩을 쉐도우 기억장치로 이동시킬 데이터가 적다는 것을 의미하기 때문이다. 도 17을 참조하라.The ideal context switch for the algorithm occurs when its stack is at the highest level because it means there is less data to move the off-chip into shadow storage. See FIG. 17.

양호한 실시예의 데이터 페이지 아키텍처는 문맥 전환이 가장 효율적이 될 것을 요구한다. 문맥 전환 프로세싱 오버헤드는 DSP가 알고리즘을 수행할 수 있는 시간으로부터 가져갈 수 있다. 알고리즘이 호출 바운더리에 있을 때 문맥 전환의 최상의 시간이기 때문에, 선취 알고리즘은 확실히 최소화되어야 한다. 스택이 최소보다 클 때, 알고리즘을 선취하는 것은 전체적인 시스템을 저하시킬 것이다. 필요하지만 매우 제한적인 기준에서 선취하는 것이 허용될 것이다. 도 18-19를 참조하라.The data page architecture of the preferred embodiment requires that the context switch be the most efficient. Context switch processing overhead can be taken from the time the DSP can perform the algorithm. Since the algorithm is the best time for context switching when the algorithm is at the call boundary, the preemption algorithm must be minimized. When the stack is larger than the minimum, preempting the algorithm will degrade the overall system. It will be allowed to preempt on a necessary but very limited basis. See FIGS. 18-19.

알고리즘 작업영역의 특별한 케이스는 알고리즘인 판독 전용 지속 메모리를 필요로하는 경우이다. 이러한 종류의 메모리는 알고리즘에 의해 이용되는 룩업 테이블로 이용된다. 이러한 메모리는 결코 변형되지 않기 때문에, 기록이 아닌 판독 전용이 될 필요가 있다. 이러한 비대칭 페이지 전송은 알고리즘의 문맥 전환으로 오버헤드를 감소시킨다.A special case of an algorithmic workspace is when an algorithm requires read-only persistent memory. This kind of memory is used as the lookup table used by the algorithm. Since these memories are never modified, they need to be read only, not write. This asymmetric page transfer reduces overhead by context switching of the algorithm.

이 데이터 페이징 아키텍처로 단일 알고리즘이 한 번 이상 인스턴스 생성될수 있다. 알고리즘이 내부 메모리 요건에 대해 필요한 것을 규정하기 때문에, DSP 시스템 통합자는 동일 알고리즘의 하나 이상의 인스턴스일 수 있다. DSP 시스템 소프트웨어는 다중 인스턴스 및 알고리즘의 각 인스턴스를 스케줄하는 때를 계속해서 알고 있다. 인스턴스의 수의 한계는 알고리즘 인스턴스의 쉐도우 버전을 유지하기 위해 DSP 시스템에 얼마나 많은 외부 메모리가 있느냐이다.This data paging architecture allows a single algorithm to be instantiated more than once. Since the algorithm specifies what is required for internal memory requirements, the DSP system integrator may be one or more instances of the same algorithm. The DSP system software continues to know when to schedule multiple instances and each instance of an algorithm. The limit on the number of instances is how much external memory the DSP system has to maintain the shadow versions of the algorithm instances.

DSP 시스템 소프트웨어는 각 인스턴스를 관리해야 하므로 알고리즘을 스케줄링할 때 알고리즘 데이터에 정확히 정합된다. 대부분의 DSP 알고리즘이 태스크로서 인스턴스 생성되기 때문에, DSP 시스템 소프트웨어는 알고리즘 인스턴스를 관리하는 수단으로서 태스크 환경 포인터를 이용할 수 있다.The DSP system software must manage each instance so that it is correctly matched to the algorithm data when scheduling the algorithm. Since most DSP algorithms are instantiated as tasks, DSP system software can use task environment pointers as a means of managing algorithm instances.

5. 체인형 데이터 흐름5. Chained Data Flow

데이터 흐름의 양호한 실시예는 프로세싱 엘리먼트를 통합하고 그것들을 공유 메모리 공간에 제공하고 GPP에 의한 개입없이 프로세싱 엘리먼트 사이에 직접적으로 데이터를 루팅하는 것에 따른다. 그러한 시스템은 도 21에 도시된다.Preferred embodiments of the data flow rely on integrating processing elements, providing them to a shared memory space, and routing data directly between processing elements without intervention by GPP. Such a system is shown in FIG. 21.

프로세싱 엘리먼트 PE_a가 데이터의 집합을 처리하는 것을 완료한 때, 결과적인 데이터를 공유 메모리에서 미리 지정된 출력 버퍼에 기록한다. 그런다음 PE_a는 적당한 제어 경로를 통해 체인에서 다음 프로세싱 엘리먼트 PE_b에 통지한다. 통지는 공유 메모리 버퍼 PE_b가 입력으로서 이용되어야 하는 것을 표시한다. 그런다음 PE_b는 프로세싱을 더 하기 위해 입력 버퍼로부터 데이터를 판독한다. 이런 식으로데이터는 모든 데이터가 소모될 때까지 모든 프로세싱 엘리먼트 사이에서 통과된다.When processing element PE _a has finished processing the set of data, it writes the resulting data to a predetermined output buffer in shared memory. PE _a then informs the next processing element PE _b in the chain via the appropriate control path. The notification indicates that the shared memory buffer PE _b should be used as input. PE _b then reads data from the input buffer for further processing. In this way, data is passed between all processing elements until all data is consumed.

상술된 바와 같이 한 세트의 버퍼는 두 개의 프로세싱 엘리먼트 사이에 데이터를 통신하는데 이용되고 그 엘리먼트 사이에 I/O 채널을 포함한다. 다중 I/O 채널은 다중 데이터 스트림이 시스템에 의해 동시에(즉, 병렬로) 처리되도록 하는 소정의 두 개의 프로세싱 엘리먼트 사이에 존재할 수 있다. 도 22는 다중 데이터 스트림(s1 및 s2)의 병렬 처리의 예를 도시한다.As described above, a set of buffers are used to communicate data between two processing elements and include an I / O channel between the elements. Multiple I / O channels may exist between any two processing elements that allow multiple data streams to be processed simultaneously (ie, in parallel) by the system. Fig. 22 shows an example of parallel processing of multiple data streams s1 and s2.

I/O 채널에 의해 접속된 직렬 프로세싱 엘리먼트는 채널 체인을 구성한다. 여러 채널 체인은 특정 시스템내에서 규정될 수 있다. 중간 체인 프로세싱 엘리먼트의 경우에 각 입력 채널은 관련 출력 채널을 갖는다. 터미널 프로세싱 엘리먼트는 입력 채널만 또는 출력 채널만 갖는다.Serial processing elements connected by I / O channels constitute a channel chain. Multiple channel chains can be defined within a particular system. In the case of an intermediate chain processing element each input channel has an associated output channel. The terminal processing element has only an input channel or only an output channel.

프로세싱 엘리먼트의 입력 채널은 데이터가 판독되는 버퍼를 정의한다. 프로세싱 엘리먼트의 출력 채널은 프로세싱 엘리먼트가 후에 통지할 뿐만 아니라 데이터가 기록되는 버퍼를 정의한다. 데이터 프로세싱 엘리먼트와 중앙 제어 프로세서(CCP) 사이의 제어 메시지의 종류는 다음과 같다.The input channel of the processing element defines a buffer from which data is read. The output channel of the processing element defines the buffer into which the data is written as well as which the processing element later notifies. The types of control messages between the data processing element and the central control processor (CCP) are as follows.

(1) 상태 메시지: 개시, 정지, 중지, 중단, 재개되는 등의 데이터 스트림 프로세싱.(1) Status messages: data stream processing such as start, stop, stop, stop, resume and so on.

(2) 서비스 품질 메시지: 타임 스탬프, 시스템 부하, 자원 프리/비지 등(2) Quality of service messages: time stamp, system load, resource free / busy, etc.

(3) 데이터 스트림 제어 메시지: 개시, 정지, 중단, 재개, 리와인드 등.(3) Data stream control messages: start, stop, stop, resume, rewind, etc.

(4) 시스템 부하 메시지: 태스크 구동, 액티브 채널의 수, 프로세싱 엘리먼트 당 채널 등(4) System load messages: task driving, number of active channels, channels per processing element, etc.

양호한 실시예에서, 프로세싱 엘리먼트를 구비한 I/O 채널의 생성 및 연관은 시스템 초기화 시간에서 판독될 수 있는 구조 파일을 통해 통계적으로 규정된다. 처리될 각 비트스트림 형태에 대해서, 구조 파일은 적당한 프로세싱 엘리먼트를 접속하는 채널 체인(즉, 데이터 경로)를 규정한다. 채널 체인에서 모든 프로세싱 엘리먼트의 집합적인 프로세싱은 데이터를 완전히 소모되게 한다.In a preferred embodiment, the creation and association of an I / O channel with processing elements is statistically defined through a structure file that can be read at system initialization time. For each bitstream type to be processed, the structure file defines the channel chain (ie, data path) that connects the appropriate processing elements. Collective processing of all processing elements in the channel chain causes data to be consumed completely.

다중 데이터 경로가 주어진 비트스트림에 존재하는 경우에, 대안 또는 백업 채널 체인이 규정될 수 있다. 비트스트림은 주 채널 체인의 임의의 처리 엘리먼트를 이용할 수 없는 경우에는 이들에 라우팅될 수 있다. 런타임 및 동적 QoS 분석시의 비트 스트림 타입의 결정은 데이터가 라우팅되는 채널 체인을 선택한다. 런타임에 시스템 내의 모든 합법적인 채널 체인은 고정되어 수정이 불가능하다.In the case where multiple data paths exist in a given bitstream, alternative or backup channel chains may be defined. The bitstream may be routed to any processing element of the main channel chain if it is not available. Determination of the bit stream type in runtime and dynamic QoS analysis selects the channel chain through which data is routed. At runtime, all legal channel chains in the system are fixed and cannot be modified.

다른 바람직한 실시예에서, 다른 비트스트림에 대한 채널 체인은 통신 프로세서에 새로운 비트스트림이 도달한 때 동적으로 구성될 수 있다. 런타임에 도출된 비트스트림 정보는 제어 메시지를 통해 CCP로 전송되는데, 이 CCP는 필요한 처리 엘리먼트들을 결정하고 이들 사이에 I/O 채널을 동적으로 할당한다. 이러한 방식은 자원이 동작을 정지하거나 런타임에 온라인으로 되어 시스템이 자동으로 적응할 수 있도록 한다.In another preferred embodiment, channel chains for other bitstreams may be dynamically constructed when a new bitstream arrives at the communication processor. Bitstream information derived at runtime is sent to the CCP via a control message, which determines the necessary processing elements and dynamically allocates I / O channels between them. This approach allows resources to freeze or come online at run time so the system can adapt automatically.

공유 메모리 이종 시스템에서, 데이터는 CCP에 의해 방해받지 않고 외부 공유 메모리를 통해 처리 엘리먼트들 사이에서 흐른다. 데이터는 버스 상에는 결코 나타나지 않으며, 따라서 데이터 처리의 속도는 버스 전송 시간이 아닌 공유 메모리 액세스 시간에 의해 결정된다. CCP의 방해도 최소화되므로, CCP 응답 및 처리 지연이 전체 데이터 흐름 시간으로부터 제거된다. 이것은 처리 엘리먼트들 간의 데이터 전송 시간을 최소화하여 시스템의 처리량을 증가시킨다.In shared memory heterogeneous systems, data flows between processing elements through external shared memory without being interrupted by the CCP. Data never appears on the bus, so the speed of data processing is determined by the shared memory access time, not the bus transfer time. Interference with CCP is also minimized, so CCP response and processing delays are eliminated from the overall data flow time. This increases the throughput of the system by minimizing the data transfer time between processing elements.

5a. 예5a. Yes

본원에 설명된 데이터 흐름 기술의 대표적인 이용은 미디어 처리 시스템에 대한 것이다. 이 시스템은 디코딩, 인코딩, 번역, 변환, 스케일링 등과 같은 처리를 위해 광대역 미디어의 스트림을 개시하고 제어한다. 케이블 모뎀, DSL 또는 무선과 같은 통신 미디어를 통해 로컬 디스크로부터 또는 원격 머신/서버로부터 발생한 미디어 스트림을 처리할 수 있다. 도 23은 이러한 시스템의 일례를 나타낸다.An exemplary use of the data flow techniques described herein is for media processing systems. The system initiates and controls the stream of broadband media for processing such as decoding, encoding, translation, conversion, scaling, and the like. Communication media such as cable modem, DSL or wireless can be used to process media streams originating from local disks or from remote machines / servers. 23 shows an example of such a system.

도 23의 미디어 처리 시스템은 다음과 같은 다섯 개의 처리 엘리먼트를 갖는다:The media processing system of FIG. 23 has five processing elements as follows:

(1) DSL 또는 케이블 모뎀 I/O 프론트-엔드 DSP(1) DSL or cable modem I / O front-end DSP

(2) 미디어 처리 DSP(2) media processing DSP

(3) 비디오/그래픽 오버레이 프로세서(3) video / graphic overlay processor

(4) H.263 디코더 태스크(4) H.263 decoder task

(5) 칼라 스페이스 컨버터 태스크(5) color space converter task

프론트-엔드 I/O DSP에 입력되는 H.263 스트림은 번호 아크 1 내지 3으로 정의된 채널 체인을 따른다. 각 채널은 2개의 처리 엘리먼트를 접속시키며, 엘리먼트들 간에 데이터를 전송하는 데 사용되는 한 세트의 I/O 버퍼로 구성된다. 제어 흐름은 음영 아크를 통해 도시된다.The H.263 stream input to the front-end I / O DSP follows the channel chain defined by the number arcs 1 to 3. Each channel connects two processing elements and consists of a set of I / O buffers used to transfer data between the elements. Control flow is shown through the shaded arc.

H.263 스트림은 I/O 프론트-엔드 DSP에서 글로벌 공유 메모리에 정의된 채널 1 I/O 버퍼로 흐른다. I/O 프론트-엔드 DSP는 채널 1과 관련된 목적 처리 엘리먼트, 즉 미디어 처리 DSP 상의 H.263 디코더 태스크에게 입력 버퍼가 가득 차 있어 판독될 준비가 되어 있다는 것을 통지한다. H.263 디코더 태스크는 채널 1 I/O 버퍼로부터 데이터를 판독하고 이를 디코딩하여 얻은 YUV 데이터를 로컬 공유 메모리 내의 채널 2 I/O 버퍼에 기록한다.H.263 streams flow from the I / O front-end DSP to the channel 1 I / O buffers defined in global shared memory. The I / O front-end DSP notifies the destination processing element associated with channel 1, the H.263 decoder task on the media processing DSP, that the input buffer is full and ready to be read. The H.263 decoder task reads data from the channel 1 I / O buffer and writes the resulting YUV data to the channel 2 I / O buffer in local shared memory.

채널은 인터프로세서 또는 인트라프로세서일 수 있다는 점에 유의한다. 데이터는 글로벌 공유 메모리(인터프로세서) 또는 주어진 프로세서로 제한된 로컬 공유 메모리(인트라프로세서)를 통해 프로세서들 간에 전송될 수 있다. 도 4에서, 채널 1 및 3은 인터프로세서이고, 채널 2는 인트라프로세서이다.Note that the channel can be an interprocessor or an intraprocessor. Data may be transferred between processors via global shared memory (interprocessor) or local shared memory (intraprocessor) limited to a given processor. In Figure 4, channels 1 and 3 are interprocessors and channel 2 is an intraprocessor.

6. 변형례6. Modifications

바람직한 실시예는 첨부된 청구범위의 특징 내에서 다양한 방법으로 변형될 수 있다.Preferred embodiments may be modified in various ways within the scope of the appended claims.

본 발명에 따르면, 클라이언트 기한 단계의 정보가 제2 단계의 서브태스크 서버 스케줄링에 이용되는 2단계 서버 태스크 스케줄링을 행하는 클라이언트 - 서버 시스템에서, 시스템의 객체 브로커가 코프로세서에서 데이터를 유지하기 위하여 클라이언트 요구 호 및 회답의 붕괴를 행하고, 다수의 코프로세서의 공유 메모리를 통해 멀티태스킹 및 데이터 흐름에 대한 서버 메모리 관리 방법을 제공하여 주 프로세서 버스 혼잡을 방지할 수 있다.According to the present invention, in a client-server system that performs a two-stage server task scheduling wherein information of a client deadline step is used for subtask server scheduling in a second step, the object broker of the system requests the client to maintain data in the coprocessor. The main processor bus congestion can be avoided by disrupting calls and responses and providing a method of managing server memory for multitasking and data flow through the shared memory of multiple coprocessors.

Claims

(a) a first step of scheduling on a client to set a real time deadline for a task of a server coupled to the client; And

(b) a second step of scheduling subtasks of the task on the server;

Wherein the second scheduling step uses the real time deadline of step (a).

The method of claim 1,

(a) the task comprises decoding a media stream,

(b) the subtask includes frame decoding for a frame of the media stream.

(a) disrupting the first client request return and the second client request call; And

(b) connecting the output of the first server object to the input of a second server object, wherein the first server object and the second server object correspond to first and second client requests, respectively;

Object request broker method for a client-server system comprising a.

The method of claim 3,

(a) The connection is by creation of a buffer for intermediate results (output of the first object and input of the second object) at the server.

(a) allocating a first portion of processor memory to processor overhead; And

(b) allocating a second portion of said processor memory to a task workspace;

And wherein the second portion can be occupied only by one task at a time.

The method of claim 5,

(a) the second portion of the memory comprises a stack component, a persistent memory component and a non-persistent memory component.

A data flow method in a heterogeneous system wherein a bus is connected to each of a control processor and a plurality of processing elements,

(a) transferring data between the processing elements using a common memory separate from the bus

How to include.