CN1405679A - Multi-processor target control - Google Patents
Multi-processor target control Download PDFInfo
- Publication number
- CN1405679A CN1405679A CN01119611A CN01119611A CN1405679A CN 1405679 A CN1405679 A CN 1405679A CN 01119611 A CN01119611 A CN 01119611A CN 01119611 A CN01119611 A CN 01119611A CN 1405679 A CN1405679 A CN 1405679A
- Authority
- CN
- China
- Prior art keywords
- server
- dsporb
- client
- data
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Computer And Data Communications (AREA)
Abstract
The client-server system divides the scheduling tasks of the server into two phases. The information at the phase till the time limit of the client is utilized in second phase of scheduling sub tasks of the server. One object agent is used in the system to maintain the data in the coprocessor by destroying the requests of the client for the call and return. Moreover, one shared memory of the multiple coprocessors in used to manage the memory and the data flow of the server in multitasks mode so as to avoid the congestion of the main processor.
Description
The right of priority of application is from provisional application series
The 60/199th, 753; 60/199,755; 60/199,917; With 60/199, No. 754; All declare 04/26/2000.
The present invention relates to electronic equipment, especially relate to the distribute objects and the method thereof of multiprocessor and digital signal processor.
The growth connection of the Internet has promoted Distributed Calculation with network insertion at a high speed becomes main flow.Object-oriented network programming and the component software method oversimplified appearred in public object request broker scheduler program architecture (CORBA) and distributed component object model (DCOM) standard.Such client applications can be visited one and the remote server object of data or function is provided and therefore simplify application programming; Figure 24 example common teleprogram access structure.In fact, therefore object-oriented programming encapsulation particulars also only provide the object interface that is used to inquire or allow other object interaction of this Distribution calculation.
The core of CORBA is that the Object Request Broker (ORB) of " bus " is provided for the object interphase interaction, no matter be this locality or long-range.A CORBA adds an interface to liking methodology.As the processing of call method, the references object that the client of CORBA object uses as to as if be arranged in client's address space.ORB is responsible for finding the realization (on a possible remote server) of an object, this object is received a call request from a client applications to be prepared, arrive object from client's transportation request (for example, parameter), and some answers are turned back to the client from object.By an ORB interface or a target adapter, the realization of object is influential to ORB.Figure 25 has shown the structure of whole C ORBA.
Usually in Object-oriented Programming Design, when to particulars (data realize) when hiding, the interface definition of a language definition (IDL) object interface that comprises the method for calling by the client.Typically, IDL provides data encapsulation, pleomorphism, and inheritance.As shown in figure 24, the client to call the function of an object at first be to produce a visit to client stub (agency); Counterfoil is organized into an information to access parameter; The electric wire agreement sends to server counterfoil (framework) to information; The server counterfoil is not put the also function of access object in order to the access parameter of information.Top layer at Figure 25 is basic program design architecture, and the middle level is that long-range architecture and bottom are the electric wire protocol architectures.The developer of client's programming and server object programming works together with the base program design architecture, references object and the significant thing of processing in the process of client and server.The electric wire agreement effectively expands to long-range architecture in the various hardware devices.
Described as people such as Cheung, DCOM and CORBA are side by side, stepping and multilayer, use a simple application program that has the remote object that allows CORBA client and server can produce five files: (1) idl file that is used to define object interface.The idl compiler device will generate client stub and the object skeletal code adds a head file interface that is used by client and server.Realize that the derivation of head file realizes classification from the server of interface object for (2) one.Realize classification and the interface class relevant (passing through inheritance) that produces by the idl compiler device in fact.(3) other implementation methods of server category.(4) master routines that are used for server; This program is with other example of server category of example (object).(5) client applications by access customer counterfoil allocating object method.
For calling of static object, after compiling, only before carrying out, CORBA is registered in the interface name of the server carried out in the realization storage vault and the association (seeing Figure 25) between the pathname.For calling of dynamic object, the idl compiler device also is created in the type information of every kind of method in the interface also by a kind of method of dynamic call interface interchange on dynamic object.Equally, at server here, dynamically framework interface allows a client to be invoked at an operation on the object, the type knowledge when this object does not have the compiling of the object of realizing.
Figure 26 a has shown the client requests activity of an object of CORBA top layer and the method for calling it, and server produces the example of an object and it availability for the client.Especially, then movable (1) client access of object is used for the client's of object interface static function.(2) ORB starts the server that comprises a support target interface object.(3) server is to an object example programming and registration references object.(4) ORB turns back to client applications to references object.Then for [1], the object reference method of [2], the method for the object interface of method in the last invoking server of client access.If the method rreturn value, then server sends these values and gets back to the client.
Figure 26 b example the middle level (long-range architecture) movable (1) of CORBA and object according to the visit that receives, client stub task delegation to ORB.(2) ORB consulting realizes storage vault calling the server path name of map to it, and the activation server programming.(3) references object that obtains of server example object and generation is unique with reference to ID.It registers references object to ORB.(4) be used for other structure symbol of server category and also produce other example of frame clsss.(5) ORB the object reference course is sent to the client and also produce a client stub classification example and it be registered in the corresponding client stub Object table of references object in.(6) client stub returns a references object to the client.The method of client's allocating object is carried out subsequently, by [1] visit according to the customer to customer counterfoil of receiving, produce a pseudo-object of request, the parameter of arrangement visit enters pseudo-object, request is translated into an information to pseudo-object and is delivered to server in passage, and wait acknowledge.[2] when information arrived server, ORB found target framework, reconstruct request puppet object, and it is forwarded to framework.[3] framework is not put the parameter from the pseudo-object of request in order, and the method for server invocation object is put rreturn value (if any) in order, and returns the method from framework.ORB sets up a response message and it is placed in the transmission buffer.[4] arrive client Bian Shi when replying, read response message from reception buffer after, the ORB visit is returned.Client stub is not put in order rreturn value and they is turned back to the client so that finish visit subsequently.
As Figure 26 c example the object activity of bottom comprise (1) according to the request of receiving, the ORB on client limit selects the machine of a support target and sends a request to the ORB on server limit through TCP/IP.(2) when the ORB on the serviced device of server limit starts, the serviced device of an object institute example, ORB structure symbol is called, and generating function is called.At socket end points of the inner generation of generating function, object is assigned with an object identity, generates one and comprises interface and realize name, with reference to the references object of identity and end-point addresses.References object is registered to ORB.(3) be returned to the client limit when references object, client stub extracts end-point addresses and sets up a socket to server and be connected.Following carry out [1] of call method is according to calling of receiving then, and client stub arrangement public data is represented the parameter in (CDR) form.[2] request is sent to destination server by the socket connection of setting up.[3] target framework is identified by reference identity or excuse example identifier.[4] after the practical methods of server object is called, the rreturn value in the framework arrangement CDR form.
The real-time expansion of CORBA typically provides service (Q
oS) quality of aspect, such as predictable performance, safe operation, and resource distribution.For example, people such as Gill are distributed application program of future generation, and the application self-adapting intermediate equipment is managed Q end to end
oS.
By the agency of as the CORBA element of first type, and for the implementation procedure of describing, the element of the realization definitional language that can obtain being correlated with.Figure 27 example programming step.
DCOM has three layers and somewhat similar to the architecture of CORBA equally.
The application WO99/12097 that Notenboom USP 5,748,468 and Equator Technologies PCT deliver has described the method for giving the resources allocation multitask of processor.Notenboom considers according to previous system, adds the coprocessor of the coprocessor resource that has allocating task with a primary processor.Equator Technologies is according to consuming the dispatch processor resource task time, owing to be present at least one seeervice level that is supported (processor resource consumption rate), if and existed enough resources to be used for a seeervice level that is supported, explorer would allow to accept a task.
The system that has two or more processors, each processor has its affected system or BIOS, include the system that separates far a plurality of processors, a plurality of processors link to each other through the Internet, also have two or more processors to be integrated in system on the identical semiconductor die, add one or more DSPs such as a RISC CPU.
The XDAIS standard code interface algorithm on DSPs; This provides reusable object.XDAIS need realize that a kind of algorithm of IALG interface standard adds an expansion that is used to move this algorithm.XDAIS also need be regular flexibly according to certain, such as relocatable code and naming convention.Pass through in the function pointer table and visit, a client applications can be managed the example of an algorithm.Standard/criterion that XDAIS has been arranged, algorithm development person can exploitation or transfer algorithm so that it easily is inserted in the DSP application framework, such as iDSP media platform DSP framework.
Because need service quality, the supervisory routine in a network node is derived from all real-time demands for services based on the Streaming Media of application program clearly.The Streaming Media application program must be handled multimachine kind coder (encoder/decoder) and have the filtrator of unique translation deadline date.For the fail soft in service quality, these application programs also should be able to be developed the feature with interpreter's perception.They are reasonably in the treatment progress and the amount of jitter of translation cycle.For example,, in video application, the frame per second of translation must maintain 30 frames/sec (fps), i.e. 33ms frame period of translation.Yet, limited transient change when application program should be able to be stood with server negotiate.And at 30fps, people's visual perception is able to take the frame losing of about 6 frames/sec.Client applications also should be able to support performance fail soft and with the special scope of allowing of server negotiate in keep the steady state (SS) of translation.A Q
oThe S manager is for realizing that a kind of like this real-time system provides the required function and the mechanism of ability.
As broadband connections, incorporate new market such as DSL and cable modem, and unprecedented data capacity is sent to the consumer device that is used for data processing and consumption, will need to keep more effective data processing, route arrangement, and treatment technology.
Figure 20 has shown the data message stream by current heterogeneous system processing element.Each data processing is numbered to show time sequencing.Each data processing must be through the system bus under being controlled by center processor controls (CCP).To each processing element, CCP is by transmission information or trigger primary data to handle through the control path in the system.
Processing element among Figure 20 shown the task handling device that can move one group of definition that separates (for example, DSPs, ASICs, GPPs, or the like).Here it is is the storer that each processor of assorted mill has oneself.Processing element on same processor also can the isolated operation task.
In some cases, identical data must repeatedly be passed through system bus (for example, 1 and 2,3 and 4,5 and 6).In such system, data must through the number of times of system bus be total up to 2+ (2 * n), perhaps be 6 times in this case.Each process system bus will be introduced the intervention of additional data flow by CCP, has reduced the handling capacity of total system.
Additional data flow the data volume by system in the given time frame played negative effect and therefore limited system the data volume that can handle.The useful task that such system carries out is probably lacked than the shown ability that goes out of its whole parts.
The invention provides a kind of client-server system that has one or more features, the task scheduling that comprises two stage servers, an Object Request Broker that is used for client-server system, this client-server system has the task link on the server DSPs, the internal memory of multi-task processor adds a task work space by the overhead that internal memory is divided into processor and manages, this task work space belongs to every task that next is carried out separately, and multiple machine system comprises that a center processor controls adds bus.The processing element that connects adds a shared storage that is used for processing element, has avoided data stream in the multiple machine system through the bus of center processor controls.
In order to know that accompanying drawing is didactic.
Fig. 1 shows the preferred embodiment of DSPORB structure.
Fig. 2 example idl compiler.
Fig. 3-the 13rd, Q
oS figure in time.
Figure 14-19 shows the preferred embodiment that decomposes storer.
Figure 20 is presented at a known data stream in the multiple machine system.
The preferred embodiment of Figure 21-23 display data stream.
Figure 24-27 example CORBA.
Embodiment is described
The primary processor that the system of preferred embodiment typically has an operation client applications adds the processor-server of one or more runtime server algorithms, and comprise the Object Request Broker that is used for the algorithm object, be used for the service quality control of Object Request Broker, be used for the paging of algorithm object and be used for the data stream of algorithm object.A preferred embodiment that is called iDSPOrb is applied in the system that has a primary processor and one or more DSP coprocessors.
IDSPOrb is a high performance DSP Object Request Broker (DSPORB), and it is supported in the multi-processor environment establishment and visit from a general processor (GPP) or DSP to the DSP object.IDSPOrb has a common architecture and is similar to the operation of CORBA.IDSPOrb has the feature of following DSPORB:
(1) the iDSPOrb support target is compiled and is called (DSP object reference program) and pass the processor border.
(2) iDSPOrb provides one by the compiling duration head that is used for static call and counterfoil and the proxy interface that runtime dynamic call interface is formed on the GPP limit.
(3) iDSPOrb provides an algorithm interface (counterfoil and head) that is used to set up an iDSP server for the DSP limit.
(4) iDSPOrb provides call synchronous and asynchronous.
(5) the iDSPOrb real-time Q that gives security
oS.
(6) iDSPOrb provides the processing of basic frame and base flow.
(7) iDSPOrb provides link data flow object (intermediate result rests in the DSP storer).
(8) iDSPOrb realizes on the I/O interface of a high bandwidth hyperchannel GPP/DSP.
Fig. 1 has shown the architecture of the iDSPOrb of a dual processor arrangement, and wherein GPP takes on " client ", and DSP is as " server ".
Thereby, the service quality (Q in the iDSP system
oS) manager is considered to iDSP-Q
oSM, being one provides the mechanism (a server in) of the seeervice level of negotiation to client applications.It provides the guaranteed service quality and the client communication that have predetermined degradation strategy.IDSP-Q
oSM has following properties: it is defined within a limited range of nodes interior (interior nodes) on the network (1).There is a suitable Q in its supposition
oThe communication of S manager control internal node (network).(2) it is defined and is used to have the multi-processor environment that load is shared ability.
IDSP-Q by preferred embodiment
oSM carries out function and comprises: the steady state (SS) of handling load in (1) surveillance on server.(2) server-assignment from overload loads on the server of its peer.(3) with client applications negotiate service demand, register some additional loading on the server.(4) based on the special characteristics of each object of serving, predict the load on the future server by service routine.(5) based on time of processor round-robin time rather than processing, the time that prediction algorithm moved: the method for this prediction algorithm working time does not rely on the frequency of operation of processor.
In Texas Instruments, TMS320C62XX DSPs has inside (in a chip) data-carrier store that defines quantity.Except that TMS320C6211 (with its modification), TMS320C62XXDSPs does not have the data caching that external memory storage (away from chip) carries out valid memory access.Internal memory is positioned at the superlative degree of the data-carrier store staging hierarchy of a TMS320C62XX DSP.Therefore all operate in an algorithm on the TMS320C62XX DSP and need use internal memory to be used for their datamation space, because this is the five-star efficient of access data storer.
Typically, they have whole dsp processors the DSPs algorithm supposition that is developed, and therefore have all DSP internal memories.Like this with regard to integrated several different algorithms, they be identical (of the same clan) or different (foreign peoples'), integrate very difficulty.Need set of rule with a kind of public method access and the resource of using system, such as the picture internal memory for algorithm development person.
Preferred embodiment provides a kind of method that improves processor utilization, when moving many algorithms on the DSPs that is lacking data caching, and can be by the DSP internal memory being used a kind of architecture of datarams paging.With the standard of Texas Instruments XDAIS, the architecture of data-driven paging can be finished the exploitation of DSP algorithm or conversion.This standard-required algorithm development person stipulates that at least one or a plurality of support are used for the memory block of all storeies of algorithm.Stipulate in the zone one or all select to be used in the internal memory of a TMS320C62X DSP, moving by algorithm development person these users.The software section of using internal memory in dsp system is divided into a system support and a data workspace (page).In the execution time, the workspace that all algorithm application are shared in the DSP and own whole workspace.For the context swap between two kinds of algorithms, dsp system software will be handled the workspace of each algorithm and the transmission between the shadow memory respectively.Preferred embodiment provides:
(1) between two or more DSP algorithms, shares internal data memory among the DSP lacked data caching, improve the utilization factor of processor.
(2) the multiple algorithm of operation from identical shared drive, when the access data storer when supporting the variable of required storehouse and algorithm inside, in TMS320C62X DSP environment, allow every kind of algorithm to enjoy maximum efficient.
(3) this architecture runs on the DMA utility routine of some single-processors that have internal memory and a kind of necessary access processor internal memory.
(4) only carry out context swap, active data paging structure is provided on data incoming frame border.
Support the asynchronous paging transmission of read-only algorithm data.
Data stream in application program can be from the algorithm to the algorithm, and for the execution of every kind of algorithm, and preferred embodiment provides and is retained in data among one or more DSPs rather than transfers to a GPP.
1. the DSP ORB in dual processor arrangement
Fig. 1 has shown the ORB architecture of a preferred embodiment of dual processor arrangement, comprises a general processor (GPP) and a digital signal processor (DSP), and wherein GPP is taken as " client ", and DSP is as " server ".Notice that iDSPOrb comprises a service quality (Q
oS) manager.Fig. 1 has shown a client application calls two kinds of DSP algorithm objects " A " and " B ".The object that iDSPOrb at first provides GPP to go up agent object " a " and " b " compiles.For example, " A " and " B " can be the expansion of DSPIDL interface, with a following code translator (DEC):
<! [CDATA[module DEC{interface IDecoder{ ... int process ([in] BUFFER input, [out] BUFFER output); Interface A:IDecoder{}interface B:IDecoder{}}]]The algorithm interface that use is provided by the DSPIDL compiler is set up the application program (being called the iDSP server) on a DSP limit: DEC_A_Handle DEC_A_create (IALG_params *P); Int DEC_A_decode (BUF_Handle in, BUF_Handle out): using also is the application program of being set up a GPP limit by the proxy interface that the DSPIDL compiler provides: DEC_A *DEC_A_create (DSPORB_Params *P); Int DEC_A_decode (DSPORB_Buffer *In, DSPORB_Buffer *Out); Perhaps use iDSPOrb dynamic call interface.In when operation, can be called to handling an impact damper " a " from the client applications on GPP limit.This data have been passed through the practical object " A " on DSP limit.Use the object linking data stream, " A " of output can be connected to " B " of input, so that the intermediate data of impact damper need not be transmitted back to GPP." b " call " B " cause another the processing stage in data are turned back to GPP.It is synchronous and asynchronous that the dynamic call interface support of iDSPOrb is called.
Between a GPP and single DSP, iDSPOrb needn't be divided.It also can operate in the configuration that has a plurality of DSPs.In this case, Q
oS manager (server limit) is carried out the load balancing of DSP algorithm in existing DSPs.Other configuration can be made up of an ASIC (as the DSP of fixed function), perhaps adds RISC by ASIC and forms, and wherein algorithm interface is provided to client applications.2a.DSPIDL compiler
IDSPOrb supports DSPIDL, and a kind of IDL (IDL (Interface Definition Language)) has following key word:
Module: the set of an interface specification.
For example, the H263 module can comprise code translator and encoder interfaces.
Interface a: interface specification.
In: indicate an input argument
Out: indicate an output argument
BUFFER: indicate a buffer types
STREAM: indicate a stream type
RESULT: the return type of a function of indication
Other the utilization that is used for storer, in real time
The general type of a DSPIDL file is
module modulename{ interface algorithm_1[:alg1,alg2,…]{ algorithm_1(PARAMS)//constructor method_1 method_2 method_3 … } … }
Wherein method is
RESULT?function([direction]TYPE,…)
With direction be in, out, or [in, out] and TYPE are BUFFER or STREAM.For example, a H263 IDL can produce algorithm and proxy interface, as shown in Figure 2.
2b. the processing of frame and stream
Frame has following difference with respect to the processing of stream.
Key word:
BUFFER: shock absorber function is handled a frame as argument type by the frame base.
STREAM: stream function is as a stream of argument type processed frame, and is typical in disperseing a task.
Function call
DSPORB_Buffer_connect (DSPORB_Buffe
*Out, DSPORB_Buffe
*In) and
DSPORB_Stream_connect(DSPORB_Stream
*out,DSPORB_Stream
*in)
What connecting object was provided outputs to input (frame or flow point other).For impact damper, the attended operation symbol will cause that DSPORB generates a memory buffer unit on DSP, wherein call the input of (object linking) for other method, and the output of a method call is stored.For example:
DSPORB_Buffe_connect(yuvframe_out,yuvframe_in);
H263_TIDEC_decode(h263frame_in,yuvframe_out);
YUV_TI_toRGB(yuvframe_in,rgbframe_out);
Handle for stream, a proxy call like this
H263_TIDEC_decodeStream(in_stream,out_stream);
To typically cause generating a task, handle two stream SIO streams on the DSP limit
(realization of H263_TIDEC_decodeStream will disperse a task).The stream that does not have to connect provides I/O between customer's representative and server.2c. real-time Q
oThe S manager
By DSPORB_System_setTimeConstraint and DSPORB_System_setPriority () interface, in a time-constrain that is provided with, iDSPOrb can distribute required resource that hard real-time Q is provided by a given operation is carried out
oS.The I/O driver of GPP/DSP passage allows multi-thread operation repetitive.
Q
oThe S manager is the algorithm that a part (1) the illustration client of DSP limit iDSPOrb needs, and constraint and management resource that (2) upgrade client applications are more to satisfy constraint (perhaps returning the report constraint can not satisfy) and (3).
2d.iDSPOrb enrolled for service
IDSPOrb provides a kind enrolled for service so that server object can be registered their service.For example, a server can be deciphered the MP3 audio frequency with the iDSPOrb registration.Pass through the title of the desired service that provided, target client's illustration server object.The iDSPOrb enrolled for service can be used for the DSP objects services of any kind, but it should be the known medium scope that is used for the Voice ﹠ Video service by the setting of standard mark:
The audio service Video service
MP3?Audio?Decode?MPEG?1?Video?Decode
MP3?Audio?Decode?MPEG?1?Video?Eecode
MPEG1?L2?Audio?Decode?MPEG2?Video?Decode
MPEG1?L2?Aideo?Eecode?MPEG2?Video?Eecode
G.723?Decode?MPEG4?Video?Decode
G.723?Encode?MPEG4?Video?Eecode
G.729?Decode?H.263?Decode
G.729?Encode?H.263?Encode
In working time, the iDSPOrb enrolled for service allows the dynamic illustration service object of iDSPOrb.When service object of illustration, iDSPOrb dynamically gives and distributes rudimentary I/O passage between microprocessor and the DSP.The target client can directly visit these rudimentary passages (seeing DSPORB_Stream Interface) through the iDSPOrb stream interface.The iDSPOrb enrolled for service also provides information, allows iDSPOrb specify a DSP that a kind of special service is provided, and allows Q
oLoad balancing made by the S manager and operation plan (is seen real-time Q
oThe S manager).For example, use the dynamic call module, (" MP3Audio Decode " is NULL) with the example of a MP3 audio coding of illustration to call DSPORB_ALG_create.The iDSPOrb equilibrium load of system and client's DSP reality to carry out the details of decoding covered, and distribute rudimentary stream to pass through data.By inquiry iDSPOrb, a client can also enumerate the inventory of current registered service type.Function DSPORB_Alg
*DSPORB_System_getServices () can be used to obtain the service counter of a current registration.Character then
*DSPORB_System_next (DSPORB_Alg
*Enum) can be called to obtain the title of each registered service.By calling DSPORB_System_reset (DSPORB_Handle
*Enum) enumerate the beginning that to be reset.
2e. media framework support
By for special media framework provides element, iDSPOrb can be used to support the acceleration of medium to be handled, such as
DirectShow (windows media): the wave filter object can be realized twining iDSPOrb target client's coder and inserting the DirectShow framework.
RealMedia architecture (RealSystem G2): transfer to insert and to realize twining iDSPOrb target client's coder and inserting RealSystem G2 framework.
Use same procedure DSPOrb also can insert JMF and QuickTime.
The API of iDSPOrb is packed to be advanced in the DSPORB module.Data type and the function of client (GPP) limit DSPORB are specific as follows.
2f. data type
DSPORB_Alg a: customer's representative who is used for DSP algorithm object.
DSPORB_Fxn: a function object is used for dynamic call.
Function variable of DSPORB_Arg is used for dynamic call.
DSPORB_Buffer?and?DSPORB_Stream?are’subclasses’
ofDSPORB_Arg
DSPORB_Params: for an algorithm provides the algorithm parameter structure of parameter matching at DSP limit IALG_Params.
DSPORB_Buffer: an impact damper object.
DSPORB_Stream a: flow object.
2g.DSPORB_Buffer interface
-DSPORB_Buffer
*DSPORB_Buffer_create (int size, int direction); Generate one can reference data length size, direction the impact damper object, it is one of them of DSPBUFFER_INPUT or DSPBUFFER_OUTPUT.The impact damper direction must the adaptation function invocation flags otherwise iDSPOrb error working time will be occurred.Replaceable, DSPORB_Buffer
*DSPORB_Buffer_create (DSP ORB_Alg
*, int, int); Impact damper is by an object utilization.
-unsigned?char
*DSPORB_Buffer_getData();
Obtain reference data by the impact damper object.If this impact damper is connected to another impact damper, then NULL returns.
-Void?DSPORB_Buffer_setData(unsigned?char
*data)
The buffer data pointer is set.If this impact damper is connected to another impact damper, then operation failure is at the DSP storage space because be used for the storage space of buffer data.
-void?DSPORB_Buffer_setSize(int)
The size of real data is set.
-int?DSPORB_Buffer_getSize()
Obtain the size of real data.
-void?DSPORB_Buffer_delete(DSPORB_Buffer
*buffer)
-int?DSPORB_Buffer_connect(DSPORB_Buffer
*output,DSPORB_Buffer
*input)
On DSP, connect input buffer to an output buffer.When these impact damper objects were connected, data remained on DSP and go up and do not transfer back to GPP (impact damper is generated to keep intermediate result by the iDSPOrb on the DSP).
2h.DSPORB_Stream interface
Stream interface has following method.
-DSPORB_Stream
*DSPORB_Stream_create (int n, int direction); Generate a stream and can keep n impact damper.Direction is one of them among DSPSTREAM_INPUT or the DSPSTREAM_OUTPUT.
-int DSPORB_Stream_Issue (DSPORB_Buffer
*Buf); Have an input buffer buf to send an inlet flow, perhaps an empty buffer device that is placed in the formation is received in an output stream.For connected stream, this operation is effect not, because stream is connected directly between algorithm.
-DSPORB_Buffer
*DSPORB_Stream_reclaim (); Obtain an output buffer from an output stream; Perhaps an input buffer can be sent on an inlet flow again.For connected stream, this operation is effect not.
-DSPORB_Stream_select (DSPORB_Stream array
Int n_stream, int
*Mask, long millis); Obstruction is prepared I/O up to a stream.
-DSPORB_Stream_idle (DSPORB_Stream
*Str); Idle stream.
-DSPORB_Stream_close (DSPORB_Stream
*Str); Close a stream.
-DSPORB_Stream_connect (DSPORB_Stream
*Out, DSPORB_Stream
*In); Connect an output and flow to an inlet flow.Two streams are divided present operation equally and are not visited GPP in the dsp processor space.
2i.DSPORB dynamic call interface
The dynamic call interface has following method.
-int DSPORB_System_init (); Must at first call initial DSPOrb.-DSPORB_Alg
*DSPORB_Alg_create (const char
*Name, DSPORB_Params
*Params); By the example of symbol ' title generation with reference to algorithm.
-void DSPORB_Alg_delete (DSPORB_Handle alg); The deletion algorithm example.
-DSPORB_Fxn
*DSPORB_Alg_getFxn (DSPORB_Alg
*Alg, constchar
*Fxn_name); Return and the relevant function object of symbol ' fxn_name '.
-DSPORB_Fxn_setTimeConstraint (DSPORB_Fxn
*Fxn); The event horizon of carrying out fxn is set.DSPOrb will distribute enough resources satisfying this constraint, otherwise return 0.
-int DSPORB_Fxn_setPriority (DSPORB_Fxn
*Fxn); Priority from 1 to 15 is set.
-int DSPORB_Fxn_invoke (DSPORB_Fxn
*Fxn, DSPORB_Arg
*Args); Call a function of input and output.Block up to this calling just of the data that obtain all outputs that do not connect.For with ' DSPORB_Buffer_connect ' and input and output are connected, ' NULL ' can pass through.
-int?DSPORB_Fxn_invokeAsync(DSPORB_Fxn
*fxn,DSPORB_Arg
*args);
-call a function of input and output.This calls immediately and returns; Use ' DSPORB_getData ' application program is from output argument object retrieve data.
-unsigned char
*DSPORB_Arg_getData (DSPORB_Arg
*Utput, longtimeout); Obtain data from an output argument object.Just block when nanosecond occurring up to overtime; Else if ' and overtime=-1 ' then be unlimited.
-blank DSPORB_Arg_setCallback (DSPORB_Arg
*Output, unsigned char
*(
*GetData) (DSPORB_Arg
*)); On an output argument, be provided with and recall to function; When data obtain, obtain data and be called.
-void DSPORB_System_close () closes DSPOrb.
2j.iDSPOrb an example
First example shows by using the dynamic call interface, how to use iDSPOrb be connected to C6 * * * on the TIH.263 code translator.Second example demonstration has the identical program that counterfoil is write of acting on behalf of.
/*<!-- SIPO <DP n="14"> --><dp n="d14"/>*testH263-dii.opp Program to test DSPOrb**Read a raw H.263 file,parse,decode frames using DSPOrb,and*write out YUV file.**Usage:testH263 in_file out_file*/#include#include#include"dsporb.h"#include"h263.h"const int MEMSIZE=4*176*144*3;*/enough for CIF*/static DSPORB_Alg*h263decoder;static DSPORB_Fxn*h263decoderFxn;static DSPORB-Buffer*h263inputArg;static DSPORB_Buffer*h263outputArg;static DSPORB_Arg h263decoderFxnArgs[2];int main(int argc,char**argv)/*frame is encoded H.263;buffer is YUV data*/unsigned char*frame=(unsigned char*)malloc(MEMSIZE);unsigned char*buffer=(unsigned char*)malloc(MEMSIZE);DSPORB_System_init();h263decoder=DSPORB_Alg_create("H263D_TIDEC",NULL);h263decoderFxn=DSPORB_Fxn_getFxn(h263decoder,"decode");h263inputArg=DSPORB_Buffer_Create();h263outputArg=DSPORB_Buffer_create();h263decoderFxnArgs[0]=(DSPORB_arg*)h263inputArg;h263decoderFxnArgs[1]=(DSPORB_arg*)h263outputArg;*/in is H.263 file;out is YUV file*/FILE*in=fopen(argv[1],"rb");FILE*out=fopen(argv[2],"wb");<!-- SIPO <DP n="15"> --><dp n="d15"/> int_n_bytes_in-frame; H263_initReader(in); while((n_bytes_in_frame=H263_readFrame(frame,MEMSIZE))>0){DSPORB_Buffer_SetSize(h263inputArg,n_bytes_in_frame); DSPORB_Buffer_setData(h263inputArg,frame); DSPORB_Buffer_setSize(h263outputArg,MEMSIZE); DSPORB_Buffer_SetData(h263outputArg,buffer); DSPORB_Fxn_invoke(h263decoderFxn,h263decoderFxnArgs); ints=DSPORB_Buffer_getSize(h263outputArg)); printf("%d->%d\n',n_bytes_in_frame,s); if(s>0) fwrite((const void*)buffer,l,s,out); } fclose(in), fclose(out); DSPORB_System_close() } Now the stubs version: /* *testH263-stubs.cpp Program to test DSPOrb * *Read a raw H.263 file,parse,decode frames using DSPOrb,and *write out YUV file. * *Usage:testH263 in_file out_file */ #include #include #include"dsporb.h"<!-- SIPO <DP n="16"> --><dp n="d16"/>#include"h263.h"#include"H263-TIDEC.h"const int MEMSIZE=4*176*144*3;/*enough for CIF*/static H263 TIDEC*h263decoder;static DSPORB_Buffer*h263inputArg;static DSPORB-Buffer*h263outputArg;int main(int argc,char**argv)*/frame is encoded H.263;buffer is YUV data*/unsigned char*frame=(unsigned char*)malloc(MEMSIZE);unsigned char*buffer=(unsigned char*)malloc(MEMSIZE);DSPORB_init()h263decoder=H263_TIDEC_create(NULL);/*in is H.263 file;out is YUV file*/FILE*in=fopen(argv[1],"rb");FILE*out=fopen(argv[2],"wb");int n_bytes_in_frame;H263_initReader(in);while((n_bytes_in_frame=H263_readFrame(frame,DSPORB_Buffer_setSize(h263inputArg,n_bytes_in_frame);DSPORB_Buffer_setData(h263inputArg,frarme);DSPORB_Buffer_setSize(h263outputArg,MEMSIZE);DSPORB_Buffer_setData(h263outputArg,buffer);H263_TIDEC_decode(h263inputArg,h263outputArg);int s=DSPORB_Buffer_getSize(h263outputArg));printf("%d->%d\n",n_bytes_in-frame,s);if(s>0)fwrite((const void*)buffer,l,s,out);}fclose(in)<!-- SIPO <DP n="17"> --><dp n="d17"/> fclose(out) DSPORB_close() }
2. service quality (Q
oS)
In the configuration of preferred embodiment, the Service Quality Management device (iDSP-Q of iDSPOrb
oS) be defined by combined and form as primary processor of server at the same level and a digital signal processor (DSPs).A umbrella shape Q
oThe S-manager is carried out all functions that need in order to safeguard a service specified quality, the combination of management DSP server.Primary processor is a general processor (GPP) normally, and it is connected to DSPs by hardware interface, such as the interface of picture shared storage and bus type.Q
oThe S manager can be the part of iDSPOrb, perhaps in general, is the manager of a separation on the DSPs.System is driven by the hardware and software interruption.Preferred implementation method is based on load Sharing, to allow main user (client) application program move on DSPs in operation and special service on the GPP.Q
oThe S manager moves simultaneously, and all processors can be a kind of frameworks, such as the iDSP media framework.IDSP-Q
oThe S manager is carried out three major functions: the classification of (1) object, the execution time of the scheduling of (2) object and (3) forecasting object.
These functions will be described below, and in the environment of the many DSP of GPP/, use the object lesson of medium.
3a. the classification of object
In the concrete environment of medium, object is translated the coder/wave filter (algorithm) of medium.Based on their type of stream, type of application or algorithm types, media object can be classified.Rely on the type of algorithm, Q
oThe S manager defines said coder-cycle, the specification in wave filter-cycle or the like.
3b. the scheduling of object (hard closing time)
IDSP-Q
oSM is based on the scheduler program dispatching algorithm object in two stages.Phase one is a kind of HLS high level scheduler, determines whether that a new Media Stream is schedulable on DSP and hard real-time closing time is set for coder-cycle.Subordinate phase scheduling other media frame and the hard real-time closing time of utilizing the phase one.The negotiation time that phase one runs on object goes up and is typically and operates in (GPP) on the main frame.Subordinate phase should operate in DSPs (server) and go up and move based on every frame.
First scheduling phase is such period, i.e. Q
oThe S manager determines whether that on average the object that object can have been moved simultaneously supports.As also needing of phase one part, be that object is considered enough supports according to storer.Be used for inner the use, the object storage impact damper of input and output must be fixing static when its example has been removed the uncertain dynamic memory that distributes.The iDSP platform only moves the algorithm that meets XDAIS.The developer need give a definition the processing time in different condition for algorithm.Be used for initialized the time, being determined, work as Q with the required time proximity of Server Transport data
oWhen the S manager was provided with closing time for each object, it decomposed initialization.
Each DSP object need provide following message to arrive Q
oThe S manager:
The number of n coder-cycle and frame is (default: frame/second)
T
AccDestination server (DSP) round-robin calculates between the average in a coder-cycle altogether.
T
AcdShow the time in destination server (DSP) coder-cycle of round-robin altogether.
For the video coder, the frame number (for example 15 frames) in the normally continuous I-frame of n.T
AccNormally I-frame required maximum time adds the summation of the averaging time that P and B frame are required.Q
oThe S manager keeps following the tracks of the T of all media object
CcdThis time, (according to the DSP circulation) was based on current frame per second.For example, for video flowing and the n=15 of a 30fps, allow T
Ccd=125M week.
Q
oThe S manager can determine as follows now whether a new stream can be scheduled.Allow S equal coder-cycle (T of all current streams that are scheduled
Acc) summation.If (the S+T of new stream
Acc) be less than the T of new stream
Ccd, this stream is schedulable, otherwise can not.For example, suppose to have an object A, n=15, T
Axc=39.5M week (158ms), and T
Ccd=125M week (500ms), and on DSP, do not have task scheduling (so S=0).Notified is the resource that the new stream dispatching office of object A needs.Because S+39.5=39.5M week<125M week (500ms), we can dispatch this stream.Need as object A second when flowing to, it also is scheduled, because S+39.5=79M week (316ms)<125M week (500ms).The 3rd stream also can be scheduled.Yet the 4th stream can not be scheduled, because need 158M week (632ms), so that we can not satisfy the hard closing time of 500ms.At this moment Q
oThe S manager consults to reduce the frame per second of stream, if failure, the stream that refusal is all.
A kind of method of modification allows scheduler program to handle the different types of media object that has different coder-cycle lengths.Longer T
CcdObject pro rata distributed minimum T
CcdFor example, suppose to have an object B, n=30, T
Axc=40M week (160ms), and T
Ccd=169M week (675ms), and have two object A (as defined above) on DSP, to be scheduled (thereby S=79M week/316ms).The new stream of we energy scheduler object B is because S+40* (125/158)=110.45M week (S+160*500/675=435ms).This is provable to be correct, because (79+40<125) M week/(316+160<500) ms, so we can actually guarantee that all stream is within coding and decoding-cycle of short 500ms.When second stream that needs when object B needs scheduling how about? 110.45+40*125/158=139<125M week/435+160* (500/675)=554ms>500ms.Therefore, scheduler program is refused this stream and is begun negotiation above-mentioned.
IDSP-Q
oS will with application program or its agent negotiation, be that media object keeps enough processing bandwidth based on coder-cycle.This negotiation will be considered the storer that object is required, the Q that is asked with other DSP application programs of moving simultaneously
oThe MIPS of the DSP of S rank and acquisition.As the selection variation of object, Q
oThe S manager will be carried out the negotiation again of dsp processor bandwidth.Parameter is input to Q
oThe negotiation procedure of S manager, application program are that an object need define following content:
(1) storer (quantity of input/output (i/o) buffer and size) of DSP needs
(2) Qi Wang Q
oS rank (generally being illustrated in the frame per second)
(3) working time of the worst case of startup object
(4) have the hard real-time closing time that is used for sequence of media frames, call coder-cycle (frame number and average execution time).
At iDSP-Q
oSecond scheduling phase of object is based on two aspects in the S manager, and then who has higher right of priority to whose closing time of arrival earlier.Consider following Example, if 3ms when 10ms and object D have closing time when object A has closing time, iDSP-Q
oThe S manager at first moves scheduler object D, although object A has higher right of priority.Because we know, when working time of object near the time, we can determine the time of " not late ", object must be activated so that it meets its closing time all the time.In Fig. 3, estimate that object D will finish before " not late " starting point of object A.In this scheme, between the object A of higher-priority and object D, there is not conflict closing time.Therefore, object A is in the object D back operation than low priority.
In the example of another scheduling, the right of priority of balance on first closing time, " not late " time of the object A of higher-priority is before the time that the object D that estimates estimates to finish.In this case, object A will at first move because it has higher right of priority, and object D should thereafter the operation, in addition, only object example time object D meet its regulation the frame losing parameter in; Referring to Fig. 4.
For making iDSP-Q
oS manages effectively to doing most probable closing time, and GPP must allow the data incoming frame arrive the DSP subsystem to allow to have the time of maximum between time of arrival of object and closing time as early as possible.The time that is used for Frame between arrival and closing time is many more, iDSP-Q
oS just can dispatch each object with other objects that move simultaneously more neatly.
3c. the working time of forecasting object (soft-closing time)
IDSP-Q
oThe center function of S is the processing time for next incoming frame forecasting institute need of all objects that are scheduled.This prediction is important and only at an object.Q
oThe S manager is by adding up the expection working time of calculating next incoming frame previous working time.Be the function (a unique corresponding object) of early stage working time expection working time of object, and most probable just has and changes (each object is determined uniquely).For example, under the situation of object video, I, the periodicity of P and B frame is confirmable.Therefore, based on type and its position in the periodicity of frame of video of present frame, the processing time in the future can be predicted.Based on the processing time and the approximate hard closing time of prediction, this prediction of carrying out on all common algorithms that exist has directly helped dynamic right of priority reallocation.
These predictions are keys of soft-closing time of management and processing time shake.IDSP-Q
oS will rearrange process object immediately based on prediction.This scheduling again appear at individual objects coder-closing time in cycle (definition average hard closing time) during in.In the above example, when we divided equally workload, all frames among we the suppose object B need the identical time, and were consistent with the 500ms of object A.This can not be real, because the frame of object B can require more time or can not give the average time quantum of object B in the overlapping process of whole reality.Therefore, approach the frame of coder-closing time in cycle most and obtain higher right of priority.
If the working time of prediction is for having put user-defined required time, Q
oThe S manager will be taked several possible on a kind of.In a single DSP configuration:
(rank 1) a kind of simple scale-of-two ends: this causes automatic frame losing.Described object should be represented if frame losing will cause unfortunate consequence.
The working time that (rank 2) a kind of general minimizing distributes than the low priority object, try to be the first at the end of the time of being assigned with and to occupy object.This can maybe cannot cause frame losing.
(rank 3) object need have the Q of acceptance
oThe ability of S order is such as reducing the quality of output data in proportion.In the configuration of DSP more than:
(1) at each Q
oThe end of S timeslice, the information of the data of packing into is sent to GPP from each DSP.
(2) only when missing the closing time of estimating, GPP seeks help from the reallocation of object.Receive " data of packing into " afterwards from the DSPs that serving, the reallocation of executing the task by GPP.Yet, be reducing the transit time of task, it is very worth that all DSPs work in public bunch an of external memory space.
All objects of carrying out in iDSP must be determined the execution time.The DSP object can be divided into three classes, packed data (coding), decompressed data (decoding) and data-switching (processing of early stage or later stage object data).Object is pressed piece to data and is handled; These pieces input data frame that is called.Object handles is with input data frame and produce an output data frame.As the data of any calculating, the input and output Frame is retrained according to size and treatment capacity.Based on any size of giving incoming frame, can accurately determine the maximum that DSP handles, in this respect, any other computing machine must be carried out on incoming frame.
Each object before it is integrated the iDSP system, need show the working time of a frame of object in worst case.Be used to calculate the working time of first incoming frame working time of this worst case so that can start object.Because scrambler and code translator object seldom move under worst case, first incoming frame will be (because it must predict worst case) of high cost.The arrangement of this worst case may cause will be longer than actual working time the working time of first frame.
Go into describedly previously, the processing time of an algorithm object will change in incoming frame.At first, iDSP-Q
oS begins the first data incoming frame worst-case value.After first frame, based on the feature of algorithm and the first frame processing time of measurement, Q
oThe S manager will be predicted the processing time of next incoming frame.For each sequence frame, based on the semanteme and the historical record of algorithm object, its approximate processing time of prediction.For example, the scrambler object has used the average scramble time of object semanteme (for example, I, P is with the type of B frame) together with early stage identical incoming frame, and this can be used to predict encodes the required time in the future.The scrambler object is each when producing identical big or small incoming frame, and they are arranged to carry out.The variation in processing time is from many factors, the activity level of picture frame, degree that moves between frame or the like.Yet these variations are restricted.Therefore, the maximum differential in processing time is limited between two frames, and this maximum differential can be added on the processing time of prediction, so that determine the processing time of worst case for next frame.Referring to Fig. 5-6.
The decoding object generally provides the incoming frame of variable-size.The processing time of input data frame and its size are directly proportional.For whether the processing time of determining next frame can increase Q
oThe S manager will be verified current and difference next data incoming frame size.Discuss as scrambler, code translator is also set up, promptly between the identical frame of two semantemes, the difference in the processing is limited.The processing time correspondence of the maximum of object or worst case be used for object and the impact damper of the maximum possible that is defined.Referring to Fig. 7.
The operation of converting objects is identical with the scrambler object, and they always produce the incoming frame of identical size.Every frame always use up equal number processing time and be the single incoming frame that passes through.Therefore, the processing time of every incoming frame always remains unchanged.
Each object will receive a relative time from user application, and the frame that wherein passes through must be finished by object.An example is such, and this frame of applications specify must be processed at next 7mS.Owing to there is not the common software clock between main frame GPP and DSP, only can be limited by relative period closing time.The transmission time of the Frame of our hypothesis between main frame and DSP is confirmable.The iDSP system keeps an internal clocking, and it arrives the markers that the back receives with respect to Frame, thereby can calculate the processing time of expection.After the processing time of having calculated expection, Q
oThe S manager begins the data dispatching frame to be carried out.
Before an object can be scheduled, Q
oThe S manager is compared the common object that exists of this object and other, determines the suitable instruction that this object is carried out.If there is not other object handles incoming frame, the frame of this object is arranged to carry out immediately.If other object operation is arranged, Q
oThe S manager real time of the closing time of expection and hard or soft needs, is determined the instruction of carrying out by considering the right of priority of each request object.Referring to Fig. 8.
When multi-object, different priority working time is arranged, be connected in identical DSP together, Q
oThe S manager will be according to calculating the working time that each object is estimated the concrete working time of object.Arrange different tasks based on the object that is in scheduling (TBD) then.Following three kinds of possible scheduling situations:
(1) all objects operation is finished the input data frame that provides and is finished in the closing time of applications specify.This situation represents in Fig. 9, among the attention figure all to as if before the closing time of each object, finish.If all objects were finished Q before their closing times separately
oThe workload of S manager is minimum.
(2) (for example: object B) processing load increases one or more objects, but does not cause that the projected cutoff time of back object misses.Possible when the load of one or more objects increases, such as the picture object B.Depend on object, if the frame data sequence of same object is processed in their restriction closing time, missing a closing time is acceptable.An example, in a H263 scrambler, " I " frame has been used for a long time and has been calculated.Follow at " p " frame always of " I " frame back and typically have less processing demands.This just allows " I " frame to divert the processing cycle of back " P " frame.Like this, if there are enough processing spaces to be used for next frame, do not miss the closing time of a frame and very big trouble can occur.
Because be exceeded the closing time of object B, must determine the effect of total system.If the projected cutoff time that does not cause the back object of missing of the closing time that is caused by object B makes mistakes, then the danger of total system is minimum.Referring to Figure 10-11.
(3) (for example: object B) processing load increases one or more objects, and has caused that the projected cutoff time of back object makes mistakes.See Figure 12.In this case, the projected cutoff time that has caused the back object of missing of the closing time of object B makes mistakes.Although this situation occurs, the danger of total system may maybe can not be minimum.The object of each common operation can be diverted the cycle of sequence frame and therefore avoid domino (dominoes) effect of mistake closing time.
IDSP-Q
oS has proposed one group of rule that is used for soft management closing time.The design limit of this group rule by one single conclusive wrong caused closing time of closing time mistake the snowball effect.(1) each algorithm object offers Q
oThe maximum number of frame losing/second that the S manager allows.(2) after each processing cycle, each object upgrades the counting of ' miss closing time ' in service as moving average.(3) restriction that has exceeded the closing time of makeing mistakes when an object makes the priority of this object into mxm..In case counting drops to when being lower than restriction, recovers original priority.(4) all sequence frame does not reach their closing time after restriction, then is dropped.This causes Q
oS temporarily drops to back to back next stage.Q
oThis instantaneous decline (seldom having) of S is reported to the client subsequently.(5) have only when having passed through that closing time, DSP was not activated object, as a rule, frame is dropped.
3d. the media translation of throttling control cycle
For an algorithm object that provides, iDSP-Q
oS hypothesis only has a request of preparing queuing in moment arbitrarily, usually, has to retrain periodic closing time of specified quality of service Q
oThe S manager.Voice ﹠ Video translation element energy buffered frame in the media system allows frame a little in advance in scheduling arrival to handle the variation of time of arrival.But these impact dampers are limited, thereby the upstream bit element of media system must be suppressed relative velocity modestly when frame is processed.
Pass through iDSP-Q
oSM provides two kinds of mechanism, the processing speed that is used to suppress the algorithm object.
(1) client of DSP algorithm object controls the speed that it calls algorithm object handles function (server).This can cause Q
oThe performance of the suboptimum of the dispatching algorithm of S manager, if request produced in the time cycle, they must be done.For example, the algorithm object A that considers above, wherein impact damper A1 must be in period of time T 1 processed and impact damper A2 must be processed in period of time T 2.Figure.
Wherein T1 and T2 are two continuous cycles, the impact damper x that [x] expression arrives, and { x} represents to finish the impact damper x of processing.13a with the aid of pictures.
(2) Q
oThe throttling of S manager control Media Stream.This mechanism allows the client once the processing capacity that might call an algorithm object with an input buffer.Q
oThe S manager subsequently will be to input buffer additional ' beginning-closing time '.The client just blocks when its processing when anterior bumper is done.13b. with the aid of pictures like this, in both cases, at any moment, Q
oEach algorithm object of preparing queuing in the S manager has a request at most.
3. memory paging
In order to move many algorithms better on a DSP, perhaps the processor arbitrarily of relevant this respect must be set up set of rule so that system resource fair sharing in algorithm.These rule predeterminings to the visit of processor peripherals, such as DMA, the dispatching method of internal memory and algorithm.In case accepted set of rule, a system interface can be developed and be used for insertion algorithm, so that they can access system resources.A common interface offers the clear and definite scope of algorithm development person, wherein will develop algorithm soon, the problem that does not have system to support because they can concentrate on algorithm development individually.For example an interface is a Texas InstrumentsiDSP Medis PLatform DSP framework.The all-access that occurs between an algorithm and TMS320C62XXDSP is through this framework.
Texas Instruments XDAIS standard need be set up rule, allow to insert more than an algorithm in the iDSP media platform, by one or more algorithms by the integrator of the system product quality system that collects fast.XDAIS standard-required algorithm will meet the common interface of the needs that are called the Alg interface.Several rules of being forced by the XDAIS standard are arranged, and most important is directly define storage or direct access hardware external unit of algorithm.Provide system service by the single common interface that is used for all algorithms.Therefore system's integrator only provides a DSP framework of supporting the Alg interface of all algorithms.The Alg interface also provides the device of an access system service of algorithm development person and calls their algorithm.
An algorithm must accurately define its required internal memory.This needs a kind of paging structure to support the visit to same space in internal memory of many algorithms.XDAIS need stipulate the inside and outside storer that they are required according to algorithm.
Interior (in the chip) deposited and must be divided into two zones.The firstth, the house-keeping district of system supports the OS data that a special dsp system disposes.Second district is that algorithm uses, but must be to be scheduled and when carrying out at them.The size of two memory blocks must be fixed.Second memory block is called as workspace in the algorithm chip; In other words this workspace also can be described as a kind of data covering or the data storage page.Referring to Figure 14.
In order to determine to obtain workspace in the great algorithm chip, system developer deducts the required amount of support system software with the total amount in the internal data memory space that obtains, and supports and be used for the data support of paging structure such as OS.The configuration of OS, such as task amount, semaphore or the like should be arranged to maximum to support whole algorithm amounts by the DSP deviser of system, and the deviser wishes to move simultaneously these algorithms.Keep minimum OS to support house-keeping and increase algorithm workspace.
For making an algorithm run on such environment, the internal memory that it needs must be less than the size of workspace.Otherwise the unintegrable algorithm of system's integrator; A kind of restriction is arranged, and promptly every algorithm only has a paging.This architecture is not supported many pagings of algorithm.
The algorithm workspace is divided into three parts, storehouse (mandatory), long-time memory and long-time memory not.One the 4th part is arranged sometimes, and this will discuss afterwards, and it handles the read-only part of long-time memory.Referring to Figure 15.
Algorithm only uses the workspace in the chip when it is carried out.When an algorithm is scheduled when carrying out, dsp system software will be sent to the interior internal work district of chip to the workspace of algorithm from its internal storage unit (shade storage).Obey control when algorithm, dsp system software will be determined following which algorithm of this operation, not need to transmit the workspace if it is identical algorithm.Be stored in its shadow unit of internal storage if next algorithm is a current workspace of different algorithms, and the workspace of next algorithm is transmitted.Referring to Figure 16.
When changing in front and back, the whole workspace that is used for algorithm do not transmit.Only transmit the use part and the persistent data store of storehouse.When being during it highest in the allocating stack of an algorithm at it, the storehouse of algorithm is the superlative degree (minimum use) at it.In other words, algorithm is the entrance at it.
The storehouse that the conversion of desirable front and back algorithm comes across it is positioned at the superlative degree, because that means that chip has a spot of data to be sent to the shade storage outward.Referring to Figure 17.
Conversion should be a full blast before and after the data page architecture of preferred embodiment required.Can be used for execution algorithm to DSP the extra time of conversion cost before and after handling.Because the golden hour of front and back conversions is to call on the separatrix the definitely preemption of minimization algorithm at it.When the storehouse of algorithm during greater than its minimum value, the preemption of algorithm will reduce total system.This should be a kind of requirement, but can accept preemption on very limited basis.Referring to Figure 18-19.
The special circumstances of a kind of algorithm workspace are, if algorithm needs read-only long-time memory.Such storer is used to table look-up by algorithm.Because sort memory never was modified then and only need reads in and can not be write.This asymmetrical paging transmits and has reduced algorithm front and back conversion overhead.
With this data page structure, single algorithm can be by example more than once.Because algorithm has defined the internal memory of its needs, the dsp system integrator can be more than the example of an identical algorithms.Dsp system software keeps following the tracks of each example of a plurality of examples and while dispatching algorithm.For keeping the shade version of algorithm examples, the restricted number of example is corresponding to how many internal memories are arranged in dsp system.
Dsp system software must be managed each example so that it can be to the correct matching algorithm data of algorithm of scheduling.Since most DSP algorithms since task by example, dsp system software can use the task environment pointer to come the management algorithm example as a device.
4. data stream link
The data stream of preferred embodiment relies on integrated processing unit, supplies with their shared storage space, and in processing unit direct routing data, be not subjected to the interference of GPP.Such system is presented among Figure 21.
When processing unit PEa has finished the lot of data processing, it writes a predetermined output buffer in the shared storage to result data.PEa notifies next through the suitable processing unit PEb of control path in connection subsequently.PEb is used for other processing from the input buffer read data then.In such mode, data run out up to all data through the processing unit that all need.
As top said, the pool of buffer device is used for having constituted in the communication between two processing units and between these unit an I/O passage.A plurality of I/O passages may reside between any two processing units, a plurality of data stream of permission system (for example parallel) processing simultaneously.Figure 22 has shown parallel processing multiple data stream s1, the example of s2.
Formed a passage chain by the channel attached a series of processing units of I/O.Several passage chains can be defined within the special system.In this case, each input channel of the intermediate chain of a processing unit has a relevant output channel.The two ends of processing unit only have one to input or output passage.
The input channel regulation of a processing unit is from which impact damper read data.The output channel regulation of a processing unit is write data the processing unit of which impact damper and notice back.The type of the control information between data processing unit and center processor controls (CCP) is.
(1) status information: the processing of data stream begins, and stops, and abnormal end is suspended, recover, or the like ...
(2) quality of service information: time mark, system loading, resource the free time/busy or the like ...
(3) data flow con-trol information: beginning, suspend, recover back-roll or the like ...
(4) system loading information: task run, working channel number, the channel of each processing unit etc..
In a preferred embodiment, through the establishment of a configuration file processing unit I/O passage with in conjunction with the static definition of quilt, this configuration file can be read in system initialization.For the type of every kind of processed bit stream, the passage chain (data routing just) that one of configuration file definition is connected with suitable processing unit.Focusing on of all processing units causes finishing of data consumes in a passage.
Exist in the multidata path under the situation of a given bit stream, can define and replace or the backup path chain.Bit stream can be routed to an initial passage chain of any processing unit that does not utilize.The type and the Data Dynamic O of bit stream when determining operation by being routed
oS analyzes the selector channel chain.In working time all systems legal passage be fix with unmodifiable.
In a further advantageous embodiment, when a new bit stream arrived the processor of communication, the passage chain of different bit streams can dynamically be constructed.The bit stream information that derives from can be sent to CCP through control information when operation, CCP will determine required processing unit and dynamically distribute the I/O passage between them.This method will allow resource to be extracted by service or at the online permission system self-adaption of working time.
In the nonhomogeneous system of shared storage, the data stream in the processing unit is not subjected to the interference of CCP through inner shared storage.Data can not occur on bus, thereby the speed of data processing was determined by access time of shared storage rather than the transmission time of bus.Minimize because the interference of CCP becomes, CCP response and the processing time-delay of entire stream time are excluded.By the time of the transmission of minimise data in processing unit, can improve the handling capacity of system.
5a. example
Here the application program of technical discussion typical data stream will be used for medium processing system.Such system will start and control is used for handling picture decoding, coding, translation, conversion, the stream of the broadband medium that calibration etc. are such.By the picture cable modem, DSL, or wireless such communication medium, it can handle the Media Stream from this domain and a remote computer/server.Figure 23 has shown the example of such system.
Medium processing system among Figure 23 comprises 5 processing units:
(1) DSP of DSL or cable modem I/O front end
(2) media DSP
(3) video/graphics overriding processor
(4) code translator task H.263
(5) color space conversion is had a high regard for affair
H.263 data stream enters the I/O DSP of front end, and I/O DSP is following a passage chain by label 1-3 definition.Each passage connects 2 processing units and forms one group of impact damper that is used between the unit by data.Control stream shows through hacures.
H.263 data stream enters the impact damper I/O that passage 1 is defined from the I/O of front end DSP total shared storage.The I/O notification target processing unit of front end DSP is relevant with passage 1, the H.263 code translator task on media DSP just, and its input buffer is full and prepares to be read.H.263 the code translator task is from passage 1 impact damper I/O sense data, and decoding data is also write yuv data as a result the I/O of passage 2 impact dampers in the local shared storage.
Should notice that passage can be between the processor or in processor.Through " part " shared storage of total shared storage (between processor) or process given processor (in processor), data can be passed through between processor.In Fig. 4, passage 1 and 3 is between processor, and passage 2 is in processor inside.
5. being modified in when keeping described feature preferred embodiment can in all sorts of ways and make amendment.
Claims (7)
1. the method for client-server scheduling comprises:
(a) phase one of dispatching on a client is for the task of the server that is coupled to described client is provided with real-time closing time; With
(b) subordinate phase of dispatching on the server of the subtask of described task, the subordinate phase of described scheduling is used the real-time closing time of step (a).
2. the dispatching method of claim 1, wherein:
(a) described task comprises a Media Stream decoding; With
(b) described subtask comprises a frame coding that is used for the frame of described Media Stream.
3. be used for the method for a kind of Object Request Broker of a client-server system, comprise:
(a) one first client requests of destruction is returned with one second client requests and is called; With
(b) input that outputs to a second server object of one first server object of link, wherein said first server and described second server be corresponding first and second client requests respectively.
4. the method for claim 3, wherein:
(a) described link is to realize by create an impact damper that is used for intermediate result (input of the output of described first object and described second object) in described server.
5. the method for a processor-server memory management in a client-server system comprises;
(a) be a first of a processor storage of expense distribution of processor; With
(b) distribute a second portion of described processor storage for the task workspace, wherein said second portion can be taken by single task only at every turn.
6. the method for claim 5, wherein:
(a) second portion of described storer comprises a stack element, a persistent storage element and a non-persistent memory element.
7. the method for streams data in a nonhomogeneous system, this system have each processing unit that a bus is connected to a processor controls and is connected to a plurality of processing units, comprising:
(a) by using an isolated common storage from described bus, in described processing unit, transmit data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN01119611A CN1405679A (en) | 2001-04-26 | 2001-04-26 | Multi-processor target control |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN01119611A CN1405679A (en) | 2001-04-26 | 2001-04-26 | Multi-processor target control |
Publications (1)
Publication Number | Publication Date |
---|---|
CN1405679A true CN1405679A (en) | 2003-03-26 |
Family
ID=4663690
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN01119611A Pending CN1405679A (en) | 2001-04-26 | 2001-04-26 | Multi-processor target control |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN1405679A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1307543C (en) * | 2003-06-13 | 2007-03-28 | 三星电子株式会社 | Apparatus and method for initializing coprocessor for use in system comprised of main processor and coprocessor |
CN1312592C (en) * | 2004-02-26 | 2007-04-25 | 索尼株式会社 | Information processing system, information processing method, and computer program |
CN1318971C (en) * | 2003-11-18 | 2007-05-30 | 神达电脑股份有限公司 | Method of raising data storage device integrated working efficiency under multitask operating environment |
CN100382490C (en) * | 2004-07-05 | 2008-04-16 | 索尼株式会社 | Server/client system, information processing unit, information processing method, and computer program |
CN103493037A (en) * | 2011-03-16 | 2014-01-01 | 迈思肯系统公司 | Multi-core distributed processing for machine vision applications |
CN103677984A (en) * | 2012-09-20 | 2014-03-26 | 中国科学院计算技术研究所 | Internet of Things calculation task scheduling system and method |
CN109815537A (en) * | 2018-12-19 | 2019-05-28 | 清华大学 | A kind of high-throughput material simulation calculation optimization method based on time prediction |
-
2001
- 2001-04-26 CN CN01119611A patent/CN1405679A/en active Pending
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1307543C (en) * | 2003-06-13 | 2007-03-28 | 三星电子株式会社 | Apparatus and method for initializing coprocessor for use in system comprised of main processor and coprocessor |
CN1318971C (en) * | 2003-11-18 | 2007-05-30 | 神达电脑股份有限公司 | Method of raising data storage device integrated working efficiency under multitask operating environment |
CN1312592C (en) * | 2004-02-26 | 2007-04-25 | 索尼株式会社 | Information processing system, information processing method, and computer program |
CN100382490C (en) * | 2004-07-05 | 2008-04-16 | 索尼株式会社 | Server/client system, information processing unit, information processing method, and computer program |
CN103493037A (en) * | 2011-03-16 | 2014-01-01 | 迈思肯系统公司 | Multi-core distributed processing for machine vision applications |
US9235455B2 (en) | 2011-03-16 | 2016-01-12 | Microscan Systems, Inc. | Multi-core distributed processing using shared memory and communication link |
CN103677984A (en) * | 2012-09-20 | 2014-03-26 | 中国科学院计算技术研究所 | Internet of Things calculation task scheduling system and method |
CN103677984B (en) * | 2012-09-20 | 2016-12-21 | 中国科学院计算技术研究所 | A kind of Internet of Things calculates task scheduling system and method thereof |
CN109815537A (en) * | 2018-12-19 | 2019-05-28 | 清华大学 | A kind of high-throughput material simulation calculation optimization method based on time prediction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Cost-efficient and robust on-demand video transcoding using heterogeneous cloud services | |
US20020019843A1 (en) | Multiprocessor object control | |
CN1287282C (en) | Method and system for scheduling real-time periodic tasks | |
Lakra et al. | Multi-objective tasks scheduling algorithm for cloud computing throughput optimization | |
EP1229444A1 (en) | Media accelerator | |
CN1206602C (en) | System and method for comprehensive load distribution and source management in internet network | |
CN1601474A (en) | Method and system for real-time scheduling | |
CN1993674A (en) | Resource management in a multicore architecture | |
CN1460212A (en) | Media session framework using protocol independent control module direct and manage application and service servers | |
CN111294647B (en) | Video processing method, device and equipment and storage medium | |
CN1075021A (en) | Multimedia operation system and method | |
CN103067468A (en) | Cloud scheduling method and system thereof | |
CN1405679A (en) | Multi-processor target control | |
Montes et al. | Data-driven workflows in multi-cloud marketplaces | |
Hassan et al. | Efficient virtual machine resource management for media cloud computing | |
CN1173280C (en) | Adaptive information processing system and with network topology | |
WO2022257247A1 (en) | Data processing method and apparatus, and computer-readable storage medium | |
TW514832B (en) | Multiprocessor object control | |
WO2021036784A1 (en) | Media data processing method and apparatus, media server and computer-readable storage medium | |
Northcutt et al. | System support for time-critical applications | |
CN1943182A (en) | Method and system for scheduling synchronous and asynchronous data packets over same network | |
Huang et al. | Criticality-and QoS-based multiresource negotiation and adaptation | |
Cortés-Mendoza et al. | Robust cloud VoIP scheduling under VMs startup time delay uncertainty | |
JP5487137B2 (en) | Information distribution control device and communication system | |
Kumar et al. | An approach to workflow scheduling using priority in cloud computing environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |