WO2006049740A2 - Method for integrating multiple object files from heterogeneous architectures into a set of files - Google Patents
Method for integrating multiple object files from heterogeneous architectures into a set of files Download PDFInfo
- Publication number
- WO2006049740A2 WO2006049740A2 PCT/US2005/034460 US2005034460W WO2006049740A2 WO 2006049740 A2 WO2006049740 A2 WO 2006049740A2 US 2005034460 W US2005034460 W US 2005034460W WO 2006049740 A2 WO2006049740 A2 WO 2006049740A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- code
- processor
- ocl
- created
- multiprocessor
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000004590 computer program Methods 0.000 claims description 13
- 230000001131 transforming effect Effects 0.000 claims 6
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/54—Link editing before load time
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4482—Procedural
- G06F9/4484—Executing subprograms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
Definitions
- the present invention relates generally to the processing of object files and, more particularly, to the integrating of multiple object files from heterogeneous architectures.
- a program defined in one name space may reference a program defined on another name space.
- the processors involved may comprise different machine types, with different architectures, different instructions sets, and different forms of object files.
- a linker could misinterpret object code generated by another processor, and handle the code incorrectly.
- the programmer could hard code a call from a program running on one processor to a program in the name space of another processor, but the process could become cumbersome. With the hard coding, it would not be possible for runtime reference to the object code, for dynamic linking and object sharing, or for execution time handling of an object from the combined multiprocessor name space.
- the present invention is a method for integrating multiple object codes from heterogeneous architectures.
- the object code for the second-processor program is enclosed in a wrapper to create object code in the first-processor name space.
- the header of the wrapped object code defines a new symbol in the name space of the first processor, and the symbol points to the second-processor object code contained in the wrapped object code.
- the referencing program on the first processor references the wrapped object code .
- FIGURE 1 shows a block diagram of a multiprocessor comprising processors with distinct architectures
- FIGURE 2 illustrates enclosing object code in ELF format in a wrapper
- FIGURE 3 depicts a flow diagram of the execution of object code on one processor after a call from another processor
- FIGURE 4 depicts a flow diagram of the creation of a wrapped object containing object code.
- FIGURE 1 shows a block diagram of a multiprocessor comprising processors with distinct architectures.
- the multiprocessor 100 comprises two processors, the PU 102 and the SPU 110, with heterogeneous architectures. Object files which run on one processor do not run on the other. Nevertheless, code running on the PU 102 may reference code designed to run on the SPU 110.
- the two processors, the PU 102 and the SPU 110 differ in their access to data.
- the PU 102 has access to system memory 108 and a cache 104, under the control of a first DMA controller 106.
- the DMA controller 106 handles load and store instructions to transfer data to and from the system memory 108 and the cache 104 and the PU 102.
- the data moving to and from the system memory 108 travels over a system bus 116.
- the SPU 110 does not have access to the system memory 108 through load and store instructions .
- a second DMA controller 114 transfers data from the system memory 108 to local store 112, and the SPU 110 can load and store from there.
- the DMA controller 114 is connected to the system memory 108 via system bus 116.
- the architecture of the multiprocessor 100 is different.
- the multiprocessor 100 comprises multLple copies of the PU 102, all sharing a single system memory.
- the multiple copies of the PU 102 each share a single cache.
- some groups of one or more PUs share a cache, while some PUs do not have access to a cache.
- the SPU 110 has its own separate memory .
- FIGURE 2 illustrates enclosing object code in ELF format in a wrapper.
- Object code 200 in ELF format for an SPU 110 routine comprises an ELF header section 202 and the remaining sections of the object code 204 for the routine. The remaining sections include program and data.
- the object code 200 is converted into object code 208, which is a PU 102 object, by adding a wrapper 210.
- the wrapper 210 contains the symbol definition of a PU 102 object with the same name as the SPU 110 routine. For example, if the SPU 110 routine is BAR-SPU, the wrapper 210 defines a symbol BAR-SPU, a PU 102 object.
- the object code 208 also contains the object code 200, including the ELF headers 212 and the remaining sections of the object code 214.
- the symbol BAR-SPU is a pointer to, or refers to, the object code 200 within the object code 208.
- the SPU object code 200 is an SPU object, BAR-SPU.o
- the wrapped code 208 is a PU object, BAR-SPU-PU.o.
- the wrapping process makes possible the integration of multiple object files from heterogeneous architectures.
- the wrapping of an SPU 110 object creates a PU 102 object which can be treated for linking and loading purposes as any other PU 102 object.
- the SPU 110 object tha_t was wrapped is handled correctly.
- the wrapping process makes possible the integration of PU 102 and SPU 110 objects.
- the linker links to the PU 102 object BAR-SPU-PU.o.
- This method supports static and dynamic linking and the object sharing of an SPU 110 object.
- the wrapping allows the loading of any SPU 110 file format.
- the wrapped PU object 208 is loaded.
- PU 102 runtime reference can be made to an SPU 110 object.
- the runtime reference on the PU 102 is to the PU 102 object BAR-SPU-PU.
- the wrapping also allows a clear separation of PU 102 object name space and SPU 110 object name space.
- Code running on the PU 102 does not have to refer directly to an SPU 110 object. Instead, the SPU 110 object is wrapped, creating a PU 102 object, and the PU 102 code refers to the wrapped object, a PU 102 object.
- the result is also a simple symbol association for PU 102 program reference.
- PU 102 code refers to a PU 102 symbol, which points to an SPU 110 object.
- the result gives the capability of pre-linking and mixing both PU 102 and SPU 110 objects.
- the wrapping process is friendly to library packaging for both static and dynamic needs.
- FIGURE 3 depicts a flow diagram 300 of the execution of object code on one processor after a call from another processor.
- a program FOO running on the PU 102 calls the routine BAR which runs on the SPU 110
- the call to BAR is interpreted as a call to the PU 102 object BAR- SPU-PU.o.
- the wrapped code BAR-SPU-PU.o is run on the PU 102.
- the SPU object code for BAR which is contained in the wrapped code BAR-SPU-PU.o, is then DMA'ed over? to the local store 112 of the SPU 110.
- the S-PU 110 starts executing the code.
- the result is DMA' ed back to the PU 102.
- FIGURE 4 depicts a flow diagram 400 of the creation of a wrapped object containing SPU 110 object code.
- the SPU 110 routine is named BAR.
- an SPU 110 object file is created for BAR in ELF format, BAR-SPU.o.
- This object file is created by a compiler or assembler compatible with the processor SPU 110.
- a wrapper is placed on this code to create PU 102 object code, BAR-SPU-PU.o.
- a system tool is available on the multiprocessor 100 to create the wrapper.
- step 406 the system tool defines within the wrapper the PU 102 symbol BAR-SPU as a pointer to the SPU 110 object BAR-SPU.o, contained within the PU 102 object BAR-SPU-PU.o.
- the SPU 110 file Once the SPU 110 file has been embedded in a PU 102 object file, it can be treated as an ordinary PU 102 file, and in step 408, the user can transform it to any file format, such as an executable, dynamic shared library, and/or archive format.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Stored Programmes (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP05800060A EP1815329A2 (en) | 2004-10-28 | 2005-09-23 | Method for integrating multiple object files from heterogeneous architectures into a set of files |
JP2007538925A JP5072599B2 (en) | 2004-10-28 | 2005-09-23 | How to integrate multiple object files from heterogeneous architectures into a set of files |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/976,264 US20060095898A1 (en) | 2004-10-28 | 2004-10-28 | Method for integrating multiple object files from heterogeneous architectures into a set of files |
US10/976,264 | 2004-10-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2006049740A2 true WO2006049740A2 (en) | 2006-05-11 |
WO2006049740A3 WO2006049740A3 (en) | 2006-08-10 |
Family
ID=36178031
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2005/034460 WO2006049740A2 (en) | 2004-10-28 | 2005-09-23 | Method for integrating multiple object files from heterogeneous architectures into a set of files |
Country Status (6)
Country | Link |
---|---|
US (1) | US20060095898A1 (en) |
EP (1) | EP1815329A2 (en) |
JP (1) | JP5072599B2 (en) |
KR (1) | KR100892191B1 (en) |
CN (1) | CN101048734A (en) |
WO (1) | WO2006049740A2 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7644402B1 (en) * | 2004-03-17 | 2010-01-05 | Sun Microsystems, Inc. | Method for sharing runtime representation of software components across component loaders |
US8120610B1 (en) * | 2006-03-15 | 2012-02-21 | Adobe Systems Incorporated | Methods and apparatus for using aliases to display logic |
US8255919B2 (en) * | 2007-03-23 | 2012-08-28 | Qualcomm Atheros, Inc. | Distributed processing system and method |
EP2336883B1 (en) * | 2008-09-09 | 2018-03-07 | NEC Corporation | Programming system in multi-core environment, and method and program of the same |
KR100968774B1 (en) * | 2008-09-18 | 2010-07-09 | 고려대학교 산학협력단 | Multi-Processing System for including a large number of heterogeneous Processor and Driving Method of the Same |
US20110113409A1 (en) * | 2009-11-10 | 2011-05-12 | Rodrick Evans | Symbol capabilities support within elf |
US9235458B2 (en) | 2011-01-06 | 2016-01-12 | International Business Machines Corporation | Methods and systems for delegating work objects across a mixed computer environment |
US9052968B2 (en) * | 2011-01-17 | 2015-06-09 | International Business Machines Corporation | Methods and systems for linking objects across a mixed computer environment |
US9104504B2 (en) | 2013-03-13 | 2015-08-11 | Dell Products Lp | Systems and methods for embedded shared libraries in an executable image |
US9753710B2 (en) * | 2013-11-07 | 2017-09-05 | Netronome Systems, Inc. | Resource allocation with hierarchical scope |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5247678A (en) * | 1989-10-12 | 1993-09-21 | Texas Instruments Incorporated | Load time linker for software used with a multiprocessor system |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3051438B2 (en) * | 1989-10-12 | 2000-06-12 | テキサス インスツルメンツ インコーポレイテッド | How to give enhanced graphics capabilities |
GB2272085A (en) * | 1992-10-30 | 1994-05-04 | Tao Systems Ltd | Data processing system and operating system. |
US6029000A (en) * | 1997-12-22 | 2000-02-22 | Texas Instruments Incorporated | Mobile communication system with cross compiler and cross linker |
CA2343437A1 (en) * | 2001-04-06 | 2002-10-06 | Ibm Canada Limited-Ibm Canada Limitee | Method and system for cross platform, parallel processing |
US20030046448A1 (en) * | 2001-06-06 | 2003-03-06 | Claudius Fischer | Application programming interface layer for a device |
US7415703B2 (en) * | 2003-09-25 | 2008-08-19 | International Business Machines Corporation | Loading software on a plurality of processors |
US7444632B2 (en) * | 2003-09-25 | 2008-10-28 | International Business Machines Corporation | Balancing computational load across a plurality of processors |
US20060031821A1 (en) * | 2004-08-04 | 2006-02-09 | Rutter Budd J Ii | Divided compiling program application functionality for software development |
-
2004
- 2004-10-28 US US10/976,264 patent/US20060095898A1/en not_active Abandoned
-
2005
- 2005-09-23 WO PCT/US2005/034460 patent/WO2006049740A2/en active Application Filing
- 2005-09-23 CN CNA2005800370347A patent/CN101048734A/en active Pending
- 2005-09-23 KR KR1020077009699A patent/KR100892191B1/en not_active IP Right Cessation
- 2005-09-23 EP EP05800060A patent/EP1815329A2/en not_active Ceased
- 2005-09-23 JP JP2007538925A patent/JP5072599B2/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5247678A (en) * | 1989-10-12 | 1993-09-21 | Texas Instruments Incorporated | Load time linker for software used with a multiprocessor system |
Also Published As
Publication number | Publication date |
---|---|
JP5072599B2 (en) | 2012-11-14 |
EP1815329A2 (en) | 2007-08-08 |
JP2008518355A (en) | 2008-05-29 |
US20060095898A1 (en) | 2006-05-04 |
CN101048734A (en) | 2007-10-03 |
WO2006049740A3 (en) | 2006-08-10 |
KR20070088624A (en) | 2007-08-29 |
KR100892191B1 (en) | 2009-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1815329A2 (en) | Method for integrating multiple object files from heterogeneous architectures into a set of files | |
EP0996059B1 (en) | Class loading model | |
US5892966A (en) | Processor complex for executing multimedia functions | |
US6324686B1 (en) | Just in time compiler technique | |
US4455602A (en) | Digital data processing system having an I/O means using unique address providing and access priority control techniques | |
US20060248262A1 (en) | Method and corresponding apparatus for compiling high-level languages into specific processor architectures | |
EP1267256A2 (en) | Conditional execution of instructions with multiple destinations | |
US20070006184A1 (en) | Method and apparatus for combined execution of native code and target code during program code conversion | |
US20020194459A1 (en) | Method and apparatus for saving and restoring processor register values and allocating and deallocating stack memory | |
WO2005103924A1 (en) | Modified computer architecture with initialization of objects | |
TWI291098B (en) | Method and system for data optimization and protection in DSP firmware | |
WO2007107707A2 (en) | Computer architecture | |
JP2002525707A (en) | An accurate method for virtual call inlining | |
JP4202244B2 (en) | VLIW DSP and method of operating the same | |
US6550000B1 (en) | Processor to execute in parallel plurality of instructions using plurality of functional units, and instruction allocation controller | |
US8707013B2 (en) | On-demand predicate registers | |
JP2008518355A5 (en) | ||
GB2330673A (en) | Data processing apparatus | |
CN110073332B (en) | Data processing apparatus and method | |
EP2577464B1 (en) | System and method to evaluate a data value as an instruction | |
US6314564B1 (en) | Method for resolving arbitrarily complex expressions at link-time | |
US20030126589A1 (en) | Providing parallel computing reduction operations | |
US6408380B1 (en) | Execution of an instruction to load two independently selected registers in a single cycle | |
US8468511B2 (en) | Use of name mangling techniques to encode cross procedure register assignment | |
EP0803801A1 (en) | Method and apparatus for mixing objective-C and C++ objects |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2005800060 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580037034.7 Country of ref document: CN Ref document number: 2007538925 Country of ref document: JP Ref document number: 1020077009699 Country of ref document: KR |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWP | Wipo information: published in national office |
Ref document number: 2005800060 Country of ref document: EP |