US20090247249A1 - Data processing - Google Patents
Data processing Download PDFInfo
- Publication number
- US20090247249A1 US20090247249A1 US12/280,144 US28014407A US2009247249A1 US 20090247249 A1 US20090247249 A1 US 20090247249A1 US 28014407 A US28014407 A US 28014407A US 2009247249 A1 US2009247249 A1 US 2009247249A1
- Authority
- US
- United States
- Prior art keywords
- emulating
- processing units
- emulated
- data
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000012545 processing Methods 0.000 title claims abstract description 94
- 238000004891 communication Methods 0.000 claims description 17
- 238000000034 method Methods 0.000 claims description 14
- 238000003860 storage Methods 0.000 claims description 5
- 230000005540 biological transmission Effects 0.000 claims description 4
- 230000002457 bidirectional effect Effects 0.000 claims 1
- 238000003672 processing method Methods 0.000 claims 1
- JWOKHNWVYMHFTC-NKWVEPMBSA-N [(4as,7ar)-6-phosphono-1,2,3,4,4a,5,7,7a-octahydrocyclopenta[b]pyridin-6-yl]phosphonic acid Chemical compound C1CCN[C@@H]2CC(P(O)(=O)O)(P(O)(O)=O)C[C@@H]21 JWOKHNWVYMHFTC-NKWVEPMBSA-N 0.000 description 23
- 239000013598 vector Substances 0.000 description 22
- 230000008451 emotion Effects 0.000 description 20
- 230000006870 function Effects 0.000 description 12
- 230000002093 peripheral effect Effects 0.000 description 12
- 230000008569 process Effects 0.000 description 11
- 238000012546 transfer Methods 0.000 description 8
- 238000009877 rendering Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000007667 floating Methods 0.000 description 6
- 238000013507 mapping Methods 0.000 description 6
- 230000002195 synergetic effect Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 4
- 239000002245 particle Substances 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000009977 dual effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 238000013144 data compression Methods 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000014616 translation Effects 0.000 description 2
- 241000533950 Leucojum Species 0.000 description 1
- 101100029138 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) PE16 gene Proteins 0.000 description 1
- ATJFFYVFTNAWJD-UHFFFAOYSA-N Tin Chemical compound [Sn] ATJFFYVFTNAWJD-UHFFFAOYSA-N 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000002311 subsequent effect Effects 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45504—Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
Definitions
- This invention relates to data processing.
- electronic games are well known and may be supplied on a variety of distribution media, such as magnetic and/or optical discs.
- General computers or dedicated games consoles may be used to play these games.
- the emulating processor runs native program code arranged so that such native instructions or groups of native instructions have the same effect as data processing instructions relating to the emulated system.
- This invention provides a data processor comprising a plurality of interconnected real processing units arranged to emulate the operation of an emulated processor having a plurality of interconnected emulated processing units, in which:
- At least one emulated processing unit is emulated by contributions from two or more real processing units;
- At least one real processing unit contributes to emulating two or more emulated processing units.
- the invention addresses a problem relevant to an emulating system which uses a multi-processor architecture, particularly (though not exclusively) one in which communication between processors (in the emulating system) is relatively slow compared to the general speed of operation of the emulating system.
- the invention recognises that a division of the emulation of an emulated processing unit between two (or more) emulating processing units can reduce the message traffic needed to provide communication between the emulations of those emulated processing units.
- the emulation of multiple processing units which normally communicate heavily with one another, once again the message traffic needed to provide communication between the emulations of those emulated processing units can be greatly reduced.
- FIG. 1 schematically illustrates the overall system architecture of the PlayStation2
- FIG. 2 schematically illustrates the architecture of an Emotion Engine
- FIG. 3 schematically illustrates the configuration of a Graphics Synthesiser
- FIG. 4 schematically illustrates the structure of an emulating processor, in particular a Sony® PlayStation 3® device
- FIG. 5 schematically illustrates a cell processor
- FIG. 6 schematically illustrates a graphics unit
- FIG. 7 schematically illustrates logical interactions within the emulating processor.
- FIG. 1 schematically illustrates the overall system architecture of the PlayStation2 computer games machine.
- a system unit 10 is provided, with various peripheral devices connectable to the system unit.
- the system unit 10 comprises: an Emotion Engine 100 ; a Graphics Synthesiser 200 ; a sound processor unit 300 having dynamic random access memory (DRAM); a read only memory (ROM) 400 ; a compact disc (CD) and digital versatile disc (DVD) reader 450 ; a Rambus Dynamic Random Access Memory (RDRAM) unit 500 ; an input/output processor (IOP) 700 with dedicated RAM 750 .
- An (optional) external hard disk drive (MD) 390 may be connected.
- the input/output processor 700 has two Universal Serial Bus (USB) ports 715 and an iLink or IEEE 1394 port (iLink is the Sony Corporation implementation of the IEEE 1394 standard) (not shown).
- the IOP 700 handles all USB, iLink and game controller data traffic. For example when a user is playing a game, the IOP 700 receives data from the game controller and directs it to the Emotion Engine 100 which updates the current state of the game accordingly.
- the IOP 700 has a Direct Memory Access (DMA) architecture to facilitate rapid data transfer rates. DMA involves transfer of data from main memory to a device without passing it through the CPU.
- the USB interface is compatible with Open Host Controller Interface (OHCI) and can handle data transfer rates of between 1.5 Mbps and 12 Mbps. Provision of these interfaces means that the PlayStation2 is potentially compatible with peripheral devices such as digital video cassette recorders (VCRs) e.g. camcorders, digital cameras, microphones, printers, and input devices such as a keyboard, mouse and joystick.
- VCRs digital
- a device driver In order for successful data communication to occur with a peripheral device connected to a USB port 715 , an appropriate piece of software such as a device driver should be provided.
- Device driver technology is very well known and will not be described in detail here, except to say that the skilled man will be aware that a device driver or similar software interface may be required in the embodiment described here.
- a USB microphone 730 is connected to the USB port.
- the USB microphone 730 may be a hand-held microphone or may form part of a head-set that is worn by the human operator.
- the advantage of wearing a head-set is that the human operator's hands are free to perform other actions.
- the microphone includes an analogue-to-digital converter (ADC) and a basic hardware-based real-time data compression and encoding arrangement, so that audio data are transmitted by the microphone 730 to the USB port 715 in an appropriate format, such as a streaming compressed audio format for decoding at the PlayStation 2 system unit 10 .
- ADC analogue-to-digital converter
- two other ports 705 , 710 are proprietary sockets allowing the connection of a proprietary non-volatile RAM memory card 720 for storing game-related information, a hand-held game controller 725 or a device (not shown) mimicking a hand-held controller, such as a dance mat.
- the system unit 10 may be connected to a network adapter 805 that provides an interface (such as an Ethernet interface) to a network.
- This network may be, for example, a LAN, a WAN or the Internet.
- the network may be a general network or one that is dedicated to game related communication.
- the network adapter 805 allows data to be transmitted to and received from other system units 10 that are connected to the same network, (the other system units 10 also having corresponding network adapters 805 ).
- the Emotion Engine 100 is a 128-bit Central Processing Unit (CPU) that has been specifically designed for efficient simulation of 3 dimensional (3D) graphics for games applications.
- the Emotion Engine components include a data bus, cache memory (part of its CPU core) and registers, all of which are 128-bit. This facilitates fast processing of large volumes of multi-media data.
- Conventional PCs by way of comparison, have a basic 64-bit data structure.
- the floating point calculation performance of the PlayStation2 is 6.2 GFLOPs.
- the Emotion Engine also comprises MPEG2 decoder circuitry which allows for simultaneous processing of 3D graphics data and DVD data
- the Emotion Engine performs geometrical calculations including mathematical transforms and translations and also performs calculations associated with the physics of simulation objects, for example, calculation of friction between two objects.
- the image rendering commands are output in the form of display lists.
- a display list is a sequence of drawing commands that specifies to the Graphics Synthesiser which primitive graphic objects (e.g. points, lines, triangles, sprites) to draw on the screen and at which co-ordinates.
- primitive graphic objects e.g. points, lines, triangles, sprites
- a typical display list will comprise commands to draw vertices, commands to shade the faces of polygons, render bitmaps and so on.
- the Emotion Engine 100 can asynchronously generate multiple display lists.
- the Graphics Synthesiser 200 is a video accelerator that performs rendering of the display lists produced by the Emotion Engine 100 .
- the Graphics Synthesiser 200 includes a graphics interface unit (GIF) which handles, tracks and manages the multiple display lists.
- the rendering function of the Graphics Synthesiser 200 can generate image data that supports several alternative standard output image formats, i.e., NTSC/PAL, High Definition TV and VESA.
- NTSC/PAL graphics interface unit
- VESA High Definition TV
- the rendering capability of graphics systems is defined by the memory bandwidth between a pixel engine and a video memory, each of which is located within the graphics processor.
- Conventional graphics systems use external Video Random Access Memory (VRAM) connected to the pixel logic via an off-chip bus which tends to restrict available bandwidth.
- VRAM Video Random Access Memory
- the Graphics Synthesiser 200 of the PlayStation2 provides the pixel logic and the video memory on a single high-performance chip which allows for a comparatively large 38.4 Gigabyte per second memory access bandwidth.
- the Graphics Synthesiser is theoretically capable of achieving a peak drawing capacity of 75 million polygons per second. Even with a full range of effects such as textures, lighting and transparency, a sustained rate of 20 million polygons per second can be drawn continuously. Accordingly, the Graphics Synthesiser 200 is capable of rendering a film-quality image.
- the Sound Processor Unit (SPU) 300 is effectively the soundcard of the system which is capable of handling 3D digital sound such as Digital Theater Surround (DTS®) sound and AC-3 (also known as Dolby Digital) which is the sound format used for DVDs.
- DTS® Digital Theater Surround
- AC-3 also known as Dolby Digital
- a display and sound output device 305 such as a video monitor or television set with an associated loudspeaker arrangement 310 , is connected to receive video and audio signals from the graphics synthesiser 200 and the sound processing unit 300 .
- the main memory supporting the Emotion Engine 100 is the RDRAM (Rambus Dynamic Random Access Memory) module 500 licensed by Rambus Incorporated.
- This RDRAM memory subsystem comprises RAM, a RAM controller and a bus connecting the RAM to the Emotion Engine 100 .
- FIG. 2 schematically illustrates the architecture of the Emotion Engine 100 of FIG. 1 .
- the Emotion Engine 100 is a collective term for a number of processing units interconnected to give a desired set of functionality.
- the Emotion Engine comprises: a floating point unit (FPU) 104 ; a central processing unit (CPU) core 102 ; vector unit zero (VU0) 106 ; vector unit one (VU1) 108 ; a graphics interface unit (GIF) 110 ; an interrupt controller (INTC) 112 ; a timer unit 114 ; a direct memory access controller 116 ; an image data processor unit (IPU) 118 ; a dynamic random access memory controller (DRAMC) 120 ; a sub-bus interface (SIF) 122 ; and all of these individual processing units are connected via a 128-bit main bus 124 .
- FPU floating point unit
- CPU central processing unit
- VU0 vector unit zero
- VU1 vector unit one
- GIF graphics interface
- IIF
- the CPU core 102 is a 128-bit processor clocked at 300 MHz (in fact 294.912 MHz, but 300 MHz tends to be used as shorthand for this figure).
- the CPU core has access to 32 MB of main memory via the DRAMC 120 .
- the CPU core 102 instruction set is based on MIPS III RISC with some MIPS IV RISC instructions together with additional multimedia instructions.
- MIPS III and IV are Reduced Instruction Set Computer (RISC) instruction set architectures proprietary to MIPS Technologies, Inc.
- Standard instructions are 64-bit, two-way superscalar, which means that two instructions can be executed simultaneously.
- Multimedia instructions use 128-bit instructions via two pipelines.
- the CPU core 102 comprises a 16 KB instruction cache, an 8 KB data cache and a 16 KB scratchpad RAM which is a portion of cache connected by a dedicated bus to the CPU, allowing data access independent of the main bus.
- the FPU 104 serves as a first co-processor for the CPU core 102 .
- the vector unit 106 acts as a second co-processor.
- the FPU 104 comprises a floating point division calculator (FDIV).
- the vector units 106 and 108 perform mathematical operations and are essentially specialised FPUs that are extremely fast at evaluating the multiplication and addition of vector equations. They use Floating-Point Multiply-Adder Calculators (FMACs) for addition and multiplication operations and Floating-Point Dividers (FDIVs) for division and square root operations.
- FMACs Floating-Point Multiply-Adder Calculators
- FDIVs Floating-Point Dividers
- the FMACs operate on 32-bit values so when an operation is carried out on a 128-bit value (composed of four 32-bit values) an operation can be carried out on all four parts concurrently.
- VUs have built-in memory for storing micro-programs and interface with the rest of the system via Vector Interface Units (VIFs) referred to by the same number as the corresponding Vector Unit.
- VIFs Vector Interface Units
- Vector unit zero 106 can work as a coprocessor to the CPU core 102 via a dedicated 128-bit bus so it is essentially a second specialised FPU.
- Vector unit one 108 has a dedicated bus to the Graphics synthesiser 200 and thus can be considered as a completely separate processor.
- the inclusion of two vector units allows the software developer to split up the work between different parts of the CPU and the vector units can be used in either serial or parallel connection.
- Vector unit zero 106 comprises 4 FMACS and 1 FDIV. It is connected to the CPU core 102 via a coprocessor connection. It has 4 KB of vector unit memory for data and 4 KB of micro-memory for instructions. Vector unit zero 106 is useful for performing physics calculations associated with the images for display. It primarily executes non-patterned geometric processing together with the CPU core 102 .
- Vector unit one 108 comprises 5 FMACS and 2 FDIVs. It has no direct path to the CPU core 102 , although it does have a direct path to the GIF unit 110 . It has 16 KB of vector unit memory for data and 16 KB of micro-memory for instructions. Vector unit one 108 is useful for performing transformations. It primarily executes patterned geometric processing and directly outputs a generated display list to the GIF 110 .
- the GIF 110 is an interface unit to the Graphics Synthesiser 200 . It converts data according to a tag specification at the beginning of a display list packet and transfers drawing commands to the Graphics Synthesiser 200 whilst mutually arbitrating multiple transfer.
- the interrupt controller (INTC) 112 serves to arbitrate interrupts from peripheral devices, except the DMAC 116 .
- the timer unit 114 comprises four independent timers with 16-bit counters. The timers are driven either by the bus clock (at 1/16 or 1/256 intervals) or via an external clock.
- the DMAC 116 handles data transfers between main memory and scratchpad RAM, or between main memory or scratchpad RAM and peripherals. It arbitrates the main bus 124 at the same time. Performance optimisation of the DMAC 116 is a key way by which to improve Emotion Engine performance.
- the image processing unit (IPU) 118 is an image data processor that is used to expand compressed animations and texture images. It performs macro-Block decoding, colour space conversion and vector quantisation.
- the sub-bus interface (SIF) 122 is an interface unit to the IOP 700 .
- the IPU has its own memory and bus to control 10 devices such as sound chips and storage devices.
- FIG. 3 schematically illustrates the configuration of the Graphic Synthesiser 200 .
- the Graphics Synthesiser comprises: a host interface 202 ; a pixel pipeline 206 ; a memory interface 208 ; a local memory 212 including a frame page buffer 214 and a texture page buffer 216 ; and a video converter 210 .
- the host interface 202 transfers data with the host (in this case the GIF 110 ). Both drawing data and buffer data from the host pass through this interface.
- the output from the host interface 202 is supplied to the graphics synthesiser 200 which develops the graphics to draw pixels based on vertex information received from the Emotion Engine 100 , and calculates information such as RGBA value, depth value (i.e. Z-value), texture value and fog value for each pixel.
- the RGBA value specifies the red, green, blue (RGB) colour components and the A (Alpha) component represents opacity of an image object.
- the Alpha value can range from completely transparent to totally opaque.
- the pixel data is supplied to the pixel pipeline 206 which performs processes such as texture mapping, fogging and Alpha-blending and determines the final drawing colour based on the calculated pixel information.
- the pixel pipeline 206 comprises 16 pixel engines PE 1 , PE 2 , . . . , PE 16 so that it can process a maximum of 16 pixels concurrently.
- the pixel pipeline 206 runs at 150 MHz (in fact 294.912/2 MHz) with 32-bit colour and a 32-bit Z-buffer.
- the memory interface 208 reads data from and writes data to the local Graphics Synthesiser memory 212 . It writes the drawing pixel values (RGBA and Z) to memory at the end of a pixel operation and reads the pixel values of the frame buffer 214 from memory. These pixel values read from the frame buffer 214 are used for pixel test or Alpha-blending.
- the memory interface 208 also reads from local memory 212 the RGBA values for the current contents of the frame buffer.
- the local memory 212 is a 32 Mbit (4 MB) memory that is built-in to the Graphics Synthesiser 200 . It can be organised as a frame buffer 214 , texture buffer 216 and a Z-buffer 215 .
- the frame buffer 214 is the portion of video memory where pixel data such as colour information is stored.
- the Graphics Synthesiser uses a 2D to 3D texture mapping process to add visual detail to 3D geometry. Each texture may be wrapped around a 3D image object and is stretched and skewed to give a 3D graphical effect.
- the texture buffer is used to store the texture information for image objects.
- the Z-buffer 215 also known as depth buffer
- Images are constructed from basic building blocks known as graphics primitives or polygons. When a polygon is rendered with Z-buffering, the depth value of each of its pixels is compared with the corresponding value stored in the Z-buffer.
- the value stored in the Z-buffer is greater than or equal to the depth of the new pixel value then this pixel is determined visible so that it should be rendered and the Z-buffer will be updated with the new pixel depth. If however the Z-buffer depth value is less than the new pixel depth value the new pixel value is behind what has already been drawn and will not be rendered.
- Alternative Z-buffer tests are available, so that (a) the new pixel always replaces the previous value, or (b) the new pixel replaces the previous pixel value if its depth is greater than or equal to the previous value stored in the Z buffer.
- the local memory 212 has a 1024-bit read port and a 1024-bit write port for accessing the frame buffer and Z-buffer and a 512-bit port for texture reading.
- the video converter 210 is operable to display the contents of the frame memory in a specified output format.
- processing units shown in FIGS. 1 to 3 will be referred to as “emulated” processing units, whereas in the emulating system to be described below, processing units of that (emulating)-system will be referred to as “emulating” processing units.
- emulating processing units of that (emulating)-system
- both categories of processing units represent physical processing units capable of running native software appropriate to those processing units.
- FIG. 4 schematically illustrates the overall system architecture of the Sony® Playstation 3® entertainment device.
- a system unit 910 is provided, with various peripheral devices connectable to the system unit.
- the system unit 910 comprises: a Cell processor 1100 ; a Rambus® dynamic random access memory (XDRAM) unit 1500 ; a Reality Synthesiser graphics unit 1200 with a dedicated video random access memory (VRAM) unit 1250 ; and an I/O bridge 1700 .
- a Cell processor 1100 a Cell processor 1100 ; a Rambus® dynamic random access memory (XDRAM) unit 1500 ; a Reality Synthesiser graphics unit 1200 with a dedicated video random access memory (VRAM) unit 1250 ; and an I/O bridge 1700 .
- XDRAM Rambus® dynamic random access memory
- VRAM dedicated video random access memory
- the system unit 910 also comprises a Blu Ray® Disk BD-ROM® optical disk reader 1430 for reading a disk 1440 and a removable slot-in hard disk drive (HDD) 1400 , accessible through the I/O bridge 1700 .
- the system unit also comprises a memory card reader 1450 for reading compact flash memory cards, Memory Stick® memory cards and the like, which is similarly accessible through the I/O bridge 1700 .
- the I/O bridge 1700 also connects to six Universal Serial Bus (USB) 2.0 ports 1710 ; a gigabit Ethernet port 1720 ; an IEEE 802.11b/g wireless network (Wi-Fi) port 1730 ; and a Bluetooth® wireless link port 1740 capable of supporting of up to seven Bluetooth connections.
- USB Universal Serial Bus
- Wi-Fi IEEE 802.11b/g wireless network
- Bluetooth® wireless link port 1740 capable of supporting of up to seven Bluetooth connections.
- the I/O bridge 1700 handles all wireless, USB and Ethernet data, including data from one or more game controllers 1751 .
- the I/O bridge 1700 receives data from the game controller 1751 via a Bluetooth link and directs it to the Cell processor 1100 , which updates the current state of the game accordingly.
- the wireless, USB and Ethernet ports also provide connectivity for other peripheral devices in addition to game controllers 1751 , such as: a remote control 1752 ; a keyboard 1753 ; a mouse 1754 ; a portable entertainment device 1755 such as a Sony Playstation Portable® entertainment device; a video camera such as an EyeToy® video camera 1756 ; and a microphone headset 1757 .
- peripheral devices may therefore in principle be connected to the system unit 910 wirelessly; for example the portable entertainment device 1755 may communicate via a Wi-Fi ad-hoc connection, whilst the microphone headset 1757 may communicate via a Bluetooth link.
- Playstation 3 device is also potentially compatible with other peripheral devices such as digital video recorders (DVRs), set-top boxes, digital cameras, portable media players, Voice over IP telephones, mobile telephones, printers and scanners.
- DVRs digital video recorders
- set-top boxes digital cameras
- portable media players Portable media players
- Voice over IP telephones mobile telephones, printers and scanners.
- a legacy memory card reader 1410 may be connected to the system unit via a USB port 1710 , enabling the reading of memory cards 1420 of the kind used by the Playstation® or Playstation 2® devices.
- the game controller 1751 is operable to communicate wirelessly with the system unit 910 via the Bluetooth link.
- the game controller 1751 can instead be connected to a USB port, thereby also providing power by which to charge the battery of the game controller 1751 .
- the game controller is sensitive to motion in 6 degrees of freedom, corresponding to translation and rotation in each axis. Consequently gestures and movements by the user of the game controller may be translated as inputs to a game in addition to or instead of conventional button or joystick commands.
- other wirelessly enabled peripheral devices such as the Playstation Portable device may be used as a controller.
- additional game or control information may be provided on the screen of the device.
- Other alternative or supplementary control devices may also be used, such as a dance mat (not shown), a light gun (not shown), a steering wheel and pedals (not shown) or bespoke controllers, such as a single or several large buttons for a rapid-response quiz game (also not shown).
- the remote control 1752 is also operable to communicate wirelessly with the system unit 910 via a Bluetooth link.
- the remote control 1752 comprises controls suitable for the operation of the Blu Ray Disk BD-ROM reader 1430 and for the navigation of disk content.
- the Blu Ray Disk BD-ROM reader 1430 is operable to read CD-ROMs compatible with the Playstation and PlayStation 2 devices, in addition to conventional pre-recorded and recordable CDs, and so-called Super Audio CDs.
- the reader 1430 is also operable to read DVD-ROMs compatible with the Playstation 2 and PlayStation 3 devices, in addition to conventional pre-recorded and recordable DVDs.
- the reader 1430 is further operable to read BD-ROMs compatible with the Playstation 3 device, as well as conventional pre-recorded and recordable Blu-Ray Disks.
- the system unit 910 is operable to supply audio and video, either generated or decoded by the Playstation 3 device via the Reality Synthesiser graphics unit 1200 , through audio and video connectors to a display and sound output device 1300 such as a monitor or television set, having a display screen 1305 and one or more loudspeakers 1310 .
- the audio connectors 1210 may include conventional analogue and digital outputs whilst the video connectors 1220 may variously include component video, S-video, composite video and one or more High Definition Multimedia Interface (HDMI) outputs. Consequently, video output may be in formats such as PAL or NTSC, or in 720p, 1080i or 1080p high definition.
- Audio processing (generation, decoding and so on) is performed by the Cell processor 1100 .
- the Playstation 3 device's operating system supports Dolby® 5.1 surround sound, Dolby® Theatre Surround (DTS), and the decoding of 7.1 surround sound from Blu-Ray® disks.
- DTS Dolby® Theatre Surround
- the video camera 1756 comprises a single charge coupled device (CCD), an LED indicator, and hardware-based real-time data compression and encoding apparatus so that compressed video data may be transmitted in an appropriate format such as an intra-image based MPEG (motion picture expert group) standard for decoding by the system unit 910 .
- the camera LED indicator is arranged to illuminate in response to appropriate control data from the system unit 910 , for example to signify adverse lighting conditions.
- Embodiments of the video camera 1756 may variously connect to the system unit 910 via a USB, Bluetooth or Wi-Fi communication port.
- Embodiments of the video camera may include an associated microphone and also be capable of transmitting audio data.
- the CCD may have a resolution suitable for high-definition video capture. In use, images captured by the video camera may for example be incorporated within a game or interpreted as game control inputs.
- an appropriate piece of software such as a device driver should be provided.
- Device driver technology is well-known and will not be described in detail here, except to say that the skilled man will be aware that a device driver or similar software interface may be required in the present embodiment described.
- the Cell processor 1100 has an architecture comprising four basic components: external input and output structures comprising a memory controller 1160 and a dual bus interface controller 1170 A,B; a main processor referred to as the Power Processing Element (PPE) 1150 ; eight co-processors referred to as Synergistic Processing Elements (SPEs) 1110 A-H; and a circular data bus connecting the above components referred to as the Element Interconnect Bus 1180 .
- the total floating point performance of the Cell processor is 218 GFLOPS, compared with the 6.2 GFLOPs of the Playstation 2 device's Emotion Engine.
- the Power Processing Element (PPE) 1150 is based upon a two-way simultaneous multithreading Power 970 compliant PowerPC core (PP) 1155 running with an internal clock of 3.2 GHz. It comprises a 512 kB level 2 (L2) cache and a 32 kB level 1 (L1) cache.
- the PPE 1150 is capable of eight single position operations per clock cycle, translating to 25.6 GFLOPs at 3.2 GHz.
- the primary role of the PPE 1150 is to act as a controller for the Synergistic Processing Elements 111 A-H, which handle most of the computational workload. In operation the PPE 1150 maintains a job queue, scheduling jobs for the Synergistic Processing Elements 1110 A-H and monitoring their progress. Consequently each Synergistic Processing Element 110 A-H runs a kernel whose role is to fetch a job, execute it and synchronise with the PPE 1150 .
- Each Synergistic Processing Element (SPE) 1110 A-H comprises a respective Synergistic Processing Unit (SPU′—to distinguish it from the Sound Processing Unit mentioned above) 1120 A-H, and a respective Memory Flow Controller (MFC) 1140 A-H comprising in turn a respective Dynamic Memory Access Controller (DMAC) 1142 A-H, a respective Memory Management Unit (MMU) 1144 A-H and a bus interface (not shown).
- Each SPU′ 1120 A-H is a RISC processor clocked at 3.2 GHz and comprising 256 kB local RAM 1130 A-H, expandable in principle to 4 GB.
- Each SPE gives a theoretical 25.6 GFLOPS of single precision performance.
- An SPU′ can operate on 4 single precision floating point members, 4 32-bit numbers, 8 16-bit integers, or 16 8-bit integers in a single clock cycle. In the same clock cycle it can also perform a memory operation.
- the SPU′ 1120 A-H does not directly access the system memory XDRAM 1500 ; the 64-bit addresses formed by the SPU′ 1120 A-H are passed to the MFC 1140 A-H which instructs its DMA controller 1142 A-H to access memory via the Element Interconnect Bus 1180 and the memory controller 1160 .
- the Element Interconnect Bus (FIB) 1180 is a logically circular communication bus internal to the Cell processor 1100 which connects the above processor elements, namely the PPE 1150 , the memory controller 1160 , the dual bus interface 1170 A,B and the 8 SPEs 1110 A-H, totalling 12 participants. Participants can simultaneously read and write to the bus at a rate of 8 bytes per, clock cycle. As noted previously, each SPE 1110 A-H comprises a DMAC 1142 A-H for scheduling longer read or write sequences.
- the EIB comprises four channels, two each in clockwise and anti-clockwise directions. Consequently for twelve participants, the longest step-wise data-flow between any two participants is six steps in the appropriate direction.
- the theoretical peak instantaneous EIB bandwidth for 12 slots is therefore 96 B (bytes) per clock, in the event of full utilisation through arbitration between participants. This equates to a theoretical peak bandwidth of 307.2 GB/s (gigabytes per second) at a clock rate of 3.2 GHz.
- the memory controller 1160 comprises an XDRAM interface 1162 , developed by Rambus Incorporated.
- the memory controller interfaces with the Rambus XDRAM 1500 with a theoretical peak bandwidth of 25.6 GB/s.
- the dual bus interface 1170 A,B comprises a Rambus FlexIO® system interface 1172 A,B.
- the interface is organised into 12 channels each being 8 bits wide, with five paths being inbound and seven outbound. This provides a theoretical peak bandwidth of 62.4 GB/s (36.4 GB/s outbound, 26 GB/s inbound) between the Cell processor and the I/O Bridge 1700 via the controller 1170 A and the Reality Simulator graphics unit 1200 via controller 1170 B.
- Data sent by the Cell processor 1100 to the Reality Simulator graphics unit 1200 will typically comprise display lists, being a sequence of commands to draw vertices, apply textures to polygons, specify lighting conditions, and so on.
- the Reality Simulator graphics (RSX) unit 1200 is a video accelerator based upon the NVidia® G70/71 architecture that processes and renders lists of commands produced by the Cell processor 1100 .
- the RSX unit 1200 comprises a host interface 1202 operable to communicate with the bus interface controller 1170 B of the Cell processor 1100 ; a vertex pipeline 1204 (VP) comprising eight vertex shaders 1205 ; a pixel pipeline 1206 (PP) comprising 24 pixel shaders 1207 ; a render pipeline 1208 (RP) comprising eight render output units (ROPs) 1209 ; a memory interface 1210 ; and a video converter 1212 for generating a video output.
- VP vertex pipeline 1204
- PP pixel pipeline 1206
- RP render pipeline 1208
- ROPs render output units
- the RSX 1200 is complemented by 256 MB double data rate (DDR) video RAM (VRAM) 1250 , clocked at 600 MHz and operable to interface with the RSX 1200 at a theoretical peak bandwidth of 25.6 GB/s.
- VRAM 1250 maintains a frame buffer 1214 and a texture buffer 1216 .
- the texture buffer 1216 provides textures to the pixel shaders 1207 , whilst the frame buffer 1214 stores results of the processing pipelines.
- the RSX can also access the main memory 1500 via the EIB 1180 , for example to load textures into the VRAM 1250 .
- the vertex pipeline 1204 primarily processes deformations and transformations of vertices defining polygons within the image to be rendered.
- the pixel pipeline 1206 primarily processes the application of colour, textures and lighting to these polygons, including any pixel transparency, generating red, green, blue and alpha (transparency) values for each processed pixel.
- Texture mapping may simply apply a graphic image to a surface, or may include bump-mapping (in which the notional direction of a surface is perturbed in accordance with texture values to create-highlights and shade in the lighting model) or displacement mapping (in which the applied texture additionally perturbs vertex positions to generate a deformed surface consistent with the texture).
- the render pipeline 1208 performs depth comparisons between pixels to determine which should be rendered in the final image.
- the render pipeline and vertex pipeline 1204 can communicate depth information between them, thereby enabling the removal of occluded elements prior to pixel processing, and so improving overall rendering efficiency.
- the render pipeline 1208 also applies subsequent effects such as full-screen anti-aliasing over the resulting image.
- Both the vertex shaders 1205 and pixel shaders 1207 are based on the shader model 3.0 standard. Up to 136 shader operations can be performed per clock cycle, with the combined pipeline therefore capable of 74.8 billion shader operations per second, outputting up to 840 million vertices and 10 billion pixels per second.
- the total floating point performance of the RSX 1200 is 1.8 TFLOPS.
- the RSX 1200 operates in close collaboration with the Cell processor 1100 ; for example, when displaying an explosion, or weather effects such as rain or snow, a large number of particles must be tracked, updated and rendered within the scene.
- the PPU 1155 of the Cell processor may schedule one or more SPEs 1110 A-H to compute the trajectories of respective batches of particles.
- the RSX 1200 accesses any texture data (e.g. snowflakes) not currently held in the video RAM 1250 from the main system memory 1500 via the element interconnect bus 1180 , the memory controller 1160 and a bus interface controller 1170 B.
- the or each SPE 1110 A-H outputs its computed particle properties (typically coordinates and normals, indicating position and attitude) directly to the video RAM 1250 ; the DMA controller 1142 A-H of the or each SPE 1110 A-H addresses the video RAM 1250 via the bus interface controller 1170 B.
- the assigned SPEs become part of the video processing pipeline for the duration of the task.
- the PPU 1155 can assign tasks in this fashion to six of the eight SPEs available; one SPE is reserved for the operating system, whilst one SPE is optionally disabled.
- the disabling of one SPE provides a greater level of tolerance during fabrication of the Cell processor, as it allows for one SPE to fail the fabrication process.
- the eighth SPE provides scope for redundancy in the event of subsequent failure by one of the other SPEs during the life of the Cell processor.
- the PPU 1155 can assign tasks to SPEs in several ways. For example, SPEs may be chained together to handle each step in a complex operation, such as accessing a DVD, video and audio decoding, and error masking, with each step being assigned to a separate SPE. Alternatively or in addition, two or more SPEs may be assigned to operate on input data in parallel, as in the particle animation example above.
- Software instructions implemented by the Cell processor 1100 and/or the RSX 1200 may be supplied at manufacture and stored on the HDD 1400 , and/or may be supplied on a data carrier or storage medium such as an optical disk or solid state memory, or via a transmission medium such as a wired or wireless network or internet connection, or via combinations of these.
- the software supplied at manufacture comprises system firmware and the Playstation 3 device's operating system (OS).
- the OS provides a user interface enabling a user to select from a variety of functions, including playing a game, listening to music, viewing photographs, or viewing a video.
- the interface takes the form of a so-called cross media-bar (XMB), with categories of function arranged horizontally.
- XMB cross media-bar
- the user navigates by moving through the functions horizontally using a game controller 1751 , remote control 1752 or other suitable control device so as to highlight the desired function, at which point options pertaining to that function appear as a vertically scrollable list centred on that function, which may be navigated in analogous fashion.
- the Playstation 3 device may select appropriate options automatically (for example, by commencing the game), or may provide relevant options (for example, to select between playing an audio disk or compressing its content to the HDD 1400 ).
- the OS provides an on-line capability, including a web browser, an interface with an on-line store from which additional game content, demos and other media may be downloaded, and a friends management capability, providing on-line communication with other Playstation 3 device users nominated by the user of the current device; for example, by text, audio or video depending on the peripheral devices available.
- the on-line capability also provides for on-line communication, content download and content purchase during play of a suitably configured game, and for updating the firmware and OS of the Playstation 3 device itself.
- the reproduction of the operation of the PS2 arrangement is an emulation rather than a simulation. That is to say, it is not the case that all of the operations contributing to the functionality of the PS2 are reproduced by the emulating system in a lock-step, clock-by-clock manner. Rather, some functions may be carried out by time division on a single emulating processing unit, and in general the processing units communicate with one another only when there is a need (within the emulated system) to do so.
- the PPE 1150 controls the overall operation of the emulating system and runs an operating system (OS) for the emulating system. It also has one thread which provides interpretation of native Emotion Engine PS2 instructions into native SPE instructions which it supplies, with any associated information (such as allocation of emulation functions—see below) which is required to carry out the respective part of the emulation, to the relevant SPEs via the EIB, while another thread provides the function of recompiling new code native to the emulating system to provide the particular functionality defined by the interpreted PS2 code. Emulation of the various parts of the PS2 system described above is devolved to the eight SPEs acting as emulating processing units, which emulate PS2 functionality as set out below.
- OS operating system
- the precise identity of the individual SPEs is just a convenient notation and has no technical significance because of the nature of the message-passing interface between the SPEs. So, for example, the operations assigned to the SPEs 1110 A and 1110 B could be swapped in their entirety with no technical effect on the overall emulation process. It will also be appreciated that the disabling of one SPE can be carried out as mentioned above, so that the tasks are in fact divided amongst the remaining seven SPEs.
- the PS2 used a conventional bus for communication between the various emulated processing units.
- the emulating system makes use of the EIB 1180 for passing messages between SPEs and between an SPE, the PPE 1150 and the I/O bridge 1700 (and/or other system devices such as the RSX 1200 ).
- the PS2 system had conventional memory access arrangements to access the RDRAM 500 .
- the emulating system uses a distributed DMA system (the DMA controllers 1142 A-H).
- the main system memory 1500 is treated as a common memory “pool”, with all SPEs having access to it.
- Each of the SPEs runs locally, on its own time clock
- the SPEs run software to allow parts of the functionality of the PS2 system to be emulated.
- synchronisation is required only when the emulated processing units emulated by the EPUs need to communicate with one another. At that time, synchronisation takes place just between the devices concerned, using a message transfer mechanism via the EIB.
- one of the SPEs places a message onto the EIB (including a source SPE identifier, a destination SPE identifier etc), addressed to the other of the SPEs.
- the message may include a request for a certain piece of data, or may include a data item which is being sent to that other SPE involved in that particular synchronisation.
- an acknowledgement is returned by that other SPE, the transaction is complete. This is a reliable but rather slow method of synchronising two emulated processors.
- the way in which the SPEs are logically arranged is shown schematically in FIG. 5 .
- Example paths of logical communication between the SPEs are also shown, although these need not be exhaustive. It can be seen that some functions are shared on the same SPE which avoids entirely the need to use the message-passing mechanism to communicate between them. So, providing the emulation of two (or more) emulated processing units on a single SPE can improve the system's performance by reducing the amount of inter-SPE communication needed.
- FIG. 5 Another feature which is not exhaustively indicated on FIG. 5 , for clarity of the diagram, is that where the emulated processing units emulated by two different SPEs need to communicate with one another a lot in order to carry out particular functions of the PS2, a part of the functionality of one real processor can be carried out by the “other” processor's SPE.
- the PS2 sound processing unit is mostly emulated on one SPE. This processes samples and mixes them into the final samples for output. It also processes accesses to its register map.
- the registers used to write to sound processing unit sample memory are emulated on the IOP's SPE which manages the queuing of accesses, directly accesses the sample memory image in main memory, and raises any interrupts these might cause as though they had been routed from the sound processing unit.
- DMAC DMAC
- An example of one emulating SPE handling the emulation of multiple PS2 system devices is that the emulation of the PS2's IOP is shared (on a single emulating device) with the bulk of the emulation of a CD disk controller.
- the PPE can vary the distribution of emulation tasks between the SPEs (during an overall emulation operation), so as to alter which SPE emulates a particular emulated processing unit, and/or which emulated processing units are emulated by a particular SPE.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Image Generation (AREA)
- Processing Or Creating Images (AREA)
- Bus Control (AREA)
Abstract
Description
- This invention relates to data processing.
- As an example of data processing, electronic games are well known and may be supplied on a variety of distribution media, such as magnetic and/or optical discs. General computers or dedicated games consoles may be used to play these games.
- There is sometimes a need to emulate the operation of one processor on another processor. That is to say, the emulating processor runs native program code arranged so that such native instructions or groups of native instructions have the same effect as data processing instructions relating to the emulated system.
- A situation in which this need arises is where a data processor has been upgraded by the manufacturer to a new “generation”—for example, a new hardware architecture or instruction protocol, but the manufacturer still wants software relating to the older generation device to be handled (so-called backwards compatibility). Often the only way of achieving this is for the newer generation device to run emulation software which in turn acts in response to instructions relating to the older generation device. In this case, while it is of course acknowledged that running an emulation is generally much more processor-intensive than running native software, the general trend of generational improvements in the performance of data processing hardware is such that the increased processing overhead can usually be handled.
- This invention provides a data processor comprising a plurality of interconnected real processing units arranged to emulate the operation of an emulated processor having a plurality of interconnected emulated processing units, in which:
- at least one emulated processing unit is emulated by contributions from two or more real processing units; and
- at least one real processing unit contributes to emulating two or more emulated processing units.
- The invention addresses a problem relevant to an emulating system which uses a multi-processor architecture, particularly (though not exclusively) one in which communication between processors (in the emulating system) is relatively slow compared to the general speed of operation of the emulating system. The invention recognises that a division of the emulation of an emulated processing unit between two (or more) emulating processing units can reduce the message traffic needed to provide communication between the emulations of those emulated processing units. Similarly, by grouping together (on a single emulating processing unit) the emulation of multiple processing units which normally communicate heavily with one another, once again the message traffic needed to provide communication between the emulations of those emulated processing units can be greatly reduced. These measures, taken together, can provide a faster and more efficient emulation.
- Various further respective aspects and features of the invention are defined in the appended claims.
- Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which:
-
FIG. 1 schematically illustrates the overall system architecture of the PlayStation2; -
FIG. 2 schematically illustrates the architecture of an Emotion Engine; -
FIG. 3 schematically illustrates the configuration of a Graphics Synthesiser; -
FIG. 4 schematically illustrates the structure of an emulating processor, in particular a Sony® PlayStation 3® device; -
FIG. 5 schematically illustrates a cell processor; -
FIG. 6 schematically illustrates a graphics unit; and -
FIG. 7 schematically illustrates logical interactions within the emulating processor. - Referring now to the drawings,
FIG. 1 schematically illustrates the overall system architecture of the PlayStation2 computer games machine. Asystem unit 10 is provided, with various peripheral devices connectable to the system unit. - The
system unit 10 comprises: an EmotionEngine 100; aGraphics Synthesiser 200; asound processor unit 300 having dynamic random access memory (DRAM); a read only memory (ROM) 400; a compact disc (CD) and digital versatile disc (DVD)reader 450; a Rambus Dynamic Random Access Memory (RDRAM)unit 500; an input/output processor (IOP) 700 withdedicated RAM 750. An (optional) external hard disk drive (MD) 390 may be connected. - The input/
output processor 700 has two Universal Serial Bus (USB)ports 715 and an iLink or IEEE 1394 port (iLink is the Sony Corporation implementation of the IEEE 1394 standard) (not shown). The IOP 700 handles all USB, iLink and game controller data traffic. For example when a user is playing a game, the IOP 700 receives data from the game controller and directs it to the Emotion Engine 100 which updates the current state of the game accordingly. The IOP 700 has a Direct Memory Access (DMA) architecture to facilitate rapid data transfer rates. DMA involves transfer of data from main memory to a device without passing it through the CPU. The USB interface is compatible with Open Host Controller Interface (OHCI) and can handle data transfer rates of between 1.5 Mbps and 12 Mbps. Provision of these interfaces means that the PlayStation2 is potentially compatible with peripheral devices such as digital video cassette recorders (VCRs) e.g. camcorders, digital cameras, microphones, printers, and input devices such as a keyboard, mouse and joystick. - Generally, in order for successful data communication to occur with a peripheral device connected to a
USB port 715, an appropriate piece of software such as a device driver should be provided. Device driver technology is very well known and will not be described in detail here, except to say that the skilled man will be aware that a device driver or similar software interface may be required in the embodiment described here. - In the present embodiment, a
USB microphone 730 is connected to the USB port. It will be appreciated that theUSB microphone 730 may be a hand-held microphone or may form part of a head-set that is worn by the human operator. The advantage of wearing a head-set is that the human operator's hands are free to perform other actions. The microphone includes an analogue-to-digital converter (ADC) and a basic hardware-based real-time data compression and encoding arrangement, so that audio data are transmitted by themicrophone 730 to theUSB port 715 in an appropriate format, such as a streaming compressed audio format for decoding at the PlayStation 2system unit 10. - Apart from the USB ports, two
other ports RAM memory card 720 for storing game-related information, a hand-heldgame controller 725 or a device (not shown) mimicking a hand-held controller, such as a dance mat. - The
system unit 10 may be connected to anetwork adapter 805 that provides an interface (such as an Ethernet interface) to a network. This network may be, for example, a LAN, a WAN or the Internet. The network may be a general network or one that is dedicated to game related communication. Thenetwork adapter 805 allows data to be transmitted to and received fromother system units 10 that are connected to the same network, (theother system units 10 also having corresponding network adapters 805). - The
Emotion Engine 100 is a 128-bit Central Processing Unit (CPU) that has been specifically designed for efficient simulation of 3 dimensional (3D) graphics for games applications. The Emotion Engine components include a data bus, cache memory (part of its CPU core) and registers, all of which are 128-bit. This facilitates fast processing of large volumes of multi-media data. Conventional PCs, by way of comparison, have a basic 64-bit data structure. The floating point calculation performance of the PlayStation2 is 6.2 GFLOPs. The Emotion Engine also comprises MPEG2 decoder circuitry which allows for simultaneous processing of 3D graphics data and DVD data The Emotion Engine performs geometrical calculations including mathematical transforms and translations and also performs calculations associated with the physics of simulation objects, for example, calculation of friction between two objects. It produces sequences of image rendering commands which are subsequently utilised by theGraphics Synthesiser 200. The image rendering commands are output in the form of display lists. A display list is a sequence of drawing commands that specifies to the Graphics Synthesiser which primitive graphic objects (e.g. points, lines, triangles, sprites) to draw on the screen and at which co-ordinates. Thus a typical display list will comprise commands to draw vertices, commands to shade the faces of polygons, render bitmaps and so on. TheEmotion Engine 100 can asynchronously generate multiple display lists. - The
Graphics Synthesiser 200 is a video accelerator that performs rendering of the display lists produced by theEmotion Engine 100. TheGraphics Synthesiser 200 includes a graphics interface unit (GIF) which handles, tracks and manages the multiple display lists. The rendering function of theGraphics Synthesiser 200 can generate image data that supports several alternative standard output image formats, i.e., NTSC/PAL, High Definition TV and VESA. In general, the rendering capability of graphics systems is defined by the memory bandwidth between a pixel engine and a video memory, each of which is located within the graphics processor. Conventional graphics systems use external Video Random Access Memory (VRAM) connected to the pixel logic via an off-chip bus which tends to restrict available bandwidth. However, theGraphics Synthesiser 200 of the PlayStation2 provides the pixel logic and the video memory on a single high-performance chip which allows for a comparatively large 38.4 Gigabyte per second memory access bandwidth. The Graphics Synthesiser is theoretically capable of achieving a peak drawing capacity of 75 million polygons per second. Even with a full range of effects such as textures, lighting and transparency, a sustained rate of 20 million polygons per second can be drawn continuously. Accordingly, theGraphics Synthesiser 200 is capable of rendering a film-quality image. - The Sound Processor Unit (SPU) 300 is effectively the soundcard of the system which is capable of handling 3D digital sound such as Digital Theater Surround (DTS®) sound and AC-3 (also known as Dolby Digital) which is the sound format used for DVDs.
- A display and
sound output device 305, such as a video monitor or television set with an associatedloudspeaker arrangement 310, is connected to receive video and audio signals from thegraphics synthesiser 200 and thesound processing unit 300. - The main memory supporting the
Emotion Engine 100 is the RDRAM (Rambus Dynamic Random Access Memory)module 500 licensed by Rambus Incorporated. This RDRAM memory subsystem comprises RAM, a RAM controller and a bus connecting the RAM to theEmotion Engine 100. -
FIG. 2 schematically illustrates the architecture of theEmotion Engine 100 ofFIG. 1 . TheEmotion Engine 100 is a collective term for a number of processing units interconnected to give a desired set of functionality. Viewed in this context, the Emotion Engine comprises: a floating point unit (FPU) 104; a central processing unit (CPU)core 102; vector unit zero (VU0) 106; vector unit one (VU1) 108; a graphics interface unit (GIF) 110; an interrupt controller (INTC) 112; atimer unit 114; a directmemory access controller 116; an image data processor unit (IPU) 118; a dynamic random access memory controller (DRAMC) 120; a sub-bus interface (SIF) 122; and all of these individual processing units are connected via a 128-bitmain bus 124. - The
CPU core 102 is a 128-bit processor clocked at 300 MHz (in fact 294.912 MHz, but 300 MHz tends to be used as shorthand for this figure). The CPU core has access to 32 MB of main memory via theDRAMC 120. TheCPU core 102 instruction set is based on MIPS III RISC with some MIPS IV RISC instructions together with additional multimedia instructions. MIPS III and IV are Reduced Instruction Set Computer (RISC) instruction set architectures proprietary to MIPS Technologies, Inc. Standard instructions are 64-bit, two-way superscalar, which means that two instructions can be executed simultaneously. Multimedia instructions, on the other hand, use 128-bit instructions via two pipelines. TheCPU core 102 comprises a 16 KB instruction cache, an 8 KB data cache and a 16 KB scratchpad RAM which is a portion of cache connected by a dedicated bus to the CPU, allowing data access independent of the main bus. - The
FPU 104 serves as a first co-processor for theCPU core 102. Thevector unit 106 acts as a second co-processor. TheFPU 104 comprises a floating point division calculator (FDIV). Thevector units CPU core 102 via a dedicated 128-bit bus so it is essentially a second specialised FPU. Vector unit one 108, on the other hand, has a dedicated bus to theGraphics synthesiser 200 and thus can be considered as a completely separate processor. The inclusion of two vector units allows the software developer to split up the work between different parts of the CPU and the vector units can be used in either serial or parallel connection. - Vector unit zero 106 comprises 4 FMACS and 1 FDIV. It is connected to the
CPU core 102 via a coprocessor connection. It has 4 KB of vector unit memory for data and 4 KB of micro-memory for instructions. Vector unit zero 106 is useful for performing physics calculations associated with the images for display. It primarily executes non-patterned geometric processing together with theCPU core 102. - Vector unit one 108 comprises 5 FMACS and 2 FDIVs. It has no direct path to the
CPU core 102, although it does have a direct path to theGIF unit 110. It has 16 KB of vector unit memory for data and 16 KB of micro-memory for instructions. Vector unit one 108 is useful for performing transformations. It primarily executes patterned geometric processing and directly outputs a generated display list to theGIF 110. - The
GIF 110 is an interface unit to theGraphics Synthesiser 200. It converts data according to a tag specification at the beginning of a display list packet and transfers drawing commands to theGraphics Synthesiser 200 whilst mutually arbitrating multiple transfer. The interrupt controller (INTC) 112 serves to arbitrate interrupts from peripheral devices, except theDMAC 116. - The
timer unit 114 comprises four independent timers with 16-bit counters. The timers are driven either by the bus clock (at 1/16 or 1/256 intervals) or via an external clock. TheDMAC 116 handles data transfers between main memory and scratchpad RAM, or between main memory or scratchpad RAM and peripherals. It arbitrates themain bus 124 at the same time. Performance optimisation of theDMAC 116 is a key way by which to improve Emotion Engine performance. The image processing unit (IPU) 118 is an image data processor that is used to expand compressed animations and texture images. It performs macro-Block decoding, colour space conversion and vector quantisation. Finally, the sub-bus interface (SIF) 122 is an interface unit to theIOP 700. The IPU has its own memory and bus to control 10 devices such as sound chips and storage devices. -
FIG. 3 schematically illustrates the configuration of theGraphic Synthesiser 200. The Graphics Synthesiser comprises: ahost interface 202; apixel pipeline 206; amemory interface 208; alocal memory 212 including aframe page buffer 214 and atexture page buffer 216; and avideo converter 210. - The
host interface 202 transfers data with the host (in this case the GIF 110). Both drawing data and buffer data from the host pass through this interface. The output from thehost interface 202 is supplied to thegraphics synthesiser 200 which develops the graphics to draw pixels based on vertex information received from theEmotion Engine 100, and calculates information such as RGBA value, depth value (i.e. Z-value), texture value and fog value for each pixel. The RGBA value specifies the red, green, blue (RGB) colour components and the A (Alpha) component represents opacity of an image object. The Alpha value can range from completely transparent to totally opaque. The pixel data is supplied to thepixel pipeline 206 which performs processes such as texture mapping, fogging and Alpha-blending and determines the final drawing colour based on the calculated pixel information. - The
pixel pipeline 206 comprises 16 pixel engines PE1, PE2, . . . , PE16 so that it can process a maximum of 16 pixels concurrently. Thepixel pipeline 206 runs at 150 MHz (in fact 294.912/2 MHz) with 32-bit colour and a 32-bit Z-buffer. Thememory interface 208 reads data from and writes data to the localGraphics Synthesiser memory 212. It writes the drawing pixel values (RGBA and Z) to memory at the end of a pixel operation and reads the pixel values of theframe buffer 214 from memory. These pixel values read from theframe buffer 214 are used for pixel test or Alpha-blending. Thememory interface 208 also reads fromlocal memory 212 the RGBA values for the current contents of the frame buffer. Thelocal memory 212 is a 32 Mbit (4 MB) memory that is built-in to theGraphics Synthesiser 200. It can be organised as aframe buffer 214,texture buffer 216 and a Z-buffer 215. Theframe buffer 214 is the portion of video memory where pixel data such as colour information is stored. - The Graphics Synthesiser uses a 2D to 3D texture mapping process to add visual detail to 3D geometry. Each texture may be wrapped around a 3D image object and is stretched and skewed to give a 3D graphical effect. The texture buffer is used to store the texture information for image objects. The Z-buffer 215 (also known as depth buffer) is the memory available to store the depth information for a pixel. Images are constructed from basic building blocks known as graphics primitives or polygons. When a polygon is rendered with Z-buffering, the depth value of each of its pixels is compared with the corresponding value stored in the Z-buffer. If the value stored in the Z-buffer is greater than or equal to the depth of the new pixel value then this pixel is determined visible so that it should be rendered and the Z-buffer will be updated with the new pixel depth. If however the Z-buffer depth value is less than the new pixel depth value the new pixel value is behind what has already been drawn and will not be rendered. Alternative Z-buffer tests are available, so that (a) the new pixel always replaces the previous value, or (b) the new pixel replaces the previous pixel value if its depth is greater than or equal to the previous value stored in the Z buffer.
- The
local memory 212 has a 1024-bit read port and a 1024-bit write port for accessing the frame buffer and Z-buffer and a 512-bit port for texture reading. Thevideo converter 210 is operable to display the contents of the frame memory in a specified output format. - An arrangement will now be described to allow the emulation of the system described with reference to
FIGS. 1 to 3 . Note that for convenience, the processing units shown inFIGS. 1 to 3 will be referred to as “emulated” processing units, whereas in the emulating system to be described below, processing units of that (emulating)-system will be referred to as “emulating” processing units. To avoid any-possible confusion, note that both categories of processing units (“emulated” and “emulating” processing units) represent physical processing units capable of running native software appropriate to those processing units. -
FIG. 4 schematically illustrates the overall system architecture of the Sony® Playstation 3® entertainment device. Asystem unit 910 is provided, with various peripheral devices connectable to the system unit. - The
system unit 910 comprises: aCell processor 1100; a Rambus® dynamic random access memory (XDRAM)unit 1500; a RealitySynthesiser graphics unit 1200 with a dedicated video random access memory (VRAM)unit 1250; and an I/O bridge 1700. - The
system unit 910 also comprises a Blu Ray® Disk BD-ROM®optical disk reader 1430 for reading adisk 1440 and a removable slot-in hard disk drive (HDD) 1400, accessible through the I/O bridge 1700. Optionally the system unit also comprises amemory card reader 1450 for reading compact flash memory cards, Memory Stick® memory cards and the like, which is similarly accessible through the I/O bridge 1700. - The I/
O bridge 1700 also connects to six Universal Serial Bus (USB) 2.0ports 1710; agigabit Ethernet port 1720; an IEEE 802.11b/g wireless network (Wi-Fi)port 1730; and a Bluetooth®wireless link port 1740 capable of supporting of up to seven Bluetooth connections. - In operation the I/
O bridge 1700 handles all wireless, USB and Ethernet data, including data from one ormore game controllers 1751. For example when a user is playing a game, the I/O bridge 1700 receives data from thegame controller 1751 via a Bluetooth link and directs it to theCell processor 1100, which updates the current state of the game accordingly. - The wireless, USB and Ethernet ports also provide connectivity for other peripheral devices in addition to
game controllers 1751, such as: aremote control 1752; akeyboard 1753; amouse 1754; aportable entertainment device 1755 such as a Sony Playstation Portable® entertainment device; a video camera such as an EyeToy® video camera 1756; and amicrophone headset 1757. Such peripheral devices may therefore in principle be connected to thesystem unit 910 wirelessly; for example theportable entertainment device 1755 may communicate via a Wi-Fi ad-hoc connection, whilst themicrophone headset 1757 may communicate via a Bluetooth link. - The provision of these interfaces means that the Playstation 3 device is also potentially compatible with other peripheral devices such as digital video recorders (DVRs), set-top boxes, digital cameras, portable media players, Voice over IP telephones, mobile telephones, printers and scanners.
- In addition, a legacy
memory card reader 1410 may be connected to the system unit via aUSB port 1710, enabling the reading ofmemory cards 1420 of the kind used by the Playstation® or Playstation 2® devices. - In the present embodiment, the
game controller 1751 is operable to communicate wirelessly with thesystem unit 910 via the Bluetooth link. However, thegame controller 1751 can instead be connected to a USB port, thereby also providing power by which to charge the battery of thegame controller 1751. In addition to one or more analogue joysticks and conventional control buttons, the game controller is sensitive to motion in 6 degrees of freedom, corresponding to translation and rotation in each axis. Consequently gestures and movements by the user of the game controller may be translated as inputs to a game in addition to or instead of conventional button or joystick commands. Optionally, other wirelessly enabled peripheral devices such as the Playstation Portable device may be used as a controller. In the case of the Playstation Portable device, additional game or control information (for example, control instructions or number of lives) may be provided on the screen of the device. Other alternative or supplementary control devices may also be used, such as a dance mat (not shown), a light gun (not shown), a steering wheel and pedals (not shown) or bespoke controllers, such as a single or several large buttons for a rapid-response quiz game (also not shown). - The
remote control 1752 is also operable to communicate wirelessly with thesystem unit 910 via a Bluetooth link. Theremote control 1752 comprises controls suitable for the operation of the Blu Ray Disk BD-ROM reader 1430 and for the navigation of disk content. - The Blu Ray Disk BD-
ROM reader 1430 is operable to read CD-ROMs compatible with the Playstation and PlayStation 2 devices, in addition to conventional pre-recorded and recordable CDs, and so-called Super Audio CDs. Thereader 1430 is also operable to read DVD-ROMs compatible with the Playstation 2 and PlayStation 3 devices, in addition to conventional pre-recorded and recordable DVDs. Thereader 1430 is further operable to read BD-ROMs compatible with the Playstation 3 device, as well as conventional pre-recorded and recordable Blu-Ray Disks. - The
system unit 910 is operable to supply audio and video, either generated or decoded by the Playstation 3 device via the RealitySynthesiser graphics unit 1200, through audio and video connectors to a display andsound output device 1300 such as a monitor or television set, having adisplay screen 1305 and one ormore loudspeakers 1310. Theaudio connectors 1210 may include conventional analogue and digital outputs whilst thevideo connectors 1220 may variously include component video, S-video, composite video and one or more High Definition Multimedia Interface (HDMI) outputs. Consequently, video output may be in formats such as PAL or NTSC, or in 720p, 1080i or 1080p high definition. - Audio processing (generation, decoding and so on) is performed by the
Cell processor 1100. The Playstation 3 device's operating system supports Dolby® 5.1 surround sound, Dolby® Theatre Surround (DTS), and the decoding of 7.1 surround sound from Blu-Ray® disks. - In the present embodiment, the
video camera 1756 comprises a single charge coupled device (CCD), an LED indicator, and hardware-based real-time data compression and encoding apparatus so that compressed video data may be transmitted in an appropriate format such as an intra-image based MPEG (motion picture expert group) standard for decoding by thesystem unit 910. The camera LED indicator is arranged to illuminate in response to appropriate control data from thesystem unit 910, for example to signify adverse lighting conditions. Embodiments of thevideo camera 1756 may variously connect to thesystem unit 910 via a USB, Bluetooth or Wi-Fi communication port. Embodiments of the video camera may include an associated microphone and also be capable of transmitting audio data. In embodiments of the video camera, the CCD may have a resolution suitable for high-definition video capture. In use, images captured by the video camera may for example be incorporated within a game or interpreted as game control inputs. - In general, in order for successful data communication to occur with a peripheral device such as a video camera or remote control via one of the communication ports of the
system unit 910, an appropriate piece of software such as a device driver should be provided. Device driver technology is well-known and will not be described in detail here, except to say that the skilled man will be aware that a device driver or similar software interface may be required in the present embodiment described. - Referring now to
FIG. 5 , theCell processor 1100 has an architecture comprising four basic components: external input and output structures comprising amemory controller 1160 and a dualbus interface controller 1170A,B; a main processor referred to as the Power Processing Element (PPE) 1150; eight co-processors referred to as Synergistic Processing Elements (SPEs) 1110A-H; and a circular data bus connecting the above components referred to as theElement Interconnect Bus 1180. The total floating point performance of the Cell processor is 218 GFLOPS, compared with the 6.2 GFLOPs of the Playstation 2 device's Emotion Engine. - The Power Processing Element (PPE) 1150 is based upon a two-way simultaneous multithreading Power 970 compliant PowerPC core (PP) 1155 running with an internal clock of 3.2 GHz. It comprises a 512 kB level 2 (L2) cache and a 32 kB level 1 (L1) cache. The
PPE 1150 is capable of eight single position operations per clock cycle, translating to 25.6 GFLOPs at 3.2 GHz. The primary role of thePPE 1150 is to act as a controller for the Synergistic Processing Elements 111A-H, which handle most of the computational workload. In operation thePPE 1150 maintains a job queue, scheduling jobs for theSynergistic Processing Elements 1110A-H and monitoring their progress. Consequently each Synergistic Processing Element 110A-H runs a kernel whose role is to fetch a job, execute it and synchronise with thePPE 1150. - Each Synergistic Processing Element (SPE) 1110A-H comprises a respective Synergistic Processing Unit (SPU′—to distinguish it from the Sound Processing Unit mentioned above) 1120A-H, and a respective Memory Flow Controller (MFC) 1140A-H comprising in turn a respective Dynamic Memory Access Controller (DMAC) 1142A-H, a respective Memory Management Unit (MMU) 1144A-H and a bus interface (not shown). Each SPU′ 1120A-H is a RISC processor clocked at 3.2 GHz and comprising 256 kB
local RAM 1130A-H, expandable in principle to 4 GB. Each SPE gives a theoretical 25.6 GFLOPS of single precision performance. An SPU′ can operate on 4 single precision floating point members, 4 32-bit numbers, 8 16-bit integers, or 16 8-bit integers in a single clock cycle. In the same clock cycle it can also perform a memory operation. The SPU′ 1120A-H does not directly access thesystem memory XDRAM 1500; the 64-bit addresses formed by the SPU′ 1120A-H are passed to theMFC 1140A-H which instructs itsDMA controller 1142A-H to access memory via theElement Interconnect Bus 1180 and thememory controller 1160. - The Element Interconnect Bus (FIB) 1180 is a logically circular communication bus internal to the
Cell processor 1100 which connects the above processor elements, namely thePPE 1150, thememory controller 1160, thedual bus interface 1170A,B and the 8SPEs 1110A-H, totalling 12 participants. Participants can simultaneously read and write to the bus at a rate of 8 bytes per, clock cycle. As noted previously, eachSPE 1110A-H comprises aDMAC 1142A-H for scheduling longer read or write sequences. The EIB comprises four channels, two each in clockwise and anti-clockwise directions. Consequently for twelve participants, the longest step-wise data-flow between any two participants is six steps in the appropriate direction. The theoretical peak instantaneous EIB bandwidth for 12 slots is therefore 96 B (bytes) per clock, in the event of full utilisation through arbitration between participants. This equates to a theoretical peak bandwidth of 307.2 GB/s (gigabytes per second) at a clock rate of 3.2 GHz. - The
memory controller 1160 comprises anXDRAM interface 1162, developed by Rambus Incorporated. The memory controller interfaces with theRambus XDRAM 1500 with a theoretical peak bandwidth of 25.6 GB/s. - The
dual bus interface 1170A,B comprises a Rambus FlexIO® system interface 1172A,B. The interface is organised into 12 channels each being 8 bits wide, with five paths being inbound and seven outbound. This provides a theoretical peak bandwidth of 62.4 GB/s (36.4 GB/s outbound, 26 GB/s inbound) between the Cell processor and the I/O Bridge 1700 via thecontroller 1170A and the RealitySimulator graphics unit 1200 viacontroller 1170B. - Data sent by the
Cell processor 1100 to the RealitySimulator graphics unit 1200 will typically comprise display lists, being a sequence of commands to draw vertices, apply textures to polygons, specify lighting conditions, and so on. - Referring now to
FIG. 6 , the Reality Simulator graphics (RSX)unit 1200 is a video accelerator based upon the NVidia® G70/71 architecture that processes and renders lists of commands produced by theCell processor 1100. TheRSX unit 1200 comprises ahost interface 1202 operable to communicate with thebus interface controller 1170B of theCell processor 1100; a vertex pipeline 1204 (VP) comprising eightvertex shaders 1205; a pixel pipeline 1206 (PP) comprising 24pixel shaders 1207; a render pipeline 1208 (RP) comprising eight render output units (ROPs) 1209; amemory interface 1210; and avideo converter 1212 for generating a video output. TheRSX 1200 is complemented by 256 MB double data rate (DDR) video RAM (VRAM) 1250, clocked at 600 MHz and operable to interface with theRSX 1200 at a theoretical peak bandwidth of 25.6 GB/s. In operation, theVRAM 1250 maintains aframe buffer 1214 and atexture buffer 1216. Thetexture buffer 1216 provides textures to thepixel shaders 1207, whilst theframe buffer 1214 stores results of the processing pipelines. The RSX can also access themain memory 1500 via theEIB 1180, for example to load textures into theVRAM 1250. - The
vertex pipeline 1204 primarily processes deformations and transformations of vertices defining polygons within the image to be rendered. - The
pixel pipeline 1206 primarily processes the application of colour, textures and lighting to these polygons, including any pixel transparency, generating red, green, blue and alpha (transparency) values for each processed pixel. Texture mapping may simply apply a graphic image to a surface, or may include bump-mapping (in which the notional direction of a surface is perturbed in accordance with texture values to create-highlights and shade in the lighting model) or displacement mapping (in which the applied texture additionally perturbs vertex positions to generate a deformed surface consistent with the texture). - The render
pipeline 1208 performs depth comparisons between pixels to determine which should be rendered in the final image. Optionally, if the intervening pixel process will not affect depth values (for example in the absence of transparency or displacement mapping) then the render pipeline andvertex pipeline 1204 can communicate depth information between them, thereby enabling the removal of occluded elements prior to pixel processing, and so improving overall rendering efficiency. In addition, the renderpipeline 1208 also applies subsequent effects such as full-screen anti-aliasing over the resulting image. - Both the
vertex shaders 1205 andpixel shaders 1207 are based on the shader model 3.0 standard. Up to 136 shader operations can be performed per clock cycle, with the combined pipeline therefore capable of 74.8 billion shader operations per second, outputting up to 840 million vertices and 10 billion pixels per second. The total floating point performance of theRSX 1200 is 1.8 TFLOPS. - Typically, the
RSX 1200 operates in close collaboration with theCell processor 1100; for example, when displaying an explosion, or weather effects such as rain or snow, a large number of particles must be tracked, updated and rendered within the scene. In this case, thePPU 1155 of the Cell processor may schedule one ormore SPEs 1110A-H to compute the trajectories of respective batches of particles. Meanwhile, theRSX 1200 accesses any texture data (e.g. snowflakes) not currently held in thevideo RAM 1250 from themain system memory 1500 via theelement interconnect bus 1180, thememory controller 1160 and abus interface controller 1170B. The or eachSPE 1110A-H outputs its computed particle properties (typically coordinates and normals, indicating position and attitude) directly to thevideo RAM 1250; theDMA controller 1142A-H of the or eachSPE 1110A-H addresses thevideo RAM 1250 via thebus interface controller 1170B. Thus in effect the assigned SPEs become part of the video processing pipeline for the duration of the task. - In general, the
PPU 1155 can assign tasks in this fashion to six of the eight SPEs available; one SPE is reserved for the operating system, whilst one SPE is optionally disabled. The disabling of one SPE provides a greater level of tolerance during fabrication of the Cell processor, as it allows for one SPE to fail the fabrication process. Alternatively if all eight SPEs are functional, then the eighth SPE provides scope for redundancy in the event of subsequent failure by one of the other SPEs during the life of the Cell processor. - The
PPU 1155 can assign tasks to SPEs in several ways. For example, SPEs may be chained together to handle each step in a complex operation, such as accessing a DVD, video and audio decoding, and error masking, with each step being assigned to a separate SPE. Alternatively or in addition, two or more SPEs may be assigned to operate on input data in parallel, as in the particle animation example above. - Software instructions implemented by the
Cell processor 1100 and/or theRSX 1200 may be supplied at manufacture and stored on theHDD 1400, and/or may be supplied on a data carrier or storage medium such as an optical disk or solid state memory, or via a transmission medium such as a wired or wireless network or internet connection, or via combinations of these. - The software supplied at manufacture comprises system firmware and the Playstation 3 device's operating system (OS). In operation, the OS provides a user interface enabling a user to select from a variety of functions, including playing a game, listening to music, viewing photographs, or viewing a video. The interface takes the form of a so-called cross media-bar (XMB), with categories of function arranged horizontally. The user navigates by moving through the functions horizontally using a
game controller 1751,remote control 1752 or other suitable control device so as to highlight the desired function, at which point options pertaining to that function appear as a vertically scrollable list centred on that function, which may be navigated in analogous fashion. However, if a game, audio ormovie disk 1440 is inserted into the BD-ROMoptical disk reader 1430, the Playstation 3 device may select appropriate options automatically (for example, by commencing the game), or may provide relevant options (for example, to select between playing an audio disk or compressing its content to the HDD 1400). - In addition, the OS provides an on-line capability, including a web browser, an interface with an on-line store from which additional game content, demos and other media may be downloaded, and a friends management capability, providing on-line communication with other Playstation 3 device users nominated by the user of the current device; for example, by text, audio or video depending on the peripheral devices available. The on-line capability also provides for on-line communication, content download and content purchase during play of a suitably configured game, and for updating the firmware and OS of the Playstation 3 device itself.
- The operation of the PS2 arrangement described with reference to
FIGS. 1 to 3 is reproduced (or very nearly reproduced) by software running on the arrangement ofFIGS. 4 to 6 , despite the fact that the emulating processing units ofFIGS. 4 to 6 have tin general terms) a different architecture, speed, memory accessing capabilities and so on, compared to the emulated processing units ofFIGS. 1 to 3 . - The reproduction of the operation of the PS2 arrangement is an emulation rather than a simulation. That is to say, it is not the case that all of the operations contributing to the functionality of the PS2 are reproduced by the emulating system in a lock-step, clock-by-clock manner. Rather, some functions may be carried out by time division on a single emulating processing unit, and in general the processing units communicate with one another only when there is a need (within the emulated system) to do so.
- The
PPE 1150 controls the overall operation of the emulating system and runs an operating system (OS) for the emulating system. It also has one thread which provides interpretation of native Emotion Engine PS2 instructions into native SPE instructions which it supplies, with any associated information (such as allocation of emulation functions—see below) which is required to carry out the respective part of the emulation, to the relevant SPEs via the EIB, while another thread provides the function of recompiling new code native to the emulating system to provide the particular functionality defined by the interpreted PS2 code. Emulation of the various parts of the PS2 system described above is devolved to the eight SPEs acting as emulating processing units, which emulate PS2 functionality as set out below. It will be appreciated that the precise identity of the individual SPEs is just a convenient notation and has no technical significance because of the nature of the message-passing interface between the SPEs. So, for example, the operations assigned to theSPEs -
SPE 1110AIPU 118 SPE 1110BEmotion Engine CPU 102 andVector Unit 0SPE 1110CVIF 0, VIF 1,GIF 110SPE 1110D Vector Unit 1 SPE 1110EGS 200 (i.e. that part of the operation of the GS 200specific to the PS2 system; the SPE 1110G also interfaceswith a graphics controller of the emulating system (not shown) for non-PS2-specific graphics operations) SPE 1110Fgenerally unused, but can be used to recompile code to emulate Vector Unit 1, to ease the load on the one threadof the PPE 1150 described aboveSPE 1110GSPU 300 SPE 1110HIOP 700 and SIF 122 - The PS2 used a conventional bus for communication between the various emulated processing units. The emulating system makes use of the
EIB 1180 for passing messages between SPEs and between an SPE, thePPE 1150 and the I/O bridge 1700 (and/or other system devices such as the RSX 1200). The PS2 system had conventional memory access arrangements to access theRDRAM 500. The emulating system uses a distributed DMA system (theDMA controllers 1142A-H). Themain system memory 1500 is treated as a common memory “pool”, with all SPEs having access to it. - Each of the SPEs runs locally, on its own time clock The SPEs run software to allow parts of the functionality of the PS2 system to be emulated. As between EPUs, synchronisation is required only when the emulated processing units emulated by the EPUs need to communicate with one another. At that time, synchronisation takes place just between the devices concerned, using a message transfer mechanism via the EIB.
- To achieve this, when synchronisation (of emulated functionality) is required between two SPEs, one of the SPEs places a message onto the EIB (including a source SPE identifier, a destination SPE identifier etc), addressed to the other of the SPEs. The message may include a request for a certain piece of data, or may include a data item which is being sent to that other SPE involved in that particular synchronisation. When an acknowledgement is returned by that other SPE, the transaction is complete. This is a reliable but rather slow method of synchronising two emulated processors.
- The way in which the SPEs are logically arranged is shown schematically in
FIG. 5 . Example paths of logical communication between the SPEs are also shown, although these need not be exhaustive. It can be seen that some functions are shared on the same SPE which avoids entirely the need to use the message-passing mechanism to communicate between them. So, providing the emulation of two (or more) emulated processing units on a single SPE can improve the system's performance by reducing the amount of inter-SPE communication needed. - Another feature which is not exhaustively indicated on
FIG. 5 , for clarity of the diagram, is that where the emulated processing units emulated by two different SPEs need to communicate with one another a lot in order to carry out particular functions of the PS2, a part of the functionality of one real processor can be carried out by the “other” processor's SPE. - For example, the PS2 sound processing unit is mostly emulated on one SPE. This processes samples and mixes them into the final samples for output. It also processes accesses to its register map. However, the registers used to write to sound processing unit sample memory are emulated on the IOP's SPE which manages the queuing of accesses, directly accesses the sample memory image in main memory, and raises any interrupts these might cause as though they had been routed from the sound processing unit.
- Another example of a device in the PS2 system which is implemented on more than one SPE is the DMAC, whose function is distributed between the emulating components primarily used for the Emotion Engine, VIF, GIF, IPU and others.
- An example of one emulating SPE handling the emulation of multiple PS2 system devices is that the emulation of the PS2's IOP is shared (on a single emulating device) with the bulk of the emulation of a CD disk controller.
- This division of the emulation of an emulated processing unit between two (or more) SPEs (emulating processing units) again can reduce the message traffic needed to provide communication between the emulations of those emulated processing units. The PPE can vary the distribution of emulation tasks between the SPEs (during an overall emulation operation), so as to alter which SPE emulates a particular emulated processing unit, and/or which emulated processing units are emulated by a particular SPE.
- In so far as the embodiments of the invention described above are implemented, at least in part, using software-controlled data processing apparatus, it will be appreciated that a computer program providing such software control, a storage medium by which such a computer program is stored and a transmission medium by which such a computer program is transmitted are envisaged as aspects of the present invention. It is noted that such software may be provided on a storage medium such as an optical disk or a hardware memory, an/or via a transmission medium such as a network connection or the internet.
Claims (13)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0603446A GB2435335A (en) | 2006-02-21 | 2006-02-21 | Multi-processor emulation by a multi-processor |
GB0603446.6 | 2006-02-21 | ||
PCT/GB2007/000587 WO2007096602A1 (en) | 2006-02-21 | 2007-02-19 | Data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20090247249A1 true US20090247249A1 (en) | 2009-10-01 |
Family
ID=36178463
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/280,144 Abandoned US20090247249A1 (en) | 2006-02-21 | 2007-02-19 | Data processing |
Country Status (5)
Country | Link |
---|---|
US (1) | US20090247249A1 (en) |
EP (1) | EP1987426A1 (en) |
JP (2) | JP2009527836A (en) |
GB (1) | GB2435335A (en) |
WO (1) | WO2007096602A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120092458A1 (en) * | 2010-10-11 | 2012-04-19 | Texas Instruments Incorporated | Method and Apparatus for Depth-Fill Algorithm for Low-Complexity Stereo Vision |
RU2620956C2 (en) * | 2012-03-13 | 2017-05-30 | СОНИ КОМПЬЮТЕР ЭНТЕРТЕЙНМЕНТ АМЕРИКА ЭлЭлСи | System and method for console game data retrieval and sharing |
US10486064B2 (en) | 2011-11-23 | 2019-11-26 | Sony Interactive Entertainment America Llc | Sharing buffered gameplay in response to an input request |
US10610778B2 (en) | 2011-11-23 | 2020-04-07 | Sony Interactive Entertainment America Llc | Gaming controller |
US10960300B2 (en) | 2011-11-23 | 2021-03-30 | Sony Interactive Entertainment LLC | Sharing user-initiated recorded gameplay with buffered gameplay |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8458754B2 (en) | 2001-01-22 | 2013-06-04 | Sony Computer Entertainment Inc. | Method and system for providing instant start multimedia content |
US7770050B2 (en) | 2006-05-03 | 2010-08-03 | Sony Computer Entertainment Inc. | Method and apparatus for resolving clock management issues in emulation involving both interpreted and translated code |
US7792666B2 (en) | 2006-05-03 | 2010-09-07 | Sony Computer Entertainment Inc. | Translation block invalidation prehints in emulation of a target system on a host system |
US7813909B2 (en) | 2006-05-03 | 2010-10-12 | Sony Computer Entertainment Inc. | Register mapping in emulation of a target system on a host system |
US9483405B2 (en) | 2007-09-20 | 2016-11-01 | Sony Interactive Entertainment Inc. | Simplified run-time program translation for emulating complex processor pipelines |
US8060356B2 (en) | 2007-12-19 | 2011-11-15 | Sony Computer Entertainment Inc. | Processor emulation using fragment level translation |
JP5242628B2 (en) * | 2010-05-06 | 2013-07-24 | 株式会社スクウェア・エニックス | A high-level language that improves programmer productivity in game development |
US8433759B2 (en) | 2010-05-24 | 2013-04-30 | Sony Computer Entertainment America Llc | Direction-conscious information sharing |
CN114882149A (en) | 2022-03-31 | 2022-08-09 | 北京智明星通科技股份有限公司 | Animation rendering method and device, electronic equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5701479A (en) * | 1993-06-15 | 1997-12-23 | Xerox Corporation | Pipelined image processing system for a single application environment |
US5966515A (en) * | 1996-12-31 | 1999-10-12 | Unisys Corporation | Parallel emulation system and method |
US20020128812A1 (en) * | 2001-03-12 | 2002-09-12 | International Business Machines Corporation | Time-multiplexing data between asynchronous clock domains within cycle simulation and emulation environments |
US20030225565A1 (en) * | 2002-06-03 | 2003-12-04 | Broadcom Corporation | Method and system for deterministic control of an emulation |
US20040054992A1 (en) * | 2002-09-17 | 2004-03-18 | International Business Machines Corporation | Method and system for transparent dynamic optimization in a multiprocessing environment |
US20040054517A1 (en) * | 2002-09-17 | 2004-03-18 | International Business Machines Corporation | Method and system for multiprocessor emulation on a multiprocessor host system |
US20050188373A1 (en) * | 2004-02-20 | 2005-08-25 | Sony Computer Entertainment Inc. | Methods and apparatus for task management in a multi-processor system |
US6978233B1 (en) * | 2000-03-03 | 2005-12-20 | Unisys Corporation | Method for emulating multi-processor environment |
US7089175B1 (en) * | 2000-10-26 | 2006-08-08 | Cypress Semiconductor Corporation | Combined in-circuit emulator and programmer |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03198161A (en) * | 1989-12-27 | 1991-08-29 | Fujitsu Ltd | Logical design parallel processing system |
JPH06202877A (en) * | 1992-12-28 | 1994-07-22 | Fujitsu Ltd | Emulator |
US5581705A (en) * | 1993-12-13 | 1996-12-03 | Cray Research, Inc. | Messaging facility with hardware tail pointer and software implemented head pointer message queue for distributed memory massively parallel processing system |
JPH08263306A (en) * | 1995-03-10 | 1996-10-11 | Xerox Corp | Data-processing system for pipeline data processing and pipeline data-processing method |
US6339752B1 (en) * | 1998-12-15 | 2002-01-15 | Bull Hn Information Systems Inc. | Processor emulation instruction counter virtual memory address translation |
JP3964142B2 (en) * | 2000-08-15 | 2007-08-22 | 株式会社ソニー・コンピュータエンタテインメント | Emulation device and component, information processing device, emulation method, recording medium, program |
US6907519B2 (en) * | 2001-11-29 | 2005-06-14 | Hewlett-Packard Development Company, L.P. | Systems and methods for integrating emulated and native code |
-
2006
- 2006-02-21 GB GB0603446A patent/GB2435335A/en not_active Withdrawn
-
2007
- 2007-02-19 EP EP07712762A patent/EP1987426A1/en not_active Ceased
- 2007-02-19 US US12/280,144 patent/US20090247249A1/en not_active Abandoned
- 2007-02-19 WO PCT/GB2007/000587 patent/WO2007096602A1/en active Application Filing
- 2007-02-19 JP JP2008555861A patent/JP2009527836A/en active Pending
-
2011
- 2011-05-30 JP JP2011120250A patent/JP5746916B2/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5701479A (en) * | 1993-06-15 | 1997-12-23 | Xerox Corporation | Pipelined image processing system for a single application environment |
US5966515A (en) * | 1996-12-31 | 1999-10-12 | Unisys Corporation | Parallel emulation system and method |
US6978233B1 (en) * | 2000-03-03 | 2005-12-20 | Unisys Corporation | Method for emulating multi-processor environment |
US7089175B1 (en) * | 2000-10-26 | 2006-08-08 | Cypress Semiconductor Corporation | Combined in-circuit emulator and programmer |
US20020128812A1 (en) * | 2001-03-12 | 2002-09-12 | International Business Machines Corporation | Time-multiplexing data between asynchronous clock domains within cycle simulation and emulation environments |
US20030225565A1 (en) * | 2002-06-03 | 2003-12-04 | Broadcom Corporation | Method and system for deterministic control of an emulation |
US20040054992A1 (en) * | 2002-09-17 | 2004-03-18 | International Business Machines Corporation | Method and system for transparent dynamic optimization in a multiprocessing environment |
US20040054517A1 (en) * | 2002-09-17 | 2004-03-18 | International Business Machines Corporation | Method and system for multiprocessor emulation on a multiprocessor host system |
US20050188373A1 (en) * | 2004-02-20 | 2005-08-25 | Sony Computer Entertainment Inc. | Methods and apparatus for task management in a multi-processor system |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120092458A1 (en) * | 2010-10-11 | 2012-04-19 | Texas Instruments Incorporated | Method and Apparatus for Depth-Fill Algorithm for Low-Complexity Stereo Vision |
US10554955B2 (en) * | 2010-10-11 | 2020-02-04 | Texas Instruments Incorporated | Method and apparatus for depth-fill algorithm for low-complexity stereo vision |
US10486064B2 (en) | 2011-11-23 | 2019-11-26 | Sony Interactive Entertainment America Llc | Sharing buffered gameplay in response to an input request |
US10610778B2 (en) | 2011-11-23 | 2020-04-07 | Sony Interactive Entertainment America Llc | Gaming controller |
US10960300B2 (en) | 2011-11-23 | 2021-03-30 | Sony Interactive Entertainment LLC | Sharing user-initiated recorded gameplay with buffered gameplay |
US11065533B2 (en) | 2011-11-23 | 2021-07-20 | Sony Interactive Entertainment LLC | Sharing buffered gameplay in response to an input request |
RU2620956C2 (en) * | 2012-03-13 | 2017-05-30 | СОНИ КОМПЬЮТЕР ЭНТЕРТЕЙНМЕНТ АМЕРИКА ЭлЭлСи | System and method for console game data retrieval and sharing |
US10525347B2 (en) | 2012-03-13 | 2020-01-07 | Sony Interactive Entertainment America Llc | System and method for capturing and sharing console gaming data |
Also Published As
Publication number | Publication date |
---|---|
EP1987426A1 (en) | 2008-11-05 |
JP2009527836A (en) | 2009-07-30 |
WO2007096602A1 (en) | 2007-08-30 |
JP5746916B2 (en) | 2015-07-08 |
JP2011227908A (en) | 2011-11-10 |
GB2435335A (en) | 2007-08-22 |
GB0603446D0 (en) | 2006-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20090247249A1 (en) | Data processing | |
US8705845B2 (en) | Entertainment device and method of interaction | |
US20060269086A1 (en) | Audio processing | |
US9048859B2 (en) | Method and apparatus for compressing and decompressing data | |
US20100048290A1 (en) | Image combining method, system and apparatus | |
US8311384B2 (en) | Image processing method, apparatus and system | |
US8360856B2 (en) | Entertainment apparatus and method | |
US8269691B2 (en) | Networked computer graphics rendering system with multiple displays for displaying multiple viewing frustums | |
US20100328354A1 (en) | Networked Computer Graphics Rendering System with Multiple Displays | |
WO2006024873A2 (en) | Image rendering | |
US20100035678A1 (en) | Video game | |
US8587589B2 (en) | Image rendering | |
WO2006027596A1 (en) | Data processing | |
WO2010151511A1 (en) | Networked computer graphics rendering system with multiple displays | |
EP1889645B1 (en) | Data processing | |
WO2008035027A1 (en) | Video game |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SONY COMPUTER ENTERTAINMENT INC., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HUGHES, JONATHAN COLIN;SARGAISON, STEWART;REEL/FRAME:022256/0328;SIGNING DATES FROM 20090108 TO 20090120 |
|
AS | Assignment |
Owner name: SONY COMPUTER ENTERTAINMENT INC.,JAPAN Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE TYPOGRAPHICAL ERROR IN THE NAME OF THE FIRST INVENTOR TO READ "HUGHES, COLIN JONATHAN". PREVIOUSLY RECORDED ON REEL 022256 FRAME 0328. ASSIGNOR(S) HEREBY CONFIRMS THE THE FIRST INVENTOR'S NAME WAS ORIGINALLY IDENTIFIED AS "HUGHES, JONATHAN COLIN";ASSIGNOR:HUGHES, COLIN JONATHAN;REEL/FRAME:024273/0686 Effective date: 20090204 |
|
AS | Assignment |
Owner name: SONY NETWORK ENTERTAINMENT PLATFORM INC., JAPAN Free format text: CHANGE OF NAME;ASSIGNOR:SONY COMPUTER ENTERTAINMENT INC.;REEL/FRAME:027448/0895 Effective date: 20100401 |
|
AS | Assignment |
Owner name: SONY COMPUTER ENTERTAINMENT INC., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SONY NETWORK ENTERTAINMENT PLATFORM INC.;REEL/FRAME:027449/0469 Effective date: 20100401 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: SONY INTERACTIVE ENTERTAINMENT EUROPE LIMITED, UNITED KINGDOM Free format text: CHANGE OF NAME;ASSIGNOR:SONY COMPUTER ENTERTAINMENT EUROPE LIMITED;REEL/FRAME:043198/0110 Effective date: 20160729 Owner name: SONY INTERACTIVE ENTERTAINMENT EUROPE LIMITED, UNI Free format text: CHANGE OF NAME;ASSIGNOR:SONY COMPUTER ENTERTAINMENT EUROPE LIMITED;REEL/FRAME:043198/0110 Effective date: 20160729 |