USRE39345E1 - Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method - Google Patents

Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method Download PDF

Info

Publication number
USRE39345E1
USRE39345E1 US10/165,635 US16563502A USRE39345E US RE39345 E1 USRE39345 E1 US RE39345E1 US 16563502 A US16563502 A US 16563502A US RE39345 E USRE39345 E US RE39345E
Authority
US
United States
Prior art keywords
computer graphics
decoding
video
graphics data
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US10/165,635
Inventor
Jiro Katto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US10/165,635 priority Critical patent/USRE39345E1/en
Application granted granted Critical
Publication of USRE39345E1 publication Critical patent/USRE39345E1/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8543Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4305Synchronising client clock from received content stream, e.g. locking decoder clock with encoder clock, extraction of the PCR packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4341Demultiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4348Demultiplexing of additional data and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4622Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics

Definitions

  • the present invention relates to a system for synchronously reproducing/synthesizing an audio signal, a video signal, and computer graphics data.
  • MPEG1 and MPEG2 defined by the MPEG (Moving Picture Coding Experts Group) in working group (WG) 11 in SC29 under JTC1 (Joint Technical Committee 1) for handling common matters in the data processing field of the ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission).
  • the MPEG assumes a variety of applications. As for synchronization, systems using phase lock and systems not based on phase lock are assumed.
  • an audio signal coding clock (sampling rate of an audio signal) and a video signal coding clock (frame rate of a video signal) are phase-locked to a common SCR (System Clock Reference).
  • a time stamp representing time of decoding/reproduction is added to a multiplexed bit stream.
  • a decoding system realizes phase lock and sets a time reference. More specifically, synchronization between the coding system and the decoding system is established.
  • the audio signal and the video signal are decoded on the basis of the time stamp, thereby realizing reproduction/display of the audio signal and the video signal which are synchronized with each other.
  • the audio signal and the video signal are independently processed and decoded in accordance with corresponding time stamps added by the coding system.
  • FIG. 16 shows the configuration of a system for reproducing/displaying an audio signal and a video signal from an MPEG system stream based on phase lock, which is described in ISO/IEC 13818-1, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Systems”, November 1994.
  • a demultiplexer 1 separates a bit stream in which an audio signal and a video signal are compressed and multiplexed in accordance with the MPEG standard, into a compressed audio signal stream, a time stamp, the SCR (System Clock Reference) or PCR (Program Clock Reference) of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video signal.
  • An audio buffer 2 buffers the compressed audio signal stream separated by the demultiplexer 1 .
  • An audio PLL (Phase Locked Loop) 3 receives the SCR/PCR of the audio signal separated by the demultiplexer 1 and generates a decoding clock.
  • An audio signal decoder 4 decodes the compressed audio signal stream from the audio buffer 2 at a timing indicated by the time stamp of the audio signal in accordance with the decoding clock supplied from the audio PLL 3 .
  • An audio memory 5 stores the decoded audio signal supplied from the audio signal decoder 4 and outputs the audio signal.
  • a video buffer 7 buffers the compressed video signal stream separated by the demultiplexer 1 .
  • a video PLL 8 receives the SCR/PCR of the video signal separated by the demultiplexer 1 and generates a decoding clock.
  • a video signal decoder 9 decodes the compressed video signal stream from the video buffer 7 at a timing indicated by the time stamp of the video signal in accordance with the decoding clock supplied from the video PLL 8 .
  • a video memory 10 stores the decoded video signal supplied from the video signal decoder 9 and outputs the video signal.
  • the audio PLL 3 and the video PLL 8 control the oscillation frequency such that the SCR/PCR of the coding system, which is supplied from the demultiplexer 1 , matches the timer counter value of the STC (System Time Clock) of the audio PLL 3 and the video PLL 8 .
  • STC System Time Clock
  • the audio signal and the video signal are decoded at the timing indicated by the time stamp, thereby realizing synchronous reproduction/display of the audio signal and the video signal.
  • a coding system 24 receives an audio signal, a video signal, and computer graphics data, codes these data, multiplexes these data or independently outputs these data to a transmission system/storage system 25 .
  • a decoding system 26 extracts the integrated data from the transmission system/storage system 25 , decodes the data, and outputs the audio signal and integrated image data of the video signal and computer graphics. Interaction from the observer using a pointing device such as a mouse or a joystick, e.g., viewpoint movement in a three-dimensional space on the display screen is received.
  • a pointing device such as a mouse or a joystick
  • viewpoint movement in a three-dimensional space on the display screen is received.
  • a typical example is ISO/IEC WD 14772: “The Virtual Reality Modeling Language Specification: The VRML2.0 Specification” (VRML).
  • the VRML is a description language for transmitting/receiving CG data through a network represented by the Internet and forming/sharing a virtual space.
  • the VRML supports ISO/IEC 11172 (MPEG1) which is standardized as an audio/video signal coding standard. More specifically, on the coding system side, the MPEG1 stream used the in VRML description, the sound source position of an audio signal, and a three-dimensional object on which a video signal is mapped are designated. On the decoding system side, a three-dimensional space is formed in accordance with the received VRML description, the audio sound source and the video object are arranged in the three-dimensional space, and the audio signal and the video signal are synchronously reproduced/displayed in accordance with time stamp information contained in the MPEG1 stream.
  • MPEG1 ISO/IEC 11172
  • the VRML also supports an animation of a three-dimensional object. More specifically, on the coding system side, the start and end times of each event, the duration of one cycle, the contents of each event, and interaction between events are described in a script. On the decoding system side, a three-dimensional space is formed in accordance with the received VRML description, events are generated on the basis of unique time management, and an animation is displayed.
  • time ti and parameters Xi (color, shape, normal vector, direction, position, and the like) of the object at the time ti are described and defined.
  • the parameters of the object at time t (t l ⁇ t ⁇ t l+1 ) are obtained by interpolation, and an animation is displayed.
  • FIG. 17 shows the arrangement of a conventional decoding system (in the VRML, this system is normally called a “browser”) for receiving the VRML description and displaying the three-dimensional space.
  • Conventional decoding systems of this type are, e.g., “Live3D” available from Netscape, “CyberPassage” available from Sony, and “Web-Space” available from SGI which are opened to the public through the Internet.
  • an AV buffer 21 buffers a bit stream in which an audio signal and a video signal are compressed and multiplexed.
  • the demultiplexer 1 separates the bit stream in which the audio signal and the video signal are compressed and multiplexed, which is supplied from the AV buffer 21 , into a compressed audio signal stream and a compressed video signal stream.
  • the audio signal decoder 4 decodes the compressed audio signal stream supplied from the demultiplexer 1 .
  • the audio memory 5 stores the decoded audio signal supplied from the audio signal decoder 4 and outputs the audio signal.
  • a modulator 6 modulates the audio signal from the audio memory 5 on the basis of a viewpoint, the viewpoint moving speed, the sound source position, and the sound source moving speed, which are supplied from a rendering engine 15 .
  • the video signal decoder 9 decodes the compressed video signal stream supplied from the demultiplexer 1 .
  • the video memory 10 stores the decoded video signal supplied from the video signal decoder 9 .
  • a CG buffer 22 buffers a compressed computer graphics data stream (or a normal stream).
  • a CG decoder 12 decodes the compressed computer graphics data stream supplied from the CG buffer 22 and generates decoded computer graphics data, and at the same time, outputs event time management information.
  • a CG memory 13 stores the decoded computer graphics data supplied from the CG decoder 12 and outputs the computer graphics data.
  • An event generator 14 determines reference time on the basis of a clock supplied from a system clock generator 20 and outputs an event driving instruction in accordance with the event time management information (e.g., a time stamp) supplied from the CG decoder 12 .
  • the event time management information e.g., a time stamp
  • the rendering engine 15 receives the video signal supplied from the video memory 10 , the computer graphics data supplied from the CG memory 13 , the event driving instruction supplied from the event generator 14 , and viewpoint movement data supplied from a viewpoint movement detector 17 and outputs the viewpoint, the viewpoint moving speed, the sound source position, the sound source moving speed, and the synthesized image of the video signal and the computer graphics data.
  • a video/CG memory 16 stores the synthesized image of the video signal and the computer graphics data and outputs the synthesized image.
  • the viewpoint movement detector 17 receives a user input from a pointing device such as a mouse or a joystick and outputs it as viewpoint movement data.
  • Synchronization among the audio signal, the video signal, and the computer graphics data is realized by reproducing/displaying them using, as reference time, the system clock in the decoding system in accordance with the time stamp or event generation timing, as in synchronization not based on phase lock in the MPEG.
  • a synthesizing system for synchronizing a video signal and a computer graphics image is proposed in Japanese Patent Laid-Open No. 7-212653.
  • This conventional synthesizing system delays a fetched video signal by a time required for generation of a computer graphics image, thereby realizing synchronous synthesis/display of the video signal and the computer graphics data.
  • the conventional audio signal/video signal/computer graphics data synthesizing/reproducing system shown in FIG. 17 assumes only synchronization without phase lock, and a method of establishing synchronization between the coding system and the decoding system is not referred to.
  • a preprocessing section 23 enclosed by a broken line operates asynchronously with the coding system and writes decoding results in the corresponding memories 5 , 10 , and 13 .
  • Reproduction of the audio and video signals read out from the memories and the animation of the computer graphics data based on an event driving instruction are executed using the system clock unique to the decoding system as reference time.
  • the conventional decoding system fetches all audio/video/computers graphics mixed data in advance, and starts audio reproduction, video reproduction, and animation of the computer graphics data based on an event driving instruction after all decoding results are written in the memories.
  • this system can hardly cope with an application to a communication/broadcasting system which continuously transfers data.
  • all processing operations depend on the system clock unique to the decoding system, synchronous reproduction becomes hard when the transfer delay varies.
  • an audio/video/computer graphics data synchronous reproducing/synthesizing system comprising separation means for separating a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into a compressed radio signal stream, an audio signal time reference value, a compressed video signal stream, a video signal time reference value, and a compressed computer graphics data stream, first clock generation means for generating a first decoding clock on the basis of the audio signal time reference value from the separation means, first decoding means for decoding the audio signal from the compressed audio signal stream from the separation means and the first decoding clock from the first clock generation means, first storage means for storing the decoded audio signal from the first decoding means, modulation means for modulating the audio signal from the first storage means in accordance with sound source control information, second clock generation means for generating a second decoding clock on the basis of the video signal time reference value from the separation means, second decoding means for decoding the video signal from the compressed video
  • FIG. 1 is a block diagram of a synchronous reproducing/synthesizing system according to the first embodiment of the present invention
  • FIG. 2 is a block diagram of a synchronous reproducing/synthesizing system according to the second embodiment of the present invention
  • FIG. 3 is a block diagram of a synchronous reproducing/synthesizing system according to the third embodiment of the present invention.
  • FIG. 4 is a block diagram of a synchronous reproducing/synthesizing system according to the fourth embodiment of the present invention.
  • FIG. 5 is a block diagram of a synchronous reproducing/synthesizing system according to the fifth embodiment of the present invention.
  • FIG. 6 is a block diagram of a synchronous reproducing/synthesizing system according to the sixth embodiment of the present invention.
  • FIG. 7 is a block diagram of a synchronous reproducing/synthesizing system according to the seventh embodiment of the present invention.
  • FIG. 8 is a block diagram of a synchronous reproducing/synthesizing system according to the eighth embodiment of the present invention.
  • FIG. 9 is a view showing an audio/video/computer graphics multiplexing method
  • FIG. 10 is a timing chart for explaining the operation of the system shown in FIG. 1 ;
  • FIG. 11 is a timing chart for explaining influence of rendering delay in the system shown in FIG. 1 ;
  • FIG. 12 is a timing chart for explaining the operation of the system shown in FIG. 2 ;
  • FIG. 13 is a timing chart for explaining the operation of the system shown in FIG. 3 ;
  • FIG. 14 is a timing chart for explaining the operation of the system shown in FIG. 4 ;
  • FIG. 15 is a block diagram showing the concept of an audio/video/computer graphics integrated transmission/storage system
  • FIG. 16 is a block diagram of a conventional audio/video synchronous reproducing system.
  • FIG. 17 is a block diagram of a conventional audio/video/computer graphics synthesizing system.
  • FIG. 1 shows a synchronous reproducing/synthesizing system according to the first embodiment of the present invention.
  • the decoding system in the system of the first embodiment comprises a demultiplexer 101 , an audio buffer 102 , an audio PLL 103 , an audio decoder 104 , an audio memory 105 , a modulator 106 , a video buffer 107 , a video PLL 108 , a video decoder 109 , a video memory 110 , a CG buffer 111 , a CG decoder 112 , a CG memory 113 , an event generator 114 , a rendering engine 115 , a video/CG memory 116 , and a viewpoint movement detector 117 .
  • the demultiplexer 101 separates a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into a compressed audio signal stream, a time stamp of the audio signal, the SCR (System Clock Reference) or PCR (Program Clock Reference) of the audio signal, a compressed video signal stream, a time stamp of the video signal, the SCR or PCR of the video signal, and a compressed computer graphics data stream.
  • a compressed audio signal stream a time stamp of the audio signal
  • SCR System Clock Reference
  • PCR Program Clock Reference
  • the audio buffer 102 buffers the compressed audio signal stream separated by the demultiplexer 101 .
  • the audio PLL 103 receives the SCR/PCR of the audio signal separated by the demultiplexer 101 and generates a decoding clock.
  • the audio decoder 104 decodes the compressed audio signal stream from the audio buffer 102 at a timing indicated by the time stamp of the audio signal in accordance with the decoding clock supplied from the audio PLL 103 .
  • the audio memory 105 stores the decoded audio signal supplied from the audio decoder 104 and outputs the audio signal.
  • the modulator 106 modulates the audio signal from the audio memory 105 in accordance with the viewpoint, the viewpoint moving speed, the sound source position, and the sound source moving speed which are supplied from the rendering engine 115 .
  • the audio buffer 107 buffers the compressed audio signal stream separated by the demultiplexer 101 .
  • the video PLL 108 receives the SCR/PCR of the video signal separated by the demultiplexer 101 and generates a decoding clock.
  • the video decoder 109 decodes the compressed video signal stream from the video buffer 107 at a timing indicated by the time stamp of the video signal in accordance with the decoding clock supplied from the video PLL 108 .
  • the video memory 110 stores the decoded video signal supplied from the video decoder 109 and outputs the video signal.
  • the CG buffer 111 buffers the compressed computer graphics data stream separated by the demultiplexer 101 .
  • the CG decoder 112 decodes the compressed computer graphics data stream arid and generates decoded computer graphics data, and at the same time, outputs event time management information.
  • the CG memory 113 stores the decoded computer graphics data supplied from the CG decoder 112 and outputs the computer graphics data.
  • the event generator 114 determines reference time on the basis of the clock supplied from the video PLL 108 and outputs an event driving instruction in accordance with the event time management information (e.g., a time stamp) supplied from the CG decoder 112 .
  • the rendering engine 115 receives the video signal supplied from the video memory 110 , the computer graphics data supplied from the CG memory 113 , the event driving instruction supplied from the event generator 114 , and the viewpoint movement data supplied from the viewpoint movement detector 117 and outputs a viewpoint, a viewpoint moving speed, a sound source position, a sound source moving speed, and the synthesized image of the video signal and the computer graphics data.
  • the video/CG memory 116 stores the synthesized image of the video signal and the computer graphics data and outputs the synthesized image.
  • the viewpoint movement detector 117 receives a user input from a pointing device such as a mouse or a joystick and outputs it as viewpoint movement data.
  • the compressed streams of an audio signal, a video signal, and computer graphics data are multiplexed, as shown in FIG. 9 .
  • the audio signal and the video signal are divided into audio packets 140 and video packets 150 each constituted by compressed data and a header containing a time stamp.
  • the computer graphics data is divided into CG packets 160 each constituted by compressed data (the data is not sometimes compressed), even time management information, and a header. These packets are put into a group, and a pack header containing an SCR or PCR is added to the group to generate a multiplexed stream 170 .
  • the demultiplexer 101 separates the multiplexed stream 170 into the audio packet 140 , the video packet 150 , the CG packet 160 , and the SCR/PCR again.
  • the audio packet 140 , the video packet 150 , and the CG packet 160 are stored in the audio buffer 102 , the video buffer 107 , and the CG buffer 111 , respectively.
  • the SCR/PCR is output to the audio PLL 103 and the video PLL 108 and used to control the oscillation frequency.
  • the audio decoder 104 , the video decoder 109 , and the CG decoder 112 decode the compressed data stored in the audio buffer 102 , the video buffer 107 , and the CG buffer 111 , respectively.
  • the decoding results are written in the audio memory 105 , the video memory 110 , and the CG memory 113 , respectively.
  • event time management information on the time axis is separated by the CG decoder 112 and sent to the event generator 114 .
  • the event generator 114 has a function of matching the description format of time used to reproduce the audio signal and the video signal and the description format of time used for event driving of the computer graphics.
  • the clock frequency sent from the coding system and changing depending on the SCR/PCR and the time stamps contained in the packet headers are used to decode these signals in synchronism with the coding system.
  • the computer graphics data need not always be decoded in this way because the computer graphics is not based on the concept of a sampling rate while the audio signal and the video signal are sampled at a predetermined period on the time axis.
  • the operation clock of the computer graphics decoder can be arbitrarily sew set. Reversely Conversely, in the decoding system, decoding of computer graphics data must be ended before the computer graphics data is actually displayed.
  • the audio signal, the video signal, and the computer graphics data written in the audio memory 105 , the video memory 110 , and the CG memory 113 respectively are reproduced and synthesized by processing in the modulator 106 and the rendering engine 115 .
  • An interaction from the observer (user) is detected by the viewpoint movement detector 117 through the pointing device such as a mouse and reflected, as viewpoint movement data, to the rendering result from the rendering engine 115 .
  • the behavior of an object along the time axis is controlled in accordance with an event driving instruction generated from the event generator 114 .
  • the decoding clock generated by the video PLL 108 is used as reference time of the event generator 114 .
  • the video signal and the computer graphics data are synchronized with each other without being influenced by a transmission delay or jitter.
  • FIG. 10 shows the flow of decoding and reproduction/display in the first embodiment of the present invention.
  • A#n represents decoding and reproduction of the nth audio signal.
  • V#n represents decoding and reproduction/display of the frame of the nth video signal.
  • CG#n represents decoding and reproduction/display of the scene of the nth computer graphics data.
  • Decoding of the audio or video signal is started/ended at times defined by the time stamp.
  • FIG. 10 shows a case wherein decoding is complete in a time much shorter than the frame interval. However, decoding which requires a longer time can also be performed by delaying reproduction/display by a predetermined time.
  • the designated times of the time stamps of the audio signal and the video signal are intentionally changed excluding the first decoding, the same time may be set.
  • decoding of the computer graphics data is started/ended before reproduction/display.
  • Computer graphics is not displayed while the first scene is being decoded.
  • Computer graphics data of the second scene is decoded in the background of display processing of the first scene.
  • the timing for multiplexing computer graphics data in the coding system must be taken into consideration.
  • computer graphics data is determined in advance. Therefore, when the generation time of the computer graphics data in the decoding system is predicted, the appropriate end time can be easily set unless a scene change frequently occurs.
  • Reproduction/display is realized in accordance with decoding results while synchronizing the audio signal, the video signal, and the computer graphics data.
  • a delay necessary for rendering is not taken into consideration.
  • FIG. 11 shows the flow of decoding and reproduction/display considering the rendering delay.
  • an operation of performing the first rendering using a video frame to be displayed at a certain time point, starting the second rendering using a video frame to be displayed at the end time of the first rendering, and starting display of a synthesized image at the start of a frame immediately after the end of the first rendering is repeated.
  • FIG. 2 shows the system configuration of the second embodiment of the present invention.
  • the same reference numeral as in FIG. 1 denote the same elements in FIG. 2 , and a detailed description thereof will be omitted.
  • the difference from the first embodiment shown in FIG. 1 will be mainly described below.
  • a video deformation circuit 118 is added to the first embodiment.
  • the rendering engine 115 and the video/CG memory 116 are replaced by a rendering engine 130 and a video/CG memory 131 , respectively.
  • the rendering engine 130 has not only the function of the rendering engine 115 but also a function of outputting the two-dimensional projection information of an object on which a video signal is mapped to the video deformation circuit 118 .
  • the video deformation circuit 118 deforms the video signal using the video signal supplied from a video memory 110 and the two-dimensional projection information of the object supplied from the rendering engine 130 and outputs the video signal.
  • the video/CG memory 131 overwrites the output from the video deformation circuit 118 , i.e., the deformed video signal on the output from the rendering engine 130 , i.e., the synthesized image of the video signal and computer graphics data and stores the data.
  • the operation of the system shown in FIG. 2 will be described next.
  • the operation of decoding the audio signal, the video signal, and the computer graphics data and writing these data in an audio memory 105 , the video memory 110 , and a CG memory 113 , respectively, is the same as that of the system shown in FIG. 1 .
  • the rendering engine 130 additionally has a function of outputting the two-dimensional projection information of the object on which the video signal is mapped.
  • the two-dimensional projection information comprises a set of coordinates of a two-dimensional projection plane of a three-dimensional graphic on which the video signal is mapped and binary data which is given in units of coordinates and has a value of “1” when the projection plane is not hidden by a front object or “0” when the projection plane is hidden by a front object.
  • This data can be easily acquired as a by-product obtained upon applying a well-known hidden surface removal algorithm such as “z-buffer”, “depth-sorting” or “binary space-partitioning” in rendering.
  • the two-dimensional projection information is used by the video deformation circuit 118 to deform the video signal and mask the hidden surface portion.
  • This processing can be realized by using an LSI used in an existing video editor or the like.
  • the output from the video deformation circuit 118 is overwritten on the synthesized image as a rendering result from the rendering engine 130 immediately before the image is written in the video/CG memory 131 , and output as a new synthesized image.
  • the video/CG memory 131 enables preferential writing of the output from the video deformation circuit 118 .
  • FIG. 12 shows the flow of decoding and reproduction/display in the system shown in FIG. 2 .
  • FIG. 12 is different from FIG. 11 in that the video signal is written in synchronism with the audio signal at the same timing as shown in FIG. 9 .
  • a video frame to be displayed at the display time is deformed using the two-dimensional projection information given as not the result of rendering which is being performed at the display time but the result of rendering which has been performed immediately before, and overwritten in the video/CG memory 131 .
  • FIG. 3 shows the system configuration of the third embodiment of the present invention.
  • the same reference numeral as in FIG. 1 denote the same elements in FIG. 3 , and a detailed description thereof will be omitted.
  • the difference from the first embodiment shown in FIG. 1 will be mainly described below.
  • a delay circuit 119 is added to the first embodiment shown in FIG. 1 .
  • the rendering engine 115 is replaced by a rendering engine 133 .
  • the delay circuit 119 outputs an audio signal, i.e., the output from a modulator 106 in accordance with a rendering delay supplied from the rendering engine 133 .
  • the operation of the system shown in FIG. 3 will be described next.
  • the operation of decoding the audio signal, the video signal, and the computer graphics data and writing these data in an audio memory 105 , a video memory 110 , and a CG memory 113 , respectively, is the same as that of the system shown in FIG. 1 .
  • the rendering engine 133 additionally has a function of outputting a rendering delay time.
  • the delay circuit 119 has a function of delaying the audio signal in accordance with the rendering delay time supplied from the rendering engine 133 and outputting the audio signal.
  • FIG. 13 shows the flow of decoding and reproduction/display in the system shown in FIG. 3 .
  • FIG. 13 is different from FIG. 11 in that the audio signal and the synthesized image of the video signal and the computer graphics data are synchronously reproduced/displayed although some frame thinning takes place in display of the synthesized image.
  • Setting of the delay time of the audio signal is realized by safely estimating the rendering time of the rendering engine 115 .
  • FIG. 4 shows the system configuration of the fourth embodiment of the present invention.
  • the same reference numeral as in FIG. 2 denote the same elements in FIG. 4 , and a detailed description thereof will be omitted.
  • the difference from the second embodiment shown in FIG. 2 will be mainly described below.
  • a delay circuit 119 is added to the second embodiment shown in FIG. 2 .
  • the delay circuit 119 outputs an audio signal, i.e., the output from a modulator 106 on the basis of a predetermined delay.
  • the rendering engine 130 shown in FIG. 2 is replaced by a rendering engine 134 .
  • the operation of the system shown in FIG. 4 will be described next.
  • the operation of decoding the audio signal, the video signal, and the computer graphics data and writing these data in an audio memory 105 , a video memory 110 , and a CG memory 113 , respectively, and performing rendering by the rendering engine 134 is the same as that of the system shown in FIG. 2 .
  • the delay circuit 119 additionally has a function of delaying the audio signal by a predetermined time and outputting the audio signal.
  • FIG. 14 shows the flow of decoding and reproduction/display in the system shown in FIG. 4 .
  • FIG. 14 is different from FIG. 12 in that the audio signal and the synthesized image of the video signal and the computer graphics data are completely synchronously reproduced/displayed.
  • Setting of the delay time of the audio signal is realized by adaptively measuring the rendering delay time supplied from the rendering engine 134 or safely estimating the rendering time of the rendering engine 134 .
  • the audio signal and the video signal are decoded from one multiplexed bit stream.
  • the computer graphics data is acquired via a route different from that of the audio signal and the video signal. With this processing, synchronous reproduction can be performed in the fourth embodiment, as in the first embodiment.
  • FIG. 5 shows the system configuration of the fifth embodiment of the present invention.
  • the same reference numeral as in FIG. 1 denote the same elements in FIG. 5 , and a detailed description thereof will be omitted.
  • the difference from the first embodiment will be mainly described below.
  • the demultiplexer 101 in the first embodiment shown in FIG. 1 is replaced by a demultiplexer 132 .
  • the demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video.
  • Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
  • the fifth embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the first embodiment.
  • the remaining operations are the same as those of the first embodiment, and a detailed description thereof will be omitted.
  • FIG. 6 shows the system configuration of the sixth embodiment of the present invention.
  • the same reference numeral as in FIG. 2 denote the same elements in FIG. 6 , and a detailed description thereof will be omitted.
  • the difference from the second embodiment shown in FIG. 2 will be mainly described below.
  • the demultiplexer 101 in the second embodiment shown in FIG. 2 is replaced by a demultiplexer 132 .
  • the demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video.
  • Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
  • the sixth embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the second embodiment.
  • the remaining operations are the same as those of the second embodiment, and a detailed description thereof will be omitted.
  • FIG. 7 shows the system configuration of the seventh embodiment of the present invention.
  • the same reference numeral as in FIG. 3 denote that same elements in FIG. 7 , and a detailed description thereof will be omitted.
  • the difference from the third embodiment shown in FIG. 3 will be mainly described below.
  • the demultiplexer 101 in the third embodiment is replaced by a demultiplexer 132 .
  • the demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video.
  • Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
  • the seventh embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the third embodiment.
  • the remaining operations and advantages are the same as those of the third embodiment, and a detailed description thereof will be omitted.
  • FIG. 8 shows the system configuration of the eighth embodiment of the present invention.
  • the same reference numeral as in FIG. 4 denote the same elements in FIG. 8 , and a detailed description thereof will be omitted.
  • the difference from the fourth embodiment shown in FIG. 4 will be mainly described below.
  • the demultiplexer 101 in the fourth embodiment is replaced by a demultiplexer 132 .
  • the demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video.
  • Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
  • the eighth embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the fourth embodiment.
  • the remaining operations and advantages are the same as those of the fourth embodiment, and a detailed description thereof will be omitted.
  • the first effect of the present invention is that synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
  • reference time information supplied from the coding system is also used as a time reference for the decoding system.
  • the second effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
  • a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image.
  • the third effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
  • the output of the audio signal is delayed in consideration of the rendering delay time to be expected.
  • the fourth effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
  • a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image, and at the same time, the output of the audio signal is delayed in consideration of the rendering delay time to be expected.
  • the fifth effect of the present invention is that synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
  • reference time information supplied from the coding system is also used as a time reference for the decoding system.
  • the sixth effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
  • the seventh effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
  • the output of the audio signal is delayed in consideration of the rendering delay time to be expected.
  • the eighth effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
  • a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image, and at the same time, the output of the audio signal is delayed in consideration of the rendering delay time to be expected.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Processing Or Creating Images (AREA)
  • Studio Circuits (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)

Abstract

In an audio/video/computer graphics data synchronous reproducing/synthesizing system, a demultiplexer separates a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into compressed audio and video signal streams, audio and video signal time reference values, and a compressed computer graphics data stream. An audio PLL generates a first decoding clock. An audio decoder decodes the audio signal. An audio memory stores the decoded audio signal. A modulator modulates the audio signal in accordance with sound source control information. A video PLL generates a second decoding clock. A video decoder decodes the video signal. A video memory stores the decoded video signal. A CG decoder decodes the computer graphics data and event time management information. A CG memory stores the decoded computer graphics data. An event generator generates an event driving instruction. A detector detects viewpoint movement of an observer. A render engine receives the video signal, the computer graphics data, the event driving instruction, and viewpoint movement data and outputs a synthesized image of the video signal and the computer graphics data and the sound source control information. The method for this system is also disclosed.

Description

BACKGROUND OF THE INVENTION
The present invention relates to a system for synchronously reproducing/synthesizing an audio signal, a video signal, and computer graphics data.
Known international coding standards for a system which compresses, codes, and multiplexes an audio signal (or a speech signal) and a video signal, transmits/stores the multiplexed signal, expands the transmitted/stored signal, and decodes it to the original audio and video signals are MPEG1 and MPEG2 defined by the MPEG (Moving Picture Coding Experts Group) in working group (WG) 11 in SC29 under JTC1 (Joint Technical Committee 1) for handling common matters in the data processing field of the ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission).
The MPEG assumes a variety of applications. As for synchronization, systems using phase lock and systems not based on phase lock are assumed.
In synchronization using phase lock, an audio signal coding clock (sampling rate of an audio signal) and a video signal coding clock (frame rate of a video signal) are phase-locked to a common SCR (System Clock Reference).
A time stamp representing time of decoding/reproduction is added to a multiplexed bit stream. A decoding system realizes phase lock and sets a time reference. More specifically, synchronization between the coding system and the decoding system is established. In addition, the audio signal and the video signal are decoded on the basis of the time stamp, thereby realizing reproduction/display of the audio signal and the video signal which are synchronized with each other.
When phase lock is not employed, the audio signal and the video signal are independently processed and decoded in accordance with corresponding time stamps added by the coding system.
FIG. 16 shows the configuration of a system for reproducing/displaying an audio signal and a video signal from an MPEG system stream based on phase lock, which is described in ISO/IEC 13818-1, “Information Technology-Generic Coding of Moving Pictures and Associated Audio Systems”, November 1994.
Referring to FIG. 16, a demultiplexer 1 separates a bit stream in which an audio signal and a video signal are compressed and multiplexed in accordance with the MPEG standard, into a compressed audio signal stream, a time stamp, the SCR (System Clock Reference) or PCR (Program Clock Reference) of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video signal.
An audio buffer 2 buffers the compressed audio signal stream separated by the demultiplexer 1. An audio PLL (Phase Locked Loop) 3 receives the SCR/PCR of the audio signal separated by the demultiplexer 1 and generates a decoding clock. An audio signal decoder 4 decodes the compressed audio signal stream from the audio buffer 2 at a timing indicated by the time stamp of the audio signal in accordance with the decoding clock supplied from the audio PLL 3. An audio memory 5 stores the decoded audio signal supplied from the audio signal decoder 4 and outputs the audio signal.
A video buffer 7 buffers the compressed video signal stream separated by the demultiplexer 1. A video PLL 8 receives the SCR/PCR of the video signal separated by the demultiplexer 1 and generates a decoding clock. A video signal decoder 9 decodes the compressed video signal stream from the video buffer 7 at a timing indicated by the time stamp of the video signal in accordance with the decoding clock supplied from the video PLL 8. A video memory 10 stores the decoded video signal supplied from the video signal decoder 9 and outputs the video signal.
The audio PLL 3 and the video PLL 8 control the oscillation frequency such that the SCR/PCR of the coding system, which is supplied from the demultiplexer 1, matches the timer counter value of the STC (System Time Clock) of the audio PLL 3 and the video PLL 8. With this processing, the time reference of the decoding system is set, and synchronization between the coding system and the decoding system is established.
Next, the audio signal and the video signal are decoded at the timing indicated by the time stamp, thereby realizing synchronous reproduction/display of the audio signal and the video signal.
Along with recent development of the computer and LSI technologies, computer graphics (CG) is popularly used in various fields. Accordingly, attempts for integrating an audio signal (or a speech signal), a video signal, and computer graphics data and transmitting/storing the integrated data have been extensively made.
As shown in FIG. 15, a coding system 24 receives an audio signal, a video signal, and computer graphics data, codes these data, multiplexes these data or independently outputs these data to a transmission system/storage system 25.
A decoding system 26 extracts the integrated data from the transmission system/storage system 25, decodes the data, and outputs the audio signal and integrated image data of the video signal and computer graphics. Interaction from the observer using a pointing device such as a mouse or a joystick, e.g., viewpoint movement in a three-dimensional space on the display screen is received. A typical example is ISO/IEC WD 14772: “The Virtual Reality Modeling Language Specification: The VRML2.0 Specification” (VRML).
The VRML is a description language for transmitting/receiving CG data through a network represented by the Internet and forming/sharing a virtual space. The VRML supports ISO/IEC 11172 (MPEG1) which is standardized as an audio/video signal coding standard. More specifically, on the coding system side, the MPEG1 stream used the in VRML description, the sound source position of an audio signal, and a three-dimensional object on which a video signal is mapped are designated. On the decoding system side, a three-dimensional space is formed in accordance with the received VRML description, the audio sound source and the video object are arranged in the three-dimensional space, and the audio signal and the video signal are synchronously reproduced/displayed in accordance with time stamp information contained in the MPEG1 stream.
The VRML also supports an animation of a three-dimensional object. More specifically, on the coding system side, the start and end times of each event, the duration of one cycle, the contents of each event, and interaction between events are described in a script. On the decoding system side, a three-dimensional space is formed in accordance with the received VRML description, events are generated on the basis of unique time management, and an animation is displayed.
Alternatively, on the coding system side, time ti and parameters Xi (color, shape, normal vector, direction, position, and the like) of the object at the time ti are described and defined. On the decoding system side, the parameters of the object at time t (tl<t<tl+1) are obtained by interpolation, and an animation is displayed.
For the VRML, a binary format replacing the conventional script description has also been examined. This enables reduction of redundancy of a script description or shortening of the processing time for converting the script description into a high-speed rendering format on the decoding system, thereby improving the transmission efficiency and realizing high-speed three-dimensional display.
A description in, e.g., M. Deering, “Geometry Compression”, Computer Graphics Proceedings, Annual Conference Series, pp. 13-20, Aug. 1995 can be referred to as a means for reducing redundancy of a script description. An efficient system for compressing vortex data expression for describing a three-dimensional object is described in this reference.
FIG. 17 shows the arrangement of a conventional decoding system (in the VRML, this system is normally called a “browser”) for receiving the VRML description and displaying the three-dimensional space. Conventional decoding systems of this type are, e.g., “Live3D” available from Netscape, “CyberPassage” available from Sony, and “Web-Space” available from SGI which are opened to the public through the Internet.
Referring to FIG. 17, an AV buffer 21 buffers a bit stream in which an audio signal and a video signal are compressed and multiplexed. The demultiplexer 1 separates the bit stream in which the audio signal and the video signal are compressed and multiplexed, which is supplied from the AV buffer 21, into a compressed audio signal stream and a compressed video signal stream.
The audio signal decoder 4 decodes the compressed audio signal stream supplied from the demultiplexer 1. The audio memory 5 stores the decoded audio signal supplied from the audio signal decoder 4 and outputs the audio signal. A modulator 6 modulates the audio signal from the audio memory 5 on the basis of a viewpoint, the viewpoint moving speed, the sound source position, and the sound source moving speed, which are supplied from a rendering engine 15.
The video signal decoder 9 decodes the compressed video signal stream supplied from the demultiplexer 1. The video memory 10 stores the decoded video signal supplied from the video signal decoder 9.
A CG buffer 22 buffers a compressed computer graphics data stream (or a normal stream). A CG decoder 12 decodes the compressed computer graphics data stream supplied from the CG buffer 22 and generates decoded computer graphics data, and at the same time, outputs event time management information. A CG memory 13 stores the decoded computer graphics data supplied from the CG decoder 12 and outputs the computer graphics data.
An event generator 14 determines reference time on the basis of a clock supplied from a system clock generator 20 and outputs an event driving instruction in accordance with the event time management information (e.g., a time stamp) supplied from the CG decoder 12.
The rendering engine 15 receives the video signal supplied from the video memory 10, the computer graphics data supplied from the CG memory 13, the event driving instruction supplied from the event generator 14, and viewpoint movement data supplied from a viewpoint movement detector 17 and outputs the viewpoint, the viewpoint moving speed, the sound source position, the sound source moving speed, and the synthesized image of the video signal and the computer graphics data.
A video/CG memory 16 stores the synthesized image of the video signal and the computer graphics data and outputs the synthesized image. The viewpoint movement detector 17 receives a user input from a pointing device such as a mouse or a joystick and outputs it as viewpoint movement data.
Synchronization among the audio signal, the video signal, and the computer graphics data is realized by reproducing/displaying them using, as reference time, the system clock in the decoding system in accordance with the time stamp or event generation timing, as in synchronization not based on phase lock in the MPEG.
A synthesizing system for synchronizing a video signal and a computer graphics image is proposed in Japanese Patent Laid-Open No. 7-212653. This conventional synthesizing system delays a fetched video signal by a time required for generation of a computer graphics image, thereby realizing synchronous synthesis/display of the video signal and the computer graphics data.
In the conventional audio signal/video signal synthesizing/reproducing system shown in FIG. 16, processing of computer graphics data is not mentioned at all.
In addition, the conventional audio signal/video signal/computer graphics data synthesizing/reproducing system shown in FIG. 17 assumes only synchronization without phase lock, and a method of establishing synchronization between the coding system and the decoding system is not referred to.
In FIG. 17, a preprocessing section 23 enclosed by a broken line operates asynchronously with the coding system and writes decoding results in the corresponding memories 5, 10, and 13.
Reproduction of the audio and video signals read out from the memories and the animation of the computer graphics data based on an event driving instruction are executed using the system clock unique to the decoding system as reference time.
The conventional decoding system (VRML browser) fetches all audio/video/computers graphics mixed data in advance, and starts audio reproduction, video reproduction, and animation of the computer graphics data based on an event driving instruction after all decoding results are written in the memories.
For this reason, this system can hardly cope with an application to a communication/broadcasting system which continuously transfers data. In addition, since all processing operations depend on the system clock unique to the decoding system, synchronous reproduction becomes hard when the transfer delay varies.
The system proposed in Japanese Patent Laid-Open No. 7-212653 has the following problems.
(1) The system does not cope with an audio signal.
(2) The system does not cope with compression.
(3) The system does not separately consider the coding system and the decoding system.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide an audio/video/computer graphics synchronous reproducing/synthesizing system for synchronizing an audio signal, a video signal, and computer graphics data.
In order to achieve the above object, according to the present invention, there is provided an audio/video/computer graphics data synchronous reproducing/synthesizing system comprising separation means for separating a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into a compressed radio signal stream, an audio signal time reference value, a compressed video signal stream, a video signal time reference value, and a compressed computer graphics data stream, first clock generation means for generating a first decoding clock on the basis of the audio signal time reference value from the separation means, first decoding means for decoding the audio signal from the compressed audio signal stream from the separation means and the first decoding clock from the first clock generation means, first storage means for storing the decoded audio signal from the first decoding means, modulation means for modulating the audio signal from the first storage means in accordance with sound source control information, second clock generation means for generating a second decoding clock on the basis of the video signal time reference value from the separation means, second decoding means for decoding the video signal from the compressed video signal stream from the separation means and the second decoding clock from the second clock generation means, second storage means for storing the decoded video signal from the second decoding means, third decoding means for decoding the computer graphics data and event time management information from the compressed computer graphics data stream from the separation means, third storage means for storing the decoded computer graphics data from the third decoding means, event generation means for generating an event driving instruction on the basis of the second decoding clock from the second clock generation means and the event time management information from the third decoding means, detection means for detecting view-point movement of an observer using a pointing device, and rendering means for receiving the video signal stored in the second storage means, the computer graphics data stored in the third storage means, the event driving instruction from the event generation means, and viewpoint movement data from the detection means, and outputting a synthesized image of the video signal and the computer graphics data and the sound source control information used by the modulation means.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a synchronous reproducing/synthesizing system according to the first embodiment of the present invention;
FIG. 2 is a block diagram of a synchronous reproducing/synthesizing system according to the second embodiment of the present invention;
FIG. 3 is a block diagram of a synchronous reproducing/synthesizing system according to the third embodiment of the present invention;
FIG. 4 is a block diagram of a synchronous reproducing/synthesizing system according to the fourth embodiment of the present invention;
FIG. 5 is a block diagram of a synchronous reproducing/synthesizing system according to the fifth embodiment of the present invention;
FIG. 6 is a block diagram of a synchronous reproducing/synthesizing system according to the sixth embodiment of the present invention;
FIG. 7 is a block diagram of a synchronous reproducing/synthesizing system according to the seventh embodiment of the present invention;
FIG. 8 is a block diagram of a synchronous reproducing/synthesizing system according to the eighth embodiment of the present invention;
FIG. 9 is a view showing an audio/video/computer graphics multiplexing method;
FIG. 10 is a timing chart for explaining the operation of the system shown in FIG. 1;
FIG. 11 is a timing chart for explaining influence of rendering delay in the system shown in FIG. 1;
FIG. 12 is a timing chart for explaining the operation of the system shown in FIG. 2;
FIG. 13 is a timing chart for explaining the operation of the system shown in FIG. 3;
FIG. 14 is a timing chart for explaining the operation of the system shown in FIG. 4;
FIG. 15 is a block diagram showing the concept of an audio/video/computer graphics integrated transmission/storage system;
FIG. 16 is a block diagram of a conventional audio/video synchronous reproducing system; and
FIG. 17 is a block diagram of a conventional audio/video/computer graphics synthesizing system.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention will be described below in detail with reference to the accompanying drawings.
[First Embodiment]
FIG. 1 shows a synchronous reproducing/synthesizing system according to the first embodiment of the present invention. Referring to FIG. 1, the decoding system in the system of the first embodiment comprises a demultiplexer 101, an audio buffer 102, an audio PLL 103, an audio decoder 104, an audio memory 105, a modulator 106, a video buffer 107, a video PLL 108, a video decoder 109, a video memory 110, a CG buffer 111, a CG decoder 112, a CG memory 113, an event generator 114, a rendering engine 115, a video/CG memory 116, and a viewpoint movement detector 117.
The demultiplexer 101 separates a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into a compressed audio signal stream, a time stamp of the audio signal, the SCR (System Clock Reference) or PCR (Program Clock Reference) of the audio signal, a compressed video signal stream, a time stamp of the video signal, the SCR or PCR of the video signal, and a compressed computer graphics data stream.
The audio buffer 102 buffers the compressed audio signal stream separated by the demultiplexer 101.
The audio PLL 103 receives the SCR/PCR of the audio signal separated by the demultiplexer 101 and generates a decoding clock.
The audio decoder 104 decodes the compressed audio signal stream from the audio buffer 102 at a timing indicated by the time stamp of the audio signal in accordance with the decoding clock supplied from the audio PLL 103. The audio memory 105 stores the decoded audio signal supplied from the audio decoder 104 and outputs the audio signal. The modulator 106 modulates the audio signal from the audio memory 105 in accordance with the viewpoint, the viewpoint moving speed, the sound source position, and the sound source moving speed which are supplied from the rendering engine 115.
The audio buffer 107 buffers the compressed audio signal stream separated by the demultiplexer 101.
The video PLL 108 receives the SCR/PCR of the video signal separated by the demultiplexer 101 and generates a decoding clock.
The video decoder 109 decodes the compressed video signal stream from the video buffer 107 at a timing indicated by the time stamp of the video signal in accordance with the decoding clock supplied from the video PLL 108. The video memory 110 stores the decoded video signal supplied from the video decoder 109 and outputs the video signal.
The CG buffer 111 buffers the compressed computer graphics data stream separated by the demultiplexer 101. The CG decoder 112 decodes the compressed computer graphics data stream arid and generates decoded computer graphics data, and at the same time, outputs event time management information.
The CG memory 113 stores the decoded computer graphics data supplied from the CG decoder 112 and outputs the computer graphics data. The event generator 114 determines reference time on the basis of the clock supplied from the video PLL 108 and outputs an event driving instruction in accordance with the event time management information (e.g., a time stamp) supplied from the CG decoder 112.
The rendering engine 115 receives the video signal supplied from the video memory 110, the computer graphics data supplied from the CG memory 113, the event driving instruction supplied from the event generator 114, and the viewpoint movement data supplied from the viewpoint movement detector 117 and outputs a viewpoint, a viewpoint moving speed, a sound source position, a sound source moving speed, and the synthesized image of the video signal and the computer graphics data.
The video/CG memory 116 stores the synthesized image of the video signal and the computer graphics data and outputs the synthesized image. The viewpoint movement detector 117 receives a user input from a pointing device such as a mouse or a joystick and outputs it as viewpoint movement data.
The operation of the system shown in FIG. 1 will be described next. In the coding system, the compressed streams of an audio signal, a video signal, and computer graphics data are multiplexed, as shown in FIG. 9.
First, the audio signal and the video signal are divided into audio packets 140 and video packets 150 each constituted by compressed data and a header containing a time stamp. The computer graphics data is divided into CG packets 160 each constituted by compressed data (the data is not sometimes compressed), even time management information, and a header. These packets are put into a group, and a pack header containing an SCR or PCR is added to the group to generate a multiplexed stream 170.
In the decoding system shown in FIG. 1, the demultiplexer 101 separates the multiplexed stream 170 into the audio packet 140, the video packet 150, the CG packet 160, and the SCR/PCR again.
The audio packet 140, the video packet 150, and the CG packet 160 are stored in the audio buffer 102, the video buffer 107, and the CG buffer 111, respectively. The SCR/PCR is output to the audio PLL 103 and the video PLL 108 and used to control the oscillation frequency.
The audio decoder 104, the video decoder 109, and the CG decoder 112 decode the compressed data stored in the audio buffer 102, the video buffer 107, and the CG buffer 111, respectively. The decoding results are written in the audio memory 105, the video memory 110, and the CG memory 113, respectively.
For the computer graphics data, event time management information on the time axis is separated by the CG decoder 112 and sent to the event generator 114. The event generator 114 has a function of matching the description format of time used to reproduce the audio signal and the video signal and the description format of time used for event driving of the computer graphics.
In decoding the audio signal and the video signal, the clock frequency sent from the coding system and changing depending on the SCR/PCR and the time stamps contained in the packet headers are used to decode these signals in synchronism with the coding system.
However, the computer graphics data need not always be decoded in this way because the computer graphics is not based on the concept of a sampling rate while the audio signal and the video signal are sampled at a predetermined period on the time axis.
More specifically, in the decoding system, as far as decoding is ended before actual display of the computer graphics, the operation clock of the computer graphics decoder can be arbitrarily sew set. Reversely Conversely, in the decoding system, decoding of computer graphics data must be ended before the computer graphics data is actually displayed.
The audio signal, the video signal, and the computer graphics data written in the audio memory 105, the video memory 110, and the CG memory 113 respectively are reproduced and synthesized by processing in the modulator 106 and the rendering engine 115.
An interaction from the observer (user) is detected by the viewpoint movement detector 117 through the pointing device such as a mouse and reflected, as viewpoint movement data, to the rendering result from the rendering engine 115. The behavior of an object along the time axis, which is defined by the computer graphics data, is controlled in accordance with an event driving instruction generated from the event generator 114.
Not the system clock of the decoding system but the decoding clock generated by the video PLL 108 is used as reference time of the event generator 114. With this arrangement, the video signal and the computer graphics data are synchronized with each other without being influenced by a transmission delay or jitter.
FIG. 10 shows the flow of decoding and reproduction/display in the first embodiment of the present invention.
Referring to FIG. 10, A#n represents decoding and reproduction of the nth audio signal. V#n represents decoding and reproduction/display of the frame of the nth video signal. CG#n represents decoding and reproduction/display of the scene of the nth computer graphics data.
Decoding of the audio or video signal is started/ended at times defined by the time stamp. FIG. 10 shows a case wherein decoding is complete in a time much shorter than the frame interval. However, decoding which requires a longer time can also be performed by delaying reproduction/display by a predetermined time. In FIG. 10, although the designated times of the time stamps of the audio signal and the video signal are intentionally changed excluding the first decoding, the same time may be set.
On the other hand, decoding of the computer graphics data is started/ended before reproduction/display. Computer graphics is not displayed while the first scene is being decoded. Computer graphics data of the second scene is decoded in the background of display processing of the first scene.
To guarantee the appropriate end time, the timing for multiplexing computer graphics data in the coding system must be taken into consideration. In many cases, computer graphics data is determined in advance. Therefore, when the generation time of the computer graphics data in the decoding system is predicted, the appropriate end time can be easily set unless a scene change frequently occurs.
Reproduction/display is realized in accordance with decoding results while synchronizing the audio signal, the video signal, and the computer graphics data. In the first embodiment of the present invention, however, a delay necessary for rendering is not taken into consideration.
FIG. 11 shows the flow of decoding and reproduction/display considering the rendering delay. In FIG. 11, an operation of performing the first rendering using a video frame to be displayed at a certain time point, starting the second rendering using a video frame to be displayed at the end time of the first rendering, and starting display of a synthesized image at the start of a frame immediately after the end of the first rendering is repeated.
As is apparent from FIG. 11, synchronization between the audio signal and the synthesized image of the video signal/computer graphics data is canceled.
[Second Embodiment]
FIG. 2 shows the system configuration of the second embodiment of the present invention. The same reference numeral as in FIG. 1 denote the same elements in FIG. 2, and a detailed description thereof will be omitted. The difference from the first embodiment shown in FIG. 1 will be mainly described below.
In the system of the second embodiment shown in FIG. 2, a video deformation circuit 118 is added to the first embodiment. In addition, the rendering engine 115 and the video/CG memory 116 are replaced by a rendering engine 130 and a video/CG memory 131, respectively.
The rendering engine 130 has not only the function of the rendering engine 115 but also a function of outputting the two-dimensional projection information of an object on which a video signal is mapped to the video deformation circuit 118. The video deformation circuit 118 deforms the video signal using the video signal supplied from a video memory 110 and the two-dimensional projection information of the object supplied from the rendering engine 130 and outputs the video signal.
The video/CG memory 131 overwrites the output from the video deformation circuit 118, i.e., the deformed video signal on the output from the rendering engine 130, i.e., the synthesized image of the video signal and computer graphics data and stores the data.
The operation of the system shown in FIG. 2 will be described next. The operation of decoding the audio signal, the video signal, and the computer graphics data and writing these data in an audio memory 105, the video memory 110, and a CG memory 113, respectively, is the same as that of the system shown in FIG. 1. The rendering engine 130 additionally has a function of outputting the two-dimensional projection information of the object on which the video signal is mapped.
More specifically, the two-dimensional projection information comprises a set of coordinates of a two-dimensional projection plane of a three-dimensional graphic on which the video signal is mapped and binary data which is given in units of coordinates and has a value of “1” when the projection plane is not hidden by a front object or “0” when the projection plane is hidden by a front object. This data can be easily acquired as a by-product obtained upon applying a well-known hidden surface removal algorithm such as “z-buffer”, “depth-sorting” or “binary space-partitioning” in rendering.
The two-dimensional projection information is used by the video deformation circuit 118 to deform the video signal and mask the hidden surface portion. This processing can be realized by using an LSI used in an existing video editor or the like.
The output from the video deformation circuit 118 is overwritten on the synthesized image as a rendering result from the rendering engine 130 immediately before the image is written in the video/CG memory 131, and output as a new synthesized image. When the rendering engine 130 is performing the rendering operation, the video/CG memory 131 enables preferential writing of the output from the video deformation circuit 118.
FIG. 12 shows the flow of decoding and reproduction/display in the system shown in FIG. 2. FIG. 12 is different from FIG. 11 in that the video signal is written in synchronism with the audio signal at the same timing as shown in FIG. 9.
To display the video signal, a video frame to be displayed at the display time is deformed using the two-dimensional projection information given as not the result of rendering which is being performed at the display time but the result of rendering which has been performed immediately before, and overwritten in the video/CG memory 131.
[Third Embodiment]
FIG. 3 shows the system configuration of the third embodiment of the present invention. The same reference numeral as in FIG. 1 denote the same elements in FIG. 3, and a detailed description thereof will be omitted. The difference from the first embodiment shown in FIG. 1 will be mainly described below.
In the system of the third embodiment shown in FIG. 3, a delay circuit 119 is added to the first embodiment shown in FIG. 1. In addition, the rendering engine 115 is replaced by a rendering engine 133. The delay circuit 119 outputs an audio signal, i.e., the output from a modulator 106 in accordance with a rendering delay supplied from the rendering engine 133.
The operation of the system shown in FIG. 3 will be described next. The operation of decoding the audio signal, the video signal, and the computer graphics data and writing these data in an audio memory 105, a video memory 110, and a CG memory 113, respectively, is the same as that of the system shown in FIG. 1. The rendering engine 133 additionally has a function of outputting a rendering delay time. The delay circuit 119 has a function of delaying the audio signal in accordance with the rendering delay time supplied from the rendering engine 133 and outputting the audio signal.
FIG. 13 shows the flow of decoding and reproduction/display in the system shown in FIG. 3. FIG. 13 is different from FIG. 11 in that the audio signal and the synthesized image of the video signal and the computer graphics data are synchronously reproduced/displayed although some frame thinning takes place in display of the synthesized image.
Setting of the delay time of the audio signal is realized by safely estimating the rendering time of the rendering engine 115.
[Fourth Embodiment]
FIG. 4 shows the system configuration of the fourth embodiment of the present invention. The same reference numeral as in FIG. 2 denote the same elements in FIG. 4, and a detailed description thereof will be omitted. The difference from the second embodiment shown in FIG. 2 will be mainly described below.
In the system of the fourth embodiment shown in FIG. 4, a delay circuit 119 is added to the second embodiment shown in FIG. 2. The delay circuit 119 outputs an audio signal, i.e., the output from a modulator 106 on the basis of a predetermined delay. In addition, in the system of the fourth embodiment, the rendering engine 130 shown in FIG. 2 is replaced by a rendering engine 134.
The operation of the system shown in FIG. 4 will be described next. The operation of decoding the audio signal, the video signal, and the computer graphics data and writing these data in an audio memory 105, a video memory 110, and a CG memory 113, respectively, and performing rendering by the rendering engine 134 is the same as that of the system shown in FIG. 2. The delay circuit 119 additionally has a function of delaying the audio signal by a predetermined time and outputting the audio signal.
FIG. 14 shows the flow of decoding and reproduction/display in the system shown in FIG. 4. FIG. 14 is different from FIG. 12 in that the audio signal and the synthesized image of the video signal and the computer graphics data are completely synchronously reproduced/displayed.
Setting of the delay time of the audio signal is realized by adaptively measuring the rendering delay time supplied from the rendering engine 134 or safely estimating the rendering time of the rendering engine 134.
As in the existing VRML, the audio signal and the video signal are decoded from one multiplexed bit stream. However, the computer graphics data is acquired via a route different from that of the audio signal and the video signal. With this processing, synchronous reproduction can be performed in the fourth embodiment, as in the first embodiment.
[Fifth Embodiment]
FIG. 5 shows the system configuration of the fifth embodiment of the present invention. The same reference numeral as in FIG. 1 denote the same elements in FIG. 5, and a detailed description thereof will be omitted. The difference from the first embodiment will be mainly described below.
In the system of the fifth embodiment shown in FIG. 5, the demultiplexer 101 in the first embodiment shown in FIG. 1 is replaced by a demultiplexer 132.
In the system shown in FIG. 5, the demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video. Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
The fifth embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the first embodiment. The remaining operations are the same as those of the first embodiment, and a detailed description thereof will be omitted.
[Sixth Embodiment]
FIG. 6 shows the system configuration of the sixth embodiment of the present invention. The same reference numeral as in FIG. 2 denote the same elements in FIG. 6, and a detailed description thereof will be omitted. The difference from the second embodiment shown in FIG. 2 will be mainly described below.
In the system of the sixth embodiment shown in FIG. 6, the demultiplexer 101 in the second embodiment shown in FIG. 2 is replaced by a demultiplexer 132.
The demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video. Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
The sixth embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the second embodiment. The remaining operations are the same as those of the second embodiment, and a detailed description thereof will be omitted.
[Seventh Embodiment]
FIG. 7 shows the system configuration of the seventh embodiment of the present invention. The same reference numeral as in FIG. 3 denote that same elements in FIG. 7, and a detailed description thereof will be omitted. The difference from the third embodiment shown in FIG. 3 will be mainly described below.
In the system of the seventh embodiment shown in FIG. 7, the demultiplexer 101 in the third embodiment is replaced by a demultiplexer 132.
The demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video. Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
The seventh embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the third embodiment. The remaining operations and advantages are the same as those of the third embodiment, and a detailed description thereof will be omitted.
[Eighth Embodiment]
FIG. 8 shows the system configuration of the eighth embodiment of the present invention. The same reference numeral as in FIG. 4 denote the same elements in FIG. 8, and a detailed description thereof will be omitted. The difference from the fourth embodiment shown in FIG. 4 will be mainly described below.
In the system of the eighth embodiment shown in FIG. 8, the demultiplexer 101 in the fourth embodiment is replaced by a demultiplexer 132.
The demultiplexer 132 separates compressed audio and video signal streams corresponding to the existing MPEG into a compressed audio signal stream, a time stamp, the SCR or PCR of the audio signal, a compressed video signal stream, a time stamp, and the SCR or PCR of the video. Computer graphics data is fetched by a CG buffer 111 via a route different from that of the compressed streams.
The eighth embodiment is advantageous in that the existing MPEG system and VRML system are fused without being changed, and synchronous reproduction is realized, unlike the fourth embodiment. The remaining operations and advantages are the same as those of the fourth embodiment, and a detailed description thereof will be omitted.
As has been described above, according to the present invention, the following effects are obtained.
(1) The first effect of the present invention is that synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
The reason for this is that, in the present invention, reference time information supplied from the coding system is also used as a time reference for the decoding system.
(2) The second effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
The reason for this is that, in the present invention, a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image.
(3) The third effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
The reason for this is that, in the present invention, the output of the audio signal is delayed in consideration of the rendering delay time to be expected.
(4) The fourth effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal, a video signal, and computer graphics data are multiplexed.
The reason for this is that, in the present invention, a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image, and at the same time, the output of the audio signal is delayed in consideration of the rendering delay time to be expected.
(5) The fifth effect of the present invention is that synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
The reason for this is that, in the present invention, reference time information supplied from the coding system is also used as a time reference for the decoding system.
(6) The sixth effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
The reason for this is that a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image.
(7) The seventh effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
The reason for this is that, in the present invention, the output of the audio signal is delayed in consideration of the rendering delay time to be expected.
(8) The eighth effect of the present invention is that even when a rendering delay is generated, synchronous reproduction/synthesis of audio/video/computer graphics data is enabled from a compressed stream in which an audio signal and a video signal are multiplexed and computer graphics data acquired via a different route.
The reason for this is that, in the present invention, a video frame to be displayed at the current time is deformed using video signal deformation information obtained from an immediately preceding rendering result, and overwritten on the synthesized image, and at the same time, the output of the audio signal is delayed in consideration of the rendering delay time to be expected.

Claims (53)

1. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
separation means for separating a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into a compressed audio signal stream, an audio signal time reference value, a compressed video signal, a video signal time reference value, and a compressed computer graphics data stream;
first clock generation means for generating a first decoding clock on the basis of the audio signal time reference value from said separation means;
first decoding means for decoding the audio signal from the compressed audio signal stream from said separation means and the first decoding clock from said first clock generation means;
first storage means for storing the decoded audio signal from said first decoding means;
modulation means for modulating the audio signal from said first storage means in accordance with sound source control information;
second clock generation means for generating a second decoding clock on the basis of the video signal time reference value from said separation means;
second decoding means for decoding the video signal from the compressed video signal stream from said separation means and the second decoding clock from said second clock generation means;
second storage means for storing the decoded video signal from said second decoding means;
third decoding means for decoding the computer graphics data and event time management information from the compressed computer graphics data stream from said separation means;
third storage means for storing the decoded computer graphics data from said third decoding means;
event generation means for generating an event driving instruction on the basis of the second decoding clock from said second clock generation means and the event time management information from said third decoding means;
detection means for detecting viewpoint movement of an observer using a pointing device; and
rendering means for receiving the video signal stored in said second storage means, the computer graphics data stored in said third storage means, the event driving instruction from said event generation means, and viewpoint movement data from said detection means, and outputting a synthesized image of the video signal and the computer graphics data and the sound source control information used by said modulation means.
2. An apparatus according to claim 1, further comprising:
first buffer means for buffering the compressed audio signal stream from said separation means and outputting the compressed audio signal stream to said first decoding means;
second buffer means for buffering the compressed video audio stream from said separation means and outputting the compressed video signal stream to said second decoding means; and
third buffer means for buffering the compressed computer graphics data stream from said separation means and outputting the compressed computer graphics data stream to said third decoding means.
3. An apparatus according to claim 1, further comprising fourth storage means for storing the synthesized image of the video signal and the computer graphics data from said rendering means.
4. An apparatus according to claim 3, further comprising video deformation means for deforming the video signal from said second storage means on the basis of two-dimensional projection information of an object, and wherein
said rendering means outputs the two-dimensional projection information of said object on which the video signal is mapped to said video deformation means, and
said fourth storage means overwrites the deformed video signal from said video deformation means on the synthesized image of the video signal and the computer graphics data from said rendering means and stores the synthesized image.
5. An apparatus according to claim 1, further comprising delay means for delaying the modulated audio signal from said modulation means.
6. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
separation means for separating a bit stream in which an audio signal and a video signal are compressed and multiplexed, into a compressed audio signal stream, an audio signal time reference value, a compressed video signal stream, and a video signal time reference value;
first clock generation means for generating a first decoding clock on the basis of the audio signal time reference value from said separation means;
first decoding means for decoding the audio signal from the compressed audio signal stream from said separation means and the first decoding clock from said first clock generation means;
first storage means for storing the decoded audio signal from said first decoding means;
modulation means for modulating the audio signal from said first storage means in accordance with sound source control information;
second clock generation means for generating a second decoding clock on the basis of the video signal time reference value from said separation means;
second decoding means for decoding the video signal from the compressed video signal stream from said separation means and the second decoding clock from said second clock generation means;
second storage means for storing the decoded video signal from said second decoding means;
third decoding means for decoding computer graphics data and event time management information from a compressed computer graphics data stream which is not multiplexed with the compressed audio arid video signal streams;
third storage means for storing the decoded computer graphics data from said third decoding means;
event generation means for generating an event driving instruction on the basis of the second decoding clock from said second clock generation means and the event time management information from said third decoding means;
detection means for detecting viewpoint movement of an observer using a pointing device; and
rendering means for receiving the video signal stored in said second storage means, the computer graphics data stored in said third storage means, the event driving instruction from said event generation means, and viewpoint movement data from said detection means, and outputting a synthesized image of the video signal and the computer graphics data and the sound source control information used by said modulation means.
7. An apparatus according to claim 6, further comprising:
first buffer means for buffering the compressed audio signal stream from said separation means and outputting the compressed audio signal stream to said first decoding means;
second buffer means for buffering the compressed video signal stream from said separation means and outputting the compressed video signal stream to said second decoding means; and
third buffer means for buffering the compressed computer graphics data stream which is not multiplexed with the compressed audio and video signal streams.
8. An apparatus according to claim 6, further comprising fourth storage means for storing the synthesized image of the video signal and the computer graphics data from said rendering means.
9. An apparatus according to claim 8, further comprising
video deformation means for deforming the video signal from said second storage means on the basis of two-dimensional projection information of an object, and wherein
said rendering means outputs the two-dimensional projection information of said object on which the video signal is mapped to said video deformation means, and
said fourth storage means overwrites the deformed video signal from said video deformation means on the synthesized image of the video signal and the computer graphics data from said rendering means and stores the synthesized image.
10. An apparatus according to claim 6, further comprising delay means for delaying the modulated audio signal from said modulation means.
11. An audio/video/computer graphics data synchronous reproducing/synthesizing method, comprising the steps of:
separating a bit stream in which an audio signal, a video signal, and computer graphics data are compressed and multiplexed, into compressed audio and video signal streams, audio and video signal identification reference values, and a compressed computer graphics data stream;
generating a signal clock from the audio and video identification reference values;
decoding the audio and video signals from the compressed audio and video signal streams; respectively, using the generated signal clock,
decoding event time reference information from the compressed computer graphics data stream, and
generating an event driving instruction from the generated signal clock and the decoded event time reference information to synchronize a synthesized image of the video signal and the computer graphics data with the audio signal.
12. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
an audio decoder decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics and event time management information;
an event generator generating an event driving instruction on the basis of said second timing information and said event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
13. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a clock generator generating a first decoding clock and a second decoding clock on the basis of a signal time reference value;
an audio decoder decoding the compressed audio signal stream on the basis of said first decoding clock to produce an audio signal;
a video decoder decoding the compressed video signal stream on the basis of said second decoding clock to produce a video signal;
a computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of said second decoding clock and said event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
14. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a demultiplexer receiving a compressed audio signal stream, a compressed video signal stream, a signal time reference value, and a compressed computer graphics data stream;
a clock generator generating a first decoding clock and a second decoding clock on the basis of the signal time reference value;
an audio decoder decoding the compressed audio signal stream on the basis of the first decoding clock to produce an audio signal;
a video decoder decoding the compressed video signal stream on the basis of the second decoding clock to produce a video signal;
a computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of the second decoding clock and the event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
15. An apparatus according to claim 14, further comprising:
a modulator modulating the decoded audio signal in accordance with sound source control information,
wherein said rendering engine outputs the sound source control information used by said modulator.
16. An apparatus according to claim 14, further comprising:
a viewpoint movement detector detecting viewpoint movement of an observer indicated by a pointing device;
wherein said rendering engine receives viewpoint movement data from said viewpoint movement detector.
17. An apparatus according to claim 14, further comprising:
a video/computer graphics (CG) memory storing the synthesized image of the decoded video signal and the decoded computer graphics data; and
a video deformation circuit deforming the video signal in said video/CG memory on the basis of two-dimensional projection information of an object,
wherein said rendering engine outputs the two-dimensional projection information of said object on which the video signal is mapped to said video deformation circuit, and
wherein said video/CG memory overwrites the deformed video signal fro said video deformation circuit on the synthesized image of the video signal and the computer graphics data from said rendering engine and stores the synthesized image.
18. A method of synchronously reproducing/synthesizing audio, video a computer graphics data comprising:
decoding a compressed audio signal stream on the basis of a first timing information to produce an audio signal;
decoding a compressed video signal stream on the basis of a second timing information to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of said second timing information and said event time management information to synchronize the video signal with the computer graphics data.
19. A method of synchronously reproducing/synthesizing audio, video a computer graphics data comprising:
generating a first decoding clock and a second decoding clock on the basis of a signal time reference value;
decoding a compressed audio signal stream on the basis of said first decoding clock to produce an audio signal;
decoding a compressed video signal stream on the basis of said second decoding clock to produce a video signal;
decoding a compressed computer graphics data stream to produce
generating an event driving instruction on the basis of said second decoding clock and said event time management information to synchronize the video signal with the computer graphics data.
20. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
receiving a compressed audio signal stream, a compressed video signal stream, a signal time reference value, and a compressed computer graphics data stream;
generating a first decoding clock and a second decoding clock on the basis of the signal time reference value;
decoding the compressed audio signal stream on the basis of the first decoding clock to produce an audio signal;
decoding the compressed video signal stream on the basis of the second decoding clock to produce a video signal;
decoding the compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of the second decoding clock and the event time management information to synchronize the video signal with the computer graphics data.
21. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
an audio decoder decoding a compressed audio signal stream on the basis of an audio timestamp and a first decoding clock to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of a video timestamp and a second decoding clock to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of said second decoding clock and said event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
22. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a clock generator generating a first decoding clock on the basis of an audio clock reference, and generating a second decoding clock on the basis of a video clock reference;
an audio decoder decoding a compressed audio signal stream on the basis of an audio timestamp and said first decoding clock to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of a video timestamp and said second decoding clock to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of said second decoding clock and said event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
23. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a demultiplexer receiving a compressed audio signal stream, a compressed video signal stream, an audio clock reference, a video clock reference, an audio timestamp, a video timestamp and a compressed computer graphics data stream;
a clock generator generating a first decoding clock on the basis of said audio clock reference, and generating a second decoding clock on the basis of the video clock reference;
an audio decoder decoding the compressed audio signal stream on the basis of the audio timestamp and the first decoding clock to produce an audio signal;
a video decoder decoding the compressed video signal stream on the basis of the video timestamp and the second decoding clock to produce a video signal;
a computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of the second decoding clock and the event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
24. An apparatus according to claim 23, further comprising:
a modulator modulating the decoded audio signal in accordance with sound source control information,
wherein said rendering engine outputs the sound source control information used by said modulator.
25. An apparatus according to claim 23, further comprising:
a viewpoint movement detector detecting viewpoint movement of an observer indicated by a pointing device;
wherein said rendering engine receives viewpoint movement data from said viewpoint movement detector.
26. An apparatus according to claim 23, further comprising:
a video/computer graphics (CG) memory storing the synthesized image of the decoded video signal and the decoded computer graphics data; and
a video deformation circuit deforming the video signal in said video/CG memory on the basis of two-dimensional projection information of an object,
wherein said rendering engine outputs the two-dimensional projection information of said object on which the video signal is mapped to said video deformation circuit, and
wherein said video/CG memory overwrites the deformed video signal from said video deformation circuit on the synthesized image of the video signal and the computer graphics data from said rendering engine and stores the synthesized image.
27. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
decoding a compressed audio signal stream on the basis of an audio timestamp and a first decoding clock to produce an audio signal;
decoding a compressed video signal stream on the basis of a video timestamp and a second decoding clock to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of said second decoding clock and said event time management information to synchronize the video signal with the computer graphics data.
28. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
generating a first decoding clock on the basis of an audio clock reference, and generating a second decoding clock on the basis of a video clock reference;
decoding a compressed audio signal stream on the basis of an audio timestamp and said first decoding clock to produce an audio signal;
decoding a compressed video signal stream on the basis of a video timestamp and said second decoding clock to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of said second decoding clock and said event time management information to synchronize the video signal with the computer graphics data.
29. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
receiving a compressed audio video stream, a compressed video signal stream, an audio clock reference, a video clock reference, an audio timestamp, a video timestamp and a compressed computer graphics data stream;
generating a first decoding clock on the basis of the audio clock reference, and for generating a second decoding clock on the basis of the video clock reference;
decoding the compressed audio signal stream on the basis of the audio timestamp and the first decoding clock to produce an audio signal;
decoding the compressed video signal stream on the basis of the video timestamp and the second decoding clock to produce a video signal;
decoding the compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of the second decoding clock and the event time management information to synchronize the video signal with the computer graphics data.
30. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a demultiplexer receiving a compressed audio signal stream, a compressed video signal stream, a signal time reference value, and a compressed computer graphics data stream;
an audio decoder decoding the compressed audio signal stream on the basis of the signal time reference value to produce an audio signal;
a video decoder decoding the compressed video signal stream on the basis of the signal time reference value to produce a video signal;
computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating, in synchronism with said video decoder, an event driving instruction on the basis of the event time management information; and
rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
31. An audio/video/computer graphics data synchronous reproducing/synthesizing system according to claim 30, wherein said video decoder and said event generator use the same decoding clock.
32. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a demultiplexer receiving a compressed audio signal stream, a compressed video signal stream, audio clock reference, video clock reference, an audio timestamp, a video timestamp and a compressed computer graphics data stream;
an audio decoder decoding the compressed audio signal stream on the basis of the audio timestamp and the audio clock reference to produce an audio signal;
a video decoder decoding the compressed video signal stream on the basis of the video timestamp and the video clock reference to produce a video signal;
a computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating, in synchronism with said video decoder, an event driving instruction on the basis of the event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
33. An audio/video/computer graphics data synchronous reproducing/synthesizing system according to claim 32, wherein said video decoder and said event generator use the same decoding clock.
34. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a demultiplexer receiving a compressed audio signal stream, a compressed video signal stream, audio clock reference, video clock reference, an audio timestamp, a video timestamp and a compressed computer graphics data stream;
a clock generator generating a first decoding clock on the basis of said audio clock reference, and generating a second decoding clock on the basis of the video clock reference;
an audio decoder decoding the compressed audio signal stream on the basis of the audio timestamp and the first decoding clock to produce an audio signal;
a video decoder decoding the compressed video signal stream on the basis of the video timestamp and the second decoding clock to produce a video signal;
a computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating, in synchronism with said video decoder, an event driving instruction on the basis of the event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
35. An audio/video/computer graphics data synchronous reproducing/synthesizing system according to claim 34, wherein said video decoder and said event generator use the same decoding clock.
36. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
receiving a compressed audio signal stream, a compressed video signal stream, a signal time reference value, and a compressed computer graphics data stream;
decoding the compressed audio signal stream on the basis of the signal time reference value to produce an audio signal;
decoding the compressed video signal stream on the basis of the signal time reference value to produce a video signal;
decoding the compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of the event time management information to synchronize the video signal with the computer graphics data.
37. A method according to claim 36, wherein decoding the compressed video signal stream and generating an event driving instruction use the same decoding clock.
38. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
receiving a compressed audio video stream, a compressed video signal stream, an audio clock reference, a video clock reference, an audio timestamp, a video timestamp and a compressed computer graphics data stream;
decoding the compressed audio signal stream on the basis of the audio timestamp and the audio clock reference to produce an audio signal;
decoding the compressed video signal stream on the basis of the video timestamp and the video clock reference to produce the video signal;
decoding the compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of the event time management information to synchronize the video signal with the computer graphics data.
39. A method according to claim 38, wherein decoding the compressed video signal stream and generating an event driving instruction use the same decoding clock.
40. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
receiving a compressed audio signal stream, a compressed video signal stream, an audio clock reference, a video clock reference, an audio timestamp, a video timestamp and a compressed computer graphics data stream;
generating a first decoding clock on the basis of the audio clock reference, and generating a second decoding clock on the basis of the video clock reference;
decoding the compressed audio signal stream on the basis of the audio timestamp and the first decoding clock to produce an audio signal;
decoding the compressed video signal stream on the basis of the video timestamp and the second decoding clock to produce a video signal;
decoding the compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of the event time management information to synchronize the video signal with the computer graphics data.
41. A method according to claim 40, wherein decoding the compressed video signal stream and generating an event driving instruction use the same decoding clock.
42. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
a demultiplexer receiving a compressed audio signal stream, a compressed video signal stream, and a compressed computer graphics data stream;
an audio decoder decoding the compressed audio signal stream to produce an audio signal;
a video decoder decoding the compressed video signal stream to produce a video signal;
a computer graphics decoder decoding the compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating, in synchronism with said video decoder, an event driving instruction on the basis of the event time management information; and
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data.
43. An audio/video/computer graphics data synchronous reproducing/synthesizing system according to claim 42, wherein said video decoder and said event generator use the same decoding clock.
44. A method of synchronously reproducing/synthesizing audio, video and computer graphics data, comprising:
receiving a compressed audio signal stream, a compressed video signal stream, and a compressed computer graphics data stream;
decoding the compressed audio signal stream to produce an audio signal;
decoding the compressed video signal to produce a video signal;
decoding the compressed computer graphics data stream to produce computer graphics data and event time management information; and
generating an event driving instruction on the basis of the event time management information to synchronize the video signal with the computer graphics data.
45. A method according to claim 44, wherein decoding the compressed video signal stream and generating an event driving instruction use the same decoding clock.
46. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
an audio decoder decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating, in synchronism with the video decoder, an event driving instruction on the basis of said event time management information;
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data; and
a video deformation circuit for deforming the decoded video signal in accordance with two-dimensional projection information provided by the rendering engine.
47. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
an audio decoder decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of said second timing information and said event time management information;
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data; and
a video deformation circuit for deforming the decoded video signal in accordance with two-dimensional projection information provided by the rendering engine.
48. A method of synchronously reproducing/synthesizing audio, video and computer graphics data, comprising:
decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
generating an event driving instruction on the basis of said event time management information to synchronize the video signal with the computer graphics data;
deforming the decoded video signal in accordance with two-dimensional projection information; and
outputting a synthesized image of the video signal and the computer graphics data, wherein the synthesized image includes a deformed video image.
49. A method of synchronously reproducing/synthesizing audio, video and computer graphics data, comprising:
decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
generating an event driving instruction on the basis of said second timing information and said event time management information to synchronize the video signal with the computer graphics data;
deforming the decoded video signal in accordance with two-dimensional projection information; and
outputting a synthesized image of the video signal and the computer graphics data, wherein the synthesized image includes a deformed video image.
50. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
an audio decoder decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating, in synchronism with said video decoder, an event driving instruction on the basis of said event time management information;
a rendering engine received the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data; and
an audio delay circuit for delaying the output of said audio signal in accordance with a rendering delay signal provided by the rendering engine.
51. An audio/video/computer graphics data synchronous reproducing/synthesizing system comprising:
an audio decoder decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
a video decoder decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
a computer graphics decoder decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
an event generator generating an event driving instruction on the basis of said second timing information and said event time management information;
a rendering engine receiving the decoded video signal, the decoded computer graphics data and the event driving instruction, to output a synthesized image of the video signal and the computer graphics data; and
an audio delay circuit for delaying the output of said audio signal in accordance with a rendering delay signal provided by the rendering engine.
52. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
generating an event driving instruction on the basis of said event time management information to synchronize the video signal with the computer graphics data;
generating an audio delay signal in accordance with a rendering delay of a rendering engine;
outputting a synthesized image of the video signal and the computer graphics data; and
outputting a delayed audio signal in accordance with the audio delay signal such that the delayed audio signal is synchronized with the synthesized image of the video signal and the computer graphics data.
53. A method of synchronously reproducing/synthesizing audio, video and computer graphics data comprising:
decoding a compressed audio signal stream on the basis of first timing information to produce an audio signal;
decoding a compressed video signal stream on the basis of second timing information to produce a video signal;
decoding a compressed computer graphics data stream to produce computer graphics data and event time management information;
generating an event driving instruction on the basis of said second timing information and said event time management information to synchronize the video signal with the computer graphics data;
generating an audio delay signal in accordance with a rendering delay of a rendering engine;
outputting a synthesized image of the video signal and the computer graphics data; and
outputting a delayed audio signal in accordance with the audio delay signal such that the delayed audio signal is synchronized with the synthesized image of the video signal and the computer graphics data.
US10/165,635 1996-10-25 2002-06-06 Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method Expired - Fee Related USRE39345E1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/165,635 USRE39345E1 (en) 1996-10-25 2002-06-06 Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP30136096A JP2970558B2 (en) 1996-10-25 1996-10-25 Audio / video / computer graphics synchronous reproduction / synthesis method and method
US08/957,587 US6072832A (en) 1996-10-25 1997-10-24 Audio/video/computer graphics synchronous reproducing/synthesizing system and method
US10/165,635 USRE39345E1 (en) 1996-10-25 2002-06-06 Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US08/957,587 Reissue US6072832A (en) 1996-10-25 1997-10-24 Audio/video/computer graphics synchronous reproducing/synthesizing system and method

Publications (1)

Publication Number Publication Date
USRE39345E1 true USRE39345E1 (en) 2006-10-17

Family

ID=17895939

Family Applications (2)

Application Number Title Priority Date Filing Date
US08/957,587 Ceased US6072832A (en) 1996-10-25 1997-10-24 Audio/video/computer graphics synchronous reproducing/synthesizing system and method
US10/165,635 Expired - Fee Related USRE39345E1 (en) 1996-10-25 2002-06-06 Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US08/957,587 Ceased US6072832A (en) 1996-10-25 1997-10-24 Audio/video/computer graphics synchronous reproducing/synthesizing system and method

Country Status (2)

Country Link
US (2) US6072832A (en)
JP (1) JP2970558B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060074921A1 (en) * 2002-07-24 2006-04-06 Total Immersion Method and system enabling real time mixing of synthetic images and video images by a user
US20070109444A1 (en) * 2003-10-15 2007-05-17 Matsushita Electric Industrial Co., Ltd. AV synchronization system
US20080304571A1 (en) * 2004-09-02 2008-12-11 Ikuo Tsukagoshi Content Receiving Apparatus, Method of Controlling Video-Audio Output Timing and Content Providing System
US8098737B1 (en) * 2003-06-27 2012-01-17 Zoran Corporation Robust multi-tuner/multi-channel audio/video rendering on a single-chip high-definition digital multimedia receiver

Families Citing this family (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW376629B (en) * 1997-12-19 1999-12-11 Toshiba Corp Digital image decoding method and device
JP3407287B2 (en) * 1997-12-22 2003-05-19 日本電気株式会社 Encoding / decoding system
JPH11232487A (en) * 1998-02-13 1999-08-27 Sony Corp Information processor, its processing method and provided medium
JP3422686B2 (en) * 1998-06-12 2003-06-30 三菱電機株式会社 Data decoding device and data decoding method
US6374314B1 (en) 1998-09-28 2002-04-16 Raytheon Company Method for managing storage of data by storing buffer pointers of data comprising a sequence of frames in a memory location different from a memory location for pointers of data not comprising a sequence of frames
US6381647B1 (en) * 1998-09-28 2002-04-30 Raytheon Company Method and system for scheduling network communication
KR100335084B1 (en) * 1999-02-10 2002-05-02 구자홍 apparatus and method for decoding multi program in multi-channel system
US6711620B1 (en) * 1999-04-14 2004-03-23 Matsushita Electric Industrial Co. Event control device and digital broadcasting system
US7458013B2 (en) * 1999-05-12 2008-11-25 The Board Of Trustees Of The Leland Stanford Junior University Concurrent voice to text and sketch processing with synchronized replay
US6724918B1 (en) * 1999-05-12 2004-04-20 The Board Of Trustees Of The Leland Stanford Junior University System and method for indexing, accessing and retrieving audio/video with concurrent sketch activity
JP4618960B2 (en) * 1999-09-21 2011-01-26 エヌエックスピー ビー ヴィ Clock recovery
US6429902B1 (en) * 1999-12-07 2002-08-06 Lsi Logic Corporation Method and apparatus for audio and video end-to-end synchronization
US6925097B2 (en) * 2000-03-29 2005-08-02 Matsushita Electric Industrial Co., Ltd. Decoder, decoding method, multiplexer, and multiplexing method
KR100448452B1 (en) * 2000-06-09 2004-09-13 엘지전자 주식회사 Method for supporting menu of a high-density recording medium
US6975363B1 (en) * 2000-08-31 2005-12-13 Microsoft Corporation Methods and systems for independently controlling the presentation speed of digital video frames and digital audio samples
JP4208398B2 (en) * 2000-10-05 2009-01-14 株式会社東芝 Moving picture decoding / reproducing apparatus, moving picture decoding / reproducing method, and multimedia information receiving apparatus
US7095945B1 (en) * 2000-11-06 2006-08-22 Ati Technologies, Inc. System for digital time shifting and method thereof
US7154906B2 (en) * 2001-01-30 2006-12-26 Canon Kabushiki Kaisha Image processing apparatus, image processing method, image processing program, and computer-readable storage medium storing image processing program code
JP3895115B2 (en) * 2001-02-01 2007-03-22 ソニー株式会社 Data transmission method, data transmission device, and data reception device
JP2002297496A (en) * 2001-04-02 2002-10-11 Hitachi Ltd Media delivery system and multimedia conversion server
JP4166964B2 (en) * 2001-05-17 2008-10-15 パイオニア株式会社 Transmitting apparatus and control method thereof, receiving apparatus and control method thereof
EP1433317B1 (en) * 2001-07-19 2007-05-02 Thomson Licensing Fade resistant digital transmission and reception system
JP2003111078A (en) * 2001-09-27 2003-04-11 Fujitsu Ltd Contents coder, contents decoder, contents distributor, contents reproduction device, contents distribution system, contents coding method, contents decoding method, contents coding program, and contents decoding program
US7088398B1 (en) * 2001-12-24 2006-08-08 Silicon Image, Inc. Method and apparatus for regenerating a clock for auxiliary data transmitted over a serial link with video data
JP3906712B2 (en) * 2002-02-27 2007-04-18 株式会社日立製作所 Data stream processing device
AU2003245096A1 (en) * 2002-07-04 2004-01-23 Lg Electronics Inc. Read-only recording medium containing menu data and menu displaying method therefor
JP4343111B2 (en) * 2002-10-02 2009-10-14 エルジー エレクトロニクス インコーポレーテッド Recording medium having data structure for managing reproduction of graphic data, recording / reproducing method using the recording medium, and apparatus therefor
EP1547080B1 (en) * 2002-10-04 2012-01-25 LG Electronics, Inc. Recording medium having a data structure for managing reproduction of graphic data and recording and reproducing methods and apparatuses
US20040078828A1 (en) * 2002-10-18 2004-04-22 Parchman Travis Randall Recovering timing for television services
US20040103622A1 (en) * 2002-11-30 2004-06-03 Nor Joanne H. Magnetic equine hood
JP4611285B2 (en) * 2003-04-29 2011-01-12 エルジー エレクトロニクス インコーポレイティド RECORDING MEDIUM HAVING DATA STRUCTURE FOR MANAGING GRAPHIC DATA REPRODUCTION, RECORDING AND REPRODUCING METHOD AND APPARATUS THEREFOR
US7616865B2 (en) * 2003-04-30 2009-11-10 Lg Electronics Inc. Recording medium having a data structure for managing reproduction of subtitle data and methods and apparatuses of recording and reproducing
KR20050005074A (en) * 2003-07-01 2005-01-13 엘지전자 주식회사 Method for managing grahics data of high density optical disc, and high density optical disc therof
KR20050004339A (en) * 2003-07-02 2005-01-12 엘지전자 주식회사 Method for managing grahics data of high density optical disc, and high density optical disc therof
KR20050064150A (en) * 2003-12-23 2005-06-29 엘지전자 주식회사 Method for managing and reproducing a menu information of high density optical disc
EP1713269B1 (en) 2004-01-13 2012-08-08 Panasonic Corporation Recording medium, reproduction device, recording method, program, and reproduction method
JP4500694B2 (en) * 2004-02-06 2010-07-14 キヤノン株式会社 Imaging device
JP4496170B2 (en) 2004-02-17 2010-07-07 パナソニック株式会社 Recording method, playback device, program, and playback method
US7751436B2 (en) * 2005-05-24 2010-07-06 Sony Corporation System and method for dynamically establishing PLL speed based on receive buffer data accumulation for streaming video
KR100766496B1 (en) * 2005-07-08 2007-10-15 삼성전자주식회사 Hdmi transmission system
US7684443B2 (en) * 2006-06-09 2010-03-23 Broadcom Corporation PCR clock recovery in an IP network
WO2008129816A1 (en) * 2007-03-28 2008-10-30 Panasonic Corporation Clock synchronization method
US20100231788A1 (en) * 2009-03-11 2010-09-16 Chao-Kuei Tseng Playback system and method synchronizing audio and video signals
JP5139399B2 (en) * 2009-10-21 2013-02-06 株式会社東芝 REPRODUCTION DEVICE AND REPRODUCTION DEVICE CONTROL METHOD
US10448083B2 (en) 2010-04-06 2019-10-15 Comcast Cable Communications, Llc Streaming and rendering of 3-dimensional video
US11711592B2 (en) 2010-04-06 2023-07-25 Comcast Cable Communications, Llc Distribution of multiple signals of video content independently over a network
WO2011130874A1 (en) * 2010-04-20 2011-10-27 Thomson Licensing Method and device for encoding data for rendering at least one image using computer graphics and corresponding method and device for decoding
US20110317770A1 (en) * 2010-06-24 2011-12-29 Worldplay (Barbados) Inc. Decoder for multiple independent video stream decoding
US9204123B2 (en) * 2011-01-14 2015-12-01 Comcast Cable Communications, Llc Video content generation
US9276989B2 (en) 2012-03-30 2016-03-01 Adobe Systems Incorporated Buffering in HTTP streaming client
US9245491B2 (en) * 2012-12-13 2016-01-26 Texas Instruments Incorporated First de-compressing first compressing schemes for pixels from/to bus interface
CN107835397B (en) * 2017-12-22 2019-12-24 成都华栖云科技有限公司 Multi-lens video synchronization method
KR20230060339A (en) * 2021-10-27 2023-05-04 삼성전자주식회사 Method and apparatus for processing and displaying graphic and video

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07212653A (en) * 1994-01-18 1995-08-11 Matsushita Electric Ind Co Ltd Picture processing unit
US5640195A (en) * 1993-02-19 1997-06-17 Canon Kabushiki Kaisha Multimedia communication system, multimedia information transmitting apparatus and multimedia information receiving apparatus
US5703877A (en) * 1995-11-22 1997-12-30 General Instrument Corporation Of Delaware Acquisition and error recovery of audio data carried in a packetized data stream
US5771075A (en) * 1994-12-08 1998-06-23 Lg Electronics Inc. Audio/video synchronizer
US5774206A (en) * 1995-05-10 1998-06-30 Cagent Technologies, Inc. Process for controlling an MPEG decoder
US5812791A (en) * 1995-05-10 1998-09-22 Cagent Technologies, Inc. Multiple sequence MPEG decoder
US5894328A (en) * 1995-12-27 1999-04-13 Sony Corporation Digital signal multiplexing method and apparatus and digital signal recording medium
US5896140A (en) * 1995-07-05 1999-04-20 Sun Microsystems, Inc. Method and apparatus for simultaneously displaying graphics and video data on a computer display
US5933597A (en) * 1996-04-04 1999-08-03 Vtel Corporation Method and system for sharing objects between local and remote terminals
US5949436A (en) * 1997-09-30 1999-09-07 Compaq Computer Corporation Accelerated graphics port multiple entry gart cache allocation system and method
US5966121A (en) * 1995-10-12 1999-10-12 Andersen Consulting Llp Interactive hypervideo editing system and interface
US6002441A (en) * 1996-10-28 1999-12-14 National Semiconductor Corporation Audio/video subprocessor method and structure
US6044397A (en) * 1997-04-07 2000-03-28 At&T Corp System and method for generation and interfacing of bitstreams representing MPEG-coded audiovisual objects

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5640195A (en) * 1993-02-19 1997-06-17 Canon Kabushiki Kaisha Multimedia communication system, multimedia information transmitting apparatus and multimedia information receiving apparatus
JPH07212653A (en) * 1994-01-18 1995-08-11 Matsushita Electric Ind Co Ltd Picture processing unit
US5771075A (en) * 1994-12-08 1998-06-23 Lg Electronics Inc. Audio/video synchronizer
US5774206A (en) * 1995-05-10 1998-06-30 Cagent Technologies, Inc. Process for controlling an MPEG decoder
US5812791A (en) * 1995-05-10 1998-09-22 Cagent Technologies, Inc. Multiple sequence MPEG decoder
US5896140A (en) * 1995-07-05 1999-04-20 Sun Microsystems, Inc. Method and apparatus for simultaneously displaying graphics and video data on a computer display
US5966121A (en) * 1995-10-12 1999-10-12 Andersen Consulting Llp Interactive hypervideo editing system and interface
US5703877A (en) * 1995-11-22 1997-12-30 General Instrument Corporation Of Delaware Acquisition and error recovery of audio data carried in a packetized data stream
US5894328A (en) * 1995-12-27 1999-04-13 Sony Corporation Digital signal multiplexing method and apparatus and digital signal recording medium
US5933597A (en) * 1996-04-04 1999-08-03 Vtel Corporation Method and system for sharing objects between local and remote terminals
US6002441A (en) * 1996-10-28 1999-12-14 National Semiconductor Corporation Audio/video subprocessor method and structure
US6044397A (en) * 1997-04-07 2000-03-28 At&T Corp System and method for generation and interfacing of bitstreams representing MPEG-coded audiovisual objects
US5949436A (en) * 1997-09-30 1999-09-07 Compaq Computer Corporation Accelerated graphics port multiple entry gart cache allocation system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Geometry Compression", Michael Deering, Sun Microsystems, pp. 13-20, 1995. *
ISO/IEC JTCI/SC29/WG11N0801, Nov. 13, 1994. *
ISO/IEC WD 14772: "The Virtual Reality Modeling Language Specification: The VRML 2.0 Specification" (VRML), Jul. 15, 1996. *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060074921A1 (en) * 2002-07-24 2006-04-06 Total Immersion Method and system enabling real time mixing of synthetic images and video images by a user
US7471301B2 (en) * 2002-07-24 2008-12-30 Total Immersion Method and system enabling real time mixing of synthetic images and video images by a user
US8098737B1 (en) * 2003-06-27 2012-01-17 Zoran Corporation Robust multi-tuner/multi-channel audio/video rendering on a single-chip high-definition digital multimedia receiver
US20070109444A1 (en) * 2003-10-15 2007-05-17 Matsushita Electric Industrial Co., Ltd. AV synchronization system
US7812886B2 (en) * 2003-10-15 2010-10-12 Panasonic Corporation AV synchronization system
US20080304571A1 (en) * 2004-09-02 2008-12-11 Ikuo Tsukagoshi Content Receiving Apparatus, Method of Controlling Video-Audio Output Timing and Content Providing System
US8189679B2 (en) * 2004-09-02 2012-05-29 Sony Corporation Content receiving apparatus, method of controlling video-audio output timing and content providing system

Also Published As

Publication number Publication date
US6072832A (en) 2000-06-06
JPH10136259A (en) 1998-05-22
JP2970558B2 (en) 1999-11-02

Similar Documents

Publication Publication Date Title
USRE39345E1 (en) Audio/Video/Computer graphics synchronous reproducing/synthesizing system and method
US6584125B1 (en) Coding/decoding apparatus, coding/decoding system and multiplexed bit stream
CN1091336C (en) Caption data processing circuit and method therefor
US7292610B2 (en) Multiplexed data producing apparatus, encoded data reproducing apparatus, clock conversion apparatus, encoded data recording medium, encoded data transmission medium, multiplexed data producing method, encoded data reproducing method, and clock conversion method
US5381181A (en) Clock recovery apparatus as for a compressed video signal
US6101591A (en) Method and system for selectively independently or simultaneously updating multiple system time clocks in an MPEG system
US5467137A (en) Method and apparatus for synchronizing a receiver as for a compressed video signal using differential time code
US20040179554A1 (en) Method and system of implementing real-time video-audio interaction by data synchronization
US6602299B1 (en) Flexible synchronization framework for multimedia streams
EP1006726B1 (en) Data processing method for a data stream including object streams
US6233695B1 (en) Data transmission control system in set top box
KR20010024033A (en) Device for demultiplexing coded data
US20020080399A1 (en) Data processing apparatus, data processing method, data processing program, and computer-readable memory storing codes of data processing program
US6122020A (en) Frame combining apparatus
KR100390138B1 (en) A method for transmitting isochronous data, a method and apparatus for restoring isochronous data, a decoder for restoring information data
KR100240331B1 (en) Apparatus for synchronizing a video and an audio signals for a decoder system
US6970514B1 (en) Signal processing device, signal processing method, decoding device, decoding method and recording medium
JP3441257B2 (en) Data transmission equipment
US20050057686A1 (en) Method and apparatus for sending and receiving and for encoding and decoding a telop image
JP2000187940A (en) Recording/reproducing device and recorder
JP2004040807A (en) Data flow reproducing method and apparatus, and system and signal related thereto
EP1148723B1 (en) Special reproduction data generating device, medium, and information aggregate
KR20030004061A (en) Multilayer multiplexing for generating an MPEG2 transport stream from elementary MPEG2 and MPEG4 streams
JP2003337955A (en) System and method for data transmission and reception, device and method for data transmission, device and method for data reception, and program
JP2006014180A (en) Data processor, data processing method and program therefor

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 8

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees