EP0954924A1 - Informationsstromsyntax und anzeige des vorhandenseins eines spleisspunkts - Google Patents

Informationsstromsyntax und anzeige des vorhandenseins eines spleisspunkts

Info

Publication number
EP0954924A1
EP0954924A1 EP98903583A EP98903583A EP0954924A1 EP 0954924 A1 EP0954924 A1 EP 0954924A1 EP 98903583 A EP98903583 A EP 98903583A EP 98903583 A EP98903583 A EP 98903583A EP 0954924 A1 EP0954924 A1 EP 0954924A1
Authority
EP
European Patent Office
Prior art keywords
stream
splice
indicium
bitstream
splicing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98903583A
Other languages
English (en)
French (fr)
Other versions
EP0954924A4 (de
Inventor
Robert Norman Hurst, Jr.
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sarnoff Corp
Original Assignee
Sarnoff Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/996,871 external-priority patent/US6038000A/en
Application filed by Sarnoff Corp filed Critical Sarnoff Corp
Publication of EP0954924A1 publication Critical patent/EP0954924A1/de
Publication of EP0954924A4 publication Critical patent/EP0954924A4/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/179Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scene or a shot
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/87Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23608Remultiplexing multiplex streams, e.g. involving modifying time stamps or remapping the packet identifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23614Multiplexing of additional data and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/24Systems for the transmission of television signals using pulse code modulation
    • H04N7/52Systems for transmission of a pulse code modulated video signal with one or more other pulse code modulated signals, e.g. an audio signal or a synchronizing signal

Definitions

  • the invention relates to communication systems in general, and more particularly, the invention relates to a method for identifying and utilizing splicing "in-points" and splicing "out-points" in MPEG-like information stream.
  • MPEG Moving Pictures Experts Group
  • MPEG-1 refers to ISO/IEC standards 11172, incorporated herein by reference.
  • MPEG-2 refers to ISO/IEC standards 13818, incorporated herein by reference.
  • a compressed digital video system is described in the Advanced
  • ATSC Television Systems Committee
  • a program transport stream is formed by multiplexing individual elementary streams which share a common time base (i.e., the same 27MHz clock source).
  • the elementary streams comprise encoded video, audio or other bit streams.
  • the elementary streams may be, but do not have to be, in a packetized elementary stream (PES) format prior to transport multiplexing.
  • PES packetized elementary stream
  • a PES consists of a packet header followed by a packet payload.
  • the elementary streams are multiplexed, they are formed into transport packets and a control bit stream that describes the program (also formed into transport packets) is added.
  • the invention splices a first information stream into a second information stream.
  • the first information stream includes at least one entrance indicium that identifies an appropriate point of entrance to the stream.
  • the second information stream includes at least one exit indicium that identifies an appropriate point of exit from the stream.
  • the invention monitors the two streams until the appropriate points are found and, in response to a control signal, splices the first stream into the second stream.
  • the inventive splicer includes a pre-splice buffer receiving a first information stream and producing a buffered information stream; a bitstream examiner receiving the first information stream and responsively causing the pre-splice buffer to position an entrance point of the buffered information stream at an output of the buffer; a switch for coupling either the buffered information stream or a second information stream to an output; and a switch controller for monitoring the second information stream and, in response to a control signal and the detection of an exit point in the second information stream, causing the switch to couple the buffered information stream to an output.
  • FIG. 1 shows a block diagram of a compressed bitstream splicing system including the invention
  • FIG. 2 depicts a flow chart of a seamless splicing process in accordance with the invention
  • FIG. 3 shows a detailed block diagram of the splicer of FIG. 1;
  • FIG. 4 depicts a block diagram of digital studio comprising a plurality of interoperable islands and including the invention;
  • FIGs. 5A-5C depicts a plurality of splicing scenarios
  • FIG. 6A and FIG. 6B together depict a flow diagram of a routine 600 suitable for identifying in-points and out-points in accordance with the invention.
  • identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.
  • the invention is generally described within the context of a digital television studio includes a plurality of operative environments which receive and process various bitstreams and which have associated switching capabilities according to the invention.
  • the switching capabilities allow seamless or non-seamless splicing of a plurality of, e.g., video transport streams to produce an output stream.
  • a combination of seamless and non-seamless bitstreams may be produced to provide a controllably degraded output stream.
  • the invention is a two-input bitstream splicer which performs switching, splicing or insertion operations on a pair of MPEG-compliant input transport streams to produce an output stream. It must be noted that the principles of the invention apply to bitstream switchers or splicers having more than two inputs and to input streams other than MPEG-compliant input streams.
  • the invention may be implemented using a general purpose computer system that is programmed to perform the functions discussed below. As programmed, the general purpose computer becomes a specific purpose apparatus for splicing digital data bit streams.
  • the invention may be used for both seamless and non-seamless splicing of bitstreams.
  • Seamless splicing means seamless butt-splicing of two streams to form a resultant output stream that produces a continuous, undisturbed flow of information (e.g., video or audio without glitches or artifacts).
  • Non-seamless splicing produces a resultant output signal which may have a disturbed information flow (e.g., visual or aural distortions, disturbances and artifacts).
  • each bitstream is a transport stream comprising video, audio and (possibly) other information.
  • the invention is applicable to packetized elementary and other elementary streams. Additionally, it is assumed that the splicing points are determined with respect to the video information. This may result in some distortions in the spliced audio and other information, since the audio and other information may not temporally "line up" on a packet by packet basis.
  • Splicing consists of making a transition in an output-stream from a "from-stream” to a "to-stream.”
  • the from-stream is ideally exited at an "out-point” and the to-stream is ideally entered at an "in-point.”
  • An out-point is a place in a presently-selected stream (i.e., "from-stream") where the stream may be ended, and some other stream (i.e., "to-stream”) spliced on.
  • An "in-point” is a place in the other stream where the information may begin to be spliced on to another stream.
  • a "splicing segment" is defined as the portion of an information stream between an in-point and an out-point.
  • a splicing segment may include multiple out-points and in-points. Thus, it is desirable to include as many in-points and out-points as possible in a stream to allow for maximum flexibility in splicing.
  • a delay-parameter e.g., a video buffering verifier (VBV) for MPEG compliant streams.
  • a splicing segment with a known in-point delay-parameter and with out-points having the same known delay-parameter may include within itself shorter valid splicing segments with different values of the delay-parameter.
  • information streams are divided into transport packets.
  • Packets containing video may be intermixed with packets containing audio, auxiliary data, or other information.
  • a video stream out-point is the end of the last video transport packet of the stream of interest. The video stream before and through the last packet must meet the splicing definition of an out-point.
  • a video stream in-point is the beginning of the first video transport packet of a splice segment (SS). It must be noted that other information in the transport stream, specifically audio, is unlikely to be neatly segmented at in-points and out-points.
  • a critical aspect of splicing information streams is the proper processing of the various delay parameters.
  • One parameter of concern is the delay parameter associated with the various information streams.
  • the delay parameter is the video buffering verifier (VBV) delay parameter.
  • Another parameter is the latency, or transitional period, inherent in a splicing operation. For example, a typical splice occurs at a certain time, i.e., a "splice time.” Prior to the splice time an output information stream comprises a from-stream.
  • the output stream may include information from both the from-stream and the to-stream.
  • the output stream includes information from only the to-stream. It is assumed that the from-stream and the to-stream are each valid. There are certain constraints on the streams that must be met if the splicing is to be seamless. Seamless splicing implies that the resultant spliced bitstream will not cause discontinuities in the future.
  • An MPEG Splice Segment is defined at the transport level and includes functionality at the video (and audio) levels.
  • An information-bearing splice segment may be as short as a single frame.
  • a splice segment may even be a zero frame length segment (although such a SS might be MPEG non-compliant).
  • Such a zero-length segment is simply an in-point followed by an out-point (i.e., an "in-out-point").
  • a SS may be also be very long, including many GOPs.
  • a SS In general the length of a SS is not constrained and the SS should include multiple out-points to enable seamless exiting from the segment.
  • a possible exception is a SS comprising a television commercial.
  • the television commercial SS can be deliberately produced without defined out-points so that exiting the commercial segment is not seamless.
  • An MPEG SS should be an MPEG compliant stream having consistent transport stream and elementary stream time stamps (e.g., PCR, PTS and DTS) and an associated delay parameter (e.g., a VBV delay), thereby allowing a decoder to properly decode and present the information in the SS.
  • the first information frame (e.g., video access unit) at an in-point of an MPEG video SS must be an I-Frame.
  • the second frame shall not reference information frames prior to the in-point (i.e., if the second frame is a B-frame, the B-frame may not reference frames prior to the in-point).
  • the last frame before an out-point should not be a B-frame (in display order).
  • An audio SS will have an in-point consisting of the beginning of an audio frame and an out-point consisting of the last byte of an audio frame. There may be other constraints placed on the stream to address issues of, e.g., coding error-build-up, tuning-time and minimum picture quality.
  • the in-point of a video SS must begin with a sequence header, although the SS may contain multiple sequence headers.
  • a SS may contain additional header information to indicate that the sequence header is also an in-point. It is necessary to distinguish the SS in-point sequence header from a sequence header included for tuning-time or picture quality, since seamless splicing can only be guaranteed on in-points. Since the in-point should follow an sequence end code (SEC) code it is desirable to include the SEC code just before the in-point, thereby obviating the need to include the SEC on the end of an out-point.
  • the out-point may include the SEC.
  • An MPEG-type splice count-down, if used, must end (i.e., equal zero) at the out-point.
  • FIG. 1 shows a block diagram of a compressed bitstream splicing system 100 including the invention.
  • the system 100 includes a first compressed bitstream stream source 110, a second compressed bitstream stream source 120, a splicer 300, a controller 105 and an optional splice monitor 130.
  • the first compressed bitstream stream source 110 illustratively a "live feed" from a transport stream encoder, produces a first MPEG-compliant transport stream S6.
  • the second compressed bitstream stream source 120 illustratively a server (e.g., a video disk, tape machine, or other storage device) which stores video and audio elementary streams and transport encodes the stored streams to produce a second MPEG-compliant transport stream S7.
  • a server e.g., a video disk, tape machine, or other storage device
  • the stored information may comprise, e.g., advertisement or local programming information to be spliced into the first transport stream.
  • the splicer 300 selectively couples one of the two input transport streams S6, S7 to a transmitter or other subsystem as an output stream S9.
  • An optional splice monitor 130 monitors various parameters of the spliced output signal S9, e.g., delay parameter, buffer utilization information, synchronization, bitstream source and the like.
  • the optional splice monitor 130 is responsive to the controller 105 and the splicer 300.
  • the splicer 300 receives the first transport stream S6, illustratively a television program produced by a first source, and the second transport stream S7, illustratively an advertisement produced by a second source. In response to a control signal SELECT, the splicer produces an output signal S9 comprising either the first S6 or second S7 transport stream.
  • the control signal SELECT may include priority information which causes the splicer 300 to respond immediately, within a defined time interval or when certain conditions exist (i.e., specific alignments of stream entrance or exit points).
  • the splicer 300 produces a signal ACKNOWLEDGE which is used to acknowledge the SELECT signal and provide specific details about the splice operation (e.g., exact time of splice, error conditions and the like). The operation of the splicer 300 is described more fully below with respect to FIG. 3.
  • the actual splicing operation is the process that takes place within the splicer 300 that does what is necessary to actually switch amongst the bitstreams. This involves stopping, in an orderly manner, the flow of packets from the from-stream; starting, in an orderly manner, the flow of packets from the to-stream; and adjusting the header information in the output stream. During some interval, packets from both the from-stream and the to-stream are likely to be intermixed. Splicing operations must be synchronized to be seamless. To ensure that input streams arrive at the appropriate splicers at the time they are needed several synchronizing operations may be performed.
  • the output stream is continuous and that the actual splice is taken to be a change in the content of the output stream from a from-stream to a to-stream.
  • the time stamps in the output stream should also maintain continuity from one stamp to the next (this is related to stream content) and the splicing mechanism should adjust the output stream time-stamps.
  • the MPEG "discontinuity" header flag should utilized such that an indication of a new time stamps (or time stamp discontinuity) is provided to a decoder.
  • the splicing process must have some notion of time, since this local notion of time that must be used to produce the output time-stamps.
  • the splicing process gets its notion of time from some timing source such as the OC-12c interface and the current time is derived from either stream content or set-time messages.
  • the local notion of time must be moderately continuous and well behaved.
  • both the end of the from-stream and the beginning of the to-stream must be available at the actual splice hardware that is producing the output.
  • all buffering within the splicing process must be finite and defined.
  • a decision to switch streams must be made.
  • the source of an output stream is actually switched in response to that decision.
  • the decision to splice may be content related, such as a switch from a from-stream to a to-stream when a content-related data element is encountered in one of the streams.
  • the from-stream may be monitored and, in response to the detection of, e.g., a black-screen or a scene change, a splice decision may be made. This operational decision does not require synchronization.
  • the decision requires that the splicer (or a controller) analyze, e.g., the from-stream to detect the data element.
  • the decision to splice may also be data-flow related, such as a switch from a from-stream to a to-stream on some particular packet or upon the start or stop of information flow.
  • the decision to splice may be time-related, such as a switch from a program to commercial at noon. Time-related decisions must be referenced to the splicer's local frame-of-reference.
  • a message-passing process passes the decision information to the splicer in time for the splicer to be ready to make the splice in its frame-of-reference. Given that the decision to splice at some time has been made, the splice will be made at the next available splice point, based upon the from-stream and the to-stream.
  • the decision to splice may be may be event driven, such as the pushing of a button (e.g., the director's "take” command, as depicted in the splicer 100 of FIG. 1).
  • a button e.g., the director's "take” command, as depicted in the splicer 100 of FIG. 1.
  • acknowledge message may be required.
  • This message when delivered to the originator of the splice decision (e.g., the controller), will allow an intelligent choice to be made about time-outs, and actions like panic non-seamless splices. Time-outs and determinations about corrective actions to remedy splice failures is a policy matter for the originator of the splice decision. Time-out and forced switch may be a service implemented by the splicer but only as a convenience.
  • An operational unit may feed back an appropriate acknowledgment message to a controlling entity.
  • the contents of such a feedback message may include one or more of the following parameters: 1) a splice did or did not take place; 2) the local time-of-day that the splice occurred; 3) the delay-parameter value of the to-stream; 4) the delay-parameter value of the from-stream; 5) the current (post-splice) sync-buffer delay (e.g., in delay seconds); 6) the future time a splice will take place (if the switcher can compute this value); and 7) any exceptions or errors.
  • Exceptions and errors may include the fact that no splice took place, that the decision parameters passed by the controller were incorrect (e.g., syntactically or logically), that the to-stream was not ready, that a time-out occurred or that an audio-failure occurred (e.g., the dropping of an excessive number of audio frames).
  • Additional information includes: 1) the amount of time that the audio information from the from-stream will be needed; 2) an indication that the inputs are buffered correctly and ready for a new splice; and 3) other information useful to the controller or the splicing process itself.
  • the precise time at which a seamless splice takes place may not be pre-determined, since the seamless splice depends upon the arrival of an in-point in the to-stream. In the case of a decision to splice seamlessly there are several sub-decisions which must be made about what to do if the splice does not take place within some time limit. The choices are as follows. First, simply wait for a seamless splice to occur. Depending upon studio operational goals, this may not be acceptable. Second, define a fixed time-out period and, if the splicer has not spliced within the defined time-out period, perform a non-seamless splice (i.e., switch streams in as controllable a manner as possible).
  • a non-seamless splice i.e., switch streams in as controllable a manner as possible.
  • ⁇ 0 are available when out-points occur in the from-stream. If the amount buffered is insufficient (e.g., more than a second elapses between successive in-points in a from-stream), then the buffer will overflow and will contain invalid information. This condition is remedied by an appropriate number of in-points and out-points being inserted into the bitstreams. If bitstreams do not have in-points and out-points often enough, then those bitstreams can not be seamlessly spliced at those times. Moreover, to the extent that there is packet or cell jitter in the arrival time of input bitstreams, a first-in, first-out (FIFO) buffer (with output clocked at nominal data rate) is expected to smooth the flow.
  • FIFO first-in, first-out
  • Server-generated streams must be carefully generated so that the data does not arrive at the splicer too early or too late. If the data arrives too early, there is some risk of overflow of an input buffer. If it is assumed that the splicer has enough synchronization buffering to hold a second or so of video, then it would seem that server streams can be delivered in any pattern of flow that never exceeds the just-in-time limit, and the one-second-early limit. Of course, there may be peak rate limitations on the splicer.
  • any stream processed in a studio containing the splicer is expected to have the same reference clock rate.
  • Remotely-generated streams by the time they have reached a splicer, should be the same as locally-generated real-time streams.
  • the remote source may be genlocked to the local studio. This can be done via a reverse channel or by locking both to an external reference, such as a timing signal derived form the Global Positioning System (GPS).
  • GPS Global Positioning System
  • a ⁇ Splice monitoring is an important aspect of splicing, especially in a studio environment.
  • Content-related monitoring may comprise the steps of viewing an image on a display device (i.e., "monitor") responsively changing parameters of the bitstreams producing the image (e.g., splicing).
  • Optional splice monitor 130 may be used for content-related monitoring by, e.g., a director.
  • Another form of monitoring is the qualitative assessment of a monitored bitstream.
  • Optional splice monitor 130 may be used to retrieve qualitative information from the spliced output signal S9, e.g., delay parameter, buffer utilization information, synchronization information, bitstream source identification and the like.
  • the optional splice monitor 130 is responsive to the controller 105 and the splicer 300 to either process the information and return, e.g., an operational summary, or to couple the qualitative information directly to the controller 105 and the splicer 300 for further processing.
  • a director i.e., human
  • monitors i.e., decoders driving displays
  • output stream i.e., a program
  • a first example is a "soonest" mode of operation.
  • the director presses a "take” button TAKE based upon an event seen on an output monitor 132, a from-stream monitor 136 or a to-stream monitor 134.
  • a queued up (e.g., server-stored) to-stream is ready and aligned at an in-point.
  • an out-point will arrive at end of the from-stream sync-buffer
  • the from-stream contains up to 1/4 second of delay.
  • the amount of output monitor delay (i.e., the time between the "take” command TAKE and a change in scene on the output monitor 132) is between 1/2 and one second. If the director responded to a scene on the from-stream monitor 136 , the amount of from-stream monitor delay is between 1/4 and 1/2 second and the output monitor delay is 1/2 second. If the director responded to a scene on the to-stream monitor 134, the to-stream monitor 134 is continuous (i.e., no monitor delay) and the output monitor delay is negative 1/4 seconds (i.e., the scene changes 1/4 second after the "take" button TAKE is pressed and the image displayed occurred 1/4 second prior to the press of the button).
  • a second example is the "next" mode of operation.
  • a queued up to-stream is flushed from a to-stream synchronization buffer and the next segment beginning with an in-point is queued up within up to 1/4 second.
  • the to-stream synchronization buffer also has zero to 1/4 seconds of random delay. When the in-point arrives the splice is made.
  • the amount of output monitor delay is between 1/2 and one second. If the director responded to a scene on the from-stream monitor 136, the amount of from-stream monitor delay is between 1/2 and 3/4 second and the output monitor delay is 1/2 second. If the director responded to a scene on the to-stream monitor 134, the to-stream monitor 134 is continuous and the output monitor 132 switches to a new scene between zero and 1/4 second later.
  • the choice of "soonest” or “next" mode of splicing is an operational one, and may be based upon which disconcerting effect (delay or back-up) is least objectionable. To alleviate these effects an amount of delay may be inserted into the splicer inputs. If this delay matches the monitor delay, and the monitors are connected to the inputs of the delays, then the apparent delay between monitor scenes and button action is less, but the delay to final output is greater.
  • a separate monitor control unit may be built to simulate the bit-stream switching and show the simulated results of the bitstream switch, thereby providing more flexibility to the director.
  • FIG. 3. shows a detailed block diagram of the splicer 300 of FIG. 1.
  • the splicer 300 selects one of a first input bitstream S6 and a second input bitstream S7 as an output bitstream S8.
  • the output bitstream S8 is optionally time stamped to produce a retimed output stream S9.
  • the first and second input bitstreams S6, S7 are, illustratively, MPEG-compliant transport streams including at least video and audio elementary streams.
  • the video and audio elementary streams may be in a packetized elementary stream (PES) format.
  • PES packetized elementary stream
  • the second bitstream S7 is currently selected as the output bitstream (i.e., S7 is the from-stream) and the first bitstream S6 will be selected as the output bitstream (i.e., S6 is the to-stream) after a splicing operation.
  • the first input bitstream S6 is coupled to a first bitstream examiner 310A and a first synchronization buffer 320A.
  • the first bitstream examiner 310A examines the first bitstream for entrance points which have been included in the first input bitstream S6. When an in-point is found, the contents of the synchronization buffer are discarded (i.e., the buffer is "flushed") and the in-point is stored in the first memory portion of the synchronization buffer.
  • the synchronization buffer may be constructed as a first-in, first-out (FIFO) buffer. The process of searching for in-points and flushing the buffer is repeated until the first input bitstream S6 is selected by the splicer.
  • the output bitstream S3 A of the first synchronization buffer 320A is coupled to a switch controller 340 and a first working buffer 330A.
  • the first working buffer 330A produces an output signal S4A which is coupled to a packet switching unit 350.
  • the second input bitstream S7 is coupled to a second bitstream examiner 310B and a second synchronization buffer 320B. If the second bitstream were not presently selected as the output stream, then the second bitstream examiner 310B and synchronization buffer 320B would operate in the same manner as described above with respect to the first bitstream examiner 310 A and synchronization buffer 320A.
  • the second bitstream examiner 310A examines the second bitstream for exit points which have been included in the second input bitstream S6. In the "selected mode" of operation, the second bitstream examiner 310B is not used and the second synchronization buffer 320B serves as a constant delay buffer which produces a delayed bitstream S3B.
  • the delayed bitstream S3B is coupled to a working buffer 330B and a switch controller 340.
  • the second working buffer 330B produces an output signal S4B which is coupled to packet switching unit 350.
  • the second working buffer 330B holds the selected bitstream long enough to allow for overlap of old audio packets with current video packets. This allows audio frames to continue to completion after a splice is made.
  • the synchronization of audio and video frames are discussed in more detail below and in U.S. patent application serial number 08/864,321, filed May 28, 1997 and incorporated herein by reference.
  • a splice decision is made by a controller (e.g., controller 105) and coupled to the switch controller 340 via a control signal SELECT. Assuming that the splice decision equates to the command "splice seamlessly at the next opportunity," the switch controller 340 responds by scanning the currently selected output stream (i.e., bitstream S3B) for out-points. It is assumed that an in-point is positioned at the end of the first synchronization buffer 320A. When an out-point arrives on the from-stream, the switch controller 340 causes, via a control signal A/B, the switch 350 to begin coupling video packets from the to-stream through the switch to an optional header adjuster. At an appropriate time any audio packets within the to-stream are also switched.
  • a controller e.g., controller 105
  • the switch controller 340 responds by scanning the currently selected output stream (i.e., bitstream S3B) for out-points. It is assumed that an in-point is positioned at the end of
  • the optional header adjuster 360 alters time-stamps in the selected output stream S8 to produce a retimed output stream S9.
  • the retiming of the program clock reference (PCR), presentation time stamps (PTS) and decode time stamps (DTS) of the selected stream S8 may be necessary to ensure that the splice is, in fact, seamless to a decoder.
  • the header adjuster 360 includes a 27MHz (local) station clock 362 which is utilized by a local PCR and PCRB generator 364. To retime the presentation and decode time stamps it is necessary to partially decode (i.e., packetized elementary stream (PES) layer) the selected transport stream S8.
  • PES packetized elementary stream
  • ⁇ DTS detection and retiming unit 366 to produce a PTS and DTS retimed stream S8P.
  • the PTS and DTS retimed stream is transport encoded and time stamped by PCR detection and retiming unit 368 to produce a retimed transport stream S9.
  • An alternate embodiment of a header adjuster is the PTS-DTS retimer discussed in more detail in U.S. patent application serial number 08/864,326, filed May 28, 1997 and incorporated herein by reference.
  • the invention may be implemented using a general purpose computer system that is programmed to perform the various functions.
  • the embodiment of FIG. 3 may be implemented as a computer program utilizing portions of memory to provide buffering, and an algorithm directed to the examination, control, switching and header adjustment functions.
  • the splicer 300 produces a signal ACKNOWLEDGE which is used to acknowledge the SELECT signal and provide specific details about the splice operation (e.g., exact time of splice, error conditions and the like).
  • a routine for splicing will now be described with respect to FIG. 2.
  • FIG. 2 illustrates a splicing routine in accordance with the invention.
  • the splicing routine is entered at step 202 when the decision to splice is made. For the purpose of this discussion, it is assumed that the decision is to seamlessly splice from the currently selected (from) stream S4B to another (to-stream) stream S4A.
  • the decision is examined at step 204. If the decision of step 202 is to splice as soon as possible, then the routine proceeds to step 208. If the decision is to splice at the next in-point (e.g., skip the presently buffered GOP in the to-stream), then the synchronization buffer (e.g., 320A) is flushed.
  • the next in-point e.g., skip the presently buffered GOP in the to-stream
  • the synchronization buffer e.g., 320A
  • the splice is made (step 220) and the routine is exited (step 230).
  • the context in which a splicing decision (step 202) is made is relevant to the amount of information necessary to perform a seamless splice. If the splice decision is made in the context of building play-to-air edit lists, it is necessary for the streams to be spliced to have the same value of delay-parameter. If the splice decision is made in the context of creating a live production, it is necessary for the streams being spliced to have matching delay-parameters and splice points which occur often enough to meet operational
  • the invention will now be described within the context of a digital television studio including a number of distinct operating environments (such as servers or edit-suites) which receive, process and transmit various information streams.
  • the operating environments, or “islands of interoperability ,” may be interconnected to perform one or more operations on the various information streams.
  • the studio output may be delivered to end-users (e.g., the public) via ATSC broadcast, cable, telephone and satellite transmission and the like.
  • the studio output may also be stored for later use in, e.g., a server or on CD-ROM or video tape.
  • the invention is also useful in video teleconferencing and other applications.
  • FIG. 4 depicts a block diagram of digital studio comprising a plurality of interoperable islands and including the invention.
  • the digital studio 400 of FIG. 4 includes interoperable islands 401, 402 and 404-409.
  • the digital studio 400 also includes a first compressed bitstream stream source 110, a second compressed bitstream stream source 120, a splicer 300, a controller 105 and an optional splice monitoring unit.
  • the first compressed bitstream stream source 110 illustratively a 'live feed" from a transport stream encoder, produces a first MPEG-compliant transport stream S6.
  • the second compressed bitstream stream source 120 illustratively a server (e.g., a video disk, tape machine, or other storage device) which stores video and audio elementary streams and transport encodes the stores streams to produce a second MPEG-compliant transport stream S7.
  • the first and second compressed bitstream sources 110, 120 operate in substantially the same manner as previously described with respect to the bitstream splicing system 100 of FIG. 1.
  • the digital studio 400 includes a controller 105 which performs those functions previously described with respect to the bitstream splicing system 100 of FIG. 1, and other functions which will be described below.
  • Island 300 roughly equates to the splicer 300 of the bitstream splicing system 100 of FIG. 1.
  • Each of the islands receives a plurality of information streams.
  • islands 401 and 402 each receive information streams from a NETWORK FEED and a LOCAL FEED.
  • Controller 105 communicates with each of the islands via a control channel C.
  • the control channel C is used to direct the flow of information throughout the studio (i.e., between islands) and to direct the processing of the information within the islands.
  • the controller 105 provides the splicing decisions and any necessary parameters associated with the intended splice.
  • the islands respond by performing, monitoring and acknowledging (via control channel C) the various splicing operations.
  • a digital studio according to the invention may be described as an interconnected group of "splicing islands" which perform particular processing functions on received bitstreams to produce output bitstreams.
  • the islands form individually distinct operating environments (e.g., storage environments, editing environments, processing environments and the like) which cooperate with each other via a controller 105 to produce one or more output bitstreams (e.g., S9, OUTPUT STREAM).
  • Each island operates at a known delay-parameter value and all splicing within an island is (ideally) seamless.
  • the splicing and processing functions are under the general control of controller 105, but may be locally controlled as necessary. For example, an operator sitting at an editing station may logically comprise one island.
  • the streams to be edited are routed to an editing island (e.g., island 407) in response to commands transmitted via control channel C from controller 105.
  • a signal may be switched through several islands (e.g., 406 and 300) prior to being stored in a storage unit (not shown) at the editing island (e.g., 407).
  • An alternate mode of studio operation is to controllably operate one or more islands in a non-seamless mode.
  • the non-seamless mode may be required in several circumstances where a splice or other transition between bitstreams must occur rapidly, and a range of bitstream degradation is permissible. It must be noted that non-seamless switching may produce errors which are propagated to subsequent islands receiving a degraded bitstream. These errors may be mitigated, if necessary, by, e.g., dropping damaged or inferior access units or groups of access units (e.g., video frames) or by adding additional access units.
  • the splicing operation is unlikely to be seamless (i.e., the buffer will likely overflow). In this case, frames may be dropped to avoid the overflow condition.
  • the splicer needs to adjust time stamps to cause a number of frame repeats (i.e., add frames) while the buffer fills.
  • the buffer may also be increased by splicing short, all-black frames on the end of a short delay-parameter sequence to build up the value of the delay-parameter in current use.
  • splicing operations take place in operational units (e.g., splicing islands), such as routing switchers, play-to-air switchers, production switchers or other switchers. Therefore, it is desirable to support a plurality of data formats and bitrates.
  • operational units e.g., splicing islands
  • 422@HIGH and 420@HIGH television studio formats each support multiple picture formats and bit rates. Therefore, it may be necessary to splice, e.g., a bitstream comprising a 1280 by 960 picture element, 60Hz Progressive Scan picture onto the end of a bitstream comprising a 1920 by 1080 picture element, 59.94Hz interlaced picture.
  • Both of the above example splices may be seamlessly made if the streams being spliced have matching delay parameters. Therefore, it is important that the controller that makes the splice decision know the delay parameters of the various streams to be spliced.
  • the delay parameter of a stream may be calculated by an operational unit receiving a stream or included within the stream as part of the
  • switch controller 340 includes a bitstream calculator which calculates the delay parameters of the input streams S6, S7. Is should be noted that the delay parameter calculation may also be performed by the bitstream examiners 310A, 310B or the optional splice monitor 130.
  • splicing information streams Another critical aspect of splicing information streams is the determination of in-point and out-point locations in the streams to be spliced. To properly perform a seamless splice it is necessary to find the in-point of the to-stream and the out-point of the from-stream. Moreover, a splice segment may include in-points and out-points having different delay-parameter values. There are several options available for finding the appropriate splice points.
  • the entire to-stream or from-stream may be analyzed by the splicer in real time (i.e., "on the fly").
  • a real-time analysis is difficult for a to-stream because an in-point cannot readily be deduced from the stream without playing the stream to its end.
  • the length of an I-frame is not known in advance. By the time the first I-frame has ended, and its length is known, it is probably too late for the information to be used. It must be noted that this problem may be overcome by using, e.g., a more powerful computing device.
  • a real-time analysis is easier for a from-stream because the delay-parameter of the from-stream is known (from the in-point or otherwise), the presentation time-stamps in the stream indicate when frames leave the decode buffer, and bit-counts (or packet counts) indicate when the frames enter the decode buffer.
  • the frame rate is also known from sequence headers.
  • an external table may be created to contain indications of where splice points are.
  • This approach assumes that the information about in-point and out-point locations has been computed elsewhere (e.g., during a stream encoding process).
  • This approach requires that the in-points and out-points be indexed in some manner (e.g., Nth packet from a marker, first packet after a time-of-day reference, and the like).
  • This approach also requires the updating of a splice table
  • in-point and out-point markers may be placed within the information stream directly.
  • An MPEG compliant information stream includes header portions where such a marker may be included.
  • Both in-points and out-point should be marked and, ideally, the marking should occur at the system, transport and PES levels.
  • the delay-parameter associated with the stream or splicing segment and an audio offset i.e., a displacement of audio-frame boundaries from associated video frames
  • the MPEG count-down feature should also be used to indicate that, e.g., an out-point is approaching (decreasingly positive countdown) or an in-point has been transmitted (increasingly negative-countdown).
  • Bitstream Generation To help ensure seamless splicing it may be necessary to create the bitstreams to be spliced in a certain manner.
  • bitstreams that can be spliced
  • the creation of the stream content and the insertion of appropriate splice control information (i.e., in-point and out-point markers).
  • appropriate splice control information i.e., in-point and out-point markers.
  • a desired value of the delay-parameter is known in advance.
  • other goals such as how often an in-point is wanted, are also known.
  • the creation of the bitstream becomes a matter of rate-control. For each frame, there is a not-to-be-exceeded bit-count.
  • the rate-control task is to encode each frame with the best quality possible within the bit-budget.
  • the per-frame bit budget is computed as the transmission-bit-rate divided by the frame-rate.
  • For complex GOP encoding a forward analysis of the created stream may be made. The allocation of bits among frames must be done to assure that a decoder buffer doesn't underflow.
  • a first constraint which may be applied to the created stream is the defining of a splice segment as a fixed GOP structure (e.g., 13 frames arranged in the following display order: "...IBBPBBPBBPBBP"). This approach is straightforward a the expense of unnecessarily degraded picture quality.
  • a scene cut on the last P frame of an "...IBBPBBPBBPBBP" GOP would be reproduced with a very small bit budget.
  • GOP structure that is ideal for all applications.
  • the loss of flexibility implied in this approach is probably unacceptable.
  • a second constraint which may be applied to complex GOP encoding is the insertion of in-points at out-points at predetermined time intervals (e.g., 2 and 0.5 seconds, respectively). This approach does not require the use of a specific GOP structure, therefore the encoder is free to select frame type based upon the input pictures. There are various rate-control issues to be resolved when switching between MPEG streams or splice segments.
  • One rate-control issue involves the amount of data transmitted to a decoder buffer. For example, the decoder buffer will not overflow if the buffer contents (measured in bits) at any out-point is less-than-or-equal-to the decoder buffer contents (measured in bits) measured at the most recent in-point. It is not necessary to know the actual number of bits, it is only necessary to ensure that the number of bits in the decoder buffer does not increase from in-point to out-point.
  • the above-described amount of time may be defined at the "Delay-Parameter" for the stream.
  • the frame sizes (measured in time to transmit the frames at the operating bit rate) must be consistent with the operating delay parameters to ensure seamless splicing.
  • the delay parameters are the end-to-end VBV size (measured in time) and the VBV contents (measured in time) at the beginning/end of a stream.
  • An additional, globally defined value is the maximum size of physical buffers (in bits). This maximum size must be greater than the maximum VBV size implied by the MPEG profile and level indication criteria.
  • the decode time stamp at an out-point of a from-stream should be one frame time of the stream greater that the DTS and PTS of the last frame of the from-stream.
  • a splicing decision is made by some human.
  • the decision may be made in the process of generating a list of programming to be transmitted by a television studio or in real time as the studio is transmitting.
  • the splicing decision may be made by some surrogate process, such as a preprogrammed command to splice a station identification announcement into the studio transmission every day at 12:05 AM.
  • the decision may be to splice at a particular time in the future or immediately.
  • Several parameters of the streams to be spliced may be known at the time of the decision, though these parameters may change prior to the actual splicing operation.
  • the splicing decision is usually made with some knowledge of the to-stream, such as the stream length, VBV delay parameter and the like. It is possible for the from-stream to be unknown at the time of the decision (e.g., the daily message is inserted into whichever stream is being transmitted at the time of insertion).
  • a decision contains the following elements. First, the operational unit which sources the to-stream, the operational unit which performs the splice and the stream or segments to be spliced. Second, the time the splice is to take place. The time may be "now," a particular time of day or the occurrence of some logical condition. "Now” means make the next splice after the arrival of the splice-now message. The now decision may arise from direct human action (e.g., button-press) or some external controlling process deciding to send a splice-now message. The logical condition may be the occurrence of a time-code (e.g.,
  • SMPTE in a particular information stream
  • a time stamp e.g., PTS or DTS
  • a reference time e.g., PCR
  • some other detectable event e.g., an input stream PID changes.
  • the logical events may be combined in a logical manner to determine a splice time and select appropriate streams for splicing.
  • a to-stream comprises either 24 or 30 frames per second (fps) video streams including only I-frames.
  • the delay parameter of the to-stream is equal to one frame time at the slowest frame rate (i.e., 42mS if 24 fps).
  • each of the I-frames contains fewer bits than can be sent in one unit of display time (i.e., one 42mS frame time) at the bit rate for the frame. If the bit rate is 150 Mb/s, a 30fps frame contains no more than 5 Mb. If the bit rate is 150 Mb/s, a 24fps frame contains no more than 6.25 Mb.
  • the presentation time stamp indicating when the last frame is to be presented has a value 42mS in the future.
  • the from-stream were at 30Hz. (33mS frame rate)
  • 9 ms. after the out-point the last frame of the from stream will be taken from the decoder buffer, and 33mS later the first frame of the to-stream will be needed.
  • the to-stream is also 30Hz, the first frame will have been delivered 9mS before it is needed. If the from-stream were at 24Hz, and the to-stream were also at 24Hz, the to-stream frame arrives just in time.
  • streams are coded with a bit-count between the in-point and following out-points that is calculated from the bit-rate and the frame-time (i.e., bit-rate * frame-time).
  • the presentation time-stamps are set to values that all agree with the delay-parameter (i.e., first frame presented delay-parameter after the first bit arrives.).
  • the second example is a complex GOP transmission format.
  • stream is a 30 frame per second video (and associated audio) stream having delay-parameter of 250mS, a display order of "...IBBPBBPBBPBBPBBP.Xand transmission order of "...IPBBPBBPBBPBBPBB" (where "I” represents an I-frame, "P" represents a
  • This GOP structure includes in-points on the I-frames and out-points on the frame immediately preceding the I-frames. At each out-point, the PTS associated with the last P frame is 250mS in the future.
  • the rate control ensures that the decoder buffer doesn't underflow on the I-frame. It must contain less than 250mS worth of bits.
  • the valid-MPEG constraint implies that the P-frame following the I frame also doesn't underflow. It is not necessary that the I-frame use all 250mS.
  • the third example is a multiple output example. For purposes of the third example it is assumed that stream is a 30 frame per second video (and associated audio) stream having the following GOP structure:
  • the stream is also assumed to have a delay-parameter of 250mS and a transmission bitrate of 20Mb/s (i.e., 670Kb per frame). If an I-frame takes 23 ImS and P-frame takes 20mS then, after 15 frames, the decoder buffer contents have subsided to a level below the level at the in-point to the stream. This may be calculated using an equation such as the following:
  • each I-frame may be an in-point and all the P-frames after the fifteenth P-frame may be out-points.
  • the decoder buffer reacts as follows. At the splice point, the buffer contains
  • each P-frame frame is 400 Kb and the buffer contains 2.8 Mb.
  • the I-frame comes in, increasing the buffer contents. Since each I-frame adds 670Kb and each P-frame taken out removes 400Kb the buffer contains 4.7Mb after the seven P-frames are taken out. The I-frame is then presented, removing 4.6Mb from the buffer and, therefore, leaving 100Kb in the buffer.
  • u delay in the buffer is approximately zero.
  • Each P-frame now adds 400 Kb in 20mS and every 33 ms 400Kb is used.
  • delay in buffer increases by 13mS every frame time. After 15 frames, the delay stored in the buffer has reached the delay-parameter value. At this time a splice to another sequence may be made because the buffer is able to receive an I-frame.
  • Compressed Audio Splicing The following discussion of splicing of compressed audio is limited to the issue of splicing combined video-audio streams in the audio-follows-video mode. The composition of streams from separately edited audio and video streams is not considered here.
  • Compressed audio is carried in frames.
  • Each audio frame is of fixed duration and contains a fixed number of bits.
  • the audio frame size, or duration is different from any of the video frame sizes, or durations. This means that audio frames will not align with splice points. Audio frames can be considered to be randomly aligned with the video. Therefore, when making a splice, the alignment of the audio with the video will be different for the to-stream and the from-stream.
  • a Presentation Time Stamp exists in each audio stream.
  • the audio and video PTSs refer to the same reference to allow the required synchronization.
  • the to-stream becomes the output stream. It is important to note that, particularly due to audio constraints, the process of switching may extend in time before and after the actual switch instant.
  • Audio information frames in an information stream are ideally located within a limited time difference from respective video information arrival at the end of the decoder buffer. If there is a 1/2 second end-to-end video buffer delay, then audio packets should be approximately 1/2 second later in a transmission stream than corresponding (i.e., having the same presentation time stamps) video packets. If this assumption is correct, then the switching operational unit must save audio information from the from stream for this 1/2 second after the video
  • the source stream must continue for 1/2 second after the splicer has switched to another stream. It is also amusing to contemplate rapid switching among several streams.
  • the overlapped audio packets may simply be broken. This is not the most desirable approach because it relies upon the CRC to prevent the use of partial packets. If the CRC fails one time in 64K packets, at about 30 packets/second, every few thousand seconds there is a potential undetected error. When a broken-packet CRC fails, it fails every time the packet is used. This means that a failure that produces a click may end up reproduced every time the same splice is made. This requires the use of a garbage-collecting process to remove broken audio frames.
  • a second approach to splicing audio is to include "unfinished" from-stream audio frames (i.e., those overlapping a video splice) in the resultant stream.
  • the to-stream audio frames are then retimed such that they butt up against the "unfinished" from-stream audio frame.
  • This technique assures continuous audio at the expense of continuous inspection, buffering and adjustment of audio frames and packets.
  • the first complete to-stream audio frame is selected as the first audio frame to play because the to-stream frame which has already started is likely to be delayed too much to match the end of the "unfinished" from-frame. This technique also causes a slightly distorted lip-sync.
  • a third approach to splicing audio is to maintain alignment of audio with its corresponding video, that is, to leave a gap between the end of the from-stream audio frame and the beginning of the to-stream audio frame.
  • This approach advantageously relies on the MPEG decoder specification which requires that audio frame gaps are muted.
  • the audio presentation-time-stamps are adjusted by the same adjustment amount applied to the video frames. In this manner lip sync is maintained.
  • the third approach is especially useful when many splices (e.g., the creation of a sequence from a succession of short splice segments) may produce audio artifacts due to the muting.
  • FIG. 5 depicts a plurality of splicing scenarios involving audio alignment which illustrate aspects of audio-video splicing, assuming the above cited third approach is used to maintain alignment of audio with its corresponding video.
  • FIG. 5A depicts the simple splicing case where both audio streams align with their corresponding video. The splicer delays both to-streams and simply switches at the splice point.
  • FIG. 5B depicts the splicing case where the from-stream video and from-stream audio are aligned, but the to-stream video and to-stream audio are not aligned.
  • a partial to-stream audio frame is discarded.
  • the next complete to-audio frame is passed to the output with appropriate delay.
  • FIG. 5C depicts the typical splicing case where both audio streams do not align with the corresponding video streams. It can be seen that a from-stream audio frame has already begun before the splice point. This audio-frame is buffered and transferred to the output. It doesn't end until some fraction of a frame time after the splice. The to-stream audio frame that spans the splice-point cannot be used. The next to-stream audio frame also cannot be used. It begins too early, and would overlap the last from-stream audio frame. The first to-stream audio frame that appears in the output stream begins D time units after the splice-point. This delay may be as much as two audio frames.
  • the lip-sync is preserved, but as much as 32mS of from-stream audio overlaps the to-stream video. Also, the first to-stream audio begins as late as 64mS after to-stream video begins. Finally, the splicer performing the splicing operation must buffer a whole audio frame in each work buffer.
  • every audio frame includes a PTS. It is possible that some equipment manufacturers only include a PTS every, e.g., third audio frame. In this case, or the case where there is no audio PTS, a splicing operation may be performed after calculating a virtual time-stamp.
  • the virtual time stamp is derived from the approximate real-time delay of audio-frames from video reference time-stamps.
  • the virtual audio-time-stamp is then incremented by the (known) audio-frame duration on successive audio-frame starts. This process may be implemented as a backup process to ensure that non-time stamped audio streams to not get into a studio where time stamps are crucial to the operation of the studio.
  • Auxiliary Data Splicing is included in many MPEG streams. This data is usually present as contiguous, unbreakable streams of unknown length. By contrast, a compressed audio stream is relatively well-behaved and predictable.
  • the auxiliary data stream may be associated with a corresponding video stream having a delay that is not now specified.
  • auxiliary data there are several methods for handling auxiliary data in a splicing operation, including: 1) ignore it and splice the auxiliary data at the same time as the video data; 2) insert auxiliary data through a separate path to, e.g., a play-to-air switcher (this data may comprise a program guide or other consumer-related information); and 3) define a set of segmentation markers for auxiliary data and rely upon these markers in switchers to keep segmentation correct (this requires knowledge about the content of auxiliary data streams and the lengths of auxiliary data segments within those streams).
  • the auxiliary data may also be switched with or without a delay, and the delay may be a parameter passed to the switcher by a decision making controller.
  • the auxiliary data may be input to the splicer via an auxiliary input.
  • Input arriving on the auxiliary-data input may be buffered and inserted into the output stream on a space-available basis as a replacement for null packets. In this case it becomes some other system unit's responsibility to align such data within streams, and to provide channel capacity for inserted aux-data by, e.g., reducing a video stream data rate.
  • header information comprises, e.g., a splicing_point_flag, a splice_type field and a splice_countdown field.
  • the splicing_point_flag when set to 1, indicates that a splice_countdown field is present.
  • the splice_countdown field is an 8-bit integer specifying the remaining number of transport stream packets remaining until a splicing point is reached, such as the end of a video frame.
  • the splice_type field is a 4-bit field used to derive splice_decoding_delay and max_splice_rate data from, e.g., a table storing such data.
  • the standard use the these header flags and fields to implement a splicing function is defined in the MPEG specification.
  • splice points within a transport stream may be in- points, out-points or both.
  • An out-point is equivalent to the MPEG-definition of a splice point.
  • An in-point comprises a splice point (i.e., an out-point) followed by a sequence header that is immediately followed by an I-frame. Therefore, in-points within a particular stream may be identified by finding out-points followed by sequence headers that are followed immediately by I-frames.
  • a to-stream may be entered at any in-point, as described above, even if the out-point of a from- stream is not followed by a sequence header or an I-frame.
  • the splicing_point_flag of the packet immediately preceding the out-point must equal one, and the splice_countdown field of that packet must equal zero.
  • the above-described embodiment requires that the bitstream be parsed down to the elementary layer to examine, e.g., the picture_coding_type field in the picture header to determine if an I-frame is present.
  • entrance and exit indicia comprise information residing within the transport layer, thereby obviating the need to parse the bitstream down to its elementary layer.
  • an out- point in a from-stream is indicated by the splicing_point_flag being equal to one and the splice_countdown field being equal to zero.
  • an in-point in a to- stream is indicated by the splicing_point_flag being equal to one and the splice_countdown field being equal to negative one.
  • an in-point packet i.e., the packet that immediately follows an in-point
  • an out-point packet i.e., the packet including an out-point
  • the splice_type of an in-point packet indicates the suitability of splicing the in-packet to an out-packet in that the in- packet and the out-packet should both have the same splice_type.
  • an out-point that is not also an in-point must have the splicing_point_flag equal to zero in the packet immediately following the packet with the splicing_point_flag being equal to one and the splice_countdown being equal to zero.
  • the splicing_point_flag indicates that the out-point is not associated with an in-point. This is because the splicing_point_flag must be equal to one for the contents of the splice_countdown field to be valid, and the splice_countdown field must be valid and equal to negative one for an in-point.
  • indicia for identifying in-points (i.e., entrance indicia) and out-points (i.e., exit indicia) within an MPEG-like bitstream provides several advantages.
  • One advantage, as previously described, is the ability to identify in-points and out-points by parsing only the adaptation header of the transport layer, and avoiding the parsing of the elementary layer.
  • Another advantage of the preferred embodiment is that each transport packet is self contained in the sense that each packet contains sufficient information to determine if the particular packet comprises an in-point, an outpoint or both. This allows the identification of an in-point or out-point in a packet without regard to any previous packets.
  • bitstream generation it is desirable to generate bitstreams according to the in-point and out-point syntax described above. Specifically, it is desirable to include entrance and exit indicia at many places within the bitstream, illustratively, at every I-frame during the bitstream encoding process.
  • FIG. 6 depicts a flow diagram of a routine 600 suitable for identifying in-points and out-points in accordance with the invention.
  • routine 600 is suitable for use by, illustratively, the bitstream examiners 310A
  • the routine 600 is entered at step 605 when a transport packet within a stream to be examined (e.g., S6 or S7) is received by, e.g., a bitstream examiner (e.g., 310A or 310B).
  • a transport packet within a stream to be examined e.g., S6 or S7
  • a bitstream examiner e.g., 310A or 310B
  • the routine 600 proceeds to step 610, where the packet header of the received packet is examined, and to step 615, where a query is made as to whether the splice_point_flag within the adaptation header of the received packet is equal to 1.
  • step 615 the routine 600 proceeds to step 620, where a query is made as to whether the splice_countdown flag is equal to 0. If the query at step 620 is answered affirmatively, then the routine proceeds to step 635, where the packet is identified as containing an out-point. Such identification may take the form of setting an "out-point-ready" flag suitable for use in, e.g., step 210 of the routine 200 of FIG. 2. The routine then proceeds to step 635, where it is exited.
  • step 620 If the query at step 620 is answered negatively, then the routine 600 proceeds to step 630, where a query is made as to whether the splice_countdown flag is equal to -1. If the query at step 630 is answered affirmatively, then the routine proceeds to step 640, where the packet is identified as containing an in- point. Such identification may take the form of setting an "in-point-queued" flag suitable for use in step 208 of the routine 200 of FIG. 2. The routine then proceeds to step 645, where it is exited.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
EP98903583A 1997-01-21 1998-01-21 Informationsstromsyntax und anzeige des vorhandenseins eines spleisspunkts Withdrawn EP0954924A4 (de)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US3597597P 1997-01-21 1997-01-21
US3597P 1997-01-21
US08/996,871 US6038000A (en) 1997-05-28 1997-12-23 Information stream syntax for indicating the presence of a splice point
US996871 1997-12-23
PCT/US1998/001036 WO1998032281A1 (en) 1997-01-21 1998-01-21 Information stream syntax for indicating the presence of a splice point

Publications (2)

Publication Number Publication Date
EP0954924A1 true EP0954924A1 (de) 1999-11-10
EP0954924A4 EP0954924A4 (de) 2003-05-14

Family

ID=26712673

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98903583A Withdrawn EP0954924A4 (de) 1997-01-21 1998-01-21 Informationsstromsyntax und anzeige des vorhandenseins eines spleisspunkts

Country Status (3)

Country Link
EP (1) EP0954924A4 (de)
JP (1) JP2001509354A (de)
WO (1) WO1998032281A1 (de)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6154496A (en) * 1997-11-25 2000-11-28 Philips Electronics N.A. Corp. Video buffer for seamless splicing of MPEG streams
US6785289B1 (en) * 1998-06-05 2004-08-31 Sarnoff Corporation Method and apparatus for aligning sub-stream splice points in an information stream
WO2000001160A2 (en) * 1998-06-29 2000-01-06 Limt Technology Ab Method and apparatus for splicing data streams
US6912251B1 (en) * 1998-09-25 2005-06-28 Sarnoff Corporation Frame-accurate seamless splicing of information streams
FR2784844B1 (fr) * 1998-10-14 2002-03-29 France Telecom Procede de basculement de la ou des composantes video d'un premier programme audiovisuel numerique sur la ou les composantes d'un second programme audiovisuel numerique
FR2784845B1 (fr) * 1998-10-14 2001-02-23 France Telecom Procede de basculement de la ou des composantes video d'un premier programme audiovisuel sur la ou les composantes video d'un second programme audiovisuel numerique
GB2347812A (en) * 1999-03-08 2000-09-13 Nds Ltd Real time splicing of video signals
US6549669B1 (en) 1999-04-27 2003-04-15 Telefonaktiebolaget L M Ericsson (Publ) Predictive audio and video encoding with smooth scene switching capability
CN1321394A (zh) * 1999-07-16 2001-11-07 皇家菲利浦电子有限公司 A/v数据流的记录和编辑
US6985188B1 (en) * 1999-11-30 2006-01-10 Thomson Licensing Video decoding and channel acquisition system
JP4467984B2 (ja) * 2002-01-18 2010-05-26 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ オーディオのコード化
US7206494B2 (en) 2002-05-09 2007-04-17 Thomson Licensing Detection rules for a digital video recorder
US7260308B2 (en) 2002-05-09 2007-08-21 Thomson Licensing Content identification in a digital video recorder
JP5720691B2 (ja) 2010-09-30 2015-05-20 富士通株式会社 動画像符号化装置、動画像符号化方法及び動画像符号化用コンピュータプログラム
CN103838691B (zh) 2012-11-27 2018-08-14 中兴通讯股份有限公司 实现高速数据传输的方法及通用接口芯片
EP4088227A4 (de) 2020-01-07 2024-01-24 Nokia Technologies Oy Syntax auf hoher ebene für komprimierte darstellung neuronaler netze

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5534944A (en) * 1994-07-15 1996-07-09 Matsushita Electric Corporation Of America Method of splicing MPEG encoded video
JP3484832B2 (ja) * 1995-08-02 2004-01-06 ソニー株式会社 記録装置、記録方法、再生装置及び再生方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"INFORMATION TECHNOLOGY - GENERIC CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION: SYSTEMS" INTERNATIONAL STANDARD, NEW YORK, NY, US, 15 April 1996 (1996-04-15), pages I-XY,1-119, XP000667435 *
See also references of WO9832281A1 *
WEISS S M: "SWITCHING FACILITIES IN MPEG-2: NECESSARY BUT NOT SUFFICIENT" SMPTE JOURNAL, SMPTE INC. SCARSDALE, N.Y, US, vol. 104, no. 12, 1 December 1995 (1995-12-01), pages 788-802, XP000543847 ISSN: 0036-1682 *

Also Published As

Publication number Publication date
WO1998032281A1 (en) 1998-07-23
EP0954924A4 (de) 2003-05-14
JP2001509354A (ja) 2001-07-10

Similar Documents

Publication Publication Date Title
US6038000A (en) Information stream syntax for indicating the presence of a splice point
US6137834A (en) Method and apparatus for splicing compressed information streams
US6741290B1 (en) Processing coded video
EP0881838B1 (de) Korrektur der vorgegebenen Zeit
US6907081B2 (en) MPEG encoder control protocol for on-line encoding and MPEG data storage
US5859660A (en) Non-seamless splicing of audio-video transport streams
EP1397918B1 (de) Verbinden von digitalen videotransportströmen
US6912251B1 (en) Frame-accurate seamless splicing of information streams
US5877812A (en) Method and apparatus for increasing channel utilization for digital video transmission
US6678332B1 (en) Seamless splicing of encoded MPEG video and audio
US6806909B1 (en) Seamless splicing of MPEG-2 multimedia data streams
US7477692B2 (en) Video encoding for seamless splicing between encoded video streams
US7254175B2 (en) Frame-accurate seamless splicing of information streams
WO1998032281A1 (en) Information stream syntax for indicating the presence of a splice point
WO2010125582A2 (en) Method and apparatus for splicing a compressed data stream
EP3360334B1 (de) Spleissungssystem und -verfahren für digitale medien
KR100517794B1 (ko) 압축된정보스트림을스플라이싱하는방법및장치
WO2000062551A1 (en) Frame-accurate seamless splicing of information streams

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19990708

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB IT NL

A4 Supplementary search report drawn up and despatched

Effective date: 20030328

RIC1 Information provided on ipc code assigned before grant

Ipc: 7H 04N 7/24 B

Ipc: 7H 04N 5/262 A

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20060307