MXPA00001917A - Frame-accurate editing of encoded a/v sequences - Google Patents
Frame-accurate editing of encoded a/v sequencesInfo
- Publication number
- MXPA00001917A MXPA00001917A MXPA/A/2000/001917A MXPA00001917A MXPA00001917A MX PA00001917 A MXPA00001917 A MX PA00001917A MX PA00001917 A MXPA00001917 A MX PA00001917A MX PA00001917 A MXPA00001917 A MX PA00001917A
- Authority
- MX
- Mexico
- Prior art keywords
- sequence
- frames
- frame
- bridge
- sequences
- Prior art date
Links
- 230000003287 optical Effects 0.000 claims description 7
- 238000003780 insertion Methods 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 9
- 238000000926 separation method Methods 0.000 description 9
- 230000002123 temporal effect Effects 0.000 description 6
- 230000002350 accommodative effect Effects 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 230000000875 corresponding Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 230000004059 degradation Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000003197 gene knockdown Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000002452 interceptive Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 102100005129 MPEG1 Human genes 0.000 description 1
- 101700012088 MPEG1 Proteins 0.000 description 1
- 230000003111 delayed Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000009830 intercalation Methods 0.000 description 1
- 230000002687 intercalation Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
Abstract
A method and apparatus are provided for generating bridge segments (B) to enable editing jumps to be made from one A/V segment (A) to another (C) whilst handling timing and frame constraints imposed by the A/V segment compliance with coding conventions, such as MPEG. The bridge segment is constructed by copying data from the two sequences (A, C) to be bridged, with some demultiplexing, decoding, remultiplexing and re-encoding of this data to maintain the validity of the edited data stream. Different procedures in terms of copying and/or re-encoding are applied in dependence on the picture encoding types at the source and destination of the edit via the bridging segment.
Description
EXACT EDITION OF CODIFIED AUDIO / VIDEO SEQUENCE TABLE
The present invention relates to the storage, retrieval and editing of audio and / or video data encoded by frame, particularly, but not essentially, in conjunction with optical disk storage for data and the use of coding schemes that meet with the MPEG protocol. Recently there has been a need for domestic and commercial audio and / or video (here "A / V") devices that support a large amount of user interactivity, and the need to seamlessly join A / V segments arises from this. in which the transition between the end of one segment and the beginning of the next can be handled uniformly with the decoder. This implies that from the point of view of the user there is no perceptible change in the frame rate seen and the audio continues uninterrupted. The applications for seamless video are numerous, with particular household applications including the editing of home movies and the removal of commercial breaks and other discontinuities in the recorded broadcast material. Additional examples include backgrounds for fairies (computer generated images); An example of using this technique would be an animated character that runs in the front of a video stream encoded by MPEG. Another is a series of character-user interactions presented as short, seamless clippings where the result of an interaction will determine which cut appears next. ün development of this are interactive mobile images where the user (observer) can influence the plot. The bifurcation points along the path or path that a user chooses to take through the interactive movie should appear seamless, otherwise the user will lose the suspension of disbelief normally associated when watching a movie. A problem with the frame-encoded sequences, in particular those such as the MPEG-compliant schemes involving predictive encoding between frames for at least the video content, is that it is not possible to easily jump out of the last frame in a first group of frames. images (GOP) to the first frame of a new GOP, starting only from one frame arbitrarily selected to another. This is due to temporary dependencies, timing constraints and separation among others, as will be discussed later here. Therefore, an object of the present invention is to allow the reading of audio and / or video clippings or stored frame sequences in such a way as to allow them to be joined without causing perceptible disturbances. According to the present invention, there is provided a data processing apparatus comprising means operating to read data sequences based on frames of a storage device and edit them, such as to link from a first edit point to a first sequence of frames to a second edit point in a second sequence, where for each of the sequences of 'stored frames a number of frames are intra-coded (referred to hereinafter as "I frames") without reference to any another frame of the sequence, a number (which is hereinafter referred to as "frames P") are respectively encoded with reference to an additional frame of the sequence, and the remainder (hereinafter referred to as "frames B ") are encoded, respectively, with reference to two or more additional frames of the sequence; The apparatus includes means for generating bridges configured to create a sequence of bridge frames for linking the first and second edit points, by selectively embedding frames of the first and second stored frame sequences and selective recoding of one or more of the frames. frames within the bridge sequence as determined by the type of coding (I, P or B) of the frames of the first and second sequences indicated by the respective editing points. Also, according to the present invention, there is provided a method for editing data sequences based on frames, such as for linking from a first edit point in a first sequence of frames to a second edit point in a second sequence, where For each of the sequences of frames a number of the frames (here afterwards "I frames") are inter-coded, without reference to any other frame of the sequence, a number (hereinafter "P frames") are encoded, respectively, with reference to an additional frame of the sequence, and the remainder (hereinafter "frames B") are encoded, respectively, with reference to two or more additional frames of the sequence; the method includes the step of creating a bridge frame sequence to link the first and second edit points, the bridge frame sequence incorporates frames of the first and second frame sequences with the selective recoding of frames within the frame sequence, being determined by the type of coding of the frames of the first and second sequences indicated by the respective editing points.
Through the use of bridge sequence generation, which can be effected by a suitably configured subsection of a signal processor apparatus that handles data transfer to and from the storage device, means are provided to solve the problem of make exact editions of video and / or audio frames in compliance with MPEG and similar program flows where, due to temporal dependencies and separation models used in such coding and multiplexing techniques, simple cut and paste edits do not they can be made in any frame boundary. Additional features of the present invention are set forth in the appended claims, the description of which is incorporated herein by reference, and to which readers' attention is now directed. These and other aspects of the invention are best described in terms of the following exemplary, but non-limiting embodiments. The preferred embodiments will now be described by way of example only, and with reference to the accompanying drawings, in which: Figure 1 is a schematic block representation of an optical disc recording / reproducing apparatus suitable for incorporating the invention;
Figure 2 is a more detailed scheme showing the components within the apparatus of Figure 1; Figure 3 represents the registration of blocks of information in sequence areas of an optical disk; Figure 4 represents the reproduction of the information stored on the disk in Figure 3; Figure 5 illustrates in a general way the editing of stored video data, with the bridge sequences omitted; Figure 6 represents the splice points required for a pair of MPEG video image streams in presentation order; Figure 7 illustrates the sequence limits in relation to a generated bridge sequence; Figure 8 schematically depicts differences in the duration of the video and audio signal frames and their relation to the size of the data packet; Figure 9 represents the creation of a bridge segment between two sequences of A / V frames; and Figure 10 illustrates the delay of the audio packet in a stream of composite A / V packets. The following description considers in particular, the A / V devices that operate according to the MPEG standards
(ISO / IEC 11172 for MPEG1, and, in particular, ISO / IEC 13818 for MPEG2) although the person skilled in the art will recognize the applicability of the present invention to other A / V coding schemes that do not conform to the standard MPEG. The following describes how the present invention solves the problem of making exact editions of video and / or audio frames in an MPEG Program Stream where, due to the temporal dependencies and separation patterns used in MPEG coding and multiplexing, they can not be made Simple cut and paste edits in no frame boundary. To facilitate editing, bridge sequences are generated - that is, short sequences of MPEG data that are specially constructed (in a form to be described) to link two original MPEG data records. As will be described, under certain circumstances, it becomes necessary to partially decode and recode sections of this data to construct a valid MPEG stream. The final element in the video editing is a control structure or play list or slack. It teaches the reproduction system how to make a sequence through two flows. It contains the starting point of the original flow and information about the start of the bridge sequence. It contains information about where to jump to the second stream from the end of the bridge sequence. It may also contain other information to make the management of the reproduction easier.
Figure 1 shows embodiment of an apparatus suitable for housing the present invention, in the form of an optical disk recording and reproducing device. In the description of the device, the handling of video signals based on frames is concentrated on, although it will be recognized that other types of signals, such as audio or data signals, may be processed alternatively or additionally, and that the invention is equally applicable to other devices of memory, such as magnetic data storage media and computer hard disk drives. The apparatus comprises an input terminal 1 for receiving a video signal to be recorded on the optical disk 3. Further, the apparatus comprises an output terminal 2 for supplying a video signal reproduced from the disk. The data area of disk 3 consists of a contiguous range of physical sectors, which have corresponding sector addresses. This address space is divided into sequence areas, with a sequence area being a contiguous sequence of sectors. The apparatus as shown in Figure 1 is broken down into two main system parts, namely the disk subsystem 6 and referred to herein as the subsystem of the video recorder 8 which controls both the registration and the reproduction. The two subsystems are characterized by a number of features, as will be easily understood, including that the disk subsystem can be addressed transparently in terms of logical addresses and can guarantee a maximum sustainable bit rate for reading and / or writing. Figure 2 shows a schematic version of the apparatus in greater detail. The apparatus comprises a signal processing unit 100, which is incorporated in the subsystem 8 of Figure 1. The signal processing unit 100 receives the video signal via the input terminal 1 and processes the video data in a signal of channel to register them on the disk 3. A read / write unit is provided indicated by dashed line 102, incorporated in the subsystem of disk 6 of Figure 1. The read / write unit 102 comprises a read / write head 104 configured to read / write to the optical disk 3. Positioning means 106 are present to position the head 104 in a radial direction through the disk 3. A read / write amplifier 108 is present to amplify the signals a and the disk 3 A motor 110 rotates the disk 3 in response to a motor control signal supplied by the signal generating unit 112. A microprocessor 114 is present to control all the circuits via control lines 116, 118 and 120.
The signal processing unit 100 is adapted to convert the video data received via the input terminal 1 into information blocks in the channel signal. The size of the information blocks can be variable but can (for example) be between 2 Mb and 4 Mb. The writing unit 102 is adapted to write a block of information of the channel signal in a sequence area about disk 3. The blocks of information that correspond to the original video signal are written in many sequence areas that are not necessarily contiguous, as can be seen in the recoding diagram of Figure 3, which is known as a record fragmented. This is a feature of the disk subsystem that is able to record and write such fragmented records fast enough to satisfy deadlines in real time. To allow the editing of recorded video data in a first registration step on the disk 3, the apparatus is further provided with an input unit 130 to receive an output position (exit point) in a first video signal recorded on disk 3 and to receive an input position (entry point) in a second video signal encoded on the same disk. Additionally, the apparatus comprises a bridge sequence generating unit 134, incorporated in the signal processing unit 100, to generate the bridge sequence for linking the two video streams as described in detail here below. The recording of a video signal will be discussed briefly with reference to Figure 3. In the subsystem of the video recorder, the video signal, which is a real-time signal, is converted into an RTF file in real time as shows in the upper part of Figure 3. The _ real-time file consists of a succession of block sequences of signals SEQ to register in corresponding (although fragmented) frequency areas. There is no restriction on the location of the sequence areas on the disk and, consequently, any two consecutive sequence areas comprising portions of the recorded video signal data can be anywhere in the LAS logical address space as shown in FIG. shows in the lower part of Figure 3. Within each sequence area, real-time data is assigned contiguously. Each file in real time represents a single A / V flow. The A / V flow data is obtained by concatenating the sequence data of the order of the file sequence. Next, the reproduction of a recorded video signal on the disc 3 will be discussed briefly with reference to Figure 4. The reproduction of a video signal is controlled by means of a playback control program (PBC). In general, each PBC program defines a new PBS playback sequence, which may comprise an edited version of the registered video and / or audio segments, and may specify a sequence of segments of respective sequence areas. As can be seen from the comparison of Figures 3 and 4, the PBC required to create the sequence of the original file (of Figure 3) reorders the fragmented recorded segments to provide a succession of reproduction frames corresponding to the original sequence . The 'editing of one or more video signals recorded on disk 3 is discussed with reference to Figure 5, which shows two video signals indicated by two sequences of fragments named "file A" and "file B". To make an edited version of one or more of the video signals recorded at the beginning, a new PBC program is generated to define the A / V sequence obtained by concatenating the parts of the first A / V registers in a new order. The parties can be from the same registry or from different registries. To play a PBC program, data from several parts of (one or more) real-time files have to be provided to a decoder. This implies a new flow of data that is obtained by concatenating parts of the flows represented by each file in real time. In Figure 5, this is illustrated for a PBC program that uses three parts, one from file A and two from file B.
Figure 5 shows the edited version that starts at point Pl in the sequence of areas in file A and continues to point P2 in the next sequence area of file A. Then playback jumps over point P3 in the sequence area in file B and continue to point P4 in an additional sequence area in file B. Next, playback jumps over point P5 in the same file B, which may be an initial point in the succession of the areas sequence of file B different to point P3, or to a later point in the sequence other than point P4. From point P5 in the frequency area in file B, playback continues until point P6. The generation of bridging sequences for the transitions P2-P3 and P4-P5 has been omitted from Figure 5 for reasons of clarity: the reasons for, and means for generating, those bridging sequences will not be considered. As will be understood in a general way, the following examples are related to the picture-based rather than file-based editing: this is because the general unit of video coding in the MPEG is the frame. It will be recognized by those skilled in the art that compliance with the MPEG is not mandatory (as mentioned above) and that the techniques described herein can also be applied to data based on non-MEPG files.
To create a seamless edition of a flow
MEPG to another using a bridge sequence, a number of factors and conditions have to be observed, as summarized below and considered in detail here later. Beginning with the elementary streams, and considering first the video aspects: Field Sequence: the sequence (from top to bottom) must be preserved through all the jumps, in or out of the bridge sequence. Resolution change: If there is a change in the resolution, seamless reproduction can be guaranteed if required. The apparatus can be simplified if a limited number of permissible resolution levels can be used (for example a medium or complete). 3: 2 knock down: the field sequence (from top to bottom) must be preserved through all jumps. Mixed frame rates (for example from NTSC or PAL): in these circumstances, seamless reproduction can only be guaranteed with an additional cost and complexity of the apparatus, since this mixing requires a change of vertical synchronization for deployment. Such mixing of standards or standards and consequently of frame rates, should therefore be avoided whenever possible.
Image types: different operations will be required depending on the type of image (I, P, B) involved, as discussed below. Turning now to the aspects of audio, the first of these is that of voids. For an edit in a combined A / V stream, the join will generally be seamless in the video, but there may be a discontinuity in the structure of the audio frame - either in the form of a space or an overlay, because the frames of audio are usually of different duration to the video frames. To handle this, information is needed in the game list to help control the player. Another aspect of audio is that of the structure of the frame, with this being the responsibility of the creator of the bridge sequence to ensure that a continuous sequence of complete audio frames is presented to the decoder. Considering the multiplexing aspects, jumps can be expected in the time base of the System Clock Reference (SCR) at any frame boundary in the connection, and consequently the decoder must be able to reconstruct the correct time base. Additionally, through all the seamless jumps, the System Target Decoder (STD) links must be respected, with these being the responsibility of the process that creates a bridge sequence to ensure this.
For multiplexing, "the concern of audio deviation arises: in a typical implementation there will be a deviation between the audio and video decoded arrival times concurrently." To handle this, the decoder must have the facility to read additional audio frames Finally, the main concern of the separation of the disk is that of the allocation requirements, with these being the responsibility of the process that creates the edition to ensure that the minimum contiguous area requirements are satisfied. As suggested above, the connections in terms of the decoding and presentation of the video stream have to be done in a seamless manner, as generally illustrated in Figure 6. Therefore, the unnecessary images after the exit point or before the entry point are excluded during a process that recodes a part of the sequences around the edit point. The continuous provision of data is a precondition for seamless decoding, which must be guaranteed by the file system. At the end of a sequence before the connection (SEQ.l), an End of the MPEG Sequence code is placed and at the beginning of the sequence after the connection point (SEQ.2) there is a sequence header. The video material at the end of SEQ.l and at the beginning of SEQ.2 probably needs to be recoded. As shown in Figure 7, the connection is made by creating a video bridge sequence. The bridge sequence consists of a recoded video of the original content on either side of the exit point and the entry point. The first part of the bridge forms the end of SEQ.l. This is a piece of video encoded up to and including the intended end point. This is recoded to connect the previous tables of the SEQ.l and forms an accommodative, continuous elementary flow. Similarly, the second part of the bridge forms the heading of SEQ.2. This consists of encoded data from the entry point in SEQ.2 forward. The data is recoded to give an effective starting point for decoding, and to connect to the rest of SEQ.2 to form an accommodative, continuous elementary stream. The video bridge contains the connection between the two sequences. All video data in SEQ.l and SEQ.2 comply with the MEPG video specification, with SEQ.2 starting with an I image and a GOP header. Image I is the first presentation unit in that GOP (temporary reference - = 0). This ensures that there is a "clean break" between the video data of the sequence, and means that the last bit of the video data of the SEQ.l is released before any bit of the video data of the SEQ is released. .2. The additional restrictions imposed include that the video presentation units defined in the bitstream must be continuous through the connection, with no frame or field in the presentation in the connection. In terms of the audio, the difference in the sizes of the video and audio frames can lead to a gap in the sequence of the audio presentation units in a connection. Although a vacuum of less than one duration audio frame can be tolerated, it is preferred to insert an additional audio frame at this point, so that there is an overlap in the definition of the audio presentation units of less than one period of time. audio box. For the multiplexing aspects, the end of the SEQ.l and the start of the SEQ.2, the sections that form the bridge sequence are recoded and remultiplexed and stored in a multiplex bridge to ensure that the STD model is obeyed. To satisfy the requirements of this STD model, the multiplex bridge is likely to be larger in time than the bridge sequence. Timing of all presentation units before, during and after the connection point, is determined by a single reference timeline, so that in the reproduction model the reproduction is seamless. For file allocation, the connection is built in such a way as to guarantee the continuous supply of data by the file system. This is done by assigning the multiplex bridge as part of a new assignment that is connected in the bodies of SEQ.l and SEQ.2. Choose the jump point of the original data in the SEQ.la a new place that contains the end of the SEQ.l to satisfy the conditions of allocation of contiguous blocks of data to allow the provision of data in real time, the new assignment must contain at least the multiplex bridge. This new assignment may be greater than that of the multiplex bridge if it is required. The length of the bridge assignment (which contains the end of SEQ.l and the start of SEQ.2, which includes the multiplex bridge) should be chosen so that conditions on the permissible degree of such fragments are satisfied, and the point at which SEQ.2 jumps back to the original data sequence should be chosen to satisfy the aforementioned condition on the allocation of contiguous blocks. It should be noted that the jump points near the end of SEQ.l and near the start of SEQ.2 are not directly linked to the start and end of the multiplex bridge. They must be chosen by the system that creates the edition to satisfy the allocation rules. It is always possible to choose jump points that satisfy the conditions of continuous supply for an arbitrary choice of the entry point and the exit point of the addition.
At the level of the video sequence, the frames can be copied from the original sequence or decoded and then recoded to form the bridge sequence. The decision to recode (to improve quality) or copy (to improve speed) depends on several reasons: Recoding can be avoidable because the reference image used is not already present; The recoding is indicated because the reference image has changed, but because it has the same content (although recoded) of the image, it can be decided to copy instead of recoding, negotiating speed accuracy; - The recoding is chosen to reduce the bit rate. There are a few combinations of cases that have to be considered, as will be described later. In those examples, the letters I, P and B have their conventional meanings in terms of the MPEG picture or frame; the numbers in subscripts after the letter of the type of table indicate the order of presentation of the table, the letters in subscripts indicate the origin or destination; and the bold letters identify the particular picture that illustrates the current example.
The first example has the original image (frame in the first sequence) to jump as an image B. In the order of presentation: I? SBlsB2sP3sB4sB53P6sB sB8sP9sBl? S-3lls
In the order of the bitstream:
I? SB-? SB-.2sP3sBisB2s-? 6sB4sB5SP9SB sB8s
If the jump is direct from the Bss block, the decoder will present incorrectly. Therefore, if the exit point of the edition is a B image, the jump must be done on the previous P image (presentation order) and recode the B images in the bridge sequence. The sequence of images until the exit point is then:
I0sB-? SB-2SP3SB? SB2s IMG REF B * 4sB * 5s
where IMG_REF is a reference image (I or P) taken from the destination flow, and B * 4sB * 5S correspond to the content of the image to the source flow tables Bs and B5S but were recoded on the basis of the new image reference.
In an alternative arrangement, to ensure a "clean interruption" of the connection as explained above, the type of B * s image coding should be changed to the P box, so that the injection into the source sequence of IMG_REF of the destination sequence is avoided. With this change, the sequence of images until the exit point becomes:
I? SB_lsB_2sP3sB? SB2sP * 4sB5s
In the following example, the source image to jump is an image P or l. In the order of presentation, the original sequence is:
I? SB? SB2sP3sB sB5S 6sB sB8sP9sB ?? sB?
In the order of the bitstream:
I? SB-? SB-2sP3sBlsB2s * -? 6sB4sB5sP9sB sB8s
If the exit point is P6s, then the jump is made after B5s in the order of the bitstream. All images in the skipped sequence will be decoded correctly and therefore can be copied easily. The case is the same for an I image instead of an image P. In the third example, the destination image for jumping is an image B. In the order of presentation the original destination sequence is:
I? DBidB2dP3dB4dB5dP6dB dB8dP9dB? OdB? Id
In the order of the bitstream, the original destination sequence is:
I? DB-l B-2dP3dBldB2dP6dB BsdP9dB7dB8d
The composite bridge sequence is
XXXXXXXXP6dB4dBsdP9dB7dB8d
where the X are the images copied or recoded from the source sequence. There are two cases (depending on whether we jump over an I / P image or an image
B, as above) with the following respective possibilities for flow XXX:
I0sB_? SB_2sP3SB? SB2s IMG REF B * 4sB * 5s I? SB-isB_2sP3sBisB2sP6sB4sB5s
In any case, \ 6d needs to be recoded since it has lost its reference image; B4d must be removed from the sequence; Bsd must be recoded; and Pgd and all the other images in the GOP (Group of Images) must be recoded because Pßd has been recoded. Normally, however, it may be possible to simply copy Pgd and accept the degradation of the limited quality caused by the mismatch, although it may be necessary that all images after splicing need to change the temporal reference. Again, to preserve the clean interruption in the connection, we re-encode and change the image type of Pßs for frame I. B4 must be excluded and B5 recoded. Again, all tables should be recoded, but it may be considered sufficient to simply recalculate the temporal references. The last of these examples considers the case where the destination image for the jump is an I or P image. In the order of presentation, the original destination sequence is:
I? DBldB2dP3dB4dB5 P6dB dB8dP9dBl? DBll Pl2dBl3 Bl4dPl5d In the order of the bitstream, the original destination sequence is:
I? B-idB-2dP3dBldB2dP6dB4dB5dP9dB dB8dPl2dB ?? dBn Pl5dBl3dBl4d
The composite bridge sequence is:
XXXXXXXXPgdB7dB8 (- [Pi2dB? OdB? I
where the X are the images copied or recoded from the source sequence. As before, there are two cases, depending on whether we jump over an I / P image or a B image, with the following generalized possibilities for the XXX stream:
I0sB-? SB-2SP3sB? SB2s IMG_REF B * 4sB * 5s I? SB-? SB-2sP3sBisB2sP6sB4sB5s
In any case, Pgd needs to be recoded since it has lost its reference image, Bd and B8d must be removed from the sequence since none of them are relevant to the edited stream, P? 2d and all the other images in the GOP should be recoded because Pgd has been recoded: however, it may be possible to simply copy P? 2 and accept the degradation of the limited quality caused by the mismatch, although it may be necessary that all images after splicing need to change the temporal reference . As described above, the first option IMG_REF can be replaced by changing the frame type of B * 4s to preserve the clean interruption. Moving to the field sequence, this must be preserved through a seamless union. Normally, with structured coding of the frame without using the 3: 2 knockdown, this is the default action when the edits are based on frames or field pairs. In case the indicators repeat_primer_field (rpc) and first_field_superior (psc) (options of the standard MPEG encoder) are used, then care must be taken to ensure the domain preservation of the field. This will be easier if the DTS / PTS time (Decoder Date Clock / Presentation Date Clock) is stamped on each coded frame. In the case that they do not have stamped the time, it is necessary to observe the indicators rpc and pcs to determine the sequence of the field. This is an additional criterion that must be satisfied in a jump. To give the "clean interruption" at the edit points, multiplexing is restricted so that all data for SEQ.l is brought to the STD input before the first data for SEQ.2 is delivered. This gives a single point of discontinuity of the data supplied. Note that both of SEQ.l and SEQ.2 when considered independently, can adequately (though not essentially) comply with ISO / IEC 12818-1 P-STD, although other multiplexing methods may be used. The coding and multiplexing of video packets in both SEQ.l and SEQ.2 are restricted by the continuity of the separator, as will be described later. The last packet of an audio stream in SEQ.l contains the sample that has a presentation interval that includes the end of the period of presentation of the last video frame in SEQ.l. The first audio packet of an audio stream in SEQ.2 contains the sample that has a presentation interval that includes the start of the presentation period of the first video frame in SEQ.2. This definition means that the audio packets at the end of the SEQ.l and at the start of the SEQ.2 can overlap at their release or delivery time: the behavior of the player required in this case will be defined later. During the playback of a sequence, while playing a single sequence, only data from a single time base are present in the STD separators, with the STD model operating as the P-STD model defined in ISO / IEC 13818-1 ( MPEG-2 systems): in this way, the continuous presentation of audio and video can be guaranteed. During the transition from one sequence to another, it is likely that the time base of SEQ.2 is not the same as that of SEQ.l. It is required that the presentation of video data continue seamlessly. There may be an OV overlay at the presentation time of the audio presentation units, as illustrated in Figure 8. In terms of handling the discontinuities of the time stamp for the MEPG flow, with a jump from a first sequence to the bridge sequence and then to the second sequence, there will be a change in the time base of the SCR / DTS / PTS dater clocks recoded with the flow in the discontinuity where two sequences meet in the middle part of the bridge. Figure 9 illustrates this arrangement for the following example, with the bridge sequence B between the first sequence A and the second sequence C. In the example, each image has a size in bits given by An. Each image has decoder and date clocks. presentation based on the clock system
DTSñnr PTSñn. Each image has a DTS and PTS value recoded into the bitstream or inferred from previous values recoded into the DTS'nn bitstream, PTS'An.
Each image has a start code that is delivered to the STD model at a time that can be derived from the recoded SCR values SCR'An. Each image has a start code as a real time in the STD model, in which the data is delivered to the STD separator SCRñn. The period of the image is T. For the sequence of A: SCR? N-SCRAn and DTS ^ DTSftn that applies to all images n (in other words, the system clock in the player and the time base registered in the flow, they are the same). In cases where this is not true (for example, after a previous jump), they will • differ in a constant. Considering the timing of the video presentation, the presentation is continuous without a gap through the connection. Using the following: The PTS in the bit stream in the
PTSJ last video presentation unit of the SEQ.l. The PTS in the bitstream of the
PTS 'first unit of initial presentation of video of the SEQ.2 The period of presentation of the last unit of video of the
-PP SEQ.l Next, the deviation between two time bases, SCT_delta, is calculated from the data in the following two bit streams: SCT_delta + PTS2inic al = PTS1fina? + Tpp Consequently SCT_delta = PTS ^ mai - PTS2initiative? + Tpp Until the time, Tx (SCR1fin_video), the time when the last video packet of the SEQ.l has completely entered the STC, the timing of input to the STD is determined by the SCR of the packets in the SEQ.ly and the STC. The remaining packets of the SEQ.l should enter the STD at the speed_mux of the SEQ.l The time at which the last bit of the SEQ.l enters the buffer is T2. If N is the number of bits in the subsequent audio packets, then a number of possible calculation options would be used:? T = T2 - i = N / m_speed After the time T2, the input timing to the STD is determined by means of of the SCT 'of the time base and the SCR of SEQ.2, of the STC' is calculated as follows: STC '= STC + SCR_delta Note that this definition of the input program creates an overlap in the delivery time of the units Audio access ports of the SEQ.ly and any front audio access units of the SEQ.2. There is no overlap or intercalation of the data between the two sequences. Video packages should be constructed so that they do not overlap. The decoders require some additional audio separation (approximately Is) to handle the overlap in the time bases. - Regarding the separation, there are several situations to be considered. The most restrictive is when it is required to fully comply with the MEPG-2 PS STD model. A more relaxed consideration allows a greater separation (of double size) during the transition to the bridge sequence. To fully comply with the STD, the MEPG PS has the requirement that the data must not consume more than Is in the STD separator. Therefore, with Is after a jump it is known that the only data in the STD separator comes from the new sequence. We can build the release time (SCR) of the last bit of the last packet that contains video data released from SEQ.l by examining the SCR values of the SEQ.ly and m_speed packets: in the following this value is SCR1fin_Video . Taking: SCR1fin_Video as the value of the STC measured as the last bit of the last video packet of the SEQ.l delivered to the STD: this can be calculated from the SCRs in the packet headers of the SEQ.l and the mux speed.
SCR'2iniCio_video as the SCR value encoded in the first video packet of SEQ.2 (measured in the time base of SEQ.2) SCR video_home as the value of SCR 'video_projection projected on the basis of times of the SEQ .l: this can be calculated as follows: CK video_home = SK video_home ~ SC I delta To fulfill the STD through the jump, two conditions are required, the first of which is the release of the subsequent audio in the SEQ.l (followed by the front audio in SEQ.2) that must be contained in the interval defined by SCR1fin_Video and SCR2inicio_video as follows: SCR fin_video "•"? 1A <; SCR video_home 'ATjj Note that TB has been added to the inequality as a result of allowing front audio packets in SEQ.2. To satisfy this inequality, it may be necessary to recode and / or remultiplex part of one or both sequences. The second condition required is that the video release of SEQ.l followed by the video of SEQ.2, as defined by the SCRs of the SEQ.ly and the SCRs in SEQ.2 projected on the same basis. of time, do not produce an overflow of video separation. Turning now to the audio aspects, and starting with the alignment of the packet, there is usually a substantial deviation between concurrently encoded audio and video arrival times (approximately> 100 ms on average). This means that, at the end of the reading of the last required video frame in sequence A, several other audio frames (and of course video, unless they can be skipped) have to be read out of the multiplexed stream. Whether the jump should be delayed and the video stopped, or preferably, the audio should be remultiplexed in the bridge sequence. Referring to Figure 10, if the video pack V4 contains the end of the last video image before the jump, it is possible that the audio packets A2, A3, A4 need to be extracted from the A sequence and copied and remixed in the the bridge sequence. The same situation applies with the jump back from sequence C after the bridge. The first audio frames reached are probably the first in the presentation time than the video in the jump point, but the audio must be continuous through the jump. Having chosen the jump point for the video, it is necessary to make sure that the audio in the bridge sequence fits the audio in sequence C. In terms of audio voids, due to the difference in the duration of the audio and video frame, there will be interruptions in the periodicity of the audio frame at the point where the seamless splicing is done (in the video). This interruption is up to approximately one audio frame (24 ms) in length. This will happen near the video frame that marks the splice. Timing the information in the game list would help the audio decoder handle this interruption. At the disk allocation level, once the flows and elementary multiplexing requirements have been satisfied, it is also necessary to make sure that the sequence is sufficiently long, so that it can be assigned to a contiguous range of addresses on the disk, and that the sections on either side of the bridge in sequence A and sequence C remain sufficiently long. An example of this is described in our commonly assigned European Patent Application No. 98200888.0 filed on March 19, 1998. The basic requirement is that for a particular disk configuration, the bridge sequence is between 2-4 Mbytes in length and that the parts of the fragments on either side of the bridge are still larger than 2 Mbyttes: this restriction is not, however, applicable to all cases. In the above, we have described the means to solve the problem of making exact editions of video and / or video frames in compliance with the MEPG and similar program flows where, due to the temporal dependencies and separation models used in the MEPG and With similar coding and multiplexing techniques, simple cut-and-paste editing can not be done at any frame boundary. To facilitate editing, we generate bridge sequences - that is short sequences of accommodative coding data that are specially constructed (in the way described above) to link two original records of MEPG data or the like. From reading the present disclosure, other variations will be apparent to those skilled in the art. Such variations may involve other features that are already known in the methods and apparatus for editing audio and / or video signals and the component parts thereof, and which may be used in place of or in addition to the features already described herein.
Claims (23)
1. A data processing apparatus, characterized in that it comprises means operating to read data sequences based on frames of a storage device and to edit them, such as to link from a first edit point in a first sequence of frames to a second edit point in a second sequence, where for each of the sequences of stored frames a number of frames are intra-coded (hereinafter "I frames"), without reference to any other frame of the sequence, a number is encoded • (here subsequently "frames P"), respectively, with reference to an additional frame of the sequence, and the remainder are encoded (hereinafter "frames B"), respectively, with reference to two or more additional frames of the sequence; the apparatus includes means for generating bridges configured to create a sequence of bridge frames for linking the first and second editing points, by selectively incorporating frames of the first and second stored frame sequences and selective recoding of one or more of the frames. frames within the bridge sequence as determined by the type of coding (I, P, B) of the frames of the first and second sequences indicated by the respective edit points.
The apparatus according to claim 1, characterized in that the frames of the sequence are frames of video images, and the means for generating bridges are configured to construct the sequence edited with the jump of the first sequence to the bridge and of the bridge to the second sequence occurring in the limits of the box.
3. The apparatus according to claim 1, characterized in that the sequences comprise multiplexed arrangements of video images and audio data frames and the means of generating bridges are arranged to present in the bridge sequence all the video frames contributing to the video. the first sequence before the contributing video frames of the second sequence.
The apparatus according to claim 3, characterized in that the bridging sequence at the junction between the audio frames of the first and second sequences, there is a vacuum of up to one audio frame of duration and the means for generating bridges are arranged to insert an audio box superimposed in this vacuum.
The apparatus according to claim 1, characterized in that the bridge generating means is arranged to detect the respective date clocks in the first and second sequences and includes means that operate to derive a value specifying a discontinuity between the date clocks, to calculate a deviation to be applied to the timing clocks of the second sequence to remove the discontinuity and apply the deviation to the second sequence.
The apparatus according to any of claims 1 to 5, characterized in that the bridge generating means, upon receiving the specification of a target length for the bridge sequence, are arranged to vary the number of frames extracted from the first and / or second sequences to satisfy the target length.
The apparatus according to claim 6, characterized in that the bridge generating means is arranged to deflect the first and / or second edit points to result in the target length for the bridge sequence.
8. The apparatus according to claim 6, characterized in that the bridge generating means is arranged to selectively remove frames from the first sequence before the first edit point and / or frames from the second sequence after the second edit point to result the target length for the bridge sequence.
9. The apparatus according to claim 1, characterized in that the storage device is writable, the apparatus further comprises a recording subsystem device that operates to write one or more data sequences based on frames to be stored in places on or inside the storage device.
A method for editing data sequences based on frames, characterized in that it comprises linking from a first editing point in a first sequence of frames to a second editing point in a second sequence, where for each of the sequences of stored frames a number of frames are intra-coded (here subsequently "I frames"), without reference to any other frame of the sequence, a number is encoded (hereinafter "P frames"), respectively, with reference to an additional frame of the sequence, and the remainder are encoded (hereinafter "frames B"), respectively, with reference to two or more additional frames of the sequence; the apparatus includes means for generating bridges configured to create a sequence of bridge frames for linking the first and second editing points, by selectively incorporating frames of the first and second stored frame sequences and selective recoding of one or more of the frames. frames within the bridge sequence as determined by the type of coding (I, P, B) of the frames of the first and second sequences indicated by the respective edit points.
The method according to claim 10, characterized in that the frames of the sequence are frames of video images and the edited sequence is constructed with the jump from the first sequence to the bridge and from the bridge to the second sequence occurring in the limits of the box.
The method according to claim 10, characterized in that the sequences comprise multiplexed arrays of video images and audio data frames, with the presentation in the bridge sequence of all contributing frames of the first sequence before the frames of the video. video contributors of the second sequence.
The method according to claim 12, characterized in that in the bridge sequence at the junction between the audio frames of the first and second sequences there is a vacuum of up to one duration audio frame, which vacuum is filled by the insertion of a superimposed audio box.
14. The method according to the claim 10, characterized in that it includes the steps of detecting the respective date-clocks in the first and second sequences, deriving a value that specifies a discontinuity between the date-clocks, calculating a deviation to be applied to the second-sequence-date clocks to remove such discontinuity, and the application of such deviation to the second sequence.
15. The method according to any of claims 10 to 14, characterized in that the storage device is an optical disk and the location of the data sequences on it is indicated by a content index conserved by the disk.
16. The method according to claim 10, characterized in that the frame indicated by the first edit point is a frame B and the jump to the first frame of the bridge sequence is made on the nearest preceding frame P in the order of presentation of the first sequence.
The method according to claim 16, characterized in that the first frame of the bridge sequence after the jump comprises a reference frame extracted from the second sequence, followed by those B frames of the first sequence up to the editing point, B tables have been recoded with reference to the reference frame.
18. The method according to claim 10, characterized in that the frame indicated by the first edit point is a frame I or a frame P and the jump to the first frame of the bridge sequence is made following the preceding frame B closest to it. the order of presentation of the first sequence.
19. The method according to claim 10, characterized in that the frame indicated by the second editing bridge is a frame B and the frames of the bridge sequence that precede the jump to the second sequence comprise those frames of the second sequence of the frame The closest P that precedes the table indicated in the order of the bitstream and any B frames that intervene.
20. The method of compliance with the claim 10, characterized in that the frame indicated by the second edit point is a frame P and the frame of the bridge sequence preceding the jump to the second sequence comprises the indicated frame P.
21. The method according to claim 19 or claim 20, characterized in that the content of a frame P included before a jump of the bridge sequence to the second sequence is recoded in the bridge sequence as a frame I.
The method according to claim 10, characterized in that the frame indicated by the second edit point is a frame I and the frame of the bridge sequence preceding the jump to the second sequence comprises the indicated frame I.
23. A storage device containing a plurality of frame sequences together with one or more bridge sequences linking the respective pairs of sequences at specified edit points and a subject index identifying the respective storage address of each frame sequence and the bridge sequence, or each bridge sequence has been generated following the method according to any of claims 10 to 22.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP9813831.6 | 1998-06-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA00001917A true MXPA00001917A (en) | 2001-03-05 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6584273B1 (en) | Frame-accurate editing of encoded A/V sequences | |
JP4837868B2 (en) | Method and apparatus for editing digital video recordings, and recordings produced by such methods | |
JP4328989B2 (en) | REPRODUCTION DEVICE, REPRODUCTION METHOD, AND RECORDING MEDIUM | |
US8528027B2 (en) | Method and apparatus for synchronizing data streams containing audio, video and/or other data | |
JP3778985B2 (en) | Information recording medium, recording apparatus, recording method, reproducing apparatus, and reproducing method | |
EP0784848B1 (en) | Information carrier, device for reading and device for providing the information carrier and method of transmitting picture information | |
KR100606150B1 (en) | Recording apparatus and method, reproducing apparatus and method, recording / reproducing apparatus and method, and recording medium | |
KR100996369B1 (en) | Changing a playback speed for a video presentation recorded in a progressive frame structure format | |
EP1995731B1 (en) | Method to guarantee seamless playback of data streams | |
KR19980081054A (en) | Coding apparatus and method, decoding apparatus and method, editing method | |
JP4781600B2 (en) | Information processing apparatus and method, program, and recording medium | |
JPH08340507A (en) | Data recording medium provided with reproduction timing information and system reproducing recording data by using the reproduction timing information | |
MXPA00001917A (en) | Frame-accurate editing of encoded a/v sequences | |
JP2004120099A (en) | Information processing apparatus and method, program, and recording medium | |
KR100335413B1 (en) | Disc storing edit control information apparatus and method for seamless playback after editing | |
JP4135109B2 (en) | Recording apparatus, recording method, and recording medium | |
JP4114137B2 (en) | Information processing apparatus and method, recording medium, and program | |
Kelly et al. | Virtual editing of MPEG-2 streams | |
JP4044113B2 (en) | Information recording apparatus, information recording method, information reproducing apparatus, and information reproducing method | |
JP2006229767A (en) | Recording method and recording and reproducing method of multiplexed streams | |
JP2005302100A (en) | Information reproducing device, information reproducing method, and information reproducing program |