MXPA00012615A - Video encoder and encoding method with buffer control - Google Patents

Video encoder and encoding method with buffer control

Info

Publication number
MXPA00012615A
MXPA00012615A MXPA/A/2000/012615A MXPA00012615A MXPA00012615A MX PA00012615 A MXPA00012615 A MX PA00012615A MX PA00012615 A MXPA00012615 A MX PA00012615A MX PA00012615 A MXPA00012615 A MX PA00012615A
Authority
MX
Mexico
Prior art keywords
vop
further characterized
regulator
vol
decoder
Prior art date
Application number
MXPA/A/2000/012615A
Other languages
Spanish (es)
Inventor
Xuemin Chen
Robert O Eifrig
Original Assignee
General Instrument Corporation
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Instrument Corporation, Motorola Inc filed Critical General Instrument Corporation
Publication of MXPA00012615A publication Critical patent/MXPA00012615A/en

Links

Abstract

A technique is provided for enabling data, such as video, to be broadcast using a push dataflow scenario without causing a data rate buffer (32) for the pushed data at a decoder (30) to overflow or underflow. At an encoder (20), data are encoded for communication to the decoder (30) to provide an output bitstream. The data rate buffer (32) of the decoder is simulated at the encoder (22). The simulation is used to control the output bitstream to preclude overflow or underflow of the decoder buffer (32). For example, a complementary encoder buffer (22), which operates in a manner opposite to the decoder buffer (32), can be monitored and inverted to provide the simulation. Various different techniques are disclosed for controlling the amount of data produced at the encoder to maintain the data within the confines of the decoder buffer (32).

Description

VIDEO ENCODER AND CODING METHOD WITH REGULATOR CONTROL Field of the Invention The present invention relates to the regulation of video data, and more particularly to the regulation of video data provided to an observer using a push data flow scenario. The push data flow is a technique in which the data, such as video, text and / or graphic information, is transmitted to an observer without interaction (except for, possibly, the establishment of advance of an information profile by means of the receiver).
Antecedents of the Invention. In a flow communication system gives push data, a video range regulator model is required in order to link the memory requirements necessary for the video decoder. With a range controller model, the video encoder can be restricted to perform bit streams that are decodable with a predetermined buffer size in the decoder. The Visual Draft Final Committee (CDF) MPEG-4 No. N2202 published by the Group of Experts on Films (MPEG) and which is incorporated herein by reference, does not normally specify a model of related video range regulator with the size of the access unit (for example, size of the "video object plane" encoded (VOP), decoding time, and bit range of a video data stream for a regulator size to regulate the data The annex of the DCDF with respect to the video control verifier (VBV), describes an empty positioning clip for this information.The CDF (N2201) of the MPEG-4 Systems, also incorporated by reference herein, is incorporated herein by reference. defines a regulatory model, however, a normative definition of the relevant fields is not provided in a manner consistent with the video, it would be convenient to provide a regulator model compatible with the An exo D mentioned above, that explicitly states the relationship between the syntax in the visual CDF and the CDF systems. Said regulatory model should have the ability to be implemented easily and simply using regulation management techniques of appropriate range, in order to link the necessary memory requirements of a video decoder, thus making possible the successful provisioning. of push data flows. The present invention provides a video range regulator model and administration techniques thereof which have the aforementioned and additional advantages.
Summary of the Invention.
In accordance with the present invention, there is provided a method and apparatus for enabling data, such as video, to be transmitted using a data impulse flow scenario without causing a data range controller for data driven in a decoder to about flow and subflow. In an encoder, the data is encoded for communication with the decoder to provide a bitstream output. The data rate regulator of the decoder is simulated in the encoder. The simulation is used to control the bitstream output to exclude the overflow or subflow of the decoder controller. For example, a complementary encoder regulator, which operates in the opposite way to the decoder controller, can be monitored and inverted to provide simulation. Several different techniques are described to control the amount of data produced in the encoder, to keep the data within the confines of the decoder controller. These include reducing the quantization level to generate large VOPs or output fill bits at the end of the VOP, when the simulation in the encoder indicates that the decoder controller is or will be too full. When the simulation indicates that the decoder controller is or will be too empty, the encoder can increase the quantization levels to generate some bits, or delay the generation of the next VOP, or 0, the high frequency coefficients to reduce the number of bits generated by VOP.
Brief Description of the Drawings. Figure 1 is a graph plotting the occupancy of the regulator (b (t) < B) with respect to the decoding time (t); and Figure 2 is a block diagram illustrating an encoder and decoder apparatus according to the present invention.
Detailed Description of the Invention. The video regulation verifier (VBV) M PEG-4, is an algorithm to check a bit stream in addition to providing a range function, R (t), to verify that the amount of memory of the range regulator required in a Flow scenario of data impulse is less than the size of the manifested regulator. If a visual bitstream consists of multiple Video Objects (VOs), each with one or more VOLs (Video Object Layers), the range controller model is applied independently to each VOL (using a size) of regulator and particular range functions for said VOL). The present invention applies a regulator verification technique for natural video encoded as a combination of I, P and B-VOPs, and can be extended to cover full visual syntax, as well as sprites and synthetic video objects. In accordance with the present invention, the video encoder controls its bitstream output to meet the requirements of a video regulation verifier (VBV). The VBV is defined as follows: 1. The size of the VBV is specified in units of 16384 bits per vbv_regulator_size field provided in the transverse channel VOL. A vbv_regulator_ size of 0 is prohibited.
"B" is defined as 16384 x vbv_regulador_tamaño, and is the size of the regulator in bits. It should be noted that in the exemplary embodiment described in the present invention, the maximum size of the VBV controller is four Gbits. The value of vbv_regulador_size can only be changed after a visual_objeto_secuencia_final_code. 2. Is the channel bit range of the instantaneous object layer observed by the encoder denoted by RVO? (t) in bits per second. If the bit_range field is present in the cross channel VOL, this defines a peak range (in units of 400 bits per second, a value of 0 is prohibited) so that RVO? (t) = 400 x bit_range, should it be noted that RVO? (t) only the visual syntax for the current VOL counts (see the definition of d, below). Yes the channel is a serial time multiplexer that contains other VOL, or as defined by ISO / IEC 14496-1 with a total instantaneous channel range observed by the coder of R (t), then Rvoi (t) = R (t) if t e. { length of the one-bit channel bit of the VOL vol} otherwise 0 3. The VBV regulator is initially empty. After finding the first transverse channel VOL, the field vbv_occupation is examined to determine the initial occupation of the VBV regulator in units of 64 bits before decoding the initial VOP, immediately after the transverse channel VOL. The first bit in the VBV regulator is the first bit of the VOP (defined in paragraph 4 below), which includes the transversal channel VOL that contains the field vbv_occupation. The difference between the last vbv_occupation fields in subsequent transversal VOL channels and the run of the cumulative regulator occupancy (b¡ + d¡ as defined below), just before the removal of the VOP containing the transverse channel VOL, should be less than 64 bits. 4. We define dj as the size in bits of the i-th VOP where i is the VOP index which is increased by 1 in the decoding order. In Figure 1, the parameter d, is plotted, which plots the occupation of the regulator (b (t) <B) with respect to the decoding time (t). More precisely, dj is the number of bits of visual syntax, either: (1) from the last bit of the previous video object, still object of texture, mesh object or exclusive object of cover (and excluding any fill-code words immediately after this bit); or (2) from the first bit of the visual_object_code_sequence_sequence inclusive (in the case of the first VOP of a visual bit stream) for the last bit of the current VOP inclusive, (including any words from the padding code at the end of the VOP) , including transverse channels of the video object, transverse channels of the video object layer, and a group of transverse channels VOP which precede the VOP by itself. It should be noted that the size of a coded VOP (di) is always a multiple of 8 bits due to the start code action line. 5. t (Figure 1) must be allowed to be the decoding time associated with VOP i in the decoding order. All the bits (d,) of VOP i are instantaneously removed from the range regulator t ,. This instantaneous removal property distinguishes the VBV controller model from a real-range regulator. 6. t, is the time of composition (or presentation time in a non-composer decoder) of VOP i. For a video object plane, t, is defined by vop_increment_time (in units of 1 / vop_time_increment_resolution-ths of a second), in addition to the cumulative number of full seconds specified by base_time_module. In the case of interleaved video, a VOP consists of lines from two fields, and t¡ is the time of composition of the first field. The relation between the time of composition and the decoding time for a VOP, is determined by t¡ = t¡ - (((vop_codification_type == BVOP) 11 low_retrast)? 0: m¡) where low_retrast is true (1) if the VOL does not contain B-VOPs. If B-VOPs are present, then the composition of an anchor VOP is delayed until all immediately subsequent B-VOPs have been compounded. This period of delay is m¡ =t f - t p, where f is the anchor VOP index closest to the future of VOP i, while p is the current anchor VOP index (or the closest to the past) of vop i.
The following example demonstrates how m is determined by a sequence with variable numbers of B-VOPs: Decoding order: I0P1 P2 3B_. P5B6P7B8B9P .0B. 1 B1; Presentation order: I0P1 P2B4 P3B6 PsBßBg P Bn B? 2Pi3 In this example, it must be assumed that vop_increment_time = 1 and base_time_module = 0. The subscript i is in the decoding order.
Item. m0 0 0 0-1 = -1 1 1 1 1-1 = 0 1 2 2 2-1 = 1 1 3 4 4-2 = 2 2 4 3 3 2 5 6 6-2 = 4 2 6 5 5 2 7 9 9-3 = 6 3 8 7 7 3 9 8 8 3 10 12 12-3 = 9 3 11 10 10 3 12 11 11 3 7. Bi must be defined as the occupation of the regulator in bits immediately after the removal of VOP i from the range regulator. In Figure 1, the parameter bj is illustrated. Using the above definitions, bi can be defined interactively: bo = 64 x vbv_occupation-do t¡ + 1 bj +. = b¡ + I Rvoi (t) dt - d¡ +? for i > 0 t¡ 8. The range regulator model requires that the VBV regulator is never overflow or subflow, that is: 0 < b and b, + d, < B for all Real-value arithmetic is used to compute b, so that errors do not accumulate. An encoded VOP size should always be smaller than the size of the VBV controller, for example, di < B for all i. It is a requirement of the encoder, to produce a stream of bits that the decoder VBV regulator does not overfill or sub-flux. This means that the decoder must know RVO ?, decoder (t), the bitrate of the instantaneous channel observed by the decoder. A channel has constant delay, if the bitrate of the encoder at time t, when a particular bit enters the channel, RVO ?, encoder (t) is equal Rvoi, decoder (t + L), where the bit is received in (t + L) and L is constant. In the case of constant delay channels, the encoder can use its locally estimated Rvo ?, coder (t), to simulate the VBV occupation and control the number of bits per VOP, d, in order to avoid overflows or subflows . The VBV model assumes a constant delay channel. This allows the encoder to produce a bitstream VOL, which does not overfill or subflude the regulator using RVO ?, encoder (t) - should RVO be observed? (t) is defined as RVO ?, coder (t) in paragraph number "2" above. Figure 2 illustrates the encoder and decoder in the form of the simplified block diagram. The data that will be encoded is input to the processor of the encoder 20, which is coupled to a data range regulator of the encoder 22. The regulator 22, operates in a complementary manner to a data rate regulator 32 of the decoder 30. When monitoring the data rate regulator 22, the encoder processor 20 has the ability to simulate the data rate regulator 32 of the decoder. The encoder processor provides a bitstream output to a transmitter 24, which transmits the bit stream through a communication channel 26 to a receiver 28. The receiver provides in a conventional manner, the received bit stream for the decoder 30. The decoder 30 decodes the bitstream to provide the desired data output. Below is a description of how to handle real-time video in a network environment of non-constant delay. This procedure is practically a hypothetical model; this is not a requirement or recommendation of how M PEG-4 bitstreams interface with channels of non-constant delay. If the channel does not have a constant delay, then: 1. unknown queuing delays, variables, packet by packet are present in network interfaces and intermediate nodes (eg switches or routers as used by ATM or I P networks). 2. the information is supplied in the stamped time packages, and 3. there is a link in the difference between the minimum and maximum channel latency of a packet (as determined, for example, through a service negotiation quality). Later, a delay channel can be approximated constant using a de-jittering regulator before the decoder. The de-jittering controller keeps each packet of latency variable until the maximum latency of the channel has elapsed (the maintenance duration is based on the time stamp of the packet), before the packet is released for the decoder. The resulting channel now has a constant delay equal to the maximum latency of the channel.
Modifications of Syntax: The present invention modifies the syntax of the MPEG-4 standard, by adding a vbv_occupation field (26 bits) to the transversal channel VOL. The value of this integer is the VBV occupation in units of 64 bits, just before the removal of the first VOP after the transversal channel VOL. The purpose of the quantity is to provide the initial condition of the complete VBV regulator.
To avoid duplication of information between the MPEG-4 System (ISO / I EC 14496-1), and the Visual MPEG-4 (ISO / IEC 14496-2), and to allow a visual elementary stream as an independent entity for specify a regulator model, a vbv_parameters signal is added to control the inclusion to the VOP_range_code, bit_range, low_reflection vbv_size, and vbv_occupation in the cross channel VOL. The value of vbv_parameters must be "1" for a visual bit stream of the data pulse stream, when the equivalent information is not present in an encapsulation system multiplexer. The vol_control_parameters bit remains in the syntax to control the inclusion of the fields chroma_formato and aspect_proporción_información in the transversal channel VOL. The syntax FCD VOL contains problems of emulation of the potential start code when bit_range and vbv_size are present (runs from 23 or more consecutive 0 bits can occur). Marker bits (which always have the value "1") have been added to avoid this problem. The division of fields made by the marker bits are defined as: bit_range = (bit_range_msbs <12) I bit_range_1 sbs; vbv_size = (vbv_size_msbs < < 10) I vbv_size_1 sbs; vbv_occupation = (vbv_occupation_msbs < < 15) I vbv_occupation_1 sbs; The resulting syntax is shown in Table 1: TABLE 1 Notes for Table 1: 1. The encoding of information_relation_area aspect and VOP_code_range are not defined in the Visual CDF in the MPEG-4. 2. In order to use the coded bit stream with a data impulse flow model, it is a normative requirement that vbv_parameters must be set to "1" or the equivalent information, as defined in paragraph 4 that is find later; it must be included in the systems layer. 3. If VOP_range_code is provided, then the difference between the composition times specified by VOP_increment_time and the cumulative_base_module, must be an exact integer multiple of the period of the structure associated with VOP_code_range. In this case, the width of VOP_time_increment_resolution must be increased by one bit to represent exactly 59.94 Hz (for example, 60000/1001 Hz).
Relationship for MPEG-4 Systems: The following description defines the relationship between the terminology, semantics and syntax of the elementary current interface of the MPEG-4 Systems (ISO / I EC 14496-1) and the visual decoder (or encoder) , so that the System Decoder Model (SDM) is consistent with the Video Controller Verifier. In this case, the visual VBV controller and the SDM decoder controller (DBr) have identical semantics. These regulators are one and the same in a model of the integrated visual / systems decoder. 1 . A natural video access unit is a coded VOP. The size (d,) and precise composition of a coded VOP was defined above with reference to Figure 1. 2. The time base of the object (OTB) used to determine the object's clock reference (OCR), decoding the time stamp (DTS), and the composition time stamp (CTS), is the same time base used to determine the VOP_time_increment and module_time_base. The timeSelloResolution and OCRResolution of the Sync Layer, must be integer multiples of VOP_time_increment_resolution, so that no temporal precision is lost and that all temporal calculations are accurate in an integer arithmetic. 3. The composition time stamp is equal to t 'in addition to a constant (k). This is: CTS = n¡ x timeSdfoRes ución + fempoSdfoRes ución x VOP_u mpo_inatementD + K VOPJ-émpoJncrementojBsdución where n¡ is an accumulation of the values of the base_time_module from the initial transversal channel VOL. 4. The decoding of the time stamp from CTS is determined similarly to t, where it is calculated from t, this is: DTS¡ = CTS¡ - (((vop_codificación_tipo == BVOP) I I low_retraso)? 0: m) This specific equation that the coding is instantaneous and that only the difference between DTS and CTS, reflects the rearrangement of the anchors VOPs. It should be noted that DTS is present only in the anchor VOPs, when low_retrast is 0 (when the previous conditional expression is true).
. The relationship between the value of the time base value of the local object of the time encoder and DTS, is defined in the present description. The first bit of the access unit containing a transverse channel VOL must be allowed to be stored in the VBV controller (or SDM DB) at time OCRi and vbv_occupation specified in the same transverse channel VOL, then occupation_vbv It is an implicit requirement in OCRResolution and timeSelloResolution that the calculation of vbv_occupation be exact for the nearest integer. 6. If the RandomPoint Access Point of the System Layer (SL) is set to "1", it denotes the presence of a transversal channel VOL in the access unit starting in this SL package. 7. The Decoder / Config / Descriptor value for regulator Size DB is equal to 2048 x vbv_regulator_size. The maxBitrate field must be 400 x bit_range. The correspondence between the information of the transversal channel VOL controlled by the vbv_information bit and several syntax entities specified in the MPEG-4 Systems layer, is provided below. In the event that the information is duplicated by the two parts of the MPEG-4 standard, no disagreement is allowed. 1 . The VOP_range_code is represented by composition Unit / Duration and time / Scale Sync Layer. In this case, time / Scale must be an integer multiple of VOP_time_in ere_to_resolution. 2. bit_range and vbv_size are indicated by maxBitrate and regulatorDB size in the DecoderConfigDescriptor. 3. vbv_occupation is indicated indirectly by the difference between OCR and DTS, as defined above. 4. low_retrast is implicitly specified by DTS of the first l_VOP. If DTS is present (and is not equal to PTS), then low_retrast is = 0, the rest is low_retrast = 1.
Comparison between the MPEG-4 VBV and MPEG-2 VBV model: The MPEG-2 and MPEG-4 VBV models, both specify that the range controller can not overfill or subflute and instantaneously remove encoded films (VOPs) from the regulator. In both models, an encoded VOP / movie is defined, to include all the top-level syntax immediately before the movie / VOP. MPEG-2 video has a period of constant structure (although the bitstream can contain both structures and the field films and structure films can use the explicit 2: 3 push down through the first_speed repeater signal). In MPEG-4 terms, this structure range would be the output of the composer (the MPEG-2 terminology is the output of the deployment process that is not defined in a normative way by MPEG-2). This range structure output together with the MPEG-2 film_structure signal and first_field_repeat define precisely the time intervals between the consecutive decoded film (either structures or fields) passed between the decoding process and the deployment process. In general, the MPEG-2 bitstream contains B movies (we assume that low_return = 0 of MPEG-2 refers to the next section for the case where low_retrast = 1. This means that the order of coding and the order of display of the films is different (since both reference films used by a film B must precede the film B in the coding order). The MPEG-2 VBV (and T-STD of MPEG-2 systems) specifies that a B movie is decoded and presented (instantaneously) at the same time and the anchor films are sorted again to make this possible. This is the same reordering model specified in the composition time definition t- [. A hypothetical MPEG-4 decoder using the proposed M PEG-4 VBV controller model, simulates exactly a hypothetical M-PEG-2 decoder using the MPEG-2 VBV controller model if the MPEG-4 VOP time stamps provided by vop_increment_time and the cumulative_increment_module, are in accordance with the sequence of presentation times of the MPEG-2 movie. In the present invention we assume that both encoded VOPs / movies use the common subgroup of both standards (films structured by structure and do not push down 3: 2 in the decoder, for example, first_field_ repetition = 0). For example, if the MPEG-2 sequence is encoded in 29.97 Hz (the NTSC movie range), vop_increment_increment time must be 30000 and the change in vop_increment time between consecutive VOPs in the order of presentation must be 1001 because the jump of the movie is not allowed in MPEG-2 (when low_return = 0).
Regulation Model Similar to H.263: In H.263, there is no B-VOPs or reordering of composition units between decoding and presentation. The H.226 Hypothetical Reference Decoder (HRD) can be equivalent to the MPEG-4 VBV. In the controller model similar to H. 263, the size of the VBV regulator vbv_regulator_size is computed by vbv_regulator_size = A + BPPmaxKb x 1024 bits, where (BPPmaxKb x 10249) is the maximum number of bits per movie, which has been negotiated for used in the bitstream, and A = 4 x Rmax / P, where Rmax is the maximum range of video bits during connection in bits per second, and the film frequency, P, is 29.97 Hz as specified in the Common Intermediate Format (CI F), which corresponds to vop_time_increment_resolution = 30,000 and? vop_time_increment_ = 1001. Initially, VBV is empty. The VBV is examined in CI intervals F. If at least one complete coded VOP is in the controller, then all data for the VOP earlier in the order of bit stream (or decoding), are removed instantaneously. Immediately after removing the above data, the occupation of the regulator must be less than A. In this case, the number of bits for di +? of the encoded film (i + 1) -th, it must satisfy: '..I dM = bt + ¡R (t) dt - A In this difference, real value arithmetic was used. where bi is the occupation of the regulator just before the encoded film i'th has been removed from the regulator; t¡ is the time when the encoded nth movie is removed from the VBV controller; and R (t) is the video bitrate at time t. The important difference between the MPEG-4 VBV model and the H. 263 model is that the encoder is specifying the composition time t, for each VOP in the bit stream, which means again that the encoder must know R (t ) and A, the instantaneous bitrate as observed by the decoder. Again, this assumption is valid if a constant delay channel is assumed. When low_retrast = 1 the MPEG-2 VBV model has several similarities with HRD. The first is that B films are not used, since the order of decoding and presentation is the same. The second is that there is a specific film period (but not necessarily 29.97 Hz), used to examine the regulator. If the next movie to be decoded has not been fully received in the next movie period (a movie is called a "big movie"), the slider is again examined in multiples of the film period, until the encoded movie resides completely in the regulator. The big movie will later be decoded and displayed instantaneously. The previous image remains in the output of the decoder during the periods of the movie, although the large movie is not completely received. The encoder is still responsible for the prevention of overflow and subflow and the difference between the MPEG-2 temporal_reference fields of the large film and its predecessor is the length of the film's display before the big movie (in periods of structures).
Extensions for Full Visual Syntax: Covering full visual syntax requires the extension of sprites, still permanent texture objects, mesh objects and cover objects. In this case VBV, remains for the verifier of the visual controller.
Sprites: The basic sprite, the low latency sprit, and the scalable sprite, are specified in MPEG-4. There is no conceptual problem with the generation of bit streams from the sprites. However, a large vbv_size can be applied to take advantage of the largest sprite memory in the decoder.
Permanent Texture Objects: A permanent texture object is a simple access unit, however it can not be composed directly. Permanent texture objects use as their input a last decoder (for example, the mesh decoder). Permanent texture objects do not have regulator or time stamp parameters to control this visual object in a data impulse flow scenario. The syntax of Table 2 below will have to be added to PerpetualTextures: TABLE 2 The number of time bits_frame_loop is specified by time_so_fraction_b_ts, which can not be zero.
Mesh and Cover Objects: The access unit of the mesh visual object is the mesh object plane. The access unit of the cover visual object is the cover object plane. Since reordering is not required, t¡ = t¡. Both objects share a common temporal information specification, the transverse_channel_time (). When the mesh / cover object is intra-coded, the transverse_channel_time () can optionally specify a structure range and a time stamp (a time code I EC 461 specifying hours, minutes, seconds and structures), specifying the time of composition of the mesh / cover object plane. The origin of the time code (00: 00: 00: 00), must be in accordance with the temporary origin used in the MPEG-4 System (ISO / IEC 14496-1) for DTS and CTS. The time between the object planes is times 1 + S number_of_structures_to_salt, the period of the structure. This allows an absolute CTS to be constructed from CTS of an intra-mesh / pre-cover object. In order to apply the regulator model to the visual bitstream for the mesh / cover objects, the VBV parameters that are then controlled by vbv_parameters (except for low_return and VOP_range_code), need to be added to ObjectMobile and ObjectCoverage, as shown in Table 3: TABLE 3 Range Regulator Management: Because it is the responsibility of the encoder to prevent the VBV controller of the decoder from overflowing or sub-flowing, the encoder must simulate the VBV controller of the decoder. The VBV controller of the simulated decoder can not be too full or too empty. In order to prevent the VBV of the decoder from sub-flooding, all data of the coded VOPs must be completely transmitted to the decoder controller before its decoding time. It should be assumed that the i-th VOP encoding starts at time t, and its decoding time is t ,. After ¡-th VOP has been encoded, the amount of data transmitted is provided by the full encoder regulator in te¡ (denoted by ebe¡), in addition to the encoded size of this VOP (d¡). This has to be less than or equal to the data received from the channel. where the decoding time t, = te, + L. For a channel of constant delay, it has RVoi, decoding (t) = RVoi, coding (t-L). t D P ^ orr? l? o * tan * to, ßbi '+ di = e ^ R encoder, vol ^) dt D Pnorr, lo_ t tan nntton, / the complete encoder regulator in te is linked superior by tf eb, '¡e = \ Rcodificador, vo dt - i- T2. tf -L For the type of channels that have known the minimum transmission range RVoi, min, T2 can be adjusted to be a link of ? v / nolft 'A * - In order to prevent the decoder controller from overflowing, the regulator of the complete decoder must be less than the size of the decoder controller B at time t, immediately before the VOP i removal. This amount can be expressed in terms of the complete decoder regulator in t¡e (t¡e) in addition to the number of bits that enter the decoder VBV decoder between tf and * i * minus the number of bits removed, it forms the decoder's regulator between t and t and t. The number of bits removed is the sum of the encoder controller's occupancy in te immediately before adding VOP i (ebei), and the occupation of the decoder's regulator in t, (bei) because all data from the current of bits prior to VOP i, must be consumed before VOP i can be decoded. The last two quantities represent the data of bit streams prior to VOP i, since VOP i has not been added to the encoder's controller. Therefore, the total bits in the decoder controller are linked by t + L b If + i 'decoder, (0' * - < eb lf + b if) < B tf vol which produces t + L I i R decoder, vol (< # ~ eb. < B. < f Therefore, total_regulator_coord in te is linked lower by te e < 'ebi > l R decoder, vol (* ~ B =? _, The same arguments given above with respect to the constant delay channel can be applied in the present invention. Likewise, for the type of channels that have known the maximum transmission range Rvoi.max. i can be adjusted to be a superior link of The Ti and T2 links are reviewed in the range control algorithm and the corrective action is to perform the bit allocation of the VOPs and adjust the quantization levels of the coding units (for example, VOP, macroblocks). The encoder must take the following corrective action, if the VBV controller of the simulated decoder is too full or too empty: 1. If the VBV controller of the simulated decoder becomes too full (for example, the VBV controller of the encoder is too empty), the encoder can correct the problem in the following way: (a) reducing the quantization level to generate large VOPs, ( b) removing the filler bits at the end of the VOP. It should be noted that the generation of larger VOPs reduces the VBV occupation of the decoder. 2. If the VBV of the simulated decoder becomes too empty (for example, the VBV controller of the encoder is too full), the encoder can correct the problem, as follows: (a) increasing the quantization levels to generate few bits , or (b) delaying the generation of the next VOP (often called VOP jump), or (c) setting the high frequency coefficients to zero, to reduce the number of bits / VOP generated. It should now be appreciated that the present invention provides a video range controller model for linking the memory requirements of a video decoder in a data impulse flow scenario. The model of the range controller of the present invention restricts the video encoder to produce bitstreams that are decodable with a previously determined buffer size of the controller. Therefore, data impulse flow applications are accommodated efficiently.

Claims (1)

  1. CLAIMS: Having described the present invention, it is considered as a novelty, and therefore, the content of the following CLAIMS is claimed as property. 1 . An encoder apparatus for allowing a stream of data pulse flow bits and for channeling a data controller modeled with a memory size previously determined for data driven in a decoder, overflow or subflow, wherein said apparatus comprises: a processor adapted to encode data to provide the bitstream for communication with a decoder; wherein: the encoded data comprises at least one video or visual object (VO) with at least one layer of the visual video object (VOL), including an associated transverse channel, followed by at least one object plane of video or visual (VOP); a field in the transverse channel VOL designates an occupation of the regulator just before the first VOP is removed following the transverse channel VOL of the regulator, and said processor uses a simulation to simulate the decoder controller and control the bitstream in response to the simulation, to exclude the overflow or subflow of the decoder controller. The apparatus as described in Claim 1, further characterized in that, when the regulator is initially empty, the occupation field is examined to determine an initial occupation of the regulator before decoding the initial VOP. The apparatus as described in Claim 1 or 2, further characterized in that, the processor provides a signal to control the inclusion of at least one field in the transverse channel VOL, when equivalent information is not present in the multiplexer of the control system. encapsulation The apparatus as described in Claim 3, further characterized in that the signal allows a visual elementary stream of the bitstream, in the form of an independent entity to specify a regulator model. The apparatus as described in Claim 3 or 4, further characterized in that, in at least one field, whose inclusion in the transverse channel VOL is controlled by the signal, designates a range VOP of the bit stream. The apparatus as described in any of Claims 3 to 5, further characterized in that the at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates a peak bit range of the currents. of bits. The apparatus as described in any of Claims 3 to 6, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates whether the VOL contains at least one B-VOP. The apparatus as described in any of Claims 3 to 7, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates the size of the modeled regulator. The apparatus as described in any of Claims 3 to 8, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates said field which designates the occupation of the regulator. . The apparatus as described in any of the preceding Claims, further characterized in that the video or visual object comprises a video object. eleven . The apparatus as described in any of Claims 3 to 9, further characterized in that the video or visual object comprises a permanent textured object. 12. The apparatus as described in any one of Claims 1 to 9, further characterized in that the video or visual object comprises a mesh object. 13. The apparatus as described in any one of Claims 1 to 9, further characterized in that the video or visual object comprises a cover object. 14. The apparatus as described in any of the preceding Claims, further characterized in that the encoded data comprises a plurality of VOLs, and the decoder controller model is applied independently to each VOL, using the size of the regulator and particular range functions for each VOL. 5. The apparatus as described in any of the preceding Claims, further characterized in that the bit stream is compatible with an MPEG-4 video coding standard. (a) t¡ = t¡ if the VOL does not contain B-VOPs, where t¡ is a time of composition of the ith VOP, and (b) t¡ = t¡ - m¡, when the th VOP is a VOP anchor, and m count for a delay of at least one immediately subsequent B-VOP that will be composed or presented. . The apparatus as described in any one of Claims 1 to 18, further characterized in that the processor determines a decoding time t, of a VOP, where: (a) t, = t, yes the VOL does not contain the B-VOPs, where tj is a presentation time of the VOP, when the decoder is a non-composer decoder, and (b) tj = tj - m, when the VOP it is an anchor VOP and a count for a delay of at least one immediately subsequent B-VOP that will be composed or presented. . The apparatus as described in any one of the preceding Claims, further characterized in that said impelled data stream comprises video data including infrastructure video object (VOP) planes, bidirectional (P) prediction (B) ). The apparatus as described in Claim 21, further characterized in that said processor controls the bit stream through at least one of the following aspects: allocating bits between different VOPs, and adjusting quantization levels of coding units that they form said VOPs. 23. The apparatus as described in Claim 22, further characterized in that the modeled data controller comprises a visual or video regulation regulator (VBV). 24. The apparatus as described in Claim 23, further characterized in that said processor monitors the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too full, a quantization level for said coding units is reduced. 5. The apparatus as described in Claim 23 or 24, further characterized in that said processor monitors the VBV regulator and, when the simulation indicates that the regulator VBV is or will be too empty, this level of quantification is increased for said coding units. 6. The apparatus as described in Claim 23 or 24, further characterized in that said processor supervises the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too empty, the generation of the next VOP is delayed. 27. The apparatus as described in Claim 23 or 24, further characterized in that said processor monitors the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too empty, the high frequency coefficients of said regulators are set to zero. coding units, to reduce the number of bits generated by VOP. 28. The apparatus as described in any of Claims 23 or 25 and up to 27, further characterized in that said processor supervises the VBV regulator and, when it determines that the VBV regulator is or will be too full, filler bits are added at the end of at least one VOP. 29. The apparatus as described in any of the Precedent claims, further characterized in that the processor includes a decoder regulator for receiving the encoded data before providing the bit stream thereof; and the processor controls the range of the bitstream, so that the entire decoder regulator must, after encoding an ith VOP, be linked in a superior way through where tt is the start time to encode th VOP, L is the time difference between the coding time and the decoding time of the ith VOP, di is the amount of data encoded for the ith VOP, and RVO ?, decoder (t) is the bit range of the instantaneous channel observed by the decoder. 30. The apparatus as described in any of the Claims 1 through 28, further characterized in that the processor includes a controller of the encoder for receiving the encoded data before providing the bit stream thereof; and the processor controls the range of the bit stream, so that the entire encoder's regulator has to encode a ith VOP, it is linked in a superior way by I'm -K vol, encoder. { í) dt - d¡ tf -L l where t¡ is the start time to encode the ith VOP, L is the time difference between the coding time t¡ and the decoding time of the ith VOP, d¡ is the amount of data encoded by the ith VOP, and RVO ?, encoder (t) is the bit range of the instantaneous channel observed by the encoder. 31 The apparatus as described in any one of Claims 1 to 28, further characterized in that the processor includes a controller of the encoder for receiving the encoded data before providing the bit stream thereof; and the processor controls the range of the bitstream, so that the entire encoder's regulator has to encode a ith VOP, it is linked in a lower way by decoder, vol W? ~ B < f where t¡e is the start time to encode the ith VOP, L is the time difference between the coding time t¡ and the decoding time of the ith VOP, B is the decoder controller size, and RVO ?, decoder (t) is the bit rate of the instantaneous channel observed by the decoder. 32. The apparatus as described in any one of Claims 1 to 28, further characterized in that the processor includes a controller of the encoder for receiving the encoded data before providing the bit stream thereof; and the processor controls the range of the bitstream, so that the entire encoder's regulator has to encode a ith VOP, it is linked in a lower way by t ie í Rvol, encoder (t) dt - B tf -L l where t¡e is the start time to encode the ith VOP, L is the time difference between the coding time t¡e and the decoding time of the ith VOP, B is the size of the regulator decoder, and RVO ?, encoder (t) is the bit range of the instantaneous channel observed by the encoder. 33. A coding method to enable a stream of data pulse flow bits, without causing an overflow or subflow of the data controller modeled with memory size previously determined for data driven in a decoder, wherein said method comprises the steps of: encoding data to provide the bit stream for communication with a decoder; wherein: the encoded data comprises at least one video or visual object (VO) with at least one layer of the video or visual object (VOL), which includes an associated transverse channel, followed by at least one plane of the video or visual object (VOP); and a field in the transverse channel VOL which designates a regulator occupancy just prior to the removal of the first VOP following the transverse channel VOL of the regulator; and using a simulation, to simulate the decoder controller and control the bitstream in response to the simulation to exclude the overflow or subflow of the decoder controller. 34. The method as described in Claim 33, further characterized in that, when the regulator is initially empty, the occupation field is examined to determine an initial occupancy of the regulator before decoding the initial VOP. 35. The method as described in Claim 33 or 34, further characterized in that it comprises the additional step of, providing a signal to control the inclusion of at least one field in the transverse channel VOL, when no equivalent information is present in a multiplexer of the encapsulation system. 36. The method as described in Claim 35, further characterized in that the signal allows a visual elementary stream of the bit stream in an independent entity to specify a model of the regulator. . The method as described in Claim 35 or 36, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal designates a range VOP of the bit stream. . The method as described in any of Claims 35 to 37, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates a peak bit range of the currents. of bit. The method as described in any of the Claims 35 through 38, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates if the VOL contains at least one B-VOP. The method as described in any of Claims 35 to 39, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, it designates the size of the modeled regulator. 41 The method as described in any of Claims 35 to 40, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates said field designating the occupation of the regulator. . 42. The method as described in any of Claims 33 to 41, further characterized in that the video or visual object comprises a video object. 43. The method as described in any of Claims 33 to 41, further characterized in that the video or visual object comprises a permanent texture object. 44. The method as described in any of Claims 33 to 41, further characterized in that the video or visual object comprises a mesh object. 45. The method as described in any of Claims 33 to 41, further characterized in that the video or visual object comprises a cover object. 46. The method as described in any of the Claims 33 through 45, further characterized in that the encoded data comprises a plurality of VOLs, and a decoder regulator model is applied independently each VOL using particular regulator size and range functions for each VOL. 47. The method as described in any of Claims 33 to 46, further characterized in that the bitstream is compatible with a video coding standard in MPEG-4. 48. The method as described in any of Claims 33 to 47, further characterized in that additional fields are provided in respective subsequent transverse channels VOL to designate the occupancy levels of the respective subsequent regulator. 49. The method as described in Claim 48, further characterized in that it comprises the additional step of maintaining a difference between the additional fields in the subsequent transverse channels VOL and an occupation of the cumulative run controller just before the removal of a VOP of the regulator within a tolerance. 50. The method as described in any of Claims 33 to 49, further characterized in that it comprises the additional step of: determining a size (dj) of a current VOP as a number of bits extending to the last bit of the VOP current, and starts from either the last bit of the previous VO or the first bit of a start code for the first VOP of the encoded data. 51 The method as described in any of Claims 33 to 50, further characterized in that it comprises the additional step of: determining a decoding time t, of a ith VOP, where: (a) t, = t, yes the VOL does not contain B-VOPs, where t¡ is a time of composition of the VOP, and (b) t¡ = t¡ - m, when the ith VOP is an anchor VOP, and m¡ counts for a delay of at least one subsequent B-VOP, which shall be composed or presented. 52. The method as described in any of Claims 33 to 50, further characterized in that it comprises the additional step of: determining a decoding time t, of a ith VOP, where: (a) t, = t, yes the VOL does not contain B-VOPs, where t¡ is a presentation time of the VOP ith when the decoder is a non-composer decoder, and (b) t¡ = t¡ - mi, when the ith VOP is a VOP anchor, and m counts for a delay of at least one immediately subsequent B-VOP, which will be composed or presented. 53. The method as described in any of Claims 33 to 50, further characterized in that said stream of data driven comprises video data including planes of the video object (VOPs) of intra-structure (I), prediction (P) ) and bidirectional (B). 54. The method as described in Claim 53, further characterized in that it comprises the additional step of controlling said bitstream by means of at least one of the following steps: allocating bits between different VOPs, and adjusting the quantization levels of the units of coding forming said VOPs. 55. The method as described in Claim 54, further characterized in that the modeled data controller comprises a visual or video regulation regulator (VBV). 56. The method as described in Claim 55, further characterized in that it comprises the additional step of, monitoring the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too full, a quantization level for said VBV units is reduced. coding. The method as described in Claim 55 or 56, further characterized in that it comprises the additional step of monitoring the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too empty, a quantization level is increased for said coding units. The method as described in Claim 55 or 56, further characterized by comprising the additional step of monitoring the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too empty, the generation of the next VOP is delayed. The method as described in Claim 55 or 56, further characterized in that it comprises the additional step of, monitoring the VBV regulator and, when the simulation indicates that the VBV regulator is or will be too empty, the high coefficients are set to zero frequency of said coding units, to reduce the number of bits generated by VOP. 60. The method as described in any of Claims 55 or 57 to 59, further characterized by comprising the additional step of: monitoring the VBV controller and, when it determines that the VBV controller is or will be too full, they are added Fill bits at the end of at least one VOP. 61 The method as described in any of the Claims 33 through 60, further characterized in that an encoder controller receives the encoded data before providing the bitstream thereof, comprising the additional step of: controlling the bitstream range, so that the entire encoder regulator ebe¡ after encoding a ith VOP is linked in superior form by tf + L \ decodlfícador, vol (t) dt-di where t¡e is the start time to encode the ith VOP, L is the time difference of coding t¡ey and the decoding time ith VOP, d¡ is the amount of data encoded by the ith VOP, and Rvoi, decoder (t) is the bit range of the instantaneous channel observed by the decoder. 62. The method as described in any of Claims 33 to 60, further characterized in that an encoder controller receives the encoded data before providing the bitstream thereof, comprising the additional step of: control the range of the bit stream, so that the entire encoder's regulator should be encoded by an ith VOP, be linked in a superior way by í vol, encoder (t) dt - d¡ tf -L where t is the start time to encode the VOP, L is the time difference between the coding time t and the decoding time th VOP, dj is the amount of data encoded by the ith VOP, and Rvoi, encoder (t) is the bitrate of the instantaneous channel observed by the encoder 63. The method as described in any of Claims 33 to 60, further characterized in that an encoder controller receives the encoded data before providing the bit stream thereof, comprising the additional step of: controlling the range of the bit stream, so that the entire encoder controller ebej after encoding a ith VOP, is bound lower by where t is the start time to encode the ith VOP, L is the time difference between the coding time t and the decoding time of the ith VOP, B is the size of the decoder controller, and RVO? , decoder (t) is the bit range of the instantaneous channel observed by the decoder. 64. The method as described in any of the Claims 33 through 60, further characterized in that an encoder controller receives the encoded data before providing the bitstream thereof, comprising the additional step of: controlling the bitstream range, so that the entire encoder controller ebe¡ after encoding a VOP, is linked in a lower way by the i & vol, coder (t) dt - B te - L l where t¡e is the start time to encode the ith VOP, L is the time difference between the coding time t¡ and the decoding time of the th VOP, B is the size of the decoder controller, and RVoi , encoder (t) is the bit range of the instantaneous channel observed by the encoder. 65. A decoding apparatus, comprises: a data controller with a predetermined memory size; and means for receiving a stream of data pulse flow bits which is obtained through data encoding according to a controller model, so that the modeled controller does not overfill or sub-flow; wherein: the encoded data comprises at least one video or visual object (VO) with at least one layer of the video or visual object (VOL), including an associated transverse channel, followed by at least one plane of the object of the video or visual (VOP); and a field in the transversal channel VOL that designates a modeled regulator occupancy just before the first VOP is removed following the transverse channel VOL of the modeled regulator. 66. The apparatus as described in Claim 65, further characterized in that the encoded data comprises a signal that controls the inclusion of at least one field in the transverse channel VOL when equivalent information is not present in a multiplexer of the encapsulation system. The apparatus as described in Claim 66, further characterized in that the signal allows a visual elementary stream of the bit stream as an independent entity to specify a regulator model. The apparatus as described in Claim 66 or 67, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates a range VOP of the bit stream. The apparatus as described in any of Claims 66 to 68, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates a peak bit range of the current of bits. The apparatus as described in any of the Claims 66 to 69, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates if the VOL contains at least one B-VOP. The apparatus as described in any of Claims 66 to 70, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates the size of the modeled regulator. 72. The apparatus as described in any of Claims 66 to 71, further characterized in that, in at least one field whose inclusion in the transverse channel VOL is controlled by the signal, designates said field which designates the occupation of the regulator. . 73. The apparatus as described in any of Claims 65 through 72, further characterized in that the video or visual object comprises a video object. 74. The apparatus as described in any of Claims 65 through 72, further characterized in that the video or visual object comprises an object of independent texture. 75. The apparatus as described in any of Claims 65 through 72, further characterized in that the video or visual object comprises a mesh object. 76. The apparatus as described in any of Claims 65 through 72, further characterized in that the video or visual object comprises a cover object. 77. The apparatus as described in any of Claims 65 through 76, further characterized in that the encoded data comprises a plurality of VOLs, and a model of the decoder controller is applied independently to each VOL, using functions of size and range of the regulator for each VOL. 78. The apparatus as described in any of Claims 65 through 77, further characterized in that the bitstream is compatible with an MPEG-4 video coding standard. 79. The apparatus as described in any of Claims 65 through 78, further characterized in that additional fields are provided in respective subsequent transverse channels VOL to designate respective subsequent levels of regulator occupancy. 0. The apparatus as described in any of Claims 65 to 79, further characterized in that said driven data streams comprise video data including planes of the video object (VOPs) of intra-structure (I), prediction (P) ) and bidirectional (B). 81 A method for providing a stream of data pulse flow bits in a decoder, wherein said method comprises the steps of: providing a data controller with memory size previously determined in the decoder; and receiving the bit stream from the data pulse stream in the decoder, where said bit stream is obtained by encoding data in accordance with a regulator model, so that the modeled regulator does not overfill or sub-flux; wherein: the encoded data comprises at least one video or visual object (VO) with at least one layer of the video or visual object (VOL), including an associated transverse channel, followed by at least one plane of the object video or visual (VOP); and a field in the transversal channel VOL that designates a modeled regulator occupancy just before the first VOP is removed, following the transverse channel VOL of the modeled regulator. SUMMARY A technique is provided for transmitting data, such as video, using a data pulse flow scenario without causing a data range regulator (32) to overflow or subflow for data driven in the decoder (30). In a decoder (20), the data is encoded for communication with the decoder (30) and provide a bitstream output. In the encoder (22) the decoder of the data range (32) of the decoder is simulated. The simulation is used to control the current output of bits, to exclude the overflow or subflow of the decoder controller (32). For example, a regulator of the complementary encoder (22), which operates in the opposite way to the decoder controller (32), can be monitored and inverted to provide the simulation. Several different techniques are described for controlling the amount of data produced in the encoder, to keep the data within the confines of the decoder controller (32).
MXPA/A/2000/012615A 1998-06-19 2000-12-15 Video encoder and encoding method with buffer control MXPA00012615A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60/090,023 1998-06-19
US09219913 1998-12-23

Publications (1)

Publication Number Publication Date
MXPA00012615A true MXPA00012615A (en) 2002-02-26

Family

ID=

Similar Documents

Publication Publication Date Title
AU756355B2 (en) Video encoder and encoding method with buffer control
EP0994627B1 (en) Regulation of the computational and memory requirements of a compressed bitstream in a video decoder
TW477128B (en) Using a receiver model to multiplex variable-rate bit streams having timing constraints
US6411602B2 (en) Method and apparatus for detecting and preventing bandwidth overflow in a statistical multiplexer
CN101647278B (en) An improved video rate control for video coding standards
CN102246531B (en) Multiplexed video streaming
CA2361047C (en) Method and apparatus for assuring sufficient bandwidth of a statistical multiplexer
JPH08510628A (en) Encoder buffer with effective size that automatically changes with channel bit rate
US20080130739A1 (en) Method and device for transmitting video data
KR100841268B1 (en) Method and device for monitoring a quality of video data, and system for coding video data
CN104967871B (en) A kind of statistic multiplexing system and method for Video coding code stream
CA2431063A1 (en) Method and apparatus for adaptive bit rate control in an asynchronized encoding system
MXPA00012615A (en) Video encoder and encoding method with buffer control
WO2000003544A1 (en) Bit-rate modification
Zhu et al. End-to-end modeling and simulation of MPEG-2 transport streams over ATM networks with jitter
JP3836701B2 (en) Method and apparatus and program for encoding moving picture, and method and apparatus for moving picture audio multiplexing
WO2000046997A9 (en) Video rate-buffer management scheme for mpeg transcoder
Hoang et al. Multiplexing VBR video sequences onto a CBR channel with lexicographic optimization
LAUDERDALE et al. Using the minimum reservation rate for transmission of pre-encoded MPEG VBR video using CBR service
KR100236822B1 (en) Method for determining multiplex rate of variable bit rate signal
WO2000057647A1 (en) Method and apparatus for generating time stamp information
Chen Video Buffer Management and MPEG Video Buffer Verifier