MXPA98000246A - Method for dividing and coding da - Google Patents

Method for dividing and coding da

Info

Publication number
MXPA98000246A
MXPA98000246A MXPA/A/1998/000246A MX9800246A MXPA98000246A MX PA98000246 A MXPA98000246 A MX PA98000246A MX 9800246 A MX9800246 A MX 9800246A MX PA98000246 A MXPA98000246 A MX PA98000246A
Authority
MX
Mexico
Prior art keywords
data
resolution
segments
definition television
region
Prior art date
Application number
MXPA/A/1998/000246A
Other languages
Spanish (es)
Other versions
MX9800246A (en
Inventor
Sun Huifang
Walter Zdepski Joel
Chiang Tihao
Original Assignee
Thomson Multimedia Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/572,844 external-priority patent/US5828788A/en
Application filed by Thomson Multimedia Sa filed Critical Thomson Multimedia Sa
Publication of MX9800246A publication Critical patent/MX9800246A/en
Publication of MXPA98000246A publication Critical patent/MXPA98000246A/en

Links

Abstract

The present invention relates to a video signal processing system that can be dynamically configured, divided and encoded data using a variable number of data segments and variable data resolution. The system divides the data among a variable number of data segments by predicting, as a function of the data rate, first and second distortion factors for the data divided between the first and second numbers of the data segments ( 515-530). The first and second distortion factors are mutually compared, and the data is divided by the number of data segments that exhibit the lowest distortion factor value. The first and second distortion factors are also predicted for the data encoded with the first and second data resolutions (515-530). The first and second distortion factors (540) are also compared, and the data is encoded with the resolution that displays the value of the distortion factor plus

Description

METHOD FOR DIVIDING AND CODING DATA This invention relates to the field of digital image signal processing, and more particularly to a system for processing hierarchical video data. One objective in the development of digital video encoding and decoding formats has been to provide a standard that accommodates different different video transmission and reception systems. Another objective has been to promote interoperability and compatibility in the opposite direction between generations and different types of video encoding and decoding equipment. In order to promote this interoperability and compatibility, it is desirable to define coding and decoding strategies that can accommodate different types of video image scanning (e.g., interleaved / progressive), frame rate, image resolution, frame size, chrominance coding, and transmission bandwidth. A strategy that is used to achieve interoperability involves separating the video data into one or more levels of a data hierarchy (layers) organized as an ordered set of bitstreams for encoding and transmission. The bit streams vary from a base layer, i.e., a data stream representing the simplest video representation (eg, lower resolution), through successive layers of enhancement representing incremental video image refinements. The video data is reconstructed from the ordered bitstreams, by means of a decoder in a receiver. This strategy allows the complexity of the decoder to be tailored to achieve the desired video image quality. A decoder can vary from the most sophisticated configuration that decodes the entire complement of bit streams, which are all the enhancement layers, to the simplest that decodes only the base layer. A widely adopted standard that uses this data hierarchy is the MPEG (Moving Pictures Expert Group) coding standard (ISO / IEC 13818-2, May 10, 1994), which is referred to hereafter as the "MPEG standard". The MPEG standard details how base layer and enhancement data can be derived, and how video data can be reconstructed from layers using a decoder. In the present it is recognized that it is desirable to provide a system that incorporates the encoder and decoder architectures to rationally divide the data between the different layers, and to dynamically configure this system for this purpose.
In accordance with the principles of the present invention, video signal processing systems that can be dynamically configured divide and encode the data using a variable number of data segments and variable data resolution. A method that is described, in accordance with the present invention, divides the data among a variable number of data segments. The method involves predicting, as a function of the speed of the data, first and second distortion factors for the data divided between first and second numbers of data segments. The first and second distortion factors are mutually compared, and the data is divided by the number of data segments that exhibit the lowest distortion factor value. In accordance with a feature of the invention, a method for determining the resolution of data to which the input data is encoded is also described. The method involves predicting, as a function of the speed of the data, first and second distortion factors for the data encoded with the first and second data resolutions. The first and second distortion factors are mutually compared, and the data is encoded with the resolution that exhibits the lowest distortion factor value.
BRIEF DESCRIPTION OF THE DRAWINGS In the drawings: Figure 1 shows an exemplary architecture for coding and decoding of video signals that can be configured dynamically, in accordance with the invention. Figure 2 illustrates an exemplary graph of the Peak-to-Noise Signal Ratio (PSNR) plotted against the bit rate indicating different regions of coding strategy, in accordance with the invention. Figure 3 depicts a flow chart of a control function that is used to determine the architecture of Figure 1, according to the invention. Figure 4 shows the coding and decoding system of Figure 1 in the context of a compatible MPEG coding and decoding system. Figure 5 illustrates the architecture of encoder and decoder, according to the invention, for coding and decoding of type A region. Figure 6 shows the architecture of encoder and decoder, in accordance with the invention, for coding and decryption of region Type B. Figure 7 shows the architecture of the encoder and decoder, according to the invention, for coding and decoding of the C-type region.
Figure 8 is a variation of Figure 1, with an additional architectural configuration for region A decoding, according to the invention. Figure 9 is a variation of Figure 1, with an additional architecture configuration for C region decoding, according to the invention. Figure 10 presents a flow diagram of a method for identifying the type of region of the input data, according to the invention. The MPEG standard refers to the processing of hierarchical ordered bit stream layers, in terms of "scalability". One form of MPEG scalability, called "spatial scalability" allows data in different layers to have different frame sizes, frame rates and chrominance coding. Another form of MPEG scalability, called "temporal scalability" allows data in different layers to have different frame rates, but requires identical frame size and chrominance coding. In addition, "temporal scalability" allows an improvement layer to contain data formed by predictions that depend on movement, while "spatial scalability" does not allow it. These types of scalability, and an additional type called "SNR scalability," (SNR stands for Signal to Noise Ratio) are further defined in section 3 of the MPEG standard. One embodiment of the invention employs "spatial" and "temporal" MPEG scalability in a 2-layer hierarchy (base layer and single enhancement layer). The data from the enhancement layer accommodates different frame sizes, but a single frame rate, and a single chrominance encoding format. Two exemplary frame sizes correspond to the HDTV (High Definition Television) and SDTV (Standard Definition Television) signal formats, as proposed by the Grand Alliance High Definition Television specification in the United States of America, for example. The frame size of the HDTV is 1080 lines with 1920 samples per line (giving 1080 x 1920 pixels per image), and the standard definition television frame size is 720 lines with 1280 samples per line ( giving 720 x 1280 pixels per image). Both high-definition television and standard-definition television signals employ a 30 Hz interlace frame rate and the same chrominance encoding format. Although the described system is described in the context of the MPEG compatible application, two layers of high definition television and standard definition television, spatially and temporally scalable, this is just an example. One skilled in the art can quickly extend the described system to more than two layers of video data hierarchy, and other video data resolutions (not just a resolution of 720 and 1080 lines). Additionally, the principles of the invention can be applied to other forms of scalability, such as SNR scalability, and can also be used to determine a fixed optimal encoder and decoder architecture. The principles of the invention have particular application in television coding (high definition television or standard definition television), Very Low Bit Rate Coding (for example, video conferences) and digital terrestrial transmission to optimize the encoder and decoder apparatus for a desired communication bandwidth. Figure 1 shows a video coding and decoding architecture that can be configured dynamically. In general, an input video data stream is compressed, and is distributed between a base data layer (standard definition television) and an improvement data layer (high definition television), by means of an encoder 100. The distribution it is carried out in accordance with the principles of the invention, under the control of the bandwidth and architecture control unit 120. The resulting compressed data of the encoder 100 in the form of single or double bit streams, are formed into data packets including identification headers by the formatter 110. The formatted data exiting the unit 110, after transmission on a data channel, are received by the transport processor 115. The transmission and reception process is described below in connection with the coding and decoding system illustrated in Figure 4. The transport processor 115 (Figure 1) separates the data from the stream of compressed formatted bits, in accordance with the layer type, i.e., base layer or enhancement data, based on an analysis of the header information. The data exiting the transport processor 115 is decompressed by the decoder 105. The architecture of the decoder 105 is determined in accordance with the principles of the invention, under the control of the bandwidth and architecture control unit 145. An output of decompressed data resulting from the decoder 105, in the form of uncompressed single or double bit streams, is suitable for encoding as an NTSC format signal and for subsequent visual display. Considering the architecture that can be dynamically configured of the Figure in detail, an input video data stream is compressed and distributed between a base SDTV data layer and a HDTV enhancement layer, using the encoder 100. The unit bandwidth control and architecture 120 configures the architecture of the encoder 100 to appropriately distribute the data between the high definition television and standard definition television output layers of the 125 and 135 units respectively. The proper distribution of data depends on many system factors, including bandwidth, speed restrictions of the system's output data, data rate, and image resolution (number of pixels per image) of the input video data, and the image quality and resolution (number of pixels per image) that are required in each layer. In the described system, the image resolution between the input and the output of both the encoder 100 and the decoder 105 varies by means of changing the number of pixels per image, as described in greater detail below. The data distribution and coding strategy is derived by determining the minimum number of bits per unit of time required to represent the video input sequence at the output of the encoder 100 for a specified distortion. This is the Speed Distortion Function for the encoder 100. The Speed Distortion Function is evaluated, assuming that the input sequence is a Gaussian distribution source signal of average deviation μ and standard s. In addition, by applying a squared error criterion to the Speed Distortion Function, R, of the Gaussian input sequence, in accordance with the theory presented in section 13.3.2 of "Elements of Information tion Theory" by TM Cover and JA Thomas, published by J. Wiley and Sons, 1991, da, R = 1 max (0, 1 log2 (s2)), (bits per second) 2 2 D = 1 log2 (s2) if 0 = D = s2 or, = 0 if D > s2. Therefore, the Distortion Speed Function, D, is given by, D = s22"2R which, using is represented as a Peak Signal Ratio to Noise (PSNR), is DPSNR = 10 log (2552) + 20 log (2 * R) s2 Figure 2 is a graphical representation of the Noise Distortion Peak Signal Ratio DPSNR in decibels (dB), plotted against the Velocity of the bits of an Improvement layer (bits per second) for a two-layered spatial encoded system. The curves are plotted for a base layer distortion function, an enhancement layer distortion function, and a distortion function for an exemplary ascending base layer for a 1080 line interpolation of a 720 line image. The curves of the base layer and the ascending sample base layer have a negative inclination because as the bit rate of the Enhancement layer increases, the bit rate of the base layer decreases. The composite distortion curve for the 2-layer system is shown by the thick black line in Figure 2. This composite Distortion curve is a linearized approximation to the minimum distortion that can be obtained for the 2-layer system, using a layer ascending sampling base. A coding and decoding strategy is derived from the results of the two-layer system illustrated in Figure 2. In particular, three regions A, B and C are identified in which profit can be obtained by adopting different coding approaches and decoding. The boundaries of these regions may vary depending on the bandwidth of the system, the speed restrictions of the system data, the speed of the data and the image resolution of the input video data and the quality and resolution of the image required in each layer. The regions are identified as follows.
Region A. In region A there is an insufficient bandwidth of bandwidth to achieve the required image quality using either a two-layer coding or a high-resolution single-layer coding. In this region the video quality of a decoded upstream sampling base layer is equal to or exceeds the quality of a decoded image derived from combined base layer and enhancement layer data. This region is joined at its upper end at a point X in the enhancement layer curve which gives an image quality (DPSNR value) equivalent to that of the ascending sample base layer curve at the Y point of the Enhancement layer. of Speed of the zero bits. In region A there is an advantage in distributing the entire bandwidth of the available system to the coding and compression of a single layer (in the base layer) at a reduced spatial resolution with a reduced number of pixels per image. This strategy can be implemented in different ways. A way, for example, is to sample down an input data stream to provide a single base layer (SDTV) for transmission, and then decode the corresponding received base layer, to provide a decoded standard definition television output on reception. A higher resolution, high definition decoded television output may be produced in the receiver, in addition to the decoded standard definition television output, by upsampling (oversampling) the decoded standard definition television output. The advantage of this strategy arises because the low bandwidth is used more efficiently when it is distributed to encode a bit stream of the lowest resolution layer, which when used to encode either two layers or one single layer of high resolution. This is because these latter approaches typically incur higher higher coding associated with additional required error protection and data management code, for example, when all the bandwidth of the available system is insufficient to support all resolution coding. The advantage of the coding approach of region A may also arise in other situations, for example, when an input data stream to be encoded contains significant non-translational movement. Then, the downward and upward spatial sampling of region A can provide better image quality in a system restricted by the bandwidth that can be provided by motion compensated prediction coding. This is due to the superiority associated with this movement compensation. The operation of region A is discussed in more detail in connection with Figure 5.
Region B In region B there is sufficient bandwidth of the system to meet the required output image quality, using a two-layer coding strategy. In this region, the bandwidth of the available system is distributed between the layers, so that the quality requirements of the decoded outputs of both high and low resolution are met. This region is between region A and region C. In region B, the bandwidth of the system is distributed in accordance with the requirements of image quality between the signal output layers of high resolution and low resolution. The two output layers can be encoded for transmission in different ways. One way, for example, is to sample down and encode the high resolution input data stream, to provide a low resolution layer (SDTV) for transmission, and decode this low resolution layer when it is received, to provide a signal of standard definition television of low resolution. The high resolution enhancement layer (HDTV) to be transmitted can be derived from a combination of an upsampled version of the encoded standard definition television layer, and from earlier frames of the high definition television layer. The decoded output of high definition television can be derived from a combination of a sampled version of the decoded standard definition television output and the encoded received high definition television layer. This operation is discussed in more detail in connection with Figure 6.
Region C. In region C, the required image quality can not be achieved by distributing the bandwidth of the system either to code two layers, or to code a single layer (low resolution). In this region, a high-quality output video signal can be achieved, given the restriction of the bandwidth of the system, by means of the coding of a single high-resolution layer. This region is joined by a point V in the enhancement layer curve which provides the level of image quality required as a minimum for the base layer alone (equal to the DPSNR value W of Figure 2). In region C there is an advantage to the distribution of the entire bandwidth of the system for the coding and compression of a single layer (the enhancement layer) at full spatial resolution with a full number of pixels per image. This strategy can be implemented in different ways. One way, for example, is to encode the input data stream to the full spatial resolution as a single high resolution enhancement layer (HDTV) for transmission, and to decode the corresponding received enhancement layer, to provide the television output high definition high resolution. In the receiver, a low resolution output (SDTV) can be derived from the received high resolution signal, by means of descending sampling in the compressed or decompressed domain, as described below. The advantage of the strategy of this region C arises because, given the level of output image quality required, the available bandwidth is used more efficiently, when it is distributed to encode a single high resolution layer, rather than when it is used to encode two layers for transmission. This is because the two-layer encoding requires additional required error protection and general data management information. This operation of region C is discussed in more detail in connection with Figure 7. The three regions (A, B and C) identified for the 2-layer system of Figure 2 may not all be present in any two-layer system . For example, only one or two regions can be identified, depending on the bandwidth of the system, the speed restrictions of the system data, and the image quality and resolution required in each layer. On the contrary, in systems that involve more than two layers, more than three regions can be identified, in accordance with the principles of the invention. However, regardless of the number of data regions that can be identified in a system, adequate decoded image quality can be achieved using coding and decoding architectures that can be configured for only a limited number of regions that can be identified . The different coding and decoding strategies associated with the regions A, B and C are implemented in the dynamically configurable architecture of Figure 1. In the encoder 100, the appropriate strategy and architectures for distributing data between the HDTV and SDTV output layers, by means of the control unit 120. The control unit 120, for example, which includes a microprocessor, configures the architecture of the encoder 100 using the process shown in the flow diagram of Figure 3. The control unit 120 first identifies the type of region of the input data in step 315 of Figure 3, after the start in step 310. The type of region is determined in accordance with the principles discussed previously, based on factors that include the bandwidth of the available system, the data rate of the input data stream, and the required image quality of each output layer. ompressed These factors can be pre-programmed and indicated by data held in the memory within the control unit 120, or the factors of the inputs to the control unit 120 can be determined. For example, the speed of the data can be detected directly. of the input data stream. In addition, the externally originated inputs can be originated by the selection of the operator, for example, and entered into the control unit 120 by means of an interconnection of the computer, for example. In one implementation, for example, the control unit 120 may derive velocity threshold values from the input data, establishing the boundaries between the regions A, B and C, based on the previously programmed values indicating the bandwidth of the system and the required image quality of each decompressed output layer. Then, the control unit 120 adopts the coding strategy of the appropriate region A, B or C based on the data rate of the input data stream arriving at particular thresholds. Alternatively, the speed threshold values of the input data may be pre-programmed to themselves within the unit 120. The region type of the input data is identified in step 315 of FIG. 3, using the method shown in the flow diagram of Figure 10. In step 515 of Figure 10, after the start in step 510, a single hierarchical layer and an image resolution of 1080 lines is initially selected to encode the data in the coding region. In step 525, the predicted Distortion factor for the input data is calculated when it is encoded as a single layer for transmission with a resolution of 1080 lines. Step 530 indicates that steps 515 and 525 are repeated to calculate the Distortion factors for a single-layer encoding implementation with a resolution of 720 lines. In addition, step 530 indicates that steps 515 and 525 are repeated again to calculate the Distortion factors for a two-layer encoding implementation with resolutions of both 720 and 1080 lines. In step 540 the resulting Distortion factors are compared and the image resolution and the number of hierarchical layers used for coding are determined. In step 550 the selection process ends. In step 540, the number of layers and the image resolution are selected, to give the minimum Distortion factor. This layer selection and resolution process implements the coding region identification function of step 315 (Figure 3). It should be noted that this encoded input data division method can also be used for a variety of applications in which data is to be prepared for transmission, and is not restricted to image processing. For example, the process can be used for telephony, satellite or terrestrial communication, including microwave and fiber optic communication. In addition, this process can encompass other types of data and the division of data into other types of data segments or data packets, not just hierarchical layers of encoded data. The process may also encompass different numbers of data segments and data resolution, not only the two layers and the two data resolutions described with respect to the preferred embodiment. If region A is selected, step 320 (FIG. 3) indicates that step 325 is performed and encoder 100 is configured for a type A architecture. In addition, formatter 110 encodes the transmitted bitstream to indicate the type of region of the data, and the appropriate decoding architecture using the information provided by the control unit 120. The decoder 105 is configured in a compatible manner to decode the transmitted data from the type region A, in response to the coded architecture information. If the data is from the C-type region, step 330 indicates that step 335 be performed. Step 335 allows the encoder 100 to be configured for a C-region architecture, and the transmitted bit stream be updated to indicate the data and the type of decoding architecture in the manner described for the region A. If the data is not from the type C region, step 330 indicates that step 340 is performed. Step 340 allows the encoder 100 to be configured to a type B region architecture, and that the transmitted bitstream is updated to indicate the data and the type of decoding architecture in the manner described for the A region. The control unit 120 configures the encoder 100 by a Cl signal of configuration that is provided to each of the constituent elements of the encoder 100. The control unit 120 updates the configuration of the encoder 100 for individual input data packets, wherein The data pack consists of sequences of keywords, and represents a group of images, for example, an Image Group in accordance with the MPEG standard. However, the control unit 120 may update the configuration of the encoder 100 for different lengths of data packets, as appropriate for a particular system. For example, the configuration can be performed at power up, for each image, for each stream of images (eg, program), for each block of pixels (eg, macroblock), or at varying time intervals. In the operating mode of the region A, the control unit 120 disables, by means of the Configuration signal, both the high definition television compressor 125 and the ascending sampler 130 2: 3. In the resulting configuration of the encoder 100, a single standard definition television output layer is provided to the formatter 110 by the unit 135 of the unit 100 for transmission. This configuration is shown and discussed in connection with Figure 5. Continuing with Figure 1, to produce the output of the standard definition television layer, the descending sampler 3: 2 140 reduces the spatial resolution of the data stream of input with a resolution of 1080 lines by a factor of 2/3 to provide an output of 720 lines. This can be achieved by a variety of known methods, including, for example, simply discarding all third lines, or preferably by means of performing an interpolation and averaging process to provide two interpolated lines for every three original lines. The output of 720 lines from the descending sampler 140 is compressed by the standard definition television compressor 135 to provide compressed data from the standard definition television layer to the formatter 110. The compression performed by the unit 135 employs a temporal prediction process which uses frames of the previous standard definition television layer, stored inside the encoder 135. This compression process, which involves temporal prediction and compression of Discrete Cosine Transformation (DCT), is known and described, for example, in the Chapter 3 of the Grand Alliance HDTV System Specification of April 14, 1994, published by the Office of Science and Technology of the National Association of Broadcasters (NAB) in its 1994 Proceedings of the 48th Annual Conference. The resulting standard definition television bitstream is formed into data packets including identification headers and architecture information, by formatter 110. The architecture information is provided by the control unit 120, and is encoded by the formatter 110 within the transmitted bit stream, using the "Hierarchy Descriptor" described in sections 2.6.6 and 2.6.7 of the MPEG image coding systems standard (ISO / IEC 13818-1, June 10, 1994). ). The decoder 105 subsequently uses the architecture information to compatibly configure the decoder 105 for the appropriate decoding mode (e.g., mode of region A, B or C). The configuration of the decoder 105, like that of the encoder 100, is updated for each transmitted data packet. A data packet contains a group of images in this preferred embodiment. Although the use of the "Hierarchy Descriptor" MPEG is the preferred method to ensure that the encoder 100 and the decoder 105 are configured in a compatible manner, other methods are possible. The architecture information can be encoded, for example, in MPEG syntax in the "User Data" field defined in section 6.2.2.2.2, of the MPEG standard. Alternatively, the decoder 105 can deduce the appropriate decoding mode by the bit rate of the coded received data stream, determined by the bit rate field of the sequence header by section 6.2.2.1 of the MPEG standard. . The decoder may use this bit rate information together with previously programmed data giving details of the bandwidth and video quality requirements of the decoded output, to deduce the appropriate decoding mode, in accordance with the principles of the invention described above. The decoding mode can be changed, for example, when the speed of the received bits reaches the pre-programmed thresholds. The stream of formatted compressed data exiting from the unit 110 is transported over a transmission channel before being input to the transport processor 115. Figure 4 shows a total system including the elements of Figure 1, as well as the elements of transmission and reception 410-435. These transmission and reception elements are known and described, for example, in the reference text, Digi tal Communication, Lee and Messerschmidt (Kluwer Academic Press, Boston, MA, USA, 1988). The transmission encoder 410 encodes the formatted output of the unit 110 (FIGS. 1 and 4) for transmission. The encoder 410 typically demodulates sequentially, encodes the error and interleaves the formatted data to condition the data for transmission prior to modulation by the modulator 415. Then the modulator 415 modulates a carrier frequency with the output of the encoder 410 in a format of particular modulation, for example Quadrature Amplitude Modulation (QAM). The frequency of the modulated carrier output of the modulator 415 is then changed, and transmitted by the up converter and the transmitter 420 which may be, for example, a local area broadcasting transmitter. It should be noted that, although it is described as a single channel transmission system, bitstream information can be transmitted equally well in a multi-channel transmission system, for example, where one channel is distributed for each layer of bit stream. The transmitted signal is received and processed by the antenna and the input processor 425 in a receiver. The unit 425 typically includes a radio frequency (RF) tuner and an intermediate frequency (IF) mixer and amplification stages for downwardly converting the received input signal to a lower frequency band, suitable for further processing. The output from the unit 425 is demodulated by the unit 430, which tracks the carrier frequency and retrieves the transmitted data, as well as the associated timing data (e.g., a clock frequency). The transmit decoder 435 performs the opposite of the operations performed by the encoder 410. The decoder 435 sequentially de-interleaves, decodes and demodulates the demodulated data output from the unit 430, using the timing data derived by the unit 430. In the text mentioned above from Lee and Messerschmidt, for example, you will find additional information regarding these functions. The transport processor 115 (Figures 1 and 4) extracts synchronization information and error indication from the compressed data output from the unit 435. This information is used in the subsequent decompression performed by the decoder 105 of the compressed video data that The processor 115 also extracts decoding architecture information from the MPEG Hierarchy Descriptor field, inside the compressed data from the unit 435. This architecture information is provided to the decoder bandwidth and to the unit. of architecture control 145 (Figure 1). The unit 145 uses this information to compatibly configure the decoder 105 for the appropriate decoding mode (e.g., the mode of region A, B or C). The control unit 145 configures the decoder 105 by means of a second Configuration signal C2 which is provided to each constituent element of the decoder 105. In the mode of the region A, the control unit 145 of Figure 1 disables, by the second signal Configuration, both the high-definition television decompressor 150, and the adaptation unit 165. In the resulting configuration of the decoder 105, the compressed video output of the standard definition television layer of the processor 115 is decompressed by the decompressor 160. standard definition television to provide a standard definition television output sequence with a resolution of 720 lines. The decompression process is known and defined in the MPEG standard mentioned above. In addition, the ascending sampler 155 oversamples the standard definition television output with a resolution of 720 lines by a factor of 3/2, to provide a decompressed high definition television output with a resolution of 1080 lines. This can be pursued by a variety of known methods, including, for example, interpolation and averaging to provide three interpolated lines for every two original lines. The uncompressed output with a resolution of 1080 lines of the ascending sampler 160 is selected by the multiplexer 180, in response to the second Configuration signal, as the decompressed high-definition television output sequence. The decompressed standard definition television and high definition television data outputs from decoder 105 are suitable for encoding as a signal in NTSC format by unit 440 of Figure 4, for example, and for subsequent visual display. Figure 5 shows the coding and decoding apparatus of Figure 1 configured for coding and decoding of the type A region. The functions of the elements shown are as described above. The ascending sampler 130 and the high definition television compressor 125, which are shown in the encoder 100 of Figure 1, are absent in Figure 5, since these elements are disabled in the A-region mode as described above. . Similarly, the high definition television decompressor 150 and the adaptation unit 165, which are shown in the decoder 105 of Figure 1, are absent in Figure 5 since these elements are also disabled in the mode of the region A as described above. If the input data in Figure 1 is in the B-type region, the control unit 120 configures the encoder 100 for a B-region architecture. This is done using the Configuration signal in a manner similar to those described above for the region A. However, in region B, encoder 100 compresses both high resolution and low resolution output layers for transmission, in contrast to the only low resolution output for region A. This configuration is shown and discussed in connection with Figure 6. Continuing with Figure 1, the control unit 120 distributes the bandwidth of the system between the high resolution and low resolution output layers, by configuring the encoder 100 to compress the data of improvement as a high resolution high definition television output layer, in addition to a low resolution standard definition television output. This high definition television layer provides image refinement data to enable the decoder 105 to produce an image output with a resolution of 1080 lines from the standard definition television layer with a resolution of 720 lines. The output of the standard definition television layer in region B is produced in the same manner as described for region A. The output of 720 lines of descending sampler 140 is compressed by the standard definition television compressor 135 to provide compressed data from the standard definition television layer to the formatter 110. However, in region B, the high resolution high definition television enhancement layer for transmission is derived by the high definition television compressor 125. The compressor 125 derives the high definition television output by combining and compressing an uncompressed version sampled upwardly from the standard definition television layer produced by the up / decompressor 130 sampler and previous high definition television layer boxes stored inside the compressor 125. This combination and compression process involving temporal prediction performed by the compressor 125 is known and contemplated, for example, in the spatial scalability section (section 7.7) of the MPEG standard, the high-definition television outputs and standard definition television resulting from the encoder 100, are provided to the formatter 110. The high definition television and standard definition television bit streams from the encoder 100 are formed by the formatter 110 into data packets including identification headers and information d and architecture in the "Descriptor of Hierarchy" field. As described for the region A, the formatted data of the unit 110 is transported to the transport processor 115, which provides the architecture information to the control unit 145 of the decompressor to configure the decoder 105 (here for region B). On the receiver, in region B mode, the control unit 145 disables the adaptation unit 165 using the second Configuration signal. In the resulting configuration of the decoder 105, the compressed standard definition television output of the processor 115 is decompressed by the unit 160 to give a standard definition television output with a resolution of 720 lines, as in region A. The decompressor 150 high-definition television derives a high-definition television output with an uncompressed 1080-line resolution, by combining and decompressing an up-sampled version of this decoded standard-definition television output produced by the ascending sampler 155 and previous frames of the high-definition television layer, stored inside the decompressor 150. The process of combining the data sampled upwards and stored, and the formation of a decompressed output as performed by the decompressor 150 is known and described, for example, in the scalability section space to (section 7.7) of the MPEG standard. The decompressed high-resolution output of 1080 lines from the decompressor 150 is selected as the decompressed high-definition television output, by the multiplexer 180, in response to the second Configuration signal. The decompressed standard definition high definition television and television data outputs output from the decoder 105 are suitable for further processing and subsequent visual display as described above. Figure 6 shows the encoder and decoder apparatus of Figure 1, configured for coding and decoding of the type B region. The functions of the elements shown are as described above. The adaptation unit 165, which is shown in the decoder 105 of Figure 1, is absent in Figure 6 since this element is disabled in the B-region mode as also described above. If the input data in Figure 1 is the C-type region, the control unit 120 configures the encoder 100 for a C-region architecture. This is done using the Configuration signal in a manner similar to that described above for the region. A. However, in region C, encoder 100 encodes a single high resolution output rather than a low resolution output as for region A, or outputs as for region B. Control unit 120 distributes the entire bandwidth of the system, if necessary, to encode a high-resolution output and configure the unit 100, by means of the Configuration signal, to encode the enhancement layer at a full spatial high-definition television resolution (1080 lines ). In the region C mode, the control unit 120 disables the descending sampler 140, the standard definition television compressor 135 and the ascending sampler 130, by means of the Configuration signal. In the resulting configuration of the encoder 100, the input sequence is compressed by the high definition television compressor 125 using the full bandwidth of the system, as required, to provide a high definition television output with a resolution of 1080. lines to the formatter 110. This configuration is shown and discussed in connection with Figure 7. Continuing with Figure 1, the compressor 125 derives the high definition television output using previous high definition television layer pictures stored inside the compressor 125. The compression process performed by compressor 125 in region C is like that described for regions A and B and is also known. The high definition television bit stream from the unit 100 is formed by the formatter 110 in data packets including identification headers and architecture information in the "Hierarchy Descriptor" field. As described for region A, the formatted data of the unit 110 is transported to the transport processor 115, which provides the architecture information to the control unit 145 of the decompressor to configure the decoder 105 (here for the region C). At the receiver, in the region C mode, the control unit 145 disables the ascending sampler 155 using the second Configuration signal. In the resulting configuration of the decoder 105, the compressed high definition television output of the processor 115 is decompressed by the unit 150, to give a high resolution high resolution television output of 1080 lines. This decompressed output of 1080 lines from the decompressor 150 is selected as the decoded high definition television output of the decoder 105, by the multiplexer 180, in response to the second Configuration signal. In addition, the high definition television output compressed from the processor 115 is adapted to meet the input requirements of the standard definition television decompressor 160 via the adaptation unit 165. This is done by reducing the spatial resolution of the high-definition television output compressed from the processor 115, at an effective resolution of 720 lines in the compressed domain (frequency). This can be done, for example, by discarding the higher frequency coefficients of those Discrete Cosine Transformation (DCT) coefficients representing the video information of the compressed high definition television output from the processor 115. This The process is known and described, for example, in "Manipulation and Composition of MC-DCT Compressed Video" by S. Chang et al., published in the IEEE Journal of Selected Area in Communications (JSAC), January 1995. The spatially reduced compressed output from the adaptation unit is decompressed by unit 160 to give a standard definition television output with a resolution of 720 lines. The decompression processes performed by units 160 and 150 are like those described for region A, and are also known. The decoded data outputs resulting from high definition television and standard definition television from the decoder 105 are suitable for further processing and subsequent visual display, as described above. Figure 7 shows the encoder and decoder apparatus of Figure 1 configured for coding and decoding of the type C region. The functions of the elements shown are as described above. The descending sampler 140, the standard definition television compressor 135, and the ascending sampler 130, which are shown in the encoder of Figure 1, are absent in Figure 7 since these elements are disabled in the C region mode , as described above. Similarly, the ascending sampler 155, which is shown in the decoder 105 of Figure 1, is absent in Figure 7 since this element is disabled in the mode of the region C. Figure 8 is a variation of the Figure 1, and shows an additional architecture configuration for the decoding of the A region. The functions performed by the encoder 100, the formatter 110 and the transport processor 115 of Figure 8, are as described for Figure 1. In addition , the functions of the decoder 109 of Figure 8 are the same as those of the decoder 105 of Figure 1, except that in the decoding of the A region, the decompressed output of high definition television with a resolution of 1080 lines is provided from a different way In the A-region mode, the control unit 149 of the decoder of Figure 8 disables, by the second Configuration signal, both the ascending sampler 155 and the adaptation unit 165. In the resulting configuration of the decoder 109, the compressed video output of the standard definition television layer of the processor 115 is decompressed by the standard definition television decompressor 160 to provide the standard definition television output of the decoder 109. This is done in the same manner as it is. described for Figure 1. However, the decompressed high definition television output from the decoder 109 is produced by up-sampling the standard definition television layer in the frequency domain, in contrast to the time domain sampling performed in the decoder 105 of Figure 1. The output compressed from the processor 115 in Figure 8 is sampled upwardly in the compressed domain (frequency) by the adaptation unit 168 (not shown in Figure 1). This can be done, for example, by "zeroing" the Discrete Cosine Transform (DCT) frequency coefficients of the highest order, which represent the video information in the standard definition television output compressed from the processor 115. In effect, the Discrete Cosine Transformation coefficients of the highest order selected are assigned zero values. The theory behind this process is known and described, for example, in "Manipulation and Composition of MC-L ~ CT Compressed Video" by S. Chang et al., Published in I.E.E.E. Journal of Selected Area in Communications (JSAC), January 1995, mentioned above. The output sampled output resulting from the adaptation unit 168 is decompressed by the high definition television decompressor 152, to provide the high definition television output of the decoder 109. The decompressed data outputs resulting from high definition television and television from The standard definition of the decoder 109 is suitable for processing and subsequent visual display, as described in connection with Figure 1. Figure 9 is a variation of Figure 1, and shows an additional architecture configuration for decoding the C region. functions performed by the encoder 100, the formatter 110 and the transport processor 115 of Figure 9, are as described for Figure 1. In addition, the functions of the decoder 107 of Figure 9 are the same as those of the decoder 105 of Figure 1, except that in the decoding of the C region, the Uncompressed standard definition television with a resolution of 720 lines is provided in a different way. In the C-region mode, the control unit 147 of Figure 9 disables, by the second Configuration signal, both the ascending sampler 155, and the standard definition television decompressor 162. In the resulting configuration of the decoder 107 the compressed video output of the high definition television layer of the processor 115 is decompressed by the high definition television decompressor 150, to provide the high definition television output of the decoder 107. This is performed in the same manner as described for Figure 1. However, the standard definition television decompressed output of the decoder 107 is produced by descending the HDTV layer in the time domain, in contrast to the frequency domain sampling performed in the decoder 105 of Figure 1. The decompressed high definition television output of the multiplexer 180 in Figure 9 is sampled downwardly by the descending sampler 170 (not shown in Figure 1), by a 2/3 factor to provide an output of 720 lines. This can be done by a variety of known methods, as discussed with respect to the descending sampler 140 of the encoder 100 in Figure 1. The decompressed output with a resolution of 720 lines of the descending sampler 170 is selected as the decoded television output of standard definition of the decoder 107, by multiplexer 175 (not present in Figure 1), in response to the second Configuration signal. The decompressed data outputs resulting from high definition television and standard definition television of the decoder 107 are suitable for processing and subsequent visual display, as described in connection with Figure 1. The decoder and decoder architectures discussed with respect to the Figures 1-9 are not exclusive. Other architectures can be derived for the individual regions (A, B and C), which can achieve the same goals. In addition, the functions of the elements of the different architectures can be implemented in their entirety or in parts within the programmed instructions of a microprocessor.

Claims (12)

1. A method for dividing data, comprising the steps of: (a) predicting a first distortion factor for a first number of segments of the data, the prediction being made as a function of the data rate of that data; (b) predicting a second distortion factor for a second number of segments of the data, the prediction being made as a function of the data rate of that data; (c) mutually compare the first and second distortion factors; (d) determining which of the first and second number of segments exhibits the lowest distortion factor, based on mutual comparison; and (e) dividing the data by the determined number of data segments.
2. A method, in accordance with the claim 1, wherein the prediction step (a) includes the step of predictably accommodating the data, in the form of the first number of segments; and the prediction step (b) includes the step of predictably accommodating the data, in the form of the second number of segments.
3. A method, according to claim 2, characterized in that it also includes the step of (f) compressing the divided data.
4. A method, according to claim 1, wherein the prediction step (a) includes the steps of: selecting a first number of data segments; predictably forming that data between data segments arranged in a hierarchical manner of the first number; calculating a first Distortion factor for the data in the form of the first number of data segments ordered in a hierarchical manner, as a function of the data rate of that data; and the prediction step (b) includes the steps of: selecting a second number of data segments; predictably forming that data between data segments arranged in a hierarchical manner of the second number; calculate a second Distortion factor for the data, in the form iei second number of data segments ordered in a hierarchical manner, as a function of the data rate of that data.
5. A method, according to claim 4, wherein the data is image data.
6. A method according to claim 5, wherein the training steps predictably form the image data between one or more hierarchical layers of compressed data.
A method, according to claim 5, wherein the calculation steps predictably calculate the Distortion factors for the decompressed hierarchical layers of the compressed data.
8. A method for encoding input data, comprising the steps of: (a) predicting a first distortion factor for the input data, having a first data resolution, the prediction being made as a function of the velocity of the data data of the input data; (b) predicting a second distortion factor for the input data, which has a second data resolution, the prediction being made as a function of the data rate of the input data; (c) mutually compare the first and second distortion factors; (d) determining which of the first and second data resolutions exhibits the lowest distortion factor value, based on mutual comparison; and (e) encode the input data to the determined data resolution.
9. A method, according to claim 8, wherein the prediction step (a) includes the steps of: selecting the first data resolution; convert predictable data input to the first data resolution; calculate a first Distortion factor for data that has the first data resolution, as a function of the data rate of that input data; and the prediction step (b) includes the steps of: selecting a second data resolution; predictably convert the input data to the second data resolution; calculate a second Distortion factor for the data that has the second data resolution, as a function of the data rate of that input data.
10. A method, according to claim 1 or 9, wherein the input data is image data.
11. A method, according to claim 10, wherein the coding step compresses the image data at the determined data resolution. A method, according to claim 10, wherein in steps (a) and (b) the prediction is a function of a previously determined quality requirement of the encoded image data.
MXPA/A/1998/000246A 1995-06-29 1998-01-07 Method for dividing and coding da MXPA98000246A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US67595P 1995-06-29 1995-06-29
US000675 1995-06-29
US08572844 1995-12-14
US08/572,844 US5828788A (en) 1995-06-29 1995-12-14 System for processing data in variable segments and with variable data resolution
PCT/IB1996/000722 WO1997001935A1 (en) 1995-06-29 1996-06-04 Method for partioning and encoding data

Publications (2)

Publication Number Publication Date
MX9800246A MX9800246A (en) 1998-07-31
MXPA98000246A true MXPA98000246A (en) 1998-11-09

Family

ID=

Similar Documents

Publication Publication Date Title
KR100471583B1 (en) Method for partitioning and encoding data
AU713904B2 (en) System for encoding and decoding layered compressed video data
JP2795420B2 (en) Method and apparatus and system for compressing digitized video signal
JP4240554B2 (en) Image encoding apparatus, image encoding method, image decoding apparatus, and image decoding method
EP0644695A2 (en) Spatially scalable video encoding and decoding
EP0671102B1 (en) Picture-in-picture tv with insertion of a mean only frame into a full size frame
US20050129130A1 (en) Color space coding framework
WO2006061794A1 (en) System and method for real-time transcoding of digital video for fine-granular scalability
US20060072667A1 (en) Transcoder for a variable length coded data stream
JP2542025B2 (en) System including an apparatus for encoding a broadcast quality television signal enabling transmission as an embedded code and an apparatus for decoding the encoded signal
US6040875A (en) Method to compensate for a fade in a digital video input sequence
Petajan the HDTV grand alliance system
MXPA98000246A (en) Method for dividing and coding da
JP2006518561A (en) Method and apparatus for preventing error propagation in a video sequence
Challapali et al. Video compression for digital television applications
KR0171749B1 (en) A compatible encoder
EP1711016A2 (en) Coding data
TW309690B (en)
KR100192778B1 (en) A compatible encoder and decoder using ptsvq
JP4674593B2 (en) Image encoding device
JP4674613B2 (en) ISDB transmission device, ISDB transmission method, ISDB reception device, and ISDB reception method