JP4622121B2 - Data conversion apparatus and method, and encoding apparatus and method - Google Patents

Data conversion apparatus and method, and encoding apparatus and method Download PDF

Info

Publication number
JP4622121B2
JP4622121B2 JP2001065077A JP2001065077A JP4622121B2 JP 4622121 B2 JP4622121 B2 JP 4622121B2 JP 2001065077 A JP2001065077 A JP 2001065077A JP 2001065077 A JP2001065077 A JP 2001065077A JP 4622121 B2 JP4622121 B2 JP 4622121B2
Authority
JP
Japan
Prior art keywords
information
encoding
transcoding
output
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2001065077A
Other languages
Japanese (ja)
Other versions
JP2002044622A (en
Inventor
クーン ピーター
輝彦 鈴木
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to JP2000068719 priority Critical
Priority to JP2000147768 priority
Priority to JP2000-147768 priority
Priority to JP2000-68719 priority
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Priority to JP2001065077A priority patent/JP4622121B2/en
Publication of JP2002044622A publication Critical patent/JP2002044622A/en
Application granted granted Critical
Publication of JP4622121B2 publication Critical patent/JP4622121B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

[0001]
BACKGROUND OF THE INVENTION
  The present inventionData conversion apparatus and method, and encoding apparatus and methodIn particular, a moving image signal is recorded on a recording medium such as a magneto-optical disk or a magnetic tape, and is reproduced and displayed on a display or the like, a video conference system, a video phone system, a broadcasting device, multimedia, etc. Used to transmit a moving image signal from a transmitting side to a receiving side via a transmission path, such as in a database search system, and receive and display it on the receiving side, or edit and record a moving image signal SuitableData conversion apparatus and method, and encoding apparatus and methodAbout.
[0002]
[Prior art]
For example, in a system that transmits a moving image signal to a remote place such as a video conference system and a video phone system, in order to efficiently use a transmission path, the line correlation of video signals and the correlation between frames are used. An image signal is compressed and encoded.
[0003]
As a representative example of a high-efficiency encoding method for moving images, there is a moving picture expert group (MPEG) method as a moving image encoding method for storage. This was discussed in ISO-IEC / JTC1 / SC2 / WG11, and was proposed as a standard proposal. A hybrid method combining motion compensation prediction coding and DCT (Discrete Cosine Transform) coding is adopted. .
[0004]
In MPEG, several profiles and levels are defined to support various applications and functions. The most basic is the main profile main level (MP @ ML).
[0005]
A configuration example of an MPEG MP @ ML (main profile @ main level) encoder will be described with reference to FIG.
[0006]
The input image signal is first input to the frame memory group 1 and encoded in a predetermined order.
[0007]
Image data to be encoded is input to the motion vector detection circuit 2 in units of macroblocks. The motion vector detection circuit 2 processes the image data of each frame as an I picture, P picture, or B picture according to a predetermined sequence set in advance. It is predetermined (for example, processed as I, B, P, B, P,... B, P) to process the image of each frame that is sequentially input as an I, P, or B picture. )
[0008]
The motion vector detection circuit 2 refers to a predetermined reference frame, performs motion compensation, and detects the motion vector. There are three types of motion compensation (interframe prediction): forward prediction, backward prediction, and bidirectional prediction. The prediction mode for P pictures is only forward prediction, and there are three types of prediction modes for B pictures: forward prediction, backward prediction, and bidirectional prediction. The motion vector detection circuit 2 selects a prediction mode that minimizes the prediction error, and generates a prediction mode at that time.
[0009]
At this time, for example, the prediction error is compared with the variance of the macroblock to be encoded. When the variance of the macroblock is smaller, the prediction is not performed on the macroblock, and the intraframe encoding is performed. In this case, the prediction mode is intra-picture coding (intra). The motion vector and the prediction mode are input to the variable length coding circuit 6 and the motion compensation circuit 12.
[0010]
The motion compensation circuit 12 generates a predicted image based on a predetermined motion vector and inputs it to the arithmetic circuit 3. The arithmetic circuit 3 outputs a differential signal between the value of the macroblock to be encoded and the value of the predicted image to the DCT circuit 4. In the case of an intra macroblock, the arithmetic circuit 3 outputs the macroblock signal to be encoded to the DCT circuit 4 as it is.
[0011]
The DCT circuit 4 performs DCT (discrete cosine transform) processing on the input data and converts it into DCT coefficients. This DCT coefficient is input to the quantization circuit 5, quantized at a quantization step corresponding to the data storage amount (buffer storage amount) of the transmission buffer 7, and then input to the variable length encoding circuit 6.
[0012]
The variable length encoding circuit 6 corresponds to the quantization step (scale) supplied from the quantization circuit 5, and converts image data (in this case, I picture data) supplied from the quantization circuit 5 into, for example, It is converted into a variable length code such as a Huffman code and output to the transmission buffer 7.
[0013]
The variable length coding circuit 6 also has a quantization step (scale) set by the quantization circuit 5 and a prediction mode (intra-picture prediction, forward prediction, backward prediction, or bidirectional prediction) set by the motion vector detection circuit 2. ) And a motion vector are input, and these are also variable-length encoded.
[0014]
The transmission buffer 7 temporarily accumulates the input data and outputs data corresponding to the accumulation amount to the quantization circuit 5.
[0015]
When the remaining amount of data increases to the allowable upper limit value, the transmission buffer 7 increases the quantization scale of the quantization circuit 5 by the quantization control signal, thereby reducing the data amount of the quantized data. On the other hand, when the remaining data amount is reduced to the allowable lower limit value, the transmission buffer 7 reduces the quantization scale of the quantization circuit 5 by the quantization control signal, thereby reducing the data amount of the quantized data. Increase. In this way, overflow or underflow of the transmission buffer 7 is prevented.
[0016]
The data stored in the transmission buffer 7 is read at a predetermined timing and output to the transmission path.
[0017]
On the other hand, the data output from the quantization circuit 5 is input to the inverse quantization circuit 8 and inversely quantized corresponding to the quantization step supplied from the quantization circuit 5. The output of the inverse quantization circuit 8 is input to an IDCT (inverse DCT) circuit 9, subjected to inverse DCT processing, and then stored in the frame memory group 11 via the arithmetic unit 10.
[0018]
Next, a configuration example of an MPEG MP @ ML decoder will be described with reference to FIG. The encoded image data transmitted through the transmission path is received by a receiving circuit (not shown), reproduced by a reproducing device, temporarily stored in the receiving buffer 31, and then supplied to the variable length decoding circuit 32. The The variable length decoding circuit 32 performs variable length decoding on the data supplied from the reception buffer 31 and outputs the motion vector and prediction mode to the motion compensation circuit 37 and the quantization step to the inverse quantization circuit 33, respectively. The decoded image data is output to the inverse quantization circuit 33.
[0019]
The inverse quantization circuit 33 inversely quantizes the image data supplied from the variable length decoding circuit 32 according to the quantization step supplied from the variable length decoding circuit 32 and outputs the result to the inverse DCT circuit 34. The data (DCT coefficient) output from the inverse quantization circuit 33 is subjected to inverse DCT processing by the inverse DCT circuit 34 and supplied to the computing unit 35.
[0020]
When the image data supplied from the inverse DCT circuit 34 is I picture data, the data is output from the computing unit 35 and prediction of image data (P or B picture data) to be input to the computing unit 35 later. In order to generate image data, it is supplied to and stored in the frame memory group 36. Further, this data is output to the outside as a reproduced image as it is.
[0021]
When the input bit stream is a P or B picture, the motion compensation circuit 37 generates a prediction image according to the motion vector and the prediction mode supplied from the variable length decoding circuit 32 and outputs the prediction image to the calculator 35. The computing unit 35 adds the image data input from the inverse DCT circuit 34 and the predicted image data supplied from the motion compensation circuit 37 to obtain an output image. In the case of a P picture, the output of the computing unit 35 is input to and stored in the frame memory group 36 and is used as a reference image of an image signal to be decoded next.
[0022]
In MPEG, in addition to MP @ ML, various profiles and levels are defined, and various tools are provided. Scalability is another such MPEG tool.
[0023]
In MPEG, a scalable encoding method that realizes scalability corresponding to different image sizes and frame rates is introduced. For example, in the case of spatial scalability, when only a lower layer bit stream is decoded, an image signal with a small image size is decoded, and when a lower layer and an upper layer bit stream are decoded, an image signal with a large image size is decoded.
[0024]
The spatial scalability encoder will be described with reference to FIG. In the case of spatial scalability, the lower layer corresponds to an image signal having a small image size, and the upper layer corresponds to an image signal having a large image size.
[0025]
The lower layer image signal is first input to the frame memory group 1 and encoded in the same manner as MP @ ML. However, the output of the computing unit 10 is supplied to the frame memory group 11 and is used not only as a predicted reference image of the lower layer, but also after being enlarged to the same image size as that of the upper layer by the image enlargement circuit 41, It is also used for an upper layer prediction reference image.
[0026]
The upper layer image signal is first input to the frame memory group 51. The motion vector detection circuit 52 determines a motion vector and a prediction mode, similarly to MP @ ML.
[0027]
The motion compensation circuit 62 generates a prediction image according to the motion vector and the prediction mode determined by the motion vector detection circuit 52 and outputs the prediction image to the weight addition circuit 44. The weight addition circuit 44 multiplies the predicted image by a weight (coefficient) W and outputs the result to the calculator 43.
[0028]
The output of the arithmetic unit 10 is input to the frame memory group 11 and the image enlargement circuit 41 as described above. The image enlarging circuit 41 enlarges the image signal generated by the arithmetic unit 10, makes it the same size as the image size of the upper layer, and outputs it to the weight addition circuit 42. The weight addition circuit 42 multiplies the output of the image enlargement circuit 41 by the weight (1-W) and outputs the result to the calculator 43.
[0029]
The computing unit 43 adds the outputs of the weight addition circuits 42 and 44 and outputs the result to the computing unit 53 as a predicted image. The output of the arithmetic unit 43 is also input to the arithmetic unit 60, added with the output of the inverse DCT circuit 59, input to the frame memory group 61, and used as a prediction reference frame of the image signal to be encoded thereafter.
[0030]
The computing unit 53 calculates and outputs the difference between the image signal to be encoded and the output of the computing unit 43. However, in the case of an intra-frame encoded macro block, the arithmetic unit 53 outputs the image signal to be encoded to the DCT circuit 54 as it is.
[0031]
The DCT circuit 54 performs DCT (discrete cosine transform) processing on the output of the computing unit 53, generates a DCT coefficient, and outputs it to the quantization circuit 55. The quantization circuit 55 quantizes the DCT coefficient according to the quantization scale determined from the data accumulation amount of the transmission buffer 57 and outputs the result to the variable length coding circuit 56 as in the case of MP @ ML. The variable length coding circuit 56 performs variable length coding on the quantized DCT coefficient, and then outputs it as a higher layer bit stream via the transmission buffer 57.
[0032]
The output of the quantization circuit 55 is also inversely quantized by the inverse quantization circuit 58 at the quantization scale used in the quantization circuit 55, subjected to inverse DCT by the inverse DCT circuit 59, and then input to the computing unit 60. The arithmetic unit 60 adds the outputs of the arithmetic unit 43 and the inverse DCT circuit 59 and inputs the result to the frame memory group 61.
[0033]
The variable length coding circuit 56 also receives the motion vector and prediction mode detected by the motion vector detection circuit 52, the quantization scale used by the quantization circuit 55, and the weight W used by the weight addition circuits 42 and 44, Each is encoded and transmitted.
[0034]
[Problems to be solved by the invention]
Conventional video encoding devices and decoding devices are premised on a one-to-one correspondence. For example, in a video conference system, the transmission side and the reception side are always one-to-one, and the processing capabilities and specifications of the transmission terminal and the reception terminal are determined in advance. Furthermore, in a storage medium such as a DVD, the specifications and processing capabilities of the decoding device are strictly determined in advance, and the encoding device encodes a video signal on the premise of only the decoding device that satisfies the specifications. . Therefore, in the encoding device, if an image signal is encoded so that an optimum image quality can be obtained by a decoding device having a predetermined specification, an image can always be transmitted with an optimum image quality.
[0035]
However, for example, when transmitting a moving image to a transmission path where the transmission capacity is not constant and the transmission capacity changes depending on the time and path, such as the Internet, and when an unspecified number of terminals are connected, If the terminal specifications are not determined in advance and a moving image is transmitted to a receiving terminal having various processing capabilities, it is difficult to know the optimum image quality itself, and the moving image can be transmitted efficiently. Have difficulty.
[0036]
Furthermore, since the specifications of the terminal are not unique, the encoding method of the encoding device and the decoding device may be different. In this case, it is necessary to efficiently convert the encoded bit stream to a predetermined format. A conversion method has not yet been established.
[0037]
The present invention has been made in view of such a situation, and enables a video signal to be efficiently transmitted via a transmission line having various transmission capacities, and a receiving terminal having various processing capabilities. Therefore, an optimal moving image can be transmitted.
[0038]
[Means for Solving the Problems]
  The data conversion apparatus of the present invention transcodes a first bit stream in a data conversion apparatus that transcodes a first bit stream encoded in an input format into a second bit stream encoded in an output format. Including the encoding parameters used forDetermining a transcoding method for transcoding a first bitstream into a second bitstreamBased on the content information and client information received by the receiving means, the receiving means for receiving the content information used for the client and the client information indicating the processing capability of the client receiving the second bitstream,Transcoding methodControl means for generating transcoding type information indicating an output format and a transcoding method, and using the transcoding type information generated by the control means, the first bit stream is converted into the second bit stream. And a conversion means for transcoding.
[0039]
  The converting means decodes the first bit stream to generate image data, transmits a decoding parameter decoded when the first bit stream is decoded, and transcoding generated by the control means An encoder that encodes image data decoded by the decoder using type information and encoding parameters transmitted by the decoder;be able to.
[0040]
  The above content informationTranscodingIt is possible to include hint information that serves as a hint when performing.
[0041]
  The content information includes identification information for identifying hint information.be able to.
[0042]
  The hint information is a range in which the hint information itself is valid, andTranscodingCan include parameter set information indicating a combination of encoding parameters used when performing the above.
[0043]
  The parameter set information can include an encoding parameter corresponding to a bit rate for encoding content.
[0044]
  The parameter set information can include an image frame size and a frame rate corresponding to the bit rate.
[0045]
  The parameter set information includes minimum bit rate information indicating a minimum bit rate for which hint information is valid, andTipsAnd maximum bit rate information indicating a maximum bit rate for which the information is valid.
[0046]
  The parameter set information includes frame rate information indicating a frame rate for obtaining an optimum image quality when encoding between the minimum bit rate and the maximum bit rate.be able to.
[0047]
  The parameter set information includes image frame information indicating an image frame size for obtaining an optimum image quality when encoding between the minimum bit rate and the maximum bit rate.be able to.
[0048]
  The hint information above isThe range in which the hint information itself is valid, andEncoding difficulty level information indicating the difficulty level at the time of encoding can be included.
[0049]
  The encoding difficulty level information includes head position information indicating the head position of the bitstream in which the hint information is valid and end position information indicating the last position of the bitstream in which the hint information is valid.be able to.
[0050]
  The encoding difficulty level information includes partial encoding difficulty level information indicating a difficulty level when encoding a bit stream between a head position and a last position.be able to.
[0051]
  The first bit stream is encoded at a fixed bit rate, and the conversion means encodes at a variable bit rate using the encoding difficulty level information.be able to.
[0052]
  The data conversion method of the present invention is a data conversion method for a data conversion apparatus that transcodes a first bit stream encoded in an input format into a second bit stream encoded in an output format, and the receiving means includes: Including encoding parameters for use in transcoding the first bitstream;Determining a transcoding method for transcoding a first bitstream into a second bitstreamReceiving the content information used for the client and client information indicating the processing capability of the client receiving the second bitstream, and the control means, based on the received content information and client information,Transcoding methodAnd generating transcoding type information indicating an output format and a transcoding method, and the converting means uses the generated transcoding type information to convert the first bit stream into the second bit stream. It is characterized by coding.
[0053]
  An encoding apparatus according to the present invention encodes image data obtained by decoding a first bit stream encoded in an input format, and generates a second bit stream encoded in an output format. , Including encoding parameters used when encoding,Determining a transcoding method for transcoding a first bitstream into a second bitstreamBased on the content information received by the receiving means and the client information, the receiving means for receiving the client information indicating the processing capability of the client receiving the second bitstream,Transcoding methodDetermine the output format andTranscodingControl means for generating transcoding type information indicating the method of the above, and encoding means for generating a second bit stream by encoding image data using the transcoding type information generated by the control means It is characterized by having.
[0054]
  The encoding method of the present invention is an encoding device that encodes image data obtained by decoding a first bit stream encoded in an input format and generates a second bit stream encoded in an output format. An encoding method, wherein the receiving means includes an encoding parameter used when encoding.Determining a transcoding method for transcoding a first bitstream into a second bitstreamReceiving the content information used for the client and client information indicating the processing capability of the client receiving the second bitstream, and the control means, based on the received content information and client information,Transcoding methodDetermine the output format andTranscodingTranscoding type information indicating the above method is generated, and the encoding means encodes the image data using the generated transcoding type information to generate a second bit stream.
[0065]
  In the data conversion apparatus and method of the present invention, the data conversion apparatus and method includes an encoding parameter used when transcoding the first bit stream.Determining a transcoding method for transcoding a first bitstream into a second bitstreamContent information used for the client and client information indicating the processing capability of the client receiving the second bitstream are received, and based on the received content information and client information,Transcoding methodIs determined, transcoding type information indicating an output format and a transcoding method is generated, and the first bit stream is transcoded into the second bit stream using the generated transcoding type information.
[0066]
  In the encoding apparatus and method of the present invention, including an encoding parameter used for encoding,Determining a transcoding method for transcoding a first bitstream into a second bitstreamContent information used for the client and client information indicating the processing capability of the client receiving the second bitstream are received, and based on the received content information and client information,Transcoding methodOutput format andTranscodingTranscoding type information indicating the above method is generated, and the second bit stream is generated by encoding the image data using the generated transcoding type information.
[0069]
DETAILED DESCRIPTION OF THE INVENTION
(First embodiment)
FIG. 4 shows the configuration of the first embodiment of the present invention.
[0070]
The multimedia content server 101 records and holds multimedia content including moving images in a storage medium such as a hard disk (for example, the content recording device 112 in FIG. 5 described later). Multimedia content is, for example, uncompressed or in a compressed bitstream format such as MPEG-1, MPEG-2, or MPEG-4 (hereinafter abbreviated as MPEG-1 / 2/4). To be recorded.
[0071]
A receiving terminal (client) 103 is a terminal that requests, receives, and displays multimedia content. The user acquires content using the receiving terminal 103. The receiving terminal 103 receives the content request signal 1 for requesting a predetermined content, as well as the processing capability of its own terminal, for example, the memory size, the resolution of the image display device, the calculation capability, the buffer size, the format of a decodable bitstream, and the like. The client information signal shown is transmitted.
[0072]
The content request signal 1 is information including the semantic content of the requested content, such as a movie title, and is encoded by the MPEG-7 encoding method.
[0073]
The data access server 102 receives the content request signal 1 and the client information signal from the receiving terminal 103 via the network or a predetermined transmission path. The data access server 102 transmits a content information request signal for requesting content information requested based on the content request signal 1 to the multimedia content server 101 via a network or a predetermined transmission path.
[0074]
In the multimedia content server 101, multimedia content and information of the recorded multimedia content are recorded in a built-in storage medium. Upon receiving the content information request signal, the multimedia content server 101 transmits a predetermined content information signal to the data access server 102 based thereon.
[0075]
The content information signal is a signal indicating information on multimedia content recorded in the multimedia content server 101, and includes information such as a file name, content title, author, performer, and the like. The content information signal includes both semantic information and physical information, and is encoded by, for example, the MPEG-7 method. The physical information is, for example, a file name recorded on the storage medium, a pointer indicating a predetermined position in the bit stream, and the like. Semantic information is, for example, content titles, performers, and the like.
[0076]
The data access server 102 determines a predetermined content from the content information signal, the content request signal 1 and the client information signal, and transmits a content request signal 2 for requesting the content to the multimedia content server 101.
[0077]
The content request signal 2 includes a file name, for example. The content request signal 2 is physical information, for example, a file name or a pointer indicating a predetermined position in the bit stream. The content request signal 2 is encoded by, for example, MPEG-7.
[0078]
The multimedia content server 101 transmits the multimedia (MM) content requested by the content request signal 2 to the data access server 102.
[0079]
The data access server 102 receives the content information signal and the multimedia content from the multimedia content server 101. The data access server 102 converts the multimedia content into an optimal format (Transcode) based on the client information signal and the content information signal. The data access server 102 transmits the converted multimedia content to the receiving terminal 103.
[0080]
In FIG. 4, the data access server 102 and the receiving terminal 103, and the data access server 102 and the multimedia content server 101 are separated from each other by a transmission path and are described independently, but each is in the same terminal. It may be implemented. For example, the multimedia content server 101, the data access server 102, and the receiving terminal 103 may all be in the same terminal, or the multimedia content server 101 and the data access server 102 are in the same terminal, and the receiving terminal 103 May be another terminal separated via a network. Similarly, the multimedia content server 101 may be another terminal separated by a network, and the data access server 102 and the receiving terminal 103 may be the same terminal. Hereinafter, for the sake of simplicity, a case where each is independent will be described. However, the following description can be applied to a case where they are in the same terminal.
[0081]
FIG. 5 shows a configuration example of the multimedia content server 101 in FIG. Metadata describing the content information signal and other content information is recorded in the metadata recording device 111. Multimedia content including moving images is recorded in the content recording device 112.
[0082]
Content information signals and other content related metadata are semantic and physical information. Semantic information is information, such as a movie title and a director name, for example. The physical information is, for example, a file name, a URL, a pointer indicating a predetermined position in the bit stream, and the like. Such content information signals and metadata are encoded and recorded by, for example, MPEG-7.
[0083]
The multimedia content itself is encoded and recorded in the content recording device 112 in various formats, such as MPEG-1 / 2/4.
[0084]
The content information request signal input from the data access server 102 is input to the metadata manager 113. The metadata manager 113 manages metadata and content information signals recorded in the metadata recording device 111. The metadata manager 113 supplies a content information request signal to the metadata recording device 111.
[0085]
The metadata recording device 111 searches for predetermined metadata or a content information signal based on the supplied content information request signal, and supplies it to the metadata manager 113. The metadata manager 113 outputs the content information signal to the data access server 102 in FIG.
[0086]
The content request signal 2 input from the data access server 102 is input to the multimedia content manager 114. The multimedia content manager 114 manages multimedia content recorded on the content recording device 112. The multimedia content manager 114 supplies the content request signal 2 to the content recording device 112.
[0087]
The content recording device 112 searches for predetermined multimedia (MM) content based on the supplied content request signal 2 and outputs it to the multimedia content manager 114. The multimedia content manager 114 outputs the multimedia content to the data access server 102 in FIG.
[0088]
FIG. 6 shows a configuration example of the data access server 102 in FIG. The data access server 102 includes a transcoding manager 121, a transcoding device 122, and a transcoding library 123.
[0089]
The client information signal input from the receiving terminal 103 in FIG. 4 is input to the transcoding manager 121, and the content information signal input from the multimedia content server 101 in FIG. 4 is input to the transcoding manager 121.
[0090]
The transcoding manager 121 determines the output format of the multimedia content based on the client information signal and the content information signal. The transcoding manager 121 outputs the transcoding type information to the transcoding device 122. The transcoding type information is information indicating a multimedia content output format and a transcoding method in the transcoding device 122.
[0091]
The transcoding manager 121 also outputs the content availability information and the content information signal to the receiving terminal 103 in FIG. The transcoding manager 121 sets the content availability information to “0” when the requested content is not in the multimedia content server 101, and sets the content availability information when the requested content is in the multimedia content server 101. Set to "1".
[0092]
The transcoding device 122 converts the input content based on the transcoding type information.
[0093]
The transcoding device 122 can also be implemented as a software module that operates on a CPU, DSP, or the like. In this case, the transcoding device 122 performs transcoding (content conversion) using a predetermined transcoding tool recorded in the transcoding library 123 based on the transcoding type information. The transcoding device 122 outputs a tool request signal to the transcoding library 123 based on the transcoding type information. The transcoding library 123 outputs the requested software module (transcoding tool) to the transcoding device 122. The transcoding device 122 secures a memory or the like necessary for executing the software module, and performs transcoding using the software module.
[0094]
A configuration example of the transcoding device 122 will be described with reference to FIG. The simplest method for realizing the transcoding device 122 is to decode the content (bit stream) and then re-encode it using an encoder of a predetermined format.
[0095]
In the transcoding device 122 in FIG. 7, the bit stream supplied from the multimedia content server 101 is first input to the decoder 131 and decoded. The decoded image signal is supplied to the encoder 132 in a format that can be received by the receiving terminal 103 and encoded.
[0096]
Coding parameters, for example, motion vectors, quantization coefficients, and coding modes decoded when the bit stream is decoded by the decoder 131 are supplied to the encoder 132 and used when the image signal is encoded by the encoder 132. . The encoder 132 encodes the decoded image based on the encoding parameter supplied from the decoder 131 and the transcoding type information supplied from the transcoding manager 121, generates a bit stream of a predetermined format, and outputs it.
[0097]
With reference to FIG. 8, an example of a method in which transcoding apparatus 122 performs transcoding using a content information signal will be described.
[0098]
When a predetermined content is encoded, the image quality varies depending on the image frame size, the frame rate, etc., even at the same bit rate. An example is shown in FIG. This figure shows the relationship between bit rate (horizontal axis in FIG. 8B) and image quality (vertical axis in FIG. 8B) when the same image is encoded with three different frame sizes and frame rates. FIG. When the bit rate is sufficiently high, the image quality is the best when encoding an image with a large image frame (ITU-R Recommendation 601) and a high frame rate (30 Hz). Image quality deteriorates.
[0099]
Predetermined bit rate RB2Below, the image quality is better when the Rec.601 image size is set to 1/2 (SIF) in the vertical and horizontal direction and the frame rate is set low (10 Hz). Furthermore, a predetermined bit rate RB1In the following, the image quality is better if the SIF image size is further encoded in 1/2 (QSIF). However, which image size and frame rate should be used for encoding at each bit rate depends on the nature of the image. The relationship shown in FIG. 8B differs depending on the content.
[0100]
The content information signal in the present embodiment is, for example, a list of optimum encoding parameters when encoding the content at each bit rate. An example is shown in FIG. In the content information signal in this case, the bit rate RA1In the following, encoding is performed at a frame rate of 10 Hz with an image frame of 1/4 vertical and horizontal size, and bit rate RA1RA2In the following, encoding is performed with an image frame of 1/2 size in length and width and bit rate RA2In the above, encoding is performed with Rec 601 size and a frame rate of 30 Hz.
[0101]
Details of the description method of the content information signal in this case will be described later.
[0102]
Next, a modification of the transcoding method performed by the transcoding device 122 using the content information signal will be described with reference to FIG. A transmission path for transmitting a predetermined multimedia content includes a transmission path with a variable bit rate that allows the bit rate to change over time, and a transmission path with a fixed bit rate with a fixed bit rate. In addition, there are an encoding method capable of encoding at a variable bit rate and an encoding method capable of encoding only at a fixed bit rate.
[0103]
For example, in a video conference system or broadcast via a wireless transmission path, a bit stream is encoded at a fixed bit rate. On the other hand, in a DVD or the like, encoding is performed at a variable bit rate. In addition, MPEG-1 and H.263 can be encoded only at a fixed bit rate, and MPEG-2 and MPEG-4 can be encoded at a variable bit rate.
[0104]
When the content is encoded at a fixed bit rate or a variable bit rate, the image quality is generally better when the content is encoded at a variable bit rate. The encoding efficiency of content depends on the nature of the image. Therefore, if the content is different, the encoding efficiency is different, and even for the same content, the encoding efficiency is different depending on the time. FIG. 9A shows an example of the temporal change in the encoding difficulty level. The horizontal axis represents time, and the vertical axis represents encoding difficulty. In a scene with a low encoding difficulty level, a high image quality can be obtained at a low bit rate, whereas in a scene with a high encoding difficulty level, it is difficult to obtain a sufficient image quality even at a high bit rate.
[0105]
FIG. 9B shows temporal changes in image quality when this moving image is encoded at a fixed bit rate. As is apparent from a comparison between FIG. 9A and FIG. 9B, when encoding at a fixed bit rate, the image quality is improved in a low encoding difficulty scene, but the encoding difficulty scene is high. In this case, the image quality is considerably deteriorated. Therefore, the image quality varies greatly with time.
[0106]
FIG. 9C shows temporal changes in the encoding bit rate when the moving image of FIG. 9A is encoded at a variable bit rate. Higher bits are assigned to scenes with a high degree of encoding difficulty, and relatively few bits are assigned to scenes with a low degree of encoding difficulty. As a result, the image quality changes as shown in FIG. Compared to the case of encoding at a fixed bit rate (FIG. 9B), even if the generated bit amount of the entire content is the same, the average image quality is better at the variable bit rate. Also, with variable bit rate encoding, temporal changes in image quality are lessened.
[0107]
However, in order to perform variable bit rate encoding efficiently, it is necessary to analyze the encoding difficulty level of the entire moving image and obtain the characteristics as shown in FIG. 9A in advance. There is a method that has a buffer with a certain amount of capacity and measures the encoding difficulty within the range allowed by the buffer, but this is also only an optimization within the capacity range, and the entire content Is not optimized.
[0108]
Therefore, in order to solve such a problem, in another embodiment of the present invention, it is difficult to encode the content information signal output from the multimedia content server 101 as shown in FIG. The degree information is described, and the transcoding device 122 encodes the bit stream encoded at the fixed bit rate at the variable bit rate using the encoding difficulty level information, and outputs the encoded bit stream.
[0109]
That is, in the transcoding device 122 of the embodiment of FIG. 7, the encoder 132 encodes the bit stream based on the content information signal supplied from the multimedia content server 101 and outputs the bit stream.
[0110]
On the other hand, in the embodiment shown in FIG. 10, when a predetermined multimedia content is recorded on the multimedia content server 101 in FIG. 4, a bit stream is supplied to the multimedia content server 101 from the outside. , It is first input to the encoding difficulty analysis circuit 141. In this embodiment, a bit stream is input, but an uncompressed moving image may be directly input.
[0111]
The encoding difficulty level analysis circuit 141 analyzes the encoding difficulty level of the content, obtains the characteristics of the encoding difficulty level as shown in FIG. 9A, and outputs the characteristics to the metadata recording device 111 as a content information signal. The bit stream of the input content is output to the content recording device 112.
[0112]
FIG. 11 shows a configuration example of the encoding difficulty level analysis circuit 141. In the example of FIG. 11A, the input bitstream is first input to a syntax analysis circuit (parser) 151, and encoding parameters (quantization coefficients, bit amounts, etc.) are extracted from the bitstream. At this time, the hint generator 152 obtains the average value Q of the quantization coefficient in each frame and the generated bit amount B in that frame, further obtains QraB as the encoding difficulty level in that frame, and the content information The signal is supplied to the metadata recording device 111 as a signal and recorded.
[0113]
FIG. 11B shows a modification of the encoding difficulty analysis circuit 141. In this example, the input bit stream is once decoded by the decoder 161. The decoded image is input to the encoder 162. The encoder 162 performs encoding with a fixed quantization scale, for example, Q = 1. The generated bit amount of each frame when encoded with Q = 1 is the encoding difficulty level of the frame, and is supplied to the metadata recording apparatus 111 as content information and recorded.
[0114]
An example of a format for describing the content information signal will be described with reference to FIG. In the example of FIG. 12, the content information signal is described in TranscodingHint shown in FIG. 12A, which is a descriptor including information that serves as a hint when performing Transcoding. In the example of FIG. 12A, TranscodingHint includes an ID and descriptors of TranscodingParameterSet and TranscodingComplexityHint. The ID is an identification number that identifies the descriptor.
[0115]
As shown in FIG. 12B, TranscodingParameterSet is a descriptor that describes optimum encoding parameters when encoding and transcoding at each bit rate, and is based on ID, MinBitRate, MaxBitRate, FrameRate, and FrameSize. It is configured.
[0116]
MinBitRate is a flag indicating the lowest bit rate for which the information of this descriptor is valid.
[0117]
MaxBitRate is a flag indicating the maximum bit rate for which the information of this descriptor is valid.
[0118]
FrameRate is a flag indicating a frame rate that obtains an optimal image quality when encoding an image that is between MinBitrate and MaxBitRate.
[0119]
FrameSize is a flag indicating an image frame size that obtains an optimal image quality when encoding an image that is between MinBitrate and MaxBitRate.
[0120]
TranscodingComplexityHint is a descriptor that describes the difficulty of encoding and transcoding of the content, and is configured as shown in FIG. The StartMediaLocator is a pointer indicating the start position of the bit stream in which the information of this descriptor is valid.
[0121]
EndMediaLocator is a pointer indicating the last position of the bit stream in which the information of this descriptor is valid. Complexity is a flag indicating the encoding difficulty in the part between StartMediaLocator and EndMediaLocator on bitstream.
[0122]
TranscodingComplexityHint can also be configured as shown in FIG. StartFrameNumber is a pointer indicating the first frame number for which the information of this descriptor is valid.
[0123]
EndFrameNumber is a pointer indicating the last frame number for which the information of this descriptor is valid.
[0124]
Complexity is a flag indicating the difficulty level of encoding in the portion between StartFrameNumber and EndFrameNumber on bitstream.
[0125]
When the data structure of the TranscodingHint descriptor shown in FIG. 12A is expressed in UML (Universal Modeling Language), it becomes as shown in FIG. TranscodingHint is composed of one or more TranscodingParameterSet and one or more TranscodingComplexityHint. TranscodingParameterSet is repeated zero or more times. Similarly, TranscodingComplexityHint is repeated zero or more times.
[0126]
MPEG-7 is a metadata standard that describes information about content, and consists of multiple descriptors. Details of the MPEG-7 specification are described in ISO / IEC SC29 / WG11N3112, N3113, N3114. TranscodingHint can be configured as a kind of MPEG-7 metadata.
[0127]
Fig. 14 shows data when MPEG-7 MediaInformation descriptor (comprised of MediaIdentification descriptor, MediaFormat descriptor, MediaCoding descriptor, MediaTranscodingHint descriptor, and MediaInstance descriptor) is added with the above TranscodingHint. An example of the structure is shown. The MediaInformation descriptor is a descriptor that describes the content media, for example, the encoding method. TranscodingHint descriptor is this MediaInformation descriptor
0 or 1 is described.
[0128]
The MediaInformation descriptor is added to the entire content or a part of the content. Therefore, in this example, the TranscodingHint descriptor is also added to the entire content or a part of the content.
[0129]
FIG. 15 shows an example of a data structure when a TranscodingHint descriptor is described in the MPEG-7 Segment descriptor. The Segment descriptor is a descriptor that describes information about each part when content is divided into a plurality of parts such as scenes. In this example, 0 or one TranscodingHint descriptor is described in this VisualSegment descriptor and AudioSegment descriptor.
[0130]
The Segment descriptor is added to a part of the content. Therefore, in this example, the TranscodingHint descriptor is also added to a part of the content.
[0131]
As shown in FIG. 14, when a TranscodingHint descriptor is added to the MPEG-7 MediaInformation descriptor, the data structure of the entire MPEG-7 is as shown in FIG.
(Second Embodiment)
Next, a second embodiment will be described. In the second embodiment, the encoding difficulty level of the content information signal is configured by information indicating the difficulty level of motion compensation and information indicating the difficulty level of the intra code. Also from these two pieces of information, it is possible to know the degree of difficulty in encoding a predetermined scene of content as shown in FIG. 9A in the first embodiment. In this case, the encoding difficulty level analysis circuit 141 shown in FIG. 10 is configured as shown in FIG. 17, for example.
[0132]
If necessary, the input bit stream is decoded by the decoder 201 and then supplied to the encoders 202 to 205. The encoder 202 encodes the image data input from the decoder 201 with a fixed quantum scale, for example, Q = 1, using only intra coding. The generated bit amount of the frame when encoded with Q = 1 is the intra-coding difficulty level of the encoding, and is supplied to the content information signal generating circuit 208.
[0133]
The encoder 203 performs encoding processing (encoding processing of only I and P pictures) with Q = 1 and m = 1, for example, at a fixed quantization scale. The encoder 204 performs encoding at a fixed quantization scale, for example, with Q = 1 and m = 2 (by inserting one frame of B picture between two adjacent P pictures). The encoder 205 performs encoding at a fixed quantization scale, for example, with Q1, m = 3 (by inserting two frames of B pictures between two adjacent P pictures).
[0134]
The average value circuit 206 calculates the average value of the outputs of the encoders 203 to 205. This average value is supplied to the difference circuit 207 as the motion compensation encoding difficulty level.
[0135]
The difference circuit 207 subtracts the output of the average value circuit 206 from the output of the encoder 202 and supplies the difference value to the content information signal generation circuit 208. The content information signal generation circuit 208 is supplied with a segment start time and end time from an external device (not shown). The content information signal generation circuit 208 generates the content information signal of the segment specified by the start time and the end time from the output of the encoder 202 and the output of the difference circuit 207, and supplies it to the metadata recording device 111.
[0136]
Next, the operation will be described. If necessary, the decoder 201 decodes the input bit stream and supplies it to the encoders 202 to 205. The encoder 202 performs only intra coding with Q = 1. The amount of bits generated in each frame at this time represents the intra coding difficulty level of the frame, and is supplied to the content information signal generation circuit 208 and the difference circuit 207.
[0137]
This intra coding difficulty level is described in a TextureHint descriptor (FIG. 19C) described later.
[0138]
The encoder 203 encodes the image data supplied from the decoder 201 with Q = 1 and m = 1. The encoder 204 encodes the image data output from the decoder 201 with Q = 1 and m = 2, and the encoder 205 encodes the image data from the decoder 201 with Q = 1 and m = 3. The encoders 203 to 205 output the generated bit amount of each frame to the average value circuit 206.
[0139]
The average value circuit 206 calculates the average value of the generated bit amount of each frame supplied from the encoders 203 to 205. This average value is supplied to the difference circuit 207 as the motion compensation encoding difficulty level.
[0140]
The difference circuit 207 subtracts the motion compensation coding difficulty level supplied from the average value circuit 206 from the intra coding difficulty level representing the difficulty level of the intra coding supplied from the encoder 202, and uses this as the motion compensation difficulty level. The content information signal generation circuit 208 is supplied.
[0141]
This motion compensation difficulty level is described in a MotionHint descriptor (FIG. 19B) described later.
[0142]
The content information signal generation circuit 208 generates a content information signal based on the intra coding difficulty level supplied from the encoder 202 and the motion compensation difficulty level supplied from the difference circuit 207 and supplies the content information signal to the metadata recording device 111. And record.
[0143]
Next, content information signal generation processing performed by the content information signal generation circuit 208 will be described with reference to the flowchart of FIG.
[0144]
First, in step S1, the content information signal generation circuit 208 calculates the sum of the intra coding difficulty levels of all the frames in the segment specified by the start time and the end time.
[0145]
A segment means a predetermined section in the time axis direction of a video signal, and one video content is composed of one or a plurality of segments. A specific example of this segment will be described later with reference to FIG.
[0146]
Next, in step S2, the content information signal generation circuit 208 calculates the total value of the intra coding difficulty levels of all frames over the entire sequence.
[0147]
Next, in step S3, the content information signal generation circuit 208 performs normalization processing according to the following equation, and calculates TextureHint Difficulty described later.
[0148]
Difficulty = (sum of intra coding difficulty in the segment / number of frames in the segment) / (sum of intra coding difficulty of the entire sequence / number of frames in the entire sequence)
This Difficulty is obtained for each segment.
[0149]
Next, in step S4, the content information signal generation circuit 208 calculates the total motion compensation difficulty in the segment. Further, in step S5, the encoding difficulty level of the entire sequence is calculated. In step S6, the content information signal generation circuit 208 executes normalization processing for each segment in accordance with the following equation, and calculates Motion_uncompensability of MotionHint described later.
[0150]
Motion_uncompensability = (Sum of motion compensation difficulty in segment ÷ Number of frames in segment) ÷ (Sum of motion compensation difficulty of entire sequence ÷ Number of frames in entire sequence)
This Motion_uncompensability is also obtained for each segment.
[0151]
Next, in step S7, the content information signal generation circuit 208 generates a MediaTranscodingHint descriptor as a content information signal based on the calculation results of steps S3 and S6.
[0152]
This MediaTranscodingHint descriptor is a descriptor that describes an optimal encoding parameter when transcoding is performed, and is described as shown in FIG. 19 in the present invention.
[0153]
As shown in FIG. 19A, the MediaTranscodingHint descriptor is composed of ID, UtilityScaling (), MotionHint (), and TextureHint ().
[0154]
The UtilityScaling descriptor is a descriptor that describes the image quality at each bit rate of content.
[0155]
The MotionHint descriptor is a descriptor that describes the degree of motion compensation difficulty of the content, and is configured by ID, Motion_uncompensability, Motion_range_x_left, Motion_range_x_right, Motion_range_y_left, and Motion_range_y_right as shown in FIG.
[0156]
When the inter-frame correlation is low, the coding efficiency that can be improved by the motion compensation is not so high, and more bits need to be allocated to the portion where the inter-frame correlation is low. Motion_uncompensability is a parameter that takes a value from 0 to 1. 0 indicates that each frame is exactly the same, and 1 indicates that there is no inter-frame correlation. This Motion_uncompensability describes the motion compensation difficulty level output from the difference circuit 207 in FIG.
[0157]
Motion_range_x_left and Motion_range_x_right represent the maximum amount of change in the horizontal direction of the motion amount in motion compensation. Similarly, Motion_range_y_left and Motion_range_y_right indicate the maximum change amount in the vertical direction of the motion amount in motion compensation. These represent the maximum search range in the horizontal direction and the vertical direction in motion vector detection. By designating the maximum value of the motion vector in advance, it is possible to reduce the amount of computation in transcoding while maintaining the image quality.
[0158]
The TextureHint descriptor is a descriptor that describes the difficulty level of compression of the content in the spatial direction, and describes the intra coding difficulty level that is output by the encoder 202 in FIG. This TextureHint descriptor is composed of ID, Difficulty, and DifficultyType, as shown in FIG.
[0159]
Difficulty is a flag indicating the intra coding difficulty level of the content, and indicates the coding difficulty level when coding without using motion compensation.
[0160]
DifficultyType is a flag indicating Difficulty processing, that is, how Difficulty described in the descriptor is measured. As shown in FIG. 20, the value “0” of DifficultyType represents EncodingDifficulty.
[0161]
When the data structure of the MediaTranscodingHint descriptor shown in FIG. 19A is expressed in UML, it becomes as shown in FIG.
[0162]
Each MediaTranscodingt descriptor is composed of 0 or 1 UtilityScaling descriptor, MotionHint descriptor, and TextureHint descriptor.
[0163]
As shown in FIG. 22, the MediaTranscodingHint descriptor shown in FIG. 21 is a descriptor that describes the media of the content, for example, the encoding method, together with the MediaIdentification descriptor, MediaFormat descriptor, MediaCoding descriptor, and MediaInstance descriptor. Configure MediaInformation descriptor.
[0164]
The MediaInformation descriptor is added to the entire content or a part of the content. Therefore, the MediaTranscodingHint descriptor is also added to the entire content or a part of the content.
[0165]
FIG. 23 schematically shows the relationship between the MediaTranscodingHint descriptor and the video data. The video content 211 is composed of at least one sequence, and a part of the scene (segment) 212 is defined by a start time (Start Time) and an end time (End Time). Information about the segment 212 (start time, end time, etc.) is described in the Segment descriptor 213. One MediaInformation descriptor may be defined for one piece of content, or one MediaInformation descriptor may be defined for a Segment descriptor. When the MediaInformation descriptor 214 is defined as a child descriptor of the Segment descriptor 213, since the MediaTrancodingHint descriptor 215 is a child descriptor of the MediaInformation descriptor 214, the MediaTranscodingHint descriptor 215 is provided for each segment (scene). It will be specified. The MediaTranscodingHint descriptor 215 includes a UtilityScaling descriptor 216, a MediaHint descriptor 217, and a TextureHint descriptor 218 as child descriptors.
[0166]
The MediaInformation descriptor 214 and its child descriptors are all child descriptors of the Segment descriptor 213, and the description content is a value that is valid only between the start time and the end time specified by the Segment descriptor 213 that is the parent descriptor. It becomes.
[0167]
The series of processes described above can be executed by hardware, but can also be executed by software. When a series of processing is executed by software, a program constituting the software executes various functions by installing a computer incorporated in dedicated hardware or various programs. For example, a general-purpose personal computer is installed from a network or a recording medium.
[0168]
FIG. 24 shows an example of the configuration of a personal computer that executes the above processing. A CPU (Central Processing Unit) 221 executes various processes according to a program stored in a ROM (Read Only Memory) 222 or a program loaded from a storage unit 228 to a RAM (Random Access Memory) 223. The RAM 223 also appropriately stores data necessary for the CPU 221 to execute various processes.
[0169]
The CPU 221, ROM 222, and RAM 223 are connected to each other via a bus 224. An input / output interface 225 is also connected to the bus 224.
[0170]
The input / output interface 225 includes an input unit 226 including a keyboard and a mouse, a display including a CRT and an LCD, an output unit 227 including a speaker, a storage unit 228 including a hard disk, a modem, a terminal adapter, and the like. A configured communication unit 229 is connected. The communication unit 229 performs communication processing via a network.
[0171]
A drive 230 is connected to the input / output interface 225 as necessary, and a magnetic disk 241, an optical disk 242, a magneto-optical disk 243, a semiconductor memory 244, or the like is appropriately mounted, and a computer program read from these is loaded. It is installed in the storage unit 228 as necessary.
[0172]
The recording medium on which the program is recorded is distributed to provide the program to the user separately from the computer, and includes a magnetic disk 241 (including a floppy disk), an optical disk 242 (CD-ROM (Compact Disk-Read Only Memory). ), DVD (Digital Versatile Disk)), magneto-optical disk 243 (including MD (Mini-Disk)), or a package medium composed of a semiconductor memory 244, etc., but also pre-installed in a computer The program is configured by a ROM 222 in which a program is recorded and a hard disk included in the storage unit 228, which is provided to the user in a state.
[0173]
In the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in chronological order according to the described order, but is not necessarily performed in chronological order. It also includes processes that are executed individually.
[0174]
Further, in this specification, the system represents the entire apparatus constituted by a plurality of apparatuses.
[0175]
Further, in the present specification, the content is described mainly using an image signal as an example, but is not limited to an image signal, but includes an audio signal, a program, a text signal, and the like.
[0176]
【The invention's effect】
As described above, according to the content supply device and method and the recording medium program of the present invention, the second information related to the content corresponding to the first information related to the function of the other device is acquired, and the second Content conversion based on information enables efficient transmission of content to transmission paths with various transmission capacities and transmission of optimal content to other devices with various processing capabilities Become.
[0177]
According to the signal generation device and method of the present invention, the content encoding difficulty level is analyzed and output as a content information signal, and the content and the content information signal are retained. In the case of conversion to, format conversion can be performed so as to obtain an optimum image quality by referring to the content information signal.
[0178]
In addition, according to the conversion device and method of the present invention, since the information of the terminal that reproduces the content is acquired, the content is converted into another format based on the encoding difficulty level of the content and the terminal information. It is possible to perform format conversion so that the image quality is optimal for the terminal that performs the reproduction.
[0179]
Also, according to the playback terminal and the playback method of the present invention, the content is converted into another format suitable for the terminal based on the content encoding difficulty information and played back. Can be played.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a conventional MPEG encoder.
FIG. 2 is a block diagram showing a configuration of a conventional MPEG decoder.
FIG. 3 is a block diagram showing the configuration of another conventional MPEG encoder.
FIG. 4 is a block diagram showing a configuration of a system to which the present invention is applied.
5 is a block diagram showing a configuration of the multimedia content server of FIG. 4. FIG.
6 is a block diagram showing a configuration of the data access server of FIG. 4. FIG.
7 is a block diagram showing a configuration of the transcoding apparatus of FIG. 6. FIG.
FIG. 8 is a diagram illustrating transcoding.
FIG. 9 is a diagram illustrating transcoding.
FIG. 10 is a diagram for explaining recording in the multimedia content server in FIG. 4;
11 is a block diagram showing a configuration of the encoding difficulty analysis circuit of FIG.
FIG. 12 is a diagram illustrating the configuration of a content information signal.
FIG. 13 is a diagram illustrating the structure of a transcoding hint descriptor.
FIG. 14 is a diagram illustrating the structure of a media information descriptor.
FIG. 15 is a diagram illustrating a structure of a segment descriptor.
FIG. 16 is a diagram illustrating the overall structure of MPEG-7.
17 is a block diagram showing another configuration example of the encoding difficulty analysis circuit of FIG.
18 is a flowchart for explaining the operation of the content information signal generating circuit of FIG.
FIG. 19 is a diagram illustrating the structure of a MediaTranscodingHint descriptor.
FIG. 20 is a diagram for explaining DifficultyType.
FIG. 21 is a diagram illustrating the structure of a MediaTranscodingHints descriptor.
FIG. 22 is a diagram illustrating the structure of a MediaInformation descriptor.
FIG. 23 is a diagram for explaining the relationship between video data and a Segment descriptor.
FIG. 24 is a block diagram illustrating a configuration example of a personal computer.
[Explanation of symbols]
101 multimedia content server, 102 data access server, 103 receiving terminal, 111 metadata recording device, 112 content recording device, 113 metadata manager, 114 multimedia content manager, 121 transcoding manager, 122 transcoding device, 123 transcoding Library, 131 Decoder, 132 Encoder, 141 Coding Difficulty Analysis Circuit, 151 Parser, 152 Hint Generator, 161 Decoder, 162 Encoder

Claims (9)

  1. In a data conversion apparatus for transcoding an input bit stream encoded in an input format into an output bit stream encoded in an output format,
    Content information including an encoding parameter obtained from the amount of generated bits of a frame included in the input bit stream described in the hint information serving as a hint for the transcoding, and processing capability of a client that receives the output bit stream Receiving means for receiving client information indicating;
    Based on the content information and the client information received by said receiving means, and control means for determining the output format, generates the transcoding type information that shows how the output format and the transcoding,
    The input bitstream is transcoded from the input format to the output format indicated by the transcoding type information generated by the control means , using the encoding parameter included in the content information, and the output A data conversion device comprising: conversion means for generating a bit stream.
  2. The encoding parameter is encoding difficulty level information of the input bitstream.
    Data conversion apparatus according to Motomeko 1.
  3. The client information includes information indicating a bitstream format that can be decoded by the client,
    The control means determines, based on the client information, the output bit stream encoding method as a variable bit rate encoding method as the output format,
    The converting means transcodes the input bit stream at the variable bit rate using the encoding difficulty level information, and generates the output bit stream.
    The data conversion device according to claim 2 .
  4. The content information includes parameter set information indicating an appropriate value for the transcoding at each bit rate of the encoding parameters used for the transcoding described in the hint information,
    The client information includes information indicating a bitstream format that can be decoded by the client,
    The control means determines a value of an encoding parameter used for the transcoding as the output format based on the client information and the parameter set information,
    The converting means transcodes the input bit stream into the output bit stream using the encoding parameter whose value is determined by the control means.
    The data conversion apparatus according to claim 1 .
  5. The encoding parameter used for the transcoding whose value is indicated by the parameter set information is the frame size of the output bitstream.
    The data conversion device according to claim 4 .
  6. The encoding parameter used for the transcoding whose value is indicated by the parameter set information is the frame rate of the output bitstream.
    The data conversion device according to claim 4 .
  7. A data conversion method of a data conversion apparatus for transcoding an input bit stream encoded in an input format into an output bit stream encoded in an output format,
    Receiving means is described in the hint information the hint of the transcoding, receives the content information, the output bit stream including the encoded parameters obtained from the generated bit amount of a frame included in the input bit stream client Client information indicating the processing power of
    Control means determines the output format based on the received content information and the client information, and generates transcoding type information indicating the output format and the transcoding method;
    Transform means transcodes the input bitstream from the input format to the output format indicated by the generated transcoding type information using the encoding parameter included in the content information, and outputs the output A data conversion method that generates a bitstream.
  8. In an encoding device that encodes input image data obtained by decoding an input bitstream encoded in an input format and generates an output bitstream encoded in an output format,
    The content information including the encoding parameter obtained from the generated bit amount of the frame included in the input image data described in the hint information serving as the encoding hint, and the processing capability of the client that receives the output bitstream Receiving means for receiving client information;
    Control means for determining the output format based on the content information and the client information received by the receiving means, and generating transcoding type information indicating the output format and the transcoding method;
    A code that encodes the input image data in the output format indicated by the transcoding type information generated by the control means , using the encoding parameter included in the content information, and generates the output bitstream And an encoding device.
  9. An encoding method of an encoding device that encodes input image data obtained by decoding an input bitstream encoded in an input format and generates an output bitstream encoded in an output format,
    The receiving means describes the content information including the encoding parameter calculated from the generated bit amount of the frame included in the input image data described in the hint information serving as the encoding hint, and the client receiving the output bitstream. Client information indicating the processing capacity,
    Control means determines the output format based on the received content information and the client information, and generates transcoding type information indicating the output format and the transcoding method;
    Encoding means encodes the input image data in the output format indicated by the generated transcoding type information using the encoding parameter included in the content information, and generates the output bitstream Encoding method.
JP2001065077A 2000-03-13 2001-03-08 Data conversion apparatus and method, and encoding apparatus and method Active JP4622121B2 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2000068719 2000-03-13
JP2000147768 2000-05-19
JP2000-147768 2000-05-19
JP2000-68719 2000-05-19
JP2001065077A JP4622121B2 (en) 2000-03-13 2001-03-08 Data conversion apparatus and method, and encoding apparatus and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2001065077A JP4622121B2 (en) 2000-03-13 2001-03-08 Data conversion apparatus and method, and encoding apparatus and method

Publications (2)

Publication Number Publication Date
JP2002044622A JP2002044622A (en) 2002-02-08
JP4622121B2 true JP4622121B2 (en) 2011-02-02

Family

ID=27342643

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2001065077A Active JP4622121B2 (en) 2000-03-13 2001-03-08 Data conversion apparatus and method, and encoding apparatus and method

Country Status (1)

Country Link
JP (1) JP4622121B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101336585B1 (en) * 2012-04-16 2013-12-05 갤럭시아커뮤니케이션즈 주식회사 Systme and method for providing adaptive streaming service

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4600875B2 (en) * 2000-08-28 2010-12-22 ザ リージェント オブ ザ ユニバーシティ オブ カリフォルニア Multimedia information processing apparatus and method
EP1455504B1 (en) * 2003-03-07 2014-11-12 Samsung Electronics Co., Ltd. Apparatus and method for processing audio signal and computer readable recording medium storing computer program for the method
KR100798162B1 (en) 2003-04-10 2008-01-28 닛본 덴끼 가부시끼가이샤 Moving picture compression/encoding method conversion device and moving picture communication system
EP1665567A4 (en) 2003-09-15 2010-08-25 Directv Group Inc Method and system for adaptive transcoding and transrating in a video network
JP2007235185A (en) * 2004-04-16 2007-09-13 Matsushita Electric Ind Co Ltd Information recording medium appropriate to random access, and recording/reproducing apparatus and recording/reproducing method thereof
CN101273637B (en) * 2005-09-28 2013-04-03 艾利森电话股份有限公司 Media manager, media contents management method and system and communication unit containing media manager
JP4941507B2 (en) * 2009-05-27 2012-05-30 沖電気工業株式会社 Load distribution control device, program and method, load distribution device, and information processing device
JP5553140B2 (en) 2009-10-02 2014-07-16 ソニー株式会社 Information processing apparatus and method
KR102013461B1 (en) * 2011-01-21 2019-08-22 인터디지탈 매디슨 페이튼트 홀딩스 System and method for enhanced remote transcoding using content profiling
CN105519117A (en) * 2013-09-06 2016-04-20 三菱电机株式会社 Video encoding device, video transcoding device, video encoding method, video transcoding method and video stream transmission system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003523024A (en) * 2000-02-10 2003-07-29 テレフォンアクチーボラゲット エル エム エリクソン(パブル) Method and apparatus for intelligent transcoding multimedia data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003523024A (en) * 2000-02-10 2003-07-29 テレフォンアクチーボラゲット エル エム エリクソン(パブル) Method and apparatus for intelligent transcoding multimedia data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101336585B1 (en) * 2012-04-16 2013-12-05 갤럭시아커뮤니케이션즈 주식회사 Systme and method for providing adaptive streaming service

Also Published As

Publication number Publication date
JP2002044622A (en) 2002-02-08

Similar Documents

Publication Publication Date Title
US10257443B2 (en) Multimedia distribution system for multimedia files with interleaved media chunks of varying types
US6532593B1 (en) Transcoding for consumer set-top storage application
CN1227909C (en) System for editing compressed image sequences
KR101354833B1 (en) Techniques for variable resolution encoding and decoding of digital video
CN1089527C (en) Method and apapratus for coding image, and image recording medium
US6535559B2 (en) Image encoder, image encoding method, image decoder, image decoding method, and distribution media
Vetro et al. Video transcoding architectures and techniques: an overview
US6934334B2 (en) Method of transcoding encoded video data and apparatus which transcodes encoded video data
AU2005202313B2 (en) Method and apparatus for generating compact transcoding hints metadata
US6895052B2 (en) Coded signal separating and merging apparatus, method and computer program product
US5305113A (en) Motion picture decoding system which affords smooth reproduction of recorded motion picture coded data in forward and reverse directions at high speed
US6400768B1 (en) Picture encoding apparatus, picture encoding method, picture decoding apparatus, picture decoding method and presentation medium
US6563954B2 (en) Method for computational graceful degradation in an audiovisual compression system
JP4529933B2 (en) Multiplexing apparatus and method, and synthesizing apparatus and method
KR100681168B1 (en) Encoding and decoding system and method of the residual signal for the fine granular scalable video
US6925120B2 (en) Transcoder for scalable multi-layer constant quality video bitstreams
US7822118B2 (en) Method and apparatus for control of rate-distortion tradeoff by mode selection in video encoders
CN100531388C (en) Multi-rate transcoder for digital streams
US6031575A (en) Method and apparatus for encoding an image signal, method and apparatus for decoding an image signal, and recording medium
US7738550B2 (en) Method and apparatus for generating compact transcoding hints metadata
JP4601889B2 (en) Apparatus and method for converting a compressed bitstream
US20030215012A1 (en) Method and apparatus for transforming moving picture coding system
US6608935B2 (en) Picture encoding method and apparatus, picture decoding method and apparatus and furnishing medium
KR100305941B1 (en) A real-time single pass variable bit rate control strategy and encoder
EP1747673B1 (en) Multiple interoperability points for scalable media coding and transmission

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20080116

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20090818

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20091014

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20091208

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100208

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20100302

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100506

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20100812

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100915

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20101005

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20101018

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20131112

Year of fee payment: 3

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250