WO2013076991A1 - Procédé de codage d'image, dispositif de codage d'image, procédé de décodage d'image et dispositif de décodage d'image - Google Patents

Procédé de codage d'image, dispositif de codage d'image, procédé de décodage d'image et dispositif de décodage d'image Download PDF

Info

Publication number
WO2013076991A1
WO2013076991A1 PCT/JP2012/007528 JP2012007528W WO2013076991A1 WO 2013076991 A1 WO2013076991 A1 WO 2013076991A1 JP 2012007528 W JP2012007528 W JP 2012007528W WO 2013076991 A1 WO2013076991 A1 WO 2013076991A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
extension
unit
nal unit
encoding
Prior art date
Application number
PCT/JP2012/007528
Other languages
English (en)
Japanese (ja)
Inventor
大作 小宮
西 孝啓
陽司 柴原
寿郎 笹井
敏康 杉尾
京子 谷川
徹 松延
健吾 寺田
Original Assignee
パナソニック株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パナソニック株式会社 filed Critical パナソニック株式会社
Publication of WO2013076991A1 publication Critical patent/WO2013076991A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present invention relates to an image encoding method and an image decoding method.
  • Multi-view video coding (MVC) standard and scalable video coding (SVC: Scalable Video Coding) standard, for example, are defined as extensions of ISO / IEC 14496-10 Advanced Video Coding (AVC) standard.
  • Non-Patent Document 1 Multi-view video coding (MVC) standard and scalable video coding (SVC: Scalable Video Coding) standard, for example, are defined as extensions of ISO / IEC 14496-10 Advanced Video Coding (AVC) standard.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the present invention provides an image encoding method, an image decoding method, and the like that can realize encoding that combines multi-view video encoding (MVC) and scalable video encoding (SVC).
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • An image decoding method is an image decoding method for decoding encoded base image data and extended image data encoded by combining scalable video encoding and multi-view video encoding.
  • a specifying step for specifying extended image data from information to be specified and information for specifying the layer and an extended data decoding step for decoding the extended image data.
  • An image encoding method is an image encoding method for encoding base image data and extended image data by at least one of scalable video encoding and multi-view video encoding,
  • the base data encoding step for encoding the base image data, the scalable video encoding and the multi-view video encoding are implemented, or depending on whether only one is implemented.
  • An extension parameter specifying step for specifying an extension parameter used for encoding image data; an extension data encoding step for encoding the extension image data using the extension parameter by encoding performed; and the extension Generating a NAL unit header including parameters.
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • FIG. 1 is a diagram relating to the storage location of the NAL unit header in the MVC extension syntax.
  • FIG. 2 is a diagram relating to the storage location of the NAL unit header in the SVC extended syntax.
  • FIG. 3 is a diagram illustrating an example of a prediction structure of an input image when encoding is performed by combining multi-view video encoding (MVC) and scalable video encoding (SVC).
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • FIG. 4 is a diagram relating to storage locations in the extended syntax of the NAL unit header used in the image encoding method and the image decoding method according to the first embodiment.
  • FIG. 5 is a diagram relating to storage locations in the extended syntax of the NAL unit header used in the image encoding method and the image decoding method according to the first embodiment.
  • FIG. 1 is a diagram relating to the storage location of the NAL unit header in the MVC extension syntax.
  • FIG. 2 is a diagram
  • FIG. 6 is a diagram relating to storage locations in the extended syntax of the NAL unit header used in the image encoding method and the image decoding method according to the first embodiment.
  • FIG. 7 is a block diagram showing a configuration of the image coding apparatus according to Embodiment 1.
  • FIG. 8 is a flowchart showing processing of the image coding apparatus according to Embodiment 1.
  • FIG. 9A is a diagram illustrating an example of a configuration of a bitstream generated by the image encoding apparatus according to Embodiment 1, and is a bitstream generated when only scalable video encoding (SVC) is performed. is there.
  • FIG. 9B is a diagram illustrating an example of a configuration of a bitstream generated by the image encoding device according to Embodiment 1, and a bitstream generated when only multi-view video encoding (MVC) is performed. It is.
  • FIG. 9C is a diagram illustrating an example of a configuration of a bitstream generated by the image coding apparatus according to Embodiment 1, and multi-view video coding (MVC) and scalable video coding (SVC) are performed. It is a bit stream generated when FIG. 10 is a block diagram showing a configuration of the image decoding apparatus according to Embodiment 1.
  • FIG. 11 is a flowchart showing NAL unit header extension parameter acquisition processing by the prefix analysis unit or extension data analysis unit according to the first embodiment.
  • FIG. 12 is a diagram illustrating an example of the NAL unit header / extension part map showing the configuration of the extension part of the NAL unit header.
  • FIG. 13 is a diagram illustrating a NAL unit header corresponding to the NAL unit header / extension unit map.
  • FIG. 14 is a block diagram showing a configuration of an image coding apparatus according to Embodiment 2.
  • FIG. 15 is a flowchart showing processing of the image coding apparatus according to Embodiment 2.
  • FIG. 16A is a diagram illustrating an example of a configuration of a bitstream encoded by the image encoding device according to Embodiment 2, and is an example in which an enhancement layer (HEVC) is added to the base layer (AVC).
  • HEVC enhancement layer
  • FIG. 16B is a diagram illustrating an example of a configuration of a bitstream encoded by the image encoding device according to the second embodiment, and is an example in which an extended view (HEVC) is added to a base view (AVC).
  • FIG. 16C is a diagram illustrating an example of a configuration of a bitstream encoded by the image encoding device according to the second embodiment, and includes a base layer base view (AVC) and an extended layer base view (AVC). This is an example of adding a base layer / enhanced view (HEVC) and an enhanced layer / enhanced view (HEVC) as extended views to the view.
  • FIG. 16D is a diagram illustrating an example of a configuration of a bitstream encoded by the image encoding device according to the second embodiment, and includes a base layer base view (AVC) and a base layer / extended view (AVC).
  • AVC base layer base view
  • AVC extended view
  • FIG. 17 is a block diagram illustrating a configuration of the image decoding apparatus according to the second embodiment.
  • FIG. 18 is a flowchart showing processing of the image decoding apparatus according to Embodiment 2.
  • FIG. 19 is a flowchart illustrating processing for calculating the NAL unit header extension parameter of the base data by the extension parameter calculation unit according to the second embodiment.
  • FIG. 20 is a flowchart illustrating processing for calculating the NAL unit header MVC extension parameter of the base view by the extension parameter calculation unit according to the second embodiment.
  • FIG. 21 is an overall configuration diagram of a content supply system that realizes a content distribution service.
  • FIG. 22 is an overall configuration diagram of a digital broadcasting system.
  • FIG. 23 is a block diagram illustrating a configuration example of a television.
  • FIG. 24 is a block diagram illustrating a configuration example of an information reproducing / recording unit that reads and writes information from and on a recording medium that is an optical disk.
  • FIG. 25 is a diagram illustrating a structure example of a recording medium that is an optical disk.
  • FIG. 26A illustrates an example of a mobile phone.
  • FIG. 26B is a block diagram illustrating a configuration example of a mobile phone.
  • FIG. 27 is a diagram showing a structure of multiplexed data.
  • FIG. 28 is a diagram schematically showing how each stream is multiplexed in the multiplexed data.
  • FIG. 29 is a diagram showing in more detail how the video stream is stored in the PES packet sequence.
  • FIG. 30 is a diagram illustrating the structure of TS packets and source packets in multiplexed data.
  • FIG. 31 is a diagram illustrating a data structure of the PMT.
  • FIG. 32 shows the internal structure of multiplexed data information.
  • FIG. 33 shows the internal structure of stream attribute information.
  • FIG. 34 is a diagram showing steps for identifying video data.
  • FIG. 27 is a diagram showing a structure of multiplexed data.
  • FIG. 28 is a diagram schematically showing how each stream is multiplexed in the multiplexed data.
  • FIG. 29 is a diagram showing in more detail how
  • FIG. 35 is a block diagram illustrating a configuration example of an integrated circuit that realizes the moving image encoding method and the moving image decoding method according to each embodiment.
  • FIG. 36 is a diagram showing a configuration for switching the driving frequency.
  • FIG. 37 is a diagram showing steps for identifying video data and switching between driving frequencies.
  • FIG. 38 is a diagram showing an example of a look-up table in which video data standards are associated with drive frequencies.
  • FIG. 39A is a diagram illustrating an example of a configuration for sharing a module of a signal processing unit.
  • FIG. 39B is a diagram illustrating another example of a configuration for sharing a module of a signal processing unit.
  • the encoded base view is required to be compatible with the profile defined in the AVC standard. Therefore, the conventional decoder conforming to the high profile of the AVC standard conforms to the MVC profile. It must be possible to decode the base view in the MVC bitstream. Similarly, legacy decoders that conform to the AVC standard high profile must be able to decode the base layer in an SVC bitstream that conforms to the SVC profile.
  • NAL network abstraction layer
  • Different types of NAL units are distinguished by the value of the NAL unit type.
  • the extended view extended by the MVC standard and the extended layer extended by the SVC standard are included in the NAL unit having the value of the NAL unit type reserved in the previous version of the AVC standard (the value of the NAL unit type is “20”). Thus, these NAL units should be ignored by traditional high profile decoders.
  • the NAL unit header of the NAL unit including the extension view or the extension layer includes additional parameters. Additional parameters used for decoding the MVC bitstream are arranged in the NAL unit header MVC extension. Further, the additional parameters used for decoding the SVC bit stream are arranged in the NAL unit header SVC extension unit.
  • a special NAL unit called a prefix NAL unit (PrefixLNAL unit) is arranged in front of a NAL unit including an encoded base view or base layer.
  • PrefixLNAL unit a special NAL unit
  • To be transmitted with a NAL unit containing The value of the NAL unit type of this prefix NAL unit is “14”, which is reserved in the previous version of the AVC standard.
  • the value of the NAL unit type of the NAL unit including the encoded base view or base layer is “5” or “1”.
  • the prefix NAL unit in the MVC standard does not have a payload, and is composed only of a NAL unit header.
  • the prefix NAL unit also includes additional parameters that are placed in the MVC extension or SVC extension of the NAL unit header. These parameters are associated with the base view or base layer, and are used in the process of encoding or decoding the enhancement view or enhancement layer.
  • the MVC standard stipulates that the value of an additional parameter to be placed in the MVC extension of the prefix NAL unit is estimated when the prefix NAL unit is not attached to the bitstream. If the prefix NAL unit can be omitted from the bitstream, the following advantages are obtained.
  • the first merit is the improvement of backward compatibility with the conventional AVC decoder.
  • legacy AVC decoders should ignore NAL units that have NAL unit type values defined as reserved values in previous versions of the AVC standard.
  • a legacy AVC decoder should only decode the NAL unit containing the base view and reconstruct only that base view.
  • not all decoders on the market ignore NAL units with reserved values.
  • a compressed base view and a compressed extended view can be distinguished using different stream identifiers.
  • the prefix NAL unit is included in association with the compressed base view, some decoders in the market cannot decode the base view due to the prefix NAL unit. Omitting the front NAL unit solves this problem.
  • the second merit is an improvement in reusability of a bitstream generated based on a conventional standard.
  • the front NAL unit is based on the conventional standard. Therefore, it must be placed before each NAL unit of the bit stream (base view) already generated. This task is difficult. For example, in a service in which a base view already sold in the form of a DVD or BD and an extended view are distributed via a network such as the Internet and the 3D video is displayed by combining both, the NAL unit of the base view is displayed.
  • a prefix NAL unit cannot be added before. If the omission of the prefix NAL unit is allowed, it is not necessary to modify the bitstream already generated based on the conventional standard, and this problem is solved.
  • FIG. 1 is a diagram relating to the storage location of the NAL unit header in the MVC extended syntax.
  • the NAL unit header includes a so-called basic part common to all NAL unit headers and an MVC extension part of the NAL unit header.
  • a 1-bit svc_extension_flag is placed between the basic part of the NAL unit header and the MVC extension, and when this value is false, it indicates that the subsequent extension is the MVC extension.
  • Non_idr_flag a non-IDR flag
  • priority_id priority ID
  • view_id view ID
  • time ID temporary_id
  • anchor_pic_flag anchor picture flag
  • inter_view_flag inter-view prediction flag
  • reserved 1 bit reserved 1 bit (reserved_one_bit) is a single value and is not used in the encoding and decoding processing of the extended view.
  • FIG. 2 is a diagram relating to the storage location of the NAL unit header in the SVC extended syntax.
  • the NAL unit header is composed of a so-called basic part common to all NAL unit headers and an SVC extension part of the NAL unit header.
  • a 1-bit svc_extension_flag is placed between the basic part of the NAL unit header and the SVC extension. When this value is true, it indicates that the subsequent extension is the SVC extension.
  • an IDR flag (idr_flag), a priority ID (priority_id), an inter-layer prediction flag (no_inter_layer_pred_flag), a dependency ID (dependency_id), a quality ID (quality_id), and a time ID (temp)
  • a reference base picture use flag (use_ref_base_pic_flag), a discardable flag (discardable_flag), an output flag (output_flag), reserved 2 bits (reserved_three_2 bits), and the like.
  • the reserved 2 bits (reserved_three_2 bits) are one value and are not used for the enhancement layer encoding and decoding processing.
  • FIG. 3 is a diagram illustrating an example of a prediction structure of an input image when encoding is performed by combining multi-view video encoding (MVC) and scalable video encoding (SVC).
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • the layer has three layers and the view has two views (3D).
  • the left view is the base layer, and the right view is the extended view.
  • the base layer (Layer 1 (L)) of the base view can only perform inter prediction.
  • the extension layer (Layer 2 (L), Layer 3 (L)) of the base view can perform inter-layer prediction and inter prediction.
  • the base layer (Layer 1 (R)) of the extended view can perform inter-view prediction or inter prediction.
  • the extended layer (Layer2 (R), Layer3 (R)) of the extended view can perform inter-view prediction, inter-layer prediction, or inter prediction.
  • the NAL unit header When performing a combination of multi-view video coding (MVC) and scalable video coding (SVC), the NAL unit header needs to include the MVC extension unit and the SVC extension unit at the same time. However, as shown in FIGS. 1 and 2, the NAL unit header can include only one of the MVC extension unit and the SVC extension unit.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the prefix NAL unit needs to include information corresponding to the MVC extension unit and the SVC extension unit, The prefix NAL unit cannot be omitted. Further, when the prefix NAL unit is omitted, it is determined which of the information corresponding to the MVC extension unit, the information corresponding to the SVC extension unit, and the information corresponding to the MVC extension unit and the SVC extension unit is omitted. Can not do it. Also, the omitted information cannot be calculated.
  • MVC multi-view video coding
  • SVC extension unit scalable video coding
  • an image decoding method includes an encoded base image data, an extension encoded by combining scalable video encoding and multi-view video encoding.
  • An image decoding method for decoding image data wherein a base data decoding step for decoding the base image data, and a NAL unit header including information specifying a view and information specifying a layer as parameters are parsed
  • a syntactic analysis step a specifying step of specifying extended image data from information specifying the view and information specifying the layer, and an extended data decoding step of decoding the extended image data.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the base image data is encoded by a first video encoding method, does not include an extension parameter of the base image data, and the extended image data is a second video code.
  • the extended data decoding step may perform decoding of the extended image data using the extended parameter of the extended image data and the extended parameter of the base image data.
  • the image decoding method is scalable video encoding and multi-view video encoding performed on the extended image data based on a flag value included in a NAL unit header of the extended image data? Or an extension parameter determination step of detecting whether only one of them is implemented and determining an extension parameter of the base image data to be calculated.
  • the syntax analysis may be performed based on map information indicating an arrangement of the parameters included in the NAL unit header.
  • map information may be included in sequence parameter set information or picture parameter set information.
  • An image encoding method is an image encoding method for encoding base image data and extended image data by at least one of scalable video encoding and multi-view video encoding,
  • the base data encoding step for encoding the base image data, the scalable video encoding and the multi-view video encoding are implemented, or depending on whether only one is implemented.
  • An extension parameter specifying step for specifying an extension parameter used for encoding image data; an extension data encoding step for encoding the extension image data using the extension parameter by encoding performed; and the extension Generating a NAL unit header including parameters.
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • the header generation step information indicating whether the scalable video encoding and the multi-view video encoding are performed or only one of them is included in the generated NAL unit header. May be.
  • map information indicating an arrangement of the extension parameters included in the NAL unit header may be included in the generated NAL unit header.
  • FIG. 4 is a diagram relating to storage locations in the extended syntax of the NAL unit header used in the image encoding method and the image decoding method according to the first embodiment.
  • the NAL unit header is composed of a basic part of the NAL unit header and an MVC extension part of the NAL unit header.
  • the configuration shown in FIG. 1 is that a 2-bit svc_mvc_extension_flag is placed between the basic part of the NAL unit header and the MVC extension part, and a reserved 1 bit (reserved_one_bit) is deleted from the MVC extension part. Is different. When the value of svc_mvc_extension_flag is “2”, it indicates that the subsequent extension unit is the MVC extension unit.
  • FIG. 5 is a diagram relating to storage locations in the extended syntax of the NAL unit header used in the image encoding method and the image decoding method according to the first embodiment.
  • the NAL unit header is composed of a basic part of the NAL unit header and an SVC extension part of the NAL unit header.
  • the configuration shown in FIG. 2 is that a 2-bit svc_mvc_extension_flag is placed between the basic part of the NAL unit header and the SVC extension part, and 2 reserved bits (reserved_three_2 bits) are reserved 1 bit in the SVC extension part. The difference is that it has been changed to (reserved_one_bit).
  • svc_mvc_extension_flag When the value of svc_mvc_extension_flag is “1”, it indicates that the subsequent extension unit is the SVC extension unit.
  • FIG. 6 is a diagram relating to the storage location by the extended syntax of the NAL unit header used in the image encoding method and the image decoding method according to the first embodiment.
  • the NAL unit header includes a basic part of the NAL unit header and an MVC extension part and an SVC extension part of the NAL unit header.
  • 1 is different from the MVC extension unit shown in FIG. 1 in that the reserved 1 bit (reserved_one_bit) is deleted from the MVC extension unit.
  • 2 is different from the SVC extension unit shown in FIG. 2 in that the reserved 2 bits (reserved_three_2 bits) of the SVC extension unit are changed to reserved 3 bits (reserved_seven_3 bits).
  • a 2-bit svc_mvc_extension_flag is placed between the basic part of the NAL unit header and the MVC extension part. When this value is “3”, the subsequent extension parts are the MVC extension part and the SVC extension part. Indicates that there is.
  • parameters that overlap in the MVC extension unit and the SVC extension unit for example, priority_id, temporal_id, etc.
  • byte alignment is adjusted with reserved bits.
  • the NAL unit header shown in FIG. 6 is used in an encoded bitstream that combines multi-view video coding (MVC) and scalable video coding (SVC).
  • the NAL unit header of the present embodiment can include information corresponding to the MVC extension unit and the SVC extension unit.
  • the configuration of the subsequent extension unit differs depending on the value of svc_mvc_extension_flag. That is, the NAL unit header shown in FIG. 4 is used in an encoded bitstream only for multi-view video coding (MVC), and the NAL unit shown in FIG. 5 is used in an encoded bitstream only for scalable video coding (SVC).
  • a header is used.
  • FIG. 7 is a block diagram showing a configuration of the image coding apparatus according to the first embodiment.
  • the image encoding device 100 includes a base data encoding unit 101, an extension parameter specifying unit 102, an extension data encoding unit 103, a NAL unit header generation unit 104, and a NAL unit generation unit 105. Yes.
  • the base data coding unit 101 performs base view in multi-view video coding (MVC), base layer in scalable video coding (SVC), multi-view video coding (MVC) and scalable video coding (SVC). Base view and base layer encoding in combined encoding is performed. However, there is no difference in the encoded data output as base data in any case.
  • MVC multi-view video coding
  • SVC multi-view video coding
  • SVC scalable video coding
  • the extended parameter specifying unit 102 detects whether multi-view video coding (MVC) and scalable video coding (SVC) are performed, or only one of them is performed, and performs the coding to be performed. In response, the extension parameter of the NAL unit header is specified.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the extended data encoding unit 103 includes an extended view (for example, Layer 1 (R) in FIG. 3), an extended layer (for example, Layer 2 (L) and Layer 3 (L) in FIG. 3), an extended view and an extended layer (for example, FIG. 3). Layer 2 (R) and Layer 3 (R)) are encoded.
  • the NAL unit header generation unit 104 generates a NAL unit header (basic unit), a NAL unit header including an extension unit, or a prefix NAL unit.
  • the NAL unit generation unit 105 inserts the base data encoded by the base data encoding unit 101 or the extension data encoded by the extension data encoding unit 103 into the payload portion of the NAL unit, and further adds a header of the NAL unit.
  • the NAL unit header generated by the NAL unit header generation unit is inserted into the NAL unit header generation unit to generate a NAL unit.
  • the front NAL unit is arranged immediately before.
  • FIG. 8 is a flowchart showing processing of the image coding apparatus according to the first embodiment.
  • the base data encoding unit 101 encodes base data (step S101).
  • the NAL unit header generation unit 104 generates a NAL unit header (basic part) of base data (step S102).
  • the NAL unit generation unit 105 inserts the base data encoded by the base data encoding unit 101 into the payload portion of the NAL unit, and further generates the NAL unit header generated by the NAL unit header generation unit 104 in the header portion of the NAL unit. Is inserted to generate a NAL unit including base data (step S103).
  • the extended parameter specifying unit 102 detects whether multi-view video coding (MVC) and scalable video coding (SVC) are performed, or only one of them is performed, and the base data
  • MVC multi-view video coding
  • SVC scalable video coding
  • the NAL unit header extension parameter is specified (step S104).
  • the NAL unit generation unit 105 When adding a prefix NAL unit to a NAL unit including base data, the NAL unit generation unit 105 generates a prefix NAL unit and places the prefix NAL unit immediately before the NAL unit including the base data.
  • the extension parameter specifying unit 102 performs the following depending on whether multi-view video coding (MVC) and scalable video coding (SVC) are performed or only one of them is performed (step S105). Thus, the NAL unit header extension parameter of the extension data is specified.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the extension parameter specifying unit 102 specifies the NAL unit header MVC extension parameter (step S106).
  • the process ends when neither multi-view video coding (MVC) nor scalable video coding (SVC) is performed.
  • the NAL unit header MVC extension parameter as shown in FIG. 4, a non-IDR flag (non_idr_flag), a priority ID (priority_id), a view ID (view_id), a time ID (temporal_id), an anchor picture flag (anchor_pic_flag) And inter-view prediction flag (inter_view_flag).
  • the extended data encoding unit 103 encodes the extended view based on the multi-view video coding (MVC) standard using the specified base data and the extended parameters of the extended data (step S107).
  • the NAL unit header generation unit 104 sets the value of svc_mvc_extension_flag to “2”, and generates a NAL unit header including the NAL unit header MVC extension unit (step S108). At this time, NAL_unit_type is set to “20”.
  • the extension parameter specifying unit 102 specifies the NAL unit header SVC extension parameter (step S109).
  • NAL unit header SVC extended parameter as shown in FIG.
  • an IDR flag (idr_flag), a priority ID (priority_id), an inter-layer prediction flag (no_inter_layer_pred_flag), a dependency ID (dependency_id), and a quality ID (quality_id) , Time ID (temporal_id), reference base picture use flag (use_ref_base_pic_flag), discardable flag (discardable_flag), output flag (output_flag), reserved 1 bit (reserved_one_bit), and the like.
  • the extended data encoding unit 103 encodes the extended layer based on the scalable video coding (SVC) standard using the specified base data and the extended parameters of the extended data (step S110).
  • SVC scalable video coding
  • the NAL unit header generation unit 104 sets the value of svc_mvc_extension_flag to “1”, and generates a NAL unit header including the NAL unit header SVC extension unit (step S111). At this time, NAL_unit_type is set to “20”.
  • the extension parameter specifying unit 102 uses the NAL unit header MVC extension parameter and the NAL.
  • the unit header SVC extension parameter is specified (step S112).
  • the extended data encoding unit 103 converts the extended view and the extended layer to the multi-view video coding (MVC) standard and the scalable video coding (SVC) standard using the specified base data and the extended parameters of the extended data. Encoding is performed based on this (step S113).
  • the NAL unit header generation unit 104 sets the value of svc_mvc_extension_flag to “3”, and generates a NAL unit header including the NAL unit header MVC extension unit and the NAL unit header SVC extension unit (step S114). At this time, NAL_unit_type is set to “20”.
  • the NAL unit generation unit 105 inserts the extension data encoded by the extension data encoding unit into the payload part of the NAL unit, and further adds the NAL unit header generated by the NAL unit header generation unit 104 to the header part of the NAL unit. Inserting and generating a NAL unit (step S115).
  • FIG. 9 is a diagram illustrating an example of a configuration of a bitstream generated by the image encoding device according to the first embodiment.
  • FIG. 9A shows a bit stream generated when only scalable video coding (SVC) is performed in step S105 of FIG. 8, and is composed of a base layer (HEVC) and an enhancement layer (HEVC).
  • SVC only scalable video coding
  • HEVC enhancement layer
  • FIG. 9B shows a bit stream generated when only multi-view video coding (MVC) is performed in step S105 of FIG. 8, and is composed of a base view (HEVC) and an extended view (HEVC).
  • the NAL unit header shown in FIG. 4 is used for the base view (HEVC) and the extended view (HEVC).
  • FIG. 9C is a bitstream generated when multi-view video coding (MVC) and scalable video coding (SVC) are performed in step S105 of FIG. 8, and includes a base layer base view (HEVC) and It consists of a base layer / enhanced view (HEVC), an enhanced layer / base view (HEVC), and an enhanced layer / enhanced view (HEVC).
  • the NAL unit header shown in FIG. 6 is used for the base layer / base view (HEVC), the base layer / enhanced view (HEVC), the enhancement layer / base view (HEVC), and the enhancement layer / enhanced view (HEVC).
  • the image coding apparatus generates a NAL unit header that includes the MVC extension unit and the SVC extension unit at the same time, or a NAL unit header that simultaneously includes information corresponding to the MVC extension unit and the SVC extension unit.
  • MVC multi-view video coding
  • SVC scalable video coding
  • MVC multi-view video coding
  • SVC scalable video coding
  • FIG. 10 is a block diagram showing a configuration of the image decoding apparatus according to the first embodiment.
  • the image decoding apparatus 200 includes a base data syntax analysis unit 201, a base data decoding unit 202, a prefix analysis unit 203, an extension data analysis unit 204, and an extension data decoding unit 205, as shown in FIG.
  • the base data analysis unit 201 performs syntax analysis of NAL units of base data (base view NAL unit shown in FIG. 4, base layer NAL unit shown in FIG. 5, base NAL unit shown in FIG. 6, etc.).
  • the base data decoding unit 202 performs a base data decoding process.
  • the prefix analysis unit 203 performs syntax analysis of the NAL unit header of the prefix NAL unit. Then, the NAL unit header extension parameter of the base data is acquired.
  • the extension data analysis unit 204 performs syntax analysis of an extension data NAL unit (an extension view NAL unit shown in FIG. 4, an extension layer NAL unit shown in FIG. 5, an extension NAL unit shown in FIG. 6, etc.). Then, the NAL unit header extension parameter of the extension data is acquired.
  • an extension data NAL unit an extension view NAL unit shown in FIG. 4, an extension layer NAL unit shown in FIG. 5, an extension NAL unit shown in FIG. 6, etc.
  • the extension data decoding unit 205 performs extension data decoding processing using the NAL unit header extension parameters of the base data and extension data.
  • the base data NAL unit header extension parameter is a NAL unit header extension parameter included in the front NAL unit.
  • NAL unit header extension parameter acquisition process by the prefix analysis unit 203 and the extension data analysis unit 204 will be described.
  • the processing for obtaining the NAL unit header extension parameter is the same.
  • FIG. 11 is a flowchart showing NAL unit header extension parameter acquisition processing by the prefix analysis unit 203 or the extension data analysis unit 204 in the first embodiment.
  • the prefix analysis unit 203 or the extended data analysis unit 204 acquires svc_mvc_extension_flag from the NAL unit header (Step S201).
  • the prefix analysis unit 203 or the extended data analysis unit 204 checks the value of svc_mvc_extension_flag (step S202), and if the value of svc_mvc_extension_flag is “2” (MVC decoding in step S202), acquires the MVC extended parameter ( Step S203). If the value of svc_mvc_extension_flag is “1” (SVC decoding in step S202), the pre-analysis unit 203 or the extension data analysis unit 204 acquires the SVC extension parameter (step S204).
  • step S202 If the value of svc_mvc_extension_flag is “3” (in step S202, MVC / SVC decoding), the prefix analysis unit 203 or the extension data analysis unit 204 acquires the MVC extension parameter and the SVC extension parameter (step S205). .
  • the image decoding apparatus performs multi-view video coding (MVC) and scalable video coding (SVC) on the bitstream according to the value of svc_mvc_extension_flag, or performs only one of them. And an appropriate NAL unit header extension parameter can be obtained from the NAL unit header.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the extension data decoding unit 205 can know how the payload (extension data) of the input NAL unit is encoded from the NAL unit header extension parameters of the base data and extension data.
  • the enhancement data includes an enhancement layer (HEVC) in FIG. 9A, an enhancement view (HEVC) in FIG. 9B, a base layer / enhancement view (HEVC) in FIG. 9C, an enhancement layer base view (HEVC), and an enhancement layer. -It is possible to know which of the extended views (HEVC).
  • the image decoding apparatus performs encoding that combines multi-view video encoding (MVC) and scalable video encoding (SVC), and includes an MVC extension unit and an SVC extension unit at the same time.
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • a NAL unit header including information corresponding to the MVC extension unit and the SVC extension unit at the same time, and data that has been encoded by combining multi-view video encoding (MVC) and scalable video encoding (SVC) are appropriately decoded. be able to.
  • the NAL unit header shown in FIGS. 4, 5, and 6 is described as the NAL unit header including the MVC extension unit and the SVC extension unit at the same time.
  • other NAL unit headers may be used.
  • 12 and 13 are diagrams showing modifications of the NAL unit header including the MVC extension unit and the SVC extension unit at the same time.
  • This NAL unit header is characterized in that it does not have all the parameters included in the extension part of the NAL unit header shown in FIGS. 4, 5 and 6. That is, the extension part of the NAL unit header includes only parameters necessary for encoding and decoding, and the NAL unit header has a variable length configuration. As a result, the encoding efficiency is high compared to the NAL unit header shown in FIGS. 4, 5, and 6.
  • FIG. 12 is a diagram showing an example of the NAL unit header / extension part map showing the configuration of the extension part of the NAL unit header.
  • the NAL unit header / extension part map indicates the arrangement information of the parameters included in the extension part of the NAL unit header, and includes information on the parameters included in the extension part of the NAL unit header and the bit length of the parameters. .
  • the NAL unit header / extension map in FIG. 12 includes dependency_id_len, quality_id_len, temporal_id_len, and view_id_len.
  • the extension of the NAL unit header includes a dependency ID (dependency_id), a quality ID (quality_id), and a quality ID (quality_id) view.
  • Each parameter of time ID (temporal_id) is included.
  • the values of dependency_id_len, quality_id_len, temporal_id_len, and view_id_len indicate the bit length of each parameter.
  • the NAL unit header / extension part map is included in a sequence parameter set (SPS), for example, and sent to the decoder side, so that the decoder side can grasp the parameters included in the extension part of the NAL unit header.
  • SPS sequence parameter set
  • PPS picture parameter set
  • FIG. 13 is a diagram showing a NAL unit header corresponding to the NAL unit header / extension part map shown in FIG.
  • the extension part of the NAL unit header includes parameters such as dependency ID (dependency_id), quality ID (quality_id), view ID (view_id), and time ID (temporal_id).
  • dependency_id dependency ID
  • quality_id quality ID
  • view_id view ID
  • temporal_id_len time ID
  • dependency_id_len shown in FIG. 12 is “1”
  • dependency_id_len shown in FIG. 13 is whether the data included in the NAL unit is a base layer or an extension layer. Depending on, it becomes “0” or “1”.
  • the description of the image encoding device or the image decoding device corresponding to the modification of the NAL unit header including the MVC extension unit and the SVC extension unit shown in FIGS. 12 and 13 at the same time is omitted, but will be described in the present embodiment. It is obvious that the modified image encoding device or the image decoding device can be adapted to the modified example of the NAL unit header by appropriately modifying the image encoding device or the image decoding device.
  • the image encoding device or the image decoding device includes the NAL unit header including the MVC extension unit and the SVC extension unit at the same time, or information corresponding to the MVC extension unit and the SVC extension unit at the same time. Since it is possible to process the NAL unit header, it is possible to realize encoding and decoding that combine multi-view video encoding (MVC) and scalable video encoding (SVC).
  • MVC multi-view video encoding
  • SVC scalable video encoding
  • the image coding apparatus generates a prefix NAL unit arranged before the base NAL unit, whereas the image coding apparatus according to the second embodiment does not generate a prefix NAL unit. It is different. Further, the image decoding apparatus according to the first embodiment performs syntax analysis of the prefix NAL unit arranged before the base NAL unit and acquires the NAL unit header extension parameter of the base data, whereas the image decoding apparatus according to the second embodiment. Is different in that the NAL unit header extension parameter of the base data is calculated without performing the parsing of the prefix NAL unit.
  • the first advantage is an improvement in backward compatibility with the conventional decoder.
  • the prefix NAL unit shown in FIGS. 4, 5, 6, and 13 is placed in front of the base NAL unit, some of the decoders in the market cannot decode the base NAL unit due to the prefix NAL unit. Because there is something.
  • a second merit is an improvement in reusability of a bitstream generated based on a conventional standard. If the prefix NAL unit can be omitted, it is possible to add extension data encoded based on a standard different from the conventional standard without any modification to the bitstream generated based on the conventional standard. It becomes.
  • the conventional standard is, for example, the MVC standard shown in FIG. 1 and the AVC standard including the SVC standard shown in FIG.
  • a standard different from the conventional standard is, for example, a standard used by the image coding apparatus or the image decoding apparatus according to the first embodiment shown in FIGS. 4, 5, 6, and 13. It is.
  • FIG. 14 is a block diagram illustrating a configuration of an image encoding device according to the second embodiment.
  • the image encoding device 300 includes a base data encoding unit 101, an extended parameter specifying unit 301, an extended parameter calculating unit 302, an extended data encoding unit 103, a NAL unit header generating unit 104, and a NAL unit generating unit 105. .
  • the base data encoding unit 101, the extended data encoding unit 103, the NAL unit header generation unit 104, and the NAL unit generation unit 105 are the same as the image encoding device 100 of the first embodiment, and the extended parameter specifying unit 301, The extended parameter calculation unit 302 is different. Note that when the base data has already been encoded and only the extension data is encoded later, the base data encoding unit 101 is unnecessary. In addition, the same components as those of the image encoding device 100 according to Embodiment 1 are denoted by the same reference numerals, and description thereof is omitted.
  • the extended parameter identification unit 301 detects whether multi-view video coding (MVC) and scalable video coding (SVC) are implemented, or only one of them is implemented, and The extension parameter of the NAL unit header is specified according to the situation.
  • MVC multi-view video coding
  • SVC scalable video coding
  • the extension parameter specifying unit 102 specifies the NAL unit header extension parameters of the base data and the extension data, but the extension parameter specifying unit 301 of the second embodiment uses the NAL of the extension data. Specify only unit header extension parameters.
  • the extension parameter calculation unit 302 calculates the NAL unit header extension parameter of the base data.
  • FIG. 15 is a flowchart showing processing of the image coding apparatus according to the second embodiment.
  • the base data encoding unit 101 encodes base data (step S301).
  • the NAL unit header generation unit 104 generates a NAL unit header (basic part) of base data (step S302).
  • the NAL unit generation unit 105 inserts the base data encoded by the base data encoding unit 101 into the payload portion of the NAL unit, and further generates the NAL unit header generated by the NAL unit header generation unit 104 in the header portion of the NAL unit. Is inserted to generate a NAL unit including base data (step S303). When base data is not encoded, the processing from step S301 to step S303 is not necessary.
  • the extension parameter specifying unit 301 specifies the NAL unit header extension parameter of the extension data (step S304).
  • the extension parameter calculation unit 302 calculates the NAL unit header extension parameter of the base data based on the specified NAL unit header extension parameter of the extension data (step S305).
  • the extension data encoding unit 103 encodes extension data using the calculated extension parameter of the base data and the specified extension parameter of the extension data (step S306).
  • the NAL unit header generation unit 104 generates a NAL unit header (step S307).
  • the NAL unit header generation process is the same as that of the NAL unit header generation unit 104 of the first embodiment.
  • the NAL unit generation unit 105 inserts the extension data encoded by the extension data encoding unit 103 into the payload part of the NAL unit, and further generates the NAL unit header generated by the NAL unit header generation unit 104 into the header part of the NAL unit. Is inserted to generate a NAL unit including the extension data (step S308).
  • FIGS. 16A to 16D are diagrams illustrating an example of a configuration of a bit stream encoded by the image encoding device according to the second embodiment.
  • Some data including the base layer is generated based on the conventional standard, and other data is encoded based on a standard different from the conventional standard.
  • some data including the base layer is not necessarily encoded by the image encoding apparatus according to the second embodiment.
  • the image coding apparatus according to Embodiment 2 can be configured to encode only the extension data and add it to a bitstream that has already been encoded by another image encoding apparatus.
  • FIG. 16A is an example in which an enhancement layer (HEVC) is added to a base layer (AVC) generated based on a conventional standard.
  • HEVC enhancement layer
  • AVC base layer
  • FIG. 16B is an example in which an extended view (HEVC) is added to a base view (AVC) generated based on a conventional standard.
  • FIG. 16C shows a base layer / view (HEVC) as an extended view in a base view including a base layer base view (AVC) and an extended layer base view (AVC) generated based on the conventional standard (SVC standard).
  • HEVC extended layer / enhanced view
  • An AVC decoder compliant with the SVC standard can decode a NAL unit including a base layer base view (AVC) and an enhancement layer base view (AVC) generated based on a conventional standard (SVC standard).
  • FIG. 16D shows an extension layer base view (HEVC) as an extension layer to a base layer including a base layer base view (AVC) and a base layer extension view (AVC) generated based on a conventional standard (MVC standard). ) And an extended layer / enhanced view (HEVC).
  • the AVC decoder conforming to the MVC standard can decode a NAL unit including a base layer base view (AVC) and a base layer extended view (AVC) generated based on a conventional standard (MVC standard).
  • FIG. 17 is a block diagram showing the configuration of the image decoding apparatus according to the second embodiment.
  • the image decoding apparatus 400 includes a base data analysis unit 201, a base data decoding unit 202, an extension data analysis unit 204, an extension parameter calculation unit 401, and an extension data decoding unit 205, as shown in FIG.
  • the configuration other than the extended parameter calculation unit 401 is the same as that of the image decoding apparatus 200 according to Embodiment 1 shown in FIG.
  • the pre-analysis unit 203 is replaced with an extended parameter calculation unit 401.
  • the prefix analysis unit 203 performs syntax analysis of the NAL unit header of the prefix NAL unit included in the bitstream, and acquires the NAL unit header extension parameter of the base data.
  • the extension parameter calculation unit 401 calculates the NAL unit header extension parameter of the base data when the prefix NAL unit is not included in the bitstream.
  • FIG. 18 is a flowchart showing processing of the image decoding apparatus according to the second embodiment.
  • the base data analysis unit 201 performs syntax analysis on the NAL unit of the base data (step S401).
  • the base data decoding unit 202 decodes the base data (step S402).
  • the extended data analysis unit 204 parses the extended data NAL unit (step S403). Thereby, the extension data analysis unit 204 acquires the NAL unit header extension parameter of the extension data.
  • the extension parameter calculation unit 401 calculates a NAL unit header extension parameter of base data based on the acquired NAL unit header extension parameter of extension data (step S404).
  • the extension data decoding unit 205 decodes extension data using the calculated extension parameter of the base data and the acquired extension parameter of the extension data (step S405).
  • FIG. 19 is a flowchart showing a process of calculating the NAL unit header extension parameter of the base data by the extension parameter calculation unit 401 in the second embodiment.
  • the extended parameter calculation unit 302 of the image encoding device 300 shown in FIG. 14 also executes the same processing.
  • the extension parameter calculation unit 302 refers to the value of svc_mvc_extension_flag obtained by the syntax analysis of the extension data analysis unit 204 (step S501), and determines an extension parameter to be calculated (step S502). If the value of svc_mvc_extension_flag is “2” (MVC decoding in step S502), an MVC extension parameter is calculated (step S503). If the value of svc_mvc_extension_flag is “1” (SVC decoding in step S502), an SVC extension parameter is calculated (step S504). If the value of svc_mvc_extension_flag is “3” (MVC / SVC decoding in step S502), the MVC extension parameter and the SVC extension parameter are calculated (step S505).
  • the extension parameter to be calculated can be determined by referring to the NAL unit header extension parameter of the extension data obtained by the syntax analysis of the extension data analysis unit 204. Specifically, when only the parameters unique to the NAL unit header MVC extension unit (such as view_id) exist in the NAL unit header extension parameters, it is determined to calculate the MVC extension parameters.
  • step S503 of FIG. 19 the calculation process of the MVC extension parameter in step S503 of FIG. 19 will be specifically described.
  • the NAL unit header MVC extension parameter of the extension view has already been acquired by the syntax analysis of the extension data analysis unit 204.
  • FIG. 20 is a flowchart showing processing for calculating the NAL unit header MVC extension parameter of the base view by the extension parameter calculation unit 401 in the second embodiment.
  • these parameters include a non-IDR flag (non_idr_flag), a priority ID (priority_id), a view ID (view_id), a time ID (temporal_id), an anchor picture flag (anchor_pic_flag), and an inter-view prediction flag (inter_view_flag).
  • the extension parameter calculation unit 401 reads the value of the non-IDR flag (non_idr_flag) from the NAL unit header MVC extension parameter of the extension view (step S601).
  • the extended parameter calculation unit 401 assigns the value of the non-IDR flag (non_idr_flag) of the extended view to the non-IDR flag (non_idr_flag) of the base view (step S602).
  • the extended parameter calculation unit 401 assigns a predetermined value to the priority flag (priority_id) of the base view (step S603).
  • the value predetermined as the value of the priority flag (priority_id) is “0”.
  • the extended parameter calculation unit 401 assigns a predetermined value to the view ID (view_id) (step S604).
  • the value predetermined as the value of the view ID (view_id) is also “0”.
  • the extended parameter calculation unit 401 acquires the value of the extended view time ID (temporal_id) from the extended view NAL unit header MVC extended parameter (step S605).
  • the extended parameter calculation unit 401 assigns the value of the acquired time ID (temporal_id) of the extended view to the time ID (temporal_id) of the base view (step S606).
  • the extension parameter calculation unit 401 acquires the value of the anchor picture flag (anchor_pic_flag) from the NAL unit header MVC extension parameter of the extension view (step S607).
  • the extended parameter calculation unit 401 assigns the value of the anchor picture flag (anchor_pic_flag) acquired for the extended view to the anchor picture flag (anchor_pic_flag) of the base view (step S608).
  • the extended parameter calculation unit 401 sets a predetermined value in the inter-view prediction flag (inter_view_flag) of the base view (step S9).
  • the predetermined value as the inter-view prediction flag (inter_view_flag) of the base view is “1”.
  • the image decoding apparatus 400 even if the prefix NAL unit is not included in the bitstream, based on the NAL unit header MVC extension parameter of the extended view, A header MVC extension parameter can be calculated.
  • the above description is an example of calculating the NAL unit header MVC extension parameter of the base view (AVC) in the bit stream shown in FIG. 16B, but based on the same idea, in the bit stream shown in FIG. 16A
  • the base layer (AVC) NAL unit header SVC extension parameter can be calculated based on the enhancement layer (HEVC) NAL unit header SVC extension parameter.
  • the NAL unit header SVC extension parameter and the NAL unit header MVC extension parameter of the base layer base view (AVC) can be calculated in the bitstream shown in FIG. 16C or FIG. 16D based on the same concept. It is.
  • the NAL unit header extension parameter of the enhancement layer base view does not include the NAL unit header MVC extension parameter.
  • the NAL unit header MVC extension parameter of the enhancement layer base view (AVC) can be calculated based on the NAL unit header MVC extension parameter of the enhancement layer / enhancement view (HEVC).
  • the NAL unit header SVC extension parameter is not included in the NAL unit header extension parameter of the base layer / extended view (AVC).
  • the NAL unit header SVC extension parameter of the base layer / extended view can be calculated based on the NAL unit header SVC extension parameter of the extension layer / extended view (HEVC).
  • the image decoding apparatus operates appropriately. Specifically, even if a front NAL unit is present, it can operate as described in the second embodiment on the assumption that it does not exist.
  • the pre-analysis unit 203 of the image decoding apparatus according to the first embodiment is further provided. When the pre-NAL unit is present, the pre-analysis unit 203 is not operated without operating the extension parameter calculation unit 401. , And the NAL unit header extension parameter of the base data can be obtained by performing syntax analysis of the NAL unit header of the prefix NAL unit.
  • the image coding apparatus performs the prefix NAL unit when performing coding that combines multi-view video coding (MVC) and scalable video coding (SVC). , And a bitstream that does not include the prefix NAL unit can be generated.
  • MVC multi-view video coding
  • SVC scalable video coding
  • a bitstream that does not include the prefix NAL unit can be generated.
  • the bitstream generated based on the conventional standard is not modified at all, Only extension data can be encoded and added based on different standards.
  • the image decoding apparatus even when decoding a bitstream without a prefix NAL unit, information corresponding to the MVC extension unit and information corresponding to the SVC extension unit It can be determined which of the information corresponding to the MVC extension unit and the SVC extension unit is omitted. Furthermore, the image decoding apparatus according to the second embodiment can calculate the omitted information.
  • the image encoding device and the image decoding device according to one or more aspects of the present invention have been described based on the embodiment, but the present invention is not limited to this embodiment. Unless it deviates from the gist of the present invention, one or more of the present invention may be applied to the present embodiment in which various modifications conceived by those skilled in the art have been made in this embodiment, or a combination of components in different embodiments. It may be included within the scope of the embodiments.
  • each component may be configured by dedicated hardware or may be realized by executing a software program suitable for each component.
  • Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory.
  • the software that realizes the image encoding device or the image decoding device according to each of the above embodiments is the following program.
  • this program is an image encoding method for encoding base image data and extended image data to a computer by at least one of scalable video encoding and multi-view video encoding, and encodes the base image data. And encoding the extended image data according to whether the scalable video encoding and the multi-view video encoding are performed, or only one of them is performed.
  • An extension parameter specifying step for specifying an extension parameter to be used, an extension data encoding step for encoding the extension image data using the extension parameter by encoding performed, and a NAL unit header including the extension parameter
  • An image encoding method including a header generation step for generating To.
  • the program is an image decoding method for decoding encoded base image data and extended image data encoded by combining scalable video encoding and multi-view video encoding,
  • a base data decoding step for decoding the base image data
  • a parsing step for parsing a NAL unit header including information for specifying a view and information for specifying a layer as parameters, and information for specifying the view
  • An image decoding method including a specifying step of specifying extended image data from information specifying the layer and an extended data decoding step of decoding the extended image data is executed.
  • the storage medium may be any medium that can record a program, such as a magnetic disk, an optical disk, a magneto-optical disk, an IC card, and a semiconductor memory.
  • the system has an image encoding / decoding device including an image encoding device using an image encoding method and an image decoding device using an image decoding method.
  • image encoding / decoding device including an image encoding device using an image encoding method and an image decoding device using an image decoding method.
  • Other configurations in the system can be appropriately changed according to circumstances.
  • FIG. 21 is a diagram showing an overall configuration of a content supply system ex100 that realizes a content distribution service.
  • a communication service providing area is divided into desired sizes, and base stations ex106, ex107, ex108, ex109, and ex110, which are fixed wireless stations, are installed in each cell.
  • This content supply system ex100 includes a computer ex111, a PDA (Personal Digital Assistant) ex112, a camera ex113, a mobile phone ex114, a game machine ex115 via the Internet ex101, the Internet service provider ex102, the telephone network ex104, and the base stations ex106 to ex110. Etc. are connected.
  • PDA Personal Digital Assistant
  • each device may be directly connected to the telephone network ex104 without going from the base station ex106, which is a fixed wireless station, to ex110.
  • the devices may be directly connected to each other via short-range wireless or the like.
  • the camera ex113 is a device that can shoot moving images such as a digital video camera
  • the camera ex116 is a device that can shoot still images and movies such as a digital camera.
  • the mobile phone ex114 is a GSM (registered trademark) (Global System for Mobile Communications) system, a CDMA (Code Division Multiple Access) system, a W-CDMA (Wideband-Code Division Multiple Access) system, or an LTE (Long Terminal Term Evolution). It is possible to use any of the above-mentioned systems, HSPA (High Speed Packet Access) mobile phone, PHS (Personal Handyphone System), or the like.
  • the camera ex113 and the like are connected to the streaming server ex103 through the base station ex109 and the telephone network ex104, thereby enabling live distribution and the like.
  • live distribution content that is shot by a user using the camera ex113 (for example, music live video) is encoded as described in each of the above embodiments (that is, in one aspect of the present invention).
  • the streaming server ex103 stream-distributes the content data transmitted to the requested client. Examples of the client include a computer ex111, a PDA ex112, a camera ex113, a mobile phone ex114, and a game machine ex115 that can decode the encoded data.
  • Each device that receives the distributed data decodes the received data and reproduces it (that is, functions as an image decoding device according to one embodiment of the present invention).
  • the captured data may be encoded by the camera ex113, the streaming server ex103 that performs data transmission processing, or may be shared with each other.
  • the decryption processing of the distributed data may be performed by the client, the streaming server ex103, or may be performed in common with each other.
  • still images and / or moving image data captured by the camera ex116 may be transmitted to the streaming server ex103 via the computer ex111.
  • the encoding process in this case may be performed by any of the camera ex116, the computer ex111, and the streaming server ex103, or may be performed in a shared manner.
  • these encoding / decoding processes are generally performed in the computer ex111 and the LSI ex500 included in each device.
  • the LSI ex500 may be configured as a single chip or a plurality of chips.
  • moving image encoding / decoding software is incorporated into some recording medium (CD-ROM, flexible disk, hard disk, etc.) that can be read by the computer ex111, etc., and encoding / decoding processing is performed using the software. May be.
  • moving image data acquired by the camera may be transmitted.
  • the moving image data at this time is data encoded by the LSI ex500 included in the mobile phone ex114.
  • the streaming server ex103 may be a plurality of servers or a plurality of computers, and may process, record, and distribute data in a distributed manner.
  • the encoded data can be received and reproduced by the client.
  • the information transmitted by the user can be received, decrypted and reproduced by the client in real time, and personal broadcasting can be realized even for a user who does not have special rights or facilities.
  • the digital broadcast system ex200 also includes at least the moving image encoding device (image encoding device) or the moving image decoding according to each of the above embodiments. Any of the devices (image decoding devices) can be incorporated.
  • the broadcast station ex201 multiplexed data obtained by multiplexing music data and the like on video data is transmitted to a communication or satellite ex202 via radio waves.
  • This video data is data encoded by the moving image encoding method described in each of the above embodiments (that is, data encoded by the image encoding apparatus according to one aspect of the present invention).
  • the broadcasting satellite ex202 transmits a radio wave for broadcasting, and this radio wave is received by a home antenna ex204 capable of receiving satellite broadcasting.
  • the received multiplexed data is decoded and reproduced by an apparatus such as the television (receiver) ex300 or the set top box (STB) ex217 (that is, functions as an image decoding apparatus according to one embodiment of the present invention).
  • a reader / recorder ex218 that reads and decodes multiplexed data recorded on a recording medium ex215 such as a DVD or a BD, or encodes a video signal on the recording medium ex215 and, in some cases, multiplexes and writes it with a music signal. It is possible to mount the moving picture decoding apparatus or moving picture encoding apparatus described in the above embodiments. In this case, the reproduced video signal is displayed on the monitor ex219, and the video signal can be reproduced in another device or system using the recording medium ex215 on which the multiplexed data is recorded.
  • a moving picture decoding apparatus may be mounted in a set-top box ex217 connected to a cable ex203 for cable television or an antenna ex204 for satellite / terrestrial broadcasting and displayed on the monitor ex219 of the television.
  • the moving picture decoding apparatus may be incorporated in the television instead of the set top box.
  • FIG. 23 is a diagram illustrating a television (receiver) ex300 that uses the video decoding method and the video encoding method described in each of the above embodiments.
  • the television ex300 obtains or outputs multiplexed data in which audio data is multiplexed with video data via the antenna ex204 or the cable ex203 that receives the broadcast, and demodulates the received multiplexed data.
  • the modulation / demodulation unit ex302 that modulates multiplexed data to be transmitted to the outside, and the demodulated multiplexed data is separated into video data and audio data, or the video data and audio data encoded by the signal processing unit ex306 Is provided with a multiplexing / demultiplexing unit ex303.
  • the television ex300 also decodes the audio data and the video data, or encodes the information, the audio signal processing unit ex304, the video signal processing unit ex305 (the image encoding device or the image according to one embodiment of the present invention) A signal processing unit ex306 that functions as a decoding device), a speaker ex307 that outputs the decoded audio signal, and an output unit ex309 that includes a display unit ex308 such as a display that displays the decoded video signal. Furthermore, the television ex300 includes an interface unit ex317 including an operation input unit ex312 that receives an input of a user operation. Furthermore, the television ex300 includes a control unit ex310 that performs overall control of each unit, and a power supply circuit unit ex311 that supplies power to each unit.
  • the interface unit ex317 includes a bridge unit ex313 connected to an external device such as a reader / recorder ex218, a recording unit ex216 such as an SD card, and an external recording unit such as a hard disk.
  • a driver ex315 for connecting to a medium, a modem ex316 for connecting to a telephone network, and the like may be included.
  • the recording medium ex216 is capable of electrically recording information by using a nonvolatile / volatile semiconductor memory element to be stored.
  • Each part of the television ex300 is connected to each other via a synchronous bus.
  • the television ex300 receives a user operation from the remote controller ex220 or the like, and demultiplexes the multiplexed data demodulated by the modulation / demodulation unit ex302 by the multiplexing / demultiplexing unit ex303 based on the control of the control unit ex310 having a CPU or the like. Furthermore, in the television ex300, the separated audio data is decoded by the audio signal processing unit ex304, and the separated video data is decoded by the video signal processing unit ex305 using the decoding method described in each of the above embodiments.
  • the decoded audio signal and video signal are output from the output unit ex309 to the outside. At the time of output, these signals may be temporarily stored in the buffers ex318, ex319, etc. so that the audio signal and the video signal are reproduced in synchronization. Also, the television ex300 may read multiplexed data from recording media ex215 and ex216 such as a magnetic / optical disk and an SD card, not from broadcasting. Next, a configuration in which the television ex300 encodes an audio signal or a video signal and transmits the signal to the outside or to a recording medium will be described.
  • the television ex300 receives a user operation from the remote controller ex220 and the like, encodes an audio signal with the audio signal processing unit ex304, and converts the video signal with the video signal processing unit ex305 based on the control of the control unit ex310. Encoding is performed using the encoding method described in (1).
  • the encoded audio signal and video signal are multiplexed by the multiplexing / demultiplexing unit ex303 and output to the outside. When multiplexing, these signals may be temporarily stored in the buffers ex320, ex321, etc. so that the audio signal and the video signal are synchronized.
  • a plurality of buffers ex318, ex319, ex320, and ex321 may be provided as illustrated, or one or more buffers may be shared. Further, in addition to the illustrated example, data may be stored in the buffer as a buffer material that prevents system overflow and underflow, for example, between the modulation / demodulation unit ex302 and the multiplexing / demultiplexing unit ex303.
  • the television ex300 has a configuration for receiving AV input of a microphone and a camera, and performs encoding processing on the data acquired from them. Also good.
  • the television ex300 has been described as a configuration capable of the above-described encoding processing, multiplexing, and external output, but these processing cannot be performed, and only the above-described reception, decoding processing, and external output are possible. It may be a configuration.
  • the decoding process or the encoding process may be performed by either the television ex300 or the reader / recorder ex218,
  • the reader / recorder ex218 may share with each other.
  • FIG. 24 shows the configuration of the information reproducing / recording unit ex400 when data is read from or written to the optical disk.
  • the information reproducing / recording unit ex400 includes elements ex401, ex402, ex403, ex404, ex405, ex406, and ex407 described below.
  • the optical head ex401 irradiates a laser spot on the recording surface of the recording medium ex215 that is an optical disk to write information, and detects information reflected from the recording surface of the recording medium ex215 to read the information.
  • the modulation recording unit ex402 electrically drives a semiconductor laser built in the optical head ex401 and modulates the laser beam according to the recording data.
  • the reproduction demodulator ex403 amplifies the reproduction signal obtained by electrically detecting the reflected light from the recording surface by the photodetector built in the optical head ex401, separates and demodulates the signal component recorded on the recording medium ex215, and is necessary To play back information.
  • the buffer ex404 temporarily holds information to be recorded on the recording medium ex215 and information reproduced from the recording medium ex215.
  • the disk motor ex405 rotates the recording medium ex215.
  • the servo control unit ex406 moves the optical head ex401 to a predetermined information track while controlling the rotational drive of the disk motor ex405, and performs a laser spot tracking process.
  • the system control unit ex407 controls the entire information reproduction / recording unit ex400.
  • the system control unit ex407 uses various types of information held in the buffer ex404, and generates and adds new information as necessary.
  • the modulation recording unit ex402, the reproduction demodulation unit This is realized by recording / reproducing information through the optical head ex401 while operating the ex403 and the servo control unit ex406 in a coordinated manner.
  • the system control unit ex407 includes, for example, a microprocessor, and executes these processes by executing a read / write program.
  • the optical head ex401 has been described as irradiating a laser spot.
  • a configuration in which higher-density recording is performed using near-field light may be used.
  • FIG. 25 shows a schematic diagram of a recording medium ex215 that is an optical disk.
  • Guide grooves grooves
  • address information indicating the absolute position on the disc is recorded in advance on the information track ex230 by changing the shape of the groove.
  • This address information includes information for specifying the position of the recording block ex231 that is a unit for recording data, and the recording block is specified by reproducing the information track ex230 and reading the address information in a recording or reproducing apparatus.
  • the recording medium ex215 includes a data recording area ex233, an inner peripheral area ex232, and an outer peripheral area ex234.
  • the area used for recording user data is the data recording area ex233, and the inner circumference area ex232 and the outer circumference area ex234 arranged on the inner or outer circumference of the data recording area ex233 are used for specific purposes other than user data recording. Used.
  • the information reproducing / recording unit ex400 reads / writes encoded audio data, video data, or multiplexed data obtained by multiplexing these data with respect to the data recording area ex233 of the recording medium ex215.
  • an optical disk such as a single-layer DVD or BD has been described as an example.
  • the present invention is not limited to these, and an optical disk having a multilayer structure and capable of recording other than the surface may be used.
  • an optical disc with a multi-dimensional recording / reproducing structure such as recording information using light of different wavelengths in the same place on the disc, or recording different layers of information from various angles. It may be.
  • the car ex210 having the antenna ex205 can receive data from the satellite ex202 and the like, and the moving image can be reproduced on a display device such as the car navigation ex211 that the car ex210 has.
  • the configuration of the car navigation ex211 may be, for example, a configuration in which a GPS receiving unit is added in the configuration illustrated in FIG. 23, and the same may be considered for the computer ex111, the mobile phone ex114, and the like.
  • FIG. 26A is a diagram showing the mobile phone ex114 using the moving picture decoding method and the moving picture encoding method described in the above embodiment.
  • the mobile phone ex114 includes an antenna ex350 for transmitting and receiving radio waves to and from the base station ex110, a camera unit ex365 capable of capturing video and still images, a video captured by the camera unit ex365, a video received by the antenna ex350, and the like Is provided with a display unit ex358 such as a liquid crystal display for displaying the decrypted data.
  • the mobile phone ex114 further includes a main body unit having an operation key unit ex366, an audio output unit ex357 such as a speaker for outputting audio, an audio input unit ex356 such as a microphone for inputting audio, a captured video,
  • an audio input unit ex356 such as a microphone for inputting audio
  • a captured video In the memory unit ex367 for storing encoded data or decoded data such as still images, recorded audio, received video, still images, mails, or the like, or an interface unit with a recording medium for storing data
  • a slot ex364 is provided.
  • the mobile phone ex114 has a power supply circuit part ex361, an operation input control part ex362, and a video signal processing part ex355 with respect to a main control part ex360 that comprehensively controls each part of the main body including the display part ex358 and the operation key part ex366.
  • a camera interface unit ex363, an LCD (Liquid Crystal Display) control unit ex359, a modulation / demodulation unit ex352, a multiplexing / demultiplexing unit ex353, an audio signal processing unit ex354, a slot unit ex364, and a memory unit ex367 are connected to each other via a bus ex370. ing.
  • the power supply circuit unit ex361 starts up the mobile phone ex114 in an operable state by supplying power from the battery pack to each unit.
  • the cellular phone ex114 converts the audio signal collected by the audio input unit ex356 in the voice call mode into a digital audio signal by the audio signal processing unit ex354 based on the control of the main control unit ex360 having a CPU, a ROM, a RAM, and the like. Then, this is subjected to spectrum spread processing by the modulation / demodulation unit ex352, digital-analog conversion processing and frequency conversion processing are performed by the transmission / reception unit ex351, and then transmitted via the antenna ex350.
  • the mobile phone ex114 also amplifies the received data received via the antenna ex350 in the voice call mode, performs frequency conversion processing and analog-digital conversion processing, performs spectrum despreading processing by the modulation / demodulation unit ex352, and performs voice signal processing unit After being converted into an analog audio signal by ex354, this is output from the audio output unit ex357.
  • the text data of the e-mail input by operating the operation key unit ex366 of the main unit is sent to the main control unit ex360 via the operation input control unit ex362.
  • the main control unit ex360 performs spread spectrum processing on the text data in the modulation / demodulation unit ex352, performs digital analog conversion processing and frequency conversion processing in the transmission / reception unit ex351, and then transmits the text data to the base station ex110 via the antenna ex350.
  • almost the reverse process is performed on the received data and output to the display unit ex358.
  • the video signal processing unit ex355 compresses the video signal supplied from the camera unit ex365 by the moving image encoding method described in the above embodiments. Encode (that is, function as an image encoding device according to an aspect of the present invention), and send the encoded video data to the multiplexing / demultiplexing unit ex353.
  • the audio signal processing unit ex354 encodes the audio signal picked up by the audio input unit ex356 while the camera unit ex365 images a video, a still image, etc., and sends the encoded audio data to the multiplexing / separating unit ex353. To do.
  • the multiplexing / demultiplexing unit ex353 multiplexes the encoded video data supplied from the video signal processing unit ex355 and the encoded audio data supplied from the audio signal processing unit ex354 by a predetermined method, and is obtained as a result.
  • the multiplexed data is subjected to spread spectrum processing by the modulation / demodulation unit (modulation / demodulation circuit unit) ex352, digital-analog conversion processing and frequency conversion processing by the transmission / reception unit ex351, and then transmitted via the antenna ex350.
  • the multiplexing / separating unit ex353 separates the multiplexed data into a video data bit stream and an audio data bit stream, and performs video signal processing on the video data encoded via the synchronization bus ex370.
  • the encoded audio data is supplied to the audio signal processing unit ex354 while being supplied to the unit ex355.
  • the video signal processing unit ex355 decodes the video signal by decoding using the video decoding method corresponding to the video encoding method described in each of the above embodiments (that is, an image according to an aspect of the present invention).
  • video and still images included in the moving image file linked to the home page are displayed from the display unit ex358 via the LCD control unit ex359.
  • the audio signal processing unit ex354 decodes the audio signal, and the audio is output from the audio output unit ex357.
  • the terminal such as the mobile phone ex114 is referred to as a transmission terminal having only an encoder and a receiving terminal having only a decoder.
  • a transmission terminal having only an encoder
  • a receiving terminal having only a decoder.
  • multiplexed data in which music data or the like is multiplexed with video data is received and transmitted, but data in which character data or the like related to video is multiplexed in addition to audio data It may be video data itself instead of multiplexed data.
  • the moving picture encoding method or the moving picture decoding method shown in each of the above embodiments can be used in any of the above-described devices / systems. The described effect can be obtained.
  • Embodiment 4 The moving picture coding method or apparatus shown in the above embodiments and the moving picture coding method or apparatus compliant with different standards such as MPEG-2, MPEG4-AVC, and VC-1 are appropriately switched as necessary. Thus, it is also possible to generate video data.
  • multiplexed data obtained by multiplexing audio data or the like with video data is configured to include identification information indicating which standard the video data conforms to.
  • identification information indicating which standard the video data conforms to.
  • FIG. 27 is a diagram showing a structure of multiplexed data.
  • multiplexed data is obtained by multiplexing one or more of a video stream, an audio stream, a presentation graphics stream (PG), and an interactive graphics stream.
  • the video stream indicates the main video and sub-video of the movie
  • the audio stream (IG) indicates the main audio portion of the movie and the sub-audio mixed with the main audio
  • the presentation graphics stream indicates the subtitles of the movie.
  • the main video indicates a normal video displayed on the screen
  • the sub-video is a video displayed on a small screen in the main video.
  • the interactive graphics stream indicates an interactive screen created by arranging GUI components on the screen.
  • the video stream is encoded by the moving image encoding method or apparatus shown in the above embodiments, or the moving image encoding method or apparatus conforming to the conventional standards such as MPEG-2, MPEG4-AVC, and VC-1. ing.
  • the audio stream is encoded by a method such as Dolby AC-3, Dolby Digital Plus, MLP, DTS, DTS-HD, or linear PCM.
  • Each stream included in the multiplexed data is identified by PID. For example, 0x1011 for video streams used for movie images, 0x1100 to 0x111F for audio streams, 0x1200 to 0x121F for presentation graphics, 0x1400 to 0x141F for interactive graphics streams, 0x1B00 to 0x1B1F are assigned to the video stream used for the sub-picture, and 0x1A00 to 0x1A1F are assigned to the audio stream used for the sub-audio mixed with the main audio.
  • FIG. 28 is a diagram schematically showing how multiplexed data is multiplexed.
  • a video stream ex235 composed of a plurality of video frames and an audio stream ex238 composed of a plurality of audio frames are converted into PES packet sequences ex236 and ex239, respectively, and converted into TS packets ex237 and ex240.
  • the data of the presentation graphics stream ex241 and interactive graphics ex244 are converted into PES packet sequences ex242 and ex245, respectively, and further converted into TS packets ex243 and ex246.
  • the multiplexed data ex247 is configured by multiplexing these TS packets into one stream.
  • FIG. 29 shows in more detail how the video stream is stored in the PES packet sequence.
  • the first row in FIG. 29 shows a video frame sequence of the video stream.
  • the second level shows a PES packet sequence.
  • a plurality of Video Presentation Units in the video stream are divided into pictures, B pictures, and P pictures, and are stored in the payload of the PES packet.
  • Each PES packet has a PES header, and a PTS (Presentation Time-Stamp) that is a display time of a picture and a DTS (Decoding Time-Stamp) that is a decoding time of a picture are stored in the PES header.
  • PTS Presentation Time-Stamp
  • DTS Decoding Time-Stamp
  • FIG. 30 shows the format of the TS packet that is finally written in the multiplexed data.
  • the TS packet is a 188-byte fixed-length packet composed of a 4-byte TS header having information such as a PID for identifying a stream and a 184-byte TS payload for storing data.
  • the PES packet is divided and stored in the TS payload.
  • a 4-byte TP_Extra_Header is added to a TS packet, forms a 192-byte source packet, and is written in multiplexed data.
  • TP_Extra_Header information such as ATS (Arrival_Time_Stamp) is described.
  • ATS indicates the transfer start time of the TS packet to the PID filter of the decoder.
  • Source packets are arranged in the multiplexed data as shown in the lower part of FIG. 30, and a number incremented from the head of the multiplexed data is called an SPN (source packet number).
  • TS packets included in the multiplexed data include PAT (Program Association Table), PMT (Program Map Table), PCR (Program Clock Reference), and the like in addition to each stream such as video / audio / caption.
  • PAT indicates what the PID of the PMT used in the multiplexed data is, and the PID of the PAT itself is registered as 0.
  • the PMT has the PID of each stream such as video / audio / subtitles included in the multiplexed data and the attribute information of the stream corresponding to each PID, and has various descriptors related to the multiplexed data.
  • the descriptor includes copy control information for instructing permission / non-permission of copying of multiplexed data.
  • the PCR corresponds to the ATS in which the PCR packet is transferred to the decoder. Contains STC time information.
  • FIG. 31 is a diagram for explaining the data structure of the PMT in detail.
  • a PMT header describing the length of data included in the PMT is arranged at the head of the PMT.
  • a plurality of descriptors related to multiplexed data are arranged.
  • the copy control information and the like are described as descriptors.
  • a plurality of pieces of stream information regarding each stream included in the multiplexed data are arranged.
  • the stream information includes a stream descriptor in which a stream type, a stream PID, and stream attribute information (frame rate, aspect ratio, etc.) are described to identify a compression codec of the stream.
  • the multiplexed data is recorded together with the multiplexed data information file.
  • the multiplexed data information file is management information of multiplexed data, has one-to-one correspondence with the multiplexed data, and includes multiplexed data information, stream attribute information, and an entry map.
  • the multiplexed data information includes a system rate, a reproduction start time, and a reproduction end time as shown in FIG.
  • the system rate indicates a maximum transfer rate of multiplexed data to a PID filter of a system target decoder described later.
  • the ATS interval included in the multiplexed data is set to be equal to or less than the system rate.
  • the playback start time is the PTS of the first video frame of the multiplexed data
  • the playback end time is set by adding the playback interval for one frame to the PTS of the video frame at the end of the multiplexed data.
  • attribute information about each stream included in the multiplexed data is registered for each PID.
  • the attribute information has different information for each video stream, audio stream, presentation graphics stream, and interactive graphics stream.
  • the video stream attribute information includes the compression codec used to compress the video stream, the resolution of the individual picture data constituting the video stream, the aspect ratio, and the frame rate. It has information such as how much it is.
  • the audio stream attribute information includes the compression codec used to compress the audio stream, the number of channels included in the audio stream, the language supported, and the sampling frequency. With information. These pieces of information are used for initialization of the decoder before the player reproduces it.
  • the stream type included in the PMT is used.
  • video stream attribute information included in the multiplexed data information is used.
  • the video encoding shown in each of the above embodiments for the stream type or video stream attribute information included in the PMT.
  • FIG. 34 shows the steps of the moving picture decoding method according to the present embodiment.
  • step exS100 the stream type included in the PMT or the video stream attribute information included in the multiplexed data information is acquired from the multiplexed data.
  • step exS101 it is determined whether or not the stream type or the video stream attribute information indicates multiplexed data generated by the moving picture encoding method or apparatus described in the above embodiments. To do.
  • step exS102 the above embodiments are performed. Decoding is performed by the moving picture decoding method shown in the form.
  • the conventional information Decoding is performed by a moving image decoding method compliant with the standard.
  • FIG. 35 shows the configuration of an LSI ex500 that is made into one chip.
  • the LSI ex500 includes elements ex501, ex502, ex503, ex504, ex505, ex506, ex507, ex508, and ex509 described below, and each element is connected via a bus ex510.
  • the power supply circuit unit ex505 is activated to an operable state by supplying power to each unit when the power supply is on.
  • the LSI ex500 uses the AV I / O ex509 to perform the microphone ex117 and the camera ex113 based on the control of the control unit ex501 including the CPU ex502, the memory controller ex503, the stream controller ex504, the driving frequency control unit ex512, and the like.
  • the AV signal is input from the above.
  • the input AV signal is temporarily stored in an external memory ex511 such as SDRAM.
  • the accumulated data is divided into a plurality of times as appropriate according to the processing amount and the processing speed and sent to the signal processing unit ex507, and the signal processing unit ex507 encodes an audio signal and / or video. Signal encoding is performed.
  • the encoding process of the video signal is the encoding process described in the above embodiments.
  • the signal processing unit ex507 further performs processing such as multiplexing the encoded audio data and the encoded video data according to circumstances, and outputs the result from the stream I / Oex 506 to the outside.
  • the output multiplexed data is transmitted to the base station ex107 or written to the recording medium ex215. It should be noted that data should be temporarily stored in the buffer ex508 so as to be synchronized when multiplexing.
  • the memory ex511 is described as an external configuration of the LSI ex500.
  • a configuration included in the LSI ex500 may be used.
  • the number of buffers ex508 is not limited to one, and a plurality of buffers may be provided.
  • the LSI ex500 may be made into one chip or a plurality of chips.
  • control unit ex501 includes the CPU ex502, the memory controller ex503, the stream controller ex504, the drive frequency control unit ex512, and the like, but the configuration of the control unit ex501 is not limited to this configuration.
  • the signal processing unit ex507 may further include a CPU.
  • the CPU ex502 may be configured to include a signal processing unit ex507 or, for example, an audio signal processing unit that is a part of the signal processing unit ex507.
  • the control unit ex501 is configured to include a signal processing unit ex507 or a CPU ex502 having a part thereof.
  • LSI LSI
  • IC system LSI
  • super LSI ultra LSI depending on the degree of integration
  • the method of circuit integration is not limited to LSI, and implementation with a dedicated circuit or a general-purpose processor is also possible.
  • An FPGA Field Programmable Gate Array
  • a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.
  • FIG. 36 shows a configuration ex800 in the present embodiment.
  • the drive frequency switching unit ex803 sets the drive frequency high when the video data is generated by the moving image encoding method or apparatus described in the above embodiments.
  • the decoding processing unit ex801 that executes the moving picture decoding method described in each of the above embodiments is instructed to decode the video data.
  • the video data is video data compliant with the conventional standard, compared to the case where the video data is generated by the moving picture encoding method or apparatus shown in the above embodiments, Set the drive frequency low. Then, it instructs the decoding processing unit ex802 compliant with the conventional standard to decode the video data.
  • the drive frequency switching unit ex803 includes a CPU ex502 and a drive frequency control unit ex512 in FIG.
  • the decoding processing unit ex801 that executes the moving picture decoding method shown in each of the above embodiments and the decoding processing unit ex802 that complies with the conventional standard correspond to the signal processing unit ex507 in FIG.
  • the CPU ex502 identifies which standard the video data conforms to. Then, based on the signal from the CPU ex502, the drive frequency control unit ex512 sets the drive frequency. Further, based on the signal from the CPU ex502, the signal processing unit ex507 decodes the video data.
  • the identification information described in the fourth embodiment may be used.
  • the identification information is not limited to that described in the fourth embodiment, and any information that can identify which standard the video data conforms to may be used. For example, it is possible to identify which standard the video data conforms to based on an external signal that identifies whether the video data is used for a television or a disk. In some cases, identification may be performed based on such an external signal.
  • the selection of the driving frequency in the CPU ex502 may be performed based on, for example, a lookup table in which video data standards and driving frequencies are associated with each other as shown in FIG. The look-up table is stored in the buffer ex508 or the internal memory of the LSI, and the CPU ex502 can select the drive frequency by referring to the look-up table.
  • FIG. 37 shows steps for executing the method of the present embodiment.
  • the signal processing unit ex507 acquires identification information from the multiplexed data.
  • the CPU ex502 identifies whether the video data is generated by the encoding method or apparatus described in each of the above embodiments based on the identification information.
  • the CPU ex502 sends a signal for setting the drive frequency high to the drive frequency control unit ex512. Then, the drive frequency control unit ex512 sets a high drive frequency.
  • step exS203 the CPU ex502 drives the signal for setting the drive frequency low. This is sent to the frequency control unit ex512. Then, in the drive frequency control unit ex512, the drive frequency is set to be lower than that in the case where the video data is generated by the encoding method or apparatus described in the above embodiments.
  • the power saving effect can be further enhanced by changing the voltage applied to the LSI ex500 or the device including the LSI ex500 in conjunction with the switching of the driving frequency. For example, when the drive frequency is set low, it is conceivable that the voltage applied to the LSI ex500 or the device including the LSI ex500 is set low as compared with the case where the drive frequency is set high.
  • the setting method of the driving frequency may be set to a high driving frequency when the processing amount at the time of decoding is large, and to a low driving frequency when the processing amount at the time of decoding is small. It is not limited to the method.
  • the amount of processing for decoding video data compliant with the MPEG4-AVC standard is larger than the amount of processing for decoding video data generated by the moving picture encoding method or apparatus described in the above embodiments. It is conceivable that the setting of the driving frequency is reversed to that in the case described above.
  • the method for setting the drive frequency is not limited to the configuration in which the drive frequency is lowered.
  • the voltage applied to the LSIex500 or the apparatus including the LSIex500 is set high.
  • the driving of the CPU ex502 is stopped.
  • the CPU ex502 is temporarily stopped because there is room in processing. Is also possible. Even when the identification information indicates that the video data is generated by the moving image encoding method or apparatus described in each of the above embodiments, if there is a margin for processing, the CPU ex502 is temporarily driven. It can also be stopped. In this case, it is conceivable to set the stop time shorter than in the case where the video data conforms to the conventional standards such as MPEG-2, MPEG4-AVC, and VC-1.
  • a plurality of video data that conforms to different standards may be input to the above-described devices and systems such as a television and a mobile phone.
  • the signal processing unit ex507 of the LSI ex500 needs to support a plurality of standards in order to be able to decode even when a plurality of video data complying with different standards is input.
  • the signal processing unit ex507 corresponding to each standard is used individually, there is a problem that the circuit scale of the LSI ex500 increases and the cost increases.
  • a decoding processing unit for executing the moving picture decoding method shown in each of the above embodiments and a decoding conforming to a standard such as MPEG-2, MPEG4-AVC, or VC-1
  • the processing unit is partly shared.
  • An example of this configuration is shown as ex900 in FIG. 39A.
  • the moving picture decoding method shown in each of the above embodiments and the moving picture decoding method compliant with the MPEG4-AVC standard are processed in processes such as entropy coding, inverse quantization, deblocking filter, and motion compensation. Some contents are common.
  • the decoding processing unit ex902 corresponding to the MPEG4-AVC standard is shared, and for other processing contents specific to one aspect of the present invention that do not correspond to the MPEG4-AVC standard, a dedicated decoding processing unit A configuration using ex901 is conceivable.
  • the decoding processing unit for executing the moving picture decoding method described in each of the above embodiments is shared, and the processing content specific to the MPEG4-AVC standard As for, a configuration using a dedicated decoding processing unit may be used.
  • ex1000 in FIG. 39B shows another example in which processing is partially shared.
  • a dedicated decoding processing unit ex1001 corresponding to the processing content specific to one aspect of the present invention
  • a dedicated decoding processing unit ex1002 corresponding to the processing content specific to another conventional standard
  • a common decoding processing unit ex1003 corresponding to the processing contents common to the moving image decoding method according to the above and other conventional moving image decoding methods.
  • the dedicated decoding processing units ex1001 and ex1002 are not necessarily specialized in one aspect of the present invention or processing content specific to other conventional standards, and can execute other general-purpose processing. Also good.
  • the configuration of the present embodiment can be implemented by LSI ex500.
  • the processing content common to the moving picture decoding method according to one aspect of the present invention and the moving picture decoding method of the conventional standard reduces the circuit scale of the LSI by sharing the decoding processing unit, In addition, the cost can be reduced.
  • the image coding method and the image decoding method according to the present invention can be applied to any multimedia data, and realizes coding that combines multi-view video coding (MVC) and scalable video coding (SVC).
  • MVC multi-view video coding
  • SVC scalable video coding
  • it is useful as an image encoding method and an image decoding method in storage, transmission, communication, etc. using a mobile phone, a DVD device, a personal computer, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention se rapporte à un procédé de codage d'image et à un procédé de décodage d'image qui sont aptes à exécuter un codage qui combine une norme de codage vidéo multivue (MVC) et une norme de codage vidéo scalable (SVC). Un dispositif de décodage d'image selon l'invention comprend : un module de décodage de données d'image de base (202) qui décode des données d'image de base ; un module d'analyse de données d'image étendues (204) qui analyse un en-tête de module NAL qui contient des données de vue identifiant une vue et qui contient des données de couche identifiant une couche, lesdites données de vue et lesdites données de couche étant contenues en tant que des paramètres, et qui identifie des données d'image étendues à partir des données de vue et des données de couche ; et un module de décodage de données d'image étendues (205) qui décode les données d'image étendues.
PCT/JP2012/007528 2011-11-25 2012-11-22 Procédé de codage d'image, dispositif de codage d'image, procédé de décodage d'image et dispositif de décodage d'image WO2013076991A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161563675P 2011-11-25 2011-11-25
US61/563,675 2011-11-25

Publications (1)

Publication Number Publication Date
WO2013076991A1 true WO2013076991A1 (fr) 2013-05-30

Family

ID=48469454

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2012/007528 WO2013076991A1 (fr) 2011-11-25 2012-11-22 Procédé de codage d'image, dispositif de codage d'image, procédé de décodage d'image et dispositif de décodage d'image

Country Status (1)

Country Link
WO (1) WO2013076991A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016539526A (ja) * 2013-09-30 2016-12-15 アップル インコーポレイテッド 後方互換性拡張画像フォーマット

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010507961A (ja) * 2006-10-25 2010-03-11 韓國電子通信研究院 マルチビュービデオのスケーラブルコーディングおよびデコーディング方法、ならびにコーディングおよびデコーディング装置

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010507961A (ja) * 2006-10-25 2010-03-11 韓國電子通信研究院 マルチビュービデオのスケーラブルコーディングおよびデコーディング方法、ならびにコーディングおよびデコーディング装置

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MISKA M. HANNUKSELA ET AL.: "Scalable multiview video coding (SMVC)", JOINT VIDEO TEAM (JVT) OF ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 AND ITU-T SG16 Q.6) 27TH MEETING, 24 April 2008 (2008-04-24), GENEVA, CH *
YING CHEN ET AL.: "Unified NAL unit header design for HEVC and its extensions", JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG16 WP3 AND ISO/IEC JTC1/SC29/WG11 7TH MEETING [JCTVC-G336], 9 November 2011 (2011-11-09), GENEVA, CH, XP030110320 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016539526A (ja) * 2013-09-30 2016-12-15 アップル インコーポレイテッド 後方互換性拡張画像フォーマット

Similar Documents

Publication Publication Date Title
JP6210248B2 (ja) 動画像符号化方法及び動画像符号化装置
JP6213753B2 (ja) 符号化復号装置
JP6562369B2 (ja) 符号化復号方法および符号化復号装置
JP6156648B2 (ja) 動画像符号化方法、動画像符号化装置、動画像復号化方法、および、動画像復号化装置
WO2012117722A1 (fr) Procédé de codage, procédé de décodage, dispositif de codage et dispositif de décodage
WO2014010192A1 (fr) Procédé de codage d'image, procédé de décodage d'image, dispositif de codage d'image et dispositif de décodage d'image
JP6414712B2 (ja) 多数の参照ピクチャを用いる動画像符号化方法、動画像復号方法、動画像符号化装置、および動画像復号方法
WO2013128832A1 (fr) Procédé de codage d'image, procédé de décodage d'image, dispositif de codage d'image, dispositif de décodage d'image et dispositif de codage/décodage d'image
JP2021093769A (ja) 画像符号化方法、画像復号方法、画像符号化装置および画像復号装置
JP6483028B2 (ja) 画像符号化方法及び画像符号化装置
KR102125930B1 (ko) 영상 부호화 방법 및 영상 복호 방법
JP2014039252A (ja) 画像復号方法および画像復号装置
WO2013076991A1 (fr) Procédé de codage d'image, dispositif de codage d'image, procédé de décodage d'image et dispositif de décodage d'image
JP6167906B2 (ja) 画像符号化方法、画像復号方法、画像符号化装置及び画像復号装置
WO2012096157A1 (fr) Procédé de codage d'image, procédé de décodage d'image, dispositif de codage d'image et dispositif de décodage d'image
WO2012124300A1 (fr) Procédé de codage d'image vidéo, procédé de décodage d'image vidéo, dispositif de codage d'image vidéo et dispositif de décodage d'image vidéo
WO2013153808A1 (fr) Procédé et dispositif de décodage d'images
WO2012014458A1 (fr) Procédé de codage d'image et procédé de décodage d'image
WO2013046616A1 (fr) Appareil de codage d'image, appareil de décodage d'image, procédé de codage d'image et procédé de décodage d'image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12851738

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12851738

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP