WO2014203505A1 - Appareil de décodage d'image, appareil de codage d'image et système de traitement d'image - Google Patents

Appareil de décodage d'image, appareil de codage d'image et système de traitement d'image Download PDF

Info

Publication number
WO2014203505A1
WO2014203505A1 PCT/JP2014/003143 JP2014003143W WO2014203505A1 WO 2014203505 A1 WO2014203505 A1 WO 2014203505A1 JP 2014003143 W JP2014003143 W JP 2014003143W WO 2014203505 A1 WO2014203505 A1 WO 2014203505A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
unit
weight coefficient
residual signal
prediction
Prior art date
Application number
PCT/JP2014/003143
Other languages
English (en)
Inventor
Kazushi Sato
Original Assignee
Sony Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corporation filed Critical Sony Corporation
Publication of WO2014203505A1 publication Critical patent/WO2014203505A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • H04N19/426Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/98Adaptive-dynamic-range coding [ADRC]

Definitions

  • the present disclosure relates to image decoding apparatus, image encoding apparatus, and image processing system, and more particularly, to image decoding apparatus, image encoding apparatus, and image processing system which are capable of suppressing an increase in storage capacity necessary for encoding or decoding.
  • apparatuses have become widespread which handle image information as digital information, are aimed at transmitting and accumulating information with high efficiency at that time, and compress and encode an image by employing an encoding scheme of performing compression by orthogonal transform such as discrete cosine transform, and motion compensation, using redundancy specific to image information.
  • the encoding scheme include moving picture experts group (MPEG), H.264, MPEG-4 Part10 (advanced video coding, hereinafter, referred to as AVC), and the like.
  • HEVC high efficiency video coding
  • JCTVC joint collaboration team - video coding
  • image data to be processed is hierarchized, and has a base layer on which encoding and decoding are performed with no reference to another layer and an enhancement layer on which encoding and decoding are performed with reference to another layer (a base layer or another enhancement layer).
  • a prediction error signal also referred to as a residual signal
  • a similar process be also performed in scalable encoding processing based on HEVC encoding processing (see, for example, NPL 1).
  • NPL 1 transmitting a weight coefficient indicating whether to use a residual signal of a base layer at the time of generating a final prediction image in an enhancement layer is suggested, and in a case of using the residual signal, how much the residual signal is used is indicated.
  • the present disclosure can suppress an increase in storage capacity necessary for encoding or decoding.
  • An image decoding apparatus embodiment comprising: circuitry configured to decode encoded data in which a weight coefficient of a residual signal had been set to 0.5 when an input signal of image data constituted by a plurality of layers was 8 bits and a residual signal was 9 bits, wherein the residual signal is a prediction error of inter-frame prediction in another layer different from a current layer of the image data.
  • An image encoding apparatus embodiment comprising: circuitry configured to set a weight coefficient of a residual signal to be 0.5 when an input signal of image data constituted by a plurality of layers is 8 bits and a residual signal is 9 bits, wherein the residual signal is a prediction error of inter-frame prediction in another layer different from a current layer of the image data.
  • an image encoding apparatus including circuitry configured to produce encoded image data from image data, the circuitry is configured to set a weight coefficient of a residual signal to be 0.5 when an input signal of the image data constituted by a plurality of layers is 8 bits and a residual signal is 9 bits, wherein the residual signal is a prediction error of inter-frame prediction in another layer different from a current layer of the image data; and a decoder configured to decode the encoded image data.
  • an input signal of image data constituted by a plurality of layers is 8 bits
  • a residual signal which is a prediction error of inter-frame prediction in another layer different from a current layer of the image data is 9 bits
  • the value of a weight coefficient of the residual signal is determined to be 0.5.
  • the current layer of the image data is encoded using the residual signal having the determined value of the weight coefficient applied thereto.
  • an input signal of image data constituted by a plurality of layers is 8 bits
  • a residual signal which is a prediction error of inter-frame prediction in another layer different from a current layer of the image data is 9 bits
  • the value of a weight coefficient of the residual signal is determined to be 0.5.
  • the current layer of encoded data of the image data is decoded using the residual signal having the determined value of the weight coefficient applied thereto.
  • an image can be encoded and decoded.
  • it is possible to suppress an increase in storage capacity necessary for encoding or decoding.
  • Fig. 1 is a diagram illustrating a configuration example of a coding unit.
  • Fig. 2 is a diagram illustrating an example of a hierarchical image encoding scheme.
  • Fig. 3 is a diagram illustrating an example of spatial scalable encoding.
  • Fig. 4 is a diagram illustrating an example of temporal scalable encoding.
  • Fig. 5 is a diagram illustrating an example of scalable encoding of a signal to noise ratio.
  • Fig. 6 is a diagram illustrating an example of an interpolation filter for motion compensation.
  • Fig. 7 is a diagram illustrating prediction of a residual signal in scalable encoding.
  • Fig. 8 is a diagram illustrating an example of syntax of a sequence parameter set.
  • Fig. 9 is a diagram, following Fig.
  • Fig. 10 is a block diagram illustrating a main configuration example of an image encoding apparatus.
  • Fig. 11 is a block diagram illustrating a main configuration example of a base layer image encoding unit.
  • Fig. 12 is a block diagram illustrating a main configuration example of an enhancement layer image encoding unit.
  • Fig. 13 is a block diagram illustrating a main configuration example of a residual prediction unit.
  • Fig. 14 is a flow chart illustrating an example of a flow of image encoding processing.
  • Fig. 15 is a flow chart illustrating an example of a flow of base layer encoding processing.
  • Fig. 16 is a flow chart illustrating an example of a flow of enhancement layer encoding processing.
  • Fig. 10 is a block diagram illustrating a main configuration example of an image encoding apparatus.
  • Fig. 11 is a block diagram illustrating a main configuration example of a base layer image encoding unit.
  • Fig. 12 is a block diagram illustrating
  • FIG. 17 is a flow chart illustrating an example of a flow of inter prediction processing.
  • Fig. 18 is a flow chart illustrating another example of a flow of inter prediction processing.
  • Fig. 19 is a block diagram illustrating a main configuration example of an image decoding apparatus.
  • Fig. 20 is a block diagram illustrating a main configuration example of a base layer image decoding unit.
  • Fig. 21 is a block diagram illustrating a main configuration example of an enhancement layer image decoding unit.
  • Fig. 22 is a block diagram illustrating another configuration example of a residual prediction unit.
  • Fig. 23 is a flow chart illustrating an example of a flow of image decoding processing.
  • Fig. 24 is a flow chart illustrating an example of a flow of base layer decoding processing.
  • FIG. 25 is a flow chart illustrating an example of a flow of enhancement layer decoding processing.
  • Fig. 26 is a flow chart illustrating an example of a flow of inter prediction processing.
  • Fig. 27 is a flow chart illustrating another example of a flow of inter prediction processing.
  • Fig. 28 is a diagram illustrating an example of a multi-view image encoding scheme.
  • Fig. 29 is a diagram illustrating a main configuration example of a multi-view image encoding apparatus to which the present disclosure is applied.
  • Fig. 30 is a diagram illustrating a main configuration example of a multi-view image decoding apparatus to which the present disclosure is applied.
  • Fig. 31 is a block diagram illustrating a main configuration example of a computer.
  • FIG. 32 is a block diagram illustrating an example of a schematic configuration of a television device.
  • FIG. 33 is a block diagram illustrating an example of a schematic configuration of a mobile phone.
  • Fig. 34 is a block diagram illustrating an example of a schematic configuration of a recording and reproducing device.
  • Fig. 35 is a block diagram illustrating an example of a schematic configuration of an imaging device.
  • Fig. 36 is a block diagram illustrating an example of using scalable encoding.
  • Fig. 37 is a block diagram illustrating another example of using scalable encoding.
  • Fig. 38 is a block diagram illustrating another example of using scalable encoding.
  • Fig. 39 is a block diagram illustrating an example of a schematic configuration of a video set.
  • FIG. 40 is a block diagram illustrating an example of a schematic configuration of a video processor.
  • Fig. 41 is a block diagram illustrating another example of a schematic configuration of a video processor.
  • Fig. 42 is an explanatory diagram illustrating a configuration of a content reproduction system.
  • Fig. 43 is an explanatory diagram illustrating a flow of data in the content reproduction system.
  • Fig. 44 is an explanatory diagram illustrating a specific example of MPD.
  • Fig. 45 is a functional block diagram illustrating a configuration of a content server of a content reproduction system.
  • Fig. 46 is a functional block diagram illustrating a configuration of a content reproduction apparatus of the content reproduction system.
  • Fig. 47 is a functional block diagram illustrating a configuration of a content server of the content reproduction system.
  • Fig. 48 is a sequence chart illustrating an example of communication processing based on devices of a wireless communication system.
  • Fig. 49 is a sequence chart illustrating an example of communication processing using the devices of the wireless communication system.
  • Fig. 50 is a schematic diagram illustrating a configuration example of a frame format that is transmitted and received in communication processing based on the devices of the wireless communication system.
  • Fig. 51 is a sequence chart illustrating an example of communication processing based on the devices of the wireless communication system.
  • ⁇ Coding Unit> In an advanced video coding (AVC) format, a hierarchical structure based on a macroblock and a sub-macroblock is specified. However, a macroblock of 16 pixels x 16 pixels is not optimal for a large picture frame such as ultra high definition (UHD; 4000 pixels x 2000 pixels) which is a target of a next-generation encoding scheme.
  • UHD ultra high definition
  • a coding unit (CU) is specified, as illustrated in Fig. 1.
  • the CU is also referred to as a coding tree block (CTB), and is a partial region of an image of each picture which plays a role similar to a macroblock in an AVC format.
  • CTB coding tree block
  • the size of the latter one is fixed to 16 x 16 pixels, whereas the size of the former one is not fixed and is designated in image compression information in each sequence.
  • a CU having the largest size (largest coding unit (LCU)) and a CU having the smallest size (smallest coding unit (SCU)) are specified.
  • LCU large coding unit
  • SCU smallest coding unit
  • the size of an LCU is 128, and a maximum hierarchical depth is 5.
  • a CU having a size of 2N x 2N is divided into CUs each having a size of N x N which are one level lower hierarchies.
  • a CU is divided into prediction units (PUs), each being a region serving as a unit of processing for intra or inter prediction (partial region of an image of each picture).
  • the PU is divided into transform units (TUs), each being a region serving as a unit of processing for orthogonal transform (partial region of an image of each picture).
  • TUs transform units
  • a macroblock in an AVC format is equivalent to an LCU and a block (sub-block) is equivalent to a CU.
  • a motion compensation block in the AVC format is equivalent to a PU.
  • the size of the LCU in the top hierarchy is generally set to be larger than the macroblock in the AVC format, for example, 128 x 128 pixels.
  • an LCU includes a macroblock in an AVC format
  • a CU includes a block (sub-block) in the AVC format.
  • the term "block” used in the following description indicates any partial region within a picture, and the size, shape, and characteristics thereof are not limited.
  • the "block” includes any regions such as a TU, a PU, an SCU, a CU, an LCU, a sub-block, a macroblock, or a slice.
  • the block also includes other partial regions (units of processing). When it is necessary to limit the size, units of processing, or the like, a description will be given appropriately.
  • a coding tree unit is a unit including a coding tree block (CTB) of an LCU (the maximum number of CUs) and a parameter at the time of processing on an LCU base (level).
  • CTB coding tree block
  • a coding unit (CU) constituting the CTU is a unit including a coding block (CB) and a parameter at the time of processing on a CU base (level).
  • JM Joint Model
  • JM the following two mode determination methods, that is, a high complexity mode and a low complexity mode, can be selected. In either of them, cost function values regarding individual prediction modes are calculated, and a prediction mode having the smallest cost function value is selected as an optimal mode for a relevant block or macroblock.
  • the cost function in the high complexity mode is expressed as the following Expression (1).
  • omega denotes a universal set of candidate modes for encoding a relevant block or macroblock
  • D denotes differential energy between a decoded image and an input image in a case where encoding is performed using the prediction mode
  • lambda denotes a Lagrange undetermined multiplier which is given as a function of a quantization parameter
  • R denotes a total code amount including an orthogonal transform coefficient in a case where encoding is performed using the mode.
  • the cost function in the low complexity mode is expressed as the following Expression (2).
  • D denotes differential energy between a prediction image and an input image.
  • QP2Quant QP
  • HeaderBit denotes a code amount regarding information belonging to the header, such as a motion vector and a mode, which does not include an orthogonal transform coefficient.
  • the low complexity mode it is necessary to perform prediction processing regarding individual candidate modes, but a decoded image is not necessary, and thus it is not necessary to perform even encoding processing. For this reason, the low complexity mode can be realized with an amount of computation smaller than that of the high complexity mode.
  • Scalable encoding is a scheme of making an image into a plurality of layers (hierarchizing) and encoding the image for each layer.
  • Fig. 2 is a diagram illustrating an example of a hierarchical image encoding scheme.
  • a hierarchized image includes images of a plurality of hierarchies (layers) having different predetermined parameter values.
  • the plurality of layers of the hierarchal image are constituted by a base layer for performing encoding and decoding using only an image of itself without using images of other layers, and a non-base layer (also referred to as an enhancement layer) for performing encoding and decoding using images of other layers.
  • the non-base layer may use an image of the base layer or may use an image of another non-base layer.
  • the non-base layer is constituted by data (difference data) of a difference image between its own image and an image of another layer so that redundancy is reduced.
  • difference data difference data
  • an image having a lower quality than an original image is obtained using only data of the base layer.
  • the original image that is, a high-quality image
  • a terminal or a network it is possible to transmit image compression information according to the capability of a terminal or a network from a server without performing transcoding processing in which image compression information of only a base layer is transmitted to a terminal, for example, a mobile phone, which has a low processing capability to reproduce a moving image having low spatial and temporal resolutions or having a low image quality and in which image compression information of not only a base layer but also an enhancement layer is transmitted to a terminal, for example, a television or a personal computer, which has a high processing capability to reproduce a moving image having high spatial and temporal resolutions or having a high image quality.
  • a parameter having a scalability function is arbitrary.
  • a spatial resolution illustrated in Fig. 3 may be used as a parameter (spatial scalability).
  • spatial scalability a resolution of an image is different in each layer.
  • each picture is hierarchized into two hierarchies of a base layer having a lower spatial resolution than an original image and an enhancement layer that is synthesized with an image of the base layer to obtain the original image (original spatial resolution).
  • the number of hierarchies is an example, and hierarchization into any number of hierarchies can be performed.
  • a temporal resolution as illustrated in Fig. 4 may be used (temporal scalability).
  • a frame rate is different in each layer.
  • hierarchization into layers having different frame rates is performed, and thus a layer having a high frame rate can be added to a layer having a low frame rate to obtain a moving image having a higher frame rate, and all layers can be added together to obtain an original moving image (original frame rate).
  • the number of hierarchies is an example, and hierarchization into any number of hierarchies can be performed.
  • a signal to noise ratio (SNR scalability).
  • SNR scalability an SN ratio is different in each layer.
  • each picture is hierarchized into two hierarchies of a base layer having a lower SNR than an original image and an enhancement layer that is synthesized with an image of the base layer to obtain the original image (original SNR). That is, in base layer image compression information, information on an image having a low PSNR is transmitted, and enhancement layer image compression information is added thereto, and thus an image having a high PSNR can be reconstructed.
  • the number of hierarchies is an example, and hierarchization into any number of hierarchies can be performed.
  • a parameter for providing a scalability property may be a parameter other than the examples stated above.
  • bit depth scalability in which a base layer is constituted by an 8-bit image and an enhancement layer is added to the base layer to obtain a 10-bit image.
  • chroma scalability in which a base layer is constituted by a component image having a 4:2:0 format and an enhancement layer is added to the base layer to obtain a component image having a 4:2:2 format.
  • an interpolation filter for motion compensation illustrated in Fig. 6 is set.
  • a luminance signal performs motion compensation having a 1/4 pixel accuracy by using an 8-tap filter.
  • a color-difference signal performs motion compensation having a 1/8 pixel accuracy by using a 4-tap filter. Both of them are specified so that processing falls within 16-bit accuracy.
  • DCT-IF a technique called DCT-IF.
  • a residual signal [ResE] in an enhancement layer is calculated as the following Expression (3), using image data [CurE] of a current block of the enhancement layer and image data [RefE] of a reference block of the enhancement layer.
  • a residual signal [ResB] in a base layer is calculated as the following Expression (4), using image data [CurB] of a current block of the base layer and image data [RefB] of a reference block of the base layer.
  • the base layer and the enhancement layer are different from each other in resolution. Consequently, the residual signal [ResB] in the base layer is up-sampled to the resolution of the enhancement layer, and the amount of information of the residual signal in the enhancement layer is reduced using the up-sampled residual signal in the base layer.
  • a residual signal [ResE'] in the enhancement layer after an arithmetic computation is calculated as the following Expression (5), using the residual signal [ResE] in the enhancement layer before the arithmetic computation and the up-sampled residual signal UP[ResB] in the base layer.
  • the residual signal in the enhancement layer is calculated in this manner, and thus it is possible to enhance encoding efficiency in the enhancement layer.
  • NPL 1 suggests that a final prediction image P is generated as the following Expression (6), using a prediction image Pe0 in an enhancement layer, a pixel Bb of the relevant block in a base layer, and a prediction image Pb0 in the base layer.
  • w denotes a weight coefficient of a residual signal and has any one of values 0, 0.5, and 1, and information on which value is taken for each CU is transmitted as image compression information to be output.
  • the residual signal also referred to as residual information
  • an input signal is 8 bits and the value of a weight coefficient w of the residual signal is 1, the value of w(Bb-Pb0) is 9 bits.
  • w(Bb-Pb0) is 9 bits.
  • the term "current layer” is a layer on which encoding and decoding processing is to be performed, and indicates, for example, an enhancement layer.
  • the term “another layer” is a layer other than the current layer that acquires a residual signal that is used for the processing of the current layer, and indicates, for example, a base layer or another enhancement layer.
  • a residual signal has a value falling within 8 bits, that is, a value falling within a range of -128 to 128 or a value falling outside the range.
  • the value of a weight coefficient w is set to 1.
  • the value of the weight coefficient w is set to 0.5.
  • the value of w when non-zero which is determined in this manner and a cost function value in a case where the value of w is zero are calculated, and thus it is determined which of either zero or non-zero is used to perform encoding, by mode determination processing.
  • weight coefficient usage information Information on which of either zero or non-zero is used as the weight coefficient w (hereinafter, this information will be referred to as weight coefficient usage information) is transmitted to each CU.
  • NPL 1 it is necessary to transmit information on which of any of 0, 0.5, and 1 is used as the weight coefficient w to each CU, and 2 bits are desired for this.
  • weight coefficient usage information on whether any of zero and non-zero is used as a weight coefficient w may be transmitted to each CU.
  • the weight coefficient usage information on whether any of zero or non-zero is used as the weight coefficient w may be transmitted to each LCU or each slice header.
  • the weight coefficient w having non-zero is used, it is independently set (determined) which of either 0.5 or 1 is used, for each of PUs included in the relevant CU. Therefore, it is possible to perform residual prediction according to characteristics of signals of the individual PUs, as compared with the suggestion in NPL 1. Thus, it is possible to enhance encoding efficiency of the image compression information to be output. Meanwhile, when the weight coefficient w having non-zero is used, it may be set (determined) for each CU which of either 0.5 or 1 is used.
  • weight coefficient value information information on which of either 0.5 or 1 is used as the weight coefficient w, in order to be distinguished from the weight coefficient usage information on whether any of zero or non-zero is used as the weight coefficient w.
  • the above-described processing for reducing the amount of information of a residual signal in a base layer can be applied to each of a luminance signal and a color-difference signal.
  • the processing for reducing the amount of information of the residual signal in the base layer may be performed for only encoding and decoding of the luminance signal, may be performed for only encoding and decoding of the color-difference signal, or may be performed not only for encoding and decoding of the luminance signal, but also for encoding and decoding of the color-difference signal.
  • bit depth of an input signal in a base layer is particularly effective in a case where the input signal in the base layer is 8 bits, as described above.
  • an input signal in a base layer is 8 bits in syntax in the following manner.
  • bit_depth_luma_minus8 is a parameter for defining bit depth of a luminance signal, and a value obtained by subtracting 8 from the bit depth is set.
  • bit_depth_chroma_minus8 is a parameter for defining bit depth of a color-difference signal, and a value obtained by subtracting 8 from the bit depth is set.
  • bit_depth_luma_minus8 When the value of bit_depth_luma_minus8 is 0, processing according to the present disclosure is applied to the luminance signal. In addition, when the value of bit_depth_choma_minus8 is 0, processing according to the present disclosure is applied to the color-difference signal.
  • FIG. 10 is a diagram illustrating an image encoding apparatus which is an example of an image processing apparatus to which the present disclosure is applied.
  • An image encoding apparatus 100 illustrated in Fig. 10 is an apparatus that performs hierarchical image encoding. As illustrated in Fig. 10, the image encoding apparatus 100 includes a base layer image encoding unit 101, an enhancement layer image encoding unit 102, and a multiplexer 103.
  • the base layer image encoding unit 101 encodes a base layer image to generate a base layer image encoding stream.
  • the enhancement layer image encoding unit 102 encodes an enhancement layer image to generate an enhancement layer image encoding stream.
  • the multiplexer 103 multiplexes the base layer image encoding stream generated by the base layer image encoding unit 101 and the enhancement layer image encoding stream generated by the enhancement layer image encoding unit 102 to generate a hierarchical image encoding stream.
  • the multiplexer 103 transmits the generated hierarchical image encoding stream to the decoding side.
  • the base layer image encoding unit 101 supplies a residual signal in a base layer to the enhancement layer image encoding unit 102, for a block on which inter prediction is performed.
  • the enhancement layer image encoding unit 102 acquires the residual signal in the base layer from the base layer image encoding unit 101. In a case where an input signal in the base layer is 8 bits, when the residual signal in the base layer is 9 bits, the enhancement layer image encoding unit 102 performs a process of determining the value of a weight coefficient of the residual signal to be 0.5. The enhancement layer image encoding unit 102 performs prediction processing in encoding of an enhancement layer, using the residual signal in the base layer having the determined value of the weight coefficient applied thereto.
  • the enhancement layer image encoding unit 102 transmits weight coefficient usage information on whether the value of the weight coefficient of the residual signal is non-zero or zero, to the decoding side through the multiplexer 103 (as the hierarchical image encoding stream).
  • Fig. 11 is a block diagram illustrating a main configuration example of the base layer image encoding unit 101 of Fig. 10.
  • the base layer image encoding unit 101 includes an A/D conversion unit 111, a screen rearrangement buffer 112, an arithmetic computation unit 113, an orthogonal transform unit 114, a quantization unit 115, a lossless encoding unit 116, an accumulation buffer 117, an inverse quantization unit 118, and an inverse orthogonal transform unit 119.
  • the base layer image encoding unit 101 includes an arithmetic computation unit 120, a deblocking filter 121-1, an adaptive offset filter 121-2, a frame memory 122, and a selection unit 123. Furthermore, the base layer image encoding unit 101 includes an intra prediction unit 124, a motion prediction and compensation unit 125, a prediction image selection unit 126, and a rate control unit 127.
  • the A/D conversion unit 111 performs A/D conversion on input image data (base layer image information), supplies the converted image data (digital data) to the screen rearrangement buffer 112, and stores the data therein.
  • the screen rearrangement buffer 112 rearranges, in accordance with a group of pictures (GOP), frame images stored therein which are arranged in a display order, in a frame order for encoding, and supplies the images rearranged in a frame order to the arithmetic computation unit 113.
  • the screen rearrangement buffer 112 also supplies the images rearranged in a frame order to the intra prediction unit 124 and the motion prediction and compensation unit 125.
  • the arithmetic computation unit 113 subtracts, from an image read out from the screen rearrangement buffer 112, a prediction image supplied from the intra prediction unit 124 or the motion prediction and compensation unit 125 through the prediction image selection unit 126, and outputs difference information thereof to the orthogonal transform unit 114. For example, in the case of an image on which intra encoding is performed, the arithmetic computation unit 113 subtracts, from the image read out from the screen rearrangement buffer 112, a prediction image supplied from the intra prediction unit 124. In addition, for example, in the case of an image on which inter-encoding is performed, the arithmetic computation unit 113 subtracts, from the image read out from the screen rearrangement buffer 112, a prediction image supplied from the motion prediction and compensation unit 125.
  • the orthogonal transform unit 114 performs orthogonal transform, such as discrete cosine transform or Karhunen-Loeve transform, on the difference information supplied from the arithmetic computation unit 113.
  • the orthogonal transform unit 114 supplies a transform coefficient thereof to the quantization unit 115.
  • the quantization unit 115 quantizes the transform coefficient supplied from the orthogonal transform unit 114.
  • the quantization unit 115 sets a quantization parameter on the basis of information on a target value having a code amount which is supplied from the rate control unit 127, and performs the quantization thereof.
  • the quantization unit 115 supplies the quantized transform coefficient to the lossless encoding unit 116.
  • the lossless encoding unit 116 encodes the transform coefficient quantized by the quantization unit 115, using any encoding scheme. Since coefficient data is quantized under the control of the rate control unit 127, a code amount thereof serves as a target value that is set by the rate control unit 127 (or approximated to the target value).
  • the lossless encoding unit 116 acquires information on a mode of intra prediction from the intra prediction unit 124, and acquires information on a mode of inter prediction and difference motion vector information from the motion prediction and compensation unit 125. Furthermore, the lossless encoding unit 116 appropriately generates an NAL unit of a base layer which includes a sequence parameter set (SPS), a picture parameter set (PPS), and the like.
  • SPS sequence parameter set
  • PPS picture parameter set
  • the lossless encoding unit 116 encodes these various pieces of information using any encoding scheme, and uses (multiplexes) the pieces of information as a portion of encoded data (also referred to as an encoding stream).
  • the lossless encoding unit 116 supplies the encoded data obtained by the encoding to the accumulation buffer 117 and accumulates the encoded data therein.
  • Examples of the encoding scheme of the lossless encoding unit 116 include variable-length encoding, arithmetic encoding, and the like.
  • Examples of the variable-length encoding include context-adaptive variable length coding (CAVLC) determined in an H.264/AVC format, and the like.
  • Examples of the arithmetic encoding include context-adaptive binary arithmetic coding (CABAC), and the like.
  • the accumulation buffer 117 temporarily holds the encoded data (base layer encoded data) which is supplied from the lossless encoding unit 116.
  • the accumulation buffer 117 outputs the held base layer encoded data to, for example, a recording device (recording medium), not shown in the drawing, of a latter part of the accumulation buffer or a transmission path at a predetermined timing. That is, the accumulation buffer 117 is also a transmission unit that transmits the encoded data.
  • the transform coefficient that is quantized by the quantization unit 115 is supplied to the inverse quantization unit 118.
  • the inverse quantization unit 118 performs inverse quantization on the quantized transform coefficient by a method corresponding to the quantization using the quantization unit 115.
  • the inverse quantization unit 118 supplies the obtained transform coefficient to the inverse orthogonal transform unit 119.
  • the inverse orthogonal transform unit 119 performs inverse orthogonal transform on the transform coefficient supplied from the inverse quantization unit 118 by a method corresponding to the orthogonal transform process using the orthogonal transform unit 114.
  • An output (restored difference information) which has been subjected to the inverse orthogonal transform is supplied to the arithmetic computation unit 120.
  • the arithmetic computation unit 120 adds the prediction image supplied from the intra prediction unit 124 or the motion prediction and compensation unit 125 through the prediction image selection unit 126 to the restored difference information as a result of the inverse orthogonal transform which is supplied from the inverse orthogonal transform unit 119, to obtain a locally-decoded image (decoded image).
  • the decoded image is supplied to the deblocking filter 121-1 or the frame memory 122.
  • the deblocking filter 121-1 performs deblocking filter processing on a reconfiguration image supplied from the arithmetic computation unit 120, to remove block distortion of the reconfiguration image.
  • the deblocking filter 121-1 supplies the filter-processed image to the adaptive offset filter 121-2.
  • the adaptive offset filter 121-2 performs adaptive offset filter (sample adaptive offset (SAO)) processing of mainly removing ringing, on a result (the reconfiguration image from which the block distortion is removed) of the deblocking filter processing which is supplied from the deblocking filter 121-1.
  • SAO sample adaptive offset
  • the adaptive offset filter 121-2 determines the type of adaptive offset filter processing for each largest coding unit (LCU) which is a largest unit of encoding, and obtains an offset which is used in the adaptive offset filter processing.
  • the adaptive offset filter 121-2 performs the determined type of adaptive offset filter processing on an image subjected to adaptive deblocking filter processing, using the obtained offset.
  • the adaptive offset filter 121-2 supplies the image subjected to the adaptive offset filter processing (hereinafter, referred to as a decoded image) to the frame memory 122.
  • the deblocking filter 121-1 and the adaptive offset filter 121-2 can also supply information, such as a filter coefficient used for filter processing to the lossless encoding unit 116 if necessary, to encode the information.
  • an adaptive loop filter may be provided in a latter part of the adaptive offset filter 121-2.
  • the frame memory 122 stores a decoded image that is supplied, and supplies the stored decoded image as a reference image to the selection unit 123 at a predetermined timing.
  • the frame memory 122 stores the reconfiguration image supplied from the arithmetic computation unit 120 and the decoded image supplied from the adaptive offset filter 121-2.
  • the frame memory 122 supplies the stored reconfiguration image to the intra prediction unit 124 through the selection unit 123 at a predetermined timing or on the basis of a request from the outside such as the intra prediction unit 124.
  • the frame memory 122 supplies the stored decoded image to the motion prediction and compensation unit 125 through the selection unit 123 at a predetermined timing or on the basis of a request from the outside such as the motion prediction and compensation unit 125.
  • the selection unit 123 selects a supply destination of the reference image supplied from the frame memory 122. For example, in the case of intra prediction, the selection unit 123 supplies the reference image (a pixel value within a current picture) which is supplied from the frame memory 122, to the intra prediction unit 124. In addition, for example, in the case of inter prediction, the selection unit 123 supplies the reference image (a pixel value outside a current picture) which is supplied from the frame memory 122, to the motion prediction and compensation unit 125.
  • the intra prediction unit 124 performs prediction processing on the current picture which is an image of a frame to be processed, to generate a prediction image.
  • the intra prediction unit 124 performs the prediction processing on each predetermined block (using a block as a unit of processing). In other words, the intra prediction unit 124 generates a prediction image of a current block to be processed of the current picture.
  • the intra prediction unit 124 performs prediction processing (in-screen prediction (also referred to as intra prediction)) using the reconfiguration image supplied as the reference image from the frame memory 122 through the selection unit 123. In other words, the intra prediction unit 124 generates a prediction image using a peripheral pixel value, included in the reconfiguration image, of the current block.
  • the peripheral pixel value used for the intra prediction is a pixel value of a pixel of the current picture which is processed in the past.
  • a plurality of methods (also referred to as intra prediction modes) are prepared in advance as candidates, for the intra prediction (that is, for a method of generating the prediction image).
  • the intra prediction unit 124 performs the intra prediction in the plurality of intra prediction modes which are prepared in advance.
  • the intra prediction unit 124 generates prediction images in all the intra prediction modes serving as candidates, evaluates a cost function value of each prediction image by using the input image supplied from the screen rearrangement buffer 112, and selects an optimal mode.
  • the intra prediction unit 124 selects the optimal intra prediction mode, the intra prediction unit supplies a prediction image generated in the optimal mode to the prediction image selection unit 126.
  • the intra prediction unit 124 appropriately supplies intra prediction mode information on the adopted intra prediction mode to the lossless encoding unit 116 and encodes the information.
  • the motion prediction and compensation unit 125 performs prediction processing on a current picture to generate a prediction image.
  • the motion prediction and compensation unit 125 performs the prediction processing on each predetermined block (using a block as a unit of processing). In other words, the motion prediction and compensation unit 125 generates a prediction image of a current block of the current picture which is to be processed.
  • the motion prediction and compensation unit 125 performs prediction processing using image data of the input image supplied from the screen rearrangement buffer 112 and image data of the decoded image supplied as the reference image from the frame memory 122.
  • the decoded image is an image (another picture which is not a current picture) of a frame processed prior to the current picture.
  • the motion prediction and compensation unit 125 performs prediction processing (inter-screen prediction (also referred to as inter prediction)) of generating a prediction image, using an image of another picture.
  • the inter prediction is constituted by motion prediction and motion compensation. More specifically, the motion prediction and compensation unit 125 performs motion prediction on a current block using an input image and a reference image, to detect a motion vector. The motion prediction and compensation unit 125 performs motion compensation processing in accordance with the detected motion vector, using the reference image to generate a prediction image (inter prediction image information) of the current block. A plurality of methods (also referred to as inter prediction modes) are prepared in advance as candidates for the inter prediction (that is, for a method of generating the prediction image). The motion prediction and compensation unit 125 performs the inter prediction in the plurality of inter prediction modes which are prepared in advance.
  • the motion prediction and compensation unit 125 generates prediction images in all the inter prediction modes serving as candidates.
  • the motion prediction and compensation unit 125 evaluates a cost function value of each prediction image, using the input image supplied from the screen rearrangement buffer 112, information of the generated difference motion vector, and the like, and selects an optimal mode.
  • the motion prediction and compensation unit 125 selects the optimal inter prediction mode, the motion prediction and compensation unit supplies a prediction image generated in the optimal mode to the prediction image selection unit 126.
  • the motion prediction and compensation unit 125 supplies information on the adopted inter prediction mode or information desired to perform processing in the inter prediction mode at the time of decoding encoded data to the lossless encoding unit 116, and encodes the information.
  • the desired information include information of the generated difference motion vector, a flag indicating an index of a prediction motion vector as prediction motion vector information, and the like.
  • the prediction image selection unit 126 selects a supply source of a prediction image to be supplied to the arithmetic computation unit 113 or the arithmetic computation unit 120. For example, in the case of intra encoding, the prediction image selection unit 126 selects the intra prediction unit 124 as a supply source of a prediction image, and supplies the prediction image supplied from the intra prediction unit 124 to the arithmetic computation unit 113 or the arithmetic computation unit 120.
  • the prediction image selection unit 126 selects the motion prediction and compensation unit 125 as a supply source of a prediction image, and supplies the prediction image supplied from the motion prediction and compensation unit 125 to the arithmetic computation unit 113 or the arithmetic computation unit 120.
  • the rate control unit 127 controls a rate of a quantizing operation of the quantization unit 115 on the basis of a code amount of the encoded data accumulated in the accumulation buffer 117 so that overflow or underflow does not occur.
  • the base layer image encoding unit 101 performs encoding with no reference to another layer.
  • the intra prediction unit 124 and the motion prediction and compensation unit 125 do not refer to information on encoding of another layer.
  • the base layer image encoding unit 101 performs the process described above in ⁇ 0. Outline>. That is, the motion prediction and compensation unit 125 supplies a residual signal of an interblock which is encoded by inter-frame prediction encoding in a base layer, to the enhancement layer image encoding unit 102. Meanwhile, although not shown in the drawing, a decoded image in the base layer is also supplied to the enhancement layer image encoding unit 102 from the frame memory 122.
  • Fig. 12 is a block diagram illustrating a main configuration example of the enhancement layer image encoding unit 102 of Fig. 10. As illustrated in Fig. 12, the enhancement layer image encoding unit 102 has a configuration that is basically similar to that of the base layer image encoding unit 101 of Fig. 11.
  • the enhancement layer image encoding unit 102 includes an A/D conversion unit 131, a screen rearrangement buffer 132, an arithmetic computation unit 133, an orthogonal transform unit 134, a quantization unit 135, a lossless encoding unit 136, an accumulation buffer 137, an inverse quantization unit 138, and an inverse orthogonal transform unit 139.
  • the enhancement layer image encoding unit 102 includes an arithmetic computation unit 140, a deblocking filter 141-1, an adaptive offset filter 141-2, a frame memory 142, a selection unit 143, an intra prediction unit 144, a motion prediction and compensation unit 145, a prediction image selection unit 146, and a rate control unit 147.
  • the A/D conversion unit 131 to the rate control unit 147 correspond to the A/D conversion unit 111 to the rate control unit 127 of Fig. 11, and perform similar processing to the corresponding processing units.
  • the units of the enhancement layer image encoding unit 102 perform processing concerning encoding of image information of an enhancement layer rather than a base layer. Therefore, as a description of processing of the A/D conversion unit 131 to the rate control unit 147, the description concerning the above-described A/D conversion unit 111 to rate control unit 127 of Fig. 11 can be applied.
  • data to be processed has to be data of the enhancement layer rather than data of the base layer.
  • it is necessary to appropriately replace a processing unit of an input source or an output destination of data with the corresponding processing unit of the A/D conversion unit 131 to the rate control unit 147 and to read the data.
  • the enhancement layer image encoding unit 102 performs encoding with reference to information of another layer (for example, a base layer).
  • the enhancement layer image encoding unit 102 performs the processing described above in ⁇ 0. Outline>.
  • the enhancement layer image encoding unit 102 includes a residual prediction unit 148 and an up-sampling unit 149.
  • the residual prediction unit 148 determines a timing range (hereinafter, also simply referred to as a range) of a residual signal indicating whether a base layer residual signal from the up-sampling unit 149 falls within or is larger than 8 bits.
  • the residual prediction unit 148 determines a weight coefficient w according to range information which is the determination result thereof, and supplies the determined weight coefficient to the motion prediction and compensation unit 145.
  • the residual prediction unit 148 shifts the base layer residual signal from the up-sampling unit 149 according to the range information, and supplies the shifted base layer residual information to the motion prediction and compensation unit 145.
  • the up-sampling unit 149 acquires, from the base layer image encoding unit 101, a base layer residual signal of an interblock which is encoded by inter-frame prediction encoding in the base layer, and up-samples the acquired base layer residual signal.
  • the up-sampling unit 149 supplies the up-sampled base layer residual signal to the residual prediction unit 148.
  • the up-sampling unit 149 is supplied with a base layer decoded image from the base layer image encoding unit 101.
  • the up-sampling unit 149 up-samples the supplied base layer decoded image to the resolution of the enhancement layer.
  • the up-sampling unit 149 supplies the up-sampled base layer decoded image information to the frame memory 142 of the enhancement layer image encoding unit 102.
  • the motion prediction and compensation unit 145 performs motion prediction on a current block, using an input image and a reference image to detect a motion vector.
  • the motion prediction and compensation unit 145 generates a prediction image using all prediction modes serving as candidates.
  • the motion prediction and compensation unit 145 evaluates a cost function value of a prediction image of each prediction mode, using the input image supplied from the screen rearrangement buffer 112, information of the generated difference motion vector, and the like, and selects an optimal mode. At this time, the motion prediction and compensation unit 145 calculates a cost function value of each prediction mode in a case where a weight coefficient is zero and non-zero, using the weight coefficient from the residual prediction unit 148 and the shifted base layer residual signal. When the motion prediction and compensation unit 145 selects an optimal inter prediction mode on the basis of the calculated cost function value, the motion prediction and compensation unit supplies a prediction image generated in the optimal mode to the prediction image selection unit 126.
  • the motion prediction and compensation unit 145 supplies information on the adopted inter prediction mode or information desired to perform processing in the inter prediction mode at the time of decoding encoded data to the lossless encoding unit 136, and encodes the information.
  • the desired information include not only information of the generated difference motion vector and a flag indicating an index of a prediction motion vector as prediction motion vector information, but also information on a weight coefficient of a base layer residual signal (weight coefficient usage information or weight coefficient value information).
  • Fig. 13 is a block diagram illustrating a main configuration example of the residual prediction unit 148 of Fig. 12. Meanwhile, in the example of Fig. 13, a flow of data from units shows a case where a base layer residual signal is 9 bits.
  • the residual prediction unit 148 is configured to include a range determination unit 161, a weight coefficient determination unit 162, a shifting unit 163, and a residual buffer 164.
  • the range determination unit 161 determines a range of a residual signal indicating whether a base layer residual signal from the up-sampling unit 149 falls within or is larger than 8 bits.
  • the range determination unit 161 supplies range information which is the determination result thereof to the weight coefficient determination unit 162.
  • the range determination unit 161 supplies the base layer residual signal and the range information to the shifting unit 163.
  • the weight coefficient determination unit 162 determines whether the value of a weight coefficient w is 0.5 or 1 according to the range information from the range determination unit 161, and supplies weight coefficient information which is the determined value of the weight coefficient w to the motion prediction and compensation unit 145. Specifically, when the range information does not fall within 8 bits, the weight coefficient w is determined to be 0.5.
  • the shifting unit 163 shifts the base layer residual signal from the range determination unit 161 according to the range information from the range determination unit 161, and supplies the shifted base layer residual signal to the residual buffer 164. For example, when the base layer residual information does not fall within 8 bits, the base layer residual signal is shifted to be accumulated in the residual buffer 164. When the base layer residual signal falls within 8 bits, the base layer residual signal is accumulated in the residual buffer 164 as it is.
  • the shifted base layer residual signal and the intact base layer residual signal are accumulated, and either residual signal is used by the motion prediction and compensation unit 145 in accordance with the weight coefficient w.
  • the residual buffer 164 accumulates the supplied base layer residual signal, reads out the signal at a predetermined timing, and supplies the signal to the motion prediction and compensation unit 145.
  • the image encoding apparatus 100 (the enhancement layer image encoding unit 102) can suppress an increase in storage capacity necessary for encoding or decoding.
  • step S101 the base layer image encoding unit 101 of the image encoding apparatus 100 encodes image data of a base layer.
  • the base layer encoding processing will be described later in detail with reference to Fig. 15.
  • step S102 the enhancement layer image encoding unit 102 encodes image data of an enhancement layer.
  • the enhancement layer encoding processing will be described later in detail with reference to Fig. 16.
  • step S103 the multiplexer 103 multiplexes a base layer image encoding stream generated by the processing of step S101 and an enhancement layer image encoding stream generated by the processing of step S102 (that is, bit streams of layers) to generate one hierarchical image encoding stream.
  • step S103 the image encoding apparatus 100 terminates the image encoding processing.
  • One picture is processed by the image encoding processing. Therefore, the image encoding apparatus 100 repeatedly performs the image encoding processing on each picture of hierarchized moving image data.
  • step S121 the A/D conversion unit 111 of the base layer image encoding unit 101 performs A/D conversion on an image of each of frames (pictures) of an input moving image.
  • step S122 the screen rearrangement buffer 112 stores the image that is A/D converted in step S121, and performs rearrangement of the pictures from a display order to an encoding order.
  • step S123 the intra prediction unit 124 performs intra prediction processing of an intra prediction mode.
  • step S124 the motion prediction and compensation unit 125 performs inter prediction processing of performing motion prediction or motion compensation in an inter prediction mode.
  • step S125 the prediction image selection unit 126 selects a prediction image on the basis of a cost function value and the like. In other words, the prediction image selection unit 126 selects any one of a prediction image generated by the intra prediction of step S123 and a prediction image generated by the inter prediction of step S124.
  • step S126 the arithmetic computation unit 113 arithmetically calculates a difference between an input image that is rearranged in a frame order by the processing of step S122 and the prediction image selected by the processing of step S125.
  • the arithmetic computation unit 113 generates image data of a difference image between the input image and the prediction image.
  • the amount of image data of the difference image which is obtained in this manner is reduced as compared with original image data. Therefore, it is possible to compress the amount of data as compared with a case where an image is encoded as it is.
  • step S127 the orthogonal transform unit 114 performs orthogonal transform on the image data of the difference image which is generated by the processing of step S126.
  • step S128 the quantization unit 115 quantizes an orthogonal transform coefficient that is obtained by the processing of step S127, using a quantization parameter that is calculated by the rate control unit 127.
  • step S129 the inverse quantization unit 118 performs inverse quantization on the quantized coefficient (also referred to as a quantization coefficient) which is generated by the processing of step S128, using characteristics corresponding to the characteristics of the quantization unit 115.
  • the quantized coefficient also referred to as a quantization coefficient
  • step S130 the inverse orthogonal transform unit 119 performs inverse orthogonal transform on the orthogonal transform coefficient which is obtained by the processing of step S129.
  • step S131 the arithmetic computation unit 120 adds the prediction image selected by the processing of step S125 to the difference image which is restored by the processing of step S130, to generate image data of a reconfiguration image.
  • step S132 the deblocking filter 121-1 performs deblocking filter processing on the image data of the reconfiguration image which is generated by the processing of step S131. Thus, block distortion of the reconfiguration image is removed.
  • step S133 the adaptive offset filter 121-2 performs adaptive offset filter processing of mainly removing ringing, on a result of the deblocking filter processing from the deblocking filter 121-1.
  • step S134 the frame memory 122 stores data such as a decoded image obtained by the processing of step S133 or the reconfiguration image obtained by the processing of step S131.
  • step S135 the lossless encoding unit 116 encodes the quantized coefficient which is obtained by the processing of step S128. That is, lossless encoding, such as variable-length encoding or arithmetic encoding, is performed on data corresponding to the difference image.
  • the lossless encoding unit 116 encodes information on a prediction mode of the prediction image which is selected by the processing of step S125, and adds the information to encoded data that is obtained by encoding the difference image.
  • the lossless encoding unit 116 also encodes optimal intra prediction mode information supplied from the intra prediction unit 124 or information according to an optimal inter prediction mode supplied from the motion prediction and compensation unit 125 and adds the information to the encoded data.
  • the lossless encoding unit 116 also sets syntax elements of various units, encodes the syntax elements, and adds the syntax elements to the encoded data.
  • step S136 the accumulation buffer 117 accumulates the encoded data obtained by the processing of step S135.
  • the encoded data accumulated in the accumulation buffer 117 is appropriately read out and is transmitted to the decoding side through a transmission path or a recording medium.
  • step S137 the rate control unit 127 controls a rate of a quantizing operation of the quantization unit 115, on the basis of a code amount (generated code amount) of the encoded data that is accumulated in the accumulation buffer 117 by the processing of step S136, so that overflow or underflow does not occur.
  • the rate control unit 127 supplies information on the quantization parameter to the quantization unit 115.
  • step S138 the motion prediction and compensation unit 125 supplies a residual signal in a base layer which is obtained by the above-described base layer encoding processing to encoding processing of an enhancement layer.
  • step S138 When the process of step S138 is terminated, the base layer encoding processing is terminated, and the processing returns to Fig. 14.
  • step S151 the up-sampling unit 149 of the enhancement layer image encoding unit 102 acquires a base layer residual signal from the base layer image encoding unit 101 and up-samples the base layer residual signal.
  • the up-sampling unit 149 supplies the up-sampled base layer residual signal to the residual prediction unit 148.
  • step S152 the A/D conversion unit 131 performs A/D conversion on an image of each of frames (pictures) of an input moving image of the enhancement layer.
  • step S153 the screen rearrangement buffer 132 stores the image that is A/D converted in step S152, and performs rearrangement of the pictures from a display order to an encoding order.
  • step S154 the intra prediction unit 144 performs intra prediction processing.
  • step S155 the motion prediction and compensation unit 145 performs inter prediction processing.
  • the inter prediction process will be described later with reference to Fig. 17.
  • the value of a weight coefficient is determined in accordance with a range of a base layer residual signal.
  • a cost function value of each prediction mode in a case where the value of the weight coefficient of the base layer residual signal is zero or non-zero is calculated, and thus an optimal prediction mode and the value of the weight coefficient are determined.
  • Pieces of processing of step S156 to step S168 correspond to the pieces of processing of step S125 to step S137 of Fig. 15, respectively, and are performed in a similar manner to the pieces of processing.
  • step S156 when a prediction image of inter prediction is selected in step S156, information of the optimal prediction mode determined by step S155 and weight coefficient usage information on whether the weight coefficient is zero or non-zero are supplied to the lossless encoding unit 136 and are encoded in step S166.
  • step S168 When the process of step S168 is terminated, the enhancement layer encoding processing is terminated, and the processing returns to Fig. 14.
  • Fig. 16 illustrates an example in which the value of w is determined to be 1 when a residual signal is 8 bits.
  • step S181 the motion prediction and compensation unit 145 of the enhancement layer image encoding unit 102 performs motion searching processing for each mode.
  • step S182 the range determination unit 161 determines a range of residual information on whether a base layer residual signal from the up-sampling unit 149 falls within or is larger than 8 bits. Meanwhile, the determination processing is performed for each CU or PU.
  • the range determination unit 161 supplies range information which is the determination result thereof to the weight coefficient determination unit 162.
  • the range determination unit 161 supplies the base layer residual signal and the range information to the shifting unit 163.
  • step S183 the weight coefficient determination unit 162 determines whether the value of a weight coefficient w is 0.5 or 1, in accordance with the range information from the range determination unit 161. That is, when the base layer residual information does not fall within 8 bits, the value of the weight coefficient w is determined to be 0.5, and when the base layer residual information falls within 8 bits, the value of the weight coefficient w is determined to be 1.
  • the weight coefficient determination unit 162 supplies weight coefficient information which is the determined value of the weight coefficient w to the motion prediction and compensation unit 145.
  • step S184 the shifting unit 163 shifts the base layer residual signal from the range determination unit 161 in accordance with the range information from the range determination unit 161, and supplies the shifted base layer residual signal to the residual buffer 164.
  • the base layer residual signal is shifted to be accumulated in the residual buffer 164.
  • the base layer residual signal falls within 8 bits, the base layer residual signal is accumulated in the residual buffer 164 as it is.
  • the residual buffer 164 accumulates the supplied base layer residual signal, reads out the signal at a predetermined timing, and supplies the signal to the motion prediction and compensation unit 145.
  • step S185 the motion prediction and compensation unit 145 calculates a cost function value for each mode in a case where the weight coefficient is zero and non-zero, using the motion information searched for in step S181, the weight coefficient from the residual prediction unit 148, and the shifted base layer residual signal.
  • step S186 the motion prediction and compensation unit 145 performs mode determination on the basis of the cost function value for each mode which is calculated in step S185. As a result, the motion prediction and compensation unit 145 determines an optimal inter prediction mode and the value (non-zero or zero) of the weight coefficient w.
  • step S187 the motion prediction and compensation unit 145 performs motion compensation in the optimal inter prediction mode that is selected in step S186 to generate a prediction image.
  • the generated prediction image is supplied to the prediction image selection unit 146 together with information on the optimal inter prediction mode.
  • step S188 the motion prediction and compensation unit 145 supplies weight coefficient usage information on whether the determined value of the weight coefficient is zero or non-zero to the lossless encoding unit 136 in order to transmit the information to the decoding side, and encodes the information.
  • the value of the weight coefficient w is determined to be 0.5, and thus the base layer residual signal is shifted to become 8 bits and is accumulated in the residual buffer 164.
  • Fig. 18 illustrates an example in which either 0.5 or 1 is used as the value of w in a case where a residual signal is 8 bits.
  • step S191 the motion prediction and compensation unit 145 of the enhancement layer image encoding unit 102 performs motion searching processing for each mode.
  • step S192 the range determination unit 161 determines a range of residual information on whether a base layer residual signal from the up-sampling unit 149 falls within or is larger than 8 bits. Meanwhile, the determination processing is performed for each CU or PU.
  • the range determination unit 161 supplies range information which is the determination result thereof to the weight coefficient determination unit 162.
  • the range determination unit 161 supplies the base layer residual information and the range information to the shifting unit 163.
  • step S193 the weight coefficient determination unit 162 determines whether the base layer residual signal is equal to or less than 8 bits, with reference to the range information from the range determination unit 161. When it is determined in step S193 that the signal is equal to or less than 8 bits, the processing proceeds to step S194.
  • step S194 the weight coefficient determination unit 162 determines whether to set a weight coefficient to 0.5 or 1.
  • the weight coefficient determination unit 162 supplies weight coefficient information which is the determined value of the weight coefficient w to the motion prediction and compensation unit 145. Meanwhile, the weight coefficient determination unit supplies both the weight coefficients of 0.5 and 1 to the motion prediction and compensation unit 145, and consequently, in a case of non-zero, cost function values of both coefficients may be obtained for each mode to determine which of either values is used.
  • step S193 when it is determined in step S193 that the signal is equal to or less than 8 bits, the processing proceeds to step S195.
  • step S195 the weight coefficient determination unit 162 determines the value of the weight coefficient w to be 0.5, and supplies weight coefficient information which is the determined value of the weight coefficient w to the motion prediction and compensation unit 145.
  • step S196 the shifting unit 163 shifts the base layer residual signal from the range determination unit 161 in accordance with the range information from the range determination unit 161, and supplies the shifted base layer residual signal to the residual buffer 164.
  • the base layer residual signal is shifted to be accumulated in the residual buffer 164.
  • the base layer residual signal falls within 8 bits, for example, the intact base layer residual signal and the shifted base layer residual signal are accumulated in the residual buffer 164.
  • the residual buffer 164 accumulates the supplied base layer residual signal, reads out the shifted base layer residual signal and the base layer residual signal that is not shifted, and supplies the signals to the motion prediction and compensation unit 145.
  • step S197 the motion prediction and compensation unit 145 calculates a cost function value for each mode in a case where the weight coefficient is zero and non-zero, using the motion information searched for in step S191, the weight coefficient from the residual prediction unit 148, and the shifted base layer residual signal.
  • step S198 the motion prediction and compensation unit 145 performs mode determination on the basis of the cost function value for each mode which is calculated in step S197, to determine an optimal inter prediction mode and the value (non-zero or zero) of the weight coefficient w.
  • step S199 the motion prediction and compensation unit 145 performs motion compensation in the optimal inter prediction mode that is selected in step S198, to generate a prediction image.
  • the generated prediction image is supplied to the prediction image selection unit 146 together with information on the optimal inter prediction mode.
  • step S200 the motion prediction and compensation unit 145 supplies weight coefficient usage information on whether the determined value of the weight coefficient is zero or non-zero to the lossless encoding unit 136 in order to transmit the weight coefficient usage information to the decoding side, and encodes the information.
  • the motion prediction and compensation unit also supplies weight coefficient value information on whether the weight coefficient w is 0.5 or 1 to the lossless encoding unit 136.
  • the value of the weight coefficient w is determined to be 0.5, and thus the base layer residual signal is shifted to become 8 bits and is accumulated in the residual buffer 164.
  • the residual buffer 164 it is possible to suppress an increase in the storage capacity of the residual buffer 164 and to suppress an increase in storage capacity necessary for encoding and decoding.
  • residual prediction according to characteristics of a signal of a PU can be performed, and thus it is possible to enhance the encoding efficiency of image compression information to be output.
  • FIG. 19 is a block diagram illustrating a main configuration example of an image decoding apparatus corresponding to the image encoding apparatus 100 of Fig. 10, which is an example of an image processing apparatus to which the present disclosure is applied.
  • An image decoding apparatus 200 illustrated in Fig. 19 decodes encoded data that is generated by the image encoding apparatus 100 by a decoding method corresponding to the encoding method thereof (that is, hierarchically decodes encoded data that is hierarchically encoded).
  • the image decoding apparatus 200 includes a demultiplexer 201, a base layer image decoding unit 202, and an enhancement layer image decoding unit 203.
  • the demultiplexer 201 receives a hierarchical image encoding stream in which a base layer image encoding stream and an enhancement layer image encoding stream, which are transmitted from the encoding side, are multiplexed, and demultiplexes the received hierarchical image encoding stream to extract the base layer image encoding stream and the enhancement layer image encoding stream.
  • the base layer image decoding unit 202 decodes the base layer image encoding stream extracted by the demultiplexer 201 to obtain a base layer image.
  • the enhancement layer image decoding unit 203 decodes the enhancement layer image encoding stream extracted by the demultiplexer 201 to obtain an enhancement layer image.
  • the base layer image decoding unit 202 supplies a residual signal in a base layer to the enhancement layer image decoding unit 203, for a block on which inter prediction is performed.
  • the enhancement layer image decoding unit 203 acquires the residual signal in the base layer from the base layer image decoding unit 202.
  • the enhancement layer image decoding unit 203 determines whether the value of a weight coefficient of the residual signal in the base layer, which is indicated by weight coefficient usage information received from the encoding side, is non-zero or zero, and in a case of non-zero, performs the subsequent processing.
  • the enhancement layer image decoding unit 203 performs processing of determining the value of the weight coefficient of the residual signal to be 0.5.
  • the enhancement layer image decoding unit 203 performs prediction processing in the decoding of an enhancement layer, using the residual signal in the base layer having the determined value of the weight coefficient applied thereto.
  • Fig. 20 is a block diagram illustrating a main configuration example of the base layer image decoding unit 202 of Fig. 19.
  • the base layer image decoding unit 202 includes an accumulation buffer 211, a lossless decoding unit 212, an inverse quantization unit 213, an inverse orthogonal transform unit 214, an arithmetic computation unit 215, a deblocking filter 216-1, an adaptive offset filter 216-2, a screen rearrangement buffer 217, and a D/A conversion unit 218.
  • the base layer image decoding unit 202 includes a frame memory 219, a selection unit 220, an intra prediction unit 221, a motion prediction and compensation unit 222, and a prediction image selection unit 223.
  • the accumulation buffer 211 is also a reception unit that receives transmitted encoded data.
  • the accumulation buffer 211 receives and accumulates the transmitted encoded data, and supplies the encoded data to the lossless decoding unit 212 at a predetermined timing.
  • Information necessary for decoding such as prediction mode information, is added to the encoded data.
  • the lossless decoding unit 212 decodes information encoded by the lossless encoding unit 116 of Fig. 11, which is supplied from the accumulation buffer 211, using a decoding scheme corresponding to the encoding scheme thereof.
  • the lossless decoding unit 212 supplies quantized coefficient data of a difference image that is obtained by the decoding, to the inverse quantization unit 213.
  • the lossless decoding unit 212 determines which of either an intra prediction mode or an inter prediction mode is selected as an optimal prediction mode, and supplies information on the optimal prediction mode to the mode that is determined to be selected in the intra prediction unit 221 and the motion prediction and compensation unit 222.
  • information on the optimal prediction mode is supplied to the intra prediction unit 221.
  • information on the optimal prediction mode is supplied to the motion prediction and compensation unit 222.
  • the lossless decoding unit 212 supplies information necessary for inverse quantization, for example, a quantization matrix or a quantization parameter, to the inverse quantization unit 213.
  • the inverse quantization unit 213 performs inverse quantization on the quantized coefficient data which is obtained by the decoding of the lossless decoding unit 212, using a scheme corresponding to the quantization scheme of the quantization unit 115 of Fig. 11. Meanwhile, the inverse quantization unit 213 is a processing unit that is similar to the inverse quantization unit 118 of Fig. 11.
  • the inverse quantization unit 213 supplies the obtained coefficient data to the inverse orthogonal transform unit 214.
  • the inverse orthogonal transform unit 214 performs inverse orthogonal transform on the orthogonal transform coefficient supplied from the inverse quantization unit 213, using a scheme corresponding to the orthogonal transform scheme of the orthogonal transform unit 114 of Fig. 11 if necessary. Meanwhile, the inverse orthogonal transform unit 214 is a processing unit that is similar to the inverse orthogonal transform unit 119 of Fig. 11.
  • Image data of the difference image is restored by the inverse orthogonal transform processing.
  • the restored image data of the difference image corresponds to image data of the difference image before the orthogonal transform is performed thereon in the image encoding apparatus.
  • the restored image data of the difference image which is obtained by the inverse orthogonal transform processing of the inverse orthogonal transform unit 214, will be also referred to as decoded residual data.
  • the inverse orthogonal transform unit 214 supplies the decoded residual data to the arithmetic computation unit 215.
  • the arithmetic computation unit 215 is supplied with image data of a prediction image from the intra prediction unit 221 or the motion prediction and compensation unit 222 through the prediction image selection unit 223.
  • the arithmetic computation unit 215 obtains image data of a reconfiguration image in which the difference image and the prediction image are added together, using the decoded residual data and the image data of the prediction image.
  • the reconfiguration image corresponds to the input image before a prediction image is subtracted therefrom by the arithmetic computation unit 113 of Fig. 11.
  • the arithmetic computation unit 215 supplies the reconfiguration image to the deblocking filter 216-1.
  • the deblocking filter 216-1 performs deblocking filter processing on the supplied reconfiguration image to remove block distortion.
  • the deblocking filter 216-1 supplies an image on which the filter processing is performed, to the adaptive offset filter 216-2.
  • the adaptive offset filter 216-2 performs adaptive offset filter (sample adaptive offset (SAO)) processing of mainly removing ringing, on a result of the deblocking filter processing (a decoded image from which block distortion is removed) which is transmitted from the deblocking filter 216-1.
  • adaptive offset filter sample adaptive offset (SAO)
  • the adaptive offset filter 216-2 receives the type of adaptive offset filter processing and an offset for each largest coding unit (LCU) which is a largest unit of encoding, from the lossless decoding unit 212.
  • the adaptive offset filter 216-2 performs the received type of adaptive offset filter processing on the image after the adaptive deblocking filter processing is performed thereon, using the received offset.
  • the adaptive offset filter 216-2 supplies the image after the adaptive offset filter processing is performed thereon (hereafter, referred to as a decoded image) to the screen rearrangement buffer 217 and the frame memory 219.
  • the decoded image that is output from the arithmetic computation unit 215 can be supplied to the screen rearrangement buffer 217 or the frame memory 219 without going through the deblocking filter 216-1 or the adaptive offset filter 216-2.
  • a portion or all of the filter processing using the deblocking filter 216-1 can be omitted.
  • an adaptive loop filter may be provided in a latter part of the adaptive offset filter 216-2.
  • the adaptive offset filter 216-2 supplies the decoded image (or the reconfiguration image) which is a result of the filter processing to the screen rearrangement buffer 217 and the frame memory 219.
  • the screen rearrangement buffer 217 performs rearrangement on the decoded image in a frame order. That is, the screen rearrangement buffer 217 rearranges images of the frames, which are rearranged in an encoding order by the screen rearrangement buffer 112 of Fig. 11, in the original display order. In other words, the screen rearrangement buffer 217 stores image data of the decoded images of the frames, which are supplied in the encoding order, in the encoding order, reads out the image data of the decoded images of the frames, which are stored in the encoding order, in a display order, and supplies the image data to the D/A conversion unit 218.
  • the D/A conversion unit 218 performs D/A conversion on the decoded images (digital data) of the frames which are supplied from the screen rearrangement buffer 217, and outputs the converted decoded images as analog data to a display not shown in the drawing to display the images thereon.
  • the frame memory 219 stores the supplied decoded image, and supplies the stored decoded image as a reference image to the intra prediction unit 221 or the motion prediction and compensation unit 222 through the selection unit 220, at a predetermined timing or on the basis of a request from the outside such as the intra prediction unit 221 or the motion prediction and compensation unit 222.
  • the intra prediction unit 221 is appropriately supplied with intra prediction mode information or the like from the lossless decoding unit 212.
  • the intra prediction unit 221 performs intra prediction in an intra prediction mode (optimal intra prediction mode) which is used in the intra prediction unit 124 to generate a prediction image.
  • the intra prediction unit 221 performs intra prediction using image data of a reconfiguration image which is supplied from the frame memory 219 through the selection unit 220. That is, the intra prediction unit 221 uses the reconfiguration image as a reference image (peripheral pixel).
  • the intra prediction unit 221 supplies the generated prediction image to the prediction image selection unit 223.
  • the motion prediction and compensation unit 222 is appropriately supplied with optimal prediction mode information or motion information from the lossless decoding unit 212.
  • the motion prediction and compensation unit 222 performs inter prediction using the decoded image (reference image) acquired from the frame memory 219 in an inter prediction mode (optimal inter prediction mode) indicated by the optimal prediction mode information that is acquired from the lossless decoding unit 212, to generate a prediction image.
  • the prediction image selection unit 223 supplies the prediction image supplied from the intra prediction unit 221 or the prediction image supplied from the motion prediction and compensation unit 222 to the arithmetic computation unit 215.
  • the arithmetic computation unit 215 adds the prediction image and decoded residual data (difference image information) from the inverse orthogonal transform unit 214 to obtain a reconfiguration image.
  • the base layer image decoding unit 202 performs decoding with no reference to another layer.
  • the intra prediction unit 221 and the motion prediction and compensation unit 222 do not refer to information on encoding of another layer.
  • the base layer image decoding unit 202 performs the processing described above in ⁇ 0. Outline>. That is, the motion prediction and compensation unit 222 supplies a residual signal of an interblock which is encoded by inter-frame prediction encoding in a base layer, to the enhancement layer image decoding unit 203.
  • Fig. 21 is a block diagram illustrating a main configuration example of the enhancement layer image decoding unit 203 of Fig. 19. As illustrated in Fig. 21, the enhancement layer image decoding unit 203 has a configuration that is basically similar to that of the base layer image decoding unit 202 of Fig. 20.
  • the enhancement layer image decoding unit 203 includes an accumulation buffer 231, a lossless decoding unit 232, an inverse quantization unit 233, an inverse orthogonal transform unit 234, an arithmetic computation unit 235, a deblocking filter 236-1, an adaptive offset filter 236-2, a screen rearrangement buffer 237, and a D/A conversion unit 238.
  • the enhancement layer image decoding unit 203 includes a frame memory 239, a selection unit 240, an intra prediction unit 241, a motion prediction and compensation unit 242, and a prediction image selection unit 243.
  • the accumulation buffer 231 to the prediction image selection unit 243 correspond to the accumulation buffer 211 to the prediction image selection unit 223 of Fig. 20, and perform similar processing to the corresponding processing units.
  • the units of the enhancement layer image decoding unit 203 perform processing concerning decoding of image information of an enhancement layer rather than a base layer. Therefore, as a description of processing of the accumulation buffer 231 to the prediction image selection unit 243, the description of the above-described accumulation buffer 211 to prediction image selection unit 223 of Fig. 20 can be applied.
  • data to be processed has to be data of the enhancement layer rather than data of the base layer.
  • the enhancement layer image decoding unit 203 performs decoding with reference to information of another layer (for example, a base layer).
  • the enhancement layer image decoding unit 203 performs the processing described above in ⁇ 0. Outline>.
  • the enhancement layer image decoding unit 203 includes a residual prediction unit 244 and an up-sampling unit 245.
  • Motion information, prediction mode information, or the like is supplied to the motion prediction and compensation unit 242 from the lossless decoding unit 232.
  • the motion prediction and compensation unit 242 determines whether the value of the weight coefficient is non-zero or zero, on the basis of the weight coefficient usage information of the residual signal from the lossless decoding unit 232. When the value of the weight coefficient is zero, prediction processing in the decoding of the enhancement layer is performed using motion information or prediction mode information from the lossless decoding unit 232.
  • the motion prediction and compensation unit 242 causes the residual prediction unit 244 to perform the subsequent processing.
  • the residual prediction unit 244 determines a range of a residual signal indicating whether a base layer residual signal from the up-sampling unit 245 falls within or is larger than 8 bits.
  • the residual prediction unit 244 determines a weight coefficient w in accordance with range information which is the determination result thereof, and supplies the determined weight coefficient w to the motion prediction and compensation unit 242.
  • the residual prediction unit 244 shifts the base layer residual signal from the up-sampling unit 245 in accordance with the range information, and supplies the shifted base layer residual signal to the motion prediction and compensation unit 242.
  • the motion prediction and compensation unit 242 performs the prediction processing in the decoding of the enhancement layer, using the motion information or the prediction mode information from the lossless decoding unit 232 and the residual signal in the base layer to which the value of the weight coefficient which is determined by the residual prediction unit 244 is applied.
  • the up-sampling unit 245 acquires, from the base layer image decoding unit 202, a base layer residual signal of an interblock which is encoded by inter-frame prediction encoding in the base layer, and up-samples the acquired base layer residual signal.
  • the up-sampling unit 245 supplies the up-sampled base layer residual signal to the residual prediction unit 244.
  • the up-sampling unit 245 is also supplied with a base layer decoded image from the base layer image decoding unit 202.
  • the up-sampling unit 245 up-samples the supplied base layer decoded image to the resolution of the enhancement layer.
  • the up-sampling unit 245 supplies the up-sampled base layer decoded image information to the frame memory 239 of the enhancement layer image decoding unit 203.
  • encoding using the image encoding apparatus 100 (that is, decoding using the image decoding apparatus 200) is another scalable encoding which is not hierarchical encoding, up-sampling from the base layer is not necessary, and thus the up-sampling unit 245 is omitted.
  • Fig. 22 is a block diagram illustrating a main configuration example of the residual prediction unit 244 of Fig. 21. Meanwhile, in the example of Fig. 22, a flow of data from units shows a case where a base layer residual signal is 9 bits.
  • the residual prediction unit 244 includes a range determination unit 261, a weight coefficient determination unit 262, a shifting unit 263, and a residual buffer 264.
  • the motion prediction and compensation unit 242 receives weight coefficient usage information on whether the value of a weight coefficient of the residual signal from the lossless decoding unit 232 is non-zero or zero. When the value of the weight coefficient is non-zero, the motion prediction and compensation unit 242 supplies a control signal for performing an operation, to the weight coefficient determination unit 262 and the residual buffer 264. On the other hand, when the value of the weight coefficient is zero, a control signal is not supplied to the weight coefficient determination unit 262 and the residual buffer 264. Alternatively, a control signal for prohibiting an operation may be supplied.
  • the range determination unit 261 determines a range of a residual signal indicating whether a base layer residual signal from the up-sampling unit 245 falls within or is larger than 8 bits.
  • the range determination unit 261 supplies range information which is the determination result thereof to the weight coefficient determination unit 262.
  • the range determination unit 261 supplies the base layer residual signal and the range information to the shifting unit 263.
  • the weight coefficient determination unit 262 When the weight coefficient determination unit 262 receives the control signal from the motion prediction and compensation unit 242, the weight coefficient determination unit determines whether the value of a weight coefficient w is 0.5 or 1 in accordance with the range information from the range determination unit 161. The weight coefficient determination unit 262 supplies weight coefficient information which is the determined value of the weight coefficient w to the motion prediction and compensation unit 242.
  • the shifting unit 263 shifts the base layer residual signal from the range determination unit 261 in accordance with the range information from the range determination unit 261, and supplies the shifted base layer residual signal to the residual buffer 264. For example, when the base layer residual information does not fall within 8 bits, the base layer residual signal is shifted to be accumulated in the residual buffer 264. When the base layer residual signal falls within 8 bits, the base layer residual signal is accumulated in the residual buffer 264 as it is.
  • the shifted base layer residual signal and the intact base layer residual signal are accumulated, and either residual signal is used by the motion prediction and compensation unit 242 in accordance with the weight coefficient w in a case of non-zero.
  • the residual buffer 264 accumulates the supplied base layer residual signal.
  • the residual buffer receives a control signal from the motion prediction and compensation unit 242, the residual buffer reads out the control signal at a predetermined timing and supplies the control signal to the motion prediction and compensation unit 242.
  • the motion prediction and compensation unit 242 receives the weight coefficient value information, and supplies a control signal indicating which of either values the weight coefficient is, to the weight coefficient determination unit 262 and the residual buffer 264.
  • the weight coefficient determination unit 262 determines the value of the weight coefficient w in response to the control signal, and supplies the value to the motion prediction and compensation unit 242.
  • the residual buffer 264 reads out either the shifted base layer residual signal or the intact base layer residual signal, and supplies the read-out signal to the motion prediction and compensation unit 242.
  • the image decoding apparatus 200 (the enhancement layer image decoding unit 203) can suppress an increase in storage capacity necessary for encoding and decoding.
  • step S201 the demultiplexer 201 of the image decoding apparatus 200 demultiplexes a hierarchical image encoding stream transmitted from the encoding side for each layer.
  • step S202 the base layer image decoding unit 202 decodes a base layer image encoding stream that is extracted by the processing of step S201.
  • the base layer image decoding unit 202 outputs data of a base layer image that is generated by the decoding.
  • the base layer decoding processing will be described later in detail with reference to Fig. 24.
  • step S203 the enhancement layer image decoding unit 203 decodes an enhancement layer image encoding stream that is extracted by the processing of step S201.
  • the enhancement layer image decoding unit 203 outputs data of an enhancement layer image that is generated by the decoding.
  • the enhancement layer decoding processing will be described later in detail with reference to Fig. 25.
  • step S203 When the process of step S203 is terminated, the image decoding apparatus 200 terminates the image decoding processing. One picture is processed by the image decoding processing. Therefore, the image decoding apparatus 200 repeatedly performs the image decoding processing on each picture of hierarchized moving image data.
  • the accumulation buffer 211 accumulates a transmitted bit stream (encoded data) in step S221.
  • the lossless decoding unit 212 decodes the bit stream (encoded data) supplied from the accumulation buffer 211. That is, pieces of image data such as an I picture, a P picture, and a B picture, which are encoded by the lossless encoding unit 116 of Fig. 11, are decoded. At this time, various pieces of information other than the image data, such as header information, which is included in the bit stream are also decoded.
  • step S223 the inverse quantization unit 213 performs inverse quantization on a quantized coefficient, which is obtained by the processing of step S222.
  • step S224 the inverse orthogonal transform unit 214 performs inverse orthogonal transform on the coefficient that is inversely quantized in step S223.
  • the intra prediction unit 221 or the motion prediction and compensation unit 222 performs prediction processing to generate a prediction image.
  • the prediction processing is performed in the prediction mode, determined by the lossless decoding unit 212, which is applied at the time of encoding. More specifically, for example, when intra prediction is applied at the time of encoding, the intra prediction unit 221 generates a prediction image in an intra prediction mode that is optimized at the time of encoding. In addition, for example, when inter prediction is applied at the time of encoding, the motion prediction and compensation unit 222 generates a prediction image in an inter prediction mode that is optimized at the time of encoding.
  • step S226 the arithmetic computation unit 215 adds the prediction image generated in step S225 to the difference image that is obtained by the inverse orthogonal transform performed in step S224.
  • image data of a reconfiguration image is obtained.
  • step S227 the deblocking filter 216-1 performs deblocking filter processing on the image data of the reconfiguration image which is obtained by the processing of step S226. Thus, block distortion or the like is removed therefrom.
  • step S229 the screen rearrangement buffer 217 rearranges frames of the reconfiguration image that has been subjected to the adaptive offset filter processing in step S228. That is, the order of the frames rearranged at the time of encoding is rearranged to the original display order.
  • step S230 the D/A conversion unit 218 performs D/A conversion on the image in which the order of frames is rearranged in step S229.
  • the image is output to a display, not shown in the drawing, to be displayed.
  • the frame memory 219 stores data such as the decoded image obtained by the processing of step S228 or the reconfiguration image which is obtained by the processing of step S227.
  • step S232 the motion prediction and compensation unit 222 supplies a residual signal in a base layer which is obtained in the above-described decoding processing of the base layer, to decoding processing of an enhancement layer.
  • step S232 When the process of step S232 is terminated, the base layer decoding processing is terminated, and the processing returns to Fig. 23.
  • step S251 the up-sampling unit 245 of the enhancement layer image decoding unit 203 acquires a base layer residual signal that is obtained in the decoding processing of the base layer, and up-samples the signal.
  • the up-sampling unit 245 supplies the up-sampled base layer residual signal to the residual prediction unit 244.
  • step S252 the accumulation buffer 231 accumulates a transmitted bit stream (encoded data).
  • step S253 the lossless decoding unit 232 decodes the bit stream (encoded data) supplied from the accumulation buffer 231. That is, the pieces of image data such as the I picture, the P picture, and the B picture which are encoded by the lossless encoding unit 136 are decoded. At this time, various pieces of information other than the image data, such as header information, which is included in the bit stream are also decoded. For example, information on a weight coefficient (weight coefficient usage information or weight coefficient value information) is also decoded.
  • step S254 the inverse quantization unit 213 performs inverse quantization on the quantized coefficient that is obtained by the processing of step S253.
  • step S255 the inverse orthogonal transform unit 214 performs inverse orthogonal transform on the coefficient that is inversely quantized in step S254.
  • step S256 the motion prediction and compensation unit 242 determines whether a prediction mode is inter prediction. When it is determined that the prediction mode is inter prediction, the processing proceeds to step S257.
  • step S257 the motion prediction and compensation unit 242 performs inter prediction processing to generate a prediction image of a current block.
  • the inter prediction process will be described later in detail with reference to Fig. 26.
  • step S256 When it is determined in step S256 that the prediction mode is not inter prediction, the processing proceeds to step S258.
  • the intra prediction unit 241 generates a prediction image in an optimal intra prediction mode which is an intra prediction mode adopted at the time of encoding.
  • Pieces of processing of step S259 to step S264 correspond to the pieces of processing of step S226 to step S231 of Fig. 24, respectively, and are performed in a similar manner to the pieces of processing.
  • Fig. 25 is an example of a case where the value of w is determined to be 1 when a residual signal is 8 bits, and is an example of processing corresponding to the example of Fig. 17.
  • step S281 the lossless decoding unit 232 receives motion information, prediction mode information, or the like from the encoding side.
  • the lossless decoding unit 232 receives weight coefficient usage information (information on whether the value of the weight coefficient is zero or non-zero) from the encoding side, for example, in units of CUs.
  • step S283 the motion prediction and compensation unit 242 determines whether the value of the weight coefficient is zero, with reference to the weight coefficient usage information from the lossless decoding unit 232.
  • the motion prediction and compensation unit 242 outputs a control signal to the weight coefficient determination unit 262 and the residual buffer 264, and the processing proceeds to step S284.
  • step S284 the range determination unit 261 determines a range of a residual signal indicating whether the base layer residual signal from the up-sampling unit 245 falls within or is larger than 8 bits. Meanwhile, the determination processing is performed for each CU or PU.
  • the range determination unit 261 supplies range information which is the determination result thereof to the weight coefficient determination unit 262. In addition, the range determination unit 261 supplies the base layer residual information and the range information to the shifting unit 263.
  • step S285 the weight coefficient determination unit 262 determines whether the value of a weight coefficient w is 0.5 or 1 in accordance with the range information from the range determination unit 261, on the basis of the control signal from the motion prediction and compensation unit 242. That is, when the base layer residual information does not fall within 8 bits, the value of the weight coefficient w is determined to be 0.5, and when the base layer residual information falls within 8 bits, the value of the weight coefficient w is determined to be 1.
  • the weight coefficient determination unit 162 supplies weight coefficient information which is the determined value of the weight coefficient w, to the motion prediction and compensation unit 242.
  • step S286 the shifting unit 263 shifts the base layer residual signal from the range determination unit 261 in accordance with the range information from the range determination unit 261, and supplies the shifted base layer residual signal to the residual buffer 264.
  • the base layer residual signal is shifted to be accumulated in the residual buffer 264.
  • the base layer residual signal falls within 8 bits, the base layer residual signal is accumulated in the residual buffer 264 as it is.
  • the residual buffer 264 accumulates the supplied base layer residual signal, reads out the base layer residual signal at a predetermined timing on the basis of the control signal from the motion prediction and compensation unit 242, and supplies the base layer residual signal to the motion prediction and compensation unit 242.
  • step S287 the motion prediction and compensation unit 242 generates a prediction image using the motion information or the prediction mode information which is received by step S281, the weight coefficient from the residual prediction unit 244, and the shifted base layer residual signal.
  • step S283 when it is determined in step S283 that the value of the weight coefficient is 0, the motion prediction and compensation unit 242 does not output a control signal to the weight coefficient determination unit 262 and the residual buffer 264, and the processing skips step S284 to step S286 and proceeds to step S287.
  • step S287 the motion prediction and compensation unit 242 generates a prediction image using the motion information or the prediction mode information which is received by step S281.
  • the value of the weight coefficient w is determined to be 0.5, and thus the base layer residual signal is shifted to become 8 bits and is accumulated in the residual buffer 264.
  • Fig. 26 is an example of a case where either 0.5 or 1 is used as the value of w when a residual signal is 8 bits, and is an example of processing corresponding to Fig. 18 described above.
  • step S291 the lossless decoding unit 232 receives motion information, prediction mode information, or the like from the encoding side.
  • step S292 the lossless decoding unit 232 receives weight coefficient usage information (information on whether the value of the weight coefficient is zero or non-zero) in units of CUs from the encoding side.
  • step S293 the motion prediction and compensation unit 242 determines whether the value of the weight coefficient is zero, with reference to the weight coefficient usage information from the lossless decoding unit 232.
  • step S293 the motion prediction and compensation unit 242 outputs a control signal to the weight coefficient determination unit 262 and the residual buffer 264, and the processing proceeds to step S294.
  • step S294 the range determination unit 261 determines a range of a residual signal indicating whether the base layer residual signal from the up-sampling unit 245 falls within or is larger than 8 bits. Meanwhile, the determination processing is performed for each CU or PU.
  • the range determination unit 261 supplies range information which is the determination result thereof to the weight coefficient determination unit 262. In addition, the range determination unit 261 supplies the base layer residual signal and the range information to the shifting unit 263.
  • step S295 the weight coefficient determination unit 262 determines whether the base layer residual signal is equal to or less than 8 bits with reference to the range information from the range determination unit 261, on the basis of the control signal from the motion prediction and compensation unit 242. When it is determined in step S295 that the base layer residual signal is equal to or less than 8 bits, the processing proceeds to step S296.
  • step S296 the motion prediction and compensation unit 242 receives, from the lossless decoding unit 232, weight coefficient value information on which of either 0.5 or 1 the value of a weight coefficient is set to, and supplies a control signal indicating to which value the weight coefficient is to be set, to the weight coefficient determination unit 262 and the residual buffer 264.
  • the weight coefficient determination unit 262 determines the value of a weight coefficient w in response to the control signal, and supplies the value to the motion prediction and compensation unit 242.
  • step S296 determines the value of the weight coefficient w to be 0.5, and supplies weight coefficient information which is the determined value of the weight coefficient w to the motion prediction and compensation unit 242.
  • step S297 the shifting unit 263 shifts the base layer residual signal from the range determination unit 261 in accordance with the range information from the range determination unit 261, and supplies the shifted base layer residual signal to the residual buffer 264.
  • the base layer residual signal is shifted to be accumulated in the residual buffer 264.
  • the base layer residual signal falls within 8 bits, for example, the intact base layer residual signal and the shifted base layer residual signal are accumulated in the residual buffer 264.
  • the residual buffer 264 accumulates the supplied base layer residual signal, reads out the shifted base layer residual signal or the base layer residual signal that is not shifted, in response to the control signal supplied from the motion prediction and compensation unit 242, and supplies the signal to the motion prediction and compensation unit 242.
  • step S298 the motion prediction and compensation unit 242 generates a prediction image using the motion information or the prediction mode information which is received by step S281 and the weight coefficient and the shifted base layer residual signal.
  • step S293 when it is determined in step S293 that the value of the weight coefficient is zero, the motion prediction and compensation unit 242 does not output a control signal to the weight coefficient determination unit 262 and the residual buffer 264, and the processing skips step S294 to step S297 and proceeds to step S298.
  • step S298 the motion prediction and compensation unit 242 generates a prediction image using the motion information or the prediction mode information which is received by step S291.
  • the value of the weight coefficient w is determined to be 0.5, and the base layer residual signal is shifted to become 8 bits and is accumulated in the residual buffer 264.
  • Pieces of processing are performed as described above, and thus the image decoding apparatus 200 can suppress an increase in the storage capacity of a storage unit that is used to store a residual signal in a base layer and can suppress an increase in storage capacity necessary for decoding. In addition, it is possible to enhance encoding efficiency of image compression information to be output.
  • image data is hierarchized by scalable encoding to be divided into a plurality of layers, but the number of layers is arbitrary.
  • an enhancement layer is processed using a base layer residual signal in encoding and decoding.
  • the present disclosure is not limited thereto, and the enhancement layer may be processed using a residual signal of another processed enhancement layer.
  • the motion prediction and compensation unit 145 (Fig. 12) of the enhancement layer image encoding unit 102 may supply a residual signal of an interblock of the enhancement layer to the enhancement layer image encoding unit 102 of another enhancement layer.
  • the motion prediction and compensation unit 242 (Fig. 21) of the enhancement layer image decoding unit 203 may supply a residual signal of an interblock of the enhancement layer to the enhancement layer image decoding unit 203 of another enhancement layer.
  • An application range of the present disclosure can be applied to all image encoding apparatuses and image decoding apparatuses based on scalable encoding and decoding schemes.
  • the present disclosure can be applied to an image encoding apparatus and an image decoding apparatus which are used to receive image information (bit stream), for example, MPEG or H.26x, which is compressed by orthogonal transform such as discrete cosine transform and by motion compensation, via a network medium such as satellite broadcasting, a cable television, the Internet, or a mobile phone.
  • image information for example, MPEG or H.26x
  • orthogonal transform such as discrete cosine transform and by motion compensation
  • a network medium such as satellite broadcasting, a cable television, the Internet, or a mobile phone.
  • the present disclosure can be applied to an image encoding apparatus and an image decoding apparatus which are used to perform processing on a storage medium such as an optical disc, a magnetic disk, and a flash memory.
  • Fig. 28 illustrates an example of a multi-view image encoding scheme.
  • a multi-view image includes an image having a plurality of views.
  • the plurality of views of the multi-view image are constituted by a base view for performing encoding and decoding using only an image of its own view without using information of another view, and a non-base view for performing encoding and decoding using information of another view.
  • the encoding and decoding of the non-base view may use information of the base view, or may use information of another non-base view.
  • a reference relationship between views in multi-view image encoding and decoding is similar to a reference relationship between layers in hierarchical image encoding and decoding. Therefore, in the encoding and decoding of a multi-view image as illustrated in Fig. 28, the above-described method may be applied.
  • a weight coefficient w of the residual signal may be set to 0.5 and stored. In this manner, similarly to a case of a multi-view image, it is possible to suppress an increase in storage capacity necessary for encoding or decoding.
  • weight coefficient usage information on whether the weight coefficient w is non-zero or zero only has to be transmitted, and thus it is possible to enhance encoding efficiency of image compression information to be output.
  • Fig. 29 is a diagram illustrating a multi-view image encoding apparatus that performs the above-described multi-view image encoding.
  • a multi-view image encoding apparatus 600 includes an encoding unit 601, an encoding unit 602, and a multiplexer 603, all of which employ circuitry to perform the described operations, the circuitry being in either a programmable form, fixed form or hybrid form.
  • the encoding unit 601 encodes a base view image to generate a base view image encoding stream.
  • the encoding unit 602 encodes a non-base view image to generate a non-base view image encoding stream.
  • the multiplexer 603 multiplexes the base view image encoding stream generated by the encoding unit 601 and the non-base view image encoding stream generated by the encoding unit 602 to generate a multi-view image encoding stream.
  • the base layer image encoding unit 101 (Fig. 11) may be applied as the encoding unit 601 of the multi-view image encoding apparatus 600, and the enhancement layer image encoding unit 102 (Fig. 12) may be applied as the encoding unit 602.
  • a weight coefficient w of the residual signal may be set to 0.5 and stored. In this manner, it is possible to suppress an increase in storage capacity necessary for encoding.
  • weight coefficient usage information on whether the weight coefficient w is non-zero or zero only has to be transmitted, and thus it is possible to enhance encoding efficiency of image compression information to be output.
  • Fig. 30 is a diagram illustrating a multi-view image decoding apparatus that performs the above-described multi-view image decoding.
  • a multi-view image decoding apparatus 610 includes a demultiplexer 611, a decoding unit 612, and a decoding unit 613, all of which employ circuitry to perform the described operations, the circuitry being in either a programmable form, fixed form or hybrid form.
  • the demultiplexer 611 demultiplexes the multi-view image encoding stream in which the base view image encoding stream and the non-base view image encoding stream are multiplexed, to extract the base view image encoding stream and the non-base view image encoding stream.
  • the decoding unit 612 decodes the base view image encoding stream extracted by the demultiplexer 611 to obtain a base view image.
  • the decoding unit 613 decodes the non-base view image encoding stream extracted by the demultiplexer 611 to obtain a non-base view image.
  • the base layer image decoding unit 202 (Fig. 20) may be applied as the decoding unit 612 of the multi-view image decoding apparatus 610, and the enhancement layer image decoding unit 203 (Fig. 21) may be applied as the decoding unit 613.
  • a weight coefficient w of the residual signal may be set to 0.5 and stored. In this manner, it is possible to suppress an increase in storage capacity necessary for decoding.
  • Fig. 31 is a block diagram illustrating a configuration example of hardware of a computer that executes the above-described series of pieces of processing by a program.
  • a central processing unit (CPU) 801, a read only memory (ROM) 802, and a random access memory (RAM) 803 are connected to each other through a bus 804.
  • an input and output interface 805 is also connected to the bus 804.
  • An input unit 806, an output unit 807, a storage unit 808, a communication unit 809, and a drive 810 are connected to the input and output interface 805.
  • the input unit 806 is constituted by, for example, a keyboard, a mouse, a microphone, a touch panel or an input terminal.
  • the output unit 807 is constituted by, for example, a display, a speaker or an output terminal.
  • the storage unit 808 is constituted by, for example, a hard disk, a RAM disk or a non-volatile memory.
  • the communication unit 809 is constituted by, for example, a network interface.
  • the drive 810 drives a removable medium 811 such as a magnetic disk, an optical disc, a magneto-optical disc, or semiconductor memory.
  • the CPU 801 loads a program stored in the storage unit 808 to the RAM 803 through the input and output interface 805 and the bus 804 to execute the program, and thus the above-described series of pieces of processing are performed.
  • data that is necessary for the CPU 801 to execute various pieces of processing is stored in the RAM 803 as appropriate.
  • the program executed by the computer (CPU 801) can be recorded in, for example, the removable medium 811 as a package medium, to be applied.
  • the program can be installed in the storage unit 808 through the input and output interface 805 by mounting the removable medium 811 to the drive 810.
  • the program can also be provided through a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be received in the communication unit 809 through a wired or wireless transmission medium, and can be installed in the storage unit 808.
  • the program can be installed in advance in the ROM 802 or the storage unit 808.
  • the program executed by the computer may be a program that is processed in time series in accordance with the sequence described in this specification, or may be a program that is processed in parallel or at a necessary timing such as at the time of being called.
  • steps of describing a program that is recorded in a recording medium include not only pieces of processing that are performed in time series in accordance with a described sequence, but also pieces of processing that are executed in parallel or individually without being necessarily processed in time series.
  • a "system” means a set of a plurality of components (devices, modules (parts), or the like), and it does not matter whether all components are included in one and the same housing. Therefore, a plurality of devices that are accommodated in individual housings and are connected to each other through a network, and one device including a plurality of modules accommodated in one housing thereof are both systems.
  • the configuration described as one device (or processing unit) is divided, and may be configured as a plurality of devices (or processing units).
  • the configuration described above as a plurality of devices (or processing units) may be configured as one device (or processing unit) collectively.
  • a configuration other than those described above may be, of course, added to the configuration of each device (or each processing unit).
  • a portion of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or another processing unit).
  • the present disclosure can adopt a configuration of cloud computing in which one function is shared and is jointly processed by a plurality of devices through a network.
  • steps described in the above-described flow charts can be executed by one device, or can be shared and executed by a plurality of devices.
  • one step includes a plurality of pieces of processing
  • the plurality of pieces of processing included in one step can be executed by one device or can be shared and executed by a plurality of devices.
  • the image encoding apparatus and the image decoding apparatus can be applied to various electronic equipment, for example, a recording device that records images in a medium such as a transmitter or a receiver used in, for example, satellite broadcasting, cable broadcasting such as a cable TV, distribution on the Internet, and distribution to a terminal based on cellular communication, an optical disc, a magnetic disk, or a flash memory, or a reproducing device that reproduces images from these recording media.
  • a recording device that records images in a medium such as a transmitter or a receiver used in, for example, satellite broadcasting, cable broadcasting such as a cable TV, distribution on the Internet, and distribution to a terminal based on cellular communication, an optical disc, a magnetic disk, or a flash memory, or a reproducing device that reproduces images from these recording media.
  • a television device 900 includes an antenna 901, a tuner 902, a demultiplexer 903, a decoder 904, a video signal processing unit 905, a display unit 906, an audio signal processing unit 907, a speaker 908, an external interface (I/F) unit 909, a control unit 910, a user interface (I/F) unit 911, and a bus 912.
  • I/F external interface
  • control unit 910 control unit 910
  • I/F user interface
  • the tuner 902 extracts a signal of a desired channel from a broadcast signal that is received through the antenna 901, and demodulates the extracted signal.
  • the tuner 902 outputs an encoding bit stream that is obtained by the demodulation, to the demultiplexer 903. That is, the tuner 902 takes on a role as a transmission unit in the television device 900, which receives an encoding stream in which an image is encoded.
  • the demultiplexer 903 separates a video stream and an audio stream of a program to be watched from the encoding bit stream, and outputs the separated streams to the decoder 904. In addition, the demultiplexer 903 extracts auxiliary data such as an electronic program guide (EPG) from the encoding bit stream, and supplies the extracted data to the control unit 910. Meanwhile, when the encoding bit stream is scrambled, the demultiplexer 903 may perform descrambling thereon.
  • EPG electronic program guide
  • the decoder 904 decodes the video stream and the audio stream which are input from the demultiplexer 903.
  • the decoder 904 outputs video data generated by decoding processing to the video signal processing unit 905.
  • the decoder 904 outputs audio data generated by decoding processing to the audio signal processing unit 907.
  • the video signal processing unit 905 reproduces the video data that is input from the decoder 904, and causes the display unit 906 to display a video.
  • the video signal processing unit 905 may cause the display unit 906 to display an application screen that is supplied through a network.
  • the video signal processing unit 905 may perform additional processing, for example, noise rejection, on the video data, in accordance with setting.
  • the video signal processing unit 905 may generate an image of a graphical user interface (GUI) such as a menu, a button, or a cursor, and may superimpose the generated image on an output image.
  • GUI graphical user interface
  • the display unit 906 is driven in response to a driving signal supplied from the video signal processing unit 905, and displays a video or an image on a video screen of a display device (for example, a liquid crystal display, a plasma display, or an organic electroluminescence display (OELD)).
  • a display device for example, a liquid crystal display, a plasma display, or an organic electroluminescence display (OELD)
  • the audio signal processing unit 907 performs reproduction processing, such as D/A conversion or amplification, on the audio data that is input from the decoder 904, and causes the speaker 908 to output audio. In addition, the audio signal processing unit 907 may perform additional processing, such as noise rejection, on the audio data.
  • the external interface unit 909 is an interface for connecting the television device 900 to an external device or a network.
  • a video stream or an audio stream which is received through the external interface unit 909 may be decoded by the decoder 904. That is, the external interface unit 909 also takes on a role as a transmission unit in the television device 900 which receives an encoding stream in which an image is encoded.
  • the control unit 910 includes a processor such as a CPU, and a memory such as a RAM or a ROM.
  • the memory stores a program, program data, and EPG data which are executed by the CPU, data that is acquired through a network, and the like.
  • the program stored in the memory is read by the CPU, for example, at the time of startup of the television device 900, and is executed.
  • the CPU executes the program to control an operation of the television device 900 in response to, for example, an operation signal that is input from the user interface unit 911.
  • the user interface unit 911 is connected to the control unit 910.
  • the user interface unit 911 includes, for example, a button and a switch for operating the television device 900 by a user, and a reception unit of a remote control signal.
  • the user interface unit 911 detects a user's operation through these components to generate an operation signal, and outputs the generated operation signal to the control unit 910.
  • the bus 912 connects the tuner 902, the demultiplexer 903, the decoder 904, the video signal processing unit 905, the audio signal processing unit 907, the external interface unit 909, and the control unit 910 to each other.
  • the decoder 904 has a function of the image decoding apparatus 200 (Fig. 19) according to the above-described embodiment. Thus, it is possible to suppress an increase in storage capacity necessary for decoding of an image in the television device 900.
  • FIG. 33 illustrates an example of a schematic configuration of a mobile phone to which the above-described embodiment is applied.
  • a mobile phone 920 includes an antenna 921, a communication unit 922, an audio codec 923, a speaker 924, a microphone 925, a camera unit 926, an image processing unit 927, a multiple separation unit 928, a recording and reproducing unit 929, a display unit 930, a control unit 931, an operation unit 932, and a bus 933.
  • the antenna 921 is connected to the communication unit 922.
  • the speaker 924 and the microphone 925 are connected to the audio codec 923.
  • the operation unit 932 is connected to the control unit 931.
  • the bus 933 connects the communication unit 922, the audio codec 923, the camera unit 926, the image processing unit 927, the multiple separation unit 928, the recording and reproducing unit 929, the display unit 930, and the control unit 931 to each other.
  • the mobile phone 920 performs operations such as transmission and reception of an audio signal, transmission and reception of an e-mail or image data, image capturing, or data recording in various operation modes including a voice call mode, a data communication mode, an image capturing mode, and a videophone mode.
  • an analog audio signal generated by the microphone 925 is supplied to the audio codec 923.
  • the audio codec 923 converts the analog audio signal to audio data, and performs A/D conversion on the converted audio data to compress the data.
  • the audio codec 923 outputs the compressed audio data to the communication unit 922.
  • the communication unit 922 encodes and modulates the audio data to generate a transmission signal.
  • the communication unit 922 transmits the generated transmission signal to a base station (not shown) through the antenna 921.
  • the communication unit 922 amplifies a radio signal that is received through the antenna 921, and frequency-converts the radio signal to acquire a reception signal.
  • the communication unit 922 demodulates and decodes the reception signal to generate audio data, and outputs the generated audio data to the audio codec 923.
  • the audio codec 923 expands the audio data, and performs D/A conversion on the audio data to generate an analog audio signal.
  • the audio codec 923 supplies the generated audio signal to the speaker 924 to output audio.
  • control unit 931 in the data communication mode, for example, the control unit 931 generates character data constituting an e-mail by a user's operation through the operation unit 932. In addition, the control unit 931 causes the display unit 930 to display characters. In addition, the control unit 931 generates e-mail data in accordance with a transmission instruction received from a user through the operation unit 932, and outputs the generated e-mail data to the communication unit 922. The communication unit 922 encodes and modulates the e-mail data to generate a transmission signal. The communication unit 922 transmits the generated transmission signal to a base station (not shown) through the antenna 921.
  • the communication unit 922 amplifies a radio signal that is received through the antenna 921, and frequency-converts the radio signal to acquire a reception signal.
  • the communication unit 922 demodulates and decodes the reception signal to restore the e-mail data, and outputs the restored e-mail data to the control unit 931.
  • the control unit 931 causes the display unit 930 to display contents of the e-mail, and stores the e-mail data in a storage medium of the recording and reproducing unit 929.
  • the recording and reproducing unit 929 includes any storage medium which is capable of reading and writing.
  • the storage medium may be a built-in storage medium such as a RAM or a flash memory, or may be an externally-installed storage medium such as a hard disk, a magnetic disk, a magneto-optical disc, an optical disc, a universal serial bus (USB) memory, or a memory card.
  • the camera unit 926 captures an image of a subject to generate image data, and outputs the generated image data to the image processing unit 927.
  • the image processing unit 927 encodes the image data that is input from the camera unit 926, and stores an encoding stream in a storage medium of the recording and reproducing unit 929.
  • the multiple separation unit 928 multiplexes a video stream that is encoded by the image processing unit 927 and an audio stream that is input from the audio codec 923, and outputs the multiplexed streams to the communication unit 922.
  • the communication unit 922 encodes and modulates the streams to generate a transmission signal.
  • the communication unit 922 transmits the generated transmission signal to a base station (not shown) through the antenna 921.
  • the communication unit 922 amplifies a radio signal that is received through the antenna 921 and frequency-converts the radio signal to acquire a reception signal.
  • the transmission signal and the reception signal can include an encoding bit stream.
  • the communication unit 922 demodulates and decodes the reception signal to restore the stream, and outputs the restored stream to the multiple separation unit 928.
  • the multiple separation unit 928 separates the video stream and the audio stream from the stream that is input, and outputs the video stream and the audio stream to the image processing unit 927 and the audio codec 923, respectively.
  • the image processing unit 927 decodes the video stream to generate video data.
  • the video data is supplied to the display unit 930, and a series of images are displayed on the display unit 930.
  • the audio codec 923 expands the audio stream, and performs D/A conversion on the audio stream to generate an analog audio signal.
  • the audio codec 923 supplies the generated audio signal to the speaker 924 to output audio.
  • the image processing unit 927 has a function of the image encoding apparatus 100 (Fig. 10) or the image decoding apparatus 200 (Fig. 19) according to the above-described embodiments.
  • the image processing unit 927 has a function of the image encoding apparatus 100 (Fig. 10) or the image decoding apparatus 200 (Fig. 19) according to the above-described embodiments.
  • Fig. 34 illustrates an example of a schematic configuration of a recording and reproducing device to which the above-described embodiment is applied.
  • a recording and reproducing device 940 encodes audio data and video data of a received broadcast program, and records the pieces of data in a recording medium.
  • the recording and reproducing device 940 may encode audio data and video data which are acquired from another device, and may record the pieces of data in the recording medium.
  • the recording and reproducing device 940 reproduces data recorded in the recording medium on a monitor and a speaker in accordance with a user's instruction. At this time, the recording and reproducing device 940 decodes the audio data and the video data.
  • the recording and reproducing device 940 includes a tuner 941, an external interface (I/F) unit 942, an encoder 943, a hard disc drive (HDD) 944, a disc drive 945, a selector 946, a decoder 947, an on-screen display (OSD) 948, a control unit 949, and a user interface (I/F) unit 950.
  • I/F external interface
  • the tuner 941 extracts a signal of a desired channel from a broadcast signal that is received through an antenna (not shown), and demodulates the extracted signal.
  • the tuner 941 outputs an encoding bit stream that is obtained by the demodulation to the selector 946. That is, the tuner 941 takes on a role as a transmission unit in the recording and reproducing device 940.
  • the external interface unit 942 is an interface for connecting the recording and reproducing device 940 to an external device or a network.
  • the external interface unit 942 may be, for example, an IEEE1394 interface, a network interface, a USB interface, or a flash memory interface.
  • the video data and the audio data which are received through the external interface unit 942 are input to the encoder 943. That is, the external interface unit 942 takes on a role as a transmission unit in the recording and reproducing device 940.
  • the encoder 943 encodes the video data and the audio data.
  • the encoder 943 outputs the encoding bit stream to the selector 946.
  • the HDD 944 records, in an internal hard disk, an encoding bit stream in which pieces of content data such as a video and audio are compressed, various programs, and other pieces of data. In addition, the HDD 944 reads out these pieces of data from the hard disk at the time of reproduction of a video and audio.
  • the disc drive 945 records and reads out data in and from a recording medium installed therein.
  • the recording medium installed in the disc drive 945 may be, for example, a DVD disc (DVD-Video, DVD-RAM, DVD-R, DVD-RW, DVD+R, DVD+RW, or the like) or Blu-ray (registered trademark) disc.
  • the selector 946 selects an encoding bit stream that is input from the tuner 941 or the encoder 943, and outputs the selected encoding bit stream to the HDD 944 or the disc drive 945. In addition, at the time of reproducing a video and audio, the selector 946 outputs an encoding bit stream that is input from the HDD 944 or the disc drive 945 to the decoder 947.
  • the decoder 947 decodes the encoding bit stream to generate video data and audio data.
  • the decoder 947 outputs the generated video data to the OSD 948.
  • the decoder 947 outputs the generated audio data to an external speaker.
  • the OSD 948 generates the video data that is input from the decoder 947 to display a video.
  • the OSD 948 may superimpose an image of a GUI, such as a menu, a button, or a cursor, on an image to be displayed.
  • the control unit 949 includes a processor such as a CPU, and a memory such as a RAM and a ROM.
  • the memory stores a program, program data, and the like which are executed by the CPU.
  • the program stored in the memory is read by the CPU, for example, at the time of starting up of the recording and reproducing device 940, and is executed.
  • the CPU executes the program to control an operation of the recording and reproducing device 940 in response to, for example, an operation signal that is input from the user interface unit 950.
  • the user interface unit 950 is connected to the control unit 949.
  • the user interface unit 950 includes, for example, a button and a switch for operating the recording and reproducing device 940 by a user, and a reception unit of a remote control signal.
  • the user interface unit 950 detects a user's operation through these components to generate an operation signal, and outputs the generated operation signal to the control unit 949.
  • the encoder 943 has a function of the image encoding apparatus 100 (Fig. 10) according to the above-described embodiment.
  • the decoder 947 has a function of the image decoding apparatus 200 (Fig. 19) according to the above-described embodiment.
  • Fig. 35 illustrates an example of a schematic configuration of an imaging device to which the above-described embodiment is applied.
  • the imaging device 960 captures an image of a subject to generate an image, encodes image data, and records the image data in a recording medium.
  • the imaging device 960 includes an optical block 961, an imaging unit 962, a signal processing unit 963, an image processing unit 964, a display unit 965, an external interface (I/F) unit 966, a memory 967, a medium drive 968, an OSD 969, a control unit 970, a user interface (I/F) unit 971, and a bus 972.
  • the optical block 961 is connected to the imaging unit 962.
  • the imaging unit 962 is connected to the signal processing unit 963.
  • the display unit 965 is connected to the image processing unit 964.
  • the user interface unit 971 is connected to the control unit 970.
  • the bus 972 connects the image processing unit 964, the external interface unit 966, the memory 967, the medium drive 968, the OSD 969, and the control unit 970 to each other.
  • the optical block 961 includes a focus lens, a diaphragm mechanism, and the like.
  • the optical block 961 forms an optical image of a subject on an imaging surface of the imaging unit 962.
  • the imaging unit 962 includes an image sensor such as a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS), and converts the optical image formed on the imaging surface to an image signal as an electrical signal by photoelectric conversion.
  • the imaging unit 962 outputs the image signal to the signal processing unit 963.
  • the signal processing unit 963 performs various pieces of camera signal processing, such as knee correction, gamma correction, or color correction, on the image signal that is input from the imaging unit 962.
  • the signal processing unit 963 outputs the image data after the camera signal processing is performed thereon, to the image processing unit 964.
  • the image processing unit 964 encodes the image data that is input from the signal processing unit 963 to generate encoded data.
  • the image processing unit 964 outputs the generated encoded data to the external interface unit 966 or the medium drive 968.
  • the image processing unit 964 decodes the encoded data that is input from the external interface unit 966 or the medium drive 968 to generate image data.
  • the image processing unit 964 outputs the generated image data to the display unit 965.
  • the image processing unit 964 may output the image data that is input from the signal processing unit 963 to the display unit 965, to display an image.
  • the image processing unit 964 may superimpose data for a display which is acquired from the OSD 969 on an image to be output to the display unit 965.
  • the OSD 969 generates an image of a GUI, such as a menu, a button, or a cursor, and outputs the generated image to the image processing unit 964.
  • a GUI such as a menu, a button, or a cursor
  • the external interface unit 966 is configured as, for example, a USB input and output terminal.
  • the external interface unit 966 connects the imaging device 960 and a printer, for example, at the time of printing an image.
  • a drive is connected to the external interface unit 966 when necessary.
  • a removable medium such as a magnetic disk or an optical disc is installed in the drive, and thus a program that is read out from the removable medium can be installed in the imaging device 960.
  • the external interface unit 966 may be configured as a network interface that is connected to a network such as a LAN or the Internet. That is, the external interface unit 966 takes on a role as a transmission unit in the imaging device 960.
  • a recording medium installed in the medium drive 968 may be any removable medium, such as a magnetic disk, a magneto-optical disc, an optical disc, or a semiconductor memory, which is capable of reading and writing.
  • the recording medium may be fixedly installed in the medium drive 968, and may be configured as, for example, a built-in hard disc drive or a non-transportable storage unit such as a solid state drive (SSD).
  • SSD solid state drive
  • the control unit 970 includes a processor such as a CPU, and a memory such as a RAM and a ROM.
  • the memory stores a program executed by a CPU, program data, and the like.
  • the program stored in the memory is read by the CPU, for example, at the time of starting up of the imaging device 960, and is executed.
  • the CPU executes the program to control an operation of the imaging device 960 in response to, for example, an operation signal that is input from the user interface unit 971.
  • the user interface unit 971 is connected to the control unit 970.
  • the user interface unit 971 includes, for example, a button and a switch for operating the imaging device 960 by a user.
  • the user interface unit 971 detects a user's operation through these components to generate an operation signal, and outputs the generated operation signal to the control unit 970.
  • the image processing unit 964 has a function of the image encoding apparatus 100 (Fig. 10) or the image decoding apparatus 200 (Fig. 19) according to the above-described embodiments.
  • the image processing unit 964 has a function of the image encoding apparatus 100 (Fig. 10) or the image decoding apparatus 200 (Fig. 19) according to the above-described embodiments.
  • a distribution server 1002 reads out scalable encoded data that is stored in a scalable encoded data storage unit 1001, and distributes the data to a terminal device, such as a personal computer 1004, an AV device 1005, a tablet device 1006, and a mobile phone 1007, through a network 1003.
  • a terminal device such as a personal computer 1004, an AV device 1005, a tablet device 1006, and a mobile phone 1007, through a network 1003.
  • the distribution server 1002 selects encoded data having an appropriate quality in accordance with the capability of the terminal device or a communication environment, and transmits the selected data. Even if the distribution server 1002 unnecessarily transmits high-quality data, the terminal device does not necessarily obtain a high-quality image, and thus there is a concern that a delay or overflow may occur. In addition, there is a concern that a communications zone may be unnecessarily occupied or a load of the terminal device may be unnecessarily increased. In contrast, even if the distribution server 1002 unnecessarily transmits low-quality data, there is a concern that the terminal device may not obtain an image with a sufficient image quality. For this reason, the distribution server 1002 appropriately reads out the scalable encoded data stored in the scalable encoded data storage unit 1001 as encoded data with an appropriate quality with respect to capability of the terminal device or a communication environment, and transmits the read-out data.
  • the scalable encoded data storage unit 1001 stores scalable encoded data (BL+EL) 1011 that is scalably encoded.
  • the scalable encoded data (BL+EL) 1011 is encoded data including both a base layer and an enhancement layer, and is data capable of obtaining both an image of the base layer and an image of the enhancement layer through decoding.
  • the distribution server 1002 selects an appropriate layer in accordance with capability of the terminal device transmitting data or a communication environment, and reads out data of the layer. For example, the distribution server 1002 reads out the scalable encoded data (BL+EL) 1011 having a high quality from the scalable encoded data storage unit 1001, and transmits the data to the personal computer 1004 or the tablet device 1006 having a high processing capability as it is.
  • BL+EL scalable encoded data
  • the distribution server 1002 extracts data of the base layer from the scalable encoded data (BL+EL) 1011, and transmits the data as scalable encoded data (BL) 1012 which is data having contents that are the same as those of the scalable encoded data (BL+EL) 1011 but has a lower quality than the scalable encoded data (BL+EL) 1011, to the AV device 1005 or the mobile phone 1007 which has a low processing capability.
  • BL scalable encoded data
  • BL scalable encoded data
  • the amount of data can be easily adjusted by using the scalable encoded data in this manner, and thus it is possible to suppress the occurrence of a delay or overflow and to suppress an unnecessary increase in a load of the terminal device or a communication medium.
  • the scalable encoded data (BL+EL) 1011 has a reduced redundancy between layers thereof, and thus it is possible to further reduce the amount of data as compared with a case where encoded data of the layers are set as individual data pieces. Therefore, a storage region of the scalable encoded data storage unit 1001 can be used more efficiently.
  • the terminal device As in the personal computer 1004 to the mobile phone 1007, various devices can be applied as the terminal device, and thus the performance of hardware of the terminal device varies depending on devices. In addition, since an application executed by the terminal device varies, the capability of software also varies. Furthermore, all communication line networks including a wired or wireless network such as the Internet or a local area network (LAN), or both of them can be applied as the network 1003 serving as a communication medium, and the data transmission capability thereof varies. Furthermore, there is a concern that the data transmission capability may also vary according to other communications.
  • LAN local area network
  • the distribution server 1002 may communicate with a terminal device which is a transmission destination of data before data transmission is started, to obtain information on the capability of the terminal device, such as hardware performance of the terminal device or performance of an application (software) which is executed by the terminal device, and information on a communication environment such as an available bandwidth of the network 1003.
  • the distribution server 1002 may select an appropriate layer on the basis of the information obtained here.
  • the extraction of a layer may be performed by the terminal device.
  • the personal computer 1004 may decode the transmitted scalable encoded data (BL+EL) 1011 to display an image of the base layer and to display an image of the enhancement layer.
  • the personal computer 1004 may extract the scalable encoded data (BL) 1012 of the base layer from the transmitted scalable encoded data (BL+EL) 1011 to store the scalable encoded data (BL) or to transmit the scalable encoded data (BL) to another device, or may decode the scalable encoded data (BL) to display an image of the base layer.
  • scalable encoded data storage units 1001, distribution servers 1002, networks 1003, and terminal devices are arbitrary.
  • the distribution server 1002 transmits data to the terminal device has been described so far, but a usage example is not limited thereto.
  • Any system can be applied as the data transmission system 1000, insofar as it is a system that selects and transmits an appropriate layer according to the capability of the terminal device or a communication environment at the time of transmitting scalably encoded data to the terminal device.
  • the scalable encoding is used for transmission through a plurality of communication media, for example, as in an example illustrated in Fig. 37.
  • a broadcasting station 1101 transmits scalable encoded data (BL) 1121 of a base layer by terrestrial broadcasting 1111.
  • the broadcasting station 1101 transmits (for example, packetizes and transmits) scalable encoded data (EL) 1122 of an enhancement layer through any network 1112 constituted by a wired, wireless, or a combination of both communication networks.
  • a terminal device 1102 has a reception function of the terrestrial broadcasting 1111 that is broadcasted by the broadcasting station 1101, and receives the scalable encoded data (BL) 1121 of the base layer which is transmitted through the terrestrial broadcasting 1111.
  • the terminal device 1102 further has a communication function of performing communication through the network 1112, and receives the scalable encoded data (EL) 1122 of the enhancement layer which is transmitted through the network 1112.
  • the terminal device 1102 decodes the scalable encoded data (BL) 1121 of the base layer which is acquired through the terrestrial broadcasting 1111 in accordance with, for example, a user's instruction to obtain an image of the base layer, stores the data, or transmits the data to another device.
  • BL scalable encoded data
  • the terminal device 1102 synthesizes the scalable encoded data (BL) 1121 of the base layer which is acquired through the terrestrial broadcasting 1111 in accordance with, for example, a user's instruction and the scalable encoded data (EL) 1122 of the enhancement layer which is acquired through the network 1112 to obtain scalable encoded data (BL+EL), decodes the pieces of data to obtain an image of the enhancement layer, and stores the pieces of data, or transmits the pieces of data to another device.
  • BL scalable encoded data
  • EL scalable encoded data
  • the scalable encoded data can be transmitted through, for example, a communication medium that is different for each layer. Therefore, a load can be dispersed, and thus it is possible to suppress the occurrence of a delay or overflow.
  • a communication medium used for transmission may be selected for each layer depending on situations.
  • the scalable encoded data (BL) 1121 of the base layer having a relatively large amount of data may be transmitted through a communication medium having a wide bandwidth
  • the scalable encoded data (EL) 1122 of the enhancement layer having a relatively small amount of data may be transmitted through a communication medium having a narrow bandwidth.
  • switching between using the network 1112 and using the terrestrial broadcasting 1111 as a communication medium that transmits the scalable encoded data (EL) 1122 of the enhancement layer may be performed in accordance with an available bandwidth of the network 1112.
  • the same is true of data of any layer.
  • the data transmission system 1100 divides encoded data that is scalably encoded into a plurality of pieces of data in units of layers, and any system can be applied as the data transmission system, insofar as it is a system that transmits the data through a plurality of lines.
  • scalable encoding is used to store encoded data, for example, as in an example illustrated in Fig. 38.
  • an imaging device 1201 scalably encodes image data that is obtained by imaging a subject 1211, and supplies the image data as scalable encoded data (BL+EL) 1221 to a scalable encoded data storage device 1202.
  • BL+EL scalable encoded data
  • the scalable encoded data storage device 1202 stores the scalable encoded data (BL+EL) 1221 that is supplied from the imaging device 1201 with a quality according to situations. For example, in a normal occasion, the scalable encoded data storage device 1202 extracts data of a base layer from the scalable encoded data (BL+EL) 1221, and stores the data as scalable encoded data (BL) 1222 of the base layer having a low quality and a small amount of data. On the other hand, for example, in a notice occasion, the scalable encoded data storage device 1202 stores the scalable encoded data (BL+EL) 1221 having a high quality and a large amount of data as it is.
  • BL+EL scalable encoded data
  • the scalable encoded data storage device 1202 can save an image with a high image quality only when necessary, and thus it is possible to suppress an increase in the amount of data while suppressing a reduction in the value of the image due to image degradation and to enhance utilization efficiency of a storage region.
  • the imaging device 1201 is assumed to be a monitoring camera.
  • a monitoring object for example, an intruder
  • the image data (scalable encoded data) is stored with a low quality.
  • an image quality is given priority, and thus the image data (scalable encoded data) is stored with a high quality.
  • the scalable encoded data storage device 1202 may determine whether it is a normal occasion or a notice occasion, by analyzing an image.
  • the imaging device 1201 may perform determination, and may transmit the determination result thereof to the scalable encoded data storage device 1202.
  • criteria for determining whether it is a normal occasion or a notice occasion is arbitrary, and the contents of an image which are used as the criteria for determination are arbitrary. Naturally, conditions other than the contents of the image can also be used as the criteria for determination.
  • the criteria for determination may be switched in accordance with the size or waveform of recorded audio, may be switched at each of predetermined time intervals, or may be switched in accordance with an instruction from the outside such as a user's instruction.
  • the imaging device 1201 may determine the number of layers of scalable encoding depending on states. For example, in a normal occasion, the imaging device 1201 may generate the scalable encoded data (BL) 1222 of the base layer having a low quality and a small amount of data, and may supply the scalable encoded data (BL) to the scalable encoded data storage device 1202. In addition, for example, in a notice occasion, the imaging device 1201 may generate the scalable encoded data (BL+EL) 1221 having a high quality and a large amount of data, and may supply the scalable encoded data (BL+EL) to the scalable encoded data storage device 1202.
  • BL scalable encoded data
  • BL+EL scalable encoded data
  • the monitoring camera has been described so far as an example, the use of the imaging system 1200 is arbitrary and is not limited to the monitoring camera.
  • Fig. 39 illustrates an example of a schematic configuration of a video set to which the present disclosure is applied.
  • a video set 1300 illustrated in Fig. 39 is a multi-functional component, and is a combination of a device having a function regarding encoding or decoding (may be either one or both of them) of an image and a device having another function regarding the function.
  • the video set 1300 has a module group including a video module 1311, an external memory 1312, a power management module 1313, a front end module 1314, and the like, and devices such as a connectivity port 1321, a camera 1322, and a sensor 1323 which have related functions.
  • a module is a part having a united function in which some component functions related to each other are put together.
  • the specific physical configuration thereof is arbitrary, for example, an integrated module is considered in which a plurality of processors having individual functions, an electronic circuit element such as a resistor or a capacitor, and other devices are disposed in a wiring substrate or the like.
  • it is also considered to form a new module in which another module or processor are combined with the module.
  • the video module 1311 is a combination of configurations having functions regarding image processing, and includes an application processor 1331, a video processor 1332, a broadband modem 1333, and an RF module 1334.
  • the processor is a processor in which a configuration having a predetermined function is integrated on a semiconductor chip by a system on a chip (SoC), and there is a processor that is referred to as, for example, a system large scale integration (LSI).
  • the configuration having a predetermined function may be a logic circuit (hardware configuration), may be a program (software configuration) which is executed using a CPU, a ROM, or a RAM, or may be a combination of the logic circuit and the program.
  • a processor may include a logic circuit, and a CPU, a ROM, a RAM, and the like, a portion of the function thereof may be realized by a logic circuit (hardware configuration), and other portions of the function may be realized by a program (software configuration) which is executed in a CPU.
  • the application processor 1331 of Fig. 39 is a processor that executes an application regarding image processing.
  • the application executed by the application processor 1331 can not only perform an arithmetic computation process but also control configurations inside and outside the video module 1311, for example, the video processor 1332, when necessary, in order to realize a predetermined function.
  • the video processor 1332 is a processor having a function regarding encoding and decoding (either one or both of them) of an image.
  • the broadband modem 1333 is a processor (or a module) that performs processing regarding wired or wireless (or both of them) broadband communication which is performed through a broadband line such as the Internet or a public telephone line network.
  • the broadband modem 1333 converts data (digital signal) to be transmitted into an analog signal by, for example, digital modulation, or demodulates a received analog signal to convert the analog signal into data (digital signal).
  • the broadband modem 1333 can perform digital modulation and demodulation on any information such as image data processed by the video processor 1332, a stream in which image data is encoded, an application program, or setting data.
  • the RF module 1334 is a module that performs frequency conversion, modulation and demodulation, amplification, filter processing, or the like on a radio frequency (RF) signal that is transmitted and received through an antenna. For example, the RF module 1334 frequency-converts a baseband signal generated by the broadband modem 1333, to generate an RF signal. In addition, for example, the RF module 1334 frequency-converts an RF signal that is received through the front end module 1314, to generate a baseband signal.
  • RF radio frequency
  • the application processor 1331 and the video processor 1332 may be integrated into one processor.
  • the external memory 1312 is a module, provided outside the video module 1311, which includes a storage device used by the video module 1311.
  • the storage device of the external memory 1312 may be realized by any physical configuration. However, in general, the storage device is often used to store large-volume data such as image data in units of frames, and thus it is preferable that the storage device be realized by a semiconductor memory, for example, a dynamic random access memory (DRAM), which is relatively low in price and has a large capacity.
  • DRAM dynamic random access memory
  • the power management module 1313 manages and controls power supply to the video module 1311 (configurations within the video module 1311).
  • the front end module 1314 is a module that provides a front end function (a circuit of a transmission and reception end on the antenna side) to the RF module 1334. As illustrated in Fig. 39, the front end module 1314 includes, for example, an antenna unit 1351, a filter 1352, and an amplification unit 1353.
  • the antenna unit 1351 includes an antenna that transmits and receives a radio signal and the peripheral configurations thereof.
  • the antenna unit 1351 transmits a signal, as a radio signal, which is supplied from the amplification unit 1353, and supplies the received radio signal as an electrical signal (RF signal) to the filter 1352.
  • the filter 1352 performs filter processing on an RF signal that is received through the antenna unit 1351, and supplies the processed RF signal to the RF module 1334.
  • the amplification unit 1353 amplifies the RF signal supplied from the RF module 1334, and supplies the amplified signal to the antenna unit 1351.
  • the connectivity port 1321 is a module having a function regarding connection with the outside.
  • a physical configuration of the connectivity port 1321 is arbitrary.
  • the connectivity port 1321 includes a configuration having a communication function other than a communication standard corresponding to the broadband modem 1333, an external input and output terminal, and the like.
  • the connectivity port 1321 may include a module having a communication function complying with a wireless communication standard such as Bluetooth (registered trademark), IEEE 802.11 (for example, wireless fidelity (Wi-Fi, registered trademark)), near field communication (NFC), or infrared data association (IrDA), or an antenna that transmits and receives a signal complying with the standard.
  • the connectivity port 1321 may include a module having a communication function complying with a wired communication standard such as a universal serial bus (USB) or high-definition multimedia interface (HDMI, registered trademark), or a terminal complying with the standard.
  • the connectivity port 1321 may have other data (signal) transmission functions such as an analog input and output terminal.
  • the connectivity port 1321 may include a device which is a transmission destination of data (signal).
  • the connectivity port 1321 may include a drive (including a hard disk, a solid state drive (SSD), a network attached storage (NAS), and the like, in addition to a drive of a removable medium) which reads out or writes data with respect to a recording medium such as a magnetic disk, an optical disc, a magneto-optical disc, or a semiconductor memory.
  • the connectivity port 1321 may include an output device (a monitor, a speaker, or the like) of an image or audio.
  • the camera 1322 is a module that images a subject and has a function of obtaining image data of the subject.
  • the image data obtained by the imaging of the camera 1322 is supplied to, for example, the video processor 1332 and is encoded.
  • the sensor 1323 is a module having a function of any sensor, for example, an audio sensor, an ultrasound wave sensor, an optical sensor, an illuminance sensor, an infrared sensor, an image sensor, a rotation sensor, an angle sensor, an angular velocity sensor, a velocity sensor, an acceleration sensor, an inclination sensor, a magnetic identification sensor, an impact sensor, or a temperature sensor.
  • Data detected by the sensor 1323 is supplied to, for example, the application processor 1331, and is used by an application or the like.
  • the configuration described above as a module may be realized as a processor, and on the contrary, the configuration described above as a processor may be realized as a module.
  • the present disclosure can be applied to the video processor 1332. Therefore, the video set 1300 can be implemented as a set to which the present disclosure is applied.
  • Fig. 40 illustrates an example of a schematic configuration of the video processor 1332 (Fig. 39) to which the present disclosure is applied.
  • the video processor 1332 has a function of receiving inputs of a video signal and an audio signal and encoding the signals using a predetermined scheme, and a function of decoding the encoded video data and audio data and reproducing and outputting the video signal and the audio signal.
  • the video processor 1332 includes a video input processing unit 1401, a first image enlargement and reduction unit 1402, a second image enlargement and reduction unit 1403, a video output processing unit 1404, a frame memory 1405, and a memory control unit 1406.
  • the video processor 1332 includes an encoder and decoder engine 1407, video elementary stream (ES) buffers 1408A and 1408B, and audio ES buffers 1409A and 1409B.
  • the video processor 1332 includes an audio encoder 1410, an audio decoder 1411, a multiplexer (MUX) 1412, a demultiplexer (DMUX) 1413, and a stream buffer 1414.
  • MUX multiplexer
  • DMUX demultiplexer
  • the video input processing unit 1401 acquires a video signal that is input from, for example, the connectivity port 1321 (Fig. 39), and converts the input video signal to digital image data.
  • the first image enlargement and reduction unit 1402 performs format conversion, image enlargement and reduction processing, or the like on image data.
  • the second image enlargement and reduction unit 1403 performs image enlargement and reduction processing on image data, in accordance with a format in a destination of outputting through the video output processing unit 1404, or performs format conversion, image enlargement and reduction processing, or the like on image data, which is similar to that in the first image enlargement and reduction unit 1402.
  • the video output processing unit 1404 performs format conversion, conversion into an analog signal, or the like on image data to output the image data as a reproduced video signal to, for example, the connectivity port 1321 (Fig. 39).
  • the frame memory 1405 is a memory for image data which is shared by the video input processing unit 1401, the first image enlargement and reduction unit 1402, the second image enlargement and reduction unit 1403, the video output processing unit 1404, and the encoder and decoder engine 1407.
  • the frame memory 1405 is realized as a semiconductor memory, for example, a DRAM.
  • the memory control unit 1406 receives a synchronization signal from the encoder and decoder engine 1407 to control access of writing and read-out to the frame memory 1405 in accordance with an access schedule to the frame memory 1405 which is written in an access management table 1406A.
  • the access management table 1406A is updated by the memory control unit 1406 in accordance with processing performed by the encoder and decoder engine 1407, the first image enlargement and reduction unit 1402, the second image enlargement the reduction unit 1403, and the like.
  • the encoder and decoder engine 1407 performs encoding processing of image data, and decoding processing of a video stream which is data obtained by encoding image data. For example, the encoder and decoder engine 1407 encodes image data that is read out from the frame memory 1405, and sequentially writes the encoded image data as a video stream in the video ES buffer 1408A. In addition, for example, the encoder and decoder engine sequentially reads out video streams from the video ES buffer 1408B, decodes the read-out video streams, and sequentially writes the video streams as image data in the frame memory 1405. The encoder and decoder engine 1407 uses the frame memory 1405 as a work area in the encoding and the decoding. In addition, the encoder and decoder engine 1407 outputs a synchronization signal to the memory control unit 1406, for example, at a timing when processing for each macroblock is started.
  • the video ES buffer 1408A buffers a video stream generated by the encoder and decoder engine 1407 and supplies the buffered video stream to the multiplexer (MUX) 1412.
  • the video ES buffer 1408B buffers a video stream supplied from the demultiplexer (DMUX) 1413 and supplies the buffered video stream to the encoder and decoder engine 1407.
  • the audio ES buffer 1409A buffers an audio stream generated by the audio encoder 1410 and supplies the buffered audio stream to the multiplexer (MUX) 1412.
  • the audio ES buffer 1409B buffers an audio stream supplied from the demultiplexer (DMUX) 1413 and supplies the buffered audio stream to the audio decoder 1411.
  • the audio encoder 1410 digitally converts an audio signal that is input from, for example, the connectivity port 1321 (Fig. 39), and encodes the converted audio signal in a predetermined format, for example, an MPEG audio format or an AudioCode number 3 (AC3) format.
  • the audio encoder 1410 sequentially writes, in the audio ES buffer 1409A, an audio stream which is data obtained by encoding an audio signal.
  • the audio decoder 1411 decodes an audio stream that is supplied from the audio ES buffer 1409B, performs conversion into, for example, an analog signal, and supplies the analog signal as a reproduced audio signal to, for example, the connectivity port 1321 (Fig. 39).
  • the multiplexer (MUX) 1412 multiplexes a video stream and an audio stream.
  • the multiplexing method (that is, a format of a bit stream that is generated by multiplexing) is arbitrary.
  • the multiplexer (MUX) 1412 can add predetermined header information or the like to a bit stream.
  • the multiplexer (MUX) 1412 can convert the format of the stream by multiplexing.
  • the multiplexer (MUX) 1412 multiplexes a video stream and an audio stream to convert the multiplexed stream to a transport stream which is a bit stream having a format for transportation.
  • the multiplexer (MUX) 1412 multiplexes a video stream and an audio stream to convert the multiplexed stream to data (file data) having a file format for recording.
  • the demultiplexer (DMUX) 1413 demultiplexes a bit stream in which a video stream and an audio stream are multiplexed, using a method corresponding to the multiplexing using the multiplexer (MUX) 1412. In other words, the demultiplexer (DMUX) 1413 extracts the video stream and the audio stream (separates the video stream and the audio stream) from the bit stream that is read out from the stream buffer 1414. In other words, the demultiplexer (DMUX) 1413 can convert a format of a stream by demultiplexing (inverse conversion of conversion using the multiplexer (MUX) 1412).
  • the demultiplexer (DMUX) 1413 can acquire a transport stream that is supplied from, for example, the connectivity port 1321 or the broadband modem 1333 (both shown in Fig. 39) through the stream buffer 1414, and can convert the acquired transport stream to a video stream and an audio stream by demultiplexing.
  • the demultiplexer (DMUX) 1413 can acquire file data that is read out from various recording media by, for example, the connectivity port 1321 (Fig. 39) through the stream buffer 1414, and can convert the acquired file data to a video stream and an audio stream by demultiplexing.
  • the stream buffer 1414 buffers a bit stream.
  • the stream buffer 1414 buffers the transport stream that is supplied from the multiplexer (MUX) 1412, and supplies the buffered transport stream to, for example, the connectivity port 1321 or the broadband modem 1333 (both shown in Fig. 39) at a predetermined timing or on the basis of a request from the outside.
  • MUX multiplexer
  • the stream buffer 1414 buffers the file data that is supplied from the multiplexer (MUX) 1412, supplies the buffered file data to, for example, the connectivity port 1321 (Fig. 39) at a predetermined timing or on the basis of a request from the outside, and causes various recording media to record the buffered file data.
  • MUX multiplexer
  • the stream buffer 1414 buffers the transport stream that is acquired through, for example, the connectivity port 1321 or the broadband modem 1333 (both shown in Fig. 39), and supplies the buffered transport stream to the demultiplexer (DMUX) 1413 at a predetermined timing or on the basis of a request from the outside.
  • DMUX demultiplexer
  • the stream buffer 1414 buffers file data that is read out from various recording media in, for example, the connectivity port 1321 (Fig. 39), and supplies the buffered file data to the demultiplexer (DMUX) 1413 at a predetermined timing or on the basis of a request from the outside.
  • DMUX demultiplexer
  • a video signal that is input to the video processor 1332 from the connectivity port 1321 is converted to digital image data having a predetermined format such as a 4:2:2Y/Cb/Cr format in the video input processing unit 1401, and is sequentially written in the frame memory 1405.
  • the digital image data is read out to the first image enlargement and reduction unit 1402 or the second image enlargement and reduction unit 1403, is subjected to format conversion and enlargement and reduction processing in a predetermined format such as a 4:2:0Y/Cb/Cr format, and is written again in the frame memory 1405.
  • the image data is encoded by the encoder and decoder engine 1407 and is written as a video stream in the video ES buffer 1408A.
  • an audio signal that is input to the video processor 1332 from the connectivity port 1321 (Fig. 39) or the like is encoded by the audio encoder 1410, and is written as an audio stream in the audio ES buffer 1409A.
  • the video stream of the video ES buffer 1408A and the audio stream of the audio ES buffer 1409A are read out to the multiplexer (MUX) 1412 and are multiplexed, and are converted to a transport stream, file data, or the like.
  • the transport stream generated by the multiplexer (MUX) 1412 is buffered by the stream buffer 1414, and is then output to an external network through, for example, the connectivity port 1321 or the broadband modem 1333 (both shown in Fig. 52).
  • the file data generated by the multiplexer (MUX) 1412 is buffered by the stream buffer 1414, and is then output to, for example, the connectivity port 1321 (Fig. 39), and is recorded in various recording media.
  • a transport stream that is input to the video processor 1332 from an external network through, for example, the connectivity port 1321 or the broadband modem 1333 (both shown in Fig. 39) is buffered by the stream buffer 1414, and is then demultiplexed by the demultiplexer (DMUX) 1413.
  • the file data that is read out from various recording media in, for example, the connectivity port 1321 (Fig. 39) and is then input to the video processor 1332 is buffered by the stream buffer 1414 and is then demultiplexed by the demultiplexer (DMUX) 1413.
  • the transport stream or the file data that is input to the video processor 1332 is divided into a video stream and an audio stream by the demultiplexer (DMUX) 1413.
  • the audio stream is supplied to the audio decoder 1411 through the audio ES buffer 1409B and is decoded, and thus an audio signal is reproduced.
  • the video stream is written in the video ES buffer 1408B, is then sequentially read out by the encoder and decoder engine 1407 to be decoded, and is written in the frame memory 1405.
  • the decoded image data is subjected to enlargement and reduction processing by the second image enlargement and reduction unit 1403, and is written in the frame memory 1405.
  • the decoded image data is read out to the video output processing unit 1404, is subjected to format conversion in a predetermined format such as a 4:2:2Y/Cb/Cr format, and is converted to an analog signal, and thus a video signal is reproduced and output.
  • a predetermined format such as a 4:2:2Y/Cb/Cr format
  • the present disclosure when the present disclosure is applied to the video processor 1332 that is configured in such a manner, the present disclosure according to the above-described embodiments may be applied to the encoder and decoder engine 1407.
  • the encoder and decoder engine 1407 may have a function of the image encoding apparatus 100 (Fig. 10) according to the first embodiment or the image decoding apparatus 200 (Fig. 19) according to the second embodiment.
  • the encoder and decoder engine 1407 may have a function of the multi-view image encoding apparatus 600 (Fig. 29) or the multi-view image decoding apparatus 610 (Fig. 30) according to the third embodiment.
  • the video processor 1332 can obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27.
  • the present disclosure (that is, a function of the image encoding apparatus or the image decoding apparatus according to the above-described embodiments) may be realized by hardware such as a logic circuit, may be realized by software such as an embedded program, or may be realized by both of them.
  • Fig. 41 illustrates another example of a schematic configuration of the video processor 1332 (Fig. 39) to which the present disclosure is applied.
  • the video processor 1332 has a function of encoding and decoding video data in a predetermined format.
  • the video processor 1332 includes a control unit 1511, a display interface 1512, a display engine 1513, an image processing engine 1514, and an internal memory 1515.
  • the video processor 1332 includes a codec engine 1516, a memory interface 1517, a multiplexing and demultiplexer (MU DMUX) 1518, a network interface 1519, and a video interface 1520.
  • MU DMUX multiplexing and demultiplexer
  • the control unit 1511 controls operations of processing units within the video processor 1332, such as the display interface 1512, the display engine 1513, the image processing engine 1514, and the codec engine 1516.
  • the control unit 1511 includes, for example, a main CPU 1531, a sub CPU 1532, and a system controller 1533.
  • the main CPU 1531 executes a program or the like for controlling operations of the processing units within the video processor 1332.
  • the main CPU 1531 generates a control signal in accordance with the program or the like, and supplies the control signal to the processing units (in other words, controls the operations of the processing units).
  • the sub CPU 1532 plays a supplementary role to the main CPU 1531.
  • the sub CPU 1532 executes a child process or a subroutine of the program or the like that is executed by the main CPU 1531.
  • the system controller 1533 controls operations of the main CPU 1531 and the sub CPU 1532, for example, designation of programs executed by the main CPU 1531 and the sub CPU 1532.
  • the display interface 1512 outputs image data to, for example, the connectivity port 1321 (Fig. 39) under the control of the control unit 1511.
  • the display interface 1512 converts image data of digital data to an analog signal, and outputs the analog signal as a reproduced video signal or the intact image data of the digital data to a monitor device of the connectivity port 1321 (Fig. 39).
  • the display engine 1513 performs various pieces of processing such as format conversion, size conversion, or color gamut conversion on image data under the control of the control unit 1511 in conformity with hardware specifications of a monitor device or the like which displays the image.
  • the image processing engine 1514 performs predetermined image processing, for example, filter processing for improving an image quality on the image data under the control of the control unit 1511.
  • the internal memory 1515 is a memory, shared by the display engine 1513, the image processing engine 1514, and the codec engine 1516, which is provided within the video processor 1332.
  • the internal memory 1515 is used to transmit and receive data, for example, between the display engine 1513, the image processing engine 1514, and the codec engine 1516.
  • the internal memory 1515 stores data that is supplied from the display engine 1513, the image processing engine 1514, or the codec engine 1516, and supplies the data to the display engine 1513, the image processing engine 1514, or the codec engine 1516 when necessary (for example, in accordance with a request).
  • the internal memory 1515 may be realized by any storage device.
  • the internal memory is often used to store data having a small size such as image data in units of blocks or parameters, and thus it is preferable that the internal memory be realized by a semiconductor memory, for example, a static random access memory (SRAM), which has a relatively (as compared with, for example, the external memory 1312) small capacity but has a high response speed.
  • a semiconductor memory for example, a static random access memory (SRAM), which has a relatively (as compared with, for example, the external memory 1312) small capacity but has a high response speed.
  • SRAM static random access memory
  • the codec engine 1516 performs processing regarding encoding or decoding of image data.
  • Encoding and decoding schemes corresponding to the codec engine 1516 are arbitrary, and the number of schemes may be one or two or more.
  • the codec engine 1516 has a plurality of codec functions using encoding and decoding schemes, and may perform encoding of image data or decoding of encoded data using the codec function selected therefrom.
  • the codec engine 1516 includes, for example, an MPEG-2 Video 1541, an AVC/H.264 1542, an HEVC/H.265 1543, an HEVC/H.265 (scalable) 1544, an HEVC/H.265 (multi-view) 1545, and an MPEG-DASH 1551, as functional blocks of processing regarding a codec.
  • the MPEG-2 Video 1541 is a functional block that encodes or decodes image data in an MPEG-2 format.
  • the AVC/H.264 1542 is a functional block that encodes or decodes image data in an AVC format.
  • the HEVC/H.265 1543 is a functional block that encodes or decodes image data in an HEVC format.
  • the HEVC/H.265 (Scalable) 1544 is a functional block that scalably encodes or decodes image data in an HEVC format.
  • the HEVC/H.265 (Multi-view) 1545 is a functional block that performs multi-view encoding or multi-view decoding on image data in an HEVC format.
  • the MPEG-DASH 1551 is a functional block that transmits and receives image data in an MPEG-dynamic adaptive streaming using HTTP (MPEG-DASH) format.
  • MPEG-DASH is a technique of performing streaming of a video using hypertext transfer protocol (HTTP), and has a feature of selecting appropriate pieces of encoded data in units of segments from a plurality of pieces of encoded data, having different resolutions, which are prepared in advance, and of transmitting the selected data.
  • the MPEG-DASH 1551 generates a stream complying with a standard and controls the transmission of the stream, and the MPEG-2 Video 1541 to the HEVC/H.265 (Multi-view) 1545 described above are used for encoding and decoding of image data.
  • the memory interface 1517 is an interface for the external memory 1312. Data supplied from the image processing engine 1514 or the codec engine 1516 is supplied to the external memory 1312 through the memory interface 1517. In addition, data read out from the external memory 1312 is supplied to the video processor 1332 (the image processing engine 1514 or the codec engine 1516) through the memory interface 1517.
  • the multiplexing and demultiplexer (MU DMUX) 1518 performs multiplexing and demultiplexing on various pieces of data regarding an image, such as a bit stream of encoded data, image data, or a video signal. Methods of the multiplexing and the demultiplexing are arbitrary. For example, at the time of multiplexing, the multiplexing and demultiplexer (MU DMUX) 1518 can not only combine a plurality of pieces of data into one, but also add predetermined header information or the like to the data.
  • the multiplexing and demultiplexer (MU DMUX) 1518 can not only divide one piece of data into a plurality of pieces of data, but also add predetermined header information or the like to each of the divided pieces of data.
  • the multiplexing and demultiplexer (MU DMUX) 1518 can convert a format of data by multiplexing and demultiplexing.
  • the multiplexing and demultiplexer (MU DMUX) 1518 can multiplex a bit stream to convert the multiplexed bit stream to a transport stream which is a bit stream having a format for transmission or data (file data) having a file format for recording.
  • the inverse conversion thereof can also be performed by demultiplexing.
  • the network interface 1519 is an interface for the broadband modem 1333 or the connectivity port 1321 (both shown in Fig. 39).
  • the video interface 1520 is an interface for the connectivity port 1321 or the camera 1322 (both shown in Fig. 39).
  • the transport stream is supplied to the multiplexing and demultiplexer (MU DMUX) 1518 through the network interface 1519 to be demultiplexed, and is decoded by the codec engine 1516.
  • MU DMUX multiplexing and demultiplexer
  • image data obtained by the decoding of the codec engine 1516 is subjected to predetermined image processing by the image processing engine 1514, is subjected to predetermined conversion by the display engine 1513, and is supplied to, for example, the connectivity port 1321 (Fig.
  • image data obtained by the decoding of the codec engine 1516 is encoded again by the codec engine 1516, is multiplexed by the multiplexing and demultiplexer (MU DMUX) 1518 to be converted to file data, is output to, for example, the connectivity port 1321 (Fig. 39) through the video interface 1520, and is recorded in various recording media.
  • MU DMUX multiplexing and demultiplexer
  • file data of encoded data obtained by encoding image data which is read out from a recording medium not shown in the drawing by the connectivity port 1321 (Fig. 39), is supplied to the multiplexing and demultiplexer (MU DMUX) 1518 through the video interface 1520 to be demultiplexed, and is decoded by the codec engine 1516.
  • Image data obtained by the decoding of the codec engine 1516 is subjected to predetermined image processing by the image processing engine 1514, is subjected to predetermined conversion by the display engine 1513, and is supplied to, for example, the connectivity port 1321 (Fig. 39) through the display interface 1512, and thus the image thereof is displayed on a monitor.
  • image data obtained by the decoding of the codec engine 1516 is encoded again by the codec engine 1516, is multiplexed by the multiplexing and demultiplexer (MU DMUX) 1518 to be converted to a transport stream, is supplied to, for example, the connectivity port 1321 or the broadband modem 1333 (both shown in Fig. 39) through the network interface 1519, and is transmitted to another device not shown in the drawing.
  • MU DMUX multiplexing and demultiplexer
  • transmission and reception of image data or other pieces of data between the processing units within the video processor 1332 are performed using, for example, the internal memory 1515 or the external memory 1312.
  • the power management module 1313 controls power supply to, for example, the control unit 1511.
  • the present disclosure when the present disclosure is applied to the video processor 1332 that is configured in this manner, the present disclosure according to the above-described embodiment may be applied to the codec engine 1516.
  • the codec engine 1516 may include a functional block that realizes the image encoding apparatus 100 (Fig. 10) according to the first embodiment or the image decoding apparatus 200 (Fig. 19) according to the second embodiment.
  • the codec engine 1516 may include a functional block that realizes the multi-view image encoding apparatus 600 (Fig. 29) or the multi-view image decoding apparatus 610 (Fig. 30) according to the third embodiment.
  • the video processor 1332 can obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27.
  • the present disclosure (that is, functions of the image encoding apparatus or the image decoding apparatus according to the above-described embodiments) may be realized by hardware such as a logic circuit, may be realized by software such as an embedded program, or may be realized by both of them.
  • the configuration of the video processor 1332 is arbitrary, and configurations other than the above-described two examples may be used.
  • the video processor 1332 may be configured as one semiconductor chip, or may be configured as a plurality of semiconductor chips.
  • the video processor may be a three-dimensional stacked LSI in which a plurality of semiconductors are stacked.
  • the video processor may be realized by a plurality of LSIs.
  • the video set 1300 can be embedded into various devices that process image data.
  • the video set 1300 can be embedded into the television device 900 (Fig. 32), the mobile phone 920 (Fig. 33), the recording and reproducing device 940 (Fig. 34), or the imaging device 960 (Fig. 35).
  • the video set 1300 By the video set 1300 being embedded, the device thereof can obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27.
  • the video set 1300 can also be embedded into, for example, the terminal devices such as the personal computer 1004, the AV device 1005, the tablet device 1006, and the mobile phone 1007 in the data transmission system 1000 of Fig. 36, the broadcasting station 1101 and the terminal device 1102 in the data transmission system 1100 of Fig. 37, and the imaging device 1201 and the scalable encoded data storage device 1202 in the imaging system 1200 of Fig. 39.
  • the video set 1300 By the video set 1300 being embedded, the device thereof can obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27.
  • the video set can also be embedded into devices in the content reproduction system of Fig. 42 or the wireless communication system of Fig. 48.
  • each configuration of the video set 1300 described above includes the video processor 1332
  • the video processor 1332 can be implemented as a video processor to which the present disclosure is applied.
  • the processor and the video module 1311 which are shown as the dotted line 1341 can be implemented as a processor and a module to which the present disclosure is applied.
  • the video module 1311, the external memory 1312, the power management module 1313, and the front end module 1314 can also be combined with each other to be implemented as a video unit 1361 to which the present disclosure is applied. In any of these configurations, it is possible to obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27.
  • any configuration including the video processor 1332 can be embedded into various devices that process image data.
  • the video processor 1332, the processor shown as the dotted line 1341, the video module 1311, or the video unit 1361 can be embedded into the television device 900 (Fig. 32), the mobile phone 920 (Fig. 33), the recording and reproducing device 940 (Fig. 34), the imaging device 960 (Fig. 35), terminal devices such as the personal computer 1004, the AV device 1005, the tablet device 1006, and the mobile phone 1007 in the data transmission system 1000 of Fig. 36, the broadcasting station 1101 and the terminal device 1102 in the data transmission system 1100 of Fig.
  • any configuration having the present disclosure applied thereto may be embedded, and thus the device thereof can obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27, similar to the case of the video set 1300.
  • the present disclosure can also be applied to a content reproduction system of HTTP streaming, for example, MPEG DASH to be described later, or to a wireless communication system having a Wi-Fi standard, wherein the content reproduction system selects appropriate pieces of encoded data in units of segments from a plurality of pieces of encoded data, prepared in advance, which have different resolutions and uses the selected encoded data.
  • a content reproduction system of HTTP streaming for example, MPEG DASH to be described later
  • a wireless communication system having a Wi-Fi standard wherein the content reproduction system selects appropriate pieces of encoded data in units of segments from a plurality of pieces of encoded data, prepared in advance, which have different resolutions and uses the selected encoded data.
  • Fig. 42 is an explanatory diagram illustrating a configuration of a content reproduction system.
  • the content reproduction system includes content servers 1610, 1611, a network 1612, and a content reproduction apparatus 1620 (client device).
  • the content servers 1610, 1611 and the content reproduction apparatus 1620 are connected to each other through the network 1612.
  • the network 1612 is a wired or wireless transmission path of information that is transmitted from a device connected to the network 1612.
  • the network 1612 may include various local area networks (LANs) or wide area networks (WANs) including a public line network such as the Internet, a telephone line network, a satellite communication network, or Ethernet (registered trademark).
  • LANs local area networks
  • WANs wide area networks
  • IP-VPN Internet protocol-virtual private network
  • the content server 1610 encodes content data, generates a data file including encoded data and meta-information of the encoded data, and records the generated data file. Meanwhile, when the content server 1610 generates a data file having an MP4 format, encoded data corresponds to "mdat”, and meta-information corresponds to "moov".
  • content data may be music data, for example, music, a lecture and a radio program, video data such as a movie, a television program, a video program, a photo, a document, a painting, and a chart, a game, software, or the like.
  • the content server 1610 generates a plurality of data files at different bit rates with respect to the same content.
  • the content server 1611 includes information of a parameter added to the URL in the content reproduction apparatus 1620 in information of an URL of the content server 1610 with respect to a request for reproducing a content from the content reproduction apparatus 1620, and transmits the information to the content reproduction apparatus 1620.
  • the matters will be specifically described with reference to Fig. 43.
  • Fig. 43 is an explanatory diagram illustrating a flow of data in the content reproduction system of Fig. 42.
  • the content server 1610 encodes the same content data at different bit rates, and generates, for example, a file A of 2 Mbps, a file B of 1.5 Mbps, and a file C of 1 Mbps, as illustrated in Fig. 43.
  • the file A is a high bit rate
  • the file B is a standard bit rate
  • the file C is a low bit rate.
  • encoded data of each file is divided into a plurality of segments.
  • the encoded data of the file A is divided into segments of "A1", “A2", “A3", “An”
  • the encoded data of the file B is divided into segments of "B1", “B2”, “B3”, “Bn”
  • the encoded data of the file C is divided into segments of "C1", “C2", “C3", “Cn”.
  • each segment may be constituted by a sample constituted by one or two or more pieces of video encoded data and audio encoded data that are independently reproducible and begin with a sync sample of MP4 (for example, an IDR picture in video encoding of AVC/H.264).
  • MP4 for example, an IDR picture in video encoding of AVC/H.264.
  • each segment may be a video corresponding to 2 seconds which is equivalent to 4GOP and audio encoded data or may be a video corresponding to 10 seconds which is equivalent to 20GOP and audio encoded data.
  • reproduction ranges ranges of temporal positions from the head of contents based on segments having the same arrangement order in the files are the same. For example, reproduction ranges of the segment “A2”, the segment “B2”, and the segment “C2” are the same. When each segment is encoded data corresponding to 2 seconds, all the reproduction ranges of the segment “A2”, the segment “B2”, and the segment “C2" are 2 seconds to four seconds of contents.
  • the content server 1610 stores the file A to the file C. As illustrated in Fig. 43, the content server 1610 sequentially transmits segments constituting different files to the content reproduction apparatus 1620, and the content reproduction apparatus 1620 streaming-reproduces the received segments.
  • the content server 1610 transmits a play list file (hereinafter, media presentation description (MPD)) which includes bit rate information and access information of each piece of encoded data, to the content reproduction apparatus 1620.
  • the content reproduction apparatus 1620 selects any bit rate in a plurality of bit rates on the basis of the MPD, and requests the transmission of a segment corresponding to the selected bit rate from the content server 1610.
  • MPD media presentation description
  • Fig. 42 illustrates only one content server 1610, but it is needless to say that the present disclosure is not limited to such an example.
  • Fig. 44 is an explanatory diagram illustrating a specific example of MPD.
  • the MPD includes access information on a plurality of pieces of encoded data having different bit rates (BANDWIDTH).
  • BANDWIDTH bit rates
  • the MPD illustrated in Fig. 44 indicates that pieces of encoded data of 256 Kbps, 1.024 Mbps, 1.384 Mbps, 1.536 Mbps, and 2.048 Mbps are present, and includes access information on the pieces of encoded data.
  • the content reproduction apparatus 1620 can dynamically change a bit rate of encoded data to be streaming-reproduced on the basis of the MPD.
  • Fig. 42 illustrates a portable terminal as an example of the content reproduction apparatus 1620, but the content reproduction apparatus 1620 is not limited to such an example.
  • the content reproduction apparatus 1620 may be an information processing apparatus such as a personal computer (PC), a home video processing apparatus (a DVD recorder, a video cassette recorder, or the like), personal digital assistants (PDAs), a home-use game machine, or a household electrical appliance.
  • the content reproduction apparatus 1620 may be an information processing apparatus such as a mobile phone, a personal handyphone system (PHS), a portable music playback apparatus, a portable video processing apparatus, or a portable game machine.
  • PHS personal handyphone system
  • Fig. 45 is a functional block diagram illustrating a configuration of the content server 1610.
  • the content server 1610 includes a file generation unit 1631, a storage unit 1632, and a communication unit 1633.
  • the file generation unit 1631 includes an encoder 1641 that encodes content data, and generates a plurality of pieces of encoded data having the same contents and different bit rates, and the above-described MPD. For example, when the file generation unit 1631 generates pieces of encoded data of 256 Kbps, 1.024 Mbps, 1.384 Mbps, 1.536 Mbps, and 2.048 Mbps, the file generation unit generates an MPD as illustrated in Fig. 44.
  • the storage unit 1632 stores the plurality of pieces of encoded data, having different bit rates, and the MPD which are generated by the file generation unit 1631.
  • the storage unit 1632 may be a storage medium such as a non-volatile memory, a magnetic disk, an optical disc, and a magneto optical (MO) disc.
  • the non-volatile memory include an electrically erasable programmable read-only memory (EEPROM) and an erasable programmable ROM (EPROM).
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable programmable ROM
  • examples of the magnetic disk include a hard disk, a discoidal magnetic body disk, and the like.
  • examples of the optical disc include a compact disc (CD), a digital versatile disc recordable (DVD-R), Blu-Ray disc (BD; registered trademark), and the like.
  • the communication unit 1633 is an interface with the content reproduction apparatus 1620, and communicates with the content reproduction apparatus 1620 through the network 1612.
  • the communication unit 1633 has a function as an HTTP server that communicates with the content reproduction apparatus 1620 in accordance with an HTTP.
  • the communication unit 1633 transmits an MPD to the content reproduction apparatus 1620, extracts, from the storage unit 1632, encoded data that is requested on the basis of the MPD transmitted from the content reproduction apparatus 1620 in accordance with an HTTP, and transmits the encoded data to the content reproduction apparatus 1620 as an HTTP response.
  • Fig. 46 is a functional block diagram illustrating a configuration of the content reproduction apparatus 1620.
  • the content reproduction apparatus 1620 includes a communication unit 1651, a storage unit 1652, a reproducing unit 1653, a selection unit 1654, and a present location acquisition unit 1656.
  • the communication unit 1651 which is an interface with the content server 1610, requests data from the content server 1610 and acquires data from the content server 1610.
  • the communication unit 1651 has a function as an HTTP client that communicates with the content reproduction apparatus 1620 in accordance with an HTTP.
  • the communication unit 1651 can selectively acquire an MPD or segments of encoded data from the content server 1610, using an HTTP range.
  • the storage unit 1652 stores various pieces of information on the reproduction of contents. For example, the storage unit sequentially buffers the segments acquired from the content server 1610 by the communication unit 1651. The segments of the encoded data which are buffered by the storage unit 1652 are sequentially supplied to the reproducing unit 1653 by first in first out (FIFO).
  • FIFO first in first out
  • the storage unit 1652 adds a parameter to an URL in the communication unit 1651 on the basis of an instruction for adding a parameter to an URL of a content described in an MPD, which is requested from the content server 1611 to be described later, to store a definition for having access to the URL.
  • the reproducing unit 1653 sequentially reproduces segments that are supplied from the storage unit 1652. Specifically, the reproducing unit 1653 performs decoding of segments, DA conversion, rendering, and the like.
  • the selection unit 1654 sequentially selects, within the same content, which segments of encoded data corresponding to any bit rate included in the MPD to acquire. For example, when the selection unit 1654 sequentially selects the segments “A1", “B2", and “A3” in accordance with a band of the network 1612, the communication unit 1651 sequentially acquires the segments “A1", “B2", and "A3" from the content server 1610, as illustrated in Fig. 43.
  • the present location acquisition unit 1656 acquires the present location of the content reproduction apparatus 1620, and may be configured as a module that acquires the present location of, for example, a global positioning system (GPS) receiver. In addition, the present location acquisition unit 1656 may acquire the present location of the content reproduction apparatus 1620, using a wireless network.
  • GPS global positioning system
  • Fig. 47 is an explanatory diagram illustrating a configuration example of the content server 1611. As illustrated in Fig. 47, the content server 1611 includes a storage unit 1671 and a communication unit 1672.
  • the storage unit 1671 stores information of an URL of an MPD.
  • the information of the URL of the MPD is transmitted to the content reproduction apparatus 1620 from the content server 1611 in response to a demand for requesting the reproduction of a content from the content reproduction apparatus 1620.
  • the storage unit 1671 provides the information of the URL of the MPD to the content reproduction apparatus 1620
  • the storage unit stores definition information when adding a parameter to an URL described in the MPD in the content reproduction apparatus 1620.
  • the communication unit 1672 which is an interface with the content reproduction apparatus 1620, communicates with the content reproduction apparatus 1620 through the network 1612. That is, the communication unit 1672 receives a request for the information of the URL of the MPD from the content reproduction apparatus 1620 that requests the reproduction of a content, and transmits the information of the URL of the MPD to the content reproduction apparatus 1620.
  • the URL of the MPD which is transmitted from the communication unit 1672 includes information for adding a parameter in the content reproduction apparatus 1620.
  • the parameter added to the URL of the MPD in the content reproduction apparatus 1620 can be variously set by definition information shared by the content server 1611 and the content reproduction apparatus 1620.
  • information such as the present location of the content reproduction apparatus 1620, a user ID of a user who uses the content reproduction apparatus 1620, a memory size of the content reproduction apparatus 1620, or the capacity of a storage of the content reproduction apparatus 1620 can be added to the URL of the MPD in the content reproduction apparatus 1620.
  • the encoder 1641 of the content server 1610 has a function of the image encoding apparatus 100 (Fig. 10) according to the above-described embodiment.
  • the reproducing unit 1653 of the content reproduction apparatus 1620 has a function of the image decoding apparatus 200 (Fig. 17) according to the above-described embodiment.
  • data that is encoded by the present disclosure is transmitted and received, and thus it is possible to enhance encoding efficiency of image compression information to be output.
  • wireless packet transmission and reception between when a specific application to be used is designated and when P2P connection is established and the specific application is operated are performed. Thereafter, after the connection in the second layer, wireless packet transmission and reception at the time of starting up the specific application are performed.
  • Fig. 48 and Fig. 49 illustrate an example of the above-described wireless packet transmission and reception until peer to peer (P2P) connection is established and a specific application is operated, and are sequence charts illustrating an example of communication processing using devices serving as a basis for wireless communication. Specifically, an example is illustrated of an establishment procedure of direct connection reaching connection in a Wi-Fi direct standard (also referred to as Wi-Fi P2P) which is standardized in a Wi-Fi alliance.
  • Wi-Fi direct standard also referred to as Wi-Fi P2P
  • Wi-Fi direct a plurality of wireless communication devices detect mutual presence (Device Discovery, Service Discovery). When connection devices are selected, device authentication is performed between the selected devices by Wi-Fi protected setup (WPS), and thus direct connection is established. In addition, in the Wi-Fi direct, it is determined whether a plurality of wireless communication devices take on a role as either a parent device (Group Owner) or a child device (Client), thereby forming a communication group.
  • WPS Wi-Fi protected setup
  • a portion of packet transmission and reception will be omitted.
  • packet switching for using WPC is necessary, and packet switching is also necessary in exchanging an authentication request and response.
  • Fig. 48 and Fig. 49 the illustration of these pieces of packet switching are omitted, and only a second connection and the subsequent connections are illustrated.
  • Fig. 48 and Fig. 49 illustrate an example of communication processing between a first wireless communication device 1701 and a second wireless communication device 1702, but the same is true of communication processing between other wireless communication devices.
  • the device discovery is performed between the first wireless communication device 1701 and the second wireless communication device 1702 (1711).
  • the first wireless communication device 1701 transmits a probe request (response request signal), and receives a probe response (response signal) for the probe request from the second wireless communication device 1702.
  • the first wireless communication device 1701 and the second wireless communication device 1702 can discover the mutual presence.
  • the service discovery is performed between the first wireless communication device 1701 and the second wireless communication device 1702 (1712).
  • the first wireless communication device 1701 transmits a service discovery query for inquiring a service corresponding to the second wireless communication device 1702 discovered by the device discovery.
  • the first wireless communication device 1701 receives a service discovery response from the second wireless communication device 1702 to acquire a service corresponding to the second wireless communication device 1702. That is, it is possible to acquire a service that is executable by the other party, using the service discovery.
  • the service that is executable by the other party is, for example, a service and a protocol (a digital living network alliance (DLNA), a digital media renderer (DMR), or the like).
  • DLNA digital living network alliance
  • DMR digital media renderer
  • connection party selection operation a selection operation of a connection party (connection party selection operation) is performed by a user (1713).
  • the connection party selection operation may occur only in any one of the first wireless communication device 1701 and the second wireless communication device 1702.
  • a connection party selection screen is displayed on a display unit of the first wireless communication device 1701, and the second wireless communication device 1702 is selected as a connection party in the connection party selection screen by a user's operation.
  • Fig. 48 and Fig. 49 illustrate an example in which the first wireless communication device 1701 becomes a group owner 1715 and the second wireless communication device 1702 becomes a client 1716 by a result of the group owner negotiation.
  • pieces of processing (1717 to 1720) are performed between the first wireless communication device 1701 and the second wireless communication device 1702, and thus direct connection is established. That is, association (second layer (L2) link establishment) (1717) and secure link establishment (1718) are sequentially performed.
  • IP address assignment (1719) and L4 setup (1720) on an L3 using a simple service discovery protocol (SSDP) or the like are sequentially performed.
  • the layer 2 (L2) means a second layer (data link layer)
  • the layer 3 (L3) means a third layer (network layer)
  • a layer 4 (L4) means a fourth layer (transport layer).
  • the designation of a specific application or a startup operation is performed by a user (1721).
  • the application designation and startup operation may occur only in any one of the first wireless communication device 1701 and the second wireless communication device 1702.
  • an application designation and startup operation screen is displayed on the display unit of the first wireless communication device 1701, and a specific application is selected in the application designation and startup operation screen by a user's operation.
  • connection between an access point (AP) and a station (STA) is performed within a range of a specification before a Wi-Fi direction standard (specification standardized in IEEE802.11).
  • AP access point
  • STA station
  • connection in the second layer before association in a term of IEEE802.11
  • connection party in the device discovery or service discovery (option), it is possible to acquire information of a connection party at the time of searching for a connection candidate party.
  • the information of the connection party is, for example, a type of a basic device or a corresponding specific application. It is possible to cause a user to select a connection party on the basis of the acquired information of the connection party.
  • Fig. 51 illustrates an example of a sequence reaching connection in this case.
  • Fig. 50 illustrates a configuration example of a frame format that is transmitted and received in the communication processing.
  • Fig. 50 is a diagram schematically illustrating a configuration example of a frame format that is transmitted and received in communication processing using devices serving as a basis for the present disclosure. That is, Fig. 50 illustrates a configuration example of an MAC frame for establishing connection in the second layer. Specifically, Fig. 50 is an example of a frame format of an association request and response (1787) for realizing a sequence illustrated in Fig. 51.
  • a section from frame control (1751) to sequence control (1756) is an MAC header.
  • the MAC frame illustrated in Fig. 50 is basically an association request and response frame format described in section 7.2. 3.4 and section 7.2. 3.5 of an IEEE802.11-2007 specification.
  • the MAC frame is different in that it includes not only an information element (hereinafter, simply referred to as an IE) which is defined in an IEEE802.11 specification but also an IE that is independently expanded.
  • an IE information element
  • 127 is set in decimal notation in an IE Type (Information Element ID (1761)).
  • IE Type Information Element ID (1761)
  • a Length field (1762) and an OUI field (1763) follow by section 7.3. 2.26 of the IEEE802.11-2007 specification, and a vendor specific content (1764) is disposed after this.
  • a field indicating a type of a vendor specific IE (IE type (1765)) is first provided as contents of the vendor specific content (1764). After this, it is considered to set a configuration capable of storing a plurality of subelements (1766).
  • the contents of the subelements (1766) include a name of a specific application to be used (1767) and a role of a device (1768) at the time of operating the specific application.
  • the contents include a specific application, information such as a port number that is used for the control of the specific application (L4 setup information) (1769), or information on capability within the specific application (capability information).
  • the capability information is information for specifying the correspondence to sending out and playback of audio or the correspondence to sending out and playback of a video, for example, when a specific application to be designated is a DLNA.
  • the present disclosure described above with reference to Fig. 1 to Fig. 27 is applied, and thus it is possible to obtain effects similar to the effects described above with reference to Fig. 1 to Fig. 27. That is, it is possible to suppress an increase in storage capacity necessary for encoding and decoding of an image in the wireless communication system.
  • data that is encoded by the present disclosure is transmitted and received, and thus it is possible to enhance encoding efficiency of image compression information to be output.
  • the information may be recorded in a recording medium (or another recording area of the same recording medium) which is separate from the image (or the bit stream).
  • the information and the image (or the bit stream) may be associated with each other in arbitrary units of, for example, a plurality of frames, one frame, or a portion within a frame.
  • the present disclosure can also adopt the following configuration.
  • An image decoding apparatus comprising: circuitry configured to decode encoded data in which a weight coefficient of a residual signal had been set to 0.5 when an input signal of image data constituted by a plurality of layers was 8 bits and a residual signal was 9 bits, wherein the residual signal is a prediction error of inter-frame prediction in another layer different from a current layer of the image data.
  • the circuitry is configured to decode the current layer of encoded data of the image data using the residual signal having the weight coefficient set to 0.5.
  • the circuitry is configured to receive the encoded data and weight coefficient usage information regarding whether zero or non-zero is used as the weight coefficient.
  • An image encoding apparatus comprising: circuitry configured to set a weight coefficient of a residual signal to be 0.5 when an input signal of image data constituted by a plurality of layers is 8 bits and a residual signal is 9 bits, wherein the residual signal is a prediction error of inter-frame prediction in another layer different from a current layer of the image data.
  • the circuitry is configured to transmit the encoded data and weight coefficient usage information regarding whether zero or non-zero is used as the weight coefficient.
  • the circuitry is configured to transmit the weight coefficient usage information for each coding unit (CU).
  • the circuitry is configured to transmit the weight coefficient usage information for each largest coding unit (LCU) and/or each slice header.
  • An image processing system comprising: an image encoding apparatus including circuitry configured to produce encoded image data from image data, the circuitry is configured to set a weight coefficient of a residual signal to be 0.5 when an input signal of the image data constituted by a plurality of layers is 8 bits and a residual signal is 9 bits, wherein the residual signal is a prediction error of inter-frame prediction in another layer different from a current layer of the image data; and a decoder configured to decode the encoded image data.
  • Image encoding apparatus 101 Base layer image encoding unit 102 Enhancement layer image encoding unit 103 Multiplexer 148 Residual prediction unit 149 Up-sampling unit 161 Range determination unit 162 Weight coefficient determination unit 163 Shifting unit 164 Residual buffer 200 Image decoding apparatus 201 Demultiplexer 202 Base layer image decoding unit 203 Enhancement layer image decoding unit 244 Residual prediction unit 245 Up-sampling unit 261 Range determination unit 262 Weight coefficient determination unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention porte sur un appareil de codage d'image qui comprend une circuiterie configurée pour régler un coefficient de poids d'un signal résiduel pour être à 0,5 lorsqu'un signal d'entrée de données d'image constituées par une pluralité de couches est de 8 bits et un signal résiduel est de 9 bits. Le signal résiduel est une erreur de prédiction d'une prédiction inter-trame dans une autre couche différente d'une couche courante des données d'image.
PCT/JP2014/003143 2013-06-21 2014-06-12 Appareil de décodage d'image, appareil de codage d'image et système de traitement d'image WO2014203505A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2013130276A JP2015005893A (ja) 2013-06-21 2013-06-21 画像処理装置および方法
JP2013-130276 2013-06-21

Publications (1)

Publication Number Publication Date
WO2014203505A1 true WO2014203505A1 (fr) 2014-12-24

Family

ID=51063756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/003143 WO2014203505A1 (fr) 2013-06-21 2014-06-12 Appareil de décodage d'image, appareil de codage d'image et système de traitement d'image

Country Status (2)

Country Link
JP (1) JP2015005893A (fr)
WO (1) WO2014203505A1 (fr)

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BROSS B ET AL: "High Efficiency Video Coding (HEVC) text specification draft 9 (SoDIS)", 11. JCT-VC MEETING; 102. MPEG MEETING; 10-10-2012 - 19-10-2012; SHANGHAI; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/, no. JCTVC-K1003, 21 October 2012 (2012-10-21), XP030113269 *
JIANLE CHEN; KRISHNA RAPAKA; XIANG LI; VADIM SEREGIN; LIWEI GUO; MARTA KARCZEWICZ; GEERT VAN DER AUWERA; JOEL SOLE; XIANGLIN WANG;: "Description of scalable video coding technology proposal by Qualcomm", JCTVC-K0036, JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG16 WP3 AND ISO/IEC JTC I /SC29/WG 1111 TH MEETING, 10 October 2012 (2012-10-10)
LI X ET AL: "TE3: Results of Test 4.6.2.1 on Generalized Residual Prediction", 12. JCT-VC MEETING; 103. MPEG MEETING; 14-1-2013 - 23-1-2013; GENEVA; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/, no. JCTVC-L0078, 8 January 2013 (2013-01-08), XP030113566 *
SATO(SONY) K: "Non-SCE3: Quantized GRP", 104. MPEG MEETING; 22-4-2013 - 26-4-2013; INCHEON; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. m28591, 11 April 2013 (2013-04-11), XP030057125 *

Also Published As

Publication number Publication date
JP2015005893A (ja) 2015-01-08

Similar Documents

Publication Publication Date Title
JP6610735B2 (ja) 画像処理装置及び画像処理方法
US11245925B2 (en) Image processing apparatus and image processing method
US11503321B2 (en) Image processing device for suppressing deterioration in encoding efficiency
US10075719B2 (en) Image coding apparatus and method
US20230308646A1 (en) Image encoding device and method and image decoding device and method
WO2014203763A1 (fr) Dispositif de décodage, procédé de décodage, dispositif de codage et procédé de codage
WO2015005132A1 (fr) Dispositif et procédé de codage d'image, et dispositif et procédé de décodage d'image
WO2014203505A1 (fr) Appareil de décodage d'image, appareil de codage d'image et système de traitement d'image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14735689

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14735689

Country of ref document: EP

Kind code of ref document: A1