MX2013014733A

MX2013014733A - Video image coding method, video image coding device, video image decoding method, video image decoding device and video image coding/decoding device.

Info

Publication number: MX2013014733A
Application number: MX2013014733A
Authority: MX
Inventors: Toshiyasu Sugio; Takahiro Nishi; Youji Shibahara; Hisao Sasai; Kyoko Tanikawa; Toru Matsunobu; Kengo Terada
Original assignee: Panasonic Corp
Priority date: 2011-12-16
Filing date: 2012-12-11
Publication date: 2014-02-19
Also published as: KR102072831B1; RU2013158874A; US20170054985A1; MX340433B; CN103650507B; US8885722B2; CN103650507A; CA2841058A1; US20140044188A1; JP2014017845A; JP5364219B1; US20230328254A1; JP2014003704A; CA2841058C; US20150003534A1; US20200389652A1; EP2793466A4; US10757418B2; US9094682B2; US20140341295A1

Abstract

A video image coding device (100) is provided with an intra/inter-prediction unit (107). When, in regard to one or more spatially adjacent blocks to be coded contained in a picture to be coded or timewise adjacent respective corresponding blocks contained in a picture that is different from the picture to be coded, a movement factor of a corresponding block is selectively added to a list, scaling processing is performed of a first movement vector of the corresponding block that is timewise adjacent thereto. In this way, a second movement vector is calculated, and an evaluation is made as to whether or not this second movement vector is included in a range of a prescribed size. If this second movement vector is included in the range of prescribed size, the second movement vector is added to the list.

Description

METHOD OF CODING OF VIDEO IMAGES, VIDEO IMAGE ENCODING DEVICE, VIDEO IMAGE DECODING METHOD, DECODING DEVICE OF VIDEO IMAGES AND VIDEO IMAGE ENCODING / DECODING DEVICE FIELD OF THE INVENTION The present invention relates to a method of encoding moving images to encode images on a block-by-block basis, and to a method of decoding moving images to decode images on a block-by-block basis.

BACKGROUND OF THE INVENTION In the inter-prediction decoding in H.264, image data of a current block is decoded by predicting a bipredictive reference block included in a segment B using, as references, two elements of image data that are image data different from the image that includes the current block.

For the H.264 standard, there are modes of derivation of motion vectors available for image prediction. The modes are known as direct modes (see 8.4.1.2.1, 3.45, etc. of NPL 1).

The following two modes of (S) and (T) are available as direct modes in H.264.

REF. : 245540 (T): Temporary direct mode (temporary mode). A current block is predicted using a mvCol movement vector of a colocalized block (Col_Blk), which is spatially identical to the current block (but temporarily different), is scaled by a certain percentage.

(S): Direct spatial mode. A current block is predicted using data on a motion vector (movement data) of a block that is spatially different (but that is going to be presented visually at the same time as the current block).

Appointment list Non-patent literature NPL 1 ITU-T H.264 03/2010 NPL 2 WD4: Working Draft 4 of High-Efficiency Video Coding Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11 6th Meeting: Torino, IT, 14-22 July , 2011, Document: JCTVC-F803_d2 BRIEF DESCRIPTION OF THE INVENTION Technical problem However, the prediction in the temporary direct mode implies multiplication for scaling. This multiplication may cause an increase in loading in coding or decoding because the motion vectors used in coding or decoding may have to be handled at a higher bit precision.

In view of this, a non-limiting and exemplary embodiment provides a method of encoding moving images and a method of decoding moving images each of which can cause reduced cause and be carried out with the same coding efficiency.

Solution to the problem A method of encoding moving images according to one aspect of the present invention is a method of encoding moving images to encode images on a block-by-block basis, and includes: selectively adding, to a list, a vector of movement of each of one or more corresponding blocks each of which is (i) a block included in a current image that will be coded and spatially adjacent to a current block that will be encoded or (ii) a block included in an image that it is not the current image and temporarily adjacent to the current block; select a motion vector from among the motion vectors in the list, the motion vector is selected to be used to encode the current block; and encoding the current block using the movement vector selected in the selection, where in the addition, a scaling process is carried out in a first movement vector of the corresponding temporarily adjacent block to calculate a second movement vector, it is determined whether the second calculated movement vector has a magnitude that is within a predetermined range of magnitude or a magnitude that is not within the predetermined magnitude, and the second movement vector is added to the list as the movement vector of the corresponding block when it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude.

Further, a method of decoding moving images according to one aspect of the present invention is a method of decoding moving images to decode images on a block-by-block basis, and includes: selectively adding, to a list, a vector of movement of each of one or more corresponding blocks each of which is (i) a block included in a current image that will be decoded and spatially adjacent to a current block that will be decoded or (ii) a block included in a image that is not the current image and temporarily adjacent to the current block; select a motion vector from among the motion vectors in the list, the selected motion vector being to be used to decode the current block; and decoding the current block using the motion vector selected in the selection, where in the addition, a scaling process in a first movement vector of the corresponding temporarily adjacent block to calculate a second motion vector, it is determined whether the second calculated motion vector has a magnitude that is within a predetermined range of magnitude or a magnitude that is not within of the predetermined magnitude, and the second motion vector is added to the list as the movement vector of the corresponding block when it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude.

These general and specific aspects can be implemented using a system, a method, an integrated circuit, a computer program or a computer-readable recording medium such as a CD-ROM, or any combination of systems, methods, integrated circuits, programs of computer or computer-readable recording media.

Suitable effects of the invention The motion picture coding methods and the motion picture decoding methods according to the present invention each make it possible to encode or decode moving pictures with reduced processing load while not causing reduction in coding efficiency.

BRIEF DESCRIPTION OF THE FIGURES Figure 1 illustrates two referenced images for the decoding of a current block (Curr_Blk).

Figure 2A illustrates a list of reference images (RefPicListO).

Figure 2B illustrates a list of reference images (RefPicList1).

Figure 3 illustrates picNum in the reference image lists RefPicListO and RefPicListl of the included CurrBlk.

Figure 4 illustrates information to be used in a temporary mode (T).

Figure 5A illustrates a scaling process in the temporal mode, showing a simplified diagram of a colocalized block and a mvLOCol movement vector.

Figure 5B illustrates a scaling process in the temporal mode using a concept diagram of the scaling process.

Figure 6 illustrates a relationship between steps 1 to 3 and equations for deriving motion vectors described in NPL 1.

Figure 7 illustrates a spatial direct mode (S).

Figure 8 is a block diagram illustrating a configuration of a motion picture coding apparatus according to mode 1.

Figure 9 is a flow diagram illustrating the operation of the moving picture coding apparatus according to mode 1.

Figure 10 illustrates candidate fusion blocks [1 ... 6] established by an intra-inter prediction unit.

Figure 11 illustrates the concept of the list of merge candidates (mergeCandList).

Figure 12 illustrates an example of a case where the inter-intra prediction unit determines which movement data is a duplicate.

Figure 13 is a flow chart illustrating a process for obtaining moving data from a block of fusion candidates [i].

Figure 14 is a flowchart illustrating an example of a scaling process carried out by the inter-intra-prediction unit.

Fig. 15 is a flow diagram illustrating another example of the scaling process carried out by the inter-intra prediction unit.

Fig. 16 is a block diagram illustrating a configuration of a motion picture decoding apparatus according to mode 1.

Fig. 17 is a flow chart illustrating the operation of the motion picture decoding apparatus according to mode 1.

Figures 18A and 18B illustrate the updating of a list of merge candidates (mergeCandList) using figure 18A which shows a list of candidates for initial merge generated (mergeCandList) and figure 18B which shows a list of merge candidates after having been updated.

Figure 19A illustrates a mvpLX motion vector predictor in HEVC.

Figure 19B illustrates a candidate list mvpListLX (mvpListlO and mvpListLl) for the mvpLX motion vector predictor.

Figure 20 illustrates candidate predictor blocks or a candidate predictor block.

Figure 21 shows a general configuration of a content provision system for implementing content distribution services.

Figure 22 illustrates a general configuration of a digital broadcasting system.

Figure 23 shows a block diagram illustrating an example of a television configuration.

Figure 24 shows a block diagram illustrating an example of a configuration of an information reproduction / recording unit that reads and writes information from and onto a recording medium that is an optical disc.

Figure 25 shows an example of a configuration of a recording medium that is an optical disc.

Figure 26A shows an example of a cell phone.

Figure 26B is a block diagram showing an example of a cell phone configuration.

Figure 27 illustrates a multiplexed data structure.

Figure 28 schematically shows how each stream is multiplexed into multiplexed data.

Figure 29 shows how a video stream is stored in a flow of PES packets in more detail.

Figure 30 shows a structure of TS packets and source packets in the multiplexed data.

Figure 31 shows a data structure of a PMT.

Figure 32 shows an internal structure of multiplexed data information.

Figure 33 shows an internal structure of flow attribute information.

Figure 34 shows steps to identify video data.

Figure 35 shows an example of a configuration of an integrated circuit to implement the method of decoding moving images and the method of decoding moving images according to each of the modalities.

Figure 36 shows a configuration for switching between excitation frequencies.

Figure 37 shows steps to identify video data and change between excitation frequencies.

Figure 38 shows an example of a look-up table in which video data standards are associated with excitation frequencies.

Figure 39A is a diagram showing an example of a configuration for sharing a module of a signal processing unit.

Fig. 39B is a diagram showing another example of a configuration for sharing a module of the signal processing unit.

DETAILED DESCRIPTION OF THE INVENTION Underlying knowledge that forms the basis of the present invention Figure 1 illustrates two referenced images for the decoding of a current block (Curr_Blk). In Figure 1, the numbers "300" to "304" are image numbers (PicNum), and the images are arranged in ascending order of display order values (PicOrderCnt). The current block that will be decoded is included in an image numbered 302 (CurrPic). In this example, the current block that will be decoded refers to an image that has a PicNum of 301 and an image that has a PicNum of 304. The image that has a PicNum of 301 precedes the image that includes the current block in the order of visual presentation, and the image that has a PicNum of 304 follows the image that includes the current block in the order of visual presentation. In the following figures, the starting point of a date indicates a reference image (an image that will be decoded) and the head of an arrow indicates an image that will be used for decoding (an image that will be referenced) as described in Legend of Figure 1.

The current blocks that will be decoded are indicated by a black block filled in the following figures, and are simply called Curr_Blk in the figure and following description. Again, the starting point of an arrow indicates a reference image (an image that will be decoded) and the head of an arrow indicates an image that will be used for decoding (an image that will be referenced) as described in the legend of Figure 1. The image that has a picNum of 302 is an image that includes a current block that will be decoded (a current image that will be decoded).

Figure 2A and Figure 2B illustrate two lists of reference images, RefPicListO and RefPicListl, respectively.

Figure 2A illustrates a list of reference images O (RefPicListO) which is a list to identify one of two reference images. Figure 2B shows a list of reference images 1 (RefPicListl) which is a list to identify the other of the two reference images. By using the reference image lists, it is possible to specify a reference image referenced by a current image that will be decoded, using an index that has a small value such as "0" or "1" (refldxLO and refldxLl) instead of picNum that has a large value such as "302". The images referenced by current blocks that will be decoded (Curr_Blk), which are blocks in a segment, are indicated using the values in these lists.

These lists are initialized (generated) when a segment B that includes a current block is decoded.

The entries in the reference image lists RefPicListO and RefPicListl are reordered in such a way that the indexes that have fewer values in the RefPicListO reference image list and the RefPicListl reference image list indicate images that have different PicNum image numbers. Each of the lists of reference images is divided into the first half which includes images that precede the picNum302 and the second half that includes images following the picNum302. In the first half of the list of reference images OR, the image indexes are assigned image numbers in descending order (301, 300 ...). In the first half of the list of reference images 1, image indices are assigned image numbers in ascending order (303, 304 ...).

For example, when a sequence of codes has an index having a minimum value "0" for each of the reference image list 0 and the reference image list 1, the following two reference images are determined for the image - 302 One of the reference images is an image indicated by RefPicListO [0], which is an image 301 immediately before the image 302. The other of the reference images is an image indicated by RefPicListl [0], which is an image 303 immediately after the image 302.

In the example illustrated in Figure 1, an index refldxLO is 0, and therefore the current image 302 refers to the image 301. The other index refldxLl is 1, and therefore the current image 302 refers to the image 304 Figure 3 illustrates picNum in the case where the values of refldxLO and refldxLl in each of the reference image lists RefPicListO and RefPicListl of the CurrBlk included in image 302 are increased from "0". The Larger values in the list (the refldxLO value and the refldxLl value) indicate images more distant from the current image that will be decoded (picNum302).

In particular, RefPicListl, which indicates the other reference, contains indexes under a rule that indexes having fewer values in the list are assigned to images that follow the CurrPic 302 image (ie, images larger than PicOrderCnt (CurrPic) and already decoded). and stored in memory in descending order (the rule is called Rule 1.) Under Rule 1, the image indicated by RefPicListl [0] is a picNum 303 image indicated by a dotted circle in Figure 3.

As noted above, one of the lists of reference images is simply called RefPicListO and the indexes in the list are simply named refldxLO in the description and the figures unless otherwise indicated. Similarly, the other list of reference images is simply called RefPicListl and the indexes in the list are simply called refldxLl (see the legends in Figure 3 and NPL 1, 8.2.4.2.3 in 8.2.4. construction of lists of reference images for more details).

Next, the temporal mode (T) and the spatial direct mode (S) in H.264 will be described.

Figure 4 illustrates information to be used in the temporary mode (T).

The block with diagonal stripes in Figure 4 represents a colocalized block (CoI_Blk), which is spatially identical to the current block (but temporarily different from the current block). The temporary location of the colocalized block is specified by the index that has a value of "0" in the RefPicListl in the other list of reference images 1 in figure 3, that is, the colocalized block is located in image 303. In the RefPicListl list initialized under rule 1, the image indicated by the index that has a value of "0" (that is, the value of RefPicListl [0]) is an image temporarily closer to one of the images that are in the reference memory and that follows the current image in the order of visual presentation with exceptional cases in which, for example, the reference memory does not store an image that temporarily follows the current image.

Then, in the temporary mode, motion vectors mvLO and mvLl of a current block to be decoded Curr_Blk represented as a filled black block are derived using "movement data" of Col_Blk represented as a block with diagonal stripes. "Movement data" includes the following. (i) Reference image refIdxLO [refidx] referenced by Col_Blk In this example, Col_Blk represents the image that has a picNum of 301 (this is indicated by the value of RefPicListO [1]). (ii) mvLOCol movement vector in the reference image.

In Figure 4, the dotted arrow on the image having a picNum of 301 indicates a mvLOCol motion vector that will be used for the decoding of the Col_Blk.

Hereinafter, the dotted lines in the present description and the figures represent motion vectors. The mvLOCol motion vector indicates a predictive image used for decoding Col_Blk.

Figure 5A and Figure 5B illustrate a scaling process in the temporal mode.

The escalation process is a process for the derivation of mvLO and mvLl movement vectors from a current block that will be decoded Curr_Blk by scaling the value of a mvLOCol motion vector using the relationship between distances from the current block to reference images.

Figure 5A illustrates the reference structure, colocalized block and mvLOCol movement vector in Figures 1 to 4 using a simplified diagram.

Figure 5b illustrates the concept of the scaling process.

The scaling process is based on the idea of similarity between a triangle DEF and a triangle ABC as illustrated in figure 5B.

The triangle DEF is a triangle for Col_Blk.

Point D is in Col_Blk. Point E is in an image referenced by Col_Blk. Point F is a point where the motion vector mvLOCol starts at point E as its tip.

The triangle ABC is a triangle for Curr_Blk.

Point A is in a current block that will be decoded Curr_Blk. Point B is in an image referenced by the Curr_Blk block. Point C is a point where the vector that will be derived has its point.

First, in STAGE 1, ScaleFactor is derived which is a ratio of (2) a relative distance (tx) from Col_Blk to an image referenced by Col_Blk to (1) a relative distance (tb) from Curr_Blk to a referenced image for the Curr_Blk. For example, with reference to Figure 5B, ScaleFactor is a ratio of tb = 302 - 301 = 1 to tx = 303 - 301 = 2 (th / tx), that is, the scaling ratio is 0.5 (1/2) (or the homothetic relationship is 1: 2). Therefore, it is the case that the homothetic relation of triangle ABC to triangle DEF is 1/2.

ScaleFator = tb / tx = (302-301) / (303-301) = 1/2 ...

(STAGE 1) .

In STAGE 2, an EF vector that has a magnitude equal to the length of a given side EF is multiplied by the scaling relation to obtain a vector BC. The vector BC is one of two vectors that will be derived, a mvLO vector. mvLO = ScaleFactor x mvLOCol ... (STAGE 2).

In STEP 3, the other vector that will be derived, a vector mvLl, is derived using the mvLO derived in stage 2 and an inverted mvLOCol. mvLl = mvLO - mvLOCol ... (STAGE 3) Figure 6 illustrates a relationship between STAGES 1 to 3 and the equations for deriving motion vectors described in 8.4.1.2.3. Derivation process for temporary direct luminance movement vector and NPL 1 reference index prediction mode.

Figure 7 illustrates the other of the two direct modes, the spatial direct mode (S).

A current block that will be decoded (Curr_Blk) is included in a block of motion compensation unit. In this mode, data on a motion vector (ie movement data including a combination of values (motion vector mvLXN and reference index refldxLXN) as described above, the same applies hereinafter) are obtained for a N block which is adjacent to the motion compensation unit block (the block N is, for example, an adjacent block A, a adjacent block B or an adjacent block C).

Enter data in a motion vector (hereinafter also referred to as movement data), a movement data element (refldxLO and refldxLO and mvLO and mvLl corresponding to them, respectively) of a block having the reference index value plus small (refldxLXN) is used as such (see equations 8-186 and 8-187 in NPL 1). The reference indices have values of natural numbers including "0" (values of MinPositive values). Specifically, refldxLO and refldxll are derived using the following equations, respectively: In spatial direct mode, the elements of "movement data" that include data on a mvLO or mvLl movement vector, such as a distance from the current image to a reference image (refldxLO, refldxLl), are used in a set. Therefore, unlike the temporal mode, the derivation of a motion vector generally does not involve scaling mvLO or mvLl but only refers to a reference image used for the adjacent block.

As described above, the derivation of a mVLO motion vector using ScaleFactor (DistScaleFactor) in the temporary mode (T) involves the multiplication of mvLOCol by ScaleFactor. Consequently, when a vector of movement will be handled in decoding is limited to a magnitude such that the motion vector can be represented at a certain bit pressure, it is necessary to control the generation of a motion vector in such a way that the motion vector obtained as a result of the multiplication carried out in the Temporal mode coding has such magnitude. This control will increase the processing load in coding.

In addition, according to the conventional H.264 standard, the change between the temporary mode (T) and the direct spatial mode (S) is allowed only up to once per segment.

For the HEVC standard, the use of a fusion mode is described in which motion vectors are derived using a more flexible method than when the spatial direct mode or the temporal mode is used for each segment in H.264. Here, it is desired to adequately balance between reduction in processing load and maintenance of coding efficiency for derivation of such motion vectors having a limited magnitude by using these modes in combination with the melting mode for a new standard, the HEVC .

A method of encoding moving images according to one aspect of the present invention is a method of encoding moving images to encode images on a block-by-block basis, and includes: selectively adding, to a list, a movement vector of each or more corresponding blocks each of which is (i) a block included in a current image that will be encoded and spatially adjacent to a current block that will be encoded or (ii) a block included in an image that is not the current image and temporarily adjacent to the current block; select a movement vector from among the motion vectors in the list, the selected motion vector being to be used to encode the current block; and encoding the current block using the movement vector selected in the selection, wherein in the addition, a scaling process is carried out in a first movement vector of the corresponding temporarily adjacent block to calculate a second motion vector, is determined if the second calculated motion vector has a magnitude that is within a predetermined range of magnitude or a magnitude that is not within the predetermined magnitude, and the second motion vector is added to the list as the motion vector of the corresponding block when it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude.

In this way, it is possible to limit movement vectors managed in coding and decoding to a certain magnitude in such a way that the motion vectors can be represented at a certain bit precision.

Further, in the addition, when it is determined that the second motion vector has a magnitude that is not within the predetermined range of magnitude, the second motion vector is cut to have a magnitude within the predetermined range of magnitude, and a vector of movement resulting from the cutting of the second movement vector is added to the list as the movement vector of the corresponding block.

Moreover, in the addition, when it is determined that the second motion vector has a magnitude that is not within the predetermined range of magnitude, the second motion vector is not added to the list.

In addition, the list is a list of fusion candidates that lists the movement vector of the corresponding block and specification information to specify an image referenced by corresponding block, in the addition, the specification information is added to the list of candidates for fusion in addition to the movement vector of the corresponding block, in the selection, a motion vector and specification information to be used for the coding of the current block are selected from among the motion vectors in the list of fusion candidates, and in the encoding, the current block is encoded by generating a predictive image of the current block using the movement vector and specification information selected in the selection.

Moreover, the list is a list of candidates for movement vector predictor, in addition, it is further determined if a fourth movement vector has a magnitude that is within a predetermined range of magnitude or a range that is not within the predetermined magnitude range, and the fourth motion vector is added to the candidate list to the motion vector predictor as a candidate for motion predictive vector when it is determined that the fourth motion vector has a magnitude that is within the predetermined range of magnitude, the fourth movement vector is calculated by carrying out a scaling process in a third movement vector of the corresponding spatially adjacent block, in the selection, a movement vector predictor to be used for coding of the current block is selected from the list of candidates for motion vector predictor, and in coding, the coding of the current block is carried out which includes coding of a motion vector of the current block using the motion vector predictor selected in the selection.

Further, in the addition, when it is determined that the fourth motion vector has a magnitude that is not within the predetermined range of magnitude, the fourth motion vector is cut to have a magnitude within the predetermined range of magnitude, and a vector of movement that results from cutting the fourth movement vector and is added to the list of candidates for movement vector predictor as the candidate for motion vector predictor.

Moreover, the predetermined magnitude range is determined based on a bit precision of a motion vector, and the bit precision has either a value specified by one of a profile and a level or by a value included in a header .

Further, a method of decoding moving images according to one aspect of the present invention is a method of decoding moving images to decode images on a block-by-block basis, and includes: selectively adding, to a list, a vector of movement of each one or more corresponding blocks each of which is (i) a block included in a current image that will be decoded and spatially adjacent to a current block that will be decoded or (ii) a block included in an image that it is not the current image and temporarily adjacent to the current block; select a vector movement among the motion vectors in the list, the motion vector selected being to be used for the decoding of the current block; and decoding the current block using the motion vector selected in the selection, wherein in the addition, an scaling process is carried out in a first movement vector of the corresponding temporarily adjacent block to calculate a second motion vector, is determined if the second calculated motion vector has a magnitude that is within a predetermined range of magnitude or a magnitude that is not within the predetermined magnitude, and the second motion vector is added to the list as the motion vector of the corresponding block when it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude.

Further, in the addition, when it is determined that the second motion vector has a magnitude that is not within the predetermined range of magnitude, the second motion vector is cut to have a magnitude within the predetermined range of magnitude, and a vector of movement resulting from the cutting of the second movement vector is added to the list.

In addition, the list is a list of fusion candidates that lists the movement vector of the corresponding block and specification information to specify an image referenced by the corresponding block, in the addition, the specification information is added to the list of candidates for fusion in addition to the movement vector of the corresponding block, in the selection, a movement vector and specification information that are to be used for the decoding of the current block are selected from among the motion vectors in the list of fusion candidates, and In decoding, the current block is decoded by generating a predictive image of the current block using the motion vector and specifying selected information in the selection.

Moreover, the list is a list of candidates for movement vector predictor, in the addition, it is further determined if a fourth motion vector has a magnitude that is within a range of magnitude default or a magnitude that is not within the predetermined magnitude range, and the fourth motion vector is added to the list of candidates for motion predictor as a candidate for motion vector predictor when the fourth movement vector is determined to be is within the range of predetermined magnitude, the fourth motion vector being calculated by performing a scaling process on a third motion vector of the corresponding spatially adjacent block, in the selection,, a motion vector predictor to be used for the decoding of the current block is selected from the list of candidates for the motion vector predictor, and in the decoding, decoding of the current block is carried out which includes the decoding of a motion vector of the current block using the vector predictor of movement selected in the selection.

Further, in the addition, when it is determined that the fourth motion vector has a magnitude that is not within the predetermined range of magnitude, the fourth motion vector is cut to have a magnitude within the predetermined range of magnitude, and a vector of movement that results from the fourth movement vector is added to the list of candidates to motion vector predictor as the candidate to vector predictor of movement .

Moreover, the predetermined magnitude range is determined based on a bit precision of a motion vector, and the bit precision has either a value specified by one of a profile and a level by a value included in a header.

Hereinafter, modalities of the present invention are specifically described with reference to the figures.

Each of the modalities described below shows a general or specific example. The numerical values, forms, materials, structural elements, the arrangement and relation of the structural elements, stages, the order of processing of the stages, etc., shown in the following modalities are simple examples, and therefore do not limit the scope of the claims. Therefore, among the structural elements in the following exemplary modalities, the structural elements do not described in none of the independent claims are described as arbitrary structural elements.

Modality 1 Figure 8 is a block diagram illustrating a configuration of a motion picture coding apparatus according to mode 1.

As illustrated in FIG. 8, a moving image coding apparatus 100 includes, as its main part, a subtracting unit 101, a transformation unit 102, a quantization unit 103, an entropy coding unit 110, a inverse quantization unit 104, a reverse conversion unit 105, an adding unit 106, a memory unit 109, an intra-inter prediction unit 107 and an encoding control unit 108.

The subtractor unit 101 sends a differential signal which is a difference between an input video signal and a pre-active video signal.

The transformation unit 102 transforms the differential signal of an image domain into a frequency domain. The quantization unit 103 quantizes the differential signal in a frequency domain as a result of the transformation and sends the quantized differential signal.

The entropy coding unit 110 encodes by entropy the quantized differential signal and a decoding control signal and sends a coded bit stream. The inverse quantization unit 104 inverse quantizes the quantized differential signal. The reverse transformation unit 105 inversely transforms the inverse quantized differential signal of a frequency domain to an image domain and sends a restored differential signal.

The summing unit 106 adds the restored differential signal and a predictive video signal to generate a decoded video signal.

The intra-predictive unit 107 stores the decoded video signal based on a predetermined unit, such as on a basis by frames or on a base per block, in memory 109 and, after instructions from the control unit encoding 108, generates and sends a predictive video signal (derived pixel values based on the decoded video signal and motion vectors) that will be provided to the subtracting unit 101 and the summing unit 106.

Moreover, the intra-inter prediction unit 107 derives a list of merge candidates (mergeCandList) which is a list of candidate movement vectors to be used in coding and decoding carried out in fusion mode. To derive the list of fusion candidates, the intra-predictor unit 107 selectively adds, to the list of fusion candidates, a movement vector of each corresponding block. Each of the corresponding blocks is (i) a block included in a current image that will be coded and spatially adjacent to a current block that will be coded or (ii) a block included in an image that is not the current image and temporally adjacent to it. current block. In addition, the intra-inter prediction unit 107 performs a scaling process on a first movement vector of the corresponding temporarily adjacent block to calculate a second motion vector, and determines whether the second motion vector has a magnitude that is within of a predetermined range of magnitude or a quantity that is not within the predetermined range of magnitude. When it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude, the intra-inter prediction unit 107 adds, to the list of fusion candidates, the second motion vector as a motion vector of a corresponding block. The intra-inter prediction unit 107 selects a motion vector that will be used to encode a current block from the list of fusion candidates. In others words, the scaling process according to the mode 1 is carried out mainly by the intra-inter prediction unit 107. It should be noted that the intra-predictive unit 107 of the moving image coding apparatus 100 in accordance with the embodiment 1 corresponds to an addition unit and a selection unit, and the subtraction unit 101, the transformation unit 102, the quantization unit 103 and the entropy coding unit 110 of the moving image coding apparatus 100 of according to modality 1 correspond to a coding unit.

The coding control unit 108 determines control parameters to control the processing units in Figure 8 and to control the coding of an image based on a result of a test, and provides the parameters particularly to the intra-inter unit. prediction 107. (The control parameters correspond to a decoding control signal). The test is carried out using, for example, a function to reduce the bit length of a coded bitstream represented by a dotted line in FIG. 8. The control parameters for encoding video data (for example, parameters that they indicate either inter prediction or intra prediction) are thus determined and sent. The sent signal includes indexes of movement vectors, which will be described later.

When the result of the test is affirmative, the coding control unit 108 determines a merge index (merge_Idx) which is a value indicating that the scaling process according to mode 1 has been applied to the image, and includes the melt index in a decoding control signal that will be sent. In this case, the quantized differential signal has values derived from a predictive video signal generated using the scaling process according to mode 1.

Next, the operation of coding in fusion mode is described in case the coding control unit 108 has determined (1) to intercode a current block (MODE_INTER) and (2) to use the merge mode (MergeMODE) ( or obtain a result of using the fusion mode).

The fusion mode in HEVC is conceptually equivalent to a direct mode newly provided in the H.264 standard. As with the direct mode in H.264, a motion vector is derived not by using a sequence of codes but by using a motion vector of a different spatial block (S) or temporarily (T).

The merge mode and the direct mode in H.264 are different in the following points. (a) Processing unit: the change between using and not using fusion mode is possible when changing merge_flag, which can be changed in a prediction unit (PU) less than a segment. (b) Options: The selection of direct spatial mode (S) or temporary mode (T) is not a determination of two alternatives. There are more options and the selection is indicated by merge_idx. Specifically, a list of merge candidates (mergeCandList) is derived which is a list of candidate motion vectors to be used in encoding and decoding a merge mode. A motion vector to be used is indicated by the value of an index (merge_idx) selected from a sequence of codes in the list.

When the process for the merging mode is started, the coding control unit 108 sets the values of merge_idx and i to "0" (step S101). The parameter i is conveniently used as a candidate number to distinguish candidates.

The intra-inter prediction unit 107 establishes candidate blocks [1 ... N] each of which is of any of the following two types (step S102). Suppose that N = 6. (s) Candidate blocks [1 ... (N-l)] are one or more candidate blocks for spatial direct mode. These candidate blocks [1 ... 5] are distinguished based on the location of each candidate block. (t) The candidate block [N] is a candidate block for temporary mode. A colocalized block attached to the candidate blocks for spatial direct mode has an input value of "6", which is used as the index of the colocalized block. This will be described later using Figure 10.

In steps S103 and later, the coding control unit 108 performs a circuit process that increments in the value of the parameter i that each candidate Indicates (step S103), to determine a way for the derivation of a vector of movement that will be sent. The determined motion vector is suitable for an objective function to provide high precision.

The intra-inter prediction unit 107 determines whether the candidate block [i] is available or not in memory (step S104). For example, a block placed below the current block and still to be encoded (or decoded) is not stored in memory, and is therefore determined as not available.

When it is determined that a block is not available (step S104, No), the intra-inter unit prediction 107 moves to the next candidate i without changing the value of merge_idx (returns to step S103).

When a block is determined to be available (step S104, Yes), the intra-inter prediction unit 107 proceeds to the next stage.

Then, the intra-inter prediction unit 107 determines whether movement data (a set of mvLO, mvLl, refldxLO and refldxLl, the same applies hereinafter) of the candidate block [i] is a duplicate of movement data (mvLO, refldxLO, mvLl and refldxLl) already tested with candidate blocks [1 ... (i-1)] previous (step S105). This determination will be described later using Figure 12.

When determining that a block is a duplicate (step S105, Yes), the intra-inter prediction unit 107 moves to the next candidate i without changing the value of merge_idx (returns to step S103).

When determining that a block is not a duplicate, that is, when the movement data is a new set of movement data elements (step S10.5, No), the intra-inter prediction unit 107 proceeds to the next stage. A list of candidates for fusion of motion vectors (mergeCandLis) is generated as a result of the determinations regarding availability (step S104) and duplication (step S105). This will be described later using Figure 11.

Then, the intra-inter prediction unit 107 obtains or derives movement data (mvLO, refldxLO, mvLl and refldxLl) from the candidate block [i] (step sl06). Here, when the candidate block [i] is a colocalized block intended to be used in temporary mode, the scaling process is carried out. The scaling process will be described later using figure 14.

Although the scaling process is carried out when a candidate block [i] turns out to be a colocalized block intended to be used in temporary mode in step S106, the operation of the moving image coding apparatus is not limited thereto. For example, in another possible operation, 'movement data (mvLO, refldxLO, mvLl and refldxLl) already subjected to the scaling process (this will be described later using Figure 14) are obtained when a colocalized block is added to the list of candidate blocks in step S102, and the colocalized block is not added to the list in step S105 when the movement data of the colocalized block is a duplicate of movement data of any of the previous candidate blocks (FIG. 17). In this way, more duplicate movement data of candidate blocks are omitted so that the processing load can be reduced and the coding efficiency can be improved.

Afterwards, inter-coding is carried out as a test by the coding apparatus as a whole using the movement data determined under the control of the coding control unit 108 (step S107). The coding control unit 108 obtains, for example, a bit stream [i] as an output resulting from the entropy coding unit 110.

The coding control unit 108 determines whether the current candidate [i] produces a better result or not than the results obtained using previous candidates [1 ... (i-1)] (if the current candidate [i] produces or not a minimum value of a predetermined objective function, from points of view such as bitstream length (compression efficiency) or processing delay (step S108).

When it is determined that the current candidate [i] produces a better result than the results produced using the previous candidate [1 ... (i-1)] (step 108, Yes), the current value of merge_idx is stored as a value of merge_idx that will actually be used for encoding and decoding (step S109). Briefly, the effective value of merge_idx that produces a more deliberate result is stored in a dummy_merge_idx parameter.

The intra-inter prediction unit 107 has thus obtained the result that the current candidate i is an effective entry. Next, the unit of intra-inter prediction 107 increase the value of merge_idx to move to the next entry (step S110).

Then, the coding control unit 108 determines whether or not the test has been carried out on all the candidate blocks (step Slll).

When it is determined that the process has been carried out in all blocks (the test has been carried out in the block colocalized for the temporary mode (t) established as the last candidate block [N] in step S102) (stage Slll, Yes), the coding control unit 108 proceeds to the next step.

When it is determined that the process has not been carried out in all the candidate blocks (step Slll, No), the candidate number i is increased and the test is carried out in the next candidate.

Finally, dummy_merge_idx, which produces a maximum value (or a minimum value) of a predetermined objective function, is determined as a merge index (merge_idx) that will actually be included in a sequence of codes (step S112).

This is the coding operation using the merge mode.

Figure 10 illustrates fusion candidate blocks [1 ... 6] established in step S102 by the intra-inter prediction unit 107.

Candidate blocks include (s) one or more spatially adjacent blocks ((s) spatially adjacent blocks [1 ... (Nl)] in Figure 10) and (t) a temporarily adjacent block ((t) block colocalized [N] ] in figure 10).

In a list of merging candidates, the spatially adjacent blocks are listed as a candidate entry (or candidate entries) that have merge_idx of fewer values, in other words, as a candidate entry (or candidate entries) in the top of the list. The spatially adjacent blocks are located in a vertical direction (SI) or vertical (S2) from the current PU and adjacent to the current PU there as illustrated in figure 10.

It should be noted that the adjacency is determined based on PU which is a unit of movement data to which the same movement vector is applied. In other words, what is determined is whether a PU is adjacent to the CurrentPU that includes the current block Curr_Blk. The blocks B0 to B2 in Figure 10 are examples of a vertically adjacent block. A PU that includes any of the blocks is an adjacent PU, and movement data (mvLO, refldxLO, mvLl and refldxLl) of the adjacent PU are used. In Figure 10, blocks A0 and Al are examples of a horizontally adjacent block.

The candidate entry that has merge_idx of the largest value and is located at the bottom of a list of merge candidates, in other words, the candidate entry finally added to a list of merge candidates is a temporarily adjacent block. In Figure 10, the block colocalized in an image indicated by an index value of 0 in a list of reference images Ll (or LO) when there is no list of Ll reference images available) of a current block is a block temporarily adjacent.

Figure 11 illustrates the concept of the list of merge candidates (mergeCandList) generated in the process in steps S103 and later. The "i" (1 ... 6) to the left of figure 11 corresponds to the candidate number i in step S103 and others.

The entries corresponding to i = [1 ... 5] are (s) one or more spatially adjacent blocks (A0 ... B2 in Figure 10). · The input corresponding to i = 6 is (t) a temporarily adjacent block ((t) block colocalized [N] in Figure 10).

One effective number of entry for candidates 1 ... 6 is merge_idx. Referring to Figure 11, the candidates corresponding to i = 3 and 5 are duplicate motion vectors. More specifically, this indicates that the intra-predictive unit 107 has determined in the step S105 that movement data (a set of mvLO, mvLl, refldxLO and refldxLl, the same applies hereinafter) of the candidate block [i] are a duplicate of movement data (mvLO, refldxLO, mvLl and refldxLl) already tested with previous candidate blocks [1 ... (i -1)]).

Figure 12 illustrates an example of a duplication determination in step S105 where it is determined that movement data corresponding to an entry of a candidate block is a duplicate of movement data corresponding to a previous entry.

When movement data of an adjacent block located in Bl that is directly above a current PU is determined for a PU that also includes BO and Bn, movement data of blocks BO and BN corresponding to candidate numbers 3 and 5, respectively, are a duplicate of the movement data of an adjacent block Bl that is directly above a current PU. Consequently, the entries of blocks BO and BN are removed from the list. The mergeCandList list is thus compressed to a list in which the largest merge_idx value is "2" as illustrated in figure 11.

Fig. 13 is a flow diagram illustrating a process for obtaining movement data (mvLO, refldxLO, mvLl and refldxLl) of a fusion candidate block [i] that is carried out in step S106.

When the process is started, the coding control unit 108 determines whether an adjacent block [i] is a spatially adjacent block or a temporarily adjacent block (step S201).

When the coding control unit 108 determines that the adjacent block [i] is a spatially adjacent block (the value of [i] is one of 1 to 5 in the table in Figure 11), PU movement data including the candidate block [i] are determined directly as movement data of a current block (step S202).

When the coding control unit 108 determines that the adjacent block [i] is a temporarily adjacent block (the value of [i] is 6 in the table in Figure 11), mvLOCol of the colocalized block (Col_Blk), which is the candidate block [6], s are scaled using a temporary direct scaling process including multiplication (step S203).

The scaling process will be described below using figure 14.

Fig. 14 is a flow chart illustrating the scaling process in step S203.

First, the intra-inter prediction unit 107 calculates DistScaleFactor using a current CurrPicOrfField image, a picO reference image referenced by a current block, a skin image including a block colocalized and the visual presentation order value of a reference picO referenced by the colocalized block as illustrated by the equation for step 1 in figure 6 (step S301). Then, the intra-inter-prediction unit 107 calculates a mvLO motion vector by multiplying a mvCol movement vector of the block colocalized by DistScaleFactor as illustrated by the equation for step 2 in figure 6 (step S302). Then, the intra-inter prediction unit 107 determines whether the magnitudes of a horizontal component and a vertical component of the calculated mvLO motion vector may or may not be represented at a certain bit precision (step S303). When the result of the determination is true (step S303, Yes), the intra-inter prediction unit 107 adds a merge block candidate having the calculated movement block mvLO to a mergeCandList merge candidate list (step S304) . When the result is false (step S303, No), the intra-inter prediction unit 107 determines that a merged block candidate calculated from a collocated block is not available and does not add the merged block candidate to a list of mergeCandList merge candidates (step S305).

In this way, when a motion vector that results from the scaling process has a value too large to be represented at a certain bit precision, a fusion block candidate that has the motion vector is not added to a list of merge candidates. This makes it possible to limit movement vector to be handled in coding and decoding to a magnitude that can be represented at the certain bit precision. For exampleSuppose that the certain bit precision is 16 bits. In this case, a merge block that has a mvLO movement vector obtained as a result of the escalation process is not added to a list of merge candidates when either the horizontal component or the vertical component of the mvLO motion vector has a value not within the range of -32768 to +32767. In this way, it is possible to limit movement vectors to be handled in coding and decoding to a certain magnitude in such a way that the motion vectors can be represented at a bit precision of 16 bits.

The present invention is not limited to the example described above for the mode 1 in which both the horizontal component and the vertical component of a motion vector are limited to a quantity that can be represented at a bit precision of 16 bits. For example, suppose the case that the horizontal component is limited to a quantity that can be represented at a bit precision of 16 bits and the vertical component is limited to a quantity that can be represented to a 14 bit bit precision. In this case, a fusion block candidate having a mvLO motion vector obtained as a result of the scaling process is not added to a list of fusion candidates when it is determined that the mvLO horizontal component is not within the range of - 32768 to +327867 or the vertical component of the motion vector is not within the range of -8192 to 8191. In this way, it is possible to limit the horizontal component of a motion vector to a magnitude and the vertical component of the motion vector to another magnitude The present invention is not limited to the example described above for the mode 1 in which a mvLO motion vector from a list of reference images LO is calculated by the scaling process. The scaling process is also applicable to the calculation of a motion vector mvLl from a list of reference images Ll.

The present invention is not limited to the embodiment 1 described above in which a fusion block candidate calculated from a colocalized block is not added to a list of fusion candidates when the fusion block candidate has a movement vector mvLO which is calculated by multiplying a mvCol movement vector of the block colocalized by DistScaleFactor in step S302 and has a horizontal component and a vertical component any of which has a value too large as to be represented at a certain bit precision. For example, when a colocalized block is bredredictive, a fusion block candidate can be calculated by carrying out the process of steps S302 to S305 using the other movement vector of the colocalized block as mvCol. In this way, the excessive reduction in the number of candidates for fusion blocks calculated from colocalized blocks can be avoided, whereby the coding efficiency can be increased.

The present invention is not limited to the embodiment 1 described above in which a fusion block candidate calculated from a colocalized block is not added to a list of fusion candidates in step S305 when either the horizontal component or the component The vertical of a mvLO motion vector has a value too large to be represented at a certain bit precision. For example, as illustrated in step S401 in FIG. 15, the horizontal component or the vertical component of the mvLO motion vector can be cut such that its value can be represented at some bit precision, and a fusion block candidate that has the cut motion vector can be added to a list of fusion candidates. For a specific example, suppose that the certain bit precision is 16 bits. In this case, when a movement vector obtained Because the scaling process has a horizontal component that has a value greater than ++ 32767, a candidate fusion block can be calculated using a motion vector that has a horizontal component of +32767 as a result of the cut. When a motion vector obtained by the scaling process has a horizontal component that has a value of less than -32768, a candidate to fusion block can be calculated using a motion vector having a horizontal component of -32768 as a result of the cut.

The present invention is not limited to the example described above for the mode 1 in which the magnitude of motion vectors is limited to a quantity based on a fixed bit precision. For example, an indicator and bit precision for limiting motion vectors may be additionally indicated in a header such as a set of sequence parameters (SPS), a set of image parameters (PPS, by its acronym in English) and a segment header, and limiting values for motion vectors can be changed for each sequence, image, or segment according to the indicator and bit precision. Optionally, limiting values for motion vectors can be changed according to a profile or a level specifying a bit precision of a motion vector.

Next, a motion picture decoding apparatus will be described which resets a moving image from a bitstream encoded by the moving picture coding apparatus according to mode 1.

A moving image decoding apparatus 200 decodes an encoded bit stream entered and sends the decoded image signals temporarily stored in a memory (a memory for decoded images) in order of display with predetermined timing.

As illustrated in Figure 16, the moving image decoding apparatus 200 includes, as its main part, an entropy decoding unit 201, an inverse quantization unit 202, a reverse transformation unit 203, an addition unit 024 , a memory 207, an intra-inter-prediction unit 205 and a decoding control unit 206. Each constituent element having the same name as that in the moving image coding apparatus illustrated in FIG. 8 has corresponding functionality .

The entropy decoding unit 201 decodes an entropy encoded input bitstream and sends a quantized differential signal, a decoding control signal, and others.

The inverse quantization unit 202 inverse quantizes the quantized differential signal obtained by the entropy decoding. The reverse transformation unit 203 inversely transforms a differential signal obtained by inverse quantization from a frequency domain in an image domain and sends the restored differential signal.

Addition unit 204 adds the restored differential signal and a predictive video signal to generate a decoded video signal.

The intra-predictor unit 205 stores the decoded video signal based on a predetermined unit, such as on a basis per frame or block, in the memory 207 and, after receiving instructions from the decode control unit 206. , generates and sends a predictive video signal (derived pixel values based on the decoded video signal and motion vectors) that will be provided to the addition unit 204.

As with the motion picture coding apparatus 100, the scaling process according to mode 1 is carried out by the unit intra-inter prediction 205. It should be noted that the intra-inter-prediction unit 205 of the moving image decoding apparatus 200 according to mode 1 corresponds to an addition unit and a selection unit, and the decoding unit for Entropy 201, the inverse quantization unit 202, the inverse transformation unit 203, the summing unit 204, etc., collectively correspond to a decoding unit.

The decoding control unit 206 obtains control parameters that will be used to control the processing unit in Figure 16 and the decoding of images from the decoding control signal decoded by the entropy decoding unit 201. The information The decoding control in an encoded bit stream includes the merge index (merge_idx) determined in step S112 illustrated in FIG. 9.

Next, the operation to be carried out in case the decoding control unit 206 has determined, from information indicated by a decoding control signal, that a current block (Curr_Blk) (or a block of PU prediction unit that includes the current block) is inter-coded (MODE_INTER) using the merge mode (MergeMODE).

First, the intra-inter-prediction unit 205 locally generates a list of merge candidates (mergeCandList) illustrated in FIG. 11. Generating a list of fusion candidates locally means that the intra-inter-prediction unit 205 generates a list of candidates to fusion using the same method as the moving image coding apparatus 100, without reference to information obtained from a coded bit stream.

The parameter "i = 1 ... 6" has the same definition as "i" in figure 11.

The intra-inter prediction unit 205 carries out the process from steps S501 to S505 for the candidate block number i which varies from 1 to 6. The intra-inter prediction unit 205 identifies the candidate block number i (step S501). When the candidate block number i is one of 1 to 5, the intra-inter prediction unit 205 obtains movement data of adjacent spatial blocks (step S502).

When the candidate block number i is 6, the intra-inter prediction unit 205 performs the scaling process using movement data of a colocalized block using the same method as in step S203 in FIG. 13 (step S503) .

Then, the intra-inter-prediction unit 205 determines whether or not the movement data obtained in step S502 or step S504 is a duplicate of movement data in an entry above in mergeCandList (step S504).

When it is determined that the movement data is a duplicate (step S504, Yes), the intra-inter-prediction unit 205 is moved to the candidate block number i incremented to the next value.

When it is determined that movement data is not a duplicate (step S504, No), the intra-inter prediction unit 205 appended the movement data obtained to the list of merge candidates (mergeCandList) (step S505).

A list of initial fusion candidates (mergeCandList) is then generated by the process of steps S501 to S505.

Then, when a predetermined condition is satisfied, the intra-inter-prediction unit 205 updates the list of merge candidates (mergeCandList) (step S506). Figures 18A and 18B illustrate an example of the process for updating, which is carried out under a rule implicitly shared with a corresponding moving image coding apparatus. Figure 18A illustrates a list of candidates for initial fusion generated (mergeCandList). Figure 18B illustrates a list of candidates for merger after having been updated. In the example illustrated in Figure 18B, a candidate having a merge index (merge_idx) of "0" (mvL0_A, refO) and a candidate having a merger index of "1" (mvLl_B, refO) are combined to generate a candidate that has a merge index (merge_idx) of "2" (mvL0_A, refO, mvLl_B, refO).

Then, a selection is made for merge mode for mvLO and mvLl motion vectors using the list.

The entropy decoding unit 201 decodes entropy merge_Idx and the intra-inter prediction unit 205 receives the merge_Idx value (step S507).

Then, the intra-inter-prediction unit 205 selects movement data to be used in the fusion mode indicated by the merge_Idx value of the candidates in the list of fusion candidates (step S508).

Finally, the intra-inter prediction unit 205 obtains pixel data (pixelsLO and pixelsLl) of pixels at positions indicated by the mvLO and mvLl motion vectors in the selected movement data (mvLO, refldxLO, mvLl, refldxLl), and derives a predictive video signal using the pixel data (step S509).

In this way, when a motion vector resulting from the scaling process has a value too large to be represented at a certain bit precision, a fusion block candidate having the vector of Movement is not added to a list of merger candidates. This makes it possible to limit movement vectors that will be handled in coding and decoding to a magnitude that can be represented at the certain bit precision.

The present invention is not limited to the embodiment 1 described above in which after the scaling process in step S302 in FIG. 14, it is determined whether the magnitude of the calculated motion vector may or may not be represented at a certain bit precision. Alternatively, for example, it may be determined whether the magnitude of the motion vector mvLO selected according to merge_idx in step S508 in FIG. 17 may or may not be represented within a certain bit length. Furthermore, when it is determined that the magnitude can not be represented at a certain bit precision, the motion vector can be cut to have a magnitude that can be represented at a certain bit precision.

Moreover, the technique described in mode 1 is applicable not only to the case where the magnitude of a motion vector after the scaling process using the melting mode specified in the HEVC described in NPL 2 is limited by what can be represent some bit precision. Also applicable is the case where a candidate for a motion vector predictor is derived using the AMVP specified in the HEVC described in NPL 2.

Figure 19A illustrates a mvpLX motion vector predictor in HEVC described in NPL 2. Figure 19B illustrates a candidate list mvpListLX (mvpListLO and mvpListLl) for the mvpLX motion vector predictor.

The mvpLX motion vector predictor is used to derive a difference movement vector mvdLX which is a difference of a mvLX movement vector derived by a movement estimate as illustrated in Figure 19A. Then, the mvdLX difference movement vector is encoded. The value of mvp_idx_10 in Figure 19B corresponds to the value of mvp_idx_ix in Figure 19B corresponds to the alue of mvp_idx_IX which is coded (or extracted by a corresponding decoding apparatus). Motion data of mvpListLX [mvp_idx_IX] identified by an index value (0, 1 or 2) is a predictor of motion vector mvp (predictor). N in Figure 19A and Figure 19B indicates a spatial or temporal position of a block whose motion vector has a value that will be used as a predicted value of a motion vector.

Figure 20 illustrates candidate predictor blocks or a candidate predictor block indicated by the value of N (A, B or Col) shown in Figure 19B. The filled black block of Figure 20 is a current block that will be coded (or decoded) Curr_Blk. The block is included in an image that has an image number of picNum 302. The grid block in Figure 20 is located in the position indicated by approximately identical spatial coordinates (x, y) as the current block to be decoded Curr_Blk (or a PU prediction unit block that includes the current block) but in an image that has a different picNum (temporarily different), that is, it is called a colocalized block (Col_Blk). In this example, suppose that Col_Blk is located in an image that does not have an image number of a picNum 302 but that has an image number of picNum 303. In HEVC, the motion vectors mvLOA, mvLOB, and mvLOCol (or mvLlA , mvLlB, and mvLICol) of the N_Blk blocks (A_Blk, B_Blk, Col_Blk) at positions A, B and Col, respectively, are multiplied by DistScaleFactor, and the resultant mvpLO and mvpLl motion vector predictors are used as predictor candidates .

In mode 1, it is determined whether the magnitude of each of the motion vector predictors calculated by the multiplication can be represented or not at a certain bit precision. When the result of the determination is false, the motion vector predictor is not added to a list of candidates for motion vector predictor. In this way, it is possible to limit a movement vector predictor or a difference movement vector calculated from a motion vector and a movement vector predictor of a current block that will be coded until a quantity that can be represented at a certain bit precision is determined. When the movement vector predictor calculated by the multiplication has a magnitude that can not be represented to the certain bit precision, a motion vector predictor obtained by cutting the motion vector predictor in order to have a magnitude that can being represented at a certain bit precision can be added instead to the list of candidates for motion vector predictor.

The modality 1 has been described by way of example, and the scope of the claims of the present application is not limited to modality 1. Those skilled in the art will readily appreciate that various modifications can be made to these exemplary modalities and that other modalities they may be obtained by arbitrarily combining the constituent elements of the embodiments without materially departing from the novel teachings and advantages of the subject matter described in the appended claims. Accordingly, all these modifications and other embodiments are included in the present invention.

Each of the constituent elements in each of the embodiments described above can be configured in the form of an exclusive hardware product, or can be achieved when executing a software program suitable for the structural element. The constituent elements can be implemented by a program execution unit such as a CPU or a processor that reads and executes a software program recorded in a recording medium such as a hard disk or a semiconductor memory. Here, the software program for achieving the motion picture coding apparatus or the motion picture decoding apparatus according to mode 1 is a program described below.

Specifically, the program causes a computer to execute a method to encode images on a block-by-block basis, the method includes: selectively adding, to a list, a motion vector of each or more corresponding blocks each of which is (i) a block included in a current image that will be coded and especially adjacent to a current block that will be coded or "(ii) a block included in an image that is not the current image and temporarily adjacent to the current block; select a vector of motion between the motion vectors and the list, the motion vector selected being to be used for the coding of the current block, and coding the current block using the motion vector selected in the selection, where in the addition, carries out an escalation process in a first movement vector of the corresponding temporarily adjacent block to calculate a second movement vector, it is determined if the second calculated movement vector has a quantity that is within a predetermined range of magnitude or a quantity that is not within the predetermined amount, and the second motion vector is added to the list as the movement vector of the corresponding block when it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude.

In addition, the program causes a computer to execute a method to decode images on a block-by-block basis, the method includes: selectively adding, to a list, a motion vector for each of one or more corresponding blocks each of the which is (i) a block included in a current image that will be decoded and spatially adjacent to a current block that will be decoded or (ii) a block included in an image that is not the current image and temporarily adjacent to the current block; select a motion vector from among the motion vectors in the list, the selected motion vector being to be used in the decoding of the current block; and decoding the current block using the movement vector selected in the selection, where in the addition, an escalation process is carried out in a first movement vector of the corresponding temporarily adjacent block for calculating a second motion vector, it is determined her the second calculated motion vector has a magnitude that is within a predetermined range of magnitude or a quantity that is not within the predetermined amount , and the second motion vector is added to the list as the movement vector of the corresponding block it is determined that the second motion vector has a magnitude that is within the predetermined range of magnitude.

Modality 2 The processing described in each of the modalities can simply be implemented in a separate computer system, by recording, in a recording medium, a program for implementing the configurations of the method of encoding moving images (method of image coding) and the method of decoding moving images (method of decoding images) described in each of the modalities. The recording means can be any recording medium as long as the program can be recorded, such as a magnetic disk, an optical disk, an optical magnetic disk, an IC card or a semiconductor memory.

In the future, applications to the method of encoding moving images (method of coding of images) and the method of decoding moving images (method of decoding images) described in each of the modalities and systems that use them. The system has a feature of having an image coding and decoding apparatus that includes an image coding apparatus that uses the image coding method and an image decoding apparatus that uses the image decoding method. Other configurations in the system can be changed as appropriate depending on the cases.

Figure 21 illustrates a general configuration of a content exlon provisioning system to implement content distribution services. The area for providing communication services is divided into cells of desired size, and base stations exl06, exl07, exl08, exl09 and exllO, which are fixed wireless stations are placed in each of the cells.

The exlOO content provision system connects to devices, such as an exlll computer, a personal digital assistant (PDA) exll2, an exll3 camera, an exll4 cell phone and an exll5 gaming machine, through of the Internet exlOl, an Internet service provider exl02, a telephone network exl04, as well as the base stations exl06 and exllO, respectively.

However, the configuration of the exlon content provisioning system is not limited to the configuration shown in Figure 21, and a combination in which any of the elements are connected is acceptable. In addition, each device can be connected directly to the telephone network exl04, instead of via the base stations exl06 to exllO which are the fixed wireless stations. Moreover, the devices can be interconnected by means of short-distance wireless communication and others.

The exll3 camera, such as a digital video camera, is capable of capturing video. An ex6 camera, such as a digital video camera, is capable of capturing both still and video images. Moreover, the exll4 cell phone can be the one that complies with any of the standards such as global system for mobile communications (GSM) (registered trademark), Multiple Access by Code Division (CDMA, for its acronym in English), Broadband Access by Broadband Code Division (W-CDMA), Long-Term Evolution (LTE) and Access by High-Speed Packages (HSPA, for its acronym in English) . As an alternative, the exll4 cell phone can be a Personal Telephone System (PHS, for its acronym in English).

In the content delivery system exlOO, a streaming server exl03 is connected to the camera exll3 and others through the telephone network exl04 and the base station exl09, which makes it possible to distribute images of a live program and others. In such a distribution, a content (eg, video from a live music program) captured by the user using the camera exll3 is encoded as described above in each of the modes (ie, the camera functions as the coding apparatus). of images according to one aspect of the present invention), and the encoded content is transmitted to the streaming server exl03. On the other hand, the streaming server exl03 carries out the distribution of flows of the content data transmitted to the clients to their requests. Clients include the exlll computer, the exll2 PDA, the exll3 camera, the exll4 cell phone and the exll5 gaming machine that are capable of decoding the aforementioned coded data. Each of the devices that have received the distributed data decodes and reproduces the encoded data (ie, it functions as the image decoding apparatus according to an aspect of the present invention).

The captured data can be encoded by the exll3 camera or the exl03 streaming server that transmits the data, or the encoding process can shared between camera exll3 and streaming server exl03. Similarly, the distributed data can be decoded by the clients or the streaming server exl03, or the decoding processes can be shared between the clients and the streaming server exl03. Furthermore, the data of the still images and video captured not only by the camera exll3 but also by the camera exll6 but also by the camera exll6 can be transmitted to the streaming server exl03 through the computer exlll. The coding processes can be carried out by the camera ex6, the computer exlll, or the server of flow formation exll3, or shared among them.

In addition, the encoding and decoding processes can be carried out by an LSI ex500 generally included in each of the exlll computer and the devices. The LSI ex500 can be configured from a single chip or a plurality of chips. Software for encoding and decoding video can be integrated into a certain type of recording medium (such as a CD-ROM, a floppy disk and a hard disk) that is readable by the exlll computer and others, and the encoding and decoding processes they can be carried out using the software. In addition, when the exll4 cell phone is equipped with a camera, the image data obtained by the camera may be transmitted. The video data is data encoded by the LSI ex500 included in the exll4 cell phone.

Furthermore, the exl03 streaming server can be composed of servers and computers, and can decentralize data and process decentralized data, record or distribute data.

As described above, customers can receive and reproduce the encoded data in the exlOO content provision system. In other words, the clients can receive and decode information transmitted by the user, and reproduce the decoded data in real time in the exlOO content provision system, in such a way that the user who does not have any particular right and equipment can implement diffusion personal.

Apart from the example of the ex content content delivery system, at least one of the moving image coding apparatus (image coding apparatus) and the moving image decoding apparatus (image decoding apparatus) described in each of the embodiments may be implemented in an ex-digital broadcasting system illustrated in Fig. 22 More specifically, an ex-broadcasting station 202 communicates or transmits, by means of radio waves with a broadcast satellite ex202, multiplexed data obtained by mutlicking audio data and other video data. Video data are data encoded by the method of encoding moving images described in each of the embodiments (ie, data encoded by the image coding apparatus according to one aspect of the present invention). After reception of the multiplexed data, the broadcast satellite ex202 transmits radio waves for broadcast. Then, a domestic antenna ex204 with a satellite broadcast reception function receives the radio waves. Then, a device such as a television (receiver) ex300 and a box decoder (STB) ex217 decodes the multiplexed data received, and reproduces the decoded data (ie, it functions as the image coding apparatus in the present invention).

Moreover, a reader / writer ex218 (i) reads and decodes the multiplexed data recorded in a recording medium ex215, such as a DVD and a BD, or (ii) encodes video signals in the recording medium ex215, and in In some cases, it writes data obtained by multiplexing an audio signal in the encoded data. The reader / writer ex218 may include the motion picture decoding apparatus or the motion picture coding apparatus as shown in each of the embodiments. In this case, the reproduced video signals are displayed visually on the ex219 monitor, and can be reproducing by another device or system using the recording medium ex215 in which the multiplexed data is recorded. It is also possible to implement the motion picture decoding apparatus in the box decoder ex217 connected to the cable ex203 for a cable television or to the antenna ex204 for satellite and / or terrestrial broadcasting, in order to visually present the video signals on the ex219 monitor of ex300 television. The motion picture decoding apparatus can be implemented not in the box decoder but in the ex300 television.

Figure 23 illustrates the television (receiver) ex300 using the method of encoding moving images and the method of decoding moving images described in each of the embodiments. The ex300 television includes: an ex301 tuner that obtains or provides multiplexed data obtained by multiplexing audio data into video data, through the ex204 antenna or the ex203 cable, etc., that receives a broadcast; an modulation / demodulation unit ex302 that demodulates the multiplexed data received or modulates data in multiplexed data to be supplied to the outside; and a multiplexer / demultiplexer unit ex303 that demultiplexes the multiplexed data modulated into video data and audio data, or multiplexes video data and audio data encoded by an ex306 signal processing unit in data.

The ex300 television further includes: an ex306 signal processing unit including an audio signal processing unit ex304 and an ex305 video signal processing unit that decode audio data and video data and encode audio and data data video, respectively (which functions as the image coding apparatus and the image decoding apparatus according to aspects of the present invention); and an output unit ex309 including a speaker ex307 that provides the decoded audio signal, and a display unit ex302 that visually displays the decoded video signal, such as a screen. In addition, the ex300 television includes an interface unit ex317 that includes an operation input unit ex312 that receives an input from a user operation. In addition, the ex300 television includes an ex310 control unit that controls, above all, each constituent element of the ex300 television, and a power supply circuit unit for each of the elements. Unlike the operation input unit ex312, the interface unit ex317 may include: a bridge ex313 that is connected to an external device, such as the reader / writer ex218, a slot unit exl4 for making possible the insertion of a recording medium ex216, such as an SD card; an ex315 controller that will be connected to an external recording medium, such as a hard disk; and an ex316 modem that will be connected to a telephone network. Here, the recording medium ex216 can electrically record information using a non-volatile / volatile semiconductor memory element for storage. The constituent elements of the ex300 television are connected to each other through a synchronized bus.

First, the configuration in which the ex300 television decodes multiplexed data obtained from the outside through the antenna ex304 and others and reproduces the decoded data will be described. In the ex300 television, after a user operation through a remote control ex220 and others, the multiplexer / demultiplexer unit ex303 demultiplexes the multiplexed data demodulated by the modulation / demodulation unit ex302, under control of the control unit ex310 which includes a CPU. In addition, the audio signal processing unit ex304 decodes the demultiplexed audio data, and the video signal processing unit ex305 decodes the demultiplexed video data, using the decoding method described in each of the modes, in the ex300 television. The output unit ex309 provides the video signal and decoded audio signal to the outside, respectively. When the output unit ex309 provides the video signal and the signal. of audio, the signals can be stored temporarily in temporary storage memory ex318 and ex319, and others in such a way that the signals are reproduced in synchronization with each other. Moreover, the ex300 television can read multiplexed data not through a broadcast and others but from ex215 and ex216 recording media, such as a magnetic disk, an optical disk and an SD card. Next, a configuration will be described in. which the ex300 television encodes an audio signal and a video signal, and transmits the data abroad or writes the data in a recording medium. In the ex300 television, after a user operation through the ex220 remote control and others, the audio signal processing unit ex304 encodes an audio signal, and the video signal processing unit ex305 encodes a video signal , under control of the ex310 control unit using the coding method described in each of the modalities. The multiplexer / demultiplexer unit ex303 multiplexes the video signal and encoded audio signal, and provides the resulting signal to the exterior. When the multiplexer / demultiplexer unit ex303 multiplexes the video signal and the audio signal, the signals can be stored temporarily in the temporary storage memories ex320 and ex321, and others in such a way that the signals are reproduced in synchronization with one another. Here, the temporary storage memories ex318, ex319, ex320 and ex321 may be several as illustrated, or at least one temporary storage memory may be shared on the television ex300. In addition, data may be stored in a temporary storage memory such that system overflow and underflow can be avoided between the modulation / demodulation unit ex302 and the multiplexer / demultiplexer unit ex300, for example.

Moreover, the ex300 television may include a configuration for receiving an AV input from a microphone or camera other than the configuration to obtain audio and video data from a broadcast or recording medium, and may encode the obtained data. Although the ex300 television can encode, multiplex and provide data to the outside in the description, it may be able to only receive, decode and provide data to the outside but not to decode, multiplex and provide data to the outside.

Further, when the reader / writer ex218 reads or writes data multiplexed from or onto a recording medium, one of the television ex300 and the reader / writer ex218 can decode or encode the multiplexed data, and the ex300 television and the reader / recorders ex218 They can share decoding or coding.

As an example, Figure 24 illustrates a configuration of an information reproduction / recording unit ex400 when data is read or written from or onto an optical disk. The information reproduction / recording unit ex400 includes constituent elements ex400, ex402, ex403, ex404, ex405, ex406 and ex407 which will be described later herein. The optical head ex401 irradiates a laser spot on a recording surface of the recording medium ex215 which is an optical disk for writing information, and detects the light reflected from the recording surface of the recording medium ex215 to read the information. The modulation recording unit ex402 electrically excites a semiconductor laser included in the optical head ex401, and modulates the laser light according to the recorded data. The reproduction demodulation unit ex403 amplifies a reproduction signal obtained by electrically detecting light reflected from the recording surface using a photodetector included in the optical head ex401, and demodulates the reproduction signal by separating a signal component recorded in the medium of ex215 recording to reproduce the necessary information. The temporary storage memory ex404 temporarily contains the information that will be recorded in the recording medium ex215 and the reproduced information of the recording medium ex215. The disk motor ex405 rotates the recording medium ex215.

The servo control unit ex406 moves the optical head ex401 to a predetermined information track while controlling the rotation pulse of the disk motor ex405 to follow the laser point. The system control unit ex407 controls above all the information reproduction / recording unit ex400. The read and write processes may be implemented by the system control unit ex407 using varied information stored in the temporary storage memory ex404 and generating and adding new information as necessary, and by the modulation recording unit ex402, the reproduction demodulation ex403 and the servo control unit ex406 that record and reproduce information through the optical head ex 401 while operated in a coordinated manner. The system control unit ex407 includes, for example, a microprocessor, and executes processing by having a computer execute a program for reading and writing.

Although the ex401 optical head irradiates a laser dot in the description, it can perform high density recording using near field light.

Figure 25 illustrates the recording medium ex215 which is the optical disk. On the recording surface of the recording medium ex215, guide slots are spirally formed, and an information track ex230 records, in advance, address information indicating an absolute position in the disc according to change in a shape of the guide slots. The address information includes information for determining positions of recording blocks ex231 that are a unit for recording data. The reproduction of the information track ex230 and reading of the address information in an apparatus that records and reproduces data may lead to the determination of the positions of the recording blocks. In addition, the recording medium ex215 includes a data recording area ex233, an inner circumference area ex232 and an outer circumference area ex234. The data recording area ex233 is an area to be used in the recording of user data. The inner circumference area ex232 and the outer circumference area ex234 that are inside and outside the data recording area ex233, respectively, are for specific use except for recording the user data. The information reproduction / recording unit 400 reads and writes encoded audio data, encoded video data or multiplexed data obtained by multiplexing the encoded audio and video data from and into the data recording area ex233 of the recording medium ex215 .

Although an optical disc having a layer, such as a DVD and BD are described as an example in the description, the optical disc is not limited thereto, and can be an optical disc having a multi-layer structure and be able to be recorded in a part other than the surface. Moreover, the optical disk may have a structure for multidimensional recording / reproduction, such as recording information using light of colors with different wavelengths in the same portion of the optical disc and for recording information having different layers from various angles.

In addition, an ex210 car having an ex205 antenna can receive data from satellite ex202 and others, and play video in a visual display device such as an ex211 vehicle navigation system installed in the ex210 car, in the digital broadcasting system ex200 Here, a configuration of the vehicle navigation system ex211 will be a configuration, for example, that includes a GPS reception unit of the configuration illustrated in figure 23. The same will be true for the configuration of the exlll computer, the exll4 cell phone and others.

Figure 26A illustrates the cell phone exll4 using the method of decoding moving images and the method of decoding moving images described in the embodiments. The exll4 cell phone includes: An ex350 antenna for transmitting and receiving radio waves through the base station exllO; an ex365 camera unit capable of capturing moving and still images; and an ex-visual display unit such as a Visual liquid crystal presenter to visually display data such as decoded video captured by camera unit ex365 or received by antenna ex350. The exll4 cell phone further includes: a main body unit including an operation key unit ex366; an audio output unit ex357 such as a loudspeaker for audio output; an audio input unit ex356 such as a microphone for audio input; an ex367 memory unit for storing captured video or still images, recorded audio, encoded or decoded data of received video, still images, emails or others; and a slot unit ex364 which is an interface unit for a recording medium that stores data in the same manner as the memory unit ex367.

Next, an example of a cell phone configuration exll4 will be described with reference to Fig. 26B. In the cell phone exll4, a main control unit ex360 designed to fully control each unit of the main body including the display unit ex358 as well as the operating key unit ex366 which is mutually connected, by means of a synchronized bus ex370, to a power supply circuit unit ex361, an operation input control unit ex362, a video signal processing unit ex355, a camera interface unit ex366, a liquid crystal display (LCD) presenter unit ex359, an ex352 modulation / demodulation unit, an ex353 multiplexer / demultiplexer unit, an ex354 audio signal processing unit, the ex364 slot unit and the ex367 memory unit .

When a call end key or a power key is turned on by a user's operation, the power supply circuit unit ex361 supplies the respective units with power from a battery to activate the cell phone exll4.

In the cell phone exll4, the audio signal processing unit ex354 converts the audio signals collected by the audio input unit ex356 into speech conversation mode into digital audio signals under the control of the main control unit ex360 which includes a CPU, ROM and RAM. Then, the modulation / demodulation unit ex352 performs spread spectrum processing in the audio or digital signals, and the transmit and receive unit ex305 does not perform digital to analog conversion and frequency conversion in the data, for This way transmit the resulting data by means of the antenna ex350. Also, in the cellular telephone exll4, the transmission and reception unit ex351 amplifies the data received by the antenna ex350 in voice conversation mode and carries out conversion of frequency and the analog to digital conversion in the data. Then, the modulation / demodulation unit ex352 performs inverse scattered spectrum processing in the data, and the audio signal processing unit ex354 converts them into analog audio signals, in order to enter them by means of the audio unit. ex357 audio output.

In addition, when an e-mail in data communication mode is transmitted, e-mail text data entered upon operation of the operation keys unit ex366 and others of the main body are sent to the main control unit ex360 by means of the unit operating input control ex362. The main control unit ex360 causes the modulation / demodulation unit ex352 to perform spread spectrum processing in the text data, and the transmission and reception unit ex351 performs the digital-to-analog conversion and the frequency conversion in the resulting data for transmitting the data to the ex11 base station via the ex350 antenna. When an email is received, processing that is approximately inverse of the processing for transmitting an email is carried out on the received data, and the resulting data is provided to the display unit ex358.

When video, still images or video and audio in When the data communication mode is or is transmitted, the video signal processing unit ex355 compresses and encodes video signals supplied from the ex365 camera unit using the motion picture coding method shown in each of the modes (eg, said, it functions as the image coding apparatus according to one aspect of the present invention), and transmits the encoded video data to the multiplexer / demultiplexer unit ex353. In contrast, during the time when the ex365 camera unit captures video, still images and others, the audio signal processing unit ex354 encodes audio signals collected by the audio input unit ex356, and transmits the audio data. encoded to the ex353 multiplexer / demultiplexer unit.

The multiplexer / demultiplexer unit ex353 multiplexes the encoded video data supplied from the video signal processing unit ex355 and the encoded audio data supplied from the audio signal processing unit ex354, using a predetermined method. Then, the modulation / demodulation unit (modulation / demodulation circuit unit) ex352 performs spread spectrum processing in the multiplexed data, and the transmission and reception unit ex351 performs digital-to-analog conversion and conversion of frequency in the data to thereby transmit the resulting data by means of the antenna ex350.

When data is received from a video file that is linked to a web page and others in data communication mode or when an email is received with video and / or audio attached, in order to decode the multiplexed data received by means of the antenna ex350, the multiplexer / demultiplexer unit ex353 demultiplexes the multiplexed data into a video data bit stream and an audio data bit stream, and supplies the video signal processing unit ex355 with the encoded video data and the audio signal processing unit ex354 with the encoded audio data, through the synchronized bus ex370. The video signal processing unit ex355 decodes the video signal using a method of decoding moving images corresponding to the method of encoding moving images shown in each of the modes (ie, it functions as the decoding apparatus). of images according to one aspect of the present invention), and then the display unit ex358 visually displays, for example, the video and still images included in the video file linked to the web page by means of the control unit LCD ex359. In addition, the ex354 audio signal processing unit decodes the audio signal, and the audio output unit ex357 provides the audio.

In addition, similarly to the ex300 television, a terminal such as the exll4 cell phone will likely have three types of implementation configurations including not only (i) a transmit and receive terminal that includes both a coding apparatus and a decoding device , but also (ii) a transmission terminal that includes only one encoding apparatus and (iii) a receiving terminal that includes only one decoding apparatus. Although the digital broadcast system ex200 receives and transmits the multiplexed data obtained by multiplexing audio data into video data in the description, the multiplexed data can be data obtained by multiplexing not audio data but data of characters related to video in data of video, and may not be multiplexed data but the video data itself.

In this way, the method of encoding moving images and the method of decoding moving images in each of the modes can be used in any of the described devices and systems. Thus, the advantages described in each of the modalities can be obtained.

Moreover, the present invention is not limited to modalities, and various modifications and revisions are possible without departing from the scope of the present invention. Mode 3 Video data may be generated by changing, as necessary, between (i) the moving image coding method or the moving image coding apparatus shown in each of the embodiments and (ii) a method of encoding the moving images or an apparatus for encoding moving images according to a different standard, such as MPEG-2, MPEG4-AVC, and VC-1.

Here, when a plurality of video data conforming to different standards is generated and then decoded, the decoding methods have to be selected to conform to different standards. However, since it can not be detected to which standard each of the plurality of video data that will be decoded is conformed, there is a problem that a suitable decoding method can not be selected.

To solve the problem, multiplexed data obtained by multiplexing audio data and other video data have a structure that includes identification information that indicates to which standard the video data is formed. The specific structure of the multiplexed data including the video data generated in the method of encoding moving images and by the The moving image coding apparatus shown in each of the embodiments will be described hereinafter. The multiplexed data is a digital stream in the MPEG-2 transport stream format.

Figure 27 illustrates a structure of the multiplexed data. As illustrated in Figure 27, the multiplexed data can be obtained by multiplexing at least one of a video stream, an audio stream, a presentation graphics (PG) stream and an interactive graphics stream. The video stream represents primary video and secondary video of a movie, the audio stream (IG) represents a part of primary audio and a part of secondary audio that will be mixed with the primary audio part, and the flow of presentation graphics represents subtitles of the movie. Here, the primary video is normal video that will be presented visually on a screen, and the secondary video is video that will be presented visually in a smaller window in the primary video. In addition, the flow of interactive graphics represents an interactive screen that will be generated by arranging the GUI components on a screen. The video stream is encoded in the method of encoding moving images or by the moving image coding apparatus shown in each of the embodiments, or in a method of encoding moving images or by a coding apparatus of moving images in accordance with a conventional standard, such as MPEG-2, MPEG4-AVC, and VC-1. The audio stream is encoded according to a standard, such as Dolby-AC-3, Dolby Digital Plus, MLP, DTS, DTS-HD and linear PCM.

Each flow included in the multiplexed data is identified by PID. For example, 0x1011 is assigned to the video stream that will be used for video of a movie, 0x1100 to OxlllF are assigned in the audio streams, 0x1200 to 0xl21F are assigned to the presentation graphics streams, 0x1400 to 0xl41F are assigned to the Interactive graphics flows, OxlBOO to 0xlB12F are assigned to the video streams that will be used for secondary video of the movie, and OxlAOO to OxlAlF are assigned to the audio streams that will be used for the secondary video that will be mixed with the primary audio .

Figure 28 illustrates schematically how data is multiplexed. First, an ex235 video stream composed of video frames and an ex238 audio stream composed of audio frames are transported in a flow of PES ex236 packets and a flow of PES ex239 packets, and in addition in ex237 TS packets and TS ex240 packets , respectively. Similarly, data from a flow of presentation graphics ex241 and data from an interactive graphics flow ex244 are transformed into a flow of PES ex242 packets and a flow of PES ex245 packets, and further into packets TS ex243 and TS packages ex246, respectively. These TS packets are multiplexed in a stream to obtain ex-247 multiplexed data.

Figure 29 illustrates how a video stream is stored in a flow of PES packets in more detail. The first bar in Figure 29 shows a flow of video frames in a video stream. The second bar shows the flow of PES packets. As indicated by arrows indicated as yyl, yy2 and y3 and yy4 in figure 29, the video stream is divided into images as images I, images B and images P each of which is a video presentation unit, and the Images are stored in a payload of each of the PES packets. Each of the PES packets has a PES header, and the PES header stores a presentation time stamp (PTS) that indicates a time of visual presentation of the image, and a timestamp of decoding (DTS, for its acronym in English) that indicates a decoding time of the image.

Figure 30 illustrates a format of TS packets that will be finally written to the multiplexed data. Each of the TS packets is a packet with a fixed length of 188 bytes that includes a 4-byte TS header that has information, such as a PID to identify a flow and a TS payload of 184 bytes to store data. These PES packets are divided, and stored in the TS payloads, respectively. When a BD ROM is used, each of the TS packets is given a 4-byte TP_Extra_Header, thus resulting in 192-byte source packets. The source packets are written in the multiplexed data. The TP_Extra_Heade stores information such as an Arrival_Time_Stamp (ATS). The ATS shows a transfer start time at which each of the TS packets will be transferred to a PID filter. The source packets are arranged in the multiplexed data as shown in the background of Figure 30. The numbers that are incremented from the top of the multiplexed data are called source packet numbers (SPNs).

Each of the TS packets included in the multiplexed data includes not only audio, video, subtitle and other flows, but also a program association table (PAT), a program map table (PMT) and a clock reference of program (PCR). The PAT shows what a PID indicates in a PMT used in the multiplexed data, and a PID of the PAT itself is recorded as zero. The PMT stores PIDs of the video, audio, subtitle and other flows included in the multiplexed data, attribute information of the flows corresponding to the PIDs. The PMT also has several descriptors that refer to the multiplexed data. The descriptors have information such as copy control information that shows whether it is allowed or not the copy of the multiplexed data. The PCR stores STC time information that corresponds to an ATS that shows when the PCR packet is transferred to a decoder, to thereby achieve synchronization between a time of arrival (ATC) clock that is a time axis of ATSs, and a system time clock (STC) that is a time axis of PTSs and DTSs.

Figure 31 illustrates the data structure of the P T in detail. A PMT header is available at the top of the PMT. The PMT header describes the length of data included in the PMT and others. A plurality of descriptors that refer to the multiplexed data is disposed after the PMT header. Information such as copy control information is described in the descriptors. After the descriptors, a plurality of pieces of flow information are provided which refer to the flows included in the multiplexed data. Each piece of flow information includes flow descriptors that describe each information, such as a flow type to identify a compression code of a flow, a flow PID and information of flow attributes (such as a frame rate or an aspect ratio). The flow descriptors are equal in number to the number of flows in the multiplexed data.

When the multiplexed data is recorded in a recording medium and others, are recorded together with multiplexed data information files.

Each of the multiplexed data information files is management information of the multiplexed data as shown in Figure 32. The multiplexed data information files are in one-to-one correspondence with the multiplexed data, and each of the files it includes multiplexed data information, flow attribute information and an input map.

As illustrated in FIG. 32, the multiplexed data information includes a system speed, a playback start time, and a playback end time. The system rate indicates the maximum transfer rate at which a system target decoder which will be described below transfers the multiplexed data to a PID filter. The ranges of the ATSs included in the multiplexed data are set to be no higher than a system speed. The playback start time indicates a PTS in a video frame in the header of the multiplexed data. An interval of one frame is added to a PTS in a video frame at the end of the multiplexed data, and the PTS is set to the end time of playback.

As shown in figure 33, a piece of Attribute information is recorded in the flow attribute information, for each PID of each flow included in the multiplexed data. Each piece of attribute information has different information depending on whether the corresponding stream is a video stream, an audio stream, a flow of presentation graphics or a flow of interactive graphics. Each piece of video stream attribute information carries information that includes what type of compression code is used to compress the video stream, and the resolution, aspect ratio and frame rate of the pieces of image data that are included in the video stream. Each piece of audio stream attribute information carries information that includes what type of compression code is used to compress the audio stream, how many channels are included in the audio stream, what language supports the audio stream and how high is the sampling frequency. The video stream attribute information and the audio stream attribute information are used to initialize a decoder before the player plays the information.

In the present embodiment, the multiplexed data that will be used is of a type of flow included in the PMT. In addition, when the multiplexed data is recorded in a recording medium, the information of video stream attributes included in the data information is used. multiplexed More specifically, the motion picture coding method or the motion picture coding apparatus described in each of the embodiments include a step or unit for assigning unique information indicating video data generated by the picture coding method. moving or encoding apparatus of moving images in each of the modalities, to the type of flow included in the PMT or the information of video stream attributes. With the configuration, the video data generated by the motion picture decoding method or the motion picture coding apparatus described in each of the modes can be distinguished from video data conforming to another standard.

Furthermore, FIG. 34 illustrates steps of the method of decoding moving images according to the present embodiment. In the exSlOO stage, the type of stream included in the PMT or the video stream attribute information is obtained from the multiplexed data. Then, in the external stage, it is determined whether or not the type of stream or information of video stream attributes indicate that the multiplexed data is generated by the moving image coding method or the moving image coding apparatus. in each of the modalities. When it is determined that the type of flow or information of video stream attributes indicates that the multiplexed data is generated by the method of encoding moving images or the moving image coding apparatus in each of the modes, in the exS112 stage, decoding is carried out by the method of decoding moving images in each of the modalities. Furthermore, when the type of stream or information of video stream attributes indicate that they comply with conventional standards, such as MPEG-2, MPEG4-AVC and VC-1, in step exS103, decoding is carried out by a method of decoding moving images in accordance with conventional standards.

In this way, assigning a new unique value to the flow type or the video stream attributes information makes it possible to determine whether the method of decoding moving images or the decoding apparatus of moving images described in each one of the modalities may or may not carry out the decoding. Even when multiplexed data conforms to a different standard, a decoding method or appropriate decoding apparatus may be selected. Thus, it becomes possible to decode information without any error. Furthermore, the method or apparatus for encoding moving images, or the method or apparatus for decoding moving images in the present modality can be used in the devices and systems described above.

Modality 4 Each of the moving image coding method, the moving image coding apparatus, the moving image decoding method and the moving image decoding apparatus in each of the embodiments is typically achieved in the form of a integrated circuit or a large-scale integrated circuit (LSI). As an example of the LSI, Figure 35 illustrates a configuration of the LSI x500 that is done in a chi. The LSI ex500 includes elements ex501, ex502, ex503, ex504, ex505, ex506, ex507, ex508 and ex509 to be described below, and the elements are connected to each other via an ex510 bus. The ex505 power supply circuit unit is activated by supplying each of the elements with power when the power supply circuit unit ex505 is turned on.

For example, when encoding is performed, the LSI ex500 receives an AV signal from an exll7 microphone, an exll3 camera, and others through an AV 10 ex509 under the control of an ex501 control unit that includes an ex502 CPU , a memory controller ex503, a flow controller ex504 and an excitation frequency control unit ex512. The received AV signal is temporarily stores in an external memory ex511, such as an SDRAM. Under control of the control unit ex501, the stored data is segmented into portions of data according to the amount of processing and speed that will be transmitted to an ex507 signal processing unit. Then, the signal processing unit ex507 encodes an audio signal and / or a video signal. Here, the coding of the video signal is the coding described in each of the modalities. In addition, the signal processing unit ex507 sometimes multiplexes the encoded audio data and encoded video data, and an ex506 stream provides the multiplexed data to the outside. The mutiplexed data provided is transmitted to the base station exl07, or written to the recording medium ex215. When data sets are multiplexed, the data must be temporarily stored in the temporary storage memory ex508 in such a way that the data sets are synchronized with each other.

Although the ex511 memory is an element outside the LSI ex500, it may be included in the LSI ex500. The temporary storage memory ex508 is not limited to a temporary storage memory, but may be composed of temporary storage memories. In addition, the LSI ex500 can be made on a chip or a plurality of chips.

Moreover, although the ex501 control unit includes the CPU ex502, the memory controller ex503, the flow controller ex504, the excitation frequency control unit ex512, the configuration of the control unit ex501 is not limited thereto. For example, the signal processing unit ex507 may further include a CPU. The inclusion of another CPU in the signal processing unit ex507 can improve the processing speed. In addition, as another example, the CPU ex502 may serve as or be part of the signal processing unit ex507, and, for example, may include an audio signal processing unit. In such a case, the control unit ex501 includes the signal processing unit ex507 or the CPU ex502 which includes a part of the signal processing unit ex507.

The name used here is LSI, but it can also be called IC, system LSI, super LSI or ultra LSI depending on the degree of integration.

In addition, ways to achieve integration are not limited to the LSI, and a special circuit or a general-purpose processor and so on can also achieve integration. A programmable gate arrangement per field (FPGA) that can be programmed after the manufacture of LSIs or a reconfigurable processor that allows the reconfiguration of the connection or configuration of an LSI can be used for the same purpose.

In the future, with the advance in semiconductor technology, a very new technology can replace LSI. Functional blocks can be integrated using such technology. The possibility is that the present invention is applied to biotechnology.

Modality 5 When video data generated in the motion picture coding method or by the motion picture coding apparatus described in each of the modes are decoded, as compared to when video data conforming to a conventional standard, such As MPEG-2, MPEG4-AC, and VC-1 are decoded, the amount of processing is likely to increase. Thus, the LSI ex500 has to be set at a higher excitation frequency than that of the ex502 CPU to be used when video data in accordance with the conventional norm is decoded. However, when the excitation frequency is set higher, there is a problem that the power consumption is increased.

To solve the problem, the moving image decoding apparatus, such as the ex300 television and the LSI ex500 are configured to determine to which standard the video data is formed, and to switch between the excitation frequencies according to the given standard. . Figure 36 illustrates an ex800 configuration in the present modality. An excitation frequency change unit ex803 establishes an excitation frequency at a higher excitation frequency when video data is generated by the motion picture coding method or the motion picture coding apparatus described in each of the modalities. Then, the excitation frequency change unit ex803 instructs a decoding processing unit ex801 to execute the motion picture decoding method described in each of the embodiments for decoding the video data. When the video data conforms to the conventional norm, the excitation frequency change unit ex803 establishes an excitation frequency at a lower excitation frequency than that of the video data generated by the moving image coding method. or the moving image coding apparatus described in each of the embodiments. Then, the excitation frequency change unit ex803 instructs the decoding processing unit ex802 which conforms to the conventional standard for decoding the video data.

More specifically, the excitation frequency change unit ex803 includes the CPU ex502 and the excitation frequency control unit ex512 in Figure 35. Here, each of the processing unit of decoding ex801 executing the motion picture decoding method described in each of the embodiments and the decoding processing unit ex802 conforming to the conventional standard corresponds to the signal processing unit ex507 in figure 35. The CPU ex502 determines to which standard the video data is made. Then, the excitation frequency control unit ex512 determines an excitation frequency based on a signal from the ex502 CPU. Furthermore, the signal processing unit ex507 decodes the video data based on the signal from the ex502 CPU. For example, the identification information described in mode B is preferably used to identify the video data. The identification information is not limited to the one described in mode B, but it can be any information as long as the information indicates to which standard the video data is formed. For example, when determining to which standard the video data is formed based on an external signal to determine that the video data is used for a television or a disc, etc., the determination can be made based on this external signal. Moreover, the ex502 CPU selects an excitation frequency based on, for example, a look-up table in which the standards of the video data are associated with the excitation frequencies as shown in FIG. 38.

The excitation frequency may be selected by storing the look-up table in the temporary storage memory x508 and in an internal memory of an LSI, and with reference to the look-up table by the CPU ex502.

Figure 37 illustrates steps to execute a method in the present embodiment. First, in step exS200, the signal processing unit ex507 obtains identification information of the multiplexed data. Then, in step ex201, the CPU ex502 determines whether or not the video data is generated by the coding method and the coding apparatus described in each of the modes, based on the identification information. When the video data is generated by the motion picture coding method and the motion picture coding apparatus described in each of the embodiments, in step exS202, the ex502 CPU transmits a signal to set the excitation frequency at a higher excitation frequency to the excitation frequency control unit ex512. Then, the excitation frequency control unit ex512 sets the excitation frequency at the highest excitation frequency. On the other hand, when the identification information indicates that the video data conforms to the conventional norm, such as MPEG-2, MPEG4-AVC and VC-1, in the exS203 stage, the ex502 CPU transmits a signal to establish the excitation frequency to a lower excitation frequency to excitation frequency control unit ex512. Then, the excitation frequency control unit ex512 sets the excitation frequency at the lower excitation frequency than that in the case when the video data is generated by the method of decoding the moving images and the coding apparatus of the invention. moving images described in each of the modalities.

In addition, along with the change of excitation frequencies, the energy conservation effect can be improved by changing the voltage that will be applied to the LSI ex500 or to an appliance that includes the LSI ex500. For example, when the excitation frequency is set lower, the voltage that will be applied to the LSI ex500 or to the device that includes the LSI ex500 will probably be set to a lower voltage than that in the case where the excitation frequency is set highest.

Moreover, when the amount of processing for decoding is larger, the excitation frequency may be set higher, and when the amount of processing for decoding is smaller, the excitation frequency may be set lower than the method for setting the frequency. of excitement. Thus, the establishment method is not limited to those described above. For example, when the amount of processing to decode video data in accordance with MPEG4-AVC is greater than the amount of processing to decode video data generated by the motion picture coding method and the motion picture coding apparatus described in each of the embodiments , the excitation frequency is probably set in reverse order to the setting described above.

Moreover, the method to establish the excitation frequency is not limited to the method to establish the lowest excitation frequency. For example, when the identification information indicates that the video data is generated by the motion picture coding method and the motion picture coding apparatus described in each of the modes, the voltage that will be applied to the LSI ex500 or the device that includes the LSI ex500 will probably be set higher. When the identification information indicates that the video data conforms to the conventional standard, such as MPEG-2, MPEG4-AVC, and VC-1, the voltage that will be applied to the LSI ex500 or to the device that includes the LSI ex500 is likely to be it will be set lower. As another example, when the identification information indicates that the video data is generated by the method of encoding moving images and the image coding apparatus in As described in each of the modes, the excitation of the ex502 CPU probably will not have to be suspended. When the identification information indicates that the video data conforms to the conventional norm, such as MPEG-2, MPEG4-AVC and VC-1 the excitation of the ex502 CPU will probably be suspended at any given time since the ex502 CPU It has additional processing capacity. Even when the identification information indicates that the video data is generated by the moving image coding method and the moving image coding apparatus described in each of the embodiments, in case the ex502 CPU has the capacity to additional processing, the excitation of the ex502 CPU is likely to be suspended at any given time. In such a case, the suspension time is probably set shorter than that in case of when the identification information indicates that the video data conforms to the conventional norm, such as MPEG-2, MPEG4-AVC and VC-1.

Accordingly, the energy conservation effect can be improved by switching between the excitation frequencies according to the standard to which the video data is made up. In addition, when the LSI ex500 or the device including the LSI ex500 is powered by a battery, the battery life can be extended with the energy conservation effect.

Modality 6 There are cases in which a plurality of video data that conform to different standards are provided to the devices and systems, such as a television and a mobile telephone. To make it possible to decode the plurality of video data conforming to the different standards, the ex507 signal processing unit of the LSI ex500 has to conform to the different standards. However, the problems of increasing the scale of the LSI ex500 circuit and increasing the cost originate with the individual use of the ex507 signal processing units that conform to the respective standards.

To solve the problem, what is conceived is a configuration in which the decoding processing unit is partially shared to implement the method of decoding the moving images described in each of the modes and the decoding processing unit that is conforms to the conventional standard, such as MPEG-2, MPEG4-AVC and VC-1. Ex900 in Figure 39A shows an example of the configuration. For example, the method of decoding moving images described in each of the modes and the method of decoding moving images conforming to the MPEG4-AVC have, partially in common, the details of processing, such as entropy coding, inverse quantization, unblock filtering and motion compensated prediction. In contrast, the processing details to be shared are likely to include the use of an ex902 processing unit that conforms to MPEG-4 AVC to be shared by common processing operations, and that a dedicated decoding processing unit ex901 be used for processing that be unique to one aspect of the present invention. In particular, since the aspect of the present invention is characterized by inter prediction, it is possible, for example, that the dedicated decoding processing unit ex901 be used for intra-prediction, and that the decoding processing unit be shared by any or all of the other processing, such as entropy decoding, inverse quantization, unblocking filtering and motion compensation. The decoding processing unit for implementing the motion picture decoding method described in each of the modes may be shared for processing to be shared, and a dedicated decoding processing unit may be used to produce a single signal for that MPEG4. -AVC.

Moreover, exOOOO in Figure 39B shows another example in which processing is partially shared. This example uses a configuration that includes a unit of dedicated decoding processing exlOOl which supports the unique processing for the present invention, a dedicated decoding processing unit exl002 that supports single processing for the other conventional standard, and a decoding processing unit exl003 that supports the processing that will be shared between the method of decoding moving images in the present invention and the method of decoding conventional moving images. Here, the dedicated decoding processing units exlOOl and exl002 are not necessarily specialized for the processing of the present invention and the processing of the conventional standard, respectively, and may be those capable of implementing general processing. In addition, the configuration of the present modality can be implemented in the LSI ex500.

In this way, reducing the scale of an LSI circuit and reducing the cost is possible by sharing the decoding processing unit so that the processing between the motion picture decoding method in the present invention and the method of decoding is shared. decoding of moving images in accordance with the conventional norm.

Industrial application The method of encoding moving images and method of decoding moving images according to the present invention are applicable to any type of multimedia data where the methods are carried out with reduced load and the same efficiency of coding using motion vectors having limited magnitude. For example, the method of encoding moving images and method of decoding moving images can be useful for storage, transmission, data communication, etc. using mobile phones, DVD devices and personal computers.

List of reference signs 100 Motion picture coding apparatus 101 Subtracting unit 102 Transformation unit 103 Quantification unit 104, 202 Reverse quantization unit 105, 203 Reverse conversion unit 106, 204 Adding unit 107, 205 Unit of intra- Íter prediction 108 Coding control unit 109, 207 Memory unit Coding unit by entrop 200 Motion picture decoding device 201 Entropy decoding unit 206 Decoding control unit It is noted that, with regard to this date, the best method known to the applicant to carry out the present invention is that which is clear from the present description of the invention.

Claims

CLAIMS Having described the invention as above, the content of the following claims is claimed as property:

1. A method of encoding moving images to encode images on a block-by-block basis, characterized in that it comprises: selectively add, to a list, a movement vector of each or more corresponding blocks each of which is (i) a spatially adjacent block spatially adjacent to a current block in a current image to be encoded or (ii) a block temporarily adjacent corresponding temporarily adjacent to the current block and included in an image that is not the current image; select a motion vector used to encode the current block, from among the motion vectors in the list; Y code the current block using the selected motion vector, wherein the addition of the movement vector also includes: calculating a second movement vector by scaling a first movement vector of the corresponding temporally adjacent block; determining whether or not a magnitude of the second motion vector is within a predetermined fixed magnitude range; adding the second movement vector to the list, when the magnitude of the second motion vector is within the fixed magnitude range, such as the movement vector of the corresponding temporally adjacent block; Y add a third motion vector to the list, when the magnitude of the second motion vector is not within the fixed magnitude range, such as the motion vector of the corresponding temporally adjacent block, the third motion vector being generated by cutting the second vector of movement to have a magnitude within the fixed magnitude range.

2. The method of encoding moving images according to claim 1, characterized in that the list is a list of fusion candidates that lists the movement vector of the corresponding block and specification information to specify an image referenced by corresponding block, in the addition, the specification information is added to the list of fusion candidates in addition to the vector of movement of the corresponding block, in the selection, a motion vector and specification information used for the coding of the current block are selected from among the motion vectors in the list of fusion candidates, and In coding, the current block is encoded by generating a predictive image of the current block using the selected motion vector and selected specification information.

3. The method of encoding moving images according to claim 1 or 2, characterized because the list is a list of candidates for motion vector predictor, and wherein the addition of the movement vector also includes: determining whether a magnitude of a fifth motion vector is within the fixed range of magnitude or not, the fifth motion vector being calculated by performing a scaling process in a fourth movement vector of the corresponding spatially adjacent block; add the fifth motion vector to the list, when the magnitude of the fifth motion vector is within the fixed magnitude range; Y add a sixth movement vector to the list, when the magnitude of the fifth motion vector is not within the fixed magnitude range, the sixth motion vector is generated by cutting the fifth motion vector to have a magnitude within the fixed magnitude range, in the selection, a movement vector predictor used to code the current block is selected from the list of candidates for motion vector predictor, and in the coding, the coding of the current block is carried out which includes coding of a motion vector of the current block using the motion vector predictor selected in the selection.

4. The method of encoding moving images according to any of claims 1 to 3, characterized in that the fixed magnitude range is determined based on a bit precision of a motion vector, and the bit precision is 16 bits.

5. A method of decoding moving images to decode images on a block-by-block basis, characterized in that it comprises: selectively add, to a list, a movement vector of each or more corresponding blocks each one of which is (i) a spatially adjacent block spatially adjacent to a current block in a current image to be decoded or (ii) a corresponding temporarily adjacent block adjacent to the current block and included in an image that is not the current image; select a motion vector used to encode the current block, from among the motion vectors in the list; Y decode the current block using the selected motion vector, wherein the addition of the movement vector also includes: calculating a second movement vector by scaling a first movement vector of the corresponding temporally adjacent block; determining whether or not a magnitude of the second motion vector is within a predetermined fixed magnitude range; adding the second movement vector to the list, when the magnitude of the second motion vector is within the fixed magnitude range, such as the movement vector of the corresponding temporally adjacent block; Y add a third motion vector to the list, when the magnitude of the second motion vector is not within the fixed magnitude range, such as the motion vector of the corresponding temporally adjacent block, the third motion vector being generated by cutting the second motion vector to have a magnitude within the fixed magnitude interval.

6. The method of decoding moving images according to claim 5, characterized in that the list is a list of fusion candidates that lists the movement vector of the corresponding block and specification information to specify an image referenced by corresponding block, in the addition, the specification information is added to the list of fusion candidates in addition to the movement vector of the corresponding block, in the selection, a motion vector and specification information used for the decoding of the current block are selected from among the motion vectors in the list of fusion candidates, and in decoding, the current block is decoded by generating a predictive image of the current block using the selected motion vector and selected specification information.

7. The method of encoding moving images according to claim 5 or 6, characterized because the list is a list of candidates for motion vector predictor, and wherein the addition of the movement vector also includes: determining whether a magnitude of a fifth motion vector is within the fixed range of magnitude or not, the fifth motion vector being calculated by performing a scaling process in a fourth movement vector of the corresponding spatially adjacent block; add the fifth motion vector to the list, when the magnitude of the fifth motion vector is within the fixed magnitude range; Y add a sixth motion vector to the list, when the magnitude of the fifth motion vector is not within the fixed magnitude range, the sixth motion vector is generated by cutting the fifth motion vector to have a magnitude within the range of magnitude fixed, in the selection, a motion vector predictor used to decode the current block is selected from the list of candidates for motion vector predictor, and in the decoding, the decoding of the current block is carried out which includes decoding of a movement vector of the current block using the motion vector predictor selected in the selection.

8. The method of decoding moving images according to any of claims 5 to 7, characterized in that the fixed magnitude range is determined based on a bit precision of a motion vector, and the bit precision is 16 bits.

9. An apparatus for encoding moving images that encodes images on a block-by-block basis, characterized in that it comprises: an addition unit configured to selectively add, to a list, a movement vector of each or more corresponding blocks each of which is (i) a spatially adjacent block spatially adjacent to a current block in a current image that will be encoded or (ii) a temporarily adjacent block corresponding temporarily adjacent to the current block and included in an image that is not the current image; a selection unit configured to select a motion vector used to encode the current block, from among the motion vectors in the list; Y a coding unit configured to code the current block using the selected motion vector, wherein the addition unit is configured to calculate a second motion vector by scaling a first movement vector of the corresponding temporally adjacent block, determining whether a magnitude of the second motion vector is within a predetermined fixed magnitude range, or not the second movement vector to the list, when the magnitude of the second motion vector is within the fixed magnitude range, as the movement vector of the corresponding temporally adjacent block, and add a third motion vector to the list, when the magnitude of the second motion vector is not within the fixed magnitude range, such as the motion vector of the corresponding temporally adjacent block, the third motion vector being generated by cutting the second motion vector to have a magnitude within the fixed magnitude range.

10. An image decoding device in movement that decodes images on a block by block basis, characterized in that it comprises: an addition unit configured to selectively add, to a list, a movement vector of each or more corresponding blocks each of which is (i) a spatially adjacent block spatially adjacent to a current block in a current image that will be decoded or (ii) a corresponding temporarily adjacent block temporarily adjacent to the current block and included in an image that is not the current image; a selection unit configured to select a motion vector used to encode the current block, from among the motion vectors in the list; Y a decoding unit configured to decode the current block using the selected motion vector, wherein the addition unit is configured to calculate a second motion vector by scaling a first movement vector of the corresponding temporally adjacent block, determining whether a magnitude of the second motion vector is within a predetermined fixed magnitude range, or not the second movement vector to the list, when the magnitude of the second motion vector is within the fixed magnitude range, as the movement vector of the corresponding temporally adjacent block, and adding a third motion vector to the list, when the magnitude of the second motion vector is not within the fixed magnitude range, such as the movement vector of the corresponding temporally adjacent block, the third motion vector being generated by cutting the second motion vector to have a magnitude within the fixed magnitude range.