WO2017051493A1 - Video encoding device and video decoding device - Google Patents

Video encoding device and video decoding device Download PDF

Info

Publication number
WO2017051493A1
WO2017051493A1 PCT/JP2016/003322 JP2016003322W WO2017051493A1 WO 2017051493 A1 WO2017051493 A1 WO 2017051493A1 JP 2016003322 W JP2016003322 W JP 2016003322W WO 2017051493 A1 WO2017051493 A1 WO 2017051493A1
Authority
WO
WIPO (PCT)
Prior art keywords
pictures
picture
temporal
video
motion vector
Prior art date
Application number
PCT/JP2016/003322
Other languages
French (fr)
Japanese (ja)
Inventor
貴之 石田
慶一 蝶野
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2017541222A priority Critical patent/JP6489227B2/en
Publication of WO2017051493A1 publication Critical patent/WO2017051493A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/114Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/55Motion estimation with spatial constraints, e.g. at image or region borders

Definitions

  • the present invention relates to a video encoding device, a video decoding device, a video system, a video encoding method, and a video encoding program based on an encoding method in which a video screen is divided and then compressed.
  • full HD (High Definition) video content of 1920 (horizontal) ⁇ 1080 (pixel) in the horizontal direction is supplied.
  • 4K high-definition video
  • 8K commercial broadcasting of high-definition video
  • video signals are generally encoded based on the H.264 / AVC (Advanced Video Coding) standard or HEVC (High Efficiency Video Coding) standard on the transmission side, and video is decoded and processed on the reception side.
  • the signal is reproduced, but in the case of 8K, since the number of pixels is large, the processing load in the encoding process and the decoding process becomes high.
  • Non-Patent Document 1 As a method for reducing the processing load in the case of 8K, there is, for example, screen quadrant coding using a slice described in Non-Patent Document 1 (see FIG. 10). As illustrated in FIG. 11, in Non-Patent Document 1, when quad-frame coding is used, when inter prediction is performed, a motion vector for motion compensation (MC) is detected in a block near a slice boundary. There is a restriction that the vertical (vertical) component is 128 pixels or less. Note that there is no restriction on the motion vector range in the vertical direction across the slice boundary (hereinafter referred to as motion vector restriction) for blocks that do not belong to the vicinity of the slice boundary.
  • motion vector restriction there is no restriction on the motion vector range in the vertical direction across the slice boundary for blocks that do not belong to the vicinity of the slice boundary.
  • optimal motion vector means an original (normal) motion vector selected by a predictor that performs inter-screen prediction (inter prediction) processing in the video encoding device.
  • the inter-frame distance is small, so the value of the motion vector tends to be small.
  • code amount distribution according to the hierarchy layer
  • the M value is large, so that the inter-frame distance is large, so that the value of the motion vector tends to be large.
  • the number of time direction hierarchies increases, so that restrictions on code amount distribution according to the hierarchies (layers) are relaxed, so that coding efficiency is improved.
  • the M value is changed from 8 to 4
  • the value of the motion vector is halved
  • the M value is changed from 4 to 8
  • the value of the motion vector is doubled.
  • Non-Patent Document 1 the concept of SOP (Set of pictures) is introduced.
  • SOP is a unit for describing the coding order and reference relationship of each AU (Access Unit) when performing temporal direction hierarchical encoding.
  • the temporal direction hierarchical coding is coding that enables partial extraction of a frame from a plurality of frames of video.
  • Structure with L 0: SOP structure composed only of pictures with Temporal ID 0 (that is, the number of stages of pictures included in the SOP is 1. It can be said that L indicating the maximum Temporal ID is 0. )
  • L indicating the maximum temporal ID is 1) It can also be said.
  • Structure with L 2: SOP structure composed of a picture with a temporal ID of 0, a picture of 1, and a picture of 2 (that is, the number of stages of pictures included in the SOP is three. The maximum temporal ID is It can be said that L shown is 2.
  • Structure with L 3: SOP structure composed of picture with Temporal ID 0, 1 picture, 2 pictures, and 3 pictures (that is, the number of stages of pictures included in the SOP is four. (It can be said that L indicating Temporal ID is 3.)
  • the value of the motion vector tends to increase. Therefore, particularly in a scene in which an object on the screen or the entire screen moves fast in the vertical direction, the image quality is caused by the motion vector limitation. Deteriorates. This is because an optimal motion vector may not be selected at a slice boundary due to motion vector limitation.
  • the present invention is an encoding method that compresses after dividing a video screen, and an object of the present invention is to suppress deterioration in image quality when using an encoding method that restricts motion vector selection in the vicinity of a slice boundary.
  • a video encoding device is a video encoding device that divides a video into a predetermined number of slices and performs encoding processing under a motion vector restriction in the vicinity of a slice boundary, and analyzes the encoding statistical information Means, an estimation means for estimating whether or not an optimal motion vector can be selected in the vicinity of the slice boundary based on the analysis result of the analysis means, and an encoding structure based on the estimation result of the estimation means, Temporal ID is 0 SOP structure consisting only of pictures, Temporal ID SOP structure consisting of 0 and 1 pictures, Temporal ID ⁇ ⁇ ⁇ ⁇ picture 0, 1 picture, and SOP structure consisting of 2 pictures, Temporal Coding structure determining means for adaptively determining any one of SOP structures composed of a picture with ID 0 0, 1 picture, 2 pictures and 3 pictures It is characterized by.
  • the video decoding apparatus includes an SOP structure composed only of pictures having a temporal ID of 0, an SOP structure composed of pictures having a temporal ID of 0 and 1 pictures, a picture having a temporal ID of 0, a picture of 1,
  • a video encoding method is a video encoding method that divides a video into a predetermined number of slices and performs an encoding process under a motion vector restriction near a slice boundary, and analyzes encoding statistical information, Based on the analysis result, it is estimated whether or not an optimal motion vector can be selected near the slice boundary, and based on the estimation result, the coding structure is composed of an SOP structure consisting only of pictures whose Temporal ID is 0, Temporal ID SOP structure composed of 0 picture and 1 picture, picture whose Temporal ID is 0, SOP structure composed of 1 picture and 2 pictures, picture whose Temporal ID is 0, picture of 1 and 2 It is characterized in that it is adaptively determined to one of the SOP structures composed of pictures and three pictures.
  • the video decoding method includes an SOP structure composed only of pictures whose Temporal ID is 0, an SOP structure composed of pictures whose Temporal ID is 0 and 1 pictures, pictures whose Temporal ID is 0, 1 pictures, And SOP structure composed of two pictures, Temporal ID 0 picture 0, 1 picture, 2 pictures, and SOP 3 structure composed of 3 pictures Features.
  • a video encoding program includes a computer that analyzes encoding statistical information, a process that estimates whether an optimal motion vector can be selected in the vicinity of a slice boundary based on the analysis result, and an estimation result
  • the coding structure is based on the SOP structure consisting only of pictures whose Temporal ID is 0, the SOP structure consisting of pictures whose Temporal ID is 0 and 1 pictures, pictures whose Temporal ID is 0 and 1 pictures.
  • the video decoding program allows a computer to store an SOP structure composed only of a picture whose Temporal ID is 0, an SOP structure composed of a picture whose Temporal ID is 0 and a picture of 1 and a picture whose Temporal ID is 0, 1
  • image quality deterioration can be suppressed.
  • FIG. 1 is a block diagram illustrating a configuration example of an embodiment of a video encoding device.
  • a video encoding apparatus 100 illustrated in FIG. 1 includes an encoding unit 101, an analysis unit 111, a determination unit 112, and an M value determination unit 113. Note that the video encoding apparatus 100 executes the encoding process based on the HEVC standard, but may execute the encoding process based on another standard, for example, the H.264 / AVC standard. Hereinafter, an example in which 8K video is input will be described.
  • the encoding unit 101 includes a screen divider 102 that divides an input image into a plurality of screens, a frequency transformer / quantizer 103, an inverse quantizer / inverse frequency transformer 104, a buffer 105, a predictor 106, and an entropy encoder. 107 is included.
  • the screen divider 102 divides the input video screen into four screens (see FIG. 10).
  • the frequency converter / quantizer 103 performs frequency conversion on the prediction error image obtained by subtracting the prediction signal from the input video signal.
  • the frequency transformer / quantizer 103 further quantizes the frequency-converted prediction error image (frequency transform coefficient).
  • the quantized frequency transform coefficient is referred to as a transform quantization value.
  • the entropy encoder 107 entropy encodes the prediction parameter and the transform quantization value, and outputs a bit stream.
  • the prediction parameters are information related to CTU (Coding
  • the predictor 106 generates a prediction signal for the input video signal.
  • the prediction signal is generated based on intra prediction or inter-frame prediction.
  • the inverse quantization / inverse frequency converter 104 inversely quantizes the transform quantization value. Further, the inverse quantization / inverse frequency converter 104 performs inverse frequency conversion on the inversely quantized frequency conversion coefficient.
  • the reconstructed prediction error image subjected to inverse frequency conversion is supplied with a prediction signal and supplied to the buffer 105.
  • the buffer 105 stores the reconstructed image.
  • the analysis unit 111 analyzes the encoded statistical information. Based on the analysis result of the analysis unit 111, the determination unit 112 determines whether or not an optimal motion vector can be selected near the slice boundary with the above-described motion vector restriction.
  • the encoding statistical information is information on the encoding result of a past frame (for example, a frame immediately before the current encoding target frame), and a specific example of the encoding statistical information will be described later.
  • the vicinity of the slice boundary is an area in which an optimal motion vector could not be selected.
  • a range of ⁇ 128 pixels or a range of ⁇ 256 pixels from the slice boundary. May be near the slice boundary.
  • the range of “near the slice boundary” may be appropriately changed according to the state of the video (such as large / small motion). For example, when the generation ratio of a motion vector having a large value is high, the range “near the slice boundary” may be set wide.
  • FIG. 2 is a block diagram illustrating a configuration example of an embodiment of the video decoding apparatus.
  • the video decoding apparatus 200 shown in FIG. 2 includes an entropy decoder 202, an inverse quantization / inverse frequency converter 203, a predictor 204, and a buffer 205.
  • the entropy decoder 202 entropy decodes the video bitstream.
  • the entropy decoder 202 supplies the transform quantization value subjected to entropy decoding to the inverse quantization / inverse frequency converter 203.
  • the inverse quantization / inverse frequency converter 203 obtains a frequency conversion coefficient by inversely quantizing the converted quantization values of luminance and chrominance with a quantization step width. Further, the inverse quantization / inverse frequency converter 203 performs inverse frequency conversion on the inversely quantized frequency conversion coefficient.
  • the predictor 204 After the inverse frequency conversion, the predictor 204 generates a prediction signal using the image of the reconstructed picture stored in the buffer 205 (the prediction is also referred to as motion compensation prediction or MC reference).
  • the reconstructed prediction error image subjected to inverse frequency conversion by the inverse quantization / inverse frequency converter 203 is added with the prediction signal supplied from the predictor 204 and supplied to the buffer 205 as a reconstructed picture. Then, the reconstructed picture stored in the buffer 205 is output as decoded video.
  • FIG. FIG. 3 is a flowchart showing the operation of the first embodiment of the video encoding apparatus 100 shown in FIG.
  • an 8K video is divided into four (see FIG. 10) and there is a motion vector restriction near the slice boundary.
  • ⁇ 128 is taken as an example.
  • the 8K video is divided into four, and the motion vector limitation is the same in other embodiments.
  • the analysis unit 111 analyzes past encoding results stored in the buffer 105 (for example, the encoding result of the immediately preceding frame). Specifically, the analysis unit 111 calculates an average value or median value of motion vectors in blocks other than the slice boundary (hereinafter, the average value or median value is referred to as M avg ) (step S101).
  • the encoded statistical information is a value of a motion vector
  • the analysis result is an average value or a median value of the motion vectors.
  • the determination unit 112 determines how large M avg is based on ⁇ 128 as the motion vector limit (step S102).
  • the M value determination unit 113 determines the M value based on the determination result of how large M avg is (step S103).
  • the M value determination unit 113 determines the M value based on the determination result as follows, for example.
  • the M value determining unit 113 when the M value is set to 8, as in the cases (1) and (2), When it is estimated that the value of the motion vector near the slice boundary is within ⁇ 128, the M value is returned to 8. In other words, the M value determination unit 113 returns the M value to 8 when it can be estimated that an optimal motion vector can be selected near the slice boundary under the motion vector restriction. In other cases, the M value is determined in accordance with M avg so that the value of the motion vector near the slice boundary is within ⁇ 128.
  • threshold value setting is an example, and the threshold value may be changed or a finer case division may be performed.
  • the control of the video encoding device of the first embodiment is based on the following concept.
  • the determination unit 112 performs coding based on a motion vector as encoded statistical information generated in a region other than the slice boundary (there is no motion vector limitation, and thus a normal, in other words, an optimal motion vector). It is estimated whether or not the target screen is a screen image of a scene that moves fast. When the determination unit 112 estimates that the video is a fast moving scene, the M value determination unit 113 changes the M value so that an optimal motion vector can be selected in the vicinity of the slice boundary.
  • the optimal motion vector may not be selected in the vicinity of the slice boundary. This is equivalent to estimating that the optimal motion vector is not selected near the slice boundary.
  • FIG. FIG. 4 is a flowchart showing the operation of the second embodiment of the video encoding apparatus 100 shown in FIG.
  • the analysis unit 111 analyzes past encoding results stored in the buffer 105 (for example, the encoding result of the immediately preceding frame). Specifically, the analysis unit 111 calculates a block ratio P 1 in which intra-screen prediction (intra prediction) is used for all blocks (for example, PU: Prediction Unit) in a range other than the slice boundary. (step S201), for all the blocks in the vicinity of a slice boundary, and calculates the ratio P 2 of the blocks used are intra prediction (step S202).
  • the encoded statistical information is a prediction mode of a block near a slice boundary (specifically, the number of blocks for intra-screen prediction), and the analysis result is a ratio P 1 and a ratio P. 2 .
  • the determination unit 112 compares the ratio P 1 and the ratio P 2 and determines the degree of deviation between them. Specifically, as compared with the ratio P 1, it determines whether or not the ratio P 2 is quite large. Determination unit 112, for example, determines whether the difference between the ratio P 2 and the ratio P 1 exceeds a predetermined value (step S203).
  • M value determining unit 113 when the difference between the ratio P 2 and the ratio P 1 exceeds a predetermined value, the smaller the M value (step S204).
  • a plurality of predetermined values are provided. For example, when the difference exceeds the first predetermined value, the M value is decreased by a plurality of levels, and the difference exceeds the second predetermined value ( ⁇ first predetermined value). Sometimes the M value may be decreased by one step.
  • M value determining unit 113 the difference between the ratio P 2 and the ratio P 1 is the case is less than the predetermined value, maintaining or M value, or increases the M value (step S205). For example, the M value determination unit 113 increases the M value when the difference is equal to or smaller than a third predetermined value ( ⁇ second predetermined value), and maintains the M value when the difference exceeds the third predetermined value. To do.
  • the control of the video encoding device of the second embodiment is based on the following concept.
  • the encoding unit 101 can use either intra prediction or inter prediction (inter prediction) as a prediction mode when encoding each block in the screen.
  • intra prediction intra prediction
  • the video is a video of a scene where the entire screen moves fast
  • the occurrence rate of the number of motion vectors having a large value when the inter-screen prediction is used is considered to be high even near the slice boundary (the motion vector limit is If not).
  • an optimal motion vector large motion vector
  • intra prediction is often used near the slice boundary.
  • intra prediction is less used than in the vicinity of the slice boundary.
  • the optimal motion vector may not be selected in the vicinity of the slice boundary. This is equivalent to the fact that the ratio P 1 and the ratio P 2 are greatly deviated.
  • the empirical or experimental use of such a value as a threshold value as a predetermined value for determining whether or not there is a large divergence may result in an optimal motion vector not being selected near the slice boundary.
  • a value that can be estimated to be sexual is selected.
  • FIG. FIG. 5 is a flowchart showing the operation of the third embodiment of the video encoding apparatus 100 shown in FIG.
  • the analysis unit 111 analyzes past encoding results stored in the buffer 105 (for example, the encoding result of the immediately preceding frame). Specifically, the analysis unit 111, the previous frame (e.g., two frames before the current encoding target frame) is calculated generated code amount C 1 in the block in the vicinity of a slice boundary (step S301). Further, the analysis unit 111 calculates a generated code amount C 2 in the block near the slice boundary of the previous frame (step S302). In the third embodiment, the encoded statistical information is the generated code amount of blocks near the slice boundary, and the analysis results are the generated code amount C 1 and the generated code amount C 2 .
  • the determination unit 112 compares the generated code amount C 1 and the generated code amount C 2 and determines the degree of deviation between them. Specifically, as compared to the amount of generated codes C 1, determines whether or not considerably large amount of generated codes C 2. Determination unit 112, for example, determines whether the difference between the generated code amount C 2 and the generated code amount C 1 exceeds a predetermined amount (step S303).
  • M value determining unit 113 when the difference between the generated code amount C 2 and the generated code amount C 1 exceeds a predetermined amount, the smaller the M value (step S304).
  • a plurality of predetermined amounts are provided. For example, when the difference exceeds the first predetermined amount, the M value is decreased by a plurality of levels, and the difference exceeds the second predetermined amount ( ⁇ first predetermined amount). Sometimes the M value may be decreased by one step.
  • M value determining unit 113 when the difference between the generated code amount C 2 and the generated code amount C 1 is equal to or less than the predetermined amount, maintaining or M value, or increases the M value (step S305 ). For example, the M value determination unit 113 increases the M value when the difference is equal to or smaller than a third predetermined amount ( ⁇ second predetermined amount), and maintains the M value when the difference exceeds the third predetermined amount. To do.
  • the control of the video encoding device of the third embodiment is based on the following concept.
  • the ratio of the number of motion vectors having a large value when the inter-screen prediction is used is considered to be high even near the slice boundary (motion vector restriction). If not).
  • an optimal motion vector large motion vector
  • in-screen prediction is often used near the slice boundary. Conceivable. In general, the amount of generated code is larger when intra prediction is used than when inter prediction is used.
  • the optimal motion vector may not be selected near the slice boundary. in is equivalent to the generated code amount C 2 increases greatly.
  • an optimal motion vector may not be selected in the vicinity of the slice boundary. A value that can be estimated is selected.
  • the M value is adaptively switched based on the past encoding result (encoding statistical information). Based on the encoded statistical information, it is estimated whether or not an optimal motion vector (in other words, a motion vector outside the motion vector limit) can be selected near the slice boundary under the motion vector restriction. If it is estimated that it cannot be selected, the M value is changed to a smaller value. If it is determined that it can be selected, it is considered that an optimal motion vector can be selected in the vicinity of the slice boundary under the motion vector restriction even with the current M value, so that the M value is maintained or a larger value. Changed to
  • pre-analysis analysis processing executed as pre-processing when encoding the current frame
  • pre-analysis processing executed as pre-processing when encoding the current frame
  • the analysis unit 111, the determination unit 112, and the M value determination unit 113 are configured so that any two or all of the first to third embodiments are incorporated. It may be.
  • the video decoding apparatus shown in FIG. 2 converts a bitstream encoded using an M value set in a range that satisfies the motion vector restriction as exemplified in the first to third embodiments. Decrypt.
  • FIG. 6 is a block diagram illustrating an example of a video system.
  • the video system shown in FIG. 6 is a system in which the video encoding device 100 of each of the above embodiments and the video decoding device 200 shown in FIG. 2 are connected by a wireless transmission line or a wired transmission line 300.
  • the video encoding device 100 is the video encoding device 100 of any of the first to third embodiments described above, but the video encoding device 100 is an arbitrary one of the first to third embodiments.
  • the analysis unit 111, the determination unit 112, and the M value determination unit 113 may be configured to execute two or all processes.
  • each of the above embodiments can be configured by hardware, it can also be realized by a computer program.
  • the information processing system shown in FIG. 7 includes a processor 1001, a program memory 1002, a storage medium 1003 for storing video data, and a storage medium 1004 for storing a bitstream.
  • the storage medium 1003 and the storage medium 1004 may be separate storage media, or may be storage areas composed of the same storage medium.
  • a magnetic storage medium such as a hard disk can be used as the storage medium.
  • the program memory 1002 has a program (video encoding program) for realizing the function of each block (except for the buffer block) shown in FIGS. Or a video decoding program).
  • the processor 1001 executes processing according to the program stored in the program memory 1002, thereby realizing the functions of the video encoding device or the video decoding device shown in FIGS.
  • FIG. 8 is a block diagram showing the main part of the video encoding device.
  • the video encoding device 10 includes an analysis unit 11 (equivalent to the analysis unit 111 in the embodiment) that analyzes encoded statistical information, and a slice boundary near the analysis result of the analysis unit 11.
  • An estimation unit 12 that estimates whether or not an optimal motion vector can be selected (in the embodiment, realized by the determination unit 112), and an encoding structure is adaptively determined based on the estimation result of the estimation unit 12.
  • a coding structure determination unit 13 (implemented in the embodiment by the M value determination unit 113).
  • FIG. 9 is a block diagram showing the main part of the video decoding apparatus.
  • the video decoding apparatus 20 decodes a bitstream encoded based on an encoding structure that is set so that an optimal motion vector can be selected near a slice boundary under motion vector restriction.
  • the decoding part 21 (it implement
  • the decoding unit 21 has, as the set coding structure, an SOP structure composed only of pictures with Temporal ID 0, an SOP structure composed of pictures with Temporal ID 0 0 and 1 pictures, and Temporal ID 0 SOP structure composed of pictures of 1, 1 and 2; SOP structure of SOP ⁇ structure composed of pictures whose temporal ID is 0, 1 picture, 2 pictures and 3 pictures Can be decoded.
  • the decoding unit 21 is divided into four slices as shown in FIG. 10, and when the PU of one slice refers to another slice with motion compensation (MC) as shown in FIG. 11,
  • the MC reference of the same PU across the slice boundary is limited to refer to only pixels within 128 lines from the slice boundary, so that the encoded bit stream can be decoded.
  • the following SOP cage structure as shown in FIG. 12 can be used on the video encoding and decoding side.
  • L 2 structure: an SOP structure composed of a picture with a temporal ID of 0, a picture of 1 and a picture of 2 (or M) (that is, the number of stages of pictures included in the SOP is three).
  • L indicating the maximum Temporal ID is 2 (or M).
  • Structure with L 4: SOP structure composed of a picture with a temporal ID of 0, a picture of 1, a picture of 2, a picture of 3, a picture of 4 (or M) (that is, a picture included in the SOP) (The number of stages is 4. It can also be said that L indicating the maximum Temporal ID is 4 (or M).)
  • Video coding apparatus 11 Analysis part 12 Estimation part 13 Coding structure determination part 20
  • Video decoding apparatus 21 Decoding part 100
  • Video coding apparatus 101 Encoding part 102
  • Screen divider 103 Frequency converter / quantizer 104
  • Inverse quantization / inverse Frequency converter 105 Buffer 106
  • Predictor 107 Entropy encoder 111
  • Analysis unit 112 Determination unit 113
  • Video decoding device 202 Entropy decoder 203 Inverse quantization / inverse frequency converter 204
  • Predictor 205 Buffer 1001 Processor 1002 Program Memory 1003, 1004 Storage medium

Abstract

A video encoding device equipped with: an analysis means that analyzes encoding statistics information; an estimation means that, on the basis of the analysis result from the analysis means, estimates whether it is possible to select an optimal motion vector near a slice boundary; and an encoding structure determination means that, on the basis of the estimation result from the estimation means, adaptively sets the encoding structure as a SOP structure formed only with pictures for which the temporal ID is 0, a SOP structure formed with pictures for which the temporal ID is 0 and pictures for which the temporal ID is 1, a SOP structure formed with pictures for which the temporal ID is 0, pictures for which the temporal ID is 1, and pictures for which the temporal ID is 2, or a SOP structure formed with pictures for which the temporal ID is 0, pictures for which the temporal ID is 1, pictures for which the temporal ID is 2, and pictures for which the temporal ID is 3.

Description

映像符号化装置及び映像復号装置Video encoding device and video decoding device
 本発明は、映像の画面を分割してから圧縮する符号化方法に基づく映像符号化装置、映像復号装置、映像システム、映像符号化方法、及び映像符号化プログラムに関する。 The present invention relates to a video encoding device, a video decoding device, a video system, a video encoding method, and a video encoding program based on an encoding method in which a video screen is divided and then compressed.
 映像の高精細化の要請に応じて、水平方向1920×垂直方向1080(画素)のフルHD(High Definition )の映像コンテンツが供給されている。また、水平方向3840×垂直方向2160(画素)の高精細映像(以下、4Kという。)の試験放送や商用放送が開始されている。さらに、水平方向7680×垂直方向4320(画素)の高精細映像(以下、8Kという。)の商用放送が計画されている。 In response to the demand for higher definition of video, full HD (High Definition) video content of 1920 (horizontal) × 1080 (pixel) in the horizontal direction is supplied. In addition, test broadcasts and commercial broadcasts of high-definition video (hereinafter referred to as 4K) in the horizontal direction 3840 × vertical direction 2160 (pixels) have been started. Furthermore, commercial broadcasting of high-definition video (hereinafter referred to as “8K”) in the horizontal direction 7680 × vertical direction 4320 (pixels) is planned.
 映像コンテンツの配信システムにおいて、一般に、送信側では映像信号はH.264/AVC (Advanced Video Coding )規格やHEVC(High Efficiency Video Coding)規格に基づいて符号化され、受信側では復号処理を経て映像信号が再生されるが、8Kの場合には画素数が多いので、符号化処理及び復号処理における処理負荷が高くなる。 In video content distribution systems, video signals are generally encoded based on the H.264 / AVC (Advanced Video Coding) standard or HEVC (High Efficiency Video Coding) standard on the transmission side, and video is decoded and processed on the reception side. The signal is reproduced, but in the case of 8K, since the number of pixels is large, the processing load in the encoding process and the decoding process becomes high.
 8Kの場合の処理負荷を低減するための方法として、例えば非特許文献1に記載されたスライスを用いた画面4分割符号化がある(図10参照)。図11に示すように、非特許文献1では、画面4分割符号化が使用される場合、インター予測が行われるときに、スライス境界付近のブロックにおいて、動き補償(MC)のための動きベクトルの垂直方向(縦方向)の成分が128画素以下であるという制約が設けられている。なお、スライス境界付近に属さないブロックに対して、スライス境界を跨ぐ垂直方向の動きベクトル範囲の制約(以下、動きベクトル制約という。)はない。 As a method for reducing the processing load in the case of 8K, there is, for example, screen quadrant coding using a slice described in Non-Patent Document 1 (see FIG. 10). As illustrated in FIG. 11, in Non-Patent Document 1, when quad-frame coding is used, when inter prediction is performed, a motion vector for motion compensation (MC) is detected in a block near a slice boundary. There is a restriction that the vertical (vertical) component is 128 pixels or less. Note that there is no restriction on the motion vector range in the vertical direction across the slice boundary (hereinafter referred to as motion vector restriction) for blocks that do not belong to the vicinity of the slice boundary.
 動きベクトル制限がある場合、画面中の物体や画面全体が縦方向に速く動くシーンを符号化するときに、スライス境界では最適な動きベクトルが選択できないことがある。その結果、局所的な画質劣化を発生させる可能性がある。劣化の程度は、速い動きのときにM値が大きいほど大きくなる。M値は、参照ピクチャの間隔である。なお、「最適な動きベクトル」は、映像符号化装置における画面間予測(インター予測)処理を行う予測器で選択された本来の(正規の)動きベクトルを意味する。 When there is a motion vector restriction, when encoding a scene where an object on the screen or the entire screen moves fast in the vertical direction, an optimal motion vector may not be selected at the slice boundary. As a result, local image quality degradation may occur. The degree of deterioration increases as the M value increases during fast movement. The M value is a reference picture interval. Note that “optimal motion vector” means an original (normal) motion vector selected by a predictor that performs inter-screen prediction (inter prediction) processing in the video encoding device.
 図13に、M=4の場合とM=8の場合の参照ピクチャの間隔が例示されている。一般に、M値が小さい場合には、フレーム間距離が小さくなるので、動きベクトルの値は小さくなる傾向がある。しかし、特に定常的なシーンにおいて、時間方向階層が少なくなるため階層(レイヤ)に応じた符号量配分が制約されるため、符号化効率は低下する。一方、M値が大きい場合には、フレーム間距離が大きくなるので、動きベクトルの値は大きくなる傾向がある。しかし、特に定常的なシーンにおいて、時間方向階層が多くなるため階層(レイヤ)に応じた符号量配分の制約が緩和されるため、符号化効率は向上する。一例として、M値を8から4に変えると、動きベクトルの値は1/2になり、M値を4から8に変えると、動きベクトルの値は2倍になる。 FIG. 13 illustrates the interval between reference pictures when M = 4 and M = 8. In general, when the M value is small, the inter-frame distance is small, so the value of the motion vector tends to be small. However, especially in a stationary scene, since the time-direction hierarchy is reduced, code amount distribution according to the hierarchy (layer) is restricted, so that the encoding efficiency is lowered. On the other hand, when the M value is large, the inter-frame distance is large, so that the value of the motion vector tends to be large. However, especially in a stationary scene, the number of time direction hierarchies increases, so that restrictions on code amount distribution according to the hierarchies (layers) are relaxed, so that coding efficiency is improved. As an example, when the M value is changed from 8 to 4, the value of the motion vector is halved, and when the M value is changed from 4 to 8, the value of the motion vector is doubled.
 なお、非特許文献1においてSOP(Set of Pictures )という概念が導入されている。SOP は、時間方向階層符号化を行う場合に、各AU(Access Unit )の符号化順及び参照関係を記述する単位になる。時間方向階層符号化は、複数フレームの映像の中から、部分的にフレームを取り出せるようにする符号化である。 In Non-Patent Document 1, the concept of SOP (Set of pictures) is introduced. SOP is a unit for describing the coding order and reference relationship of each AU (Access Unit) when performing temporal direction hierarchical encoding. The temporal direction hierarchical coding is coding that enables partial extraction of a frame from a plurality of frames of video.
 SOP構造は、L=0の構造、L=1の構造、L=2の構造及びL=3の構造を含む。図14に示すように、Lx(x=0,1,2,3)は、以下のような構造である。
・L=0の構造:Temporal ID が0のピクチャだけで構成されるSOP 構造(つまり、同SOPに含まれるピクチャの段数は1つである。最大Temporal ID を示すLが0であるともいえる。)
・L=1の構造:Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は2つである。最大Temporal ID を示すLが1であるともいえる。)
・L=2の構造:Temporal ID が0のピクチャ、1のピクチャ、および、2のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は3つである。最大Temporal ID を示すLが2であるともいえる。)
・L=3の構造:Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は4つである。最大Temporal ID を示すLが3であるともいえる。)
The SOP structure includes a structure with L = 0, a structure with L = 1, a structure with L = 2, and a structure with L = 3. As shown in FIG. 14, Lx (x = 0, 1, 2, 3) has the following structure.
Structure with L = 0: SOP structure composed only of pictures with Temporal ID 0 (that is, the number of stages of pictures included in the SOP is 1. It can be said that L indicating the maximum Temporal ID is 0. )
Structure with L = 1: SOP structure composed of a picture with a temporal ID of 0 and a picture of 1 (that is, the number of stages of pictures included in the SOP is two. L indicating the maximum temporal ID is 1) It can also be said.)
Structure with L = 2: SOP structure composed of a picture with a temporal ID of 0, a picture of 1, and a picture of 2 (that is, the number of stages of pictures included in the SOP is three. The maximum temporal ID is It can be said that L shown is 2.)
Structure with L = 3: SOP structure composed of picture with Temporal ID 0, 1 picture, 2 pictures, and 3 pictures (that is, the number of stages of pictures included in the SOP is four. (It can be said that L indicating Temporal ID is 3.)
 本明細書の記載では、M=1はL=0の構造のSOP に対応し、M=2はN=1の場合のL=1の構造のSOP (図14参照)に対応し、M=3はN=2の場合のL=1の構造のSOP (図14参照)に対応し、M=4はL=2の構造のSOP に対応し、M=8はL=3の構造のSOP に対応する。 In the description of the present specification, M = 1 corresponds to a SOP having a structure of L = 0, M = 2 corresponds to a SOP having a structure of L = 1 when N = 1 (see FIG. 14), and M = 3 corresponds to the SOP having the structure of L = 1 in the case of N = 2 (see FIG. 14), M = 4 corresponds to the SOP having the structure of L = 2, and M = 8 corresponds to the SOP 構造 having the structure of L = 3. Corresponding to
 定常的なシーン(例えば、画面中の物体や画面全体が速く動かないシーン)については、上述したように参照ピクチャ間隔(M値)が大きいほど符号化効率がよい。よって、8Kなどの高精細映像を低レートで符号化するためには、映像符号化装置が基本的にM=8で動作することが好ましい。 For a stationary scene (for example, an object in the screen or a scene in which the entire screen does not move quickly), as described above, the coding efficiency increases as the reference picture interval (M value) increases. Therefore, in order to encode a high-definition video such as 8K at a low rate, it is preferable that the video encoding device basically operates at M = 8.
 しかし、上述したように、M値を大きくすると動きベクトルの値が大きくなる傾向があるので、特に、画面中の物体や画面全体が縦方向に速く動くシーンにおいて、動きベクトル制限に起因して画質が劣化する。動きベクトル制限によって、スライス境界において、最適な動きベクトルを選択できない場合があるためである。 However, as described above, when the M value is increased, the value of the motion vector tends to increase. Therefore, particularly in a scene in which an object on the screen or the entire screen moves fast in the vertical direction, the image quality is caused by the motion vector limitation. Deteriorates. This is because an optimal motion vector may not be selected at a slice boundary due to motion vector limitation.
 本発明は、映像の画面を分割してから圧縮する符号化方法であって、スライス境界付近において動きベクトルの選択の制約がある符号化方法を使用する場合に、画質劣化を抑制することを目的とする。 The present invention is an encoding method that compresses after dividing a video screen, and an object of the present invention is to suppress deterioration in image quality when using an encoding method that restricts motion vector selection in the vicinity of a slice boundary. And
 本発明による映像符号化装置は、映像を所定個のスライスに分割し、スライス境界付近の動きベクトル制限の下で符号化処理を行う映像符号化装置であって、符号化統計情報を解析する解析手段と、解析手段の解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定する推定手段と、推定手段の推定結果に基づいて符号化構造を、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャおよび3のピクチャで構成されるSOP 構造のいずれかに適応的に決定する符号化構造決定手段とを備えることを特徴とする。 A video encoding device according to the present invention is a video encoding device that divides a video into a predetermined number of slices and performs encoding processing under a motion vector restriction in the vicinity of a slice boundary, and analyzes the encoding statistical information Means, an estimation means for estimating whether or not an optimal motion vector can be selected in the vicinity of the slice boundary based on the analysis result of the analysis means, and an encoding structure based on the estimation result of the estimation means, Temporal ID is 0 SOP structure consisting only of pictures, Temporal ID SOP structure consisting of 0 and 1 pictures, Temporal ID ピ ク チ ャ picture 0, 1 picture, and SOP structure consisting of 2 pictures, Temporal Coding structure determining means for adaptively determining any one of SOP structures composed of a picture with ID 0 0, 1 picture, 2 pictures and 3 pictures It is characterized by.
 本発明による映像復号装置は、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかで符号化された映像を復号する復号手段を備えることを特徴とする。 The video decoding apparatus according to the present invention includes an SOP structure composed only of pictures having a temporal ID of 0, an SOP structure composed of pictures having a temporal ID of 0 and 1 pictures, a picture having a temporal ID of 0, a picture of 1, Decoding means for decoding a video encoded with any one of the SOP structure composed of the pictures 1 and 2 and the SOP structure composed of the picture whose Temporal ID is 0, 1 picture, 2 pictures, and 3 pictures It is characterized by providing.
 本発明による映像符号化方法は、映像を所定個のスライスに分割し、スライス境界付近の動きベクトル制限の下で符号化処理を行う映像符号化方法であって、符号化統計情報を解析し、解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定し、推定結果に基づいて符号化構造を、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャおよび3のピクチャで構成されるSOP 構造のいずれかに適応的に決定することを特徴とする。 A video encoding method according to the present invention is a video encoding method that divides a video into a predetermined number of slices and performs an encoding process under a motion vector restriction near a slice boundary, and analyzes encoding statistical information, Based on the analysis result, it is estimated whether or not an optimal motion vector can be selected near the slice boundary, and based on the estimation result, the coding structure is composed of an SOP structure consisting only of pictures whose Temporal ID is 0, Temporal ID SOP structure composed of 0 picture and 1 picture, picture whose Temporal ID is 0, SOP structure composed of 1 picture and 2 pictures, picture whose Temporal ID is 0, picture of 1 and 2 It is characterized in that it is adaptively determined to one of the SOP structures composed of pictures and three pictures.
 本発明による映像復号方法は、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかで符号化された映像を復号することを特徴とする。 The video decoding method according to the present invention includes an SOP structure composed only of pictures whose Temporal ID is 0, an SOP structure composed of pictures whose Temporal ID is 0 and 1 pictures, pictures whose Temporal ID is 0, 1 pictures, And SOP structure composed of two pictures, Temporal ID picture 0, 1 picture, 2 pictures, and SOP 3 structure composed of 3 pictures Features.
 本発明による映像符号化プログラムは、コンピュータに、符号化統計情報を解析する処理と、解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定する処理と、推定結果に基づいて符号化構造を、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャおよび3のピクチャで構成されるSOP 構造のいずれかに適応的に決定する処理とを実行させることを特徴とする。 A video encoding program according to the present invention includes a computer that analyzes encoding statistical information, a process that estimates whether an optimal motion vector can be selected in the vicinity of a slice boundary based on the analysis result, and an estimation result The coding structure is based on the SOP structure consisting only of pictures whose Temporal ID is 0, the SOP structure consisting of pictures whose Temporal ID is 0 and 1 pictures, pictures whose Temporal ID is 0 and 1 pictures. And SOP structure composed of two pictures, and Temporal ID ピ ク チ ャ is a picture having a zero value, 1 picture, 2 pictures, and SOP 構成 さ れ る structure composed of 3 pictures. It is characterized by.
 本発明による映像復号プログラムは、コンピュータに、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかで符号化された映像を復号する処理を実行させることを特徴とする。 The video decoding program according to the present invention allows a computer to store an SOP structure composed only of a picture whose Temporal ID is 0, an SOP structure composed of a picture whose Temporal ID is 0 and a picture of 1 and a picture whose Temporal ID is 0, 1 Decode a video encoded with one of the SOP structure consisting of the picture of, and the picture of 2 and the SOP structure consisting of the picture whose Temporal ID is 0, 1 picture, 2 pictures, and 3 pictures It is characterized in that the processing is performed.
 本発明によれば、画質劣化を抑制することができる。 According to the present invention, image quality deterioration can be suppressed.
映像符号化装置の実施形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of embodiment of a video coding apparatus. 映像復号装置の実施形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of embodiment of a video decoding apparatus. 映像符号化装置の第1の実施形態の動作を示すフローチャートである。It is a flowchart which shows operation | movement of 1st Embodiment of a video coding apparatus. 映像符号化装置の第2の実施形態の動作を示すフローチャートである。It is a flowchart which shows operation | movement of 2nd Embodiment of a video coding apparatus. 映像符号化装置の第3の実施形態の動作を示すフローチャートである。It is a flowchart which shows operation | movement of 3rd Embodiment of a video coding apparatus. 映像システムの一例を示すブロック図である。It is a block diagram which shows an example of a video system. 映像符号化装置及び映像復号装置の機能を実現可能な情報処理システムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the information processing system which can implement | achieve the function of a video coding apparatus and a video decoding apparatus. 映像符号化装置の主要部を示すブロック図である。It is a block diagram which shows the principal part of a video coding apparatus. 映像復号装置の主要部を示すブロック図である。It is a block diagram which shows the principal part of a video decoding apparatus. 画面分割の一例を示す説明図である。It is explanatory drawing which shows an example of a screen division. 動きベクトル制限を説明するための説明図である。It is explanatory drawing for demonstrating a motion vector restriction | limiting. SOP構造を示す説明図である。It is explanatory drawing which shows an SOP structure. 参照ピクチャの間隔の一例を示す説明図である。It is explanatory drawing which shows an example of the space | interval of a reference picture. SOP構造を示す説明図である。It is explanatory drawing which shows an SOP structure.
 以下、本発明の実施形態を図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
 図1は、映像符号化装置の実施形態の構成例を示すブロック図である。図1に示す映像符号化装置100は、符号化部101、解析部111、判定部112及びM値決定部113を含む。なお、映像符号化装置100は、HEVC規格に基づいて符号化処理を実行するが、他の規格、例えば、H.264/AVC規格に基づいて符号化処理を実行してもよい。また、以下、8Kの映像が入力される場合を例にする。 FIG. 1 is a block diagram illustrating a configuration example of an embodiment of a video encoding device. A video encoding apparatus 100 illustrated in FIG. 1 includes an encoding unit 101, an analysis unit 111, a determination unit 112, and an M value determination unit 113. Note that the video encoding apparatus 100 executes the encoding process based on the HEVC standard, but may execute the encoding process based on another standard, for example, the H.264 / AVC standard. Hereinafter, an example in which 8K video is input will be described.
 符号化部101は、入力画像を複数の画面に分割する画面分割器102、周波数変換/量子化器103、逆量子化/逆周波数変換器104、バッファ105、予測器106、及びエントロピー符号化器107を含む。 The encoding unit 101 includes a screen divider 102 that divides an input image into a plurality of screens, a frequency transformer / quantizer 103, an inverse quantizer / inverse frequency transformer 104, a buffer 105, a predictor 106, and an entropy encoder. 107 is included.
 画面分割器102は、入力映像の画面を4つの画面に分割する(図10参照)。周波数変換/量子化器103は、入力映像信号から予測信号を減じた予測誤差画像を周波数変換する。周波数変換/量子化器103は、さらに、周波数変換された予測誤差画像(周波数変換係数)を量子化する。以下、量子化された周波数変換係数を変換量子化値という。 The screen divider 102 divides the input video screen into four screens (see FIG. 10). The frequency converter / quantizer 103 performs frequency conversion on the prediction error image obtained by subtracting the prediction signal from the input video signal. The frequency transformer / quantizer 103 further quantizes the frequency-converted prediction error image (frequency transform coefficient). Hereinafter, the quantized frequency transform coefficient is referred to as a transform quantization value.
 エントロピー符号化器107は、予測パラメータと変換量子化値をエントロピー符号化して、ビットストリームを出力する。予測パラメータは、予測モード(イントラ予測、インター予測)、イントラ予測ブロックサイズ、イントラ予測方向、インター予測ブロックサイズ、及び動きベクトルなど、CTU(Coding Tree Unit)及びブロックの予測に関連した情報である。 The entropy encoder 107 entropy encodes the prediction parameter and the transform quantization value, and outputs a bit stream. The prediction parameters are information related to CTU (Coding | Tree | Unit | Unit) and block prediction, such as prediction mode (intra prediction, inter prediction), intra prediction block size, intra prediction direction, inter prediction block size, and motion vector.
 予測器106は、入力映像信号に対する予測信号を生成する。予測信号は、イントラ予測またはフレーム間予測に基づいて生成される。 The predictor 106 generates a prediction signal for the input video signal. The prediction signal is generated based on intra prediction or inter-frame prediction.
 逆量子化/逆周波数変換器104は、変換量子化値を逆量子化する。さらに、逆量子化/逆周波数変換器104は、逆量子化した周波数変換係数を逆周波数変換する。逆周波数変換された再構築予測誤差画像は、予測信号が加えられて、バッファ105に供給される。バッファ105は、再構築画像を格納する。 The inverse quantization / inverse frequency converter 104 inversely quantizes the transform quantization value. Further, the inverse quantization / inverse frequency converter 104 performs inverse frequency conversion on the inversely quantized frequency conversion coefficient. The reconstructed prediction error image subjected to inverse frequency conversion is supplied with a prediction signal and supplied to the buffer 105. The buffer 105 stores the reconstructed image.
 解析部111は、符号化統計情報を解析する。判定部112は、上述した動きベクトル制限で、スライス境界付近で最適な動きベクトルを選択できるか否かを、解析部111の解析結果に基づいて判定する。なお、符号化統計情報は、過去のフレーム(例えば、現在の符号化対象のフレームの直前のフレーム)の符号化結果の情報であるが、符号化統計情報の具体例については後述する。 The analysis unit 111 analyzes the encoded statistical information. Based on the analysis result of the analysis unit 111, the determination unit 112 determines whether or not an optimal motion vector can be selected near the slice boundary with the above-described motion vector restriction. The encoding statistical information is information on the encoding result of a past frame (for example, a frame immediately before the current encoding target frame), and a specific example of the encoding statistical information will be described later.
 なお、スライス境界付近は、最適な動きベクトルを選択できなかった領域になるが、以下の制御を実現する際に、便宜的に、例えば、スライス境界から±128画素の範囲や±256画素の範囲を、スライス境界付近としてもよい。また、以下の制御を実現する際に、「スライス境界付近」の範囲を、映像の状況(動きが大きい/小さいなど)に応じて、適宜変更可能であるようにしてもよい。例えば、値が大きい動きベクトルの発生比率が高い場合に、「スライス境界付近」の範囲を広く設定するようにしてもよい。 Note that the vicinity of the slice boundary is an area in which an optimal motion vector could not be selected. However, for example, when realizing the following control, for example, a range of ± 128 pixels or a range of ± 256 pixels from the slice boundary. May be near the slice boundary. Further, when realizing the following control, the range of “near the slice boundary” may be appropriately changed according to the state of the video (such as large / small motion). For example, when the generation ratio of a motion vector having a large value is high, the range “near the slice boundary” may be set wide.
 M値決定部113は、判定部112の判定結果に基づいて、M値を適応的に決定する。なお、上述したように、M値を決定することは、SOP構造におけるLx(x=0,1,2,3)構造を決定することと等価である。また、符号化統計情報については、後述する。 The M value determination unit 113 adaptively determines the M value based on the determination result of the determination unit 112. As described above, determining the M value is equivalent to determining the Lx (x = 0, 1, 2, 3) structure in the SOP structure. The encoded statistical information will be described later.
 図2は、映像復号装置の実施形態の構成例を示すブロック図である。図2に示す映像復号装置200は、エントロピー復号器202、逆量子化/逆周波数変換器203、予測器204、及びバッファ205を含む。 FIG. 2 is a block diagram illustrating a configuration example of an embodiment of the video decoding apparatus. The video decoding apparatus 200 shown in FIG. 2 includes an entropy decoder 202, an inverse quantization / inverse frequency converter 203, a predictor 204, and a buffer 205.
 エントロピー復号器202は、映像のビットストリームをエントロピー復号する。エントロピー復号器202は、エントロピー復号した変換量子化値を逆量子化/逆周波数変換器203に供給する。 The entropy decoder 202 entropy decodes the video bitstream. The entropy decoder 202 supplies the transform quantization value subjected to entropy decoding to the inverse quantization / inverse frequency converter 203.
 逆量子化/逆周波数変換器203は、量子化ステップ幅で、輝度及び色差の変換量子化値を逆量子化して周波数変換係数を得る。さらに、逆量子化/逆周波数変換器203は、逆量子化した周波数変換係数を逆周波数変換する。 The inverse quantization / inverse frequency converter 203 obtains a frequency conversion coefficient by inversely quantizing the converted quantization values of luminance and chrominance with a quantization step width. Further, the inverse quantization / inverse frequency converter 203 performs inverse frequency conversion on the inversely quantized frequency conversion coefficient.
 逆周波数変換後、予測器204は、バッファ205に格納された再構築ピクチャの画像を用いて予測信号を生成する(前記予測は、動き補償予測、または、MC参照とも呼ぶ)。逆量子化/逆周波数変換器203で逆周波数変換された再構築予測誤差画像は、予測器204から供給される予測信号が加えられて、再構築ピクチャとしてバッファ205に供給される。そして、バッファ205に格納された再構築ピクチャが復号映像として出力される。 After the inverse frequency conversion, the predictor 204 generates a prediction signal using the image of the reconstructed picture stored in the buffer 205 (the prediction is also referred to as motion compensation prediction or MC reference). The reconstructed prediction error image subjected to inverse frequency conversion by the inverse quantization / inverse frequency converter 203 is added with the prediction signal supplied from the predictor 204 and supplied to the buffer 205 as a reconstructed picture. Then, the reconstructed picture stored in the buffer 205 is output as decoded video.
 次に、映像符号化装置100における解析部111、判定部112及びM値決定部113の動作を説明する。 Next, operations of the analysis unit 111, the determination unit 112, and the M value determination unit 113 in the video encoding device 100 will be described.
実施形態1.
 図3は、図1に示された映像符号化装置100の第1の実施形態の動作を示すフローチャートである。第1の実施形態では、8Kの映像は4分割され(図10参照)、スライス境界付近において動きベクトル制限があるとする。また、動きベクトル制限として、±128を例にする。8Kの映像は4分割され、かつ、動きベクトル制限があることは、他の実施形態でも同様である。なお、M値の初期値は8(M=8)である。
Embodiment 1. FIG.
FIG. 3 is a flowchart showing the operation of the first embodiment of the video encoding apparatus 100 shown in FIG. In the first embodiment, it is assumed that an 8K video is divided into four (see FIG. 10) and there is a motion vector restriction near the slice boundary. Further, as a motion vector restriction, ± 128 is taken as an example. The 8K video is divided into four, and the motion vector limitation is the same in other embodiments. Note that the initial value of the M value is 8 (M = 8).
 解析部111は、バッファ105に格納されている過去の符号化結果(例えば、直前フレームの符号化結果)を解析する。具体的には、解析部111は、スライス境界以外のブロックにおける動きベクトルの平均値又は中央値(以下、平均値又は中央値をMavgとする。)を算出する(ステップS101)。なお、第1の実施形態では、符号化統計情報は、動きベクトルの値であり、解析結果は、動きベクトルの平均値又は中央値である。 The analysis unit 111 analyzes past encoding results stored in the buffer 105 (for example, the encoding result of the immediately preceding frame). Specifically, the analysis unit 111 calculates an average value or median value of motion vectors in blocks other than the slice boundary (hereinafter, the average value or median value is referred to as M avg ) (step S101). In the first embodiment, the encoded statistical information is a value of a motion vector, and the analysis result is an average value or a median value of the motion vectors.
 判定部112は、Mavgが、動きベクトル制限としての±128を基準として、どの程度の大きさになっているかを判定する(ステップS102)。 The determination unit 112 determines how large M avg is based on ± 128 as the motion vector limit (step S102).
 そして、M値決定部113は、Mavgがどの程度の大きさになっているかの判定結果に基づいて、M値を決定する(ステップS103)。 Then, the M value determination unit 113 determines the M value based on the determination result of how large M avg is (step S103).
 M値決定部113は、判定結果に基づいて、例えば、以下のようにM値を決定する。 The M value determination unit 113 determines the M value based on the determination result as follows, for example.
(1)M=8である場合:
     |Mavg|≦128 → M=8を維持
 128<|Mavg|≦256 → M=4(M=8の1/2)に決定
 256<|Mavg|≦512 → M=2(M=8の1/4)に決定
 512<|Mavg|     → M=1(M=8の1/8)に決定
(1) When M = 8:
| M avg | ≦ 128 → M = 8 is maintained 128 <| M avg | ≦ 256 → M = 4 (1/2 of M = 8) 256 <| M avg | ≦ 512 → M = 2 (M = 8 <1/4) 512 <| M avg | → M = 1 (1/8 of M = 8)
(2)M=4である場合:
     |Mavg|≦64  → M=8に決定
  64<|Mavg|≦128 → M=4を維持
 128<|Mavg|≦256 → M=2に決定
 256<|Mavg|     → M=1に決定
(2) When M = 4:
| M avg | ≦ 64 → M = 8 determined 64 <| M avg | ≦ 128 → M = 4 maintained 128 <| M avg | ≦ 256 → M = 2 determined 256 <| M avg | → M = 1 Decided
 M値決定部113は、M値がその他の値であるときにも、上記の(1),(2)の場合と同様に、M値を8にしたときに、動きベクトル制限の下で、スライス境界付近での動きベクトルの値が±128以内に収まると推定できたときには、M値を8に戻す。換言すれば、M値決定部113は、動きベクトル制限の下で、スライス境界付近で最適な動きベクトルを選択できると推定できた場合には、M値を8に戻す。その他の場合にも、Mavgに応じて、スライス境界付近での動きベクトルの値が±128以内に収まるようにM値を決定する。 Even when the M value is any other value, the M value determining unit 113, when the M value is set to 8, as in the cases (1) and (2), When it is estimated that the value of the motion vector near the slice boundary is within ± 128, the M value is returned to 8. In other words, the M value determination unit 113 returns the M value to 8 when it can be estimated that an optimal motion vector can be selected near the slice boundary under the motion vector restriction. In other cases, the M value is determined in accordance with M avg so that the value of the motion vector near the slice boundary is within ± 128.
 なお、上記の場合分け(閾値の設定)は一例であって、閾値を変えたり、より細かな場合分けをしてもよい。 Note that the above case division (threshold value setting) is an example, and the threshold value may be changed or a finer case division may be performed.
 第1の実施形態の映像符号化装置の制御は、以下のような考え方に基づく。 The control of the video encoding device of the first embodiment is based on the following concept.
 映像が、画面全体が速く動くシーンの映像であるときには、発生した全ての動きベクトルに対して、スライス境界付近でもスライス境界付近以外でも、値が大きい動きベクトルの数の比率が高い。しかし、動きベクトル制限があるので、スライス境界付近では、最適な動きベクトルが選択されていない可能性がある。そこで、判定部112は、スライス境界以外の領域において発生した符号化統計情報としての動きベクトル(動きベクトル制限はないので、正規の、換言すれば最適な動きベクトルである。)にもとづいて、符号化対象の画面が速く動くシーンの映像の画面であるか否かを推定する。M値決定部113は、速く動くシーンの映像であると判定部112が推定した場合には、スライス境界付近において最適な動きベクトルを選択可能になるようにM値を変える。 When the video is a video of a scene in which the entire screen moves quickly, the ratio of the number of motion vectors having a large value is high for all generated motion vectors, both near the slice boundary and near the slice boundary. However, since there is a motion vector restriction, there is a possibility that an optimal motion vector is not selected near the slice boundary. Therefore, the determination unit 112 performs coding based on a motion vector as encoded statistical information generated in a region other than the slice boundary (there is no motion vector limitation, and thus a normal, in other words, an optimal motion vector). It is estimated whether or not the target screen is a screen image of a scene that moves fast. When the determination unit 112 estimates that the video is a fast moving scene, the M value determination unit 113 changes the M value so that an optimal motion vector can be selected in the vicinity of the slice boundary.
 なお、速く動くシーンの映像である場合には、スライス境界付近において最適な動きベクトルが選択されていない可能性があるので、速く動くシーンの映像であると推定されたことは、動きベクトル制限の下で、スライス境界付近において最適な動きベクトルが選択されていないと推定されたことと等価である。 Note that if it is a fast-moving scene video, the optimal motion vector may not be selected in the vicinity of the slice boundary. This is equivalent to estimating that the optimal motion vector is not selected near the slice boundary.
 また、上述したように、M値とSOP構造とは相関している。よって、M値決定部113がM値を決定することは、SOP構造(すなわち、Lx(x=0,1,2,3)構造)を決定することと等価である。 Further, as described above, the M value and the SOP structure are correlated. Therefore, the M value determining unit 113 determining the M value is equivalent to determining the SOP structure (that is, the Lx (x = 0, 1, 2, 3) structure).
実施形態2.
 図4は、図1に示された映像符号化装置100の第2の実施形態の動作を示すフローチャートである。
Embodiment 2. FIG.
FIG. 4 is a flowchart showing the operation of the second embodiment of the video encoding apparatus 100 shown in FIG.
 解析部111は、バッファ105に格納されている過去の符号化結果(例えば、直前フレームの符号化結果)を解析する。具体的には、解析部111は、スライス境界以外の範囲における全てのブロック(例えば、PU:Prediction Unit )に対して、画面内予測(イントラ予測)が用いられたブロックの割合Pを算出し(ステップS201)、スライス境界付近の全てのブロックに対して、画面内予測が用いられたブロックの割合Pを算出する(ステップS202)。なお、第2の実施形態では、符号化統計情報は、スライス境界付近のブロックの予測モード(具体的には、画面内予測のブロックの数)であり、解析結果は、割合P及び割合Pである。 The analysis unit 111 analyzes past encoding results stored in the buffer 105 (for example, the encoding result of the immediately preceding frame). Specifically, the analysis unit 111 calculates a block ratio P 1 in which intra-screen prediction (intra prediction) is used for all blocks (for example, PU: Prediction Unit) in a range other than the slice boundary. (step S201), for all the blocks in the vicinity of a slice boundary, and calculates the ratio P 2 of the blocks used are intra prediction (step S202). In the second embodiment, the encoded statistical information is a prediction mode of a block near a slice boundary (specifically, the number of blocks for intra-screen prediction), and the analysis result is a ratio P 1 and a ratio P. 2 .
 判定部112は、割合Pと割合Pとを比較し、それらの乖離の程度を判定する。具体的には、割合Pと比較して、割合Pがかなり大きいか否かを判定する。判定部112は、例えば、割合Pと割合Pとの差が所定値を越えているか否か判定する(ステップS203)。 The determination unit 112 compares the ratio P 1 and the ratio P 2 and determines the degree of deviation between them. Specifically, as compared with the ratio P 1, it determines whether or not the ratio P 2 is quite large. Determination unit 112, for example, determines whether the difference between the ratio P 2 and the ratio P 1 exceeds a predetermined value (step S203).
 M値決定部113は、割合Pと割合Pとの差が所定値を越えている場合には、M値を小さくする(ステップS204)。なお、複数の所定値を設け、例えば、差が第1の所定値を越えているときにはM値を複数段階小さくし、差が第2の所定値(<第1の所定値)を越えているときにはM値を1段階小さくするようにしてもよい。 M value determining unit 113, when the difference between the ratio P 2 and the ratio P 1 exceeds a predetermined value, the smaller the M value (step S204). A plurality of predetermined values are provided. For example, when the difference exceeds the first predetermined value, the M value is decreased by a plurality of levels, and the difference exceeds the second predetermined value (<first predetermined value). Sometimes the M value may be decreased by one step.
 また、M値決定部113は、割合Pと割合Pとの差が所定値以下である場合には、M値を維持するか、又は、M値を大きくする(ステップS205)。例えば、M値決定部113は、差が第3の所定値(<第2の所定値)以下であるときにはM値を大きくし、差が第3の所定値を越えているときにはM値を維持する。 Further, M value determining unit 113, the difference between the ratio P 2 and the ratio P 1 is the case is less than the predetermined value, maintaining or M value, or increases the M value (step S205). For example, the M value determination unit 113 increases the M value when the difference is equal to or smaller than a third predetermined value (<second predetermined value), and maintains the M value when the difference exceeds the third predetermined value. To do.
 第2の実施形態の映像符号化装置の制御は、以下のような考え方に基づく。 The control of the video encoding device of the second embodiment is based on the following concept.
 符号化部101は、画面内の各ブロックを符号化する際に、予測モードとして画面内予測と画面間予測(インター予測)とのいずれかを使用できる。映像が、画面全体が速く動くシーンの映像であるときには、スライス境界付近においても、画面間予測が使用されるときに値が大きい動きベクトルの数の発生率が高いと考えられる(動きベクトル制限がない場合)。動きベクトル制限があるので、スライス境界付近では、最適な動きベクトル(大きな動きベクトル)を発生することができず、その結果、スライス境界付近では、画面内予測が使用されることが多いと考えられる。スライス境界付近以外では、動きベクトル制限はないので、スライス境界付近に比べて、画面内予測が使用されることは少ないと考えられる。 The encoding unit 101 can use either intra prediction or inter prediction (inter prediction) as a prediction mode when encoding each block in the screen. When the video is a video of a scene where the entire screen moves fast, the occurrence rate of the number of motion vectors having a large value when the inter-screen prediction is used is considered to be high even near the slice boundary (the motion vector limit is If not). Since there is a motion vector restriction, an optimal motion vector (large motion vector) cannot be generated near the slice boundary, and as a result, it is considered that intra prediction is often used near the slice boundary. . Since there is no motion vector limitation outside the vicinity of the slice boundary, it is considered that intra prediction is less used than in the vicinity of the slice boundary.
 よって、割合Pと割合Pとが大きく乖離している場合には、速く動くシーンの映像の信号が符号化部101に入力されていると推定される。 Therefore, when the ratio P 1 and the ratio P 2 are greatly different from each other, it is estimated that a video signal of a fast moving scene is input to the encoding unit 101.
 なお、速く動くシーンの映像である場合には、スライス境界付近において最適な動きベクトルが選択されていない可能性があるので、速く動くシーンの映像であると推定されたことは、動きベクトル制限の下で、割合Pと割合Pとが大きく乖離していることと等価である。 Note that if it is a fast-moving scene video, the optimal motion vector may not be selected in the vicinity of the slice boundary. This is equivalent to the fact that the ratio P 1 and the ratio P 2 are greatly deviated.
 大きく乖離しているか否か判定するための所定値として、一例として、経験的又は実験的に、そのような値を閾値として使用すれば、スライス境界付近において最適な動きベクトルが選択されていない可能性があることを推定可能な値が選択される。 As an example, the empirical or experimental use of such a value as a threshold value as a predetermined value for determining whether or not there is a large divergence may result in an optimal motion vector not being selected near the slice boundary. A value that can be estimated to be sexual is selected.
実施形態3.
 図5は、図1に示された映像符号化装置100の第3の実施形態の動作を示すフローチャートである。
Embodiment 3. FIG.
FIG. 5 is a flowchart showing the operation of the third embodiment of the video encoding apparatus 100 shown in FIG.
 解析部111は、バッファ105に格納されている過去の符号化結果(例えば、直前フレームの符号化結果)を解析する。具体的には、解析部111は、以前のフレーム(例えば、現在の符号化対象のフレームの2フレーム前)のスライス境界付近のブロックにおける発生符号量Cを算出する(ステップS301)。また、解析部111は、直前のフレームのスライス境界付近のブロックにおける発生符号量Cを算出する(ステップS302)。なお、第3の実施形態では、符号化統計情報は、スライス境界付近のブロックの発生符号量であり、解析結果は、発生符号量C及び発生符号量Cである。 The analysis unit 111 analyzes past encoding results stored in the buffer 105 (for example, the encoding result of the immediately preceding frame). Specifically, the analysis unit 111, the previous frame (e.g., two frames before the current encoding target frame) is calculated generated code amount C 1 in the block in the vicinity of a slice boundary (step S301). Further, the analysis unit 111 calculates a generated code amount C 2 in the block near the slice boundary of the previous frame (step S302). In the third embodiment, the encoded statistical information is the generated code amount of blocks near the slice boundary, and the analysis results are the generated code amount C 1 and the generated code amount C 2 .
 判定部112は、発生符号量Cと発生符号量Cとを比較し、それらの乖離の程度を判定する。具体的には、発生符号量Cと比較して、発生符号量Cがかなり大きいか否かを判定する。判定部112は、例えば、発生符号量Cと発生符号量Cとの差が所定量を越えているか否か判定する(ステップS303)。 The determination unit 112 compares the generated code amount C 1 and the generated code amount C 2 and determines the degree of deviation between them. Specifically, as compared to the amount of generated codes C 1, determines whether or not considerably large amount of generated codes C 2. Determination unit 112, for example, determines whether the difference between the generated code amount C 2 and the generated code amount C 1 exceeds a predetermined amount (step S303).
 M値決定部113は、発生符号量Cと発生符号量Cとの差が所定量を越えている場合には、M値を小さくする(ステップS304)。なお、複数の所定量を設け、例えば、差が第1の所定量を越えているときにはM値を複数段階小さくし、差が第2の所定量(<第1の所定量)を越えているときにはM値を1段階小さくするようにしてもよい。 M value determining unit 113, when the difference between the generated code amount C 2 and the generated code amount C 1 exceeds a predetermined amount, the smaller the M value (step S304). A plurality of predetermined amounts are provided. For example, when the difference exceeds the first predetermined amount, the M value is decreased by a plurality of levels, and the difference exceeds the second predetermined amount (<first predetermined amount). Sometimes the M value may be decreased by one step.
 また、M値決定部113は、発生符号量Cと発生符号量Cとの差が所定量以下である場合には、M値を維持するか、又は、M値を大きくする(ステップS305)。例えば、M値決定部113は、差が第3の所定量(<第2の所定量)以下であるときにはM値を大きくし、差が第3の所定量を越えているときにはM値を維持する。 Further, M value determining unit 113, when the difference between the generated code amount C 2 and the generated code amount C 1 is equal to or less than the predetermined amount, maintaining or M value, or increases the M value (step S305 ). For example, the M value determination unit 113 increases the M value when the difference is equal to or smaller than a third predetermined amount (<second predetermined amount), and maintains the M value when the difference exceeds the third predetermined amount. To do.
 第3の実施形態の映像符号化装置の制御は、以下のような考え方に基づく。 The control of the video encoding device of the third embodiment is based on the following concept.
 上述したように、画面全体が速く動くシーンの映像であるときには、スライス境界付近においても、画面間予測が使用されるときに値が大きい動きベクトルの数の比率が高いと考えられる(動きベクトル制限がない場合)。しかし、動きベクトル制限があるので、スライス境界付近では、最適な動きベクトル(大きな動きベクトル)を発生することができず、その結果、スライス境界付近では、画面内予測が使用されることが多いと考えられる。一般に、画面間予測が使用されるときに比べて、画面内予測が使用されるときには、発生符号量は多くなる。 As described above, when the entire screen is a fast-moving scene video, the ratio of the number of motion vectors having a large value when the inter-screen prediction is used is considered to be high even near the slice boundary (motion vector restriction). If not). However, since there is a motion vector limitation, an optimal motion vector (large motion vector) cannot be generated near the slice boundary, and as a result, in-screen prediction is often used near the slice boundary. Conceivable. In general, the amount of generated code is larger when intra prediction is used than when inter prediction is used.
 よって、発生符号量Cと比較して、発生符号量Cがかなり多い場合には、速く動くシーンの映像の信号が符号化部101に入力される状況に変化したと推定される。 Therefore, when the generated code amount C 2 is considerably larger than the generated code amount C 1 , it is presumed that the situation has changed so that the video signal of the fast moving scene is input to the encoding unit 101.
 なお、速く動くシーンの映像になった場合には、スライス境界付近において最適な動きベクトルが選択されない可能性があるので、速く動くシーンの映像になったと推定されたことは、動きベクトル制限の下で、発生符号量Cが大きく増えたことと等価である。 Note that when it comes to a fast-moving scene video, the optimal motion vector may not be selected near the slice boundary. in is equivalent to the generated code amount C 2 increases greatly.
 大きく増えたか否か判定するための所定量として、一例として、経験的又は実験的に、そのような量を閾値として使用すれば、スライス境界付近において最適な動きベクトルが選択されない可能性があることを推定可能な値が選択される。 As an example of a predetermined amount for determining whether or not it has greatly increased, empirically or experimentally, if such an amount is used as a threshold value, an optimal motion vector may not be selected in the vicinity of the slice boundary. A value that can be estimated is selected.
 以上に説明したように、上記の各実施形態では、過去の符号化結果(符号化統計情報)に基づいてM値が適応的に切り替えられる。符号化統計情報に基づいて動きベクトル制限の下で、スライス境界付近で最適な動きベクトル(換言すれば、動きベクトル制限を外れる動きベクトル)を選択できるか否かが推定される。選択できないと推定された場合には、M値はより小さな値に変更される。選択できると判定された場合、そのときのM値でも動きベクトル制限の下でスライス境界付近で最適な動きベクトルを選択できると考えられるので、M値は、維持されるか、又は、より大きな値に変更される。 As described above, in each of the above embodiments, the M value is adaptively switched based on the past encoding result (encoding statistical information). Based on the encoded statistical information, it is estimated whether or not an optimal motion vector (in other words, a motion vector outside the motion vector limit) can be selected near the slice boundary under the motion vector restriction. If it is estimated that it cannot be selected, the M value is changed to a smaller value. If it is determined that it can be selected, it is considered that an optimal motion vector can be selected in the vicinity of the slice boundary under the motion vector restriction even with the current M value, so that the M value is maintained or a larger value. Changed to
 その結果、動きベクトル制限によってスライス境界付近で最適な動きベクトルを選択できない状態になることをできるだけ回避でき、局所的な画質劣化が生ずる可能性を低減できる。すなわち、動きの速さに応じてM値が適応的に切り替えられるので、好適な画質を得ることができる。 As a result, it is possible to avoid as much as possible that the optimal motion vector cannot be selected near the slice boundary due to the motion vector restriction, and the possibility that local image quality degradation will occur can be reduced. That is, since the M value is adaptively switched according to the speed of movement, a suitable image quality can be obtained.
 また、符号化結果(例えば、直前のフレームの符号化結果)に基づいてM値を切り替えることができるので、事前解析(現在のフレームを符号化する際に前処理として実行される解析処理)を行う必要がなく、事前解析を行う場合と比較して、符号化のための処理時間が延びてしまうことが防止される。 In addition, since the M value can be switched based on the encoding result (for example, the encoding result of the immediately preceding frame), pre-analysis (analysis processing executed as pre-processing when encoding the current frame) is performed. It is not necessary to perform this, and it is possible to prevent the processing time for encoding from being extended as compared with the case of performing the pre-analysis.
 なお、映像符号化装置100において、第1~第3の実施形態のうちの任意の2つ又は全ての形態が組み込まれるように、解析部111、判定部112及びM値決定部113が構成されていてもよい。 Note that in the video encoding device 100, the analysis unit 111, the determination unit 112, and the M value determination unit 113 are configured so that any two or all of the first to third embodiments are incorporated. It may be.
 また、図2に示された映像復号装置は、第1~第3の実施形態において例示されたような、動きベクトル制限を満たす範囲で設定されたM値を用いて符号化されたビットストリームを復号する。 In addition, the video decoding apparatus shown in FIG. 2 converts a bitstream encoded using an M value set in a range that satisfies the motion vector restriction as exemplified in the first to third embodiments. Decrypt.
 図6は、映像システムの一例を示すブロック図である。図6に示す映像システムは、上記の各実施形態の映像符号化装置100と図2に示された映像復号装置200とが、無線伝送路又は有線伝送路300で接続されるシステムである。映像符号化装置100は、上記の第1~第3の実施形態のいずれかの映像符号化装置100であるが、映像符号化装置100において、第1~第3の実施形態のうちの任意の2つ又は全ての処理を実行するように、解析部111、判定部112及びM値決定部113が構成されていてもよい。 FIG. 6 is a block diagram illustrating an example of a video system. The video system shown in FIG. 6 is a system in which the video encoding device 100 of each of the above embodiments and the video decoding device 200 shown in FIG. 2 are connected by a wireless transmission line or a wired transmission line 300. The video encoding device 100 is the video encoding device 100 of any of the first to third embodiments described above, but the video encoding device 100 is an arbitrary one of the first to third embodiments. The analysis unit 111, the determination unit 112, and the M value determination unit 113 may be configured to execute two or all processes.
 また、上記の各実施形態を、ハードウェアで構成することも可能であるが、コンピュータプログラムにより実現することも可能である。 Further, although each of the above embodiments can be configured by hardware, it can also be realized by a computer program.
 図7に示す情報処理システムは、プロセッサ1001、プログラムメモリ1002、映像データを格納するための記憶媒体1003およびビットストリームを格納するための記憶媒体1004を備える。記憶媒体1003と記憶媒体1004とは、別個の記憶媒体であってもよいし、同一の記憶媒体からなる記憶領域であってもよい。記憶媒体として、ハードディスク等の磁気記憶媒体を用いることができる。 The information processing system shown in FIG. 7 includes a processor 1001, a program memory 1002, a storage medium 1003 for storing video data, and a storage medium 1004 for storing a bitstream. The storage medium 1003 and the storage medium 1004 may be separate storage media, or may be storage areas composed of the same storage medium. A magnetic storage medium such as a hard disk can be used as the storage medium.
 図7に示された情報処理システムにおいて、プログラムメモリ1002には、図1,図2のそれぞれに示された各ブロック(バッファのブロックを除く)の機能を実現するためのプログラム(映像符号化プログラム又は映像復号プログラム)が格納される。そして、プロセッサ1001は、プログラムメモリ1002に格納されているプログラムに従って処理を実行することによって、図1,図2のそれぞれに示された映像符号化装置または映像復号装置の機能を実現する。 In the information processing system shown in FIG. 7, the program memory 1002 has a program (video encoding program) for realizing the function of each block (except for the buffer block) shown in FIGS. Or a video decoding program). The processor 1001 executes processing according to the program stored in the program memory 1002, thereby realizing the functions of the video encoding device or the video decoding device shown in FIGS.
 図8は、映像符号化装置の主要部を示すブロック図である。図8に示すように、映像符号化装置10は、符号化統計情報を解析する解析部11(実施形態における解析部111に相当)と、解析部11の解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定する推定部12(実施形態では、判定部112で実現される。)と、推定部12の推定結果に基づいて符号化構造を適応的に決定する符号化構造決定部13(実施形態では、M値決定部113で実現される。)とを備える。 FIG. 8 is a block diagram showing the main part of the video encoding device. As illustrated in FIG. 8, the video encoding device 10 includes an analysis unit 11 (equivalent to the analysis unit 111 in the embodiment) that analyzes encoded statistical information, and a slice boundary near the analysis result of the analysis unit 11. An estimation unit 12 that estimates whether or not an optimal motion vector can be selected (in the embodiment, realized by the determination unit 112), and an encoding structure is adaptively determined based on the estimation result of the estimation unit 12. A coding structure determination unit 13 (implemented in the embodiment by the M value determination unit 113).
 図9は、映像復号装置の主要部を示すブロック図である。図9に示すように、映像復号装置20は、動きベクトル制限の下で、スライス境界付近で最適な動きベクトルを選択できるように設定された符号化構造に基づいて符号化されたビットストリームを復号する復号部21(実施形態では、予測器204等で実現される。)を備える。 FIG. 9 is a block diagram showing the main part of the video decoding apparatus. As shown in FIG. 9, the video decoding apparatus 20 decodes a bitstream encoded based on an encoding structure that is set so that an optimal motion vector can be selected near a slice boundary under motion vector restriction. The decoding part 21 (it implement | achieved by the predictor 204 grade | etc., In this embodiment) is provided.
 なお、復号部21は、設定された符号化構造として、Temporal IDが0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および、2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかのSOP 構造に基づいて符号化されたビットストリームを復号することができる。 Note that the decoding unit 21 has, as the set coding structure, an SOP structure composed only of pictures with Temporal ID 0, an SOP structure composed of pictures with Temporal ID 0 0 and 1 pictures, and Temporal ID 0 SOP structure composed of pictures of 1, 1 and 2; SOP structure of SOP の structure composed of pictures whose temporal ID is 0, 1 picture, 2 pictures and 3 pictures Can be decoded.
 さらに、復号部21は、図10に示すような4個のスライスに分割されて、さらに、図11に示すような、あるスライスのPUが別のスライスを動き補償(MC)参照する場合に、スライス境界を跨ぐ同PUのMC参照はスライス境界から128ライン以内の画素のみを参照するように制限されて、符号化されたビットストリームを復号できる。 Furthermore, the decoding unit 21 is divided into four slices as shown in FIG. 10, and when the PU of one slice refers to another slice with motion compensation (MC) as shown in FIG. 11, The MC reference of the same PU across the slice boundary is limited to refer to only pixels within 128 lines from the slice boundary, so that the encoded bit stream can be decoded.
 なお、実施形態では、120Pの動画像を扱う場合、映像符号化および復号側で図12に示すような、以下のSOP 構造を用いることができる。 In the embodiment, when a 120P moving image is handled, the following SOP cage structure as shown in FIG. 12 can be used on the video encoding and decoding side.
・L=0の構造:Temporal ID が0のピクチャだけで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は1つである。最大Temporal ID を示すLが0であるともいえる。)
・L=1の構造:Temporal ID が0のピクチャおよび1(またはM)のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は2つである。最大Temporal ID を示すLが1(またはM)であるともいえる。)
・L=2の構造:Temporal ID が0のピクチャ、1のピクチャ、および、2(またはM)のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は3つである。最大Temporal ID を示すLが2(またはM)であるともいえる。)
・L=3の構造:Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3(またはM)のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は4つである。最大Temporal ID を示すLが3(またはM)であるともいえる。)
・L=4の構造:Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、3のピクチャ、および、4(またはM)のピクチャで構成されるSOP 構造(つまり、同SOP に含まれるピクチャの段数は4つである。最大Temporal ID を示すLが4(またはM)であるともいえる。)
Structure with L = 0: SOP structure composed of only pictures with Temporal ID 0 (that is, the number of picture stages included in the SOP is 1. It can be said that L indicating the maximum Temporal ID is 0. )
Structure with L = 1: SOP structure composed of a picture with a temporal ID of 0 and a picture with 1 (or M) (that is, the number of stages of pictures included in the SOP is two. L indicating the maximum temporal ID) Can be said to be 1 (or M).
L = 2 structure: an SOP structure composed of a picture with a temporal ID of 0, a picture of 1 and a picture of 2 (or M) (that is, the number of stages of pictures included in the SOP is three). (It can also be said that L indicating the maximum Temporal ID is 2 (or M).)
Structure with L = 3: SOP structure composed of picture with Temporal ID 0, 1 picture, 2 pictures, and 3 (or M) pictures (that is, the number of stages of pictures included in the SOP is 4) (It can also be said that L indicating the maximum Temporal ID is 3 (or M).)
Structure with L = 4: SOP structure composed of a picture with a temporal ID of 0, a picture of 1, a picture of 2, a picture of 3, a picture of 4 (or M) (that is, a picture included in the SOP) (The number of stages is 4. It can also be said that L indicating the maximum Temporal ID is 4 (or M).)
 以上、実施形態および実施例を参照して本願発明を説明したが、本願発明は上記実施形態および実施例に限定されるものではない。本願発明の構成や詳細には、本願発明のスコープ内で当業者が理解し得る様々な変更をすることができる。 Although the present invention has been described with reference to the embodiments and examples, the present invention is not limited to the above embodiments and examples. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.
 この出願は、2015年9月25日に出願された日本特許出願2015-188043を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims priority based on Japanese Patent Application No. 2015-188043 filed on September 25, 2015, the entire disclosure of which is incorporated herein.
 10   映像符号化装置
 11   解析部
 12   推定部
 13   符号化構造決定部
 20   映像復号装置
 21   復号部
 100  映像符号化装置
 101  符号化部
 102  画面分割器
 103  周波数変換/量子化器
 104  逆量子化/逆周波数変換器
 105  バッファ
 106  予測器
 107  エントロピー符号化器
 111  解析部
 112  判定部
 113  M値決定部
 200  映像復号装置
 202  エントロピー復号器
 203  逆量子化/逆周波数変換器
 204  予測器
 205  バッファ
 1001 プロセッサ
 1002 プログラムメモリ
 1003,1004 記憶媒体
DESCRIPTION OF SYMBOLS 10 Video coding apparatus 11 Analysis part 12 Estimation part 13 Coding structure determination part 20 Video decoding apparatus 21 Decoding part 100 Video coding apparatus 101 Encoding part 102 Screen divider 103 Frequency converter / quantizer 104 Inverse quantization / inverse Frequency converter 105 Buffer 106 Predictor 107 Entropy encoder 111 Analysis unit 112 Determination unit 113 M-value determination unit 200 Video decoding device 202 Entropy decoder 203 Inverse quantization / inverse frequency converter 204 Predictor 205 Buffer 1001 Processor 1002 Program Memory 1003, 1004 Storage medium

Claims (20)

  1.  映像を所定個のスライスに分割し、スライス境界付近の動きベクトル制限の下で符号化処理を行う映像符号化装置であって、
     符号化統計情報を解析する解析手段と、
     前記解析手段の解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定する推定手段と、
     前記推定手段の推定結果に基づいて符号化構造を、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャおよび3のピクチャで構成されるSOP 構造のいずれかに適応的に決定する符号化構造決定手段と
     を備える映像符号化装置。
    A video encoding device that divides a video into a predetermined number of slices and performs encoding processing under motion vector restriction near a slice boundary,
    An analysis means for analyzing the encoded statistical information;
    Estimating means for estimating whether an optimal motion vector can be selected near the slice boundary based on the analysis result of the analyzing means;
    Based on the estimation result of the estimation means, the coding structure is composed of an SOP structure composed only of pictures with a temporal ID of 0, an SOP structure composed of pictures with a temporal ID of 0 and 1 pictures, and a temporal ID of 0. SOP structure consisting of picture, picture of 1 and 2 pictures, adaptive decision to one of picture whose temporal ID is 0, SOP structure consisting of 1 picture, 1 picture, 2 pictures and 3 pictures A video encoding device comprising: an encoding structure determining unit.
  2.  符号化構造決定手段は、参照ピクチャ距離を決定する
     請求項1記載の映像符号化装置。
    The video coding apparatus according to claim 1, wherein the coding structure determining unit determines a reference picture distance.
  3.  解析手段は、前記符号化統計情報として動きベクトルを解析する
     請求項1または請求項2に記載の映像符号化装置。
    The video encoding apparatus according to claim 1, wherein the analysis unit analyzes a motion vector as the encoded statistical information.
  4.  解析手段は、前記符号化統計情報としてスライス境界付近のブロックの予測モードを解析する
     請求項1から請求項3のうちのいずれか1項に記載の映像符号化装置。
    The video encoding device according to any one of claims 1 to 3, wherein the analysis unit analyzes a prediction mode of a block near a slice boundary as the encoded statistical information.
  5.  解析手段は、前記符号化統計情報としてスライス境界付近のブロックの発生符号量を解析する
     請求項1から請求項4のうちのいずれか1項に記載の映像符号化装置。
    The video encoding apparatus according to any one of claims 1 to 4, wherein the analysis unit analyzes a generated code amount of a block near a slice boundary as the encoded statistical information.
  6.  Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかで符号化された映像を復号する復号手段
     を備える映像復号装置。
    SOP structure consisting of only pictures with Temporal ID 0, SOP structure consisting of pictures with Temporal ID 0 and 1 pictures, pictures with Temporal ID 0, 1 pictures, and 2 pictures A video decoding apparatus comprising: decoding means for decoding a video encoded with any one of an SOP structure, a picture having a temporal ID of 0, a picture of 1, a picture of 2, a picture of 2, and a picture of 3;
  7.  復号する映像は、所定個のスライスに分割されてスライス境界付近の動きベクトル制限の下で符号化され、動きベクトル制限の下でスライス境界付近で最適な動きベクトルを選択できるように設定されたSOP 構造で符号化されている
     請求項6記載の映像復号装置。
    The video to be decoded is divided into a predetermined number of slices, encoded under the motion vector limit near the slice boundary, and set so that the optimal motion vector can be selected near the slice boundary under the motion vector limit The video decoding apparatus according to claim 6, wherein the video decoding apparatus is encoded with a structure.
  8.  映像を所定個のスライスに分割し、スライス境界付近の動きベクトル制限の下で符号化処理を行う映像符号化方法であって、
     符号化統計情報を解析し、
     解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定し、
     推定結果に基づいて符号化構造を、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャおよび3のピクチャで構成されるSOP 構造のいずれかに適応的に決定する
     映像符号化方法。
    A video encoding method that divides a video into a predetermined number of slices and performs an encoding process under a motion vector restriction near a slice boundary,
    Analyze encoding statistics,
    Based on the analysis results, estimate whether the optimal motion vector can be selected near the slice boundary,
    Based on the estimation result, the coding structure is composed of an SOP structure composed only of pictures with a temporal ID of 0, an SOP structure composed of pictures with a temporal ID of 0 and 1 pictures, a picture with a temporal ID of 0, and a 1 Picture coding method for adaptively determining one of a picture and a SOP structure composed of two pictures, a picture whose temporal ID is 0, a picture of one picture, a picture of two, and a picture of three .
  9.  前記符号化構造として参照ピクチャ距離を決定する
     請求項8記載の映像符号化方法。
    The video encoding method according to claim 8, wherein a reference picture distance is determined as the encoding structure.
  10.  前記符号化統計情報として動きベクトルを解析する
     請求項8または請求項9に記載の映像符号化方法。
    The video encoding method according to claim 8 or 9, wherein a motion vector is analyzed as the encoded statistical information.
  11.  前記符号化統計情報としてスライス境界付近のブロックの予測モードを解析する
     請求項8から請求項10のうちのいずれか1項に記載の映像符号化方法。
    The video encoding method according to any one of claims 8 to 10, wherein a prediction mode of a block near a slice boundary is analyzed as the encoding statistical information.
  12.  前記符号化統計情報としてスライス境界付近のブロックの発生符号量を解析する
     請求項8から請求項11のうちのいずれか1項に記載の映像符号化方法。
    The video coding method according to any one of claims 8 to 11, wherein a generated code amount of a block near a slice boundary is analyzed as the coding statistical information.
  13.  Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかで符号化された映像を復号する
     映像復号方法。
    SOP structure consisting of only pictures with Temporal ID 0, SOP structure consisting of pictures with Temporal ID 0 and 1 pictures, pictures with Temporal ID 0, 1 pictures, and 2 pictures A video decoding method for decoding video encoded with an SOP structure, a picture having a temporal ID of 0, a picture of 1, a picture of 2, a picture of 2, and a picture of SOP.
  14.  所定個のスライスに分割されてスライス境界付近の動きベクトル制限の下で符号化され、動きベクトル制限の下でスライス境界付近で最適な動きベクトルを選択できるように設定されたSOP 構造で符号化されている映像を復号する
     請求項13記載の映像復号方法。
    It is divided into a predetermined number of slices, encoded under the motion vector limit near the slice boundary, and encoded with the SOP structure set so that the optimal motion vector can be selected near the slice boundary under the motion vector limit. The video decoding method according to claim 13, wherein the video is decoded.
  15.  映像を所定個のスライスに分割し、スライス境界付近の動きベクトル制限の下で符号化処理を行う映像符号化方法を実行するためのプログラムであって、
     コンピュータに、
     符号化統計情報を解析する処理と、
     解析結果に基づいて、スライス境界付近で最適な動きベクトルを選択できるか否かを推定する処理と、
     推定結果に基づいて符号化構造を、Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャおよび3のピクチャで構成されるSOP 構造のいずれかに適応的に決定する処理と
     を実行させるための映像符号化プログラム。
    A program for executing a video encoding method that divides a video into predetermined slices and performs an encoding process under a motion vector restriction near a slice boundary,
    On the computer,
    A process of analyzing the encoding statistics,
    Based on the analysis result, a process for estimating whether an optimal motion vector can be selected near the slice boundary;
    Based on the estimation result, the coding structure is composed of an SOP structure composed only of pictures with a temporal ID of 0, an SOP structure composed of pictures with a temporal ID of 0 and 1 pictures, a picture with a temporal ID of 0, and a 1 SOP structure composed of pictures and 2 pictures, and processing to adaptively determine any of SOP structures composed of pictures with temporal ID 0, 1 picture, 2 pictures and 3 pictures Video encoding program for
  16.  コンピュータに、前記符号化構造として参照ピクチャ距離を決定する処理を実行させる 請求項15記載の映像符号化プログラム。 16. The video encoding program according to claim 15, which causes a computer to execute a process of determining a reference picture distance as the encoding structure.
  17.  コンピュータに、前記符号化統計情報としてスライス境界付近のブロックの予測モードを解析させる
     請求項15または請求項16に記載の映像符号化プログラム。
    The video encoding program according to claim 15, wherein the computer analyzes a prediction mode of a block near a slice boundary as the encoding statistical information.
  18.  コンピュータに、前記符号化統計情報としてスライス境界付近のブロックの発生符号量を解析させる
     請求項15から請求項17のうちのいずれか1項に記載の映像符号化プログラム。
    The video encoding program according to any one of claims 15 to 17, which causes a computer to analyze a generated code amount of a block near a slice boundary as the encoded statistical information.
  19.  コンピュータに、
     Temporal ID が0のピクチャだけで構成されるSOP 構造、Temporal ID が0のピクチャおよび1のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、および2のピクチャで構成されるSOP 構造、Temporal ID が0のピクチャ、1のピクチャ、2のピクチャ、および3のピクチャで構成されるSOP 構造のいずれかで符号化された映像を復号する処理
     を実行させるための映像復号プログラム。
    On the computer,
    SOP structure consisting of only pictures with Temporal ID 0, SOP structure consisting of pictures with Temporal ID 0 and 1 pictures, pictures with Temporal ID 0, 1 pictures, and 2 pictures A video decoding program for executing processing for decoding a video encoded with any of the SOP structure, a picture with a temporal ID of 0, 1 picture, 2 pictures, and 3 pictures.
  20.  コンピュータに、
     所定個のスライスに分割されてスライス境界付近の動きベクトル制限の下で符号化され、動きベクトル制限の下でスライス境界付近で最適な動きベクトルを選択できるように設定されたSOP 構造で符号化されている映像を復号させる
     請求項19記載の映像復号プログラム。
    On the computer,
    It is divided into a predetermined number of slices, encoded under the motion vector limit near the slice boundary, and encoded with the SOP structure set so that the optimal motion vector can be selected near the slice boundary under the motion vector limit. The video decoding program according to claim 19, wherein the video is decoded.
PCT/JP2016/003322 2015-09-25 2016-07-14 Video encoding device and video decoding device WO2017051493A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2017541222A JP6489227B2 (en) 2015-09-25 2016-07-14 Video encoding apparatus and video encoding method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015188043 2015-09-25
JP2015-188043 2015-09-25

Publications (1)

Publication Number Publication Date
WO2017051493A1 true WO2017051493A1 (en) 2017-03-30

Family

ID=58386356

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/003322 WO2017051493A1 (en) 2015-09-25 2016-07-14 Video encoding device and video decoding device

Country Status (2)

Country Link
JP (1) JP6489227B2 (en)
WO (1) WO2017051493A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011030217A (en) * 2009-07-03 2011-02-10 Panasonic Corp Image coding apparatus and image decoding device
JP2014096690A (en) * 2012-11-09 2014-05-22 Fujitsu Semiconductor Ltd Moving image processing device
WO2015025747A1 (en) * 2013-08-22 2015-02-26 ソニー株式会社 Encoding device, encoding method, transmission device, decoding device, decoding method, and reception device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011030217A (en) * 2009-07-03 2011-02-10 Panasonic Corp Image coding apparatus and image decoding device
JP2014096690A (en) * 2012-11-09 2014-05-22 Fujitsu Semiconductor Ltd Moving image processing device
WO2015025747A1 (en) * 2013-08-22 2015-02-26 ソニー株式会社 Encoding device, encoding method, transmission device, decoding device, decoding method, and reception device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JILL BOYCE ET AL.: "High layer syntax to improve support for temporal scalability", JOINT COLLABORATIVE TEAM ON VIDEO CODING(JCT-VC) OF ITU-T SG 16 WP3 AND ISO/IEC JTC1/SC29/WG11 4TH MEETING, 22 January 2011 (2011-01-22), Daegu, KR, XP030047529 *

Also Published As

Publication number Publication date
JPWO2017051493A1 (en) 2018-03-15
JP6489227B2 (en) 2019-03-27

Similar Documents

Publication Publication Date Title
JP6132006B1 (en) Video encoding device, video system, video encoding method, and video encoding program
EP3416386B1 (en) Hash-based encoder decisions for video coding
US10136132B2 (en) Adaptive skip or zero block detection combined with transform size decision
US10038917B2 (en) Search strategies for intra-picture prediction modes
US10652570B2 (en) Moving image encoding device, moving image encoding method, and recording medium for recording moving image encoding program
KR20210099008A (en) Method and apparatus for deblocking an image
JP2013093650A (en) Encoder, decoder and program
JP2013157662A (en) Moving image encoding method, moving image encoding device, and moving image encoding program
US9264715B2 (en) Moving image encoding method, moving image encoding apparatus, and computer-readable medium
JP6241565B2 (en) Video encoding device, video system, video encoding method, and video encoding program
JP6677230B2 (en) Video encoding device, video decoding device, video system, video encoding method, and video encoding program
JP6489227B2 (en) Video encoding apparatus and video encoding method
JP6241558B2 (en) Video encoding device, video system, video encoding method, and video encoding program
US20160360219A1 (en) Preventing i-frame popping in video encoding and decoding
US20160156905A1 (en) Method and system for determining intra mode decision in h.264 video coding
US10523945B2 (en) Method for encoding and decoding video signal
KR102140271B1 (en) Fast intra coding method and apparatus using coding unit split based on threshold value
JP6341973B2 (en) Encoding apparatus, encoding method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16848287

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017541222

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16848287

Country of ref document: EP

Kind code of ref document: A1