GB2558277A - Image data encoding and decoding - Google Patents

Image data encoding and decoding Download PDF

Info

Publication number
GB2558277A
GB2558277A GB1622168.1A GB201622168A GB2558277A GB 2558277 A GB2558277 A GB 2558277A GB 201622168 A GB201622168 A GB 201622168A GB 2558277 A GB2558277 A GB 2558277A
Authority
GB
United Kingdom
Prior art keywords
image
stereoscopic
encoded
images
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1622168.1A
Other versions
GB201622168D0 (en
Inventor
Henry Bickerstaff Ian
Winesh Raghoebardajal Sharwin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Interactive Entertainment Inc
Original Assignee
Sony Interactive Entertainment Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Interactive Entertainment Inc filed Critical Sony Interactive Entertainment Inc
Priority to GB1622168.1A priority Critical patent/GB2558277A/en
Publication of GB201622168D0 publication Critical patent/GB201622168D0/en
Priority to PCT/GB2017/053802 priority patent/WO2018115841A1/en
Publication of GB2558277A publication Critical patent/GB2558277A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Encoding stereoscopic image pairs comprising left and right eye images, comprising detecting image differences L1-R1 between left L1 and right R1 image frames and encoding data in dependence upon the detected differences, delta or disparity. Preferably, a first image of a first polarity or view L in a first stereoscopic image pair 1 is encoded in dependence on an image of the second (opposite) polarity or view R in another stereoscopic pair 2, and the second image of the first stereoscopic pair is encoded as data representing differences L1-R1 between the first and second images of the first stereoscopic pair. Alternatively, an encoding processor detects portions of the two stereo images that have an image disparity less than a threshold and generates replacement image data to render said portions identical (i.e. it quantises the difference between the images to zero if the portions are similar). Greater compression may be applied to the peripheral regions of the image than the central portions. In another embodiment, as stereo pairs are sequentially processed, the designated first image (upon which the second image is differentially encoded) alternates between images of different polarities. Corresponding decoders are disclosed.

Description

(71) Applicant(s):
1622168.1 (51) INT CL:
H04N 19/597 (2014.01)
23.12.2016 H04N 19/115 (2014.01)
H04N 19/167 (2014.01)
- H04N 19/176 (2014.01)
H04N 19/85 (2014.01)
H04N 13/161 (2018.01) H04N 19/124 (2014.01) H04N 19/172 (2014.01) H04N 19/503 (2014.01)
Sony Interactive Entertainment Inc.
1-7-1 Konan, Minato-Ku 108-8270, Tokyo, Japan (72) Inventor(s):
(56) Documents Cited:
WO 2016/124710 A1 WO 2010/108024 A1 KR 20030001758
WO 2013/030456 A1
Ian Henry Bickerstaff Sharwin Winesh Raghoebardajai (TECH, Overview of the Multiview and 3D Extensions of High Efficiency Video Coding, IEEE TCSVT 26(1) Jan 2016.
(74) Agent and/or Address for Service:
D Young & Co LLP
120 Holborn, LONDON, EC1N 2DY, United Kingdom (58) Field of Search:
INT CL H04N
Other: EPODOC; INSPEC; INTERNET; WPI; Patent Fulltext (54) Title of the Invention: Image data encoding and decoding
Abstract Title: Stereoscopic image encoding using differences between the images of a pair to encode one of the images of the pair (57) Encoding stereoscopic image pairs comprising left and right eye images, comprising detecting image differences L1-R1 between left L1 and right R1 image frames and encoding data in dependence upon the detected differences, delta or disparity. Preferably, a first image of a first polarity or view L in a first stereoscopic image pair 1 is encoded in dependence on an image of the second (opposite) polarity or view R in another stereoscopic pair 2, and the second image of the first stereoscopic pair is encoded as data representing differences L1-R1 between the first and second images of the first stereoscopic pair. Alternatively, an encoding processor detects portions of the two stereo images that have an image disparity less than a threshold and generates replacement image data to render said portions identical (i.e. it quantises the difference between the images to zero if the portions are similar). Greater compression may be applied to the peripheral regions of the image than the central portions. In another embodiment, as stereo pairs are sequentially processed, the designated first image (upon which the second image is differentially encoded) alternates between images of different polarities. Corresponding decoders are disclosed.
Left Right Primary Difference
L1 R1 L1 L1--R1
L2 R2 R2 R2-L2
L3 R3 1 Q LU L3-R3
L4 R4 R4 R4-L4
FIG. 11
This print incorporates corrections made under Section 117(1) of the Patents Act 1977.
1/6
1801 18
Figure GB2558277A_D0001
FIG. 1
Figure GB2558277A_D0002
A/V
A/V
120
Figure GB2558277A_D0003
FIG. 2
140
Figure GB2558277A_D0004
FIG. 3
Figure GB2558277A_D0005
FIG. 4
1801 18
Figure GB2558277A_D0006
Figure GB2558277A_D0007
Figure GB2558277A_D0008
Figure GB2558277A_D0009
INPUTDETECTOR
ENCODER
OUTPUT
600
INPUT
FIG. 6
DECODER ''''
610
OUTPUT
Figure GB2558277A_D0010
OUTPUT o
Primary Difference
L1 R1 L1 L1-R1
L-i ooo L/-- i
L2 R2 L2 L2-R2
L-i ooo i
L3 R3 L3 L3-R3
ooo \^- t
L4 R4 L4 L4-R4
Left
Right Primary Difference
Figure GB2558277A_D0011
L4
R4
R4-L4
1801 18
Figure GB2558277A_D0012
Figure GB2558277A_D0013
Figure GB2558277A_D0014
Figure GB2558277A_D0015
LEFT
RIGHT
1801 18
Figure GB2558277A_D0016
Figure GB2558277A_D0017
Figure GB2558277A_D0018
DETECT
DECODE
Figure GB2558277A_D0019
REPRODUCE
GENERATE
IMAGE DATA ENCODING AND DECODING
BACKGROUND
Field
This disclosure relates to data encoding and decoding.
Description of Related Art
There are several video data encoding and decoding systems which involve transforming video data into a frequency domain representation, quantising the frequency domain coefficients and then applying some form of entropy encoding to the quantised coefficients. This can achieve compression of the video data. A corresponding decoding or decompression technique is applied to recover a reconstructed version of the original video data.
SUMMARY
The present disclosure is defined by the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
A more complete appreciation of the disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:
Figure 1 schematically illustrates an audio/video (A/V) data transmission and reception system using image data encoding and decoding;
Figure 2 schematically illustrates a video display system using image data decoding;
Figure 3 schematically illustrates an audio/video storage system using image data encoding and decoding;
Figure 4 schematically illustrates a video camera using image data encoding;
Figure 5 schematically illustrates a series of stereoscopic image pairs;
Figure 6 schematically illustrates an encoding apparatus;
Figure 7 schematically illustrates a decoding apparatus;
Figure 8 schematically illustrates an encoding apparatus in more detail;
Figure 9 schematically illustrates a decoding aooaratus in more detail;
Figures 10 and 11 schematically illustrate example selection patterns;
Figure 12 schematically illustrates a stereoscopic image pair;
Figure 13 schematically illustrates an image portion;
Figure 14 schematically illustrates a stereoscopy processor;
Figure 15 schematically illustrates a stereoscopic image pair;
Figure 16 schematically represents image regions;
Figure 17 schematically illustrates parameter selection; and
Figures 18 and 19 are schematic flowcharts illustrating respective methods. DESCRIPTION OF THE EMBODIMENTS
Referring now to the drawings, Figures 1-4 are provided to give schematic illustrations of apparatus or systems making use of the encoding and/or decoding apparatus to be described below in connection with embodiments of the disclosure.
All of the data encoding and/or decoding apparatus to be described below may be implemented in hardware, in software running on a general-purpose data processing apparatus such as a general-purpose computer, as programmable hardware such as an application specific integrated circuit (ASIC) or field programmable gate array (FPGA) or as combinations of these. In cases where the embodiments are implemented by software and/or firmware, it will be appreciated that such software and/or firmware, and non-transitory machine-readable data storage media by which such software and/or firmware are stored or otherwise provided, are considered as embodiments of the present disclosure.
Figure 1 schematically illustrates an audio/video data transmission and reception system using image data encoding and decoding.
An input audio/video signal 10 is supplied to a video data encoding apparatus 20 which encodes at least the video component of the audio/video signal 10 for transmission along a transmission route 30 such as a cable, an optical fibre, a wireless link or the like. The encoded signal is processed by a decoding apparatus 40 to provide an output audio/video signal 50. For the return path, a encoding apparatus 60 encodes an audio/video signal for transmission along the transmission route 30 to a decoding apparatus 70.
The encoding apparatus 20 and decoding apparatus 70 can therefore form one node of a transmission link. The decoding apparatus 40 and decoding apparatus 60 can form another node of the transmission link. Of course, in instances where the transmission link is unidirectional, only one of the nodes would require a encoding apparatus and the other node would only require a decoding apparatus.
Figure 2 schematically illustrates a video display system using image data decoding. In particular, a encoded audio/video signal 100 is processed by a decoding apparatus 110 to provide a decoded signal which can be displayed on a display 120. The decoding apparatus 110 could be implemented as an integral part of the display 120, for example being provided within the same casing as the display device. Alternatively, the decoding apparatus 110 may be provided as (for example) a so-called set top box (STB), noting that the expression set-top does not imply a requirement for the box to be sited in any particular orientation or position with respect to the display 120; it is simply a term used in the art to indicate a device which is connectable to a display as a peripheral device.
Figure 3 schematically illustrates an audio/video storage system using image data encoding and decoding. An input audio/video signal 130 is supplied to a encoding apparatus
140 which generates a encoded signal for storing by a store device 150 such as a magnetic disk device, an optical disk device, a magnetic tape device, a solid state storage device such as a semiconductor memory or other storage device. For replay, encoded data is read from the store device 150 and passed to a decoding apparatus 160 for decoding to provide an output audio/video signal 170.
It will be appreciated that the encoded or encoded signal, and a storage medium storing that signal, are considered as embodiments of the present disclosure.
Figure 4 schematically illustrates a video camera using image data encoding. In Figure 4, an image capture device 180, such as a charge coupled device (CCD) image sensor and associated control and read-out electronics, generates a video signal which is passed to a encoding apparatus 190. A microphone (or plural microphones) 200 generates an audio signal to be passed to the encoding apparatus 190. The encoding apparatus 190 generates a encoded audio/video signal 210 to be stored and/or transmitted (shown generically as a schematic stage 220).
Therefore, it will be appreciated that encoding and/or decoding apparatus as discussed here can be embodied in video storage, transmission, capture or display apparatus. Examples of these types of apparatus can include computer games machines.
The encoding can include data compression, or can be lossless. The decoding can (where appropriate for the format of the encoded data) include data decompression.
The techniques to be described below relate in some cases to video data encoding and decoding. However, they can be applied to image data encoding and decoding. Examples include the intra-image techniques to be discussed, which can be applied to single images. In this context, references in the description of the embodiments to “video” should be understood, where the context does not explicitly disallow such an interpretation, to relate also to “image” handling techniques. However, in at least some examples, the encoder and the detector (discussed below) are configured to operate in respect of successive stereoscopic image pairs of a stereoscopic video signal.
It will also be appreciated that many existing techniques may be used for audio data encoding in conjunction with the video data encoding techniques which will be described, to generate a encoded audio/video signal. Accordingly, a separate discussion of audio data encoding will not be provided. It will also be appreciated that the data rate associated with video data, in particular broadcast quality video data, is generally very much higher than the data rate associated with audio data (whether compressed or not compressed). It will therefore be appreciated that unencoded audio data could accompany encoded video data to form a encoded audio/video signal. It will further be appreciated that although the present examples (shown in Figures 1-4) relate to audio/video data, the techniques to be described below can find use in a system which simply deals with (that is to say, encodes, decodes, stores, displays and/or transmits) video or image data. That is to say, the embodiments can apply to video data encoding without necessarily having any associated audio data handling at all.
Therefore, Figures 1-4 provide examples of image data capture, reproduction, storage and/or transmission apparatus comprising apparatus according to any of the techniques to be discussed.
Figure 5 schematically illustrates a set of successive stereoscopic image pairs captured at respective capture times t1, t2, t3, t4... . The two images captured at a particular capture time (such as the images 500) represent a stereoscopic image pair of a left and a right image for display to be left and right eyes respectively.
The present techniques are applicable to a single stereoscopic image pair such as the pair 500. However, the techniques are also applicable to a succession of stereoscopic image pairs for example forming a stereoscopic video signal. In an example of such a video signal, the times t1 ...t4 (and so on) are separated by an image period applicable to the video signal such as 1/50s.
Between a left and a right image forming a particular stereoscopic image pair, there would generally be expected to be differences, for example relating to the representation of image depth. Depth is represented in such a stereoscopic image pair by disparity, which is to say, a lateral (left-to-right) difference in image position which is dependent upon the perceived depth of that image feature. In some examples, a zero disparity would be applicable to objects at infinity, with the disparity generally increasing the closer the object is to the camera view point.
Examples of the present techniques to be discussed below can make use of the general similarity between left and right images, as well as general similarity between successive images captured at successive points in time separated by the image period in the case of a video signal.
Figure 6 schematically illustrates an image data encoding apparatus to encode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a detector 600 to detect image differences between a left image and a right image; and an encoder 610 to generate encoded data representing a stereoscopic image in dependence upon the detected image differences.
Figure 7 schematically illustrates an image data decoding apparatus to decode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a decoder 700 to decode encoded data representing image differences between a left image and a right image, and to reproduce a stereoscopic image pair in dependence upon the decoded image differences.
The apparatus of Figures 6 and 7 will now be described in more detail with reference to Figures 8 and 9.
Figure 8 provides a schematic overview of an image data encoding apparatus such as a video data compression apparatus.
A controller 800 controls the overall operation of the apparatus and, in particular when referring to a selection pattern (to be described below) controls the generation of such a selection pattern. The controller 800 can also select various modes of operation such as sample block sizes and/or configurations.
Successive images of an input video signal 810 are supplied to an adder 820 and to an image predictor 830. The adder 820 in fact performs a subtraction (negative addition) operation, in that it receives the input video signal 810 on a + input and the output of the image predictor 820 on a input, so that the predicted image is subtracted from the input image. The result is to generate a so-called residual image signal 840 representing the difference between the actual and predicted images.
One reason why a residual image signal is generated is as follows. Some data coding techniques, that is to say the techniques which will be applied to the residual image signal, tend to work more efficiently when there is less energy in the image to be encoded. Here, the term efficiently refers to the generation of a small amount of encoded data; for a particular image quality level, it is desirable (and considered efficient) to generate as little data as is practicably possible. The reference to energy in the residual image relates to the amount of information contained in the residual image. If the predicted image were to be identical to the real image, the difference between the two (that is to say, the residual image) would contain zero information (zero energy) and would be very easy to encode into a small amount of encoded data. In general, if the prediction process can be made to work reasonably well, the expectation is that the residual image data will contain less information (less energy) than the input image and so will be easier to encode into a small amount of encoded data.
The residual image data 840 is supplied to an encoder 850 which generates output encoded data 860.
One example of the operation of the encoder 850 is to generate a discrete cosine transform (DOT) or other transformed representation (such as a discrete sine transform or DST representation) of the residual image data. A set of transformed coefficients for each transformed block of image data, can then be quantised (if data compression is being used).
Various quantisation techniques are known in the field of image data compression, ranging from a simple multiplication by a quantisation scaling factor through to the application of complicated lookup tables under the control of a quantisation parameter. The general aim is twofold. Firstly, the quantisation process reduces the number of possible values of the transformed data. Secondly, the quantisation process can increase the likelihood that values of the transformed data are zero. Both of these can make the entropy encoding process, to be described below, work more efficiently in generating small amounts of compressed video data.
A data scanning process can then be applied. The purpose of the scanning process is to reorder the quantised transformed data so as to gather as many as possible of the non-zero quantised transformed coefficients together, and of course therefore to gather as many as possible of the zero-valued coefficients together. These features can allow so-called run-length coding or similar techniques to be applied efficiently. So, the scanning process involves selecting coefficients from the quantised transformed data, and in particular from a block of coefficients corresponding to a block of image data which has been transformed and quantised, according to a scanning order so that (a) all of the coefficients are selected once as part of the scan, and (b) the scan tends to provide the desired reordering. One example scanning order which can tend to give useful results is a so-called up-right diagonal scanning order, although in example embodiments to be discussed below, other scanning orders will be considered.
The scanned coefficients may then be passed to an entropy encoder. Again, various types of entropy encoding may be used. Two examples are variants of the so-called CABAC (Context Adaptive Binary Arithmetic Coding) system and variants of the so-called CAVLC (Context Adaptive Variable-Length Coding) system. In general terms, CABAC is considered to provide a better efficiency, and in some studies has been shown to provide a 10-20% reduction in the quantity of encoded output data for a comparable image quality compared to CAVLC. However, CAVLC is considered to represent a much lower level of complexity (in terms of its implementation) than CABAC. Note that the scanning process and the entropy encoding process are shown as separate processes, but in fact can be combined or treated together. That is to say, the reading of data into the entropy encoder can take place in the scan order. Corresponding considerations apply to the respective inverse processes to be described below.
The output of the entropy encoding, along with additional data (mentioned above and/or discussed below), for example defining the manner in which the predictor 830 generated the predicted image, provides a compressed output video signal 860.
However, a return path 870 is also provided because the operation of the predictor 830 itself depends upon a decompressed version of the compressed output data.
The reason for this feature is as follows. At the appropriate stage in the decompression process (to be described below) a decompressed version of the residual data is generated. This decompressed residual data has to be added to a predicted image to generate an output image (because the original residual data was the difference between the input image and a predicted image). In order that this process is comparable, as between the compression side and the decompression side, the predicted images generated by the predictor 830 should be the same during the compression process and during the decompression process. Of course, at decompression, the apparatus does not have access to the original input images, but only to the decompressed images. Therefore, at compression, the predictor 860 bases its prediction (at least, for inter-image encoding) on decompressed versions of the compressed images.
The entropy encoding process is considered to be lossless, which is to say that it can be reversed to arrive at exactly the same data which was first supplied to the entropy encoder. So, the return path 870 can be implemented before the entropy encoding stage. Indeed, the scanning process is also considered lossless and so the return path can be implemented before that process.
In general terms, a decoder 880 in the return path provides the functions of entropy decoding, a reverse scan unit, an inverse quantisation and an inverse transform, all complementary to the respective functions of the encoder 850. For now, the discussion will continue through the encoding process; the process to decode an input encoded video signal will be discussed separately below.
In the encoding process, the decoder 880 acts on data from the return path 870 to generate a compressed-decompressed residual image signal 885.
The image signal 885 is added, at an adder 890, to the output of the predictor 830 to generate a reconstructed output image 895. This forms one input to the image predictor 830.
Turning now to the process applied to decode a received encoded image data signal 900, Figure 9 schematically illustrates a decoding apparatus.
In Figure 9, the received encoded image data signal 900 is provided to a decoder 910 equivalent in function to the decoder 880 of Figure 8 and operating under the control of a controller 920. The decoder 910 provides (in this example) the functions of entropy decoding, reverse scan, inverse quantisation and inverse transform, all complimentary to the respective functions of the encoder 850. The decoder 910 generates a decoded image data signal 930 which is added, by an adder 940, to the output of a predictor 950. The predictor operates in the same manner as the predictor 830, on the basis of image data 960 generated as the son of the predicted image data 970 and the decompressed residual image data 930 generated by the decoder 910. The output of the apparatus is also provided by the image data 960.
Figures 10 and 11 schematically illustrate example selection patterns which may be selected by the controller 800 and correspondingly by the controller 920.
Before discussing Figures 10 and 11 in detail, further context will be provided regarding the use of so-called selection patterns.
In examples of the present disclosure, at least some images of one or more stereoscopic image pairs are encoded by reference to image differences between a left image and a right image. In Figures 8 and 9, these image differences are handled by the adders 805, 980. The adder 805 subtracts one image from another (for example, the other of the same image pair on a pixel by pixel or region by region basis. The adder 980 adds back the difference data decoded for one image to the corresponding primary image. So, some images are encoded “directly” (meanings of which will be discussed further below) and some are encoded according to differences from an image of the opposite polarity (where left and right are respective polarities).
In some examples, this process can be carried out for individual image pairs (so that there is no dependency of either image of a stereoscopic image pair on any other image outside of that particular stereoscopic image pair. The choice of which image is encoded without reference to the other image (a “primary” image in the terminology of Figures 10 and 11) and which image is encoded by reference to differences from the primary image, can be a fixed selection so that (purely for example) it is always the left image which is the primary image and the right image which is encoded by reference to differences. This fixed selection is one example of a selection pattern. Figure 10 shows such a selection pattern in which, for successive stereoscopic image pairs 1...4, the left image (L1...L4) is encoded as the primary image and the respective right image (R1... ,R4) is encoded by reference to differences from the left image. Note that although the difference is expressed as Ln-Rn, an equivalent operation could be obtained by deriving the complimentary difference Rn-Ln.
In the example just discussed, so-called intra image processing is a slide to the primary image, so that the predictor 830 and the equivalent predictor 950 in Figures 8 and 9 derive their predictions from other portions of the respective primary image. In such examples, the first encoded data is decodable to allow reproduction of the first image without reference to encoded image data representing other encoded images.
Therefore, in these examples, the encoder is configured to generate:
first encoded image data (the primary data in Figure 10) which is decodable to allow reproduction of a first image of a stereoscopic image pair without reference to encoded image data relating to a second image, being the other image of the stereoscopic image pair; and second encoded image data (the Difference data in Figure 10) for the other of the stereoscopic image pair, the second encoded data representing differences, detected by the detector, between the two images of the stereoscopic image pair.
For the difference image, one way of implementing an encoding process is to provide, as the input 810 in Figure 8, a difference image generated by subtracting Rn from Ln (or the other way round) and encoding that as discussed above. Another possible technique is that the predictor 830 operates only to generate predicted images based upon one of the stereoscopic image pair (the primary image in this example) even when encoding the other of the stereoscopic image pair.
The discussion above relates to dependencies within a stereoscopic image pair. However, the apparatus of Figures 8 and 9 also allows for temporal dependencies when a stereoscopic video signal is being encoded or decoded, so that the first encoded data is dependent upon one or more images of one or more others of the stereoscopic image pairs.
These temporal dependencies allow the predictor 830 and the corresponding predictor 950 to generate a predicted version of a particular image on the basis of one or more other images. In a non- stereoscopic system these can simply be other images of a video signal. In a stereoscopic encoding and decoding arrangement, various options are available.
In Figure 8, the detector 600 can be implemented as the adder 805, for example. The remainder of Figure 8 can implement the encoder 610. In Figure 9, the arrangement shown can implement the decoder 700.
Figure 10 shows an example (indicated by dotted lines 1000) in which the primary images Ln have a temporal dependency upon one or more other primary images Ln.
Optionally, the difference images Ln-Rn can also have a temporal dependency in their encoding upon other difference images, indicated by the dotted lines 1010. Of course, even if the explicit temporal dependency 1010 is not provided, in such an arrangement the decoding of the difference images would still have a temporal aspect because of the temporal dependencies of the primary images.
Figure 11 schematically illustrates another selection pattern. Here, the primary image is selected so as to alternate between the left and the right image of each successive stereoscopic image pair of a video signal, so that in the example of Figure 11, the primary image is: L1, R2, L3, R4...and so on. The other image is encoded as a difference image as before, similarly to Figure 10, temporal dependencies can also be introduced as indicated by broken lines between successive image pairs.
Therefore, in an example arrangement applied to video encoding, the encoder and decoder can be configured to operate in respect of successive stereoscopic image pairs (such as those shown in Figures 10 and 11) of a stereoscopic video signal. The encoder can be configured to select a first or primary image of each stereoscopic image pair according to a selection pattern. In the example of Figure 11, the selection pattern may include instances of each polarity of the first image. The particular example of Figure 11 represents an alternate selection of each polarity, but it will be appreciated that different selection patterns may be used.
The temporal dependency can be within the same polarity or across polarities, so that in some examples the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the same polarity as the polarity of the first image, and in other examples the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the opposite polarity to that of the polarity of the first image.
Figures 12 to 15 provide an example of a pre-processing technique which can be applied to the images to be encoded by the apparatus of Figure 8. By potentially reducing image differences between the left and right images, the arrangements discussed above can potentially operate more efficiently. In Figures 12 to 15, a stereoscopy processor 1400 is configured to detect portions of the stereoscopic image pair having an image disparity less than a threshold image disparity (and therefore representing image features of at least a threshold depth or distance from the camera view point) and to generate replacement image content for one or both images of the stereoscopic image pair such that those portions are identical as between the left and right images of the stereoscopic image pair.
Referring to Figure 12, example left and right images are shown. Examining the image disparity in the left and right images, a region 1300 in which the disparity is greater than a threshold image disparity is identified. Image features in the region 1300 are considered to be closer than a threshold distance to the camera viewpoint. Image features in the remainder of the image are considered to be further away than the threshold distance or depth from the camera viewpoint.
Referring to Figure 14, the stereoscopy processor 1400 comprises a detector 1410 to detect image disparity of less than or greater than a threshold image disparity and to generate and store in a map store 1420 a map or mask indicating image portions which are closer than, or further than the threshold distance or depth from the camera viewpoint.
A generator 1430 acts to generate a single version of the image features having a 00 disparity of less than the threshold amount and to apply that as a background in all regions 1 other than the identified region 1300 of the image. The result is a stereoscopic image pair such l— as that shown in Figure 15, in which the background (all regions except the region 1300) is identical as between the two images, whereas the foreground (the region 1300, representing 00 image features closer than the threshold depth) is different, being the foreground region from
1” each of the original left and right images of Figure 12.
As discussed above, the encoder can be configured to generate compressed encoded data and the decoder can be configured to generate output decoded data from a compressed data input. The degree of compression does not however need to be identical across the whole image.
Figure 16 schematically illustrates an arrangement in which multiple variations in the degree of compression are applied, so that a more central region 1600 of each image 1610 is compressed the least (providing a greater amount of output data than would be the case if a greater degree of compression were used), a next outmost image region 1612 is compressed more than the region 1600, and peripheral region 1620 is compressed by a greater degree of compression than the region 1600 or 1612. So, the image compression is more aggressive (providing a greater degree of image compression) in the peripheral region 1620 than in a central region 1600.
Figure 17 schematically represents apparatus which can form part of the encoder 850, in which an image region detector 1700 detects whether a current portion or block to be compressed is within a central image region 1600 or a peripheral image region, and a parameter selector 1710 selects compression parameters to be applied to that image region as a result so that a more central region is compressed with a lesser degree of compression than a more peripheral region. The modules 1700, 1710 can be implemented by the controller 800, with corresponding operations being carried out at decoding by the controller 920. A data compressor 1720 carries out the data compression according to the selected parameters.
Figure 18 is a schematic flowchart illustrating an image data encoding method comprising:
detecting (at a step 1800) image differences between a left image and a right image of one or more stereoscopic image pairs; and generating (at a step 1810) encoded data representing a stereoscopic image in dependence upon the detected image differences.
Figure 19 is a schematic flowchart illustrating an image data decoding method comprising:
decoding (at a step 1900) encoded data representing image differences between a left image and a right image of one or more stereoscopic image pairs; and reproducing (at a step 1910) a stereoscopic image pair in dependence upon the decoded image differences.
It will be appreciated that example embodiments can be implemented by computer software operating on a general purpose computing system such as a games machine. In these examples, computer software, which when executed by a computer, causes the computer to carry out any of the methods discussed above is considered as an embodiment of the present disclosure. Similarly, embodiments of the disclosure are provided by a non-transitory, machine-readable storage medium which stores such computer software.
It will also be apparent that numerous modifications and variations of the present disclosure are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the disclosure may be practised otherwise than as specifically described herein.

Claims (27)

1. Image data encoding apparatus to encode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a detector to detect image differences between a left image and a right image; and an encoder to generate encoded data representing a stereoscopic image in dependence upon the detected image differences.
2. Apparatus according to claim 1, in which the encoder is configured to generate:
first encoded image data which is decodable to allow reproduction of a first image of a stereoscopic image pair without reference to encoded image data relating to a second image, being the other image of the stereoscopic image pair; and second encoded image data for the other of the stereoscopic image pair, the second encoded data representing differences, detected by the detector, between the two images of the stereoscopic image pair.
3. Apparatus according to claim 2, in which the first encoded data is dependent upon one or more images of one or more others of the stereoscopic image pairs.
4. Apparatus according to claim 3, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the same polarity as the polarity of the first image.
5. Apparatus according to claim 3, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the opposite polarity to that of the polarity of the first image.
6. Apparatus according to claim 2, in which the first encoded data is decodable to allow reproduction of the first image without reference to encoded image data representing other encoded images.
7. Apparatus according to any one of claims 2 to 6, comprising a stereoscopy processor configured to detect portions of the stereoscopic image pair having an image disparity less than a threshold image disparity and to generate replacement image content for one or both images of the stereoscopic image pair such that those portions are identical as between the two images of the stereoscopic image pair.
8. Apparatus according to any one of the preceding claims, in which the encoder is configured to generate compressed encoded data.
9. Apparatus according to claim 8, in which the encoder is configured to apply a greater degree of compression at one or more peripheral image regions than at a central image region.
10. Video encoding apparatus comprising apparatus according to any one of the preceding claims as dependent upon claim 2, in which the encoder and the detector are configured to operate in respect of successive stereoscopic image pairs of a stereoscopic video signal.
11. Apparatus according to claim 10, in which the encoder is configured to select a first image of each stereoscopic image pair according to a selection pattern which includes instances of each polarity of the first image.
12. Apparatus according to claim 11, in which the selection pattern comprises an alternate selection of each polarity.
13. Image data decoding apparatus to decode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a decoder to decode encoded data representing image differences between a left image and a right image, and to reproduce a stereoscopic image pair in dependence upon the decoded image differences.
14. Apparatus according to claim 13, in which the decoder is configured to decode:
first encoded image data which is decodable to allow reproduction of a first image of a stereoscopic image pair without reference to encoded image data relating to a second image, being the other image of the stereoscopic image pair; and second encoded image data for the other of the stereoscopic image pair, the second encoded data representing differences between the two images of the stereoscopic image pair.
15. Apparatus according to claim 14, in which the first encoded data is dependent upon one or more images of one or more others of the stereoscopic image pairs.
16. Apparatus according to claim 15, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the same polarity as the polarity of the first image.
17. Apparatus according to claim 15, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the opposite polarity to that of the polarity of the first image.
18. Apparatus according to claim 14, in which the first encoded data is decodable to allow reproduction ofthe first image without reference to encoded image data representing other encoded images.
19. Apparatus according to any one of claims 14 to 18, in which the encoded data is compressed encoded data.
20. Video decoding apparatus comprising apparatus according to any one of claims 14 to 19, in which the decoder is configured to operate in respect of successive stereoscopic image pairs of a stereoscopic video signal.
21. Apparatus according to claim 20, in which the decoder is configured to select a first image of each stereoscopic image pair according to a selection pattern which includes instances of each polarity of the first image.
22. Apparatus according to claim 21, in which the selection pattern comprises an alternate selection of each polarity.
23. Image data capture, reproduction, storage and/or transmission apparatus comprising apparatus according to any one of the preceding claims.
24. An image data encoding method comprising:
detecting image differences between a left image and a right image of one or more stereoscopic image pairs; and generating encoded data representing a stereoscopic image in dependence upon the detected image differences.
25. An image data decoding method comprising:
decoding encoded data representing image differences between a left image and a right image of one or more stereoscopic image pairs; and reproducing a stereoscopic image pair in dependence upon the decoded image differences.
26. Computer software which, when executed by a computer, causes the computer to perform the method of claim 24 or 25.
27. A non-transitory, machine-readable storage medium which stores computer software 5 according to claim 26.
Intellectual
Property
Office
Application No: GB1622168.1 Examiner: Dr Andrew Rose
GB1622168.1A 2016-12-23 2016-12-23 Image data encoding and decoding Withdrawn GB2558277A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB1622168.1A GB2558277A (en) 2016-12-23 2016-12-23 Image data encoding and decoding
PCT/GB2017/053802 WO2018115841A1 (en) 2016-12-23 2017-12-19 Image data encoding and decoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1622168.1A GB2558277A (en) 2016-12-23 2016-12-23 Image data encoding and decoding

Publications (2)

Publication Number Publication Date
GB201622168D0 GB201622168D0 (en) 2017-02-08
GB2558277A true GB2558277A (en) 2018-07-11

Family

ID=58360739

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1622168.1A Withdrawn GB2558277A (en) 2016-12-23 2016-12-23 Image data encoding and decoding

Country Status (2)

Country Link
GB (1) GB2558277A (en)
WO (1) WO2018115841A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113141494A (en) * 2020-01-20 2021-07-20 北京芯海视界三维科技有限公司 3D image processing method and device and 3D display terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030001758A (en) * 2001-06-27 2003-01-08 한국전자통신연구원 Apparatus and Method for stereoscopic video coding/decoding with motion and disparity compensated prediction
WO2010108024A1 (en) * 2009-03-20 2010-09-23 Digimarc Coporation Improvements to 3d data representation, conveyance, and use
WO2013030456A1 (en) * 2011-08-30 2013-03-07 Nokia Corporation An apparatus, a method and a computer program for video coding and decoding
WO2016124710A1 (en) * 2015-02-05 2016-08-11 Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Multi-view video codec supporting residual prediction

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1986003924A1 (en) * 1984-12-17 1986-07-03 Nippon Hoso Kyokai System for transmitting stereoscopic television pictures
US8451320B1 (en) * 2009-01-23 2013-05-28 Next3D, Inc. Methods and apparatus for stereoscopic video compression, encoding, transmission, decoding and/or decompression
JP5293463B2 (en) * 2009-07-09 2013-09-18 ソニー株式会社 Image processing apparatus, image processing method, and program
JP2011109398A (en) * 2009-11-17 2011-06-02 Sony Corp Image transmission method, image receiving method, image transmission device, image receiving device, and image transmission system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030001758A (en) * 2001-06-27 2003-01-08 한국전자통신연구원 Apparatus and Method for stereoscopic video coding/decoding with motion and disparity compensated prediction
WO2010108024A1 (en) * 2009-03-20 2010-09-23 Digimarc Coporation Improvements to 3d data representation, conveyance, and use
WO2013030456A1 (en) * 2011-08-30 2013-03-07 Nokia Corporation An apparatus, a method and a computer program for video coding and decoding
WO2016124710A1 (en) * 2015-02-05 2016-08-11 Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Multi-view video codec supporting residual prediction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
(TECH) Overview of the Multiview and 3D Extensions of High Efficiency Video Coding, IEEE TCSVT 26(1) Jan 2016. *

Also Published As

Publication number Publication date
WO2018115841A1 (en) 2018-06-28
GB201622168D0 (en) 2017-02-08

Similar Documents

Publication Publication Date Title
KR102431537B1 (en) Encoders, decoders and corresponding methods using IBC dedicated buffers and default value refreshing for luma and chroma components
EP3158750B1 (en) Intra block copy block vector signaling for video coding
KR20220024817A (en) Encoders, decoders and their methods
US9973777B2 (en) Data encoding and decoding apparatus, method and storage medium
CN113940079A (en) Gradient-based prediction refinement for video coding
US10897617B2 (en) Rounding of motion vectors for adaptive motion vector difference resolution and increased motion vector storage precision in video coding
CN113508592A (en) Encoder, decoder and corresponding inter-frame prediction method
CN103369316A (en) Image processing apparatus and method
KR20220070542A (en) Encoder, decoder and corresponding method for simplifying signaling picture header
US20200374550A1 (en) Bi-directional optical flow in video coding
CN104247433A (en) Decoding apparatus, decoding method, encoding apparatus and encoding method
JP2023085337A (en) Method and apparatus of cross-component linear modeling for intra prediction, decoder, encoder, and program
KR20210113367A (en) Method and apparatus for intra sub-partition coding mode
RU2597256C2 (en) Encoding device, encoding method, decoding device and method of decoding method
EP3649778B1 (en) Method for encoding and decoding images, encoding and decoding device, and corresponding computer programs
EP3750309B1 (en) Data encoding and decoding
CN107534765B (en) Motion vector selection and prediction in video coding systems and methods
CN106464898B (en) Method and apparatus for deriving inter-view motion merge candidates
GB2558277A (en) Image data encoding and decoding
KR102306631B1 (en) Method for coding and decoding image parameters, apparatus for coding and decoding image parameters and corresponding computer program
CN114424554B (en) Method and apparatus for chroma QP offset table indication and derivation
CN111034202B (en) Image encoding and decoding method, encoding and decoding device, and corresponding computer program
CN113228674A (en) Video encoding and video decoding
GB2577338A (en) Data encoding and decoding
RU2800681C2 (en) Coder, decoder and corresponding methods for intra prediction

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)