GB2558277A - Image data encoding and decoding - Google Patents
Image data encoding and decoding Download PDFInfo
- Publication number
- GB2558277A GB2558277A GB1622168.1A GB201622168A GB2558277A GB 2558277 A GB2558277 A GB 2558277A GB 201622168 A GB201622168 A GB 201622168A GB 2558277 A GB2558277 A GB 2558277A
- Authority
- GB
- United Kingdom
- Prior art keywords
- image
- stereoscopic
- encoded
- images
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/115—Selection of the code volume for a coding unit prior to coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Encoding stereoscopic image pairs comprising left and right eye images, comprising detecting image differences L1-R1 between left L1 and right R1 image frames and encoding data in dependence upon the detected differences, delta or disparity. Preferably, a first image of a first polarity or view L in a first stereoscopic image pair 1 is encoded in dependence on an image of the second (opposite) polarity or view R in another stereoscopic pair 2, and the second image of the first stereoscopic pair is encoded as data representing differences L1-R1 between the first and second images of the first stereoscopic pair. Alternatively, an encoding processor detects portions of the two stereo images that have an image disparity less than a threshold and generates replacement image data to render said portions identical (i.e. it quantises the difference between the images to zero if the portions are similar). Greater compression may be applied to the peripheral regions of the image than the central portions. In another embodiment, as stereo pairs are sequentially processed, the designated first image (upon which the second image is differentially encoded) alternates between images of different polarities. Corresponding decoders are disclosed.
Description
(71) Applicant(s):
1622168.1 (51) INT CL:
H04N 19/597 (2014.01)
23.12.2016 H04N 19/115 (2014.01)
H04N 19/167 (2014.01)
- H04N 19/176 (2014.01)
H04N 19/85 (2014.01)
H04N 13/161 (2018.01) H04N 19/124 (2014.01) H04N 19/172 (2014.01) H04N 19/503 (2014.01)
Sony Interactive Entertainment Inc.
1-7-1 Konan, Minato-Ku 108-8270, Tokyo, Japan (72) Inventor(s):
(56) Documents Cited:
WO 2016/124710 A1 WO 2010/108024 A1 KR 20030001758
WO 2013/030456 A1
Ian Henry Bickerstaff Sharwin Winesh Raghoebardajai (TECH, Overview of the Multiview and 3D Extensions of High Efficiency Video Coding, IEEE TCSVT 26(1) Jan 2016.
(74) Agent and/or Address for Service:
D Young & Co LLP
120 Holborn, LONDON, EC1N 2DY, United Kingdom (58) Field of Search:
INT CL H04N
Other: EPODOC; INSPEC; INTERNET; WPI; Patent Fulltext (54) Title of the Invention: Image data encoding and decoding
Abstract Title: Stereoscopic image encoding using differences between the images of a pair to encode one of the images of the pair (57) Encoding stereoscopic image pairs comprising left and right eye images, comprising detecting image differences L1-R1 between left L1 and right R1 image frames and encoding data in dependence upon the detected differences, delta or disparity. Preferably, a first image of a first polarity or view L in a first stereoscopic image pair 1 is encoded in dependence on an image of the second (opposite) polarity or view R in another stereoscopic pair 2, and the second image of the first stereoscopic pair is encoded as data representing differences L1-R1 between the first and second images of the first stereoscopic pair. Alternatively, an encoding processor detects portions of the two stereo images that have an image disparity less than a threshold and generates replacement image data to render said portions identical (i.e. it quantises the difference between the images to zero if the portions are similar). Greater compression may be applied to the peripheral regions of the image than the central portions. In another embodiment, as stereo pairs are sequentially processed, the designated first image (upon which the second image is differentially encoded) alternates between images of different polarities. Corresponding decoders are disclosed.
Left Right Primary Difference
L1 | R1 | L1 | L1--R1 | |||
L2 | R2 | R2 | R2-L2 | |||
L3 | R3 | 1 Q LU | L3-R3 | |||
L4 | R4 | R4 | R4-L4 |
FIG. 11
This print incorporates corrections made under Section 117(1) of the Patents Act 1977.
1/6
1801 18
FIG. 1
A/V
A/V
120
FIG. 2
140
FIG. 3
FIG. 4
1801 18
INPUTDETECTOR
ENCODER
OUTPUT
600
INPUT
FIG. 6
DECODER ''''
610
OUTPUT
OUTPUT o
Primary Difference
L1 | R1 | L1 | L1-R1 | |||
L-i | ooo L/-- i | |||||
L2 | R2 | L2 | L2-R2 | |||
L-i | ooo i | |||||
L3 | R3 | L3 | L3-R3 | |||
ooo \^- t | ||||||
L4 | R4 | L4 | L4-R4 |
Left
Right Primary Difference
L4
R4
R4-L4
1801 18
LEFT
RIGHT
1801 18
DETECT
DECODE
REPRODUCE
GENERATE
IMAGE DATA ENCODING AND DECODING
BACKGROUND
Field
This disclosure relates to data encoding and decoding.
Description of Related Art
There are several video data encoding and decoding systems which involve transforming video data into a frequency domain representation, quantising the frequency domain coefficients and then applying some form of entropy encoding to the quantised coefficients. This can achieve compression of the video data. A corresponding decoding or decompression technique is applied to recover a reconstructed version of the original video data.
SUMMARY
The present disclosure is defined by the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
A more complete appreciation of the disclosure and many of the attendant advantages thereof will be readily obtained as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings, wherein:
Figure 1 schematically illustrates an audio/video (A/V) data transmission and reception system using image data encoding and decoding;
Figure 2 schematically illustrates a video display system using image data decoding;
Figure 3 schematically illustrates an audio/video storage system using image data encoding and decoding;
Figure 4 schematically illustrates a video camera using image data encoding;
Figure 5 schematically illustrates a series of stereoscopic image pairs;
Figure 6 schematically illustrates an encoding apparatus;
Figure 7 schematically illustrates a decoding apparatus;
Figure 8 schematically illustrates an encoding apparatus in more detail;
Figure 9 schematically illustrates a decoding aooaratus in more detail;
Figures 10 and 11 schematically illustrate example selection patterns;
Figure 12 schematically illustrates a stereoscopic image pair;
Figure 13 schematically illustrates an image portion;
Figure 14 schematically illustrates a stereoscopy processor;
Figure 15 schematically illustrates a stereoscopic image pair;
Figure 16 schematically represents image regions;
Figure 17 schematically illustrates parameter selection; and
Figures 18 and 19 are schematic flowcharts illustrating respective methods. DESCRIPTION OF THE EMBODIMENTS
Referring now to the drawings, Figures 1-4 are provided to give schematic illustrations of apparatus or systems making use of the encoding and/or decoding apparatus to be described below in connection with embodiments of the disclosure.
All of the data encoding and/or decoding apparatus to be described below may be implemented in hardware, in software running on a general-purpose data processing apparatus such as a general-purpose computer, as programmable hardware such as an application specific integrated circuit (ASIC) or field programmable gate array (FPGA) or as combinations of these. In cases where the embodiments are implemented by software and/or firmware, it will be appreciated that such software and/or firmware, and non-transitory machine-readable data storage media by which such software and/or firmware are stored or otherwise provided, are considered as embodiments of the present disclosure.
Figure 1 schematically illustrates an audio/video data transmission and reception system using image data encoding and decoding.
An input audio/video signal 10 is supplied to a video data encoding apparatus 20 which encodes at least the video component of the audio/video signal 10 for transmission along a transmission route 30 such as a cable, an optical fibre, a wireless link or the like. The encoded signal is processed by a decoding apparatus 40 to provide an output audio/video signal 50. For the return path, a encoding apparatus 60 encodes an audio/video signal for transmission along the transmission route 30 to a decoding apparatus 70.
The encoding apparatus 20 and decoding apparatus 70 can therefore form one node of a transmission link. The decoding apparatus 40 and decoding apparatus 60 can form another node of the transmission link. Of course, in instances where the transmission link is unidirectional, only one of the nodes would require a encoding apparatus and the other node would only require a decoding apparatus.
Figure 2 schematically illustrates a video display system using image data decoding. In particular, a encoded audio/video signal 100 is processed by a decoding apparatus 110 to provide a decoded signal which can be displayed on a display 120. The decoding apparatus 110 could be implemented as an integral part of the display 120, for example being provided within the same casing as the display device. Alternatively, the decoding apparatus 110 may be provided as (for example) a so-called set top box (STB), noting that the expression set-top does not imply a requirement for the box to be sited in any particular orientation or position with respect to the display 120; it is simply a term used in the art to indicate a device which is connectable to a display as a peripheral device.
Figure 3 schematically illustrates an audio/video storage system using image data encoding and decoding. An input audio/video signal 130 is supplied to a encoding apparatus
140 which generates a encoded signal for storing by a store device 150 such as a magnetic disk device, an optical disk device, a magnetic tape device, a solid state storage device such as a semiconductor memory or other storage device. For replay, encoded data is read from the store device 150 and passed to a decoding apparatus 160 for decoding to provide an output audio/video signal 170.
It will be appreciated that the encoded or encoded signal, and a storage medium storing that signal, are considered as embodiments of the present disclosure.
Figure 4 schematically illustrates a video camera using image data encoding. In Figure 4, an image capture device 180, such as a charge coupled device (CCD) image sensor and associated control and read-out electronics, generates a video signal which is passed to a encoding apparatus 190. A microphone (or plural microphones) 200 generates an audio signal to be passed to the encoding apparatus 190. The encoding apparatus 190 generates a encoded audio/video signal 210 to be stored and/or transmitted (shown generically as a schematic stage 220).
Therefore, it will be appreciated that encoding and/or decoding apparatus as discussed here can be embodied in video storage, transmission, capture or display apparatus. Examples of these types of apparatus can include computer games machines.
The encoding can include data compression, or can be lossless. The decoding can (where appropriate for the format of the encoded data) include data decompression.
The techniques to be described below relate in some cases to video data encoding and decoding. However, they can be applied to image data encoding and decoding. Examples include the intra-image techniques to be discussed, which can be applied to single images. In this context, references in the description of the embodiments to “video” should be understood, where the context does not explicitly disallow such an interpretation, to relate also to “image” handling techniques. However, in at least some examples, the encoder and the detector (discussed below) are configured to operate in respect of successive stereoscopic image pairs of a stereoscopic video signal.
It will also be appreciated that many existing techniques may be used for audio data encoding in conjunction with the video data encoding techniques which will be described, to generate a encoded audio/video signal. Accordingly, a separate discussion of audio data encoding will not be provided. It will also be appreciated that the data rate associated with video data, in particular broadcast quality video data, is generally very much higher than the data rate associated with audio data (whether compressed or not compressed). It will therefore be appreciated that unencoded audio data could accompany encoded video data to form a encoded audio/video signal. It will further be appreciated that although the present examples (shown in Figures 1-4) relate to audio/video data, the techniques to be described below can find use in a system which simply deals with (that is to say, encodes, decodes, stores, displays and/or transmits) video or image data. That is to say, the embodiments can apply to video data encoding without necessarily having any associated audio data handling at all.
Therefore, Figures 1-4 provide examples of image data capture, reproduction, storage and/or transmission apparatus comprising apparatus according to any of the techniques to be discussed.
Figure 5 schematically illustrates a set of successive stereoscopic image pairs captured at respective capture times t1, t2, t3, t4... . The two images captured at a particular capture time (such as the images 500) represent a stereoscopic image pair of a left and a right image for display to be left and right eyes respectively.
The present techniques are applicable to a single stereoscopic image pair such as the pair 500. However, the techniques are also applicable to a succession of stereoscopic image pairs for example forming a stereoscopic video signal. In an example of such a video signal, the times t1 ...t4 (and so on) are separated by an image period applicable to the video signal such as 1/50s.
Between a left and a right image forming a particular stereoscopic image pair, there would generally be expected to be differences, for example relating to the representation of image depth. Depth is represented in such a stereoscopic image pair by disparity, which is to say, a lateral (left-to-right) difference in image position which is dependent upon the perceived depth of that image feature. In some examples, a zero disparity would be applicable to objects at infinity, with the disparity generally increasing the closer the object is to the camera view point.
Examples of the present techniques to be discussed below can make use of the general similarity between left and right images, as well as general similarity between successive images captured at successive points in time separated by the image period in the case of a video signal.
Figure 6 schematically illustrates an image data encoding apparatus to encode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a detector 600 to detect image differences between a left image and a right image; and an encoder 610 to generate encoded data representing a stereoscopic image in dependence upon the detected image differences.
Figure 7 schematically illustrates an image data decoding apparatus to decode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a decoder 700 to decode encoded data representing image differences between a left image and a right image, and to reproduce a stereoscopic image pair in dependence upon the decoded image differences.
The apparatus of Figures 6 and 7 will now be described in more detail with reference to Figures 8 and 9.
Figure 8 provides a schematic overview of an image data encoding apparatus such as a video data compression apparatus.
A controller 800 controls the overall operation of the apparatus and, in particular when referring to a selection pattern (to be described below) controls the generation of such a selection pattern. The controller 800 can also select various modes of operation such as sample block sizes and/or configurations.
Successive images of an input video signal 810 are supplied to an adder 820 and to an image predictor 830. The adder 820 in fact performs a subtraction (negative addition) operation, in that it receives the input video signal 810 on a + input and the output of the image predictor 820 on a input, so that the predicted image is subtracted from the input image. The result is to generate a so-called residual image signal 840 representing the difference between the actual and predicted images.
One reason why a residual image signal is generated is as follows. Some data coding techniques, that is to say the techniques which will be applied to the residual image signal, tend to work more efficiently when there is less energy in the image to be encoded. Here, the term efficiently refers to the generation of a small amount of encoded data; for a particular image quality level, it is desirable (and considered efficient) to generate as little data as is practicably possible. The reference to energy in the residual image relates to the amount of information contained in the residual image. If the predicted image were to be identical to the real image, the difference between the two (that is to say, the residual image) would contain zero information (zero energy) and would be very easy to encode into a small amount of encoded data. In general, if the prediction process can be made to work reasonably well, the expectation is that the residual image data will contain less information (less energy) than the input image and so will be easier to encode into a small amount of encoded data.
The residual image data 840 is supplied to an encoder 850 which generates output encoded data 860.
One example of the operation of the encoder 850 is to generate a discrete cosine transform (DOT) or other transformed representation (such as a discrete sine transform or DST representation) of the residual image data. A set of transformed coefficients for each transformed block of image data, can then be quantised (if data compression is being used).
Various quantisation techniques are known in the field of image data compression, ranging from a simple multiplication by a quantisation scaling factor through to the application of complicated lookup tables under the control of a quantisation parameter. The general aim is twofold. Firstly, the quantisation process reduces the number of possible values of the transformed data. Secondly, the quantisation process can increase the likelihood that values of the transformed data are zero. Both of these can make the entropy encoding process, to be described below, work more efficiently in generating small amounts of compressed video data.
A data scanning process can then be applied. The purpose of the scanning process is to reorder the quantised transformed data so as to gather as many as possible of the non-zero quantised transformed coefficients together, and of course therefore to gather as many as possible of the zero-valued coefficients together. These features can allow so-called run-length coding or similar techniques to be applied efficiently. So, the scanning process involves selecting coefficients from the quantised transformed data, and in particular from a block of coefficients corresponding to a block of image data which has been transformed and quantised, according to a scanning order so that (a) all of the coefficients are selected once as part of the scan, and (b) the scan tends to provide the desired reordering. One example scanning order which can tend to give useful results is a so-called up-right diagonal scanning order, although in example embodiments to be discussed below, other scanning orders will be considered.
The scanned coefficients may then be passed to an entropy encoder. Again, various types of entropy encoding may be used. Two examples are variants of the so-called CABAC (Context Adaptive Binary Arithmetic Coding) system and variants of the so-called CAVLC (Context Adaptive Variable-Length Coding) system. In general terms, CABAC is considered to provide a better efficiency, and in some studies has been shown to provide a 10-20% reduction in the quantity of encoded output data for a comparable image quality compared to CAVLC. However, CAVLC is considered to represent a much lower level of complexity (in terms of its implementation) than CABAC. Note that the scanning process and the entropy encoding process are shown as separate processes, but in fact can be combined or treated together. That is to say, the reading of data into the entropy encoder can take place in the scan order. Corresponding considerations apply to the respective inverse processes to be described below.
The output of the entropy encoding, along with additional data (mentioned above and/or discussed below), for example defining the manner in which the predictor 830 generated the predicted image, provides a compressed output video signal 860.
However, a return path 870 is also provided because the operation of the predictor 830 itself depends upon a decompressed version of the compressed output data.
The reason for this feature is as follows. At the appropriate stage in the decompression process (to be described below) a decompressed version of the residual data is generated. This decompressed residual data has to be added to a predicted image to generate an output image (because the original residual data was the difference between the input image and a predicted image). In order that this process is comparable, as between the compression side and the decompression side, the predicted images generated by the predictor 830 should be the same during the compression process and during the decompression process. Of course, at decompression, the apparatus does not have access to the original input images, but only to the decompressed images. Therefore, at compression, the predictor 860 bases its prediction (at least, for inter-image encoding) on decompressed versions of the compressed images.
The entropy encoding process is considered to be lossless, which is to say that it can be reversed to arrive at exactly the same data which was first supplied to the entropy encoder. So, the return path 870 can be implemented before the entropy encoding stage. Indeed, the scanning process is also considered lossless and so the return path can be implemented before that process.
In general terms, a decoder 880 in the return path provides the functions of entropy decoding, a reverse scan unit, an inverse quantisation and an inverse transform, all complementary to the respective functions of the encoder 850. For now, the discussion will continue through the encoding process; the process to decode an input encoded video signal will be discussed separately below.
In the encoding process, the decoder 880 acts on data from the return path 870 to generate a compressed-decompressed residual image signal 885.
The image signal 885 is added, at an adder 890, to the output of the predictor 830 to generate a reconstructed output image 895. This forms one input to the image predictor 830.
Turning now to the process applied to decode a received encoded image data signal 900, Figure 9 schematically illustrates a decoding apparatus.
In Figure 9, the received encoded image data signal 900 is provided to a decoder 910 equivalent in function to the decoder 880 of Figure 8 and operating under the control of a controller 920. The decoder 910 provides (in this example) the functions of entropy decoding, reverse scan, inverse quantisation and inverse transform, all complimentary to the respective functions of the encoder 850. The decoder 910 generates a decoded image data signal 930 which is added, by an adder 940, to the output of a predictor 950. The predictor operates in the same manner as the predictor 830, on the basis of image data 960 generated as the son of the predicted image data 970 and the decompressed residual image data 930 generated by the decoder 910. The output of the apparatus is also provided by the image data 960.
Figures 10 and 11 schematically illustrate example selection patterns which may be selected by the controller 800 and correspondingly by the controller 920.
Before discussing Figures 10 and 11 in detail, further context will be provided regarding the use of so-called selection patterns.
In examples of the present disclosure, at least some images of one or more stereoscopic image pairs are encoded by reference to image differences between a left image and a right image. In Figures 8 and 9, these image differences are handled by the adders 805, 980. The adder 805 subtracts one image from another (for example, the other of the same image pair on a pixel by pixel or region by region basis. The adder 980 adds back the difference data decoded for one image to the corresponding primary image. So, some images are encoded “directly” (meanings of which will be discussed further below) and some are encoded according to differences from an image of the opposite polarity (where left and right are respective polarities).
In some examples, this process can be carried out for individual image pairs (so that there is no dependency of either image of a stereoscopic image pair on any other image outside of that particular stereoscopic image pair. The choice of which image is encoded without reference to the other image (a “primary” image in the terminology of Figures 10 and 11) and which image is encoded by reference to differences from the primary image, can be a fixed selection so that (purely for example) it is always the left image which is the primary image and the right image which is encoded by reference to differences. This fixed selection is one example of a selection pattern. Figure 10 shows such a selection pattern in which, for successive stereoscopic image pairs 1...4, the left image (L1...L4) is encoded as the primary image and the respective right image (R1... ,R4) is encoded by reference to differences from the left image. Note that although the difference is expressed as Ln-Rn, an equivalent operation could be obtained by deriving the complimentary difference Rn-Ln.
In the example just discussed, so-called intra image processing is a slide to the primary image, so that the predictor 830 and the equivalent predictor 950 in Figures 8 and 9 derive their predictions from other portions of the respective primary image. In such examples, the first encoded data is decodable to allow reproduction of the first image without reference to encoded image data representing other encoded images.
Therefore, in these examples, the encoder is configured to generate:
first encoded image data (the primary data in Figure 10) which is decodable to allow reproduction of a first image of a stereoscopic image pair without reference to encoded image data relating to a second image, being the other image of the stereoscopic image pair; and second encoded image data (the Difference data in Figure 10) for the other of the stereoscopic image pair, the second encoded data representing differences, detected by the detector, between the two images of the stereoscopic image pair.
For the difference image, one way of implementing an encoding process is to provide, as the input 810 in Figure 8, a difference image generated by subtracting Rn from Ln (or the other way round) and encoding that as discussed above. Another possible technique is that the predictor 830 operates only to generate predicted images based upon one of the stereoscopic image pair (the primary image in this example) even when encoding the other of the stereoscopic image pair.
The discussion above relates to dependencies within a stereoscopic image pair. However, the apparatus of Figures 8 and 9 also allows for temporal dependencies when a stereoscopic video signal is being encoded or decoded, so that the first encoded data is dependent upon one or more images of one or more others of the stereoscopic image pairs.
These temporal dependencies allow the predictor 830 and the corresponding predictor 950 to generate a predicted version of a particular image on the basis of one or more other images. In a non- stereoscopic system these can simply be other images of a video signal. In a stereoscopic encoding and decoding arrangement, various options are available.
In Figure 8, the detector 600 can be implemented as the adder 805, for example. The remainder of Figure 8 can implement the encoder 610. In Figure 9, the arrangement shown can implement the decoder 700.
Figure 10 shows an example (indicated by dotted lines 1000) in which the primary images Ln have a temporal dependency upon one or more other primary images Ln.
Optionally, the difference images Ln-Rn can also have a temporal dependency in their encoding upon other difference images, indicated by the dotted lines 1010. Of course, even if the explicit temporal dependency 1010 is not provided, in such an arrangement the decoding of the difference images would still have a temporal aspect because of the temporal dependencies of the primary images.
Figure 11 schematically illustrates another selection pattern. Here, the primary image is selected so as to alternate between the left and the right image of each successive stereoscopic image pair of a video signal, so that in the example of Figure 11, the primary image is: L1, R2, L3, R4...and so on. The other image is encoded as a difference image as before, similarly to Figure 10, temporal dependencies can also be introduced as indicated by broken lines between successive image pairs.
Therefore, in an example arrangement applied to video encoding, the encoder and decoder can be configured to operate in respect of successive stereoscopic image pairs (such as those shown in Figures 10 and 11) of a stereoscopic video signal. The encoder can be configured to select a first or primary image of each stereoscopic image pair according to a selection pattern. In the example of Figure 11, the selection pattern may include instances of each polarity of the first image. The particular example of Figure 11 represents an alternate selection of each polarity, but it will be appreciated that different selection patterns may be used.
The temporal dependency can be within the same polarity or across polarities, so that in some examples the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the same polarity as the polarity of the first image, and in other examples the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the opposite polarity to that of the polarity of the first image.
Figures 12 to 15 provide an example of a pre-processing technique which can be applied to the images to be encoded by the apparatus of Figure 8. By potentially reducing image differences between the left and right images, the arrangements discussed above can potentially operate more efficiently. In Figures 12 to 15, a stereoscopy processor 1400 is configured to detect portions of the stereoscopic image pair having an image disparity less than a threshold image disparity (and therefore representing image features of at least a threshold depth or distance from the camera view point) and to generate replacement image content for one or both images of the stereoscopic image pair such that those portions are identical as between the left and right images of the stereoscopic image pair.
Referring to Figure 12, example left and right images are shown. Examining the image disparity in the left and right images, a region 1300 in which the disparity is greater than a threshold image disparity is identified. Image features in the region 1300 are considered to be closer than a threshold distance to the camera viewpoint. Image features in the remainder of the image are considered to be further away than the threshold distance or depth from the camera viewpoint.
Referring to Figure 14, the stereoscopy processor 1400 comprises a detector 1410 to detect image disparity of less than or greater than a threshold image disparity and to generate and store in a map store 1420 a map or mask indicating image portions which are closer than, or further than the threshold distance or depth from the camera viewpoint.
A generator 1430 acts to generate a single version of the image features having a 00 disparity of less than the threshold amount and to apply that as a background in all regions 1 other than the identified region 1300 of the image. The result is a stereoscopic image pair such l— as that shown in Figure 15, in which the background (all regions except the region 1300) is identical as between the two images, whereas the foreground (the region 1300, representing 00 image features closer than the threshold depth) is different, being the foreground region from
1” each of the original left and right images of Figure 12.
As discussed above, the encoder can be configured to generate compressed encoded data and the decoder can be configured to generate output decoded data from a compressed data input. The degree of compression does not however need to be identical across the whole image.
Figure 16 schematically illustrates an arrangement in which multiple variations in the degree of compression are applied, so that a more central region 1600 of each image 1610 is compressed the least (providing a greater amount of output data than would be the case if a greater degree of compression were used), a next outmost image region 1612 is compressed more than the region 1600, and peripheral region 1620 is compressed by a greater degree of compression than the region 1600 or 1612. So, the image compression is more aggressive (providing a greater degree of image compression) in the peripheral region 1620 than in a central region 1600.
Figure 17 schematically represents apparatus which can form part of the encoder 850, in which an image region detector 1700 detects whether a current portion or block to be compressed is within a central image region 1600 or a peripheral image region, and a parameter selector 1710 selects compression parameters to be applied to that image region as a result so that a more central region is compressed with a lesser degree of compression than a more peripheral region. The modules 1700, 1710 can be implemented by the controller 800, with corresponding operations being carried out at decoding by the controller 920. A data compressor 1720 carries out the data compression according to the selected parameters.
Figure 18 is a schematic flowchart illustrating an image data encoding method comprising:
detecting (at a step 1800) image differences between a left image and a right image of one or more stereoscopic image pairs; and generating (at a step 1810) encoded data representing a stereoscopic image in dependence upon the detected image differences.
Figure 19 is a schematic flowchart illustrating an image data decoding method comprising:
decoding (at a step 1900) encoded data representing image differences between a left image and a right image of one or more stereoscopic image pairs; and reproducing (at a step 1910) a stereoscopic image pair in dependence upon the decoded image differences.
It will be appreciated that example embodiments can be implemented by computer software operating on a general purpose computing system such as a games machine. In these examples, computer software, which when executed by a computer, causes the computer to carry out any of the methods discussed above is considered as an embodiment of the present disclosure. Similarly, embodiments of the disclosure are provided by a non-transitory, machine-readable storage medium which stores such computer software.
It will also be apparent that numerous modifications and variations of the present disclosure are possible in light of the above teachings. It is therefore to be understood that within the scope of the appended claims, the disclosure may be practised otherwise than as specifically described herein.
Claims (27)
1. Image data encoding apparatus to encode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a detector to detect image differences between a left image and a right image; and an encoder to generate encoded data representing a stereoscopic image in dependence upon the detected image differences.
2. Apparatus according to claim 1, in which the encoder is configured to generate:
first encoded image data which is decodable to allow reproduction of a first image of a stereoscopic image pair without reference to encoded image data relating to a second image, being the other image of the stereoscopic image pair; and second encoded image data for the other of the stereoscopic image pair, the second encoded data representing differences, detected by the detector, between the two images of the stereoscopic image pair.
3. Apparatus according to claim 2, in which the first encoded data is dependent upon one or more images of one or more others of the stereoscopic image pairs.
4. Apparatus according to claim 3, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the same polarity as the polarity of the first image.
5. Apparatus according to claim 3, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the opposite polarity to that of the polarity of the first image.
6. Apparatus according to claim 2, in which the first encoded data is decodable to allow reproduction of the first image without reference to encoded image data representing other encoded images.
7. Apparatus according to any one of claims 2 to 6, comprising a stereoscopy processor configured to detect portions of the stereoscopic image pair having an image disparity less than a threshold image disparity and to generate replacement image content for one or both images of the stereoscopic image pair such that those portions are identical as between the two images of the stereoscopic image pair.
8. Apparatus according to any one of the preceding claims, in which the encoder is configured to generate compressed encoded data.
9. Apparatus according to claim 8, in which the encoder is configured to apply a greater degree of compression at one or more peripheral image regions than at a central image region.
10. Video encoding apparatus comprising apparatus according to any one of the preceding claims as dependent upon claim 2, in which the encoder and the detector are configured to operate in respect of successive stereoscopic image pairs of a stereoscopic video signal.
11. Apparatus according to claim 10, in which the encoder is configured to select a first image of each stereoscopic image pair according to a selection pattern which includes instances of each polarity of the first image.
12. Apparatus according to claim 11, in which the selection pattern comprises an alternate selection of each polarity.
13. Image data decoding apparatus to decode one or more stereoscopic image pairs each comprising a left image and a right image; the apparatus comprising:
a decoder to decode encoded data representing image differences between a left image and a right image, and to reproduce a stereoscopic image pair in dependence upon the decoded image differences.
14. Apparatus according to claim 13, in which the decoder is configured to decode:
first encoded image data which is decodable to allow reproduction of a first image of a stereoscopic image pair without reference to encoded image data relating to a second image, being the other image of the stereoscopic image pair; and second encoded image data for the other of the stereoscopic image pair, the second encoded data representing differences between the two images of the stereoscopic image pair.
15. Apparatus according to claim 14, in which the first encoded data is dependent upon one or more images of one or more others of the stereoscopic image pairs.
16. Apparatus according to claim 15, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the same polarity as the polarity of the first image.
17. Apparatus according to claim 15, in which the first encoded data is dependent upon one or more images of one or more other stereoscopic image pairs of the opposite polarity to that of the polarity of the first image.
18. Apparatus according to claim 14, in which the first encoded data is decodable to allow reproduction ofthe first image without reference to encoded image data representing other encoded images.
19. Apparatus according to any one of claims 14 to 18, in which the encoded data is compressed encoded data.
20. Video decoding apparatus comprising apparatus according to any one of claims 14 to 19, in which the decoder is configured to operate in respect of successive stereoscopic image pairs of a stereoscopic video signal.
21. Apparatus according to claim 20, in which the decoder is configured to select a first image of each stereoscopic image pair according to a selection pattern which includes instances of each polarity of the first image.
22. Apparatus according to claim 21, in which the selection pattern comprises an alternate selection of each polarity.
23. Image data capture, reproduction, storage and/or transmission apparatus comprising apparatus according to any one of the preceding claims.
24. An image data encoding method comprising:
detecting image differences between a left image and a right image of one or more stereoscopic image pairs; and generating encoded data representing a stereoscopic image in dependence upon the detected image differences.
25. An image data decoding method comprising:
decoding encoded data representing image differences between a left image and a right image of one or more stereoscopic image pairs; and reproducing a stereoscopic image pair in dependence upon the decoded image differences.
26. Computer software which, when executed by a computer, causes the computer to perform the method of claim 24 or 25.
27. A non-transitory, machine-readable storage medium which stores computer software 5 according to claim 26.
Intellectual
Property
Office
Application No: GB1622168.1 Examiner: Dr Andrew Rose
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1622168.1A GB2558277A (en) | 2016-12-23 | 2016-12-23 | Image data encoding and decoding |
PCT/GB2017/053802 WO2018115841A1 (en) | 2016-12-23 | 2017-12-19 | Image data encoding and decoding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1622168.1A GB2558277A (en) | 2016-12-23 | 2016-12-23 | Image data encoding and decoding |
Publications (2)
Publication Number | Publication Date |
---|---|
GB201622168D0 GB201622168D0 (en) | 2017-02-08 |
GB2558277A true GB2558277A (en) | 2018-07-11 |
Family
ID=58360739
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1622168.1A Withdrawn GB2558277A (en) | 2016-12-23 | 2016-12-23 | Image data encoding and decoding |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2558277A (en) |
WO (1) | WO2018115841A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113141494A (en) * | 2020-01-20 | 2021-07-20 | 北京芯海视界三维科技有限公司 | 3D image processing method and device and 3D display terminal |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030001758A (en) * | 2001-06-27 | 2003-01-08 | 한국전자통신연구원 | Apparatus and Method for stereoscopic video coding/decoding with motion and disparity compensated prediction |
WO2010108024A1 (en) * | 2009-03-20 | 2010-09-23 | Digimarc Coporation | Improvements to 3d data representation, conveyance, and use |
WO2013030456A1 (en) * | 2011-08-30 | 2013-03-07 | Nokia Corporation | An apparatus, a method and a computer program for video coding and decoding |
WO2016124710A1 (en) * | 2015-02-05 | 2016-08-11 | Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Multi-view video codec supporting residual prediction |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1986003924A1 (en) * | 1984-12-17 | 1986-07-03 | Nippon Hoso Kyokai | System for transmitting stereoscopic television pictures |
US8451320B1 (en) * | 2009-01-23 | 2013-05-28 | Next3D, Inc. | Methods and apparatus for stereoscopic video compression, encoding, transmission, decoding and/or decompression |
JP5293463B2 (en) * | 2009-07-09 | 2013-09-18 | ソニー株式会社 | Image processing apparatus, image processing method, and program |
JP2011109398A (en) * | 2009-11-17 | 2011-06-02 | Sony Corp | Image transmission method, image receiving method, image transmission device, image receiving device, and image transmission system |
-
2016
- 2016-12-23 GB GB1622168.1A patent/GB2558277A/en not_active Withdrawn
-
2017
- 2017-12-19 WO PCT/GB2017/053802 patent/WO2018115841A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030001758A (en) * | 2001-06-27 | 2003-01-08 | 한국전자통신연구원 | Apparatus and Method for stereoscopic video coding/decoding with motion and disparity compensated prediction |
WO2010108024A1 (en) * | 2009-03-20 | 2010-09-23 | Digimarc Coporation | Improvements to 3d data representation, conveyance, and use |
WO2013030456A1 (en) * | 2011-08-30 | 2013-03-07 | Nokia Corporation | An apparatus, a method and a computer program for video coding and decoding |
WO2016124710A1 (en) * | 2015-02-05 | 2016-08-11 | Fraunhofer Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Multi-view video codec supporting residual prediction |
Non-Patent Citations (1)
Title |
---|
(TECH) Overview of the Multiview and 3D Extensions of High Efficiency Video Coding, IEEE TCSVT 26(1) Jan 2016. * |
Also Published As
Publication number | Publication date |
---|---|
WO2018115841A1 (en) | 2018-06-28 |
GB201622168D0 (en) | 2017-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102431537B1 (en) | Encoders, decoders and corresponding methods using IBC dedicated buffers and default value refreshing for luma and chroma components | |
EP3158750B1 (en) | Intra block copy block vector signaling for video coding | |
KR20220024817A (en) | Encoders, decoders and their methods | |
US9973777B2 (en) | Data encoding and decoding apparatus, method and storage medium | |
CN113940079A (en) | Gradient-based prediction refinement for video coding | |
US10897617B2 (en) | Rounding of motion vectors for adaptive motion vector difference resolution and increased motion vector storage precision in video coding | |
CN113508592A (en) | Encoder, decoder and corresponding inter-frame prediction method | |
CN103369316A (en) | Image processing apparatus and method | |
KR20220070542A (en) | Encoder, decoder and corresponding method for simplifying signaling picture header | |
US20200374550A1 (en) | Bi-directional optical flow in video coding | |
CN104247433A (en) | Decoding apparatus, decoding method, encoding apparatus and encoding method | |
JP2023085337A (en) | Method and apparatus of cross-component linear modeling for intra prediction, decoder, encoder, and program | |
KR20210113367A (en) | Method and apparatus for intra sub-partition coding mode | |
RU2597256C2 (en) | Encoding device, encoding method, decoding device and method of decoding method | |
EP3649778B1 (en) | Method for encoding and decoding images, encoding and decoding device, and corresponding computer programs | |
EP3750309B1 (en) | Data encoding and decoding | |
CN107534765B (en) | Motion vector selection and prediction in video coding systems and methods | |
CN106464898B (en) | Method and apparatus for deriving inter-view motion merge candidates | |
GB2558277A (en) | Image data encoding and decoding | |
KR102306631B1 (en) | Method for coding and decoding image parameters, apparatus for coding and decoding image parameters and corresponding computer program | |
CN114424554B (en) | Method and apparatus for chroma QP offset table indication and derivation | |
CN111034202B (en) | Image encoding and decoding method, encoding and decoding device, and corresponding computer program | |
CN113228674A (en) | Video encoding and video decoding | |
GB2577338A (en) | Data encoding and decoding | |
RU2800681C2 (en) | Coder, decoder and corresponding methods for intra prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WAP | Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1) |