WO2014025294A1

WO2014025294A1 - Processing of texture and depth images

Info

Publication number: WO2014025294A1
Application number: PCT/SE2012/050864
Authority: WO
Inventors: Ivana Girdzijauskas; Martin Pettersson
Original assignee: Telefonaktiebolaget L M Ericsson (Publ)
Priority date: 2012-08-08
Filing date: 2012-08-08
Publication date: 2014-02-13

Abstract

The enclosed embodiments are related to processing an image frame pair comprising a texture image part and a depth image part. An image frame pair comprising a texture image part and a depth image part is received. The resolution of the texture image part and the depth image part is reduced. The depth image part is reduced to a smaller resolution than the texture image part. At least one of the reduced texture image part and the reduced depth image is placed in one image frame.

Description

PROCESSING OF TEXTURE AND DEPTH IMAGES

TECHNICAL FIELD

Embodiments presented herein relate to image processing and more particularly to processing an image frame comprising a texture image part and a depth image part.

BACKGROUND

3D Video or 3D TV has gained increasing momentum in recent years. A number of standardization bodies (ITU, EBU, SMPTE, MPEG, and DVB) and other international groups (e.g. DTG, SCTE), are working toward standards for 3D TV or Video. Quite a few broadcasters have launched or are planning to launch public Stereoscopic 3D TV broadcasting.

From a legacy point-of-view it is practical and cost efficient to reuse an already existing network infrastructure. For instance, the stereoscopic 3D TV that is in use today deploy the same broadcast infrastructure used for normal 2D TV by packing the left and right view of the stereoscopic video into one frame. Apart from broadcasted television, 3D video is being considered in other video services such as video conferencing and mobile video calls. When deploying 3D video in these services it may also be beneficial to utilize the already existing network infrastructure. Several 3D video coding formats have been proposed in the 3D video community so far. In addition to the already mentioned stereoscopic 3D, these include Video plus Depth (V+D), Multiview Video (MW), Multiview Video plus Depth (MVD), Layered Depth Video (LDV), and Depth Enhanced Video (DES). As mentioned, when deploying a system for 3D video (e.g. broadcasted TV, video conference) it is a practical and cost-efficient solution to utilize an already existing infrastructure for sending the 3D video. In practice this often means that only one frame at a specific resolution can be sent for each point in time. To solve this for stereoscopic video, two frames from different views can be squeezed into one frame. The apparent drawback is that the full resolution for each frame thereby is lost, which may result in degradation in quality at the receiving side.

Some of the above mentioned video coding formats utilize a depth map. A depth map is a representation of the depth for each point in a texture expressed as a grey-scale image. The depth map is used to artificially render non-transmitted views at the receiver side, for example with depth image- based rendering (DIBR). Sending one texture image and one depth map image (depth image for short) instead of two texture images may be more bitrate efficient. It also gives the renderer the possibility to adjust the position of the rendered view.

US 2011/ 0286530 Ai relates to frame packing for video coding and describes several ways of packing multiple videos into one frame. US 2011/0286530 Ai discloses packing depth maps together with texture. However, there is still a need for improved transmission of 3D images in a 2D format.

SUMMARY

An object of embodiments disclosed herein is to provide improved

transmission of 3D images in a 2D format. Or put in other words: how to increase the overall quality of the 3D images given a specific transmitted resolution.

When sending texture and depth within one image frame this is typically implemented as placing the texture image and the depth image side-by-side, both with half the original horizontal resolution. However, the inventors of the enclosed embodiments have, through a combination of practical experimentation and theoretical derivation, discovered that for improved 3D video quality experience it may be beneficial to have lower quality, in terms of fidelity (i.e., bitrate) and/or resolution, for the depth image (in comparison to the fidelity of the texture image). US 2011/0286530 Ai fails to disclose how texture and depth can be weighted differently in terms of resolution. US 2011/0286530 Ai also fails to disclose packing depth in the colour channels.

A particular object is therefore to provide improved transmission of 3D images in a 2D format based on the texture and depth format. According to a first aspect a method of processing an image frame pair comprising a texture image part and a depth image part is provided. The method comprises receiving an image frame pair comprising a texture image part and a depth image part. The method further comprises reducing resolution of the texture image part and the depth image part. The depth image part is reduced to a smaller resolution than the texture image part. The method further comprises placing at least one of the reduced texture image part and the reduced depth image in one image frame.

Advantageously this allows texture plus depth type images to be sent over video systems relying on legacy 2D video network components which allow only one frame at a certain resolution to be transmitted to a receiver.

Advantageously this allows the quality importance of texture and depth to be properly balanced, thereby enabling optimal overall quality of the 3D video to be achieved.

The texture image part may comprise a luminance component, a first chrominance component and second chrominance component. A first depth section may be associated with the luminance component, a second depth section may be associated with the first chrominance component, and a third depth section may be associated with the second chrominance component.

Advantageously this allows for the otherwise empty chrominance

components for depth images to be utilized. In turn, this may allow the packed depth image to have 50% more depth pixels transmitted for YUV420 pixel formats, 100% more pixels transmitted for YUV422 pixel formats and 200% more pixels transmitted for RGB and YUV444 pixel formats. According to a second aspect a method of processing an image frame pair comprising a texture image part and a depth image part is provided. The method comprises receiving an image frame pair comprising a texture image part and a depth image part. The method further comprises splitting the depth image part into a first depth section, a second depth section and a third depth section. The method may comprise reducing resolution of the first depth section, the second depth section and the third depth section. The texture image part may comprise a luminance component, a first

chrominance component and a second chrominance component, and the method may further comprise associating the first depth section with the luminance component, the second depth section with the first chrominance component, and the third depth section with the second chrominance component.

According to a third aspect a computer program of processing an image frame pair comprising a texture image part and a depth image part is provided. The computer program comprises computer program code which, when run on a processing unit, causes the processing unit to perform a method according to the first and/or second aspect.

According to a fourth aspect a computer program product comprising a computer program according to the second aspect and a computer readable means on which the computer program is stored are provided.

According to a fifth aspect a device for processing an image frame pair comprising a texture image part and a depth image part is provided. The device comprises a receiver arranged to receive an image frame pair comprising a texture image part and a depth image part. The device further comprises a processing unit arranged to reduce resolution of the texture image part and the depth image part. The depth image part is reduced to a smaller resolution than the texture image part. The processing unit is further arranged to place at least one of the reduced texture image part and the reduced depth image in one image frame. According to a sixth aspect a device of processing an image frame pair comprising a texture image part and a depth image part is provided. The device comprises a receiver arranged to receive an image frame pair comprising a texture image part and a depth image part. The device further comprises a processing unit arranged to split the depth image part into a first depth section, a second depth section and a third depth section. The processing unit may be arranged to reduce resolution of the first depth section, the second depth section and the third depth section. The texture image part may comprise a luminance component, a first chrominance component and a second chrominance component, and the processing unit may further be arranged to associate the first depth section with the luminance component, the second depth section with the first chrominance component, and the third depth section with the second chrominance component. It is to be noted that any feature of the first, second, third, fourth, fifth, and sixth aspects may be applied to any other aspect, wherever appropriate.

Likewise, any advantage of the first aspect may equally apply to the second, third, fourth, fifth and/ or sixth aspect, respectively, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following detailed disclosure, from the attached dependent claims as well as from the drawings.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. For example, whenever the term "image" is mentioned, it is to be understood that multiple images can be temporally combined to form a video stream, unless explicitly stated otherwise. All references to "a/an/the element, apparatus, component, means, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated. BRIEF DESCRIPTION OF THE DRAWINGS

The embodiments of the invention are now described, by way of example, with reference to the accompanying drawings, in which:

Fig l is a flowchart according to embodiments; Fig 2 is a schematic diagram showing functional modules of a device;

Fig 3 shows one example of a computer program product comprising computer readable means;

Figs 4 and 5 are flowcharts of methods according to embodiments; and

Figs 6-13 are schematic diagrams illustrating different embodiments of reducing and placing texture and/ or depth images.

DETAILED DESCRIPTION

The invention will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout the description. A picture element (pixel for short) is the smallest element of a digital image and holds the luminance and colour information of that element. The luminance and colour can be expressed in different ways. Displays usually have three colour elements, red, green and blue which are lit at different intensities depending on what colour and luminance is to be displayed. It becomes therefore convenient to send the pixel information in RGB pixel format to the display. Since the signal is digital the intensity of each component of the pixel must be represented with a fixed number of bits. For instance, an RGB pixel format with 8 bits per colour component can be written as RGB888.

When video needs to be compressed it is convenient to express the luminance and colour information of the pixel with one luminance component and two colour components. The colour components are also called chrominance components. When a pixel consists of several channels (e.g. a luminance channel and two colour channels) the information in only one of these channels for a pixel is sometimes referred to as a sub-pixel. The

transformation in colour space from three colour components to one luminance component and two colour components is performed since the human visual system (HVS) is more sensitive to luminance than to colour, meaning that the luminance component can be represented with higher accuracy (i.e. more bits) than the colour components. One such pixel format is the YUV format where Y represents luminance and U and V represent the two colour components. In matrix notation the YUV representation of an image can be converted from the RGB representation according to the following:

Fourcc.org holds a list of defined YUV and RGB formats. The most commonly used pixel format for standardized video codecs (e.g. H.264, MPEG-4, and the coming HEVC) is YUV420 (also known as YV12) planar where the U and V colour components are subsampled in both vertical and horizontal direction. Furthermore, according to this format the Y, U and V components are stored in separate chunks for each frame. The number of bits per pixel is 12 where 8 bits represent the luminance and 4 bits represent the two colour components.

Another commonly used format is the UYVY format (a YUV422 pixel format) where the U and V components are subsampled only in the vertical direction and stored in UY1VY2 order for each set of two pixels. The number of bits per pixel for UYVY is 16 where 8 bits represent the luminance and 8 bits represent the two colour components.

The embodiments presented herein are based on transmitting a 3D video sequence over a legacy 2D video system. The embodiments are further based on 3D video sequences represented by a texture image part and a depth image part and where the texture image part and the depth image part may be sent packed into one image frame over a legacy 2D video system. The one image frame may thus have a format compatible with 2D video transmission. This allows 3D video to be transmitted in a cost efficient way without the need to modify the existing transmission system. Fig 1 illustrates an image communications system 101 comprising frame packing and unpacking of texture and depth images in a sender 102 and a receiver 104, respectively. In the sender 102 steps S102-S110 are performed. In a step S102 a texture image and a depth image are acquired. In a step S104 downscaling factors are found whereby the resolutions of the texture image and the depth image are reduced. In a step S106 the reduced texture image and the reduced depth image are packed (or placed) into one image frame. In a step S108 the one image frame is encoded. In a step S110 multiple image frames are transmitted as a video bitstream. In the receiver 104 steps S112-S118 are performed. In a step S112 the video bitstream is received and the multiple image frames are extracted. In a step S114 each image frame is decoded. In a step S116 each image frame is unpacked into a texture image and a depth image. In a step S118 the unpacked texture image and a depth image are used to render 3D video.

The enclosed embodiments particularly relate to details of steps S104, S106 and S116 above. One straightforward and commonly used way of fitting texture and depth into one frame is to horizontally resample both texture and depth by a factor of two and put them side-by-side. In brief, according to some embodiments the depth image is resized to a smaller resolution than the resized texture. The depth image is then split and placed on the top, bottom or one of the sides of the texture image. Further, according to some embodiments, parts of the depth image are packed also into the colour components corresponding to the other parts of depth image, thereby allowing higher resolutions of the packed texture and depth images to be sent. Fig 2 schematically illustrates, in terms of a number of functional modules, the components of a device 2 for processing an image frame comprising a texture image part and a depth image part. The device 2 comprises a receiver 8 for receiving image frames, such as image frames comprising a texture image part and a depth image part. The receiver 8 may be provided as a wireless or wired interface of the device 2. The device 2 further comprises a processing unit 4 for processing a received image. The processing may comprise at least parts of any of the steps S102, S104, S106, S108, S110, S112, S114, S116 and/or S118. The processing unit 4 is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), application specific integrated circuit (ASIC) etc., capable of executing software instructions stored in a computer program product 22 (as in Fig 3). The device 2 may further comprise a transmitter 10 for transmitting image frames. Thus the processing unit 3 is thereby preferably arranged to execute methods as herein disclosed. Other components, as well as the related functionality, of the device 2 are omitted in order not to obscure the concepts presented herein.

Figs 4 and 5 are flowcharts illustrating embodiments of methods of processing an image frame comprising a texture image part and a depth image part. The methods are preferably performed in device 2. The methods are advantageously provided as computer programs 20. Fig 3 shows one example of a computer program product 22 comprising computer readable means 24. On this computer readable means 24, a computer program 20 can be stored. The computer program 20 maybe stored in the memory 6 of the device 2. This computer program 20 can cause the processing unit 4 of the device 2 and thereto opera tively coupled entities and devices to execute methods according to embodiments described herein. In the example of Fig 3, the computer program product 22 is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. The computer program product 22 could also be embodied as a memory (RAM, ROM, EPROM, EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory. Thus, while the computer program 20 is here schematically shown as a track on the depicted optical disk, the computer program 20 can be stored in any way which is suitable for the computer program product 22.

In a step S2 an image frame pair comprising a texture image part and a depth image part is received. The image frame pair is received by the receiver 5 of the device 2. The texture image may have a higher starting resolution than the depth image.

Both texture and depth are reduced in resolution (or downscaled). Hence, in a step S4 the resolution of the texture image part and the depth image part are reduced. According to some embodiments the depth is reduced in resolution (or downscaled) to a smaller resolution than the texture. It maybe more important to maintain good quality of the texture image compared to the depth image in order to optimize the overall 3D video experience. This conclusion may hold both for resolution and level of compression. Therefore, the depth image part is reduced to a smaller resolution than the texture image part.

In general, the texture image part may be reduced a first amount and the depth image part may be reduced a second amount. The percentage allocation between texture and depth pixels to be packed in a frame can be fixed (for example, a depth image may always occupy 20% of the available frame pixels). Hence, the first amount and the second amount are fixed. Alternatively, the percentage allocation between texture and depth pixels to be packed in a frame can be determined based on properties of texture and depth signals (for example, if the depth is very smooth, a lower resolution can be allowed for depth without losing much of the original information). Hence the first amount and the second amount maybe determined based on properties of texture and/or depth indicators. The depth indicators may relate to smoothness of the depth image part.

There may be different ways of reducing the texture image part and the depth image part. For example, the texture image 26 maybe reduced exclusively in vertical direction. For example, the depth image 28 may be reduced exclusively in vertical direction or horizontal direction. Figs 6 to 8 illustrate three examples of how to fit the differently resized texture images 26', 26" and depth images 28', 28", 28"' into one frame 30, 30', 30". For example, in Fig 6, the texture image is reduced in vertical direction only. The depth image is reduced by a factor of two in horizontal direction and in vertical direction with a height that is two times of the difference between the full frame height and the height of the reduced texture image.

After the texture image part and the depth image part have been reduced at least one of the reduced texture image part and the reduced depth image part may in a step S6 be placed in one image frame 30, 30', 30". The one image frame may have the same dimensions as the texture image part of the image frame pair received in step S2. Hence the dimensions of the received image pair may be kept but instead of having a representation in the form of an image pair comprising two frames (one for texture and one for depth) steps S2, S4 and S6 allows for a representation comprising only one single frame comprising both texture and image.

Once the texture image part 26 and the depth image part 28 have been reduced they may both be placed in the one image frame 30, 30', 30". There are different ways of placing the texture image part and the depth image part in the one image frame. Prior to placing the reduced depth image part in the one image frame the reduced depth image part may be split, in a step S8, into at least a first reduced part Di and a second reduced part D2. Hence the depth image part maybe split into at least two parts. For example, the depth image part maybe split horizontally and/or vertically into at least two parts. The at least first and second reduced parts may then be placed adjacent each other as well as adjacent the reduced texture image part in the one image frame.

In general, the at least first and second reduced parts may be placed on top of, below, side by side, vertical, or horizontal in relation to the reduced texture image part in the one image frame. In a case the depth image part is split horizontally into at least two parts these at least two parts could be placed either below or above the rescaled texture image part (as in Fig 6). One advantage with this constellation apart from the fact that texture image part has a higher resolution than depth image part, is that if coding with horizontal slices is used (e.g. one slice per macro block row) the prediction within the slice may be better if texture and depth is kept apart than if texture and depth are mixed within one slice.

According to another embodiment as illustrated in Fig 7, the texture image part 26 is reduced exclusively in horizontal direction. The depth image part 28 may then be reduced by a factor of two in vertical direction and to a width that is two times the difference between the full frame width and the width of the reduced texture image part. The depth image part may then, for example, be split vertically and put on the right side or left side of the reduced texture image part. One advantage with this constellation is that no vertical resolution is lost for the texture image which may be advantageous for polarized stereo-interlaced 3D screens.

Fig 8 illustrates packing according to yet another embodiment where the depth image part 28 is reduced such that its new width corresponds to half of the texture height. The reduced depth image part 28"' is then split

horizontally into (at least) two parts Di, D2 and the (at least) two parts Di, D2, are rotated, in a step S10, and put on the right side or left side of the reduced texture image part in the one image frame 30". Thus, according to some embodiments the at least first and second reduced depth image parts are rotated prior to being placed in the one image frame. According to embodiments a further image frame pair comprising a further texture image part and a further depth image part maybe received, step S12. Fig 9 illustrates an example of a packing procedure in case of Multiview Video plus Depth (MVD) input. According to the embodiment illustrated in Fig 9 the MVD input comprises two texture image parts 26a, 26b and

corresponding depth image parts 28a, 28b. The resolution for the further texture image part and the further depth image part may also be reduced, step S14. The further depth image part may be reduced in resolution more than the further texture image part. According to embodiments the texture image part and the further texture image part are reduced equally into reduced texture images 26a' and 26b'. Likewise, according to some

embodiments, the depth image part and the further depth image part are reduced equally into reduced depth images 28a' and 28b'. Also at least one of the reduced further texture image part and the reduced further depth image part may be placed, step S16, in the one image frame 30"'. One MVD pair (defined by the images 26a, 26b, 28a, 28b) may thereby be represented by one image frame 30"'. As for the case of one image pair being received also the reduced further depth image part may be split, as in stepS8, into at least a first reduced further part and a second reduced further part and/ or being rotated, as in step S10, prior to being placed in the one image frame.

Depth images only use one sub-pixel channel while texture images normally have colour and use three colour channels for each pixel. In particular the texture image part may comprise a luminance component, a first

chrominance component and second chrominance component. The two colour channels (corresponding to the first chrominance component and the second chrominance component) will thus be empty for the pixels

corresponding to the depth images in the combined frame. It would thus be advantageous to fill also the two colour channels with data in order to avoid empty colour channels to be transmitted. For example, consider the YUV420 format. According to an embodiment the depth image 28 is split into three sections 28"", as illustrated in Fig 10. The depth image part may thus be split into a first depth section, a second depth section and a third depth section, step S16. The resolution of the first depth section, the second depth section and the third depth section may then be reduced, step S18. In Fig 10 the lower part of the depth image (consisting of sections D2 and D3) is put into the two subsampled colour channels U and V. According to one embodiment, the first depth section is associated with the luminance component, the second depth section is associated with the first chrominance component, and the third depth section is associated with the second chrominance component, step S20. For example, the first depth section Di maybe associated with the luminance component Y of the texture image 26, the second depth section D2 may be associated with the first chrominance component U of the texture image 26, and the third depth section D3 maybe associated with the second chrominance component V of the texture image 26. Also other associations are possible. For example, the first depth section, the second depth section and the third depth section may have different sizes. The association may thus be dependent on the size of the sections or vice versa. According to an embodiment the second depth section and the third depth section are reduced more than the first depth section. This may be advantageous if the second depth section and the third depth section are associated with the two chrominance components. Combining the use of the otherwise empty colour channels U and V with the packing of texture and depth images as disclosed above implies that the depth images can be kept at a higher resolution. Alternatively, a higher resolution could be kept for the texture image.

As noted above, there exist a number of different colour pixel formats. Hence the one image frame may be associated with a colour pixel format. The splitting into the first depth section, the second depth section and the third depth section may be determined according to a (pre-determined) placing pattern which in turn may be related to the colour pixel format. Typical colour pixel formats include, but are not limited to, YUV420, YUV 422, YUV444, UYVY and RGB888. The placing pattern may thus be different for different pixel formats. In general, the pattern may split rows and/or columns of the depth image part to the first depth section, the second depth section and the third depth section.

For example, one embodiment of a placing pattern for placing the depth image 28 into the three channels for YUV420 pixel formats is illustrated in Fig 11. The first and third row of every set of three rows are placed into the Y channel. Every second pixel of the remaining second row is placed alternating in the U and the V colour channels. The depth image parts before splitting are illustrated at reference numeral 28""'. According to this placing pattern the sub-pixels are thus placed in the U and V channels as close as possible to the corresponding position in the Y channel. This could minimize, or at least reduce, the risks of visible artifacts. Thus, the placing may be such that a distance between each corresponding pixel elements in the first depth section, in the second depth section and in the third depth section is minimized according to the colour pixel format used. Fig 12 illustrates a placing pattern for the UYVY pixel format. The depth image parts before splitting are illustrated at reference numeral 28""". The placing pattern corresponds to the order the pixels are stored at. Similarly, for pixel formats with equal number of bits for all three sub-pixel channels, such as YUV444 and RGB888, the placing pattern illustrated in Fig 13 could be used. The depth image parts before splitting are illustrated at reference numeral 28""". Common for all three placing patterns illustrated in Figs 11, 12 and 13 is that the sub-pixels in the two second sub-pixel channels are located as closely as possible to the corresponding point in the first sub-pixel channel. A similar approach could be used also for other pixel formats. Despite the fact that the sub-pixels of the two second sub-pixel channels are positioned as close as possible to the first sub-pixel channels artifacts may be visible after decoding (especially at lower bitrates) in the form of stripes. To avoid these artifacts the pixels unpacked from the U and V channels could be filtered during the unpacking phase. This may be accomplished by taking the nearest neighbouring pixel unpacked from the Y channel or blending the closest two neighbouring pixels unpacked from the Y channel. Filtering a pixel unpacked from the U or V channel may only be needed if the pixel has a l6 higher or lower value compared to the two neighbouring pixels unpacked from the Y channel.

The invention has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the invention, as defined by the appended patent claims. For example, a person skilled in the art will understand that instead of dividing the depth image using a certain pattern on the pixel level (as described in the examples above), the same pattern could be applied on a block level, where each block consists of a group of pixels. Moreover, a person skilled in the art will understand that splitting the depth image part into depth sections associated with colour channels may also be performed without packing a texture image part and a depth image part into one frame.

Claims

1. A method of processing an image frame pair comprising a texture image part and a depth image part, comprising the steps of:

receiving (S2) an image frame pair comprising a texture image part and a depth image part;

reducing (S4) resolution of the texture image part and the depth image part, wherein the depth image part is reduced to a smaller resolution than the texture image part; and

placing (S6) at least one of the reduced texture image part and the reduced depth image in one image frame.

2. The method according to claim 1, wherein said one image frame has a format compatible with 2D video transmission.

3. The method according to any one of the preceding claims, wherein said one image frame and the texture image part of the received image frame pair have the same dimensions.

4. The method according to any one of the preceding claims, further comprising, prior to the step of placing,

splitting (S8) the reduced depth image part into at least a first reduced part and a second reduced part.

5. The method according to claim 5, wherein the at least first and second reduced parts are placed adjacent each other as well as adjacent the reduced texture image part in said one image frame.

6. The method according to claim 4 or 5, wherein the at least first and second reduced parts are placed on top of, below, side by side, vertical, or horizontal in relation to the reduced texture image part in said one image frame.

7. The method according to claim 5 or 6, further comprising, prior to the step of placing,

rotating (S10) the at least first and second reduced parts. l8

8. The method according to any one of the preceding claims wherein the texture image part is reduced by a first amount and the depth image part is reduced by a second amount, and wherein the first amount and the second amount are fixed.

9. The method according to any one of claims 1 to 7, wherein the texture image part is reduced by a first amount and the depth image part is reduced by a second amount, wherein the first amount and the second amount are determined based on properties of texture and/or depth indicators.

10. The method according to claim 9, wherein the depth indicators relate to smoothness of the depth image part.

11. The method according to any one of the preceding claims, wherein the texture image is reduced exclusively in vertical direction or horizontal direction.

12. The method according to any one of the preceding claims, wherein the depth image is reduced exclusively in vertical direction or horizontal direction.

13. The method according to any one of the preceding claims, further comprising

receiving (S12) a further image frame pair comprising a further texture image part and a further depth image part.

14. The method according to claim 13, further comprising

reducing resolution also of the further texture image part and the further depth image part, wherein the further depth image part is reduced in resolution more than the further texture image part.

15. The method according to claim 14, wherein the texture image part and the further texture image part are reduced equally.

16. The method according to claim 14 or 15, wherein the depth image part and the further depth image part are reduced equally.

17. The method according to any one of claims 14 to 16, further comprising placing (S14) also at least one of the reduced further texture image part and the reduced further depth image part in said one image frame.

18. The method according to claim 17, further comprising, prior to the step of placing,

splitting the reduced further depth image part into at least a first reduced further part and a second reduced further part.

19. The method according to any one of the preceding claims, further comprising

splitting (S16) the depth image part into a first depth section, a second depth section and a third depth section.

20. The method according to claim 19, further comprising

reducing (S18) resolution of the first depth section, the second depth section and the third depth section.

21. The method according to claim 19 or 20, wherein the texture image part comprises a luminance component, a first chrominance component and a second chrominance component, the method further comprising

associating (S20) the first depth section with the luminance component, the second depth section with the first chrominance component, and the third depth section with the second chrominance component.

22. The method according to claim 19, 20 or 21, wherein the first depth section, the second depth section and the third depth section have different sizes.

23. The method according to any one of claims 20 to 22, wherein the second depth section and the third depth section have a smaller resolution than the first depth section.

24. The method according to any one of claims 19 to 23, wherein the one image frame is associated with a colour pixel format and wherein the splitting into the first depth section, the second depth section and the third depth section is related to the colour pixel format.

25. The method according to claim 24, wherein the colour pixel format is one of YUV420, YUV 422, YUV444, UYVY and RGB888.

26. The method according to any one of claims 19 to 25, wherein the depth image part is split according to a pre-determined pattern.

27. The method according to claim 26, wherein the pattern splits rows and/ or columns of the depth image part to the first depth section, the second depth section and the third depth section.

28. The method according to any one of claims 19 to 27, wherein a distance between each corresponding pixel elements in the first depth section, in the second depth section and in the third depth section is minimized according to the colour pixel format used.

29. A computer program (20) of processing an image frame pair comprising a texture image part and a depth image part, the computer program

comprising computer program code which, when run on a processing unit, causes the processing unit to

receive (S2) an image frame pair comprising a texture image part and a depth image part;

reduce (S4) resolution of the texture image part and the depth image part, wherein the depth image part is reduced to a smaller resolution than the texture image part; and

placing (S6) at least one of the reduced texture image part and the reduced depth image part in one image frame.

30. A computer program product (22) comprising a computer program (20) according to claim 29 and a computer readable means (24) on which the computer program is stored.

31. A device (2) for processing an image frame pair comprising a texture image part and a depth image part, comprising a receiver (8) arranged to receive an image frame pair comprising a texture image part and a depth image part; and

a processing unit (4) arranged to reduce resolution of the texture image part and the depth image part, wherein the depth image part is reduced to a smaller resolution than the texture image part; and wherein

the processing unit is further arranged to place at least one of the reduced texture image part and the reduced depth image in one image frame.