WO2018220260A1 - Procédé et appareil de compression d'image - Google Patents
Procédé et appareil de compression d'image Download PDFInfo
- Publication number
- WO2018220260A1 WO2018220260A1 PCT/FI2018/050292 FI2018050292W WO2018220260A1 WO 2018220260 A1 WO2018220260 A1 WO 2018220260A1 FI 2018050292 W FI2018050292 W FI 2018050292W WO 2018220260 A1 WO2018220260 A1 WO 2018220260A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- area
- image
- chroma
- different
- areas
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/162—User input
Definitions
- the present invention relates to a method for compressing images, an apparatus for compressing images, and computer program for compressing images.
- YUV format is basically a raw uncompressed data video format. It is a collection of raw pixel values in YUV color space.
- the YUV video is composed of three components, namely one luma component, i.e. Y, and two chroma components, i.e. U, V.
- Such format was developed to utilize in black and white televisions, where there was a need to have a video signal transmission compatible with both color- and black and white-television
- the luma component was already available in the broadcasting technology and addition of UV Chroma components kept the technology compatible with both color- and black and white-receiver types.
- Figures 2a— 2d illustrate the YUV format.
- Figure 2a shows an example image as a gray scale image, the original being a colour image.
- Figure 2b shows the Y component of the image of Figure 2a
- Figure 2c shows the U component of the image of Figure 2a
- Figure 2d shows the V component of the image of Figure 2a.
- Various embodiments provide a method and apparatus for compressing images to decrease bitrate required to encode images.
- a method and an apparatus for compressing images unevenly at different spatial locations of the image For example, based on the structure of human visual system (un-even distribution of cone cells in retina), different chroma qualities may be used in encoding and/or transmitting different parts of an image.
- an apparatus comprising at least one processor; and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
- an apparatus comprising at least one processor; and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
- a computer readable storage medium stored with code thereon for use by an apparatus, which when executed by a processor, causes the apparatus to perform:
- Figure la shows an example of a multi-camera unit as a simplified block
- Figure lb shows a perspective view of a multi-camera unit, in accordance with an embodiment
- Figure 2a— 2d illustrate the YUV colour format
- Figure 3 a illustrates an example of cone cell distribution in retina
- Figure 3b illustrates the structure of an eye
- Figure 3 c illustrates an example of cone cell distribution in retina and respective chroma compression in an image, in accordance with an embodiment
- Figures 4a— 4c show some alternative central image classifications, in accordance with an embodiment
- Figures 5a— 5c illustrate some examples of layered zones of an image for gradually changing the quality of a chroma component, in accordance with an embodiment
- Figure 6 shows a schematic block diagram of an apparatus, in accordance with an embodiment
- Figure 7 shows a flowchart of a method, in accordance with an embodiment
- Figure 8 shows a schematic block diagram of an exemplary apparatus or electronic device
- Figure 9 shows an apparatus according to an example embodiment
- Figure 10 shows an example of an arrangement for wireless communication comprising a plurality of apparatuses, networks and network elements.
- Figure la illustrates an example of a multi-camera unit 100, which comprises two or more cameras 102.
- the number of cameras 102 is eight, but may also be less than eight or more than eight.
- Each camera 102 is located at a different location in the multi-camera unit 100 and may have a different orientation with respect to other cameras 102.
- the cameras 102 may have an omnidirectional constellation so that it has a 360° viewing angle in a 3D-space.
- such multi-camera unit 100 may be able to see each direction of a scene so that each spot of the scene around the multi-camera unit 100 can be viewed by at least one camera 102.
- any two cameras 102 of the multi-camera unit 100 may be regarded as a pair of cameras 102.
- a multi-camera unit of two cameras has only one pair of cameras
- a multi-camera unit of three cameras has three pairs of cameras
- a multi-camera unit of four cameras has six pairs of cameras, etc.
- a multi-camera unit 100 comprising N cameras 102, where N is an integer greater than one, has N(N-l)/2 pairs of cameras 102. Accordingly, images captured by the cameras 102 at a certain time may be considered as N(N-l)/2 pairs of captured images.
- the multi-camera unit 100 of Figure la may also comprise a processor 104 for controlling the operations of the multi-camera unit 100.
- a memory 106 for for storing data and computer code to be executed by the processor 104, and a transceiver 108 for communicating with, for example, a communication network and/or other devices in a wireless and/or wired manner.
- the user device 100 may further comprise a user interface (UI) 110 for displaying information to the user, for generating audible signals and/or for receiving user input.
- UI user interface
- the multi-camera unit 100 need not comprise each feature mentioned above, or may comprise other features as well.
- Figure la also illustrates some operational elements which may be implemented, for example, as a computer code in the software of the processor, in a hardware, or both.
- a 2D to 3D converting element 116 may convert 2D images to 3D images and vice versa; a location determination unit 124 and an orientation determination unit 126, wherein these units may provide the location and orientation information to the system.
- the location determination unit 124 and the orientation determination unit 126 may also be implemented as one unit. It should be noted that there may also be other operational elements in the multi-camera unit 100 than those depicted in Figure la and/or some of the above mentioned elements may be implemented in some other part of a system than the multi-camera unit 100.
- Figure lb shows as a perspective view an example of an apparatus comprising the multi-camera unit 100.
- seven cameras 102a— 102g can be seen, but the multi-camera unit 100 may comprise even more cameras which are not visible from this perspective.
- Figure lb also shows two microphones 112a, 112b, but the apparatus may also comprise one or more than two microphones.
- the multi-camera unit 100 may be controlled by another device, wherein the multi-camera unit 100 and the other device may communicate with each other and a user may use a user interface of the other device for entering commands, parameters, etc. and the user may be provided information from the multi-camera unit 100 via the user interface of the other device.
- HVS human visual system
- the lens 302 in the eye Analogously to a camera that sends a message to produce a film, the lens 302 in the eye
- Rods 306 are responsible for vision at low light levels (scotopic vision). They do not mediate color vision and have a low spatial acuity and hence, are generally ignored in the human visual system modeling.
- An optical nerve 310 couples the cones and rods to brains.
- Cones 308 are active at higher light levels (photopic vision). They are capable of color vision and are responsible for high spatial acuity. There are three types of cones 308 which are generally categorized to the short-, middle-, and long-wavelength sensitive cones i.e. S-cones, M-cones, and L-cones, respectively. These can be thought by an approximation to be sensitive to blue, green, and red color components of the perceived light. Each photoreceptor reacts to a wide range of spectral frequencies, with the peak sensitivity at approximately 440nm (blue) for S-cones, 550nm (green) for M- cones, and 580nm (red) for L-cones. The brain has the ability to fetch up the whole color spectrum from these three color components. This theory, known as
- Figures 3a and 3c illustrates an example of cone cell distribution 312 and rod cell distribution 314 in retina at a center line of an eye.
- Figure 3c also depicts respective chroma compression in an image, in accordance with an embodiment (rectangles 316, 318 and 320).
- HFCs high frequency components
- a downsampling with ratio 1/2 along the vertical direction may be applied to the content. This is because the vertical spatial resolution of the display is divided between the left and right view and hence, each one has half the vertical resolution.
- a huge aliasing artifact might be introduced while perceiving the stereoscopic content.
- applying low-pass filtering may reduce such artifact considerably since the high frequency components responsible for the creation of aliasing are removed in a pre-processing stage.
- Eye gaze tracking is a process of measuring or detecting either the point of gaze (where one is looking) or the motion of an eye relative to the head.
- An eye gaze tracker is a device for measuring eye positions and eye movements and to follow the movement of eye's pupil to figure out to which point the user is looking at. Eye gaze trackers may be used in research on the visual systems and subjective tests enabling researchers to follow the eye movement of users considering different content presented.
- Eye gaze can, for example, be tracked using a camera tracking the movement of pupil in user's eye.
- the process can be done in real time and with a relatively low processing and resources required.
- An algorithm may be used that can predict the eye gaze based on characteristics of the content. The process may require a considerable amount of operations per pixel and hence, in most of the hand held devices may not be utilized due to extensive power consumption. Such algorithm may use the eye gaze movement and estimate the next places that the eye gaze may be directed to. Such algorithm, may use both the content characteristics and also the tracked previous movement of the eye gaze prior to the current time.
- Scalable video coding may refer to coding structure where one bitstream can contain multiple representations of the content, for example, at different bitrates, resolutions or frame rates. In these cases the receiver can extract the desired
- a meaningful decoded representation can be produced by decoding only certain parts of a scalable bit stream.
- a scalable bitstream typically consists of a "base layer" providing the lowest quality video available and one or more enhancement layers that enhance the video quality when received and decoded together with the lower layers.
- the coded representation of that layer typically depends on the lower layers.
- the motion and mode information of the enhancement layer can be predicted from lower layers.
- the pixel data of the lower layers can be used to create prediction for the enhancement layer.
- a video signal can be encoded into a base layer and one or more enhancement layers.
- An enhancement layer may enhance, for example, the temporal resolution (i.e., the frame rate), the spatial resolution, or simply the quality of the video content represented by another layer or part thereof.
- Each layer together with all its dependent layers is one representation of the video signal, for example, at a certain spatial resolution, temporal resolution and quality level.
- a scalable layer together with all of its dependent layers as a "scalable layer representation”.
- the portion of a scalable bitstream corresponding to a scalable layer representation can be extracted and decoded to produce a representation of the original signal at certain fidelity.
- a picture can be partitioned in tiles, which are rectangular and contain an integer number of LCUs.
- the partitioning to tiles forms a grid comprising one or more tile columns and one or more tile rows.
- a coded tile is byte-aligned, which may be achieved by adding byte-alignment bits at the end of the coded tile.
- a slice is defined to be an integer number of coding tree units
- a slice segment is defined to be an integer number of coding tree units ordered consecutively in the tile scan and contained in a single NAL unit. The division of each picture into slice segments is a partitioning.
- an independent slice segment is defined to be a slice segment for which the values of the syntax elements of the slice segment header are not inferred from the values for a preceding slice segment
- a dependent slice segment is defined to be a slice segment for which the values of some syntax elements of the slice segment header are inferred from the values for the preceding independent slice segment in decoding order.
- a slice header is defined to be the slice segment header of the independent slice segment that is a current slice segment or is the independent slice segment that precedes a current dependent slice segment
- a slice segment header is defined to be a part of a coded slice segment containing the data elements pertaining to the first or all coding tree units represented in the slice segment.
- the CUs have a specific scan order.
- a tile contains an integer number of coding tree units, and may consist of coding tree units contained in more than one slice.
- a slice may consist of coding tree units contained in more than one tile.
- all coding tree units in a slice belong to the same tile and/or all coding tree units in a tile belong to the same slice.
- all coding tree units in a slice segment belong to the same tile and/or all coding tree units in a tile belong to the same slice segment.
- a motion-constrained tile set is such that the inter prediction process is
- Images to be encoded may be received 702 or obtained otherwise by the apparatus 600 for compression.
- the images may have been captured by one camera or more than one camera, for example by the multi-camera unit 100.
- the camera 100 may have been connected with the apparatus 600 via a camera interface 614 of the apparatus, for example, or the images may have been received by other means, such as via the communication interface 612.
- images captured substantially at the same time, similar distribution of un-evenly compressed areas may be used, or determination of the compression ratios for different areas within each image may be independent of other images.
- the image information may be received in the YUV-format or the apparatus 600 may convert the received image information into the YUV-format.
- the apparatus 600 may perform RGB-to-YUV conversion.
- each pixel of the image will be represented by a luma component and two chroma components.
- Image information may be stored in to a memory 604 which may include a frame memory 608, for example.
- the apparatus 600 may comprise processing circuitry such as a processor 601 for controlling the operations of the apparatus 600, for performing compression of images etc.
- the apparatus 600 may further comprise at least a compression
- the apparatus 600 may also comprise a user interface 616. It should be noted that at least some of the operational entities of the apparatus 600 may be implemented as a computer code to be executed by the processor 601, or as a circuitry, or as a combination of computer code and circuitry.
- the compression determination block 602 of the apparatus may receive 704 or obtain otherwise information which is related to the determination how to compress each area of the images.
- the compression determination block 602 may determine for each area or in pixel- wise compression for each pixel what kind of compression to use for that area or pixel.
- the compression determination block 602 may, for example, produce a compression map in which an indication of a compression factor, a compression ratio or a compression method for each area or each block is included. For example, if only two alternative compression factors will be used for an image, the indication may be, for example, one bit having either a logical state 0 or 1. If more than two compression alternatives are in use, more bits may be needed for that information for each area or pixel. Some examples of such information and how to obtain that information will be presented later in this application.
- Images may be processed in a pixel-by-pixel manner or the image may be
- images may be divided into macroblocks each having 16x16 pixels.
- Macroblock s may further be arranged in slices, and the slices in groups of slices, for example.
- Such an entity or pixel will also be called as a coding entity in this specification.
- a picture may either be a frame or a field.
- a frame comprises a matrix of luma samples and possibly the corresponding chroma samples.
- a field is a set of alternate sample rows of a frame and may be used as encoder input, when the source signal is interlaced.
- Chroma sample arrays may be absent (and hence monochrome sampling may be in use) or chroma sample arrays may be subsampled when compared to luma sample arrays.
- Chroma formats may be summarized as follows:
- each of the two chroma arrays has half the height and half the width of the luma array.
- each of the two chroma arrays has the same height and half the width of the luma array.
- each of the two chroma arrays has the same height and width as the luma array.
- the location of chroma samples with respect to luma samples may be determined in the encoder side (e.g. as a pre-processing step or as a part of encoding).
- the chroma sample positions with respect to luma sample positions may be pre-defined for example in a coding standard, such as H.264/AVC or HEVC, or may be indicated in the bitstream for example as part of VUI of H.264/AVC or HEVC.
- a partitioning may be defined as a division of a set into subsets such that each element of the set is in exactly one of the subsets.
- a macroblock is a 16x16 block of luma samples and the
- a macroblock contains one 8x8 block of chroma samples per each chroma component.
- a picture is partitioned to one or more slice groups, and a slice group contains one or more slices.
- a slice consists of an integer number of macrob locks ordered consecutively in the raster scan within a particular slice group.
- pixels within an area will be processed in the same order in each area of the image, but it may also be possible to have different processing orders in different areas. If, for example, the macroblock-based processing is used, pixels within the macroblock may be processed in a so-called zig-zag processing order.
- the compression may be performed by a compression block 606 of the
- the compression block 606 selects 706 an area of pixels for processing and obtains 708 e.g. from the compression map compression information which corresponds with this area of pixels.
- the compression block 606 uses this information to select 710 appropriate compression method for that area of pixels and performs the compression 712.
- After the compression it may be examined 714 whether the whole image has been examined and compressed. If so, the images may be encoded and transmitted 716 e.g. to a communication network to be received and decoded by another apparatus. If the whole image has not yet been compressed, another coding entity may be selected and the above process repeated.
- the image area may be considered to include a central part, wherein the central part may be compressed with the original quality while the rest of the image (e.g. sides, parts surrounding the central part) could be compressed with degraded quality.
- the distribution of high/low quality may depend on the cone cell distribution on retina.
- Figures 4a— 4c show some central image classifications, in accordance with an embodiment.
- the central part of the image has a rectangular shape
- Figure 4b the central part of the image has an oval shape
- Figure 4c the central part of the image has a rectangular shape.
- the central area extends in the vertical direction from the top of the image to the bottom of the image but it may also be possible that the height of the central area is less than the height of the image.
- the principle of classifying the image area to include the central part and one or more other parts may be utilized so that more coarse compression may be applied to only one or both of the chroma components (UV) of pixels located outside the central part. It may also be possible that all the three components of the YUV-component format, i.e. luma and two chroma components, are compressed with degraded quality outside the central part. Furthermore, it may be possible that the compression need not be similar to each of the chroma components but both chroma components may be compressed separately and individually.
- viewer's eye gaze may be followed and further compression may be performed for the chroma component(s) in the areas which are not in the direct viewing direction of the user. In other words, stronger compression may be used in those areas of the image which are farther from the center of the viewer's gaze than areas near or at the center of the viewer's gaze.
- the direction of the gaze of the viewer may be determined and followed by using some appropriate eye gaze tracking method.
- statistics of different viewers are collected and may specifically comprise viewers' viewing direction in the same image or in different parts of the same video. This statistics may be stored and retrieved when playing back the same video for a current viewer.
- Viewing direction may include, for example, user's viewport orientation in 360-degree video/image content and/or gaze direction.
- Viewport orientation may correspond, for example, to the head orientation when a head-mounted display is used.
- an object tracking method may be used to track the moving objects in the scene and keeping the quality of chroma components for those moving objects intact while the quality of chroma components for other static objects in the scene are reduced.
- HFCs high frequency components
- HFCs high frequency components
- characters may be detected from the image and the other areas (i.e. areas with no alphanumeric characters) may be encoded with lower quality of chroma components.
- human faces and/or human bodies may be detected in the image, wherein other areas may be encoded with lower quality of chroma components.
- the chroma components may be down-sampled to reduce the number of samples of the chroma components. For example, downsampling may chosen to result into 4:2:0 chroma format in areas where further chroma compression is desired, while using 4:4:4 chroma format otherwise. It may also be possible to decrease the bit-depth allocated to chroma components, or otherwise reduce the value range used to represent chroma components. As another option, the quantization step for transform coefficients may be increased to encode chroma components. Furthermore, the chroma components may be low-pass filtered to remove any potential high frequency components.
- a function can be considered to define the compression of chroma components while going farther from the center of the image.
- the central 20 degrees concentration of cones could be encoded with the highest quality of chroma samples. This is illustrated with the dotted lines 502 in Figure 5 a.
- the next 20 degrees may be encoded with a lower chroma quality.
- Figure 5 a shows a rectangular center part, in accordance with an embodiment.
- Figure 5b shows an oval center part and the dotted lines 502, 504 illustrate how the chroma component quality could gradually change going from center part to sides of the image, in accordance with an embodiment.
- Figure 5c shows a diamond shaped center part and the dotted lines 502, 504 illustrate how the chroma component quality could gradually change going from center part to sides of the image, in accordance with an embodiment.
- the chroma component quality decreases when the layered zones get closer to the edges of the image. Similar approach for transition between high to low quality may be taken into account to further reduce the chroma quality for the areas where there are less cone cells to perceive the color.
- the rest of the viewing degrees (illustrated with the area 506 in Figures 5 a— 5 c) may be encoded with the lowest chroma component quality.
- the layers in the image may be created so that the main changing direction to be aligned is the horizontal direction i.e. the width of the image. This is because the un-even distribution of the cones in retina reflects more to the horizontal direction of the image.
- the central part and the layered zones may be selected to follow coding unit boundaries.
- the sizes of the central part and the layered zones depend on the physical screen size and the viewing distance, i.e. the angular size of the image.
- the viewing distance may be detected for example with a camera mounted on the screen pointing towards the user, and detecting eyes from the camera image and measuring inter-pupillary distance.
- the viewing distance may be detected with a depth sensing camera.
- the physical screen size and the viewing distance are approximately or exactly known. For example, when the content is viewed with a head-mounted display, the physical screen size and the viewing distance are determined by the optical setup of the head-mounted display.
- a receiver may indicate the information indicative of the physical screen size and the viewing distance to the encoder.
- the different chroma qualities can be achieved based, for example, on at least one of the following approaches. Different bit-depth (number of bits) may be assigned to present the chroma contents. In this scenario, the higher the number of bits to present the chroma components of any specific region of the image, the higher the quality of the color components in that region may be.
- the value range to represent chroma components may be reduced. For example, rather than using a nominal value range of 16 to 240 (for 8-bit chroma components), chroma components could be rescaled to occupy a value range of 32 to 224.
- the chroma transform coefficients may be quantized with a coarser quantization parameter.
- Chroma components may also be low- pass filtered to remove any high frequency components and hence, the required amount of bitrate to encode the samples may be reduced. It may also be possible to use different color gamuts, e.g. ITU-R BT.2020 for the central part and ITU-R BT.709 for the other parts.
- chroma format may be used for the central part than the other parts.
- 4:4:4 sampling may be used for the central part and 4:2:0 sample for the other parts.
- the further compression of chroma components may or may not follow the compression decision on the luma components. This means that if the compression decision on the luma components is to encode them more severely, this may translate directly to further encoding the chroma components too.
- the encoding of chroma components may be completely independent from encoding of the luma components. It means that the chroma quality of components (level of compression) maybe defined solely based on the function translating the number of cone cells in retina rather than any other factor which may change the compression strength of the luma components.
- the compression of chroma components may also be selected separately and differently for each of the two chroma components (U, V).
- multiple versions of the content are encoded with motion- constrained tile sets (or alike).
- the versions may have the same luma fidelity and different chroma fidelity compared to each other.
- Different chroma fidelity may be achieved by any of the above-described methods for achieving further compression of chroma components, such as downsampled chroma sample array size (compared to the luma sample array size), selecting bit-depth and/or value range for chroma samples, quantization step size for chroma, low-pass filtering of chroma.
- a tile stream may be extracted from each tile set position of the encoded bitstream. Tile streams from different versions may be selectively transmitted so that for the central part the chroma is represented with a better quality than in the adjacent areas.
- content is compressed with motion-constrained tile sets (or alike) and the chroma components are coded in a scalable manner. This may enable compression of the content in an even manner and trimming the stream at the time of transmission.
- the chroma components are selectively transmitted so that for the central part a greater number of scalable layers (or alike) are transmitted compared to the adjacent areas.
- the chroma component coding in a scalable manner may be achieved in any of the following ways.
- Bit-depth scalability may be used for chroma components.
- Motion-constrained tile sets (or alike) may need to be used for the enhancement layer but not necessarily for the base layer.
- Color gamut scalability may be used, wherein motion-constrained tile sets (or alike) may need to be used for the enhancement layer but not necessarily for the base layer.
- Chroma format scalability may be used, wherein motion-constrained tile sets (or alike) may need to be used for the enhancement layer but not necessarily for the base layer.
- the input sequence for the base layer encoding may be processed e.g. by low- pass filtering chroma components, compared to the input sequence for the enhancement layer encoding.
- Motion-constrained tile sets (or alike) may need to be used for the enhancement layer but not necessarily for the base layer.
- Data partitioning may be used in creating more than one partition of chroma transform coefficients.
- region-of- interest enhancement layers are encoded.
- tile streams and/or scalable layers in the above-described embodiments may be described in terms of their chroma properties in a streaming manifest, a media presentation description, or alike (hereafter jointly referred to as the streaming manifest).
- a client parses the streaming manifest, and specifically the chroma properties of available tile streams and/or scalable layers. The client selects the tile streams and/or scalable layers in a manner that the chroma fidelity according to the parsed chroma properties for the central part is higher than that for the adjacent areas.
- the above-described embodiments for selectively encoding areas of the image with different chroma properties can be applied with the present embodiment by selectively requesting areas of the image with different chroma properties.
- the selection may be done for example by requesting the tile streams and/or scalable layers through their respective Uniform Resource Locators (URLs) parsed from the streaming manifest, e.g. using the HTTP protocol.
- the client may receive, decode, and play the selected tile streams and/or scalable layers.
- control element such as a server
- the control element may receive image information from the cameras and perform the image correction tasks using the principles presented above.
- the cameras may provide or the control unit may obtain in another way information on the location and pose of the cameras to determine overlapping areas.
- Figure 8 shows a schematic block diagram of an exemplary apparatus or electronic device 50 depicted in Figure 9, which may incorporate a transmitter according to an embodiment of the invention.
- the electronic device 50 may for example be a mobile terminal or user
- the apparatus 50 may comprise a housing 30 for incorporating and protecting the device.
- the apparatus 50 further may comprise a display 32 in the form of a liquid crystal display.
- the display may be any suitable display technology suitable to display an image or video.
- the apparatus 50 may further comprise a keypad 34.
- any suitable data or user interface mechanism may be employed.
- the user interface may be implemented as a virtual keyboard or data entry system as part of a touch-sensitive display.
- the apparatus may comprise a microphone 36 or any suitable audio input which may be a digital or analogue signal input.
- the apparatus 50 may further comprise an audio output device which in embodiments of the invention may be any one of: an earpiece 38, speaker, or an analogue audio or digital audio output
- the apparatus 50 may also comprise a battery 40 (or in other embodiments of the invention the device may be powered by any suitable mobile energy device such as solar cell, fuel cell or clockwork generator).
- a battery 40 or in other embodiments of the invention the device may be powered by any suitable mobile energy device such as solar cell, fuel cell or clockwork generator.
- connection with the embodiments may also be one of these mobile energy devices.
- the apparatus 50 may comprise a combination of different kinds of energy devices, for example a rechargeable battery and a solar cell.
- the apparatus may further comprise an infrared port 41 for short range line of sight communication to other devices.
- the apparatus 50 may further comprise any suitable short range communication solution such as for example a Bluetooth wireless connection or a USB/Fire Wire wired connection.
- the apparatus 50 may comprise a controller 56 or processor for controlling the apparatus 50.
- the controller 56 may be connected to memory 58 which in
- embodiments of the invention may store both data and/or may also store instructions for implementation on the controller 56.
- the controller 56 may further be connected to codec circuitry 54 suitable for carrying out coding and decoding of audio and/or video data or assisting in coding and decoding carried out by the controller 56.
- the apparatus 50 may further comprise a card reader 48 and a smart card 46, for example a universal integrated circuit card (UICC) reader and a universal integrated circuit card for providing user information and being suitable for providing
- a card reader 48 and a smart card 46 for example a universal integrated circuit card (UICC) reader and a universal integrated circuit card for providing user information and being suitable for providing
- UICC universal integrated circuit card
- authentication information for authentication and authorization of the user at a network.
- the apparatus 50 may comprise radio interface circuitry 52 connected to the controller and suitable for generating wireless communication signals for example for communication with a cellular communications network, a wireless communications system or a wireless local area network.
- the apparatus 50 may further comprise an antenna 60 connected to the radio interface circuitry 52 for transmitting radio frequency signals generated at the radio interface circuitry 52 to other apparatus(es) and for receiving radio frequency signals from other apparatus(es).
- the apparatus 50 comprises a camera 42 capable of recording or detecting imaging.
- the system 10 comprises multiple communication devices which can communicate through one or more networks.
- the system 10 may comprise any combination of wired and/or wireless networks including, but not limited to a wireless cellular telephone network (such as a global systems for mobile communications (GSM), universal mobile telecommunications system
- GSM global systems for mobile communications
- GSM global systems for mobile communications
- UMTS long term evolution
- LTE long term evolution
- CDMA code division multiple access
- WLAN wireless local area network
- Bluetooth personal area network an Ethernet local area network
- token ring local area network a token ring local area network
- wide area network a wide area network
- the system shown in Figure 10 shows a mobile telephone network 11 and a representation of the internet 28.
- Connectivity to the internet 28 may include, but is not limited to, long range wireless connections, short range wireless connections, and various wired connections including, but not limited to, telephone lines, cable lines, power lines, and similar communication pathways.
- the example communication devices shown in the system 10 may include, but are not limited to, an electronic device or apparatus 50, a combination of a personal digital assistant (PDA) and a mobile telephone 14, a PDA 16, an integrated messaging device (IMD) 18, a desktop computer 20, a notebook computer 22, a tablet computer.
- PDA personal digital assistant
- IMD integrated messaging device
- the apparatus 50 may be stationary or mobile when carried by an individual who is moving.
- the apparatus 50 may also be located in a mode of transport including, but not limited to, a car, a truck, a taxi, a bus, a train, a boat, an airplane, a bicycle, a motorcycle or any similar suitable mode of transport.
- Some or further apparatus may send and receive calls and messages and
- the base station 24 may be connected to a network server 26 that allows communication between the mobile telephone network 11 and the internet 28.
- the system may include additional communication devices and communication devices of various types.
- the communication devices may communicate using various transmission
- CDMA code division multiple access
- GSM global systems for mobile communications
- UMTS telecommunications system
- TDMA time divisional multiple access
- FDMA frequency division multiple access
- TCP-IP transmission control protocol-internet protocol
- SMS short messaging service
- MMS multimedia messaging service
- email instant messaging service
- Bluetooth IEEE 802.11, Long Term Evolution wireless communication technique (LTE) and any similar wireless communication technology.
- a communications device involved in implementing various embodiments of the present invention may communicate using various media including, but not limited to, radio, infrared, laser, cable connections, and any suitable connection.
- embodiments of the invention operating within a wireless communication device
- the invention as described above may be implemented as a part of any apparatus comprising a circuitry in which radio frequency signals are transmitted and received.
- embodiments of the invention may be implemented in a mobile phone, in a base station, in a computer such as a desktop computer or a tablet computer comprising radio frequency communication means (e.g. wireless local area network, cellular radio, etc.).
- radio frequency communication means e.g. wireless local area network, cellular radio, etc.
- the various embodiments of the invention may be implemented in hardware or special purpose circuits or any combination thereof. While various aspects of the invention may be illustrated and described as block diagrams or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non- limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
- Embodiments of the inventions may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
- the method comprises:
- cone cells distribution in the retina of human eye determine the at least the first area and the second area in the image.
- the method comprises:
- the method comprises:
- the method further comprises:
- the method further comprises:
- an apparatus comprising at least one processor; and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
- said at least one memory including computer program code configured to, with the at least one processor, cause the apparatus to:
- cone cells distribution in the retina of human eye determine the at least the first area and the second area in the image.
- said at least one memory including computer program code configured to, with the at least one processor, cause the apparatus to:
- said at least one memory including computer program code configured to, with the at least one processor, cause the apparatus to:
- said at least one memory including computer program code configured to, with the at least one processor, cause the apparatus to:
- said at least one memory including computer program code configured to, with the at least one processor, cause the apparatus to:
- an apparatus comprising at least one processor; and at least one memory including computer program code the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:
- the representations comprising luma and chroma components and being characterized in the media presentation description by chroma properties;
- said at least one memory including
- said at least one memory including
- said at least one memory including
- an apparatus comprising: means for receiving an image presented by luma and chroma components;
- the apparatus comprises:
- the apparatus comprises:
- the apparatus comprises:
- the apparatus comprises:
- the apparatus comprises:
- a computer readable storage medium stored with code thereon for use by an apparatus, which when executed by a processor, causes the apparatus to perform:
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
L'invention concerne divers procédés, appareils et produits-programmes informatiques de compression d'image. Selon un mode de réalisation, un procédé comprend la réception d'une image présentée par des composantes de luminance et de chrominance ; la détermination d'au moins deux zones différentes dans l'image ; et le codage de la composante de chrominance de chacune des au moins deux zones différemment.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1708613.3A GB2563037A (en) | 2017-05-31 | 2017-05-31 | Method and apparatus for image compression |
GB1708613.3 | 2017-05-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018220260A1 true WO2018220260A1 (fr) | 2018-12-06 |
Family
ID=59270885
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FI2018/050292 WO2018220260A1 (fr) | 2017-05-31 | 2018-04-24 | Procédé et appareil de compression d'image |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2563037A (fr) |
WO (1) | WO2018220260A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117292003A (zh) * | 2023-11-27 | 2023-12-26 | 深圳对对科技有限公司 | 用于计算机网络的图片云数据存储方法 |
EP4228266A4 (fr) * | 2020-10-08 | 2024-09-04 | Riken | Dispositif de traitement d'image, procédé de traitement d'image et support non transitoire lisible par ordinateur stockant un programme de traitement d'image |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070024706A1 (en) * | 2005-08-01 | 2007-02-01 | Brannon Robert H Jr | Systems and methods for providing high-resolution regions-of-interest |
US20080240250A1 (en) * | 2007-03-30 | 2008-10-02 | Microsoft Corporation | Regions of interest for quality adjustments |
US20110258344A1 (en) * | 2010-04-15 | 2011-10-20 | Canon Kabushiki Kaisha | Region of interest-based image transfer |
US20130294505A1 (en) * | 2011-01-05 | 2013-11-07 | Koninklijke Philips N.V. | Video coding and decoding devices and methods preserving |
US20150264404A1 (en) * | 2014-03-17 | 2015-09-17 | Nokia Technologies Oy | Method and apparatus for video coding and decoding |
US20150334398A1 (en) * | 2014-05-15 | 2015-11-19 | Daniel Socek | Content adaptive background foreground segmentation for video coding |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6275614B1 (en) * | 1998-06-26 | 2001-08-14 | Sarnoff Corporation | Method and apparatus for block classification and adaptive bit allocation |
JP2000197050A (ja) * | 1998-12-25 | 2000-07-14 | Canon Inc | 画像処理装置及び方法 |
US7035459B2 (en) * | 2001-05-14 | 2006-04-25 | Nikon Corporation | Image compression apparatus and image compression program |
CN107211121B (zh) * | 2015-01-22 | 2020-10-23 | 联发科技(新加坡)私人有限公司 | 视频编码方法与视频解码方法 |
-
2017
- 2017-05-31 GB GB1708613.3A patent/GB2563037A/en not_active Withdrawn
-
2018
- 2018-04-24 WO PCT/FI2018/050292 patent/WO2018220260A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070024706A1 (en) * | 2005-08-01 | 2007-02-01 | Brannon Robert H Jr | Systems and methods for providing high-resolution regions-of-interest |
US20080240250A1 (en) * | 2007-03-30 | 2008-10-02 | Microsoft Corporation | Regions of interest for quality adjustments |
US20110258344A1 (en) * | 2010-04-15 | 2011-10-20 | Canon Kabushiki Kaisha | Region of interest-based image transfer |
US20130294505A1 (en) * | 2011-01-05 | 2013-11-07 | Koninklijke Philips N.V. | Video coding and decoding devices and methods preserving |
US20150264404A1 (en) * | 2014-03-17 | 2015-09-17 | Nokia Technologies Oy | Method and apparatus for video coding and decoding |
US20150334398A1 (en) * | 2014-05-15 | 2015-11-19 | Daniel Socek | Content adaptive background foreground segmentation for video coding |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP4228266A4 (fr) * | 2020-10-08 | 2024-09-04 | Riken | Dispositif de traitement d'image, procédé de traitement d'image et support non transitoire lisible par ordinateur stockant un programme de traitement d'image |
CN117292003A (zh) * | 2023-11-27 | 2023-12-26 | 深圳对对科技有限公司 | 用于计算机网络的图片云数据存储方法 |
CN117292003B (zh) * | 2023-11-27 | 2024-03-19 | 深圳对对科技有限公司 | 用于计算机网络的图片云数据存储方法 |
Also Published As
Publication number | Publication date |
---|---|
GB2563037A (en) | 2018-12-05 |
GB201708613D0 (en) | 2017-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111095933B (zh) | 图像处理装置和方法 | |
US10609400B2 (en) | Image processing device and image processing method | |
AU2019203367B2 (en) | Decoding device and decoding method, and encoding device and encoding method | |
AU2018320382B2 (en) | Image encoder, image decoder, image encoding method, and image decoding method | |
JP6780761B2 (ja) | 画像符号化装置および方法 | |
EP3528498A1 (fr) | Dispositif de codage, dispositif de décodage, procédé de codage, et procédé de décodage | |
KR102126886B1 (ko) | 단계적 계층에서의 신호 인코딩, 디코딩 및 재구성 동안의 잔차 데이터의 압축해제 | |
US9743100B2 (en) | Image processing apparatus and image processing method | |
EP3544300A1 (fr) | Dispositif de codage, dispositif de décodage, procédé de codage et procédé de décodage | |
CN108924563B (zh) | 图像处理设备和方法 | |
EP3544299A1 (fr) | Dispositif de codage, dispositif de décodage, procédé de codage, et procédé de décodage | |
CN113767633A (zh) | 用于解码器侧帧内模式导出和自适应帧内预测模式之间的交互的方法和装置 | |
EP2843951B1 (fr) | Dispositif de traitement d'images et procédé de traitement d'images | |
US20110026591A1 (en) | System and method of compressing video content | |
US8958474B2 (en) | System and method for effectively encoding and decoding a wide-area network based remote presentation session | |
EP2809073A1 (fr) | Commande de débit binaire pour codage vidéo utilisant des données d'objet d'intérêt | |
JP2020174400A (ja) | 画像復号装置および方法 | |
EP2978220B1 (fr) | Dispositif et procédé pour décoder une image | |
JP7047776B2 (ja) | 符号化装置及び符号化方法、並びに、復号装置及び復号方法 | |
US20200014925A1 (en) | Encoding apparatus, decoding apparatus, encoding method, and decoding method | |
US20130195186A1 (en) | Scalable Video Coding Extensions for High Efficiency Video Coding | |
EP2852158A1 (fr) | Dispositif de traitement d'image et procédé de traitement d'image | |
WO2018220260A1 (fr) | Procédé et appareil de compression d'image | |
WO2018229327A1 (fr) | Procédé, appareil et produit-programme informatique destinés au codage et au décodage vidéo |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18810017 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18810017 Country of ref document: EP Kind code of ref document: A1 |