WO2019066704A1 - Method in an image compression device and in an image decompression device - Google Patents

Method in an image compression device and in an image decompression device Download PDF

Info

Publication number
WO2019066704A1
WO2019066704A1 PCT/SE2018/050979 SE2018050979W WO2019066704A1 WO 2019066704 A1 WO2019066704 A1 WO 2019066704A1 SE 2018050979 W SE2018050979 W SE 2018050979W WO 2019066704 A1 WO2019066704 A1 WO 2019066704A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
resolution
edge map
edge
decompression device
Prior art date
Application number
PCT/SE2018/050979
Other languages
French (fr)
Inventor
Yong Yao
Original Assignee
Dozero Tech Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dozero Tech Ab filed Critical Dozero Tech Ab
Publication of WO2019066704A1 publication Critical patent/WO2019066704A1/en

Links

Classifications

    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/436Interfacing a local distribution network, e.g. communicating with another STB or one or more peripheral devices inside the home
    • H04N21/4363Adapting the video stream to a specific local network, e.g. a Bluetooth® network
    • H04N21/43637Adapting the video stream to a specific local network, e.g. a Bluetooth® network involving a wireless protocol, e.g. Bluetooth, RF or wireless LAN [IEEE 802.11]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4728End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/014Head-up displays characterised by optical features comprising information/image processing systems
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0147Head-up displays characterised by optical features comprising a device modifying the resolution of the displayed image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering

Definitions

  • the invention relates to image compression and decompression.
  • image processing In the field of image processing, images are often compressed and decompressed in order to be efficiently transmitted. Compression of an image involves finding redundant information in the image and removing it, which is performed by an image compression device (which may be implemented in an encoder), and thereafter somehow restoring the redundant information which is done by an image decompression device (which may be
  • Image compression/decompression devices are for example available as software to be run on general-purpose computers, as embedded software on dedicated chipsets and as dedicated hardware. Compressing images is done in order to transmit them, store them or for other purposes. In general, it is desirable for an image compression/decompression device to be able to remove as much information as possible while
  • Virtual/Augmented reality in particular demands high levels of computational power to generate images in real time while having many applications where portability has beneficial effects such as construction, operation of vehicles or machinery, surgery or portable gaming.
  • a problem with existing solutions to such implementations is that in order for the system to provide a high quality image in terms of resolution and color data, a powerful computer is needed. Such a computer is unlikely to be portable, and with processor-heavy calculations such as graphics, battery life is an issue on portable computers. A powerful portable computer is also expensive.
  • One solution to this problem is generating images on a powerful stationary computer and then transmitting them to a portable VR/AR device which can use less advanced and more easily powered circuitry to show the transmitted images.
  • a problem with this approach is that high quality images are demanding to send over a network. If the images are to appear as real-time images, the quality must currently be kept down in order to transmit them quickly from the stationary computer to the portable VR/AR device. Thus, there is need for improvements. Summary of invention
  • GPU graphics processing unit
  • the invention solves the problem of preserving high frequency information by providing an edge map along with the downscaled image data to the image decompression device, whereby the image decompression device can employ logic to re-create edges in the first image which would otherwise be smoothed by the image decompression device during upscaling.
  • edge maps can be represented with very efficient data types and thus that high frequency information may be preserved in an efficient manner.
  • the first image may be represented on a square lattice, and the method may further comprise:
  • mapping the first image from the square lattice to a hexagonal lattice detecting edges in the first image when mapped to the hexagonal lattice so as to produce an edge map on the hexagonal lattice, and mapping the edge map from the hexagonal lattice to the square lattice to produce an edge map on the square lattice before transmitting the edge map on the square lattice to the image decompression device.
  • Detecting edges on a hexagonal lattice provides the possibility of finer spatial resolution when representing edges passing through points, as shall be explained in the detailed description of the invention.
  • the step of transmitting the second image and the edge map to the image decompression device may be performed over an IEEE 802.1 1 network, a 4G or a 5G network.
  • This allows for the use of existing long-range network infrastructure to send images.
  • Mobile phones are ubiquitous and capable of simple processing tasks.
  • Using the 4G or 5G networks already available to mobile phone users enables using stationary computers to produce processing-heavy image calculation tasks which can then be sent to a mobile phone which renders the image, e.g., for a connected VR or AR headset, the screen of the mobile phone or any other connected image displaying device.
  • the bandwidth of such networks may not support transmission of image data in real time unless sufficiently compressed, while heavy compression without preserving high frequency information in the compressed image leads to suffering user experience.
  • use of such a network is enabled by the invention.
  • the step of detecting edges may be performed by using a Canny-like or Gradient-based edge detector.
  • Such edge detection schemes are advantageous because they provide reliable and computationally efficient edge detection.
  • the edge map may comprise a binary sequence, wherein a position in the binary sequence may be set to either an On' value indicating a presence of an edge in a coordinate of the first image or an Off value indicating absence of an edge in a coordinate of the first image. This allows for a compact representation of the edge map which is readily stored and/or transmitted.
  • the invention further relates to an image compression device, comprising a receiver configured to receive a first image having a first resolution,
  • a scaler configured to downscale the first image to produce a second image having a second resolution being lower than the first resolution
  • an edge detector configured to detect edges in the first image so as to produce an edge map which indicates where in the first image edges are present
  • a transmitter configured to transmit the second image and the edge map to an image decompression device.
  • Such an image compression device is advantageous for use in applications involving transmission of high quality images in real time, for the reasons expounded on above.
  • the invention also relates to a computer program product comprising a computer-readable medium having computer code instructions stored thereon adapted to carry out the image compression method of the first aspect when executed by a processor. It is advantageous to use a computer for tasks requiring fast calculation of results.
  • a second aspect of the invention pertains to a method in an image decompression device, comprising:
  • edge map which was cheaply transmitted to the image decompression device, to update the image before rendering enables restoring of high-frequency information in the received image, thereby providing accurate image information to the renderer.
  • the method may further comprise receiving a gaze direction of a user in relation to the third image from an eye tracking system, wherein the step of updating the third image further comprises selecting an area of the third image currently being looked at by the user based on the received gaze direction, and updating only the selected area of the third image based on the received edge map. Tracking the gaze of the user saves computer resources, as operations are not performed on parts of the image which will not be viewed directly by the user and thus are not as important.
  • the invention further relates to an image decompression device, comprising a receiver configured to receive a second image having a second resolution and an edge map from an image compression device, a scaler configured to upscale the second image to produce a third image having a third resolution being greater than the second resolution, an updating unit configured to update the third image based on the received edge map, and a transmitter configured to transmit the updated third image to an image renderer.
  • an image decompression device comprising a receiver configured to receive a second image having a second resolution and an edge map from an image compression device, a scaler configured to upscale the second image to produce a third image having a third resolution being greater than the second resolution, an updating unit configured to update the third image based on the received edge map, and a transmitter configured to transmit the updated third image to an image renderer.
  • Such an image decompression device is advantageous for use in applications involving transmission of high quality images in real time, for the reasons expounded on above.
  • the invention further relates to a computer program product
  • a third aspect of the invention relates to a system comprising an image compression device and an image decompression device as explained above, the image decompression device being arranged to communicate with the image compression device.
  • the system may further comprise an eye tracker configured to monitor eyes of a user and determine a gaze direction of the user in relation to an image, and to transmit the gaze direction to the image decompression device.
  • an eye tracker configured to monitor eyes of a user and determine a gaze direction of the user in relation to an image, and to transmit the gaze direction to the image decompression device.
  • Fig. 1 shows a schematic diagram of an image decompression device and an image compression device according to embodiments.
  • Fig. 2 shows a schematic diagram of processing steps of a method in an image compression device according to embodiments.
  • Fig. 3 shows a schematic diagram of processing steps of a method in an image decompression device according to embodiments
  • Fig. 4 shows a square lattice and a hexagonal lattice.
  • Fig. 5 illustrates how and edge map helps preserving high frequency information.
  • Fig. 6 shows a schematic diagram of a system comprising a computer and a VR/AR headset according to embodiments.
  • Fig. 7 is a flow chart of a method in an image compression device according to embodiments.
  • Fig. 8 is a flow chart of a method in an image decompression device according to embodiments.
  • Fig. 1 shows a system comprising an image compression device 100, an image decompression device 200, and a renderer 220.
  • the image compression device 100 comprises a receiver 160, a scaler
  • the image decompression device 200 comprises a receiver 260, a scaler 270, an updating unit 280, and a transmitter 290.
  • the image compression device 100 and image
  • decompression device 200 may each comprise circuitry which is configured to implement the components 160, 170, 180, 190, 260, 270, 280, 290 and, more specifically, their functionality.
  • the scaler 170 may thus comprise circuitry which, when in use, downscales an image having a first resolution to produce a second image having a second, lower resolution.
  • the circuitry may instead be in the form of one or more processors, such as one or more microprocessors, digital signal processors, or field programmable gate arrays, which in association with computer code instructions stored on a (non-transitory) computer- readable medium, such as a non-volatile memory, causes the image compression device 100 and image decompression device 200 to carry out embodiments of any method disclosed herein.
  • processors such as one or more microprocessors, digital signal processors, or field programmable gate arrays
  • a (non-transitory) computer- readable medium such as a non-volatile memory
  • the components 160, 170, 180, 190, 260, 270, 280, 290 may thus each correspond to a portion of computer code instructions stored on the computer-readable medium, that, when executed by the processor, causes the image
  • compression device 100 or image decompression device 200 to carry out the functionality of the component.
  • the image compression device 100 the image compression device 100
  • the decompression device 200, the renderer 220 and all the elements thereof - the receiver 160, the edge detector 180, the scaler 170, and a transmitter 190 of the image compression device, the receiver 260, the scaler 270, the updating unit 280, and the transmitter 290 of the image decompression device 200, and the renderer 220 - may be implemented as compression software in a home computer or server, compression software embedded onto a dedicated chipset, a dedicated hardware component or any other suitable equipment for compressing and decompressing images, together or by themselves.
  • the edge detector 180 of the image compression device 100 may in particular be implemented as a software supported by OpenGL or another suitable graphics library using a dedicated GPU, for example in a home computer. Such an edge detector 180 may be implemented using shader technology readily available through such libraries and hardware.
  • step S02 the receiver 160 receives a first image 1 10 having a first resolution.
  • the image with the first resolution will henceforth be referred to as a "hi-res image” 1 10, and is symbolized in Fig. 2 by an image of a Hungarian lamp post.
  • the hi-res image 1 10 may upon receipt by the receiver 160 be represented on a square lattice, as denoted by the square symbol in the bottom right corner of the box symbolizing the hi-res image 1 10 in Fig. 2.
  • a square lattice By an image being represented on a square lattice is meant that the center points of the pixels of the image are arranged in a two-dimensional square lattice. This is further illustrated in the left part of Fig. 4, where a portion of an image being represented on square lattice 400 is shown.
  • the squares correspond to pixels of the image, and the dots correspond to center points of the pixels.
  • the hi-res image 1 10 may be represented on another lattice such as a hexagonal lattice.
  • an image being represented on a hexagonal lattice is meant that the center points of the pixels of the image are arranged in a two-dimensional hexagonal lattice. This is further illustrated in the right part of Fig. 4, where a portion of an image being represented on a hexagonal lattice 600 is shown. On the hexagonal lattice 600, the center points of the pixels on every other row of pixels is shifted by a distance of half a pixel in relation to the center points of the pixels on the other rows.
  • the receiver 160 forwards the hi-res image 1 10 to the edge detector
  • step S04 the scaler 170 downscales the first image 1 10 to produce a second image 120 having a second resolution which is lower than the first resolution.
  • the second image 120 is henceforth referred to as a ⁇ -res image 120".
  • "Downscaling" in the context of this application is to be understood as reducing the number of pixels used to represent an image. This can be done in several different ways - one is to simply remove a number of pixels from the image, e.g. every other pixel. It is also possible to downscale the first image 1 10 by using interpolation such as bi-linear or bi-cubic resampling based interpolation. A further option is to use averaging sampling, which is simple, fast and near-optimal.
  • the averaging sampling considers the values of a group of points, for instance a 2x2 square, and averages the values to create one value. This produces an image of lower resolution, i.e., having fewer pixels. Further operations, such as adding a noise component to the downscaled image 120 may be performed within the scope of downscaling the image.
  • the noise component is used to model various unknown elements in the further image processing chain, such as, for instance, thermal or electrical noise, or transmission errors.
  • the second resolution may be any suitable resolution lower than the first resolution.
  • the first image 1 10 is downscaled by a factor of 0.5. In another embodiment, the first image 1 10 is downscaled by a factor of 0.25.
  • the downscaling factor applied will affect the image quality when subsequently performing upscaling by the scaler 270 the image decompression device 200.
  • the edge detector 180 detects edges in the first image 1 10 so as to produce an edge map 150 which indicates where in the first image edges are present.
  • the edge detector 180 may apply an edge detection algorithm as known in the art, such as a canny edge detector algorithm, a gradient-based algorithm, or any other edge detection algorithm suitable for detecting edges in an image. Processing steps which may be carried out by the edge detector 180 are shown in the dashed box of Fig. 2.
  • the resulting edge map 150 is typically defined with respect to the same lattice as the first image 1 10.
  • the edge map 150 is defined with respect to the same square lattice, and if the first image 1 10 is represented on a hexagonal lattice, the edge map 150 is defined with respect to the same hexagonal lattice.
  • the edge detector 180 may operate to detect edges with respect to a square lattice if the first image 1 10 is represented on a square lattice, and to detect edges with respect to a hexagonal lattice if the first image 1 10 is represented on hexagonal lattice.
  • the edge detector 180 may first map the hi-res image 1 10 onto a hexagonal lattice 600 to produce a hi-res image 130 on a hexagonal lattice. This is the case in the example of Fig. 2, and the small hexagon on the bottom right corner of image 130 in Fig. 2 illustrates the fact that image 130 is represented on a hexagonal lattice.
  • Fig. 4 further illustrates how to perform the mapping of the hi-res image 1 10 from the square lattice to the hexagonal lattice 600.
  • the left part of Fig. 4 shows a portion of the hi-res image 1 10 when represented on a square lattice 400. The center points of the pixels of image 1 10 are thus arranged in a two- dimensional square lattice.
  • the right part of Fig. 4 shows a portion of the hires image 130 when represented on a hexagonal lattice 600. The center points of the pixels of the image 130 are thus arranged in a two-dimensional hexagonal lattice 600.
  • the center points of the pixels on every other row of the hexagonal lattice 600 are thus shifted by the distance of half a pixel compared to the square lattice 400.
  • interpolation of image data such as linear interpolation or spline interpolation, may be used.
  • the hexagonal lattice 600 is advantageous because it offers higher resolution of detected edges compared to the square lattice 400.
  • the square lattice offers high resolution of detecting vertical and horizontal edges, i.e., in two directions, whereas the hexagonal offers high resolution of detected edges in three directions, viz., the horizontal direction, and two directions being angled 45 degrees with respect to the horizontal direction.
  • the edge detector 180 Having mapped the hi-res image 1 10 to the hexagonal lattice, the edge detector 180 proceeds to detect edges in the hi-res image 130 on the hexagonal grid to produce an edge map 140 on the hexagonal grid.
  • the edge detector 180 may apply an edge detection algorithm as known in the art as further mentioned above.
  • the edge detector 180 may operate on the luminance channel of the YCbCr color space. Thus, the edge detector 180 may convert the image 130 to a YCbCr color space and perform an edge detection algorithm on the luminance channel of the YCbCr color space. This approach cuts out color information unnecessary for the edge detection step, making the edge detection algorithm work faster as it has less data to consider.
  • the edge detector 180 may then proceed to map the edge map 140 on the hexagonal lattice 600 back to the square lattice 400, resulting in an edge map on the square lattice 150. Again, the mapping of the edge map from the hexagonal lattice to the square lattice may be carried out using interpolation.
  • the edge map 150 generally indicates where in the first image 1 10 edges are present.
  • the edge map 150 may be represented as a binary sequence where each point coordinate is assigned either a ⁇ ' (i.e., an "off” value) if no edge is present or a '1 ' (i.e., an "on” value) if an edge is present. Due to how modern computers handle data, this representation is efficient - an 8x8 grid of points can for instance be represented in a single 64-bit word.
  • the transmitter 190 transmits the second image 120 and the edge map 150 to the image decompression device 200.
  • the data may be sent wirelessly, e.g., over a wi-fi network, such as an IEEE 802.1 1 network, a 4G network, a 5G network or any other network suitable for wireless transmission of data.
  • step S10 the second image 120 and the edge map 150 are received in the image decompression device 200 by the receiver 260.
  • the receiver 260 forwards the edge map 150 to the updating unit 280 and the second image 120 to the scaler 270.
  • step S12 the scaler 270 proceeds to upscale the second image 120 to produce a third image 210.
  • the third image 210 has a third resolution which is higher than the second resolution, preferably the same as the first resolution of the first image 1 10.
  • Upscaling in the context of this application is to be understood as the process of increasing the number of pixels used to represent an image, i.e., points of information are added to a lower resolution image to produce a higher resolution image.
  • there are many different ways of doing image upscaling For example, this may be done by using interpolation.
  • the scaler 270 may consider neighboring points of information, i.e., values of neighboring pixels, in the second image 120 and calculate values to insert between them by
  • the scaler 270 then forwards the third image 210 to the updating unit 280.
  • step S14 the third image 210 is updated by the updating unit 280 using the edge information in the edge map 150.
  • a fourth image 215 is produced.
  • the fourth image 215 may be seen as a
  • edge-directed interpolation (NEDI) algorithm described at http://ieeexplore.ieee.org/document/951537 (2001 ) or http://chiranjivi.tripod.com/EDITut.html, or the edge-guided image
  • DCCI Convolution Interpolation
  • ICBI Iterative Curvature-Based Interpolation
  • the updating unit 280 may apply any of the algorithms referred to above when updating the third image 210 with edge information. However, instead of using edge information detected from the third image 210, i.e., the upscaled image, the updating unit 280 uses the edge map 150 obtained from the original hi-resolution image, i.e., the first image 1 10. This will in general lead to an improved image quality during the upscaling process.
  • Fig. 5 shows a simple example which illustrates the advantages of using an edge map when updating an image.
  • three pixels 510 left 510a, right 510c, and middle 510b of the first image 1 10 are illustrated. These three pixels 510 contain the color information ⁇ , ⁇ and ⁇ . The three pixels 510 contain an edge, as the middle pixel 510b has high contrast compared to the right pixel 510c.
  • the left path of Fig. 5 illustrates the processing carried out by a first example image compression device and image decompression device which are not embodiments of the invention.
  • the first example image compression device downscales the three pixels 510 to two pixels 520, containing the color information a and ⁇ , in this case by simply deleting the middle pixel 510b.
  • the two pixels 520 are then sent to the first example image decompression device.
  • the right path of Fig. 5 instead illustrates the processing carried out by the image compression device 100 and image decompression device 200 which are an embodiment of the invention.
  • the image compression device 100 downscales the three pixels 510 as explained above to a lower number of pixels, in this case to two pixels 520, and produces an edge map 540.
  • the edge information is in the example represented as a binary array where signifies the pixel containing an edge and ⁇ ' that the pixel does not contain an edge.
  • the edge map 540 is typically far less costly to transmit than the entire color information for the pixels.
  • the image decompression device 200 then upscales the two pixels again to three pixels 530 as explained above, e.g. by using interpolation.
  • the image decompression device 200 may then, with the information from the edge map that the middle pixel 530b should contain an edge, proceed to update the middle pixel 530b.
  • the image decompression device 200 may substitute the value of the middle pixel 530b by a or ⁇ , thereby preserving the contrast and the edge, as shown by the three restored pixels 550. It should be noted that even if the color information for the middle pixel 530b is chosen at random between a or ⁇ , the pixel segment will contain an edge with at most a one-pixel offset as compared to the original high resolution image.
  • the pixel information in the middle pixel 530b may be chosen in some other way, for example with weighted interpolation or any scheme which can make use of the edge information in some way when implemented in the image decompression device 200. As human eyes are biased towards picking up edges in images, this preserves important color information at low transmission cost.
  • the updating unit 280 updates the whole third image 210 using the edge map 150. In other embodiments, only a selected area of the third image 210 is updated using the edge map 150.
  • the updating unit 280 may receive information from an eye tracking system which tracks the gaze direction of a user which is currently watching the output images of the image decompression device 200. From the gaze direction, the updating unit 280 may select an area of the third image 210, corresponding to an area currently being looked at by the user. The updating unit 280 may then proceed to only update the third image 210 using the edge map 150 in the selected area.
  • step S16 the fourth image 215 is sent to a renderer 220, which is adapted to draw the fourth image 215 for the user on one or more monitors.
  • Embodiments of the invention may be used to improve the efficiency of doing video streaming in a real-time VR gaming scenario.
  • the two underlying requirements are to keep the game image quality as much as possible, and lowering the gaming latency as much as possible.
  • embodiments described herein have two main advantages: 1 ) The edge map is derived from the original high-resolution image. This leads to a higher image quality during the upscaling process. 2) The edge-detection process and image upscaling process are performed on the image transmitting side and the image receiving side, respectively.
  • Fig. 6 shows a system comprising an image decompression device 200 and image compression device 100 as described above, where the image compression device 100 is implemented in a home computer 600.
  • the hi-res image 1 10 is generated and compressed on the home computer 600. It is then sent by way of a mobile telephone 610 over a 5G network to a VR/AR headset 620, which comprises hardware implementing the image
  • the VR/AR headset 620 further comprises a camera (not pictured) configured to track the gaze of the user on the screens of the VR/AR headset 620.
  • the VR/AR headset 620 may of course comprise any suitable eye-tracking means to determine the gaze of the user. This can be done by the camera recording images of the eyes of the user.
  • An algorithm finds the pupils of the user, and calculates gaze information using their position in relation to each other and some calibration information of the camera. This calculation can be performed by a processor in the camera or a processor in the VR/AR headset 620.
  • the image decompression device 200 can use the gaze information gathered by the camera to lessen strain on the image decompression device 200 by only fully decompressing parts of the lores image 120 which are within the gaze of the user. For example, as explained above, only an area of the upscaled image 210 which is in the gaze direction of the user may be updated using the edge map 150. This can be determined by using pre-recorded gaze data pertaining to angles of typical viewing cones of humans at certain distances and such. Parts in the user's peripheral vision may for instance be decompressed not using the edge map, not fully upscaled or not upscaled at all depending on the application while parts of the image being looked upon by the user can be fully decompressed using the information of the edge map 150.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Optics & Photonics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Processing (AREA)

Abstract

There are provided methods and apparatuses for image compressionand image decompression. The image compression method comprises: receiving a first image having a first resolution, downscaling the first image to produce a second image having a second resolution being lower than the first resolution, detecting edges in the first image so as to produce an edge map which indicates where in the first image edges are present, andtransmitting the second image and the edge map to animage compression device.The decompression methodcomprises:receiving a secondimage having a secondresolution and an edge map from an image compression device, upscaling the secondimage to produce a thirdimage having a thirdresolution being greater than the secondresolution, updating the thirdimage based on the received edge map, andtransmitting the updated thirdimage to an image renderer.

Description

METHOD IN AN IMAGE COMPRESSION DEVICE AND IN AN IMAGE
DECOMPRESSION DEVICE
Field of invention
The invention relates to image compression and decompression. Technical Background
In the field of image processing, images are often compressed and decompressed in order to be efficiently transmitted. Compression of an image involves finding redundant information in the image and removing it, which is performed by an image compression device (which may be implemented in an encoder), and thereafter somehow restoring the redundant information which is done by an image decompression device (which may be
implemented in a decoder). Image compression/decompression devices are for example available as software to be run on general-purpose computers, as embedded software on dedicated chipsets and as dedicated hardware. Compressing images is done in order to transmit them, store them or for other purposes. In general, it is desirable for an image compression/decompression device to be able to remove as much information as possible while
conserving the perceived quality of the image. As computers become better at generating and displaying high quality images, there always exists a need for more efficient devices for image compression or image decompression. In the modern age, fast transmission of high-quality images is already of critical importance for many applications, and is becoming even more so with the advent of virtual reality (VR) and augmented reality (AR) applications. Such applications provide users with a wide range of functionalities, and are becoming more and more common.
Virtual/Augmented reality in particular demands high levels of computational power to generate images in real time while having many applications where portability has beneficial effects such as construction, operation of vehicles or machinery, surgery or portable gaming. A problem with existing solutions to such implementations is that in order for the system to provide a high quality image in terms of resolution and color data, a powerful computer is needed. Such a computer is unlikely to be portable, and with processor-heavy calculations such as graphics, battery life is an issue on portable computers. A powerful portable computer is also expensive. One solution to this problem is generating images on a powerful stationary computer and then transmitting them to a portable VR/AR device which can use less advanced and more easily powered circuitry to show the transmitted images. A problem with this approach is that high quality images are demanding to send over a network. If the images are to appear as real-time images, the quality must currently be kept down in order to transmit them quickly from the stationary computer to the portable VR/AR device. Thus, there is need for improvements. Summary of invention
It is an object of the invention to improve on at least some of the above mentioned problems. This object has been achieved in a first aspect of the invention by a method in an image compression device, comprising:
receiving a first image having a first resolution, downscaling the first image to produce a second image having a second resolution being lower than the first resolution,
detecting edges in the first image so as to produce an edge map which indicates where in the first image edges are present, and transmitting the second image and the edge map to an image decompression device.
This has the advantage of providing very compact high frequency information, namely the edge map which is compact, thus cheap to send and can be readily calculated by, e.g., a graphics processing unit(GPU).
In a first aspect of the invention, the invention solves the problem of preserving high frequency information by providing an edge map along with the downscaled image data to the image decompression device, whereby the image decompression device can employ logic to re-create edges in the first image which would otherwise be smoothed by the image decompression device during upscaling. The inventor has realized that edge maps can be represented with very efficient data types and thus that high frequency information may be preserved in an efficient manner.
The first image may be represented on a square lattice, and the method may further comprise:
mapping the first image from the square lattice to a hexagonal lattice, detecting edges in the first image when mapped to the hexagonal lattice so as to produce an edge map on the hexagonal lattice, and mapping the edge map from the hexagonal lattice to the square lattice to produce an edge map on the square lattice before transmitting the edge map on the square lattice to the image decompression device.
Detecting edges on a hexagonal lattice provides the possibility of finer spatial resolution when representing edges passing through points, as shall be explained in the detailed description of the invention.
The step of transmitting the second image and the edge map to the image decompression device may be performed over an IEEE 802.1 1 network, a 4G or a 5G network. This allows for the use of existing long-range network infrastructure to send images. Mobile phones are ubiquitous and capable of simple processing tasks. Using the 4G or 5G networks already available to mobile phone users enables using stationary computers to produce processing-heavy image calculation tasks which can then be sent to a mobile phone which renders the image, e.g., for a connected VR or AR headset, the screen of the mobile phone or any other connected image displaying device. It also enables construction of dedicated image displaying devices which can access the 4G or 5G networks to communicate with a stationary computer at long distances. However, the bandwidth of such networks may not support transmission of image data in real time unless sufficiently compressed, while heavy compression without preserving high frequency information in the compressed image leads to suffering user experience. Thus, use of such a network is enabled by the invention.
The step of detecting edges may be performed by using a Canny-like or Gradient-based edge detector. Such edge detection schemes are advantageous because they provide reliable and computationally efficient edge detection.
The edge map may comprise a binary sequence, wherein a position in the binary sequence may be set to either an On' value indicating a presence of an edge in a coordinate of the first image or an Off value indicating absence of an edge in a coordinate of the first image. This allows for a compact representation of the edge map which is readily stored and/or transmitted. The invention further relates to an image compression device, comprising a receiver configured to receive a first image having a first resolution,
a scaler configured to downscale the first image to produce a second image having a second resolution being lower than the first resolution,
an edge detector configured to detect edges in the first image so as to produce an edge map which indicates where in the first image edges are present, and
a transmitter configured to transmit the second image and the edge map to an image decompression device.
Such an image compression device is advantageous for use in applications involving transmission of high quality images in real time, for the reasons expounded on above.
The invention also relates to a computer program product comprising a computer-readable medium having computer code instructions stored thereon adapted to carry out the image compression method of the first aspect when executed by a processor. It is advantageous to use a computer for tasks requiring fast calculation of results.
A second aspect of the invention pertains to a method in an image decompression device, comprising:
receiving a second image having a second resolution and an edge map from an image compression device,
upscaling the second image to produce a third image having a third resolution being greater than the second resolution,
updating the third image based on the received edge map, and transmitting the updated third image to an image renderer.
Using the edge map, which was cheaply transmitted to the image decompression device, to update the image before rendering enables restoring of high-frequency information in the received image, thereby providing accurate image information to the renderer.
The method may further comprise receiving a gaze direction of a user in relation to the third image from an eye tracking system, wherein the step of updating the third image further comprises selecting an area of the third image currently being looked at by the user based on the received gaze direction, and updating only the selected area of the third image based on the received edge map. Tracking the gaze of the user saves computer resources, as operations are not performed on parts of the image which will not be viewed directly by the user and thus are not as important.
The invention further relates to an image decompression device, comprising a receiver configured to receive a second image having a second resolution and an edge map from an image compression device, a scaler configured to upscale the second image to produce a third image having a third resolution being greater than the second resolution, an updating unit configured to update the third image based on the received edge map, and a transmitter configured to transmit the updated third image to an image renderer.
Such an image decompression device is advantageous for use in applications involving transmission of high quality images in real time, for the reasons expounded on above.
The invention further relates to a computer program product
comprising a computer-readable medium having computer code instructions stored thereon adapted to carry out the method according to the method described above when executed by a processor.
A third aspect of the invention relates to a system comprising an image compression device and an image decompression device as explained above, the image decompression device being arranged to communicate with the image compression device.
The system may further comprise an eye tracker configured to monitor eyes of a user and determine a gaze direction of the user in relation to an image, and to transmit the gaze direction to the image decompression device.
Preferred embodiments appear in the dependent claims and in the detailed description. Brief description of the drawings
The invention will by way of example be described in more detail with reference to the appended schematic drawings, which shows embodiments of the invention.
Fig. 1 shows a schematic diagram of an image decompression device and an image compression device according to embodiments.
Fig. 2 shows a schematic diagram of processing steps of a method in an image compression device according to embodiments.
Fig. 3 shows a schematic diagram of processing steps of a method in an image decompression device according to embodiments
Fig. 4 shows a square lattice and a hexagonal lattice.
Fig. 5 illustrates how and edge map helps preserving high frequency information.
Fig. 6 shows a schematic diagram of a system comprising a computer and a VR/AR headset according to embodiments.
Fig. 7 is a flow chart of a method in an image compression device according to embodiments.
Fig. 8 is a flow chart of a method in an image decompression device according to embodiments.
Detailed description of preferred embodiments
The invention will now be described in more detail with reference to the figures. Fig. 1 shows a system comprising an image compression device 100, an image decompression device 200, and a renderer 220.
The image compression device 100 comprises a receiver 160, a scaler
170, an edge detector 180, and a transmitter 190. The image decompression device 200 comprises a receiver 260, a scaler 270, an updating unit 280, and a transmitter 290.
Generally, the image compression device 100 and image
decompression device 200 may each comprise circuitry which is configured to implement the components 160, 170, 180, 190, 260, 270, 280, 290 and, more specifically, their functionality.
In a hardware implementation, each of the components 160, 170, 180,
190, 260, 270, 280, 290 may correspond to circuitry which is dedicated and specifically designed to provide the functionality of the component. The circuitry may be in the form of one or more integrated circuits, such as one or more application specific integrated circuits. By way of example, the scaler 170 may thus comprise circuitry which, when in use, downscales an image having a first resolution to produce a second image having a second, lower resolution.
In a software implementation, the circuitry may instead be in the form of one or more processors, such as one or more microprocessors, digital signal processors, or field programmable gate arrays, which in association with computer code instructions stored on a (non-transitory) computer- readable medium, such as a non-volatile memory, causes the image compression device 100 and image decompression device 200 to carry out embodiments of any method disclosed herein. In that case, the components 160, 170, 180, 190, 260, 270, 280, 290 may thus each correspond to a portion of computer code instructions stored on the computer-readable medium, that, when executed by the processor, causes the image
compression device 100 or image decompression device 200 to carry out the functionality of the component.
It is to be understood that it is also possible to have a combination of a hardware and a software implementation, meaning that the functionality of some of the components 160, 170, 180, 190, 260, 270, 280, 290 are implemented in hardware and others in software.
In particular, the image compression device 100, the image
decompression device 200, the renderer 220 and all the elements thereof - the receiver 160, the edge detector 180, the scaler 170, and a transmitter 190 of the image compression device, the receiver 260, the scaler 270, the updating unit 280, and the transmitter 290 of the image decompression device 200, and the renderer 220 - may be implemented as compression software in a home computer or server, compression software embedded onto a dedicated chipset, a dedicated hardware component or any other suitable equipment for compressing and decompressing images, together or by themselves. The edge detector 180 of the image compression device 100 may in particular be implemented as a software supported by OpenGL or another suitable graphics library using a dedicated GPU, for example in a home computer. Such an edge detector 180 may be implemented using shader technology readily available through such libraries and hardware.
The operation of the image compression device 100 will now be described with reference to Fig. 1 , Fig. 2, Fig. 4 and the flowchart of Fig. 7.
In step S02, the receiver 160 receives a first image 1 10 having a first resolution. The image with the first resolution will henceforth be referred to as a "hi-res image" 1 10, and is symbolized in Fig. 2 by an image of a Hungarian lamp post.
The hi-res image 1 10 may upon receipt by the receiver 160 be represented on a square lattice, as denoted by the square symbol in the bottom right corner of the box symbolizing the hi-res image 1 10 in Fig. 2. By an image being represented on a square lattice is meant that the center points of the pixels of the image are arranged in a two-dimensional square lattice. This is further illustrated in the left part of Fig. 4, where a portion of an image being represented on square lattice 400 is shown. The squares correspond to pixels of the image, and the dots correspond to center points of the pixels. In alternative embodiments, the hi-res image 1 10 may be represented on another lattice such as a hexagonal lattice. By an image being represented on a hexagonal lattice is meant that the center points of the pixels of the image are arranged in a two-dimensional hexagonal lattice. This is further illustrated in the right part of Fig. 4, where a portion of an image being represented on a hexagonal lattice 600 is shown. On the hexagonal lattice 600, the center points of the pixels on every other row of pixels is shifted by a distance of half a pixel in relation to the center points of the pixels on the other rows.
The receiver 160 forwards the hi-res image 1 10 to the edge detector
180 and the scaler 170.
In step S04, the scaler 170 downscales the first image 1 10 to produce a second image 120 having a second resolution which is lower than the first resolution. The second image 120 is henceforth referred to as a Ίο-res image 120". "Downscaling" in the context of this application is to be understood as reducing the number of pixels used to represent an image. This can be done in several different ways - one is to simply remove a number of pixels from the image, e.g. every other pixel. It is also possible to downscale the first image 1 10 by using interpolation such as bi-linear or bi-cubic resampling based interpolation. A further option is to use averaging sampling, which is simple, fast and near-optimal. The averaging sampling considers the values of a group of points, for instance a 2x2 square, and averages the values to create one value. This produces an image of lower resolution, i.e., having fewer pixels. Further operations, such as adding a noise component to the downscaled image 120 may be performed within the scope of downscaling the image. The noise component is used to model various unknown elements in the further image processing chain, such as, for instance, thermal or electrical noise, or transmission errors. The second resolution may be any suitable resolution lower than the first resolution. For example, in one embodiment the first image 1 10 is downscaled by a factor of 0.5. In another embodiment, the first image 1 10 is downscaled by a factor of 0.25. The downscaling factor applied will affect the image quality when subsequently performing upscaling by the scaler 270 the image decompression device 200.
In step S06, the edge detector 180 detects edges in the first image 1 10 so as to produce an edge map 150 which indicates where in the first image edges are present. For this purpose, the edge detector 180 may apply an edge detection algorithm as known in the art, such as a canny edge detector algorithm, a gradient-based algorithm, or any other edge detection algorithm suitable for detecting edges in an image. Processing steps which may be carried out by the edge detector 180 are shown in the dashed box of Fig. 2.
The resulting edge map 150 is typically defined with respect to the same lattice as the first image 1 10. Thus, if the first image 1 10 is represented on a square lattice, the edge map 150 is defined with respect to the same square lattice, and if the first image 1 10 is represented on a hexagonal lattice, the edge map 150 is defined with respect to the same hexagonal lattice. This means that the edge detector 180 may operate to detect edges with respect to a square lattice if the first image 1 10 is represented on a square lattice, and to detect edges with respect to a hexagonal lattice if the first image 1 10 is represented on hexagonal lattice. However, in some embodiments, where the hi-res image 1 10, upon receipt by the receiver 160, is represented on a square lattice, the edge detector 180 may first map the hi-res image 1 10 onto a hexagonal lattice 600 to produce a hi-res image 130 on a hexagonal lattice. This is the case in the example of Fig. 2, and the small hexagon on the bottom right corner of image 130 in Fig. 2 illustrates the fact that image 130 is represented on a hexagonal lattice. The use of the square and hexagon to denote use of the square lattice and the hexagonal lattice, respective, is used throughout Fig. 2.
Fig. 4 further illustrates how to perform the mapping of the hi-res image 1 10 from the square lattice to the hexagonal lattice 600. The left part of Fig. 4 shows a portion of the hi-res image 1 10 when represented on a square lattice 400. The center points of the pixels of image 1 10 are thus arranged in a two- dimensional square lattice. The right part of Fig. 4 shows a portion of the hires image 130 when represented on a hexagonal lattice 600. The center points of the pixels of the image 130 are thus arranged in a two-dimensional hexagonal lattice 600. The center points of the pixels on every other row of the hexagonal lattice 600 are thus shifted by the distance of half a pixel compared to the square lattice 400. In order to map the hi-res image 1 10 from the square lattice to the hexagonal lattice, interpolation of image data, such as linear interpolation or spline interpolation, may be used.
The hexagonal lattice 600 is advantageous because it offers higher resolution of detected edges compared to the square lattice 400. In essence, the square lattice offers high resolution of detecting vertical and horizontal edges, i.e., in two directions, whereas the hexagonal offers high resolution of detected edges in three directions, viz., the horizontal direction, and two directions being angled 45 degrees with respect to the horizontal direction.
Having mapped the hi-res image 1 10 to the hexagonal lattice, the edge detector 180 proceeds to detect edges in the hi-res image 130 on the hexagonal grid to produce an edge map 140 on the hexagonal grid. For this purpose, the edge detector 180 may apply an edge detection algorithm as known in the art as further mentioned above.
The edge detector 180 may operate on the luminance channel of the YCbCr color space. Thus, the edge detector 180 may convert the image 130 to a YCbCr color space and perform an edge detection algorithm on the luminance channel of the YCbCr color space. This approach cuts out color information unnecessary for the edge detection step, making the edge detection algorithm work faster as it has less data to consider.
The edge detector 180 may then proceed to map the edge map 140 on the hexagonal lattice 600 back to the square lattice 400, resulting in an edge map on the square lattice 150. Again, the mapping of the edge map from the hexagonal lattice to the square lattice may be carried out using interpolation.
The edge map 150 generally indicates where in the first image 1 10 edges are present. The edge map 150 may be represented as a binary sequence where each point coordinate is assigned either a Ό' (i.e., an "off" value) if no edge is present or a '1 ' (i.e., an "on" value) if an edge is present. Due to how modern computers handle data, this representation is efficient - an 8x8 grid of points can for instance be represented in a single 64-bit word.
In step S08, the transmitter 190 transmits the second image 120 and the edge map 150 to the image decompression device 200. The data may be sent wirelessly, e.g., over a wi-fi network, such as an IEEE 802.1 1 network, a 4G network, a 5G network or any other network suitable for wireless transmission of data.
The operation of the image decompression device 200 will now be described with reference to Fig. 1 , Fig. 3 and the flowchart of Fig. 8.
In step S10, the second image 120 and the edge map 150 are received in the image decompression device 200 by the receiver 260. The receiver 260 forwards the edge map 150 to the updating unit 280 and the second image 120 to the scaler 270.
In step S12, the scaler 270 proceeds to upscale the second image 120 to produce a third image 210. The third image 210 has a third resolution which is higher than the second resolution, preferably the same as the first resolution of the first image 1 10. "Upscaling" in the context of this application is to be understood as the process of increasing the number of pixels used to represent an image, i.e., points of information are added to a lower resolution image to produce a higher resolution image. As with downscaling, there are many different ways of doing image upscaling. For example, this may be done by using interpolation. For example, the scaler 270 may consider neighboring points of information, i.e., values of neighboring pixels, in the second image 120 and calculate values to insert between them by
interpolating the neighboring points of information. Many such algorithms are known to a person skilled in the art and may be used within the scope of the invention. Also, the particular upscaling algorithm to use may be selected based on the complexity of the image contents.
The scaler 270 then forwards the third image 210 to the updating unit 280.
In step S14, the third image 210 is updated by the updating unit 280 using the edge information in the edge map 150. In this way a fourth image 215 is produced. Notably, the fourth image 215 may be seen as a
reconstruction of the first image 1 10.
There exist algorithms for using an edge information in connection to image upscaling, such as the new edge-directed interpolation (NEDI) algorithm described at http://ieeexplore.ieee.org/document/951537 (2001 ) or http://chiranjivi.tripod.com/EDITut.html, or the edge-guided image
interpolation algorithm ("An Edge-Guided Image Interpolation (EGGI)
Algorithm via Directional Filtering and Data Fusion", http://ieeexpore.ieee.org/ document/1658087). The underlying idea of such algorithms is to conduct statistical sampling by re-weighting the neighboring pixels for doing interpolation on a particular pixel. The re-weighting process is done by preserving edges in the image after scaling, and thus alleviating the staircase artifacts. Such algorithms are also referred to as edge-oriented or edge-based interpolation. Other similar algorithms are known as Directional Cubic
Convolution Interpolation (DCCI) and Iterative Curvature-Based Interpolation (ICBI), (http://www.ijarcce.com/upload/2013/december/IJARCCE4D-s- sreedhar_reddy_enlargement_of.pdf). In these algorithms, the edge information is detected from the upscaled image.
The updating unit 280 may apply any of the algorithms referred to above when updating the third image 210 with edge information. However, instead of using edge information detected from the third image 210, i.e., the upscaled image, the updating unit 280 uses the edge map 150 obtained from the original hi-resolution image, i.e., the first image 1 10. This will in general lead to an improved image quality during the upscaling process.
Fig. 5 shows a simple example which illustrates the advantages of using an edge map when updating an image. At the top of Fig. 5, three pixels 510 (left 510a, right 510c, and middle 510b) of the first image 1 10 are illustrated. These three pixels 510 contain the color information α, β and ω. The three pixels 510 contain an edge, as the middle pixel 510b has high contrast compared to the right pixel 510c.
The left path of Fig. 5 illustrates the processing carried out by a first example image compression device and image decompression device which are not embodiments of the invention. The first example image compression device downscales the three pixels 510 to two pixels 520, containing the color information a and β, in this case by simply deleting the middle pixel 510b. The two pixels 520 are then sent to the first example image decompression device. The first example image decompression device, on the left of Fig. 5, attempts to represent the color information of the middle pixel 510b through interpolation, for example as the mean value ώ = (α+β)/2. This results in three restored pixels 530 Fig. 5 shows that the edge is lost in the interpolation - to retain that edge, it would be necessary to transmit the color information associated with the middle pixel 510b, which may be costly for an image with many edges.
The right path of Fig. 5 instead illustrates the processing carried out by the image compression device 100 and image decompression device 200 which are an embodiment of the invention. The image compression device 100 downscales the three pixels 510 as explained above to a lower number of pixels, in this case to two pixels 520, and produces an edge map 540. The edge information is in the example represented as a binary array where signifies the pixel containing an edge and Ό' that the pixel does not contain an edge. The edge map 540 is typically far less costly to transmit than the entire color information for the pixels.
The image decompression device 200 then upscales the two pixels again to three pixels 530 as explained above, e.g. by using interpolation. The image decompression device 200 may then, with the information from the edge map that the middle pixel 530b should contain an edge, proceed to update the middle pixel 530b. For example, the image decompression device 200 may substitute the value of the middle pixel 530b by a or β, thereby preserving the contrast and the edge, as shown by the three restored pixels 550. It should be noted that even if the color information for the middle pixel 530b is chosen at random between a or β, the pixel segment will contain an edge with at most a one-pixel offset as compared to the original high resolution image. Of course, the pixel information in the middle pixel 530b may be chosen in some other way, for example with weighted interpolation or any scheme which can make use of the edge information in some way when implemented in the image decompression device 200. As human eyes are biased towards picking up edges in images, this preserves important color information at low transmission cost.
In some embodiments, the updating unit 280 updates the whole third image 210 using the edge map 150. In other embodiments, only a selected area of the third image 210 is updated using the edge map 150. In more detail, the updating unit 280 may receive information from an eye tracking system which tracks the gaze direction of a user which is currently watching the output images of the image decompression device 200. From the gaze direction, the updating unit 280 may select an area of the third image 210, corresponding to an area currently being looked at by the user. The updating unit 280 may then proceed to only update the third image 210 using the edge map 150 in the selected area.
In step S16, the fourth image 215 is sent to a renderer 220, which is adapted to draw the fourth image 215 for the user on one or more monitors.
Embodiments of the invention may be used to improve the efficiency of doing video streaming in a real-time VR gaming scenario. In such scenario, the two underlying requirements are to keep the game image quality as much as possible, and lowering the gaming latency as much as possible. Compared to traditional edge-directed interpolations algorithms, where the edge map is derived from the upscaled image at the receiving side, embodiments described herein have two main advantages: 1 ) The edge map is derived from the original high-resolution image. This leads to a higher image quality during the upscaling process. 2) The edge-detection process and image upscaling process are performed on the image transmitting side and the image receiving side, respectively. This serves to reduce latency since instead of performing edge-directed image upscaling in a single node, i.e., on the receiving side, a decentralized network topology is used where the sub- operations of deriving the edge map and applying the edge map during scaling are performed at different nodes in the network. In this way, different dedicated nodes are used to perform different sub-operations of the image processing, which opens up for the possibility of parallel processing.
Fig. 6 shows a system comprising an image decompression device 200 and image compression device 100 as described above, where the image compression device 100 is implemented in a home computer 600. The hi-res image 1 10 is generated and compressed on the home computer 600. It is then sent by way of a mobile telephone 610 over a 5G network to a VR/AR headset 620, which comprises hardware implementing the image
decompression device 200. The VR/AR headset 620 further comprises a camera (not pictured) configured to track the gaze of the user on the screens of the VR/AR headset 620. The VR/AR headset 620 may of course comprise any suitable eye-tracking means to determine the gaze of the user. This can be done by the camera recording images of the eyes of the user. An algorithm finds the pupils of the user, and calculates gaze information using their position in relation to each other and some calibration information of the camera. This calculation can be performed by a processor in the camera or a processor in the VR/AR headset 620. The image decompression device 200 can use the gaze information gathered by the camera to lessen strain on the image decompression device 200 by only fully decompressing parts of the lores image 120 which are within the gaze of the user. For example, as explained above, only an area of the upscaled image 210 which is in the gaze direction of the user may be updated using the edge map 150. This can be determined by using pre-recorded gaze data pertaining to angles of typical viewing cones of humans at certain distances and such. Parts in the user's peripheral vision may for instance be decompressed not using the edge map, not fully upscaled or not upscaled at all depending on the application while parts of the image being looked upon by the user can be fully decompressed using the information of the edge map 150.

Claims

1 . A method in an image decompression device, comprising:
receiving, from an image compression device which has received a first image having a first resolution, a second image having a second resolution and an edge map,
upscaling the second image to produce a third image having a third resolution being greater than the second resolution,
updating the third image based on the received edge map, and transmitting the updated third image to an image renderer,
wherein the step of updating the third image further comprises
receiving a gaze direction of a user in relation to the third image from an eye tracking system,
selecting an area of the third image currently being looked at by the user based on the received gaze direction, and
updating only the selected area of the third image based on the received edge map.
2. A method according to claim 1 , comprising:
in a compression device,
receiving a first image having a first resolution,
downscaling the first image to produce a second image having a second resolution being lower than the first resolution,
detecting edges in the first image so as to produce an edge map which indicates where in the first image edges are present, and
transmitting the second image and the edge map to an image decompression device.
3. The method according to claim 2, wherein the first image is represented on a square lattice, and the method further comprises:
mapping the first image from the square lattice to a hexagonal lattice, wherein the step of detecting edges in the first image is performed on the first image when mapped to the hexagonal lattice so as to produce an edge map on the hexagonal lattice, and
mapping the edge map from the hexagonal lattice to the square lattice to produce an edge map on the square lattice before transmitting the edge map on the square lattice to the image decompression device.
4. The method according to any one of claims 2-3, wherein the step of transmitting the second image and the edge map to the image
decompression device is performed over an IEEE 802.1 1 network, a 4G or a 5G network.
5. The method according to any one of claims 2-4, wherein the step of detecting edges in the first image further comprises converting the color space of the image to a YCbCr color space before detecting edges in the first image.
6. The method according to claim 5, wherein the step of detecting edges is only performed on the luminance channel of the YCbCr color space.
7. The method according to claim 6, wherein the step of detecting edges is performed by using a Canny-like or Gradient-based edge detector.
8. The method according to any one of the previous claims, wherein the edge map comprises a binary sequence, wherein a position in the binary sequence may be set to either an On' value indicating a presence of an edge in a coordinate of the first image or an Off value indicating absence of an edge in a coordinate of the first image.
9. A computer program product comprising a computer-readable medium having computer code instructions stored thereon adapted to carry out the method according to any one of claims 1 -8 when executed by a processor.
10. An image compression device, comprising:
a receiver configured to receive a first image having a first resolution, a scaler configured to downscale the first image to produce a second image having a second resolution being lower than the first resolution,
an edge detector configured to detect edges in the first image so as to produce an edge map which indicates where in the first image edges are present, and
a transmitter configured to transmit the second image and the edge map to an image decompression device.
1 1 . An image decompression device, comprising:
a receiver configured to receive a second image having a second resolution and an edge map from an image compression device,
a scaler configured to upscale the second image to produce a third image having a third resolution being greater than the second resolution, an updating unit configured to update the third image based on the received edge map, and
a transmitter configured to transmit the updated third image to an image renderer.
12. A system comprising:
an image compression device according to claim 10,
an image decompression device according to claim 1 1 , the image decompression device being arranged to communicate with the image compression device, and
an eye tracker configured to monitor eyes of a user and determine a gaze direction of the user in relation to an image, and to transmit the gaze direction to the image decompression device.
PCT/SE2018/050979 2017-09-26 2018-09-26 Method in an image compression device and in an image decompression device WO2019066704A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE1751194-0 2017-09-26
SE1751194 2017-09-26

Publications (1)

Publication Number Publication Date
WO2019066704A1 true WO2019066704A1 (en) 2019-04-04

Family

ID=65903265

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2018/050979 WO2019066704A1 (en) 2017-09-26 2018-09-26 Method in an image compression device and in an image decompression device

Country Status (1)

Country Link
WO (1) WO2019066704A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10832447B2 (en) * 2018-10-19 2020-11-10 Samsung Electronics Co., Ltd. Artificial intelligence encoding and artificial intelligence decoding methods and apparatuses using deep neural network

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5103306A (en) * 1990-03-28 1992-04-07 Transitions Research Corporation Digital image compression employing a resolution gradient
US5703965A (en) * 1992-06-05 1997-12-30 The Regents Of The University Of California Image compression/decompression based on mathematical transform, reduction/expansion, and image sharpening
US20040207632A1 (en) * 2001-10-04 2004-10-21 Miller Michael E Method and system for displaying an image
US6898319B1 (en) * 1998-09-11 2005-05-24 Intel Corporation Method and system for video frame enhancement using edge detection
US20050249417A1 (en) * 2004-05-06 2005-11-10 Dong-Seob Song Edge detecting method
US20100056274A1 (en) * 2008-08-28 2010-03-04 Nokia Corporation Visual cognition aware display and visual data transmission architecture
US20120319928A1 (en) * 2011-06-20 2012-12-20 Google Inc. Systems and Methods for Adaptive Transmission of Data
EP2741234A2 (en) * 2012-12-07 2014-06-11 Analog Devices, Inc. Object localization using vertical symmetry
KR101713492B1 (en) * 2016-06-27 2017-03-07 가천대학교 산학협력단 Method for image decoding, method for image encoding, apparatus for image decoding, apparatus for image encoding
US20170255257A1 (en) * 2016-03-04 2017-09-07 Rockwell Collins, Inc. Systems and methods for delivering imagery to head-worn display systems

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5103306A (en) * 1990-03-28 1992-04-07 Transitions Research Corporation Digital image compression employing a resolution gradient
US5703965A (en) * 1992-06-05 1997-12-30 The Regents Of The University Of California Image compression/decompression based on mathematical transform, reduction/expansion, and image sharpening
US6898319B1 (en) * 1998-09-11 2005-05-24 Intel Corporation Method and system for video frame enhancement using edge detection
US20040207632A1 (en) * 2001-10-04 2004-10-21 Miller Michael E Method and system for displaying an image
US20050249417A1 (en) * 2004-05-06 2005-11-10 Dong-Seob Song Edge detecting method
US20100056274A1 (en) * 2008-08-28 2010-03-04 Nokia Corporation Visual cognition aware display and visual data transmission architecture
US20120319928A1 (en) * 2011-06-20 2012-12-20 Google Inc. Systems and Methods for Adaptive Transmission of Data
EP2741234A2 (en) * 2012-12-07 2014-06-11 Analog Devices, Inc. Object localization using vertical symmetry
US20170255257A1 (en) * 2016-03-04 2017-09-07 Rockwell Collins, Inc. Systems and methods for delivering imagery to head-worn display systems
KR101713492B1 (en) * 2016-06-27 2017-03-07 가천대학교 산학협력단 Method for image decoding, method for image encoding, apparatus for image decoding, apparatus for image encoding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
X IANGJIAN HE ET AL.: "Canny edge detection on a virtual hexagonal image structure", 2009 JOINT CONFERENCES ON PERVASIVE COMPUTING (JCPC), 3 December 2009 (2009-12-03), Piscataway, NJ, USA, pages 167 - 172, XP031641634 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10832447B2 (en) * 2018-10-19 2020-11-10 Samsung Electronics Co., Ltd. Artificial intelligence encoding and artificial intelligence decoding methods and apparatuses using deep neural network

Similar Documents

Publication Publication Date Title
US11595653B2 (en) Processing of motion information in multidimensional signals through motion zones and auxiliary information through auxiliary zones
JP5790345B2 (en) Image processing apparatus, image processing method, program, and image processing system
CN112204993B (en) Adaptive panoramic video streaming using overlapping partitioned segments
US20200351442A1 (en) Adaptive panoramic video streaming using composite pictures
US9396519B2 (en) Content aware video resizing
WO2020048484A1 (en) Super-resolution image reconstruction method and apparatus, and terminal and storage medium
WO2022206202A1 (en) Image beautification processing method and apparatus, storage medium, and electronic device
US10007970B2 (en) Image up-sampling with relative edge growth rate priors
US9508025B2 (en) Image processing device, image processing method and medium
WO2019066704A1 (en) Method in an image compression device and in an image decompression device
JP2019149785A (en) Video conversion device and program
KR20130036430A (en) Method for generating high resolution depth image from low resolution depth image, and medium recording the same
US20190289277A1 (en) Systems and methods for procedural rendering of cinematic virtual reality content
CN110677691B (en) Video encoding method, video decoding method, video encoding apparatus, video decoding apparatus, storage medium, and electronic apparatus
CN110572654A (en) video encoding method, video decoding method, video encoding apparatus, video decoding apparatus, storage medium, and electronic apparatus
CN115152222A (en) Overly smooth progressive image
JP5633046B2 (en) Image information processing method and image information processing system
CN117834946A (en) Graphic image display system and method
CN110572653A (en) Video encoding method, video decoding method, video encoding apparatus, video decoding apparatus, storage medium, and electronic apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18861330

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18861330

Country of ref document: EP

Kind code of ref document: A1