WO2000072602A1 - Multi-dimensional data compression - Google Patents

Multi-dimensional data compression Download PDF

Info

Publication number
WO2000072602A1
WO2000072602A1 PCT/CA2000/000132 CA0000132W WO0072602A1 WO 2000072602 A1 WO2000072602 A1 WO 2000072602A1 CA 0000132 W CA0000132 W CA 0000132W WO 0072602 A1 WO0072602 A1 WO 0072602A1
Authority
WO
WIPO (PCT)
Prior art keywords
output
produce
compressed
data
transform
Prior art date
Application number
PCT/CA2000/000132
Other languages
French (fr)
Inventor
Joe Toth
James Schellenberg
David Graves
Original Assignee
Edge Networks Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CA 2272590 external-priority patent/CA2272590A1/en
Priority claimed from CA 2277373 external-priority patent/CA2277373A1/en
Application filed by Edge Networks Corporation filed Critical Edge Networks Corporation
Priority to AU26529/00A priority Critical patent/AU2652900A/en
Publication of WO2000072602A1 publication Critical patent/WO2000072602A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/222Secondary servers, e.g. proxy server, cable television Head-end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/612Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/756Media network packet handling adapting media to device capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/765Media network packet handling intermediate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/36Scalability techniques involving formatting the layers as a function of picture distortion after decoding, e.g. signal-to-noise [SNR] scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/62Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding by frequency transforming in three dimensions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/63Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4516Management of client data or end-user data involving client characteristics, e.g. Set-Top-Box type, software version or amount of memory available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4621Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/61Network physical structure; Signal processing
    • H04N21/6106Network physical structure; Signal processing specially adapted to the downstream path of the transmission network
    • H04N21/6125Network physical structure; Signal processing specially adapted to the downstream path of the transmission network involving transmission via Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/637Control signals issued by the client directed to the server or network components
    • H04N21/6377Control signals issued by the client directed to the server or network components directed to server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64784Data processing by the network
    • H04N21/64792Controlling the complexity of the content stream, e.g. by dropping packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server

Abstract

A method of compressing a data signal, the method comprising the steps of selecting a sequence of image frames, the sequence being part of a video stream, applying a three dimensional transform to the selected sequence to produce a first transformed output, and encoding the transformed output to produce a compressed stream output.

Description

MULTI-DIMENSIONAL DATA COMPRESSION
This invention relates to the field of data compression and more particularly, to a method and system for efficient compression of digital video data.
BACKGROUND OF THE INVENTION
One of the most significant trends affecting the efficiency of the Internet today is the movement towards the full motion video and audio data across the Internet. As web sites continue to increase their multimedia content through the integration of audio, video and data, the ability of the web to effectively deliver this media to the Internet end users will yield a congestion problem due to the architecture of the web. The significant increase in multimedia incorporated in web pages is due in part to the developments in hardware and software that have allowed web page designers to efficiently create, design, access and utilize multimedia applications. These developments in multimedia content place significant demands on the network access functions of the Internet. There is ongoing development in improving network access functions such as providing high speed links, not only throughout the Internet backbone but down to the local access to the user. To reduce network traffic is to decrease the size of the data transferred across the network. This is achieved in many ways. One of these techniques is the use of data compression and manipulation. Traditionally, image compression methods may be classified as those which reproduce the original data exactly, that is, "loss less compression" and those which trade a tolerable divergence from the original data for greater compression, that is, "lossy compression". Typically, lossless methods have a problem that they are unable to achieve a compression of much more than 70%. Therefore, where higher compression ratios are needed, lossy techniques have been developed. In general, the amount by which the original media source is reduced is referred to as the compression ratio. Compression technologies have evolved over time to adapt to the various user requirements. Historically, compression technology focused on telephony, where sound wave compression algorithms were developed and optimized. These algorithms all implemented a one-dimensional (ID) transformation, which increased the ID entropy of the data in the transformed domain to allow for efficient quantization and ID data coding.
Compression technologies then focused on two-dimensional (2D) data such as images or pictures. At first, the ID audio algorithms were applied to the line data of each image to build up a compressed image. Research then progressed to the point today where the ID algorithms have been extended to implement a two dimensional (2D) transformation, which increases the 2D entropy to allow for efficient quantization and 2D data coding. Currently, state of the art technology requires compression of moving pictures or video. In this area, research is focused on applications of 2D image coding algorithms to a multitude of images which comprise video (frames) and apply motion compensation techniques to take advantage of correlation between frame data. For example, United States Patent No. RE 36015, re-issued December 29, 1998, describes a video compression system which is based on the image data compression system developed by the motion picture experts group (MPEG) which uses various groups of field configurations to reduce the number of binary bits used to represent a frame composed of odd and even fields of video information.
In general, MPEG systems integrate a number of well known data compression techniques into a single system. These include motion compensated predictive coding, discrete cosine transformation (DCT), adaptive quantization and variable length coding (VLC). The motion compensated predictive coding scheme processes the video data in groups of frames in order to achieve relatively high levels of compression without allowing the performance of the system to be degraded by excessive error propagation. In these group of frame processing schemes, image frames are classified into one of three types: the intraframe (I-Frame), the predicted frame (P -Frame) and the bidirectional frame (B-Frame). A 2D DCT is applied to small regions such as blocks of 8 x 8 pixels to encode each of the I-Frames. The resulting data stream is quantized and encoded using a variable length code such as amplitude run length Huffman code to produce the compressed output signal. As may be seen, this quantization technique still focuses on compressing single frames or images which may not be the most effective means of compression for current multimedia requirements. Also, for low bit rate applications, MPEG suffers from 8 x 8 blocking artifacts known as tiling. Furthermore, these second-generation compression approaches as described above, have reduced the media of data requirements for video by as much as 100:1. Typically, these technologies are focused on the following approaches: wavelet algorithms and vector quantization.
The wavelet algorithms are implemented with efficient significance map coding such as EZW and line detection with gradient vectors depending on the application's final reconstructed resolution. The wavelet algorithms operate on the entire image and have efficient implementation due to finite impulse response (FIR) filter realizations. All wavelet algorithms decompose an image into coarser, smooth approximations with low pass digital filtering (convolution) on the image. IN addition, the wavelet algorithms generate detailed approximations (error signals) with high pass digital filtering or convolution on the image. This decomposition process can be continued as far down the pyramid as a designer requires where each step in the pyramid has a sample rate reduction of two. This technique is also known as spatial sample rate decimation or down sampling of the image where the resolution is one half in the next sub-band of the pyramid as shown schematically in figures 1 and 2. In vector quantization (NQ), algorithms are used with efficient codebooks. The
NQ algorithm codebooks are based on macroblocks (8 x 8 or 16 x 16) to compress image data. These algorithms also have efficient implementations. However, they suffer from blocking artifacts (tiling) at low bit rates (high compression ratio). The codebooks have a few codes to represent a multitude of bit patterns where fewer bits are allocated to the bit patterns in a macro block with the highest probability. The VQ technique is shown schematically in figure 3.
As discussed earlier, these current techniques are limited when applied to third generation compression requirements, that is, compression ratios approaching 1000:1. That is, wavelet and vector quantization techniques as discussed above still focus on compressing single frames or images which may not be the most effective for third generation compression requirements.
SUMMARY OF THE INVENTION
In accordance with this invention there is provided a method of compressing a data signal, the method comprising the steps of:
(a) selecting a sequence of image frames, the sequences comprising part of a video stream;
(b) applying a three dimensional transform to the selected sequence to produce a first transformed output; and (c) encoding the transformed output to produce a compressed stream output. BRIEF DESCRIPTION OF THE DRAWINGS
These and other features of the preferred embodiments of the invention will become more apparent in the following detailed description in which reference is made to the appended drawings wherein: Figure 1 is a schematic diagram of a multi resolution wavelet compressor;
Figure 2 is a schematic diagram of a one-stage wavelet decoder; Figure 3 is a schematic diagram showing a single frame vector quantization technique;
Figure 4(a) and (b) is a schematic diagram of a video frame sequence for use in the present invention;
Figure 4(c) is a schematic representation of a transformed sequence; Figure 5 is a schematic diagram of a 3D wavelet dyadic sub-cube structure in accordance with the present invention;
Figure 6 is a graph showing compression ratio versus frame depth for different media types;
Figure 7 is a flow chart showing the operation of a 3D compression system; and Figure 8 is a flow chart showing the operation of a 3D decompression system.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the following description, like numerals refer to like structures in the drawings.
Referring to figure 4(a), a schematic diagram of a sequence of digitized video frames is shown generally by numeral 40. The sequence comprises N frames 42 each temporally sampled by an amount Δt„ . Each frame is made up of a two dimensional matrix of pixels. In order to compress this video frame sequence, a three dimension transform is applied to a three dimensional matrix of pixels defined in the sequence of frames defined in 3D space by (x,y,t) to yield a 3D cubic structure in the transformed domain. For the case of a 3D Fourier or cosine transform the center of the 3D structure shall be DC (for the case of spatial to spectral transformations). As one leaves the center of the cubic structure the density will decrease since image data information in the 3D structures dictates the spectral distribution. This is shown graphically in figures 4(b) and 4(c). Because there is a high correlation over the space defined by the (x,y,t) dimensions, there will be very high entropy in the transformed domain which will provide for compression ratios that can approach 1000: 1. The 3D algorithm may use (Ax, Ay, At). , spatial/frame data pixel values where (Ax, Ay, At) are constant and the total number of frames (NΔt) used in the transformation shall be variable, i.e., the frame depth, depending on the scene data and the media type. In scene data the probability is high that adjacent pixels in a frame are the same. This also applies to neighbouring pixels in adjacent frames.
Referring back to figure 4(b), a sequence of frames to be transformed is indicated by label A. The three dimensional continuous fourier transform when applied to the object A that is defined by a function / of three independent variables x, y, and z , is:
3{ ,
Figure imgf000007_0001
J f(x, y,z)x e-j2π{ux+vy+W2) dxdydz
Using Euler's formula, this may be expressed as:
F(U, v, w) = \ \ \f(x, v, z) x [cos{2;z"(- + vy + wz)} - j sin{2^-(w + vy + wz)}] dxdydz
or:
F{u, v, w) = R{u, v, w) + jl(u,v, w) = \F(u,v, w]eM"'v'w) with:
+∞ R(U,V, W) = J ]{/( , v,z)x [cos{2;r(wx + y + wz)}] dxdydz
and,
Figure imgf000007_0002
y + wz)}] dxdydz
The Fourier spectrum and spectral density are then defined by the following equations:
\F{u,v, W] = [{R(U, V, W)Y + {l{u,v, w)}2 f = 3D Fourier Specttrrum
(U, V, W) : 3D Fourier Phase
Figure imgf000007_0003
P(U, V, W) = {R(U,V, W)}2 + {l(u,v,w)Y = 3D Spectral Density The transformation of the object A which is represented by P(u,v,w) will define the three dimensional spectral information with DC located at the center P (0,0,0). As u, v, or w are changed, the spectral density also changes. In fact, the largest percentage of the energy within P(u,v,w) will be contained near the center P (0,0,0) with the density falling off dramatically(non-linearly) as u,v,w are non-zero.
For the case of the object A being a cubic structure, boundary conditions exist and the triple integration will result in a cubic structure with the spectral density being the greatest at the center of the cubic structure. Proceeding away from the center of the cubic structure, the spectral density will rapidly become smaller and the spectral density will approach zero at the edges of the cubic structure. This is shown graphically in figure 4(c). The uniformity throughout the object A, determines the rate at which the spectral density decreases from maximum at the center of the cubic structure to zero at the edges. With a high level of uniformity throughout A, there will be a large correlation or low entropy in A. As a result, the 3D Spectral Density will result in high entropy. This means the rate of change of density from the center of the transformed object A will be very high.
Rather than the continuous Fourier transform above, a three dimensional discrete fourier transform may be applied to the object A. In this case, a function f(x,y,z) that is sampled in the x, y, and z, dimensions by Δx, Δy, and Δz is given by the following equation:
Figure imgf000008_0001
where: u=0, 1, 2, . M-l, v=0, 1, 2, . N-l, w=0, l, 2, . . 0-1.
Using Euler's formula, this may be expressed as:
Figure imgf000008_0002
or:
F(u, v, w) = R(U, v, w) + jl(u,v, w) - (u,v,w)
Figure imgf000008_0003
, w e j
with:
Figure imgf000008_0004
and,
Figure imgf000009_0001
The spectral density is given by:
P(u,v, w) = {R{u,v, W)Y + {l(u,v, w)Y
For the case where N=M=O, the numerical complexity of implementation is proportional to N3. There will be 2N3 trigonometric calculations, 2N3 real multiplication's, and 2N3 real additions.
For video processing applications where Δx, Δy, and Δz correspond to the horizontal spatial sample rate, the vertical spatial sample rate, and the temporal sample rate respectively, the object A will be a cubic structure defined by the video input format such as Common Input Format (CIF). This results in a three dimensional (352,240,z) pixel array where z varies with the frame rate and the scene change data. The discrete fourier transform will result in a cubic structure (352,240,z) with the spectral density being the greatest at the center (352/2,240/2,z/2). Proceeding away from the center, the spectral density will decrease and approach zero at the edges. The pixel correlation throughout the area (352,240,z), determines the rate at which the spectral density decreases from maximum at the center of the cubic structure to zero at the edges. Generally there is a high correlation of temporal and spatial neighbors of a pixel with A. As a result, the 3D transformation will result in low correlation or high entropy. This means the rate of change of density from the center of the transformed object A will be very high. The rate of change is dependant on the type of video being processed and the temporal dimension z defined by scene changes. More clearly by type of video is meant talk shows, high action movies, cartoons and such like. Thus for different types of video the spectral content will vary.
The transformed object A may be recovered by applying the appropriate inverse transform. The three dimensional inverse discrete fourier transform is defined as follows:
M-iN-i ox nSΕ_y__y!__\
Z-l {F(u,v,w)} = f(x,y,z) = ∑∑∑F(u,v,w)x e ^ " °
_=0 v=0 w=0 where: x=0, 1, 2, . . . M-l, y=0, l, 2, . . . N-l, z=0, 1, 2, . . . O-l. Using Euler's formula, this may be expressed as:
f{x, v, z) =
Figure imgf000010_0001
or: f(x, v, z) = r(x, , z) + j i(x, v, z) = \f(x, y,
Figure imgf000010_0002
with:
M-\ N-\ 0-1 wz λλ r(x,y,z) = ∑∑∑R(",v,w)> cos > -+ + u=0 v=0 w=0 V M N O J) and,
Figure imgf000010_0003
The amplitude of f(x,y,z) is given by:
\f(x, y, z] - [{r(x, y, z)}2 + {i{x, y, z)}2 f
For the case where N=M=O, the numerical complexity of implementation is proportional to N3. There will be 2N3 trigonometric calculations, 2N3 real multiplication's, and 2N real additions.
Those experienced in the art will see the application of a three dimensional Discrete Cosine Transform for highly correlated video frames will yield optimal compaction in the density of the transformed video. The Discrete Sine Transform as well as other transforms can also be applied to the 3D structure defined by A.
Referring to figure 5, a schematic diagram of a 3D wavelet transform applied to the 3D matrix of pixels, is shown generally by numeral 50. The illustration shows a 3D wavelet dyadic sub-cube tree structure. In general, a 3D wavelet and / or a fractal algorithm may also be applied to the 3D transformation process to yield a multiresolution sub-cube with a dyadic sub-cube tree structure where a 3D Embedded Zerotree Wavelet (EZW) coding technique can be applied. In addition, an efficient DCT (FFT) expanded for 3D can be followed with entropy coding or code books for 3D spaces (i.e., 8x8x8, 16x16x16, etc.).
Referring to figure 6, a graph showing typical compression ratios estimated by the present invention for various types of media as a function of the frame depth (z) is shown generally by numeral 60. The basic concept which is the subject of the present application, may be used in extending conventional transforms to 3D fixed frame depth using such approaches as fractals, NQ, DCT and wavelets. Furthermore, optimizations can be realized as a result of the human visual system response to contrast sensitivity and the adaption range of the eye due to brightness levels. Optimal 3D coding techniques may also be derived by extending present 2D coding methods such as Huffman coding, arithmetic coding, and vector or surface quantization coding. Although the compression requirements for such approaches is expected to be high, an efficient 3D variable frame depth decoder may be implemented in hardware on a desktop PC. Such a variable frame depth decoder may also be implemented using a neural network or the like.
In addition, an algorithm may be used for determining the optimal frame depth on the fly for the 3D transformation which is dependent on the video content of the frame to frame pixel correlation. For low frame to frame pixel correlation (or SΝR Methods), a scene change is detectable and the length of the 3D matrix of pixels is determined on the fly. In this regard, the curves of compression ratio effectiveness vs. frame depth for media type as shown in figure 6 may be developed to indicate the expected performance for the applications ranging from high action movies, television broadcasts, to white boarding. For lossless compression, ID, 2D or 3D entropy coding can be used to achieve >70% compression. For lossy compression, a 3D Pixel Quantization mask is applied before entropy coding to achieve larger compression ratios.
Referring to figure 7, a flow chart of the general steps implemented in 3D compression systems is shown by numeral 70. Similarly, figure 8 a flow chart of the general steps implemented in a 3D decompression system is shown generally by numeral 80. The descriptions in each block therein incorporated herein. Although the invention has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the spirit and scope of the invention as outlined in the claims appended hereto.

Claims

THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. A method of compressing a data signal, the method comprising the steps of:
(a) selecting a first number of a sequence of image frames according a first criterion, the sequence being part of a video stream;
(b) applying a three dimensional transform to the selected sequence to produce a first transformed output; and
(c) encoding the transformed output to produce a compressed stream output.
2. A method as defined in claim 1, said frames being represented as analog data.
3. A method as defined in claim 1, said frames being represented as digital data representing an array of pixels arranged in two dimensions.
4. A method as defined in claim 1, said transform being a 3D Fourier Transform to produce a 3D coefficient array output.
5. A method as defined in claim 1, said transform being a 3D cosine transform to produce a 3D coefficient array output.
6. A method as defined in claim 1, said transform being a 3D wavelet transform to produce a 3D coefficient array output.
7. A method as defined in claim 1, said transform being a 3D fractal transform to produce a 3D coefficient array output.
8. A method as defined in claim 1, including the step of determining whether lossless compression is to be applied prior to encoding the transformed output.
9. A method as defined in claim 1 , including quantizing the transformed output to produce a 3D quantized coefficient array.
10. A method as defined in any of claims 4, 5, 6, or 7, including the step of quantizing the 3D coefficient array values to produce a 3D quantized coefficient array.
11. A method as defined in claim 1 , said encoding including selecting one of a run length limited, ID, 2D, or 3D entropy coding.
12. A method as defined in claim 1, including the step of caching the compressed stream output in a database.
13. A method as defined in claim 12, including the step of applying multipass compression.
14. A method as defined in claim 12, including transmitting said compressed stream over a communication medium to a receiver.
15. A method as defined in claim 14, including the step of transmitting said compressed stream as packet data.
16. A method for communicating compressed data between a media server and a client in a data communication network, the method comprising the steps of:
(a) selecting a sequence of image frames as part of a video stream to be transmitted to the recipient; (b) applying a three dimensional transform to the selected sequence to produce a first transformed output;
(c) encoding the transformed output to produce a compressed stream output;
(d) transmitting said compressed stream output to said recipient;
(e) said recipient decoding said compressed stream data; and (f) applying an inverse of said three dimensional transform to the decoded data to produce an uncompressed frame sequence.
17. A method as defined in claim 16, including the step of formatting said compressed data into coded data blocks prior to decoding said compressed data.
A method for storing data on a media server, said method comprising the steps of: receiving a media stream to be stored on said server; selecting a sequence of frames of said media; applying a three dimensional transform to the selected sequence to produce a first transformed output and encoding the transformed output to produce a compressed stream output; and storing said compressed stream on said server.
19. A method as defined in claim 18, including storing said compressed output on a digital tape.
20. A method as defined in claim 18, including the step of storing said compressed output on a digital video disk.
21. A method for communicating compressed data between a media source and a customer, the method comprising the steps of: selecting a first number of a sequence of image frames of said media by said source; applying a three dimensional transform to the selected sequence to produce a first transformed output; encoding the transformed output to produce a compressed stream output; storing said compressed output on a portable storage medium; and providing said portable medium to said customer.
22. A method as defined in claim 21 , said portable medium for use in a home entertainment system.
23. A method as defined in claim 1, including using said compressed output in a video electronic mail or video advertising system.
24. A method as defined in claim 1, including said compressed stream output by wireless, cable, ADSL or similar medium, to a recipient.
25. A method as defined in claim 1, including the step of using said compressed output in a video on demand system.
26. A method as defined in claim 21 , said first number being determined by evaluating said video content. A method as defined in claim 26, said video content being evaluated on a frame by frame basis.
PCT/CA2000/000132 1999-05-21 2000-02-15 Multi-dimensional data compression WO2000072602A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU26529/00A AU2652900A (en) 1999-05-21 2000-02-15 Multi-dimensional data compression

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
CA2,272,590 1999-05-21
CA 2272590 CA2272590A1 (en) 1999-05-21 1999-05-21 System and method for streaming media over an internet protocol system
CA2,277,373 1999-07-09
CA 2277373 CA2277373A1 (en) 1999-05-21 1999-07-09 Multi-dimensional data compression
CA 2280662 CA2280662A1 (en) 1999-05-21 1999-09-02 Media server with multi-dimensional scalable data compression
CA2,280,662 1999-09-02

Publications (1)

Publication Number Publication Date
WO2000072602A1 true WO2000072602A1 (en) 2000-11-30

Family

ID=27170970

Family Applications (3)

Application Number Title Priority Date Filing Date
PCT/CA2000/000133 WO2000072517A1 (en) 1999-05-21 2000-02-15 System and method for streaming media over an internet protocol system
PCT/CA2000/000131 WO2000072599A1 (en) 1999-05-21 2000-02-15 Media server with multi-dimensional scalable data compression
PCT/CA2000/000132 WO2000072602A1 (en) 1999-05-21 2000-02-15 Multi-dimensional data compression

Family Applications Before (2)

Application Number Title Priority Date Filing Date
PCT/CA2000/000133 WO2000072517A1 (en) 1999-05-21 2000-02-15 System and method for streaming media over an internet protocol system
PCT/CA2000/000131 WO2000072599A1 (en) 1999-05-21 2000-02-15 Media server with multi-dimensional scalable data compression

Country Status (3)

Country Link
AU (3) AU2652900A (en)
CA (1) CA2280662A1 (en)
WO (3) WO2000072517A1 (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020078253A1 (en) * 2000-12-20 2002-06-20 Gyorgy Szondy Translation of digital contents based on receiving device capabilities
US6407680B1 (en) 2000-12-22 2002-06-18 Generic Media, Inc. Distributed on-demand media transcoding system and method
US7242324B2 (en) 2000-12-22 2007-07-10 Sony Corporation Distributed on-demand media transcoding system and method
US20030028643A1 (en) * 2001-03-13 2003-02-06 Dilithium Networks, Inc. Method and apparatus for transcoding video and speech signals
US7054335B2 (en) * 2001-05-04 2006-05-30 Hewlett-Packard Development Company, L.P. Method and system for midstream transcoding of secure scalable packets in response to downstream requirements
WO2003001748A1 (en) * 2001-06-21 2003-01-03 Ziplabs Pte Ltd. Method and apparatus for compression and decompression of data
ITTO20010813A1 (en) * 2001-08-13 2003-02-13 Telecom Italia Lab Spa PROCEDURE FOR THE TRANSFER OF MESSAGES THROUGH UDP, ITS SYSTEM AND IT PRODUCT.
US7480703B2 (en) 2001-11-09 2009-01-20 Sony Corporation System, method, and computer program product for remotely determining the configuration of a multi-media content user based on response of the user
US7730165B2 (en) 2001-11-09 2010-06-01 Sony Corporation System, method, and computer program product for remotely determining the configuration of a multi-media content user
US7356575B1 (en) 2001-11-09 2008-04-08 Sony Corporation System, method, and computer program product for remotely determining the configuration of a multi-media content user
JP2003152544A (en) 2001-11-12 2003-05-23 Sony Corp Data communication system, data transmitter, data receiver, data-receiving method and computer program
US7284069B2 (en) 2002-01-11 2007-10-16 Xerox Corporation Method for document viewing
US7200615B2 (en) 2003-10-16 2007-04-03 Xerox Corporation Viewing tabular data on small handheld displays and mobile phones
CN100458747C (en) * 2003-10-31 2009-02-04 索尼株式会社 System, method, and computer program product for remotely determining the configuration of a multi-media content user
EP1738571A1 (en) * 2004-04-20 2007-01-03 France Télécom Multimedia messaging system and telephone station comprising same
US7620892B2 (en) 2004-07-29 2009-11-17 Xerox Corporation Server based image processing for client display of documents
US7539341B2 (en) 2004-07-29 2009-05-26 Xerox Corporation Systems and methods for processing image data prior to compression
US7721204B2 (en) 2004-07-29 2010-05-18 Xerox Corporation Client dependent image processing for browser-based image document viewer for handheld client devices
US8812978B2 (en) 2005-12-22 2014-08-19 Xerox Corporation System and method for dynamic zoom to view documents on small displays
US8711925B2 (en) 2006-05-05 2014-04-29 Microsoft Corporation Flexible quantization
US8139487B2 (en) 2007-02-28 2012-03-20 Microsoft Corporation Strategies for selecting a format for data transmission based on measured bandwidth
US8897359B2 (en) 2008-06-03 2014-11-25 Microsoft Corporation Adaptive quantization for enhancement layer video coding
CN101662454A (en) * 2008-08-29 2010-03-03 阿里巴巴集团控股有限公司 Method, device and system for image processing in internet
US9225762B2 (en) * 2011-11-17 2015-12-29 Google Technology Holdings LLC Method and apparatus for network based adaptive streaming
DE102013220901A1 (en) 2013-10-15 2015-04-16 Continental Automotive Gmbh Method for transmitting digital audio and / or video data
US9747010B2 (en) 2014-01-16 2017-08-29 Xerox Corporation Electronic content visual comparison apparatus and method
US9521176B2 (en) 2014-05-21 2016-12-13 Sony Corporation System, method, and computer program product for media publishing request processing
CN112751886B (en) * 2019-10-29 2023-05-26 贵州白山云科技股份有限公司 Transcoding method, transcoding system, transmission equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5289289A (en) * 1990-01-23 1994-02-22 Olympus Optical Co., Ltd. Image data coding apparatus and coding method for dynamic-image data
EP0629090A2 (en) * 1993-06-11 1994-12-14 Quantel Limited A video image processing system
US5570126A (en) * 1993-05-03 1996-10-29 Lucent Technologies Inc. System for composing multimedia signals for interactive television services
US5706216A (en) * 1995-07-28 1998-01-06 Reisch; Michael L. System for data compression of an image using a JPEG compression circuit modified for filtering in the frequency domain
US5708511A (en) * 1995-03-24 1998-01-13 Eastman Kodak Company Method for adaptively compressing residual digital image data in a DPCM compression system
WO1999025121A1 (en) * 1997-11-07 1999-05-20 Pipe Dream, Inc. Method for compressing and decompressing motion video

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5159447A (en) * 1991-05-23 1992-10-27 At&T Bell Laboratories Buffer control for variable bit-rate channel
US5881176A (en) * 1994-09-21 1999-03-09 Ricoh Corporation Compression and decompression with wavelet style and binary style including quantization by device-dependent parser
US5621660A (en) * 1995-04-18 1997-04-15 Sun Microsystems, Inc. Software-based encoder for a software-implemented end-to-end scalable video delivery system
US5822524A (en) * 1995-07-21 1998-10-13 Infovalue Computing, Inc. System for just-in-time retrieval of multimedia files over computer networks by transmitting data packets at transmission rate determined by frame size
JP2000504906A (en) * 1996-02-14 2000-04-18 オリブル コーポレイション リミティド Method and system for progressive asynchronous transmission of multimedia data
US5918013A (en) * 1996-06-03 1999-06-29 Webtv Networks, Inc. Method of transcoding documents in a network environment using a proxy server
US5996022A (en) * 1996-06-03 1999-11-30 Webtv Networks, Inc. Transcoding data in a proxy computer prior to transmitting the audio data to a client
US5953506A (en) * 1996-12-17 1999-09-14 Adaptive Media Technologies Method and apparatus that provides a scalable media delivery system
US20010039615A1 (en) * 1997-04-15 2001-11-08 At &T Corp. Methods and apparatus for providing a broker application server
US6014694A (en) * 1997-06-26 2000-01-11 Citrix Systems, Inc. System for adaptive video/audio transport over a network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5289289A (en) * 1990-01-23 1994-02-22 Olympus Optical Co., Ltd. Image data coding apparatus and coding method for dynamic-image data
US5570126A (en) * 1993-05-03 1996-10-29 Lucent Technologies Inc. System for composing multimedia signals for interactive television services
EP0629090A2 (en) * 1993-06-11 1994-12-14 Quantel Limited A video image processing system
US5708511A (en) * 1995-03-24 1998-01-13 Eastman Kodak Company Method for adaptively compressing residual digital image data in a DPCM compression system
US5706216A (en) * 1995-07-28 1998-01-06 Reisch; Michael L. System for data compression of an image using a JPEG compression circuit modified for filtering in the frequency domain
WO1999025121A1 (en) * 1997-11-07 1999-05-20 Pipe Dream, Inc. Method for compressing and decompressing motion video

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHAN Y -L ET AL: "VARIABLE TEMPORAL-LENGTH 3-D DISCRETE COSINE TRANSFORM CODING", IEEE TRANSACTIONS ON IMAGE PROCESSING,US,IEEE INC. NEW YORK, vol. 6, no. 5, 1 May 1997 (1997-05-01), pages 758 - 763, XP000656000, ISSN: 1057-7149 *
LAZAR M S ET AL: "FRACTAL BLOCK CODING OF DIGITAL VIDEO", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,US,IEEE INC. NEW YORK, vol. 4, no. 3, 1 June 1994 (1994-06-01), pages 297 - 308, XP000460761, ISSN: 1051-8215 *

Also Published As

Publication number Publication date
WO2000072517A1 (en) 2000-11-30
AU2652900A (en) 2000-12-12
CA2280662A1 (en) 2000-11-21
AU2652800A (en) 2000-12-12
WO2000072599A1 (en) 2000-11-30
AU2653000A (en) 2000-12-12

Similar Documents

Publication Publication Date Title
WO2000072602A1 (en) Multi-dimensional data compression
Marpe et al. Very low bit-rate video coding using wavelet-based techniques
KR100308627B1 (en) Low bit rate encoder using overlapping block motion compensation and zerotree wavelet coding
Taubman et al. JPEG2000: Standard for interactive imaging
KR100664928B1 (en) Video coding method and apparatus thereof
Ohta et al. Hybrid picture coding with wavelet transform and overlapped motion-compensated interframe prediction coding
Xing et al. Arbitrarily shaped video-object coding by wavelet
KR20050028019A (en) Wavelet based coding using motion compensated filtering based on both single and multiple reference frames
US6760479B1 (en) Super predictive-transform coding
CA2552800A1 (en) Video/image coding method and system enabling region-of-interest
Bao et al. Design of wavelet-based image codec in memory-constrained environment
Lu et al. Wavelet coding of video object by object-based SPECK algorithm
Singh et al. JPEG2000: A review and its performance comparison with JPEG
Lee et al. Subband video coding with scene-adaptive hierarchical motion estimation
CA2277373A1 (en) Multi-dimensional data compression
Efstratiadis et al. Image compression using subband/wavelet transform and adaptive multiple-distribution entropy coding
Nguyen et al. Importance prioritization coding in JPEG 2000 for interpretability with application to surveillance imagery
Rossetti et al. Improved scanning methods for wavelet coefficients of video signals
Cheong et al. Significance tree image sequence coding with DCT-based pyramid structure
Sengupta et al. Computationally fast wavelet-based video coding scheme
Rinaldo G. CALVAGNO
Cheng et al. Audio/video compression applications using wavelets
Calvagno et al. Multiresolution vector quantization for video coding
Sodagar et al. Scalable picture coding for multimedia applications
Han et al. Wavelet packet image coder using coefficients partitioning for remote sensing images

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)