CA2280662A1 - Media server with multi-dimensional scalable data compression - Google Patents
Media server with multi-dimensional scalable data compression Download PDFInfo
- Publication number
- CA2280662A1 CA2280662A1 CA 2280662 CA2280662A CA2280662A1 CA 2280662 A1 CA2280662 A1 CA 2280662A1 CA 2280662 CA2280662 CA 2280662 CA 2280662 A CA2280662 A CA 2280662A CA 2280662 A1 CA2280662 A1 CA 2280662A1
- Authority
- CA
- Canada
- Prior art keywords
- client
- compression
- data
- video
- accomplished
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000013144 data compression Methods 0.000 title description 7
- 238000007906 compression Methods 0.000 claims abstract description 68
- 230000006835 compression Effects 0.000 claims abstract description 67
- 238000000034 method Methods 0.000 claims abstract description 62
- 238000013139 quantization Methods 0.000 claims abstract description 46
- 230000004044 response Effects 0.000 claims abstract description 4
- 238000001914 filtration Methods 0.000 claims description 21
- 238000013459 approach Methods 0.000 claims description 8
- 230000008859 change Effects 0.000 claims description 7
- 230000035945 sensitivity Effects 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 6
- 230000003595 spectral effect Effects 0.000 claims description 5
- 230000015556 catabolic process Effects 0.000 claims description 4
- 238000006731 degradation reaction Methods 0.000 claims description 4
- 230000000750 progressive effect Effects 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 2
- 230000002708 enhancing effect Effects 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 8
- 238000005457 optimization Methods 0.000 description 6
- 230000009466 transformation Effects 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 230000000903 blocking effect Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000032258 transport Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002957 persistent organic pollutant Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- OIWCYIUQAVBPGV-DAQGAKHBSA-N {1-O-hexadecanoyl-2-O-[(Z)-octadec-9-enoyl]-sn-glycero-3-phospho}serine Chemical compound CCCCCCCCCCCCCCCC(=O)OC[C@H](COP(O)(=O)OC[C@H](N)C(O)=O)OC(=O)CCCCCCC\C=C/CCCCCCCC OIWCYIUQAVBPGV-DAQGAKHBSA-N 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/222—Secondary servers, e.g. proxy server, cable television Head-end
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/61—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
- H04L65/612—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/756—Media network packet handling adapting media to device capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/765—Media network packet handling intermediate
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/80—Responding to QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/04—Protocols for data compression, e.g. ROHC
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/36—Scalability techniques involving formatting the layers as a function of picture distortion after decoding, e.g. signal-to-noise [SNR] scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/62—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding by frequency transforming in three dimensions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234327—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/238—Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/258—Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
- H04N21/25808—Management of client data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2662—Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/4508—Management of client data or end-user data
- H04N21/4516—Management of client data or end-user data involving client characteristics, e.g. Set-Top-Box type, software version or amount of memory available
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/454—Content or additional data filtering, e.g. blocking advertisements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/462—Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
- H04N21/4621—Controlling the complexity of the content stream or additional data, e.g. lowering the resolution or bit-rate of the video stream for a mobile client with a small screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/61—Network physical structure; Signal processing
- H04N21/6106—Network physical structure; Signal processing specially adapted to the downstream path of the transmission network
- H04N21/6125—Network physical structure; Signal processing specially adapted to the downstream path of the transmission network involving transmission via Internet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/637—Control signals issued by the client directed to the server or network components
- H04N21/6377—Control signals issued by the client directed to the server or network components directed to server
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/647—Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
- H04N21/64784—Data processing by the network
- H04N21/64792—Controlling the complexity of the content stream, e.g. by dropping packets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/65—Transmission of management data between client and server
- H04N21/658—Transmission by the client directed to the server
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Graphics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Information Transfer Between Computers (AREA)
Abstract
A method and system for managing scalable compression of multicast media over an Internet protocol, between a media server and a client. The method includes the steps of determining the client access bandwidth statistics, determining a data buffer status of the client, generating a quantization mask in response to the buffer status, applying the quantization mask to an array of transformed frame data for each frame in a sequence, or group of frames in three-dimensional case, to produce quantized data, performing arithmetic coding on the quantized data to produce a bit stream; and transmitting the bitstream to the client.
Description
MEDIA SERVER WITH MULTI-DIMENSIONAL SCALABLE DATA
COMPRESSION
The present invention relates to a system and method for providing scalable data compression in a caching server and more particularly to a system and method for optimizing and managing such compression for streaming media edge servers.
BACKGROUND OF THE INVENTION
The Internet today is defined as the interconnection of Internet protocol (IP) -based networks. The Internet protocol stack diagram is represented in terms of the ISO -7-layer model. Various equipment types and products may be associated with the layer functionality that they service.
The Internet may be viewed as a single integrated network in which various access types are interconnected to various backbone types through edge servers and edge 1 S equipment (also called remote access servers or network access servers).
There are approximately twenty or more different variations of access paths that can be used to connect the backbone services to a customer (interchangeably referred to as a client).
There are six basic access types of connections, namely wireless terrestrial, wireless, satellite, copper, coaxial cable and fiber. In the future, additional access types may be created.
Thus far, the Internet has been a resounding success. Ironically, it is this very success and more specifically, the success of the graphical World Wide Web (the web) that may be its undoing. The number of web subscribers, content providers, and requests by those subscribers for content grows exponentially faster than the capability of the network to meet the demand. The majority of current data transfers involve text and graphics. However, the future of the Internet appears to be evolving towards the transfer of full motion video and audio.
As web sites continue to increase their multimedia content through the integration of audio, video and data, the ability of the web to effectively deliver this media to Internet end users will yield a congestion problem due to the nature of the web. One of the features that has made the web such a success is the ability of one user to access another user's information regardless of where that information is stored, what type of computer it is stored on, or what kind of application was used to create it.
Unfortunately, the same flexibility and ease of use features result in a serious contention issue, since everyone competes for the same available network resources. Streaming technologies for live audio and video over the web have exacerbated this problem even further.
Streaming media is different from the typical transfer of multimedia data.
Normally, hyperlinks point to multimedia files that are downloaded in their entirety to a user's local disk before being viewed or played. However, streaming media allows users to watch live video and audio as the file is downloading. Streaming media often requires a continual transfer of large volumes of data. If many people request the same data at the same time, it will lead to bandwidth restrictions or bottlenecks.
Some of the reasons for these bandwidth restrictions or bottlenecks are highlighted in the following description. Since the Internet is IP based, all packets must be evaluated by routers to determine the destination delivery paths, creating traffic congestion, particularly with the increased demand in real-time media, such as video and audio. It has been found that backbone, subnet and router upgrades are not sufficient to increase the Internet throughput to offset the increasing bandwidth requirements of the WWW itself. This problem is further exacerbated by end users having faster access to the ISP POP (Internet Service Provider Point of Presence). Providing "bigger pipes" to the POP simply sends bigger chunks of data onto the web. In another attempted solution, real-time protocols and specialized backbones have been developed. However, these solutions are suitable only for improving transports for scheduled or premium events, but are unsuitable for the proliferation of multimedia content that is expected in the near future. Although improved compression techniques promise to squeeze multimedia files into smaller and smaller sizes, video and audio will continue to require a "big pipe" as a result of the real-time transport requirements.
Oracle Corporation has proposed a solution to the above problem. In this solution, it is proposed that the multimedia data repository is placed closer to the consumer of the multimedia. Thus, servers are deployed at the edge of the web and multimedia data is replicated on these edge servers where the user connection terminates at the POP. Hyperlinks on the web pages become pointers to streaming media servers that are physically closest to the consumer. The philosophy behind this implementation is that the POP is the logical termination of the user's access point, and thus packets flowing into or out of the POP are only limited by the access speed of the user's connection. Any data packets that flow behind or through the ISP back channel, for example, router, are affected by bottlenecks. Thus, by placing the media repository at the POP and behind the muter, the user is insulated from traffic conditions that exist on the Internet at any given time. It is envisioned that content providers and web publishers use a combination of mirroring or caching techniques to replicate data to the edge servers.
A disadvantage of the above scheme is that in the mirroring scheme it requires the content providers and web publishers themselves to stage, propagate, and update the multimedia data to be replicated. In the caching model, if the requested data by the user was not already cached, a dialogue box would inform the user with the approximate time the media would be available and might suggest that they visit other sites in the interim.
In general, both situations are unacceptable to most users since most users require instant I S access to the requested data.
A further improvement on this method and particularly applicable to streaming media, has been proposed by Real Networks which introduced a distributed mufti-tier broadcast architecture for the Internet termed the Real Broadcast Network (RBN).
In this solution, access to the RBN server is distributed throughout the Internet backbone. Live feed is transmitted directly to splitters, which are located in the major backbone provider's network. This feed is then retransmitted or "split" from the backbone provider to sputters installed at the ISP site, where it is finally streamed to the user's computer.
Another solution that Real Networks proposes in order to counter the problem of providing high quality media (video and audio) to streaming users while accommodating the various physical connection speeds between the user and the ISP, is to create a scalable stream where the server can reduce the amount of data being sent to keep the client from rebuffering. This approach is generally referred to as video "stream-thinning". The limitations of this approach is that a video or audio file designed to play at one data rate and subsequently scaled down to a lower rate results in an inferior quality level when compared to a video optimized specifically for the lower data rates.
COMPRESSION
The present invention relates to a system and method for providing scalable data compression in a caching server and more particularly to a system and method for optimizing and managing such compression for streaming media edge servers.
BACKGROUND OF THE INVENTION
The Internet today is defined as the interconnection of Internet protocol (IP) -based networks. The Internet protocol stack diagram is represented in terms of the ISO -7-layer model. Various equipment types and products may be associated with the layer functionality that they service.
The Internet may be viewed as a single integrated network in which various access types are interconnected to various backbone types through edge servers and edge 1 S equipment (also called remote access servers or network access servers).
There are approximately twenty or more different variations of access paths that can be used to connect the backbone services to a customer (interchangeably referred to as a client).
There are six basic access types of connections, namely wireless terrestrial, wireless, satellite, copper, coaxial cable and fiber. In the future, additional access types may be created.
Thus far, the Internet has been a resounding success. Ironically, it is this very success and more specifically, the success of the graphical World Wide Web (the web) that may be its undoing. The number of web subscribers, content providers, and requests by those subscribers for content grows exponentially faster than the capability of the network to meet the demand. The majority of current data transfers involve text and graphics. However, the future of the Internet appears to be evolving towards the transfer of full motion video and audio.
As web sites continue to increase their multimedia content through the integration of audio, video and data, the ability of the web to effectively deliver this media to Internet end users will yield a congestion problem due to the nature of the web. One of the features that has made the web such a success is the ability of one user to access another user's information regardless of where that information is stored, what type of computer it is stored on, or what kind of application was used to create it.
Unfortunately, the same flexibility and ease of use features result in a serious contention issue, since everyone competes for the same available network resources. Streaming technologies for live audio and video over the web have exacerbated this problem even further.
Streaming media is different from the typical transfer of multimedia data.
Normally, hyperlinks point to multimedia files that are downloaded in their entirety to a user's local disk before being viewed or played. However, streaming media allows users to watch live video and audio as the file is downloading. Streaming media often requires a continual transfer of large volumes of data. If many people request the same data at the same time, it will lead to bandwidth restrictions or bottlenecks.
Some of the reasons for these bandwidth restrictions or bottlenecks are highlighted in the following description. Since the Internet is IP based, all packets must be evaluated by routers to determine the destination delivery paths, creating traffic congestion, particularly with the increased demand in real-time media, such as video and audio. It has been found that backbone, subnet and router upgrades are not sufficient to increase the Internet throughput to offset the increasing bandwidth requirements of the WWW itself. This problem is further exacerbated by end users having faster access to the ISP POP (Internet Service Provider Point of Presence). Providing "bigger pipes" to the POP simply sends bigger chunks of data onto the web. In another attempted solution, real-time protocols and specialized backbones have been developed. However, these solutions are suitable only for improving transports for scheduled or premium events, but are unsuitable for the proliferation of multimedia content that is expected in the near future. Although improved compression techniques promise to squeeze multimedia files into smaller and smaller sizes, video and audio will continue to require a "big pipe" as a result of the real-time transport requirements.
Oracle Corporation has proposed a solution to the above problem. In this solution, it is proposed that the multimedia data repository is placed closer to the consumer of the multimedia. Thus, servers are deployed at the edge of the web and multimedia data is replicated on these edge servers where the user connection terminates at the POP. Hyperlinks on the web pages become pointers to streaming media servers that are physically closest to the consumer. The philosophy behind this implementation is that the POP is the logical termination of the user's access point, and thus packets flowing into or out of the POP are only limited by the access speed of the user's connection. Any data packets that flow behind or through the ISP back channel, for example, router, are affected by bottlenecks. Thus, by placing the media repository at the POP and behind the muter, the user is insulated from traffic conditions that exist on the Internet at any given time. It is envisioned that content providers and web publishers use a combination of mirroring or caching techniques to replicate data to the edge servers.
A disadvantage of the above scheme is that in the mirroring scheme it requires the content providers and web publishers themselves to stage, propagate, and update the multimedia data to be replicated. In the caching model, if the requested data by the user was not already cached, a dialogue box would inform the user with the approximate time the media would be available and might suggest that they visit other sites in the interim.
In general, both situations are unacceptable to most users since most users require instant I S access to the requested data.
A further improvement on this method and particularly applicable to streaming media, has been proposed by Real Networks which introduced a distributed mufti-tier broadcast architecture for the Internet termed the Real Broadcast Network (RBN).
In this solution, access to the RBN server is distributed throughout the Internet backbone. Live feed is transmitted directly to splitters, which are located in the major backbone provider's network. This feed is then retransmitted or "split" from the backbone provider to sputters installed at the ISP site, where it is finally streamed to the user's computer.
Another solution that Real Networks proposes in order to counter the problem of providing high quality media (video and audio) to streaming users while accommodating the various physical connection speeds between the user and the ISP, is to create a scalable stream where the server can reduce the amount of data being sent to keep the client from rebuffering. This approach is generally referred to as video "stream-thinning". The limitations of this approach is that a video or audio file designed to play at one data rate and subsequently scaled down to a lower rate results in an inferior quality level when compared to a video optimized specifically for the lower data rates.
Furthermore, audio codecs cannot usually dynamically send to lower data rates.
An approach to address this heterogeneous connection rate environment is to create several files so that when a client connects, the server streams the appropriate file.
This has been referred to as "bandwidth negotiation". This process is not dynamic, so if a user's actual S throughput changes due to congestion or packet loss, the server cannot adjust. Another difficulty is the increased labor required for coding and then managing the media clip for different bandwidths. The Real Networks solution to these problems in its most recent incarnation is to provide an encoding framework for combining multiple data streams, each at different bit rates into a single file. A sophisticated client server mechanism is provided for detecting changes in bandwidth and translating those changes into combinations of different streams.
While the above attempts to address the solution of bandwidth negotiation and stream thinning, it still suffers from the limitation in that multiple streams corresponding to different bit rates must still be composed at the server end. For example, if ten different streams are to be composed each ranging from 1 megabit per second to kilobits per second, then all ten streams are composed at the server end on the backbone to the POP. Thus, a 1.8-megabit per second stream is sent down the backbone.
At the POP, ten different caches are now required. The POP then forwards the appropriate bit stream to the user depending on the user's access capability. It may be seen that in this solution, the user is provided with a relatively consistent stream. However, it still does not alleviate the problem of backbone congestion since multiple streams must all be transmitted along the backbone.
An improvement in the current architecture is described in the subject applicants pending Canadian application Serial no. 2,272,590 filed May 21, 1999 and titled "System and Method for Streaming Media over an Internet Protocol". In this architecture communication between a client and a continuos media server is implemented by the media server composing data to be transmitted into a backbone common format;
the server transmitting the backbone common format data to the client POP;
converting at the POP the backbone common format data into a plurality of access common format data for transmission to ones of a plurality of clients. In this system a single high quality data stream may be transmitted to an edge server. The edge server or POP filters the data to the respective client bandwidth capabilities. In a further embodiment of this architecture the edge server may also utilize trans-compression techniques to adapt the received data for filtering to the respective client bandwidth capabilities.
Traditionally, image compression methods may be classified as those methods, which reproduce the original data exactly, that is, "lossless compression" and those, which trade a tolerable divergence from the original data for greater compression, that is, "lossy compression". Typically, lossless methods have a problem that they are unable to achieve a compression of much more than 70%. Therefore, where higher compression ratios are needed, lossy techniques have been developed. In general, the amount by which the original media source is reduced is referred to as the compression ratio.
Compression technologies have evolved over time to adapt to the various user requirements. Historically, compression technology focused on telephony, where sound wave compression algorithms were developed and optimized. These algorithms all implemented a one-dimensional (1D) transformation, which increased the 1D
entropy of the data in the transformed domain to allow for efficient quantization and 1D
data coding.
Compression technologies then focused on two-dimensional (2D) data such as images or pictures. At first, the 1D audio algorithms were applied to the line data of each image to build up a compressed image. Research then progressed to the point today where the 1D algorithms have been extended to implement a two dimensional (2D) transformation, which increases the 2D entropy to allow for efficient quantization and 2D
data coding.
Currently, state of the art technology requires compression of moving pictures or video. In this area, research is focused on applications of 2D image coding algorithms to a multitude of images which comprise video (frames) and apply motion compensation techniques to take advantage of correlation between frame data. For example, United States Patent No. RE 36015, re-issued December 29, 1998, describes a video compression system which is based on the image data compression system developed by the motion picture experts group (MPEG) which uses various groups of field configurations to reduce the number of binary bits used to represent a frame composed of odd and even fields of video information.
S
In general, MPEG systems integrate a number of well known data compression techniques into a single system. These include motion compensated predictive coding, discrete cosine transformation (DCT), adaptive quantization and variable length coding (VLC). The motion compensated predictive coding scheme processes the video data in groups of frames in order to achieve relatively high levels of compression without allowing the performance of the system to be degraded by excessive error propagation.
In these group of frame processing schemes, image frames are classified into one of three types: the intraframe (I-Frame), the predicted frame (P-Frame) and the bi-directional frame (B-Frame). A 2D DCT is applied to small regions such as blocks of 8 x 8 pixels to encode each of the I-Frames. The resulting data stream is quantized and encoded using a variable length code such as amplitude run length Huffman code to produce the compressed output signal. As may be seen, this quantization technique still focuses on compressing single frames or images which may not be the most effective means of compression for current multimedia requirements. Also, for low bit rate applications, MPEG suffers from 8 x 8 blocking artifacts known as tiling. Furthermore, these second-generation compression approaches as described above, have reduced the media of data requirements for video by as much as 100:1. Typically, these technologies are focused on the following approaches: wavelet algorithms and vector quantization.
The wavelet algorithms are implemented with efficient significance map coding such as EZW and line detection with gradient vectors depending on the application's final reconstructed resolution. The wavelet algorithms operate on the entire image and have efficient implementation due to finite impulse response (FIR) filter realizations. All wavelet algorithms decompose an image into coarser, smooth approximations with low pass digital filtering (convolution) on the image. In addition, the wavelet algorithms generate detailed approximations (error signals) with high pass digital filtering or convolution on the image. This decomposition process can be continued as far down the pyramid as a designer requires where each step in the pyramid has a sample rate reduction of two. This technique is also known as spatial sample rate decimation or down sampling of the image where the resolution is one half in the next sub-band of the pyramid.
In vector quantization (VQ), algorithms are used with efficient codebooks. A
single frame from a video stream is divided into macroblocks of 8x8 or 16x16 pixels, each macroblock thus has either 64 or 256 states which are input to a codebook (look-up table) to produce N unique output codes where N is much less than 64. The VQ
algorithm codebooks are based on macroblocks (8 x 8 or 16 x 16) to compress image data. These algorithms also have efficient implementations. However, they suffer from blocking artifacts (tiling) at low bit rates (high compression ratio). The codebooks have a few codes to represent a multitude of bit patterns where fewer bits are allocated to the bit patterns in a macro block with the highest probability.
As discussed earlier, these current techniques are limited when applied to third generation compression requirements, that is, compression ratios approaching 1000:1.
That is, wavelet and vector quantization techniques as discussed above still focus on compressing single frames or images which may not be the most effective for third generation compression requirements.
A vastly improved compression technique over the current techniques is described in a pending Canadian Patent Application filed July 9, 1999 and titled "Multi-Dimensional Data Compression", also assigned to the subject applicants. The technique as described in this application may be applied to video data signals. The method comprises the steps of selecting a sequence of image frames in a video stream, applying a three dimensional transform to the selected sequence to produce a transformed output, and then encoding the transformed output to produce a compressed stream output.
Given the current Internet architecture and compression, there is a need for a multicast media over Internet protocol (MMIP) caching bandwidth manager implemented in an edge server which uses the above described compression techniques for efficiently streaming media over the client access link and which adapts to the client access b andwidth.
SUMMARY OF THE INVENTION
In accordance with this invention there is provided a method for managing scalable compression of multicast or unicast media over an Internet protocol, between a media server and a client, the method comprising the steps of:
(a) determining the client access bandwidth statistics and client hardware capabilities;
(b) determining a data buffer status of the client;
(c) generating a quantization mask in response to the buffer status;
(d) applying the quantization mask to an array of transformed frame data for each frame in a sequence, or group of frames in 3D case to produce quantized data;
(e) performing entropy coding on the quantized data to produce a bit stream;
and (f) transmitting the bitstream to the client.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other features of the preferred embodiments of the invention will become more apparent in the following detailed description in which reference is made to the appended drawings wherein:
Figure 1 is a schematic diagram of an Internet architecture;
Figure 2 is a schematic system diagram of an edge or gateway server located at an ISP according to an embodiment of the invention;
Figure 3 is a schematic system diagram of a client according to an embodiment of the invention;
Figure 4 is a schematic functional block diagram of a caching bandwidth manager of figure 2;
Figure 5 is a schematic functional block diagram of a client player of figure 3;
Figure 6 is a schematic flow diagram of the bandwidth manager operation;
Figure 7 is a schematic flow diagram of a bandwidth optimizer according to an embodiment of the present invention;
Figure 8 is a schematic flow diagram of a quantizer according to an embodiment of the present invention;
Figure 9 is a schematic diagram of a real-time frame by frame quantizer for a wavelet case; and Figure 10 is a schematic diagram of a real-time frame by frame quantizer for a 3D spectral case.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
In the following description, like numerals refer to like structures in the drawings.
Referring to figure l, a general Internet architecture as it currently exists is shown generally by numeral 20. The architecture comprises a backbone network 22 which is defined as the interconnection equipment concerned with connecting local web sites to local POPS and an access network 24 that is defined as the interconnection between the local POPs and the consumers. The Web sites 26 host both digital and analog content from various Content providers 28, which are in-turn connected via a global and national Internet infrastructure 30 to the local access Internet infrastructure 32. The consumer or viewer 34 connects to the national Internet infrastructure 30 i.e. at the POP, by one or more access links 33. Cache 36 sites are provided between the global and national Internet infrastructure 30 and the local access Internet infrastructure 32.
The cache sites 36 are normally the demarcation between the backbone network 22 and the access network 24. Backbone web sites 26 typically do not consider the needs of various types of access 33 employed by clients 34 and various qualities of access links in their consideration of web content. It is normally the responsibility of the web content provider 28 to customizing the web site content for different access links.
Therefore the present invention leverages off the existing architecture, but implements a content compression architecture that uses technology in the data link level to application level of the ISO model to optimize the access from the local POP to the customer 34.
Refernng to figure 2, a system block diagram of local access server 37 is shown, while in figure 3, a client or consumer 34-system block diagram according to an embodiment of the present invention is shown. In this case the local access link 32 is a 56k modem.
Refernng to figure 4, a schematic diagram of the functional blocks of a multicast media over an Internet protocol (MMIP) caching bandwidth manager implemented in the local access server or edge server 37 is indicated generally at numeral 100.
The manager 100 functions as a network edge server/gateway and is compliant with the IETF
working groups protocol recommendations for streaming media. The manager 100 includes an m-dimension wavelet or spectral video translator module 102, an audio translator module 104, an access protocol optimization module 106, a cache database manager module 108, and a server management module 110.
In general the function performed by the manager 100 is to stream media over IP
to a client player, shown in the schematic system diagram of figure 3 and the corresponding functional diagram of figure 5, by implementing a controlled compression or extended client filtering. In controlled compression the manager 100 receives compressed video from standard COTS video servers and then translates the video stream using an m-dimensional wavelet or spectral codec. The manager 100 is also capable of receiving m-dimensional compressed video streams over IP from a originating media server in accordance with an embodiment of the present invention.
The caching bandwidth manager 100 receives streamed MPEG data compliant with RTP and sends translated streamed media data in wavelet format with header compression that is optimized for the access link 33. In addition the manager receives end user access configuration data in the form of client video/audio capabilities, access link capabilities, and media requests. The caching bandwidth manager utilizes the configuration data to send MPEG server requests to indicate when a user requests media from the WWW.
The caching bandwidth manager 100 performs the functions of an edge server or traditional gateway for converting protocols between the backbone network and the access network.
Each of the elements of the caching manager as shown in figure 4 will be described in detail below. Thus refernng back to figure 4, the m-dimensional wavelet video translator module 102 receives streamed MPEG video data 120 from an MPEG
media server (not shown) that is preferably compliant with RFC2250. The MPEG
data is decompressed back into the luminance and chromanence frame data values. This uncompressed frame data is then re-compressed using controlled compression according to an embodiment of the present invention, as described below. All the wavelet significance maps or sub-bands defined by the wavelet pyramidal decomposition, as described in copending Canadian Patent application titled "Multi-Dimensional Data Compression", incorporated herein by reference, is stored as compressed video data 122 in a media database 124 along with timestamp headers that are compliant with the standards definition in RTP. The wavelet video translator module 102 is enabled by a video translation control signal 126 generated by the edge server when the access configuration data indicates that there is an MPEG media request being made by the end user.
The wavelet stream is stored in the media database 124 as N level deep pyramidal multiresolution sub-bands coded with the embedded zerotrees of wavelet coefficients (EZW) compression algorithm. The highly correlated lowest resolution significance map or sub-band in the tree is processed with an algorithm such as the Discrete Cosine Transform (DCT), and then entropy coded with Huffman coding or arithmetic coding. The EZW algorithm is used to code all the other subbands or children of the lowest resolution subband that is coded by the DCT. This technique will result in efficient compression of the principle components of the video stream by a method, which closely approximates the optimal Karhunen-Loeve transformation.
The Audio Translator module 104 receives streamed MPEG audio data from the Media Server that is compliant with RFC2250. The compressed audio stream may be uncompressed back to sample data values and this uncompressed stream efficiently re-compressed with timestamps using the audio wavelet codec. The audio translator module 104 sends compressed audio data to the media database for efficient streaming to the client player over the access link.
An efficient application of the EZW transformation algorithm is in providing progressive video over various access link bandwidths. The Access Protocol Optimization module 106 uses access optimization protocols, such as the controlled compression and extended client filtering to read Compressed Media Data from the Media Database and Stream Media Data in the form of time synchronized media wavelet coefficients to the Client Player based on the available bandwidth and the MMIP Client Player configuration. For low speed access links, the PPP (RFC 1661 ) and the Serial Line Internet Protocol (SLIP) shall be supported with 10:1 IPv4 header compression compliant to RFC1144 and RFC2508/RFC2509 respectively. These IETF recommendations discuss lossless header compression algorithms to reduce the redundancies in the header addresses and timestamps by using difference products. RFC2507 for non serial links for header compression in mobile IP, etc will result in approximately 15:1 IPv6 compression of the header information. As an example, at 50 packets per seconds, the UDP/IPv4 headers consume 11.2 kbits/s and UDP/IPv6 headers consume 19.2 kbits/s when uncompressed. Using RFC2507, the overhead can be reduced to approximately 1.7 KBPS. In addition, up to 2:1 lossless compression can be achieved for the packet payload if the web data is not already in an encrypted or compressed format.
An algorithm is used here to detect if the compression of the web data results in data expansion. Data expansion is typical when a compression algorithm is applied to encrypted data. The Access Protocol Optimization module supports RTP according to RFC1889 and RTSP according to RFC2326 to stream media to the Client Player.
The Access Protocol Optimization module 106 implements algorithms to efficiently perform Controlled Compression and Extended Client Filtering in media streaming to the MMIP Client Player over the access link bandwidth. The algorithms implemented include the following functions:
1) Stream the multiresolution subbands to the Client Player utilizing RTP/LIDP/IP.
2) Stream the multiresolution subbands to the Client Player utilizing RTP/TCP/IP
for streaming of media through firewalls that do not support UDP port assignments.
3) Stream only M of a total of N multiresolution subbands starting from the lowest resolution, where M <_ N. The algorithm for which M subbands to stream to the Client Player is based on access infrastructure and bandwidth available (i.e.:
twisted pair, wireless, satellite) to maintain the best subjective image quality according to the human visual systems logarithmic sensitivity to light intensity and sensitivity to abrupt spatial changes.
4) Selectively stream only an area of each of the M subbands in 3). The algorithm will determine the shape of the area to stream of the M subbands and vary it from all the lowest subband coefficients to a small geometrical area in the center of the highest resolution subband. The algorithm to control the rate of change of the geometrical area from low to high resolution subbands in the center of the image area is optimized based on the access infrastructure, bandwidth available, and client capability.
An approach to address this heterogeneous connection rate environment is to create several files so that when a client connects, the server streams the appropriate file.
This has been referred to as "bandwidth negotiation". This process is not dynamic, so if a user's actual S throughput changes due to congestion or packet loss, the server cannot adjust. Another difficulty is the increased labor required for coding and then managing the media clip for different bandwidths. The Real Networks solution to these problems in its most recent incarnation is to provide an encoding framework for combining multiple data streams, each at different bit rates into a single file. A sophisticated client server mechanism is provided for detecting changes in bandwidth and translating those changes into combinations of different streams.
While the above attempts to address the solution of bandwidth negotiation and stream thinning, it still suffers from the limitation in that multiple streams corresponding to different bit rates must still be composed at the server end. For example, if ten different streams are to be composed each ranging from 1 megabit per second to kilobits per second, then all ten streams are composed at the server end on the backbone to the POP. Thus, a 1.8-megabit per second stream is sent down the backbone.
At the POP, ten different caches are now required. The POP then forwards the appropriate bit stream to the user depending on the user's access capability. It may be seen that in this solution, the user is provided with a relatively consistent stream. However, it still does not alleviate the problem of backbone congestion since multiple streams must all be transmitted along the backbone.
An improvement in the current architecture is described in the subject applicants pending Canadian application Serial no. 2,272,590 filed May 21, 1999 and titled "System and Method for Streaming Media over an Internet Protocol". In this architecture communication between a client and a continuos media server is implemented by the media server composing data to be transmitted into a backbone common format;
the server transmitting the backbone common format data to the client POP;
converting at the POP the backbone common format data into a plurality of access common format data for transmission to ones of a plurality of clients. In this system a single high quality data stream may be transmitted to an edge server. The edge server or POP filters the data to the respective client bandwidth capabilities. In a further embodiment of this architecture the edge server may also utilize trans-compression techniques to adapt the received data for filtering to the respective client bandwidth capabilities.
Traditionally, image compression methods may be classified as those methods, which reproduce the original data exactly, that is, "lossless compression" and those, which trade a tolerable divergence from the original data for greater compression, that is, "lossy compression". Typically, lossless methods have a problem that they are unable to achieve a compression of much more than 70%. Therefore, where higher compression ratios are needed, lossy techniques have been developed. In general, the amount by which the original media source is reduced is referred to as the compression ratio.
Compression technologies have evolved over time to adapt to the various user requirements. Historically, compression technology focused on telephony, where sound wave compression algorithms were developed and optimized. These algorithms all implemented a one-dimensional (1D) transformation, which increased the 1D
entropy of the data in the transformed domain to allow for efficient quantization and 1D
data coding.
Compression technologies then focused on two-dimensional (2D) data such as images or pictures. At first, the 1D audio algorithms were applied to the line data of each image to build up a compressed image. Research then progressed to the point today where the 1D algorithms have been extended to implement a two dimensional (2D) transformation, which increases the 2D entropy to allow for efficient quantization and 2D
data coding.
Currently, state of the art technology requires compression of moving pictures or video. In this area, research is focused on applications of 2D image coding algorithms to a multitude of images which comprise video (frames) and apply motion compensation techniques to take advantage of correlation between frame data. For example, United States Patent No. RE 36015, re-issued December 29, 1998, describes a video compression system which is based on the image data compression system developed by the motion picture experts group (MPEG) which uses various groups of field configurations to reduce the number of binary bits used to represent a frame composed of odd and even fields of video information.
S
In general, MPEG systems integrate a number of well known data compression techniques into a single system. These include motion compensated predictive coding, discrete cosine transformation (DCT), adaptive quantization and variable length coding (VLC). The motion compensated predictive coding scheme processes the video data in groups of frames in order to achieve relatively high levels of compression without allowing the performance of the system to be degraded by excessive error propagation.
In these group of frame processing schemes, image frames are classified into one of three types: the intraframe (I-Frame), the predicted frame (P-Frame) and the bi-directional frame (B-Frame). A 2D DCT is applied to small regions such as blocks of 8 x 8 pixels to encode each of the I-Frames. The resulting data stream is quantized and encoded using a variable length code such as amplitude run length Huffman code to produce the compressed output signal. As may be seen, this quantization technique still focuses on compressing single frames or images which may not be the most effective means of compression for current multimedia requirements. Also, for low bit rate applications, MPEG suffers from 8 x 8 blocking artifacts known as tiling. Furthermore, these second-generation compression approaches as described above, have reduced the media of data requirements for video by as much as 100:1. Typically, these technologies are focused on the following approaches: wavelet algorithms and vector quantization.
The wavelet algorithms are implemented with efficient significance map coding such as EZW and line detection with gradient vectors depending on the application's final reconstructed resolution. The wavelet algorithms operate on the entire image and have efficient implementation due to finite impulse response (FIR) filter realizations. All wavelet algorithms decompose an image into coarser, smooth approximations with low pass digital filtering (convolution) on the image. In addition, the wavelet algorithms generate detailed approximations (error signals) with high pass digital filtering or convolution on the image. This decomposition process can be continued as far down the pyramid as a designer requires where each step in the pyramid has a sample rate reduction of two. This technique is also known as spatial sample rate decimation or down sampling of the image where the resolution is one half in the next sub-band of the pyramid.
In vector quantization (VQ), algorithms are used with efficient codebooks. A
single frame from a video stream is divided into macroblocks of 8x8 or 16x16 pixels, each macroblock thus has either 64 or 256 states which are input to a codebook (look-up table) to produce N unique output codes where N is much less than 64. The VQ
algorithm codebooks are based on macroblocks (8 x 8 or 16 x 16) to compress image data. These algorithms also have efficient implementations. However, they suffer from blocking artifacts (tiling) at low bit rates (high compression ratio). The codebooks have a few codes to represent a multitude of bit patterns where fewer bits are allocated to the bit patterns in a macro block with the highest probability.
As discussed earlier, these current techniques are limited when applied to third generation compression requirements, that is, compression ratios approaching 1000:1.
That is, wavelet and vector quantization techniques as discussed above still focus on compressing single frames or images which may not be the most effective for third generation compression requirements.
A vastly improved compression technique over the current techniques is described in a pending Canadian Patent Application filed July 9, 1999 and titled "Multi-Dimensional Data Compression", also assigned to the subject applicants. The technique as described in this application may be applied to video data signals. The method comprises the steps of selecting a sequence of image frames in a video stream, applying a three dimensional transform to the selected sequence to produce a transformed output, and then encoding the transformed output to produce a compressed stream output.
Given the current Internet architecture and compression, there is a need for a multicast media over Internet protocol (MMIP) caching bandwidth manager implemented in an edge server which uses the above described compression techniques for efficiently streaming media over the client access link and which adapts to the client access b andwidth.
SUMMARY OF THE INVENTION
In accordance with this invention there is provided a method for managing scalable compression of multicast or unicast media over an Internet protocol, between a media server and a client, the method comprising the steps of:
(a) determining the client access bandwidth statistics and client hardware capabilities;
(b) determining a data buffer status of the client;
(c) generating a quantization mask in response to the buffer status;
(d) applying the quantization mask to an array of transformed frame data for each frame in a sequence, or group of frames in 3D case to produce quantized data;
(e) performing entropy coding on the quantized data to produce a bit stream;
and (f) transmitting the bitstream to the client.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other features of the preferred embodiments of the invention will become more apparent in the following detailed description in which reference is made to the appended drawings wherein:
Figure 1 is a schematic diagram of an Internet architecture;
Figure 2 is a schematic system diagram of an edge or gateway server located at an ISP according to an embodiment of the invention;
Figure 3 is a schematic system diagram of a client according to an embodiment of the invention;
Figure 4 is a schematic functional block diagram of a caching bandwidth manager of figure 2;
Figure 5 is a schematic functional block diagram of a client player of figure 3;
Figure 6 is a schematic flow diagram of the bandwidth manager operation;
Figure 7 is a schematic flow diagram of a bandwidth optimizer according to an embodiment of the present invention;
Figure 8 is a schematic flow diagram of a quantizer according to an embodiment of the present invention;
Figure 9 is a schematic diagram of a real-time frame by frame quantizer for a wavelet case; and Figure 10 is a schematic diagram of a real-time frame by frame quantizer for a 3D spectral case.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
In the following description, like numerals refer to like structures in the drawings.
Referring to figure l, a general Internet architecture as it currently exists is shown generally by numeral 20. The architecture comprises a backbone network 22 which is defined as the interconnection equipment concerned with connecting local web sites to local POPS and an access network 24 that is defined as the interconnection between the local POPs and the consumers. The Web sites 26 host both digital and analog content from various Content providers 28, which are in-turn connected via a global and national Internet infrastructure 30 to the local access Internet infrastructure 32. The consumer or viewer 34 connects to the national Internet infrastructure 30 i.e. at the POP, by one or more access links 33. Cache 36 sites are provided between the global and national Internet infrastructure 30 and the local access Internet infrastructure 32.
The cache sites 36 are normally the demarcation between the backbone network 22 and the access network 24. Backbone web sites 26 typically do not consider the needs of various types of access 33 employed by clients 34 and various qualities of access links in their consideration of web content. It is normally the responsibility of the web content provider 28 to customizing the web site content for different access links.
Therefore the present invention leverages off the existing architecture, but implements a content compression architecture that uses technology in the data link level to application level of the ISO model to optimize the access from the local POP to the customer 34.
Refernng to figure 2, a system block diagram of local access server 37 is shown, while in figure 3, a client or consumer 34-system block diagram according to an embodiment of the present invention is shown. In this case the local access link 32 is a 56k modem.
Refernng to figure 4, a schematic diagram of the functional blocks of a multicast media over an Internet protocol (MMIP) caching bandwidth manager implemented in the local access server or edge server 37 is indicated generally at numeral 100.
The manager 100 functions as a network edge server/gateway and is compliant with the IETF
working groups protocol recommendations for streaming media. The manager 100 includes an m-dimension wavelet or spectral video translator module 102, an audio translator module 104, an access protocol optimization module 106, a cache database manager module 108, and a server management module 110.
In general the function performed by the manager 100 is to stream media over IP
to a client player, shown in the schematic system diagram of figure 3 and the corresponding functional diagram of figure 5, by implementing a controlled compression or extended client filtering. In controlled compression the manager 100 receives compressed video from standard COTS video servers and then translates the video stream using an m-dimensional wavelet or spectral codec. The manager 100 is also capable of receiving m-dimensional compressed video streams over IP from a originating media server in accordance with an embodiment of the present invention.
The caching bandwidth manager 100 receives streamed MPEG data compliant with RTP and sends translated streamed media data in wavelet format with header compression that is optimized for the access link 33. In addition the manager receives end user access configuration data in the form of client video/audio capabilities, access link capabilities, and media requests. The caching bandwidth manager utilizes the configuration data to send MPEG server requests to indicate when a user requests media from the WWW.
The caching bandwidth manager 100 performs the functions of an edge server or traditional gateway for converting protocols between the backbone network and the access network.
Each of the elements of the caching manager as shown in figure 4 will be described in detail below. Thus refernng back to figure 4, the m-dimensional wavelet video translator module 102 receives streamed MPEG video data 120 from an MPEG
media server (not shown) that is preferably compliant with RFC2250. The MPEG
data is decompressed back into the luminance and chromanence frame data values. This uncompressed frame data is then re-compressed using controlled compression according to an embodiment of the present invention, as described below. All the wavelet significance maps or sub-bands defined by the wavelet pyramidal decomposition, as described in copending Canadian Patent application titled "Multi-Dimensional Data Compression", incorporated herein by reference, is stored as compressed video data 122 in a media database 124 along with timestamp headers that are compliant with the standards definition in RTP. The wavelet video translator module 102 is enabled by a video translation control signal 126 generated by the edge server when the access configuration data indicates that there is an MPEG media request being made by the end user.
The wavelet stream is stored in the media database 124 as N level deep pyramidal multiresolution sub-bands coded with the embedded zerotrees of wavelet coefficients (EZW) compression algorithm. The highly correlated lowest resolution significance map or sub-band in the tree is processed with an algorithm such as the Discrete Cosine Transform (DCT), and then entropy coded with Huffman coding or arithmetic coding. The EZW algorithm is used to code all the other subbands or children of the lowest resolution subband that is coded by the DCT. This technique will result in efficient compression of the principle components of the video stream by a method, which closely approximates the optimal Karhunen-Loeve transformation.
The Audio Translator module 104 receives streamed MPEG audio data from the Media Server that is compliant with RFC2250. The compressed audio stream may be uncompressed back to sample data values and this uncompressed stream efficiently re-compressed with timestamps using the audio wavelet codec. The audio translator module 104 sends compressed audio data to the media database for efficient streaming to the client player over the access link.
An efficient application of the EZW transformation algorithm is in providing progressive video over various access link bandwidths. The Access Protocol Optimization module 106 uses access optimization protocols, such as the controlled compression and extended client filtering to read Compressed Media Data from the Media Database and Stream Media Data in the form of time synchronized media wavelet coefficients to the Client Player based on the available bandwidth and the MMIP Client Player configuration. For low speed access links, the PPP (RFC 1661 ) and the Serial Line Internet Protocol (SLIP) shall be supported with 10:1 IPv4 header compression compliant to RFC1144 and RFC2508/RFC2509 respectively. These IETF recommendations discuss lossless header compression algorithms to reduce the redundancies in the header addresses and timestamps by using difference products. RFC2507 for non serial links for header compression in mobile IP, etc will result in approximately 15:1 IPv6 compression of the header information. As an example, at 50 packets per seconds, the UDP/IPv4 headers consume 11.2 kbits/s and UDP/IPv6 headers consume 19.2 kbits/s when uncompressed. Using RFC2507, the overhead can be reduced to approximately 1.7 KBPS. In addition, up to 2:1 lossless compression can be achieved for the packet payload if the web data is not already in an encrypted or compressed format.
An algorithm is used here to detect if the compression of the web data results in data expansion. Data expansion is typical when a compression algorithm is applied to encrypted data. The Access Protocol Optimization module supports RTP according to RFC1889 and RTSP according to RFC2326 to stream media to the Client Player.
The Access Protocol Optimization module 106 implements algorithms to efficiently perform Controlled Compression and Extended Client Filtering in media streaming to the MMIP Client Player over the access link bandwidth. The algorithms implemented include the following functions:
1) Stream the multiresolution subbands to the Client Player utilizing RTP/LIDP/IP.
2) Stream the multiresolution subbands to the Client Player utilizing RTP/TCP/IP
for streaming of media through firewalls that do not support UDP port assignments.
3) Stream only M of a total of N multiresolution subbands starting from the lowest resolution, where M <_ N. The algorithm for which M subbands to stream to the Client Player is based on access infrastructure and bandwidth available (i.e.:
twisted pair, wireless, satellite) to maintain the best subjective image quality according to the human visual systems logarithmic sensitivity to light intensity and sensitivity to abrupt spatial changes.
4) Selectively stream only an area of each of the M subbands in 3). The algorithm will determine the shape of the area to stream of the M subbands and vary it from all the lowest subband coefficients to a small geometrical area in the center of the highest resolution subband. The algorithm to control the rate of change of the geometrical area from low to high resolution subbands in the center of the image area is optimized based on the access infrastructure, bandwidth available, and client capability.
5) Selectively eliminate the highest resolution subbands in 4) as the geometrical area approaches a single coefficient in the significance map that will enhance the coding efficiency of the EZW algorithm.
6) Gradual spatial subsampling by decimation of the subbands in 5) to stream data from CIF to QCIF, etc. to maintain full motion video over the access links available bandwidth.
7) YUV color space subsampling to reduce the compressed data rate and to approach a gray scale video while retaining full frame rate video (Y, Cr, Cb subband map tagging).
8) Variable number of frames between key frames based on scene information and use of motion compensation between frames to deliver progressive video between key frames when scene does not change.
9) Frame rate decimation below 30 Hz to delivery gray scale video over low speed access infrastructures (i.e. Personal Digital Assistants).
10) Congestion algorithms optimized for the access infrastructure such as graceful degradation of video while maintaining the audio quality and audio degradation only after the frame rate is zero (RFC2001/RFC2581 identifies TCP congestion control algorithms).
11) IP Packet length Segmentation and Reassembly (SAR) algorithms optimized for the access link bandwidth (i.e. approximately 700 bytes).
12) Optimization algorithms such as header compression and payload compression operating over layers 2 to 7 of the OSI 7 layer network model to optimize the access link bandwidth.
13) Client player buffer size and buffer fullness continuous feedback loop with an algorithm to optimize streaming to the client to avoid overflow or underflow (i.e.: 15 second client player buffer depth for a client configured with 64 Meg RAM
80°,% free).
80°,% free).
14) Scene change detection with variable key frames between video frames (1-90). Use of a Distributed Keyframe to balance loading over access links with lower overall burstiness of traffic and lower delay to display scene change. This provides for progressive video for talk shows, distance education, etc.
Refernng to figure 6, a flow chart showing the operation of the caching bandwidth manager 100 is shown generally by numeral 200. Normally the process begins with the edge server receiving a request from the client for streaming media content 202. The caching bandwidth manager runs two loops, a non-real-time loop 204 and a real-time loop 206. In the non-real-time loop, the bandwidth manager determines the bandwidth characteristics of the client 208. The characteristics are used to generate a quantization mask 210 to be applied by the real time loop 206 on a frame by frame basis, or group of frames in 3D cases, for creating a bitstream to the client. Thus in the real time loop 206, (by real- time is generally meant time in the order of the frame rate) the bandwidth manager receives the transformed pixel data and performs coding thereon using the quantization mask 210. The bitstream generated by this process is then sent 212 to the client. The client performs the inverse operation 214 on the bitstream to generate an appropriate display.
Referring to Figure 7, a flow chart showing the operation of the bandwidth optimizer is indicated generally at numeral 400. Initially, an edge server receives a client request. The client request is buffered and the server determines the access bandwidth statistics of the client. The client buffer status is determined, as to whether the client buffer is approaching an underflow condition or approaching an overflow condition. If the client buffer status indicates that the buffer is approaching an overflow condition, the bandwidth optimizer reduces video quality by changing the co-efficient array quantization. Spatial decimation and temporal decimation follow this. In addition the bandwidth optimizer may perform colorspace reduction and decrease subjective quality based on the human visual system sensitivity. Next the values are updated in the array for the N x N quantization mask and the coded bit rate decreased to the client. The next step is then to update the new N x N quantization in the real time quantization process.
If the client buffer state is determined to be approaching an underflow condition the bandwidth optimizer improves video quality by controlling the co-efficient array quantization. Next spatial interpolation and temporal interpolation are performed as above. Also color space expansion is performed followed by an increase in subject subjective quality using a human visual system sensitivity profile. This is similar to the sensitivity profile described above for the overflow condition however in this case the optimizer moves in the direction to increase the quality of the image. Next the N x N
quantization mask is updated to increase the coded bit rate to the client. It may be noted, that the quantization mask is not restricted to N x N but my be a N x M or N x M x Z.
In both instances, the outputs are used to update the new N x N quantization mask in the real-time quantization process.
Referring to Figure 8, a flow chart showing the real time quantization process is shown generally by numeral 500. It may be shown that the quantization process is repeated for each frame, or group of frames in 3D cases, transformed. The quantization process begins by inputting a transformed pixel data of size N x N. It is next determined whether a new real-time quantization mask is available. If so the system updates the quantizer co-efficient mask from the bandwidth optimizer as described with reference to Figure 7. This quantization mask is then applied to the N x N array of transformed frame data as shown schematically in Figure 9.
Next the quantizer performs entropy (for example arithmetic) coding on the quantized data, the resulting bit stream is then sent to the client.
Refernng to Figure 9, a schematic representation of a real-time frame by frame quantization with independent co-efficient quantization is shown for a 2D
wavelet case.
As shown, a quantization mask of size N x N 604 is created and then applied to the data 602 on a frame by frame basis. The purpose of the quantization mask is for real-time selective wavelet sub-band quantization to reject certain sub-bands or portions of sub-bands for NxN, N x M or N x M x P cases. For example an AND is performed on a corresponding element in the transformed coefficient array 602 with the quantization mask 604. In this scenario the independent value for each quantization mask array element will truncate the corresponding transformed coefficient value in bit positions in the quantization mask array elements that are zero. Referring to the first array element in the quantization mask, which is represented as F4(Hex) when ANDed with the coefficient value of 92(Hex) results in a coefficient value of 90(hex) or in other words less video resolution. On the other hand for an increase in resolution a value of FF(Hex) in the quatization mask will result in a coefficient value of 92(hex).
Refernng to Figure 10, a real time group of frames quantization with independent co-efficient quantization for a 3D spectral case is shown generally by numeral 700. In this case, rather than a 2D quantization mask being created, a 3D quantization mask block is created and applied to an entire sequence of frames.
The cache database manager module 108 receives Access Configuration Data from the user. When the Access Configuration Data indicates an MPEG media request has been made by the end user, the Cache Database Manager sends an MPEG Server request to the MPEG Media Server to enable Streamed MPEG Data to be sent over IP on the WWW. The Cache Database Manager is also be capable of sending Media Requests to enable other Compressed Media Data comprised of audio, video, or data for transmission over IP on the WWW based on the Access Configuration Data received from the user.
Thus an embodiment of the present invention provides a method and system for optimizing the quality of media displayed at a client that is connected to an IP network.
Furthermore, although the invention has only been described in detail relating to media transmissions over an IP network, it may be extended to other forms of data transmission. For example, cable television companies typically transmit over 6MHz channels. Using today's MPEG compression technology, they are generally only able to accommodate from four to six television stations per channel. Using the controlled compression described herein, different types of television shows can be compressed with different compression rates. Therefore, in the cable television industry each 6 MHz channel statistically will be able to accommodate much more television shows.
In a similar fashion, the subject of the present invention may be applied to other industries, such as broadcast television, jukeboxes, personal electronic devices and the like.
Although the invention has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the spirit and scope of the invention as outlined in the claims appended hereto.
Refernng to figure 6, a flow chart showing the operation of the caching bandwidth manager 100 is shown generally by numeral 200. Normally the process begins with the edge server receiving a request from the client for streaming media content 202. The caching bandwidth manager runs two loops, a non-real-time loop 204 and a real-time loop 206. In the non-real-time loop, the bandwidth manager determines the bandwidth characteristics of the client 208. The characteristics are used to generate a quantization mask 210 to be applied by the real time loop 206 on a frame by frame basis, or group of frames in 3D cases, for creating a bitstream to the client. Thus in the real time loop 206, (by real- time is generally meant time in the order of the frame rate) the bandwidth manager receives the transformed pixel data and performs coding thereon using the quantization mask 210. The bitstream generated by this process is then sent 212 to the client. The client performs the inverse operation 214 on the bitstream to generate an appropriate display.
Referring to Figure 7, a flow chart showing the operation of the bandwidth optimizer is indicated generally at numeral 400. Initially, an edge server receives a client request. The client request is buffered and the server determines the access bandwidth statistics of the client. The client buffer status is determined, as to whether the client buffer is approaching an underflow condition or approaching an overflow condition. If the client buffer status indicates that the buffer is approaching an overflow condition, the bandwidth optimizer reduces video quality by changing the co-efficient array quantization. Spatial decimation and temporal decimation follow this. In addition the bandwidth optimizer may perform colorspace reduction and decrease subjective quality based on the human visual system sensitivity. Next the values are updated in the array for the N x N quantization mask and the coded bit rate decreased to the client. The next step is then to update the new N x N quantization in the real time quantization process.
If the client buffer state is determined to be approaching an underflow condition the bandwidth optimizer improves video quality by controlling the co-efficient array quantization. Next spatial interpolation and temporal interpolation are performed as above. Also color space expansion is performed followed by an increase in subject subjective quality using a human visual system sensitivity profile. This is similar to the sensitivity profile described above for the overflow condition however in this case the optimizer moves in the direction to increase the quality of the image. Next the N x N
quantization mask is updated to increase the coded bit rate to the client. It may be noted, that the quantization mask is not restricted to N x N but my be a N x M or N x M x Z.
In both instances, the outputs are used to update the new N x N quantization mask in the real-time quantization process.
Referring to Figure 8, a flow chart showing the real time quantization process is shown generally by numeral 500. It may be shown that the quantization process is repeated for each frame, or group of frames in 3D cases, transformed. The quantization process begins by inputting a transformed pixel data of size N x N. It is next determined whether a new real-time quantization mask is available. If so the system updates the quantizer co-efficient mask from the bandwidth optimizer as described with reference to Figure 7. This quantization mask is then applied to the N x N array of transformed frame data as shown schematically in Figure 9.
Next the quantizer performs entropy (for example arithmetic) coding on the quantized data, the resulting bit stream is then sent to the client.
Refernng to Figure 9, a schematic representation of a real-time frame by frame quantization with independent co-efficient quantization is shown for a 2D
wavelet case.
As shown, a quantization mask of size N x N 604 is created and then applied to the data 602 on a frame by frame basis. The purpose of the quantization mask is for real-time selective wavelet sub-band quantization to reject certain sub-bands or portions of sub-bands for NxN, N x M or N x M x P cases. For example an AND is performed on a corresponding element in the transformed coefficient array 602 with the quantization mask 604. In this scenario the independent value for each quantization mask array element will truncate the corresponding transformed coefficient value in bit positions in the quantization mask array elements that are zero. Referring to the first array element in the quantization mask, which is represented as F4(Hex) when ANDed with the coefficient value of 92(Hex) results in a coefficient value of 90(hex) or in other words less video resolution. On the other hand for an increase in resolution a value of FF(Hex) in the quatization mask will result in a coefficient value of 92(hex).
Refernng to Figure 10, a real time group of frames quantization with independent co-efficient quantization for a 3D spectral case is shown generally by numeral 700. In this case, rather than a 2D quantization mask being created, a 3D quantization mask block is created and applied to an entire sequence of frames.
The cache database manager module 108 receives Access Configuration Data from the user. When the Access Configuration Data indicates an MPEG media request has been made by the end user, the Cache Database Manager sends an MPEG Server request to the MPEG Media Server to enable Streamed MPEG Data to be sent over IP on the WWW. The Cache Database Manager is also be capable of sending Media Requests to enable other Compressed Media Data comprised of audio, video, or data for transmission over IP on the WWW based on the Access Configuration Data received from the user.
Thus an embodiment of the present invention provides a method and system for optimizing the quality of media displayed at a client that is connected to an IP network.
Furthermore, although the invention has only been described in detail relating to media transmissions over an IP network, it may be extended to other forms of data transmission. For example, cable television companies typically transmit over 6MHz channels. Using today's MPEG compression technology, they are generally only able to accommodate from four to six television stations per channel. Using the controlled compression described herein, different types of television shows can be compressed with different compression rates. Therefore, in the cable television industry each 6 MHz channel statistically will be able to accommodate much more television shows.
In a similar fashion, the subject of the present invention may be applied to other industries, such as broadcast television, jukeboxes, personal electronic devices and the like.
Although the invention has been described with reference to certain specific embodiments, various modifications thereof will be apparent to those skilled in the art without departing from the spirit and scope of the invention as outlined in the claims appended hereto.
Claims (22)
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. A method for managing scalable compression of multicast or unicast media over an Internet Protocol, between a media server and a client, said method comprising the steps of:
(a) determining the client access bandwidth statistics and client hardware capabilities;
(b) determining a data buffer status of the client;
(c) generating an m-dimensional quantization mask in response to said buffer status;
(d) applying said quantization mask to an array of transformed frame data for each frame in a sequence to produce quantized data;
(e) performing entropy coding on said quantized data to produce a bit stream;
and (f) transmitting said bitstream to said client.
(a) determining the client access bandwidth statistics and client hardware capabilities;
(b) determining a data buffer status of the client;
(c) generating an m-dimensional quantization mask in response to said buffer status;
(d) applying said quantization mask to an array of transformed frame data for each frame in a sequence to produce quantized data;
(e) performing entropy coding on said quantized data to produce a bit stream;
and (f) transmitting said bitstream to said client.
2. A method as defined in claim 1, further comprising the steps of enhancing dynamic video quality by application of extended client filtering or controlled compression of said bit stream.
3. A method as defined in claim 1, wherein said quantization mask is applied to coefficient data generated using wavelets and said quantization mask has real time independent control of the resolution of each coefficient in the m-dimensional transformed pixel array.
4. A method as defined in claim 1, wherein said quantization mask is applied to coefficient data generated using spectral transforms and said quantization mask has real time independent control of the resolution of each coefficient in the m-dimensional transformed pixel array.
5. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by streaming multiresolution subbands to a client player utilizing RTP/UDP/IP.
6. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by streaming multiresolution subbands to a client player utilizing RTP/TCP/IP for streaming of media through firewalls that do not support UDP port assignments.
7. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by streaming only M of a total of N
multiresolution subbands starting from the lowest resolution, where M ~ N, and wherein an algorithm for determining which M subbands for streaming to a client player is based on access infrastructure and bandwidth available for maintaining the best subjective image quality according to the human visual systems, logarithmic sensitivity to light intensity, and sensitivity to abrupt spatial changes.
multiresolution subbands starting from the lowest resolution, where M ~ N, and wherein an algorithm for determining which M subbands for streaming to a client player is based on access infrastructure and bandwidth available for maintaining the best subjective image quality according to the human visual systems, logarithmic sensitivity to light intensity, and sensitivity to abrupt spatial changes.
8. A method as defined in claim 7, wherein only an area of each of said M
subbands are selectively streamed, and an algorithm determines the shape of the area to varies it from the lowest subband coefficients to a small geometrical area in the center of the highest resolution subband, said algorithm is optimized based on the access infrastructure, bandwidth available, and client capability.
subbands are selectively streamed, and an algorithm determines the shape of the area to varies it from the lowest subband coefficients to a small geometrical area in the center of the highest resolution subband, said algorithm is optimized based on the access infrastructure, bandwidth available, and client capability.
9. A method as defined in claim 8, wherein the highest resolution subbands are selectively eliminated as the geometrical area approaches a single coefficient in the significance map that will enhance the coding efficiency of an EZW algorithm.
10. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by decimating subbands for gradual spatial subsampling for streaming data from CIF to QCIF while maintaining full motion video over an access link's available bandwidth.
11. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by YUV color space subsampling for reducing the compressed data rate and for approaching gray scale video while retaining full frame rate video.
12. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by utilizing a variable number of frames between key frames based on scene information and use of motion compensation between frames for delivering progressive video between key frames when a scene does not change.
13. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by frame rate decimation below 30 Hz for delivering gray scale video over low speed access infrastructures.
14. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by utilizing congestion algorithms optimized for the access infrastructure while maintaining the audio quality and audio degradation only after the frame rate is zero.
15. A method as defined in claim 14, wherein said congestion algorithm is a graceful degradation of video.
16. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by utilizing IP packet length Segmentation And Reassembly (SAR) algorithms optimized for the access link bandwidth.
17. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by optimizing algorithms such as header compression and payload compression operating over layers 2 to 7 of the ISO 7 layer network model for optimizing the access link bandwidth.
18. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by implementing a client player buffer size and buffer fullness continuous feedback loop with an algorithm for optimizing streaming to the client for avoiding overflow or underflow.
19. A method as defined in claim 2, wherein said extended client filtering or controlled compression is accomplished by utilizing scene change detection with variable key frames between video frames, and the use of a distributed keyframe for balancing loading over access links with lower overall burstiness of traffic and lower delay to display scene change.
20. A method for broadcasting media over a fixed channel, wherein controlled compression or extended client filtering is used for generating a bitstream with dynamic control of the compression of said bitstream.
21. A method as defined in claim 20, wherein the broadcast media is cable television.
22. A method for dynamically controlling video quality of a transmission through controlled compression or extended client filtering within a primary media server for multicast or unicast transmission over an Internet Protocol.
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA 2280662 CA2280662A1 (en) | 1999-05-21 | 1999-09-02 | Media server with multi-dimensional scalable data compression |
PCT/CA2000/000132 WO2000072602A1 (en) | 1999-05-21 | 2000-02-15 | Multi-dimensional data compression |
PCT/CA2000/000133 WO2000072517A1 (en) | 1999-05-21 | 2000-02-15 | System and method for streaming media over an internet protocol system |
AU26528/00A AU2652800A (en) | 1999-05-21 | 2000-02-15 | Media server with multi-dimensional scalable data compression |
AU26530/00A AU2653000A (en) | 1999-05-21 | 2000-02-15 | System and method for streaming media over an internet protocol system |
AU26529/00A AU2652900A (en) | 1999-05-21 | 2000-02-15 | Multi-dimensional data compression |
PCT/CA2000/000131 WO2000072599A1 (en) | 1999-05-21 | 2000-02-15 | Media server with multi-dimensional scalable data compression |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA 2272590 CA2272590A1 (en) | 1999-05-21 | 1999-05-21 | System and method for streaming media over an internet protocol system |
CA2,272,590 | 1999-05-21 | ||
CA2,277,373 | 1999-07-09 | ||
CA 2277373 CA2277373A1 (en) | 1999-05-21 | 1999-07-09 | Multi-dimensional data compression |
CA 2280662 CA2280662A1 (en) | 1999-05-21 | 1999-09-02 | Media server with multi-dimensional scalable data compression |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2280662A1 true CA2280662A1 (en) | 2000-11-21 |
Family
ID=27170970
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA 2280662 Abandoned CA2280662A1 (en) | 1999-05-21 | 1999-09-02 | Media server with multi-dimensional scalable data compression |
Country Status (3)
Country | Link |
---|---|
AU (3) | AU2652800A (en) |
CA (1) | CA2280662A1 (en) |
WO (3) | WO2000072602A1 (en) |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020078253A1 (en) * | 2000-12-20 | 2002-06-20 | Gyorgy Szondy | Translation of digital contents based on receiving device capabilities |
US7242324B2 (en) | 2000-12-22 | 2007-07-10 | Sony Corporation | Distributed on-demand media transcoding system and method |
US6407680B1 (en) | 2000-12-22 | 2002-06-18 | Generic Media, Inc. | Distributed on-demand media transcoding system and method |
US20030028643A1 (en) * | 2001-03-13 | 2003-02-06 | Dilithium Networks, Inc. | Method and apparatus for transcoding video and speech signals |
US7054335B2 (en) * | 2001-05-04 | 2006-05-30 | Hewlett-Packard Development Company, L.P. | Method and system for midstream transcoding of secure scalable packets in response to downstream requirements |
WO2003001748A1 (en) * | 2001-06-21 | 2003-01-03 | Ziplabs Pte Ltd. | Method and apparatus for compression and decompression of data |
ITTO20010813A1 (en) * | 2001-08-13 | 2003-02-13 | Telecom Italia Lab Spa | PROCEDURE FOR THE TRANSFER OF MESSAGES THROUGH UDP, ITS SYSTEM AND IT PRODUCT. |
US7480703B2 (en) | 2001-11-09 | 2009-01-20 | Sony Corporation | System, method, and computer program product for remotely determining the configuration of a multi-media content user based on response of the user |
US7356575B1 (en) | 2001-11-09 | 2008-04-08 | Sony Corporation | System, method, and computer program product for remotely determining the configuration of a multi-media content user |
US7730165B2 (en) | 2001-11-09 | 2010-06-01 | Sony Corporation | System, method, and computer program product for remotely determining the configuration of a multi-media content user |
JP2003152544A (en) * | 2001-11-12 | 2003-05-23 | Sony Corp | Data communication system, data transmitter, data receiver, data-receiving method and computer program |
US7284069B2 (en) | 2002-01-11 | 2007-10-16 | Xerox Corporation | Method for document viewing |
US7200615B2 (en) | 2003-10-16 | 2007-04-03 | Xerox Corporation | Viewing tabular data on small handheld displays and mobile phones |
CN100458747C (en) * | 2003-10-31 | 2009-02-04 | 索尼株式会社 | System, method, and computer program product for remotely determining the configuration of a multi-media content user |
EP1738571A1 (en) * | 2004-04-20 | 2007-01-03 | France Télécom | Multimedia messaging system and telephone station comprising same |
US7539341B2 (en) | 2004-07-29 | 2009-05-26 | Xerox Corporation | Systems and methods for processing image data prior to compression |
US7620892B2 (en) | 2004-07-29 | 2009-11-17 | Xerox Corporation | Server based image processing for client display of documents |
US7721204B2 (en) | 2004-07-29 | 2010-05-18 | Xerox Corporation | Client dependent image processing for browser-based image document viewer for handheld client devices |
US8812978B2 (en) | 2005-12-22 | 2014-08-19 | Xerox Corporation | System and method for dynamic zoom to view documents on small displays |
US8711925B2 (en) | 2006-05-05 | 2014-04-29 | Microsoft Corporation | Flexible quantization |
US8139487B2 (en) | 2007-02-28 | 2012-03-20 | Microsoft Corporation | Strategies for selecting a format for data transmission based on measured bandwidth |
US8897359B2 (en) | 2008-06-03 | 2014-11-25 | Microsoft Corporation | Adaptive quantization for enhancement layer video coding |
CN101662454A (en) * | 2008-08-29 | 2010-03-03 | 阿里巴巴集团控股有限公司 | Method, device and system for image processing in internet |
US9225762B2 (en) | 2011-11-17 | 2015-12-29 | Google Technology Holdings LLC | Method and apparatus for network based adaptive streaming |
DE102013220901A1 (en) | 2013-10-15 | 2015-04-16 | Continental Automotive Gmbh | Method for transmitting digital audio and / or video data |
US9747010B2 (en) | 2014-01-16 | 2017-08-29 | Xerox Corporation | Electronic content visual comparison apparatus and method |
US9521176B2 (en) | 2014-05-21 | 2016-12-13 | Sony Corporation | System, method, and computer program product for media publishing request processing |
CN112751886B (en) * | 2019-10-29 | 2023-05-26 | 贵州白山云科技股份有限公司 | Transcoding method, transcoding system, transmission equipment and storage medium |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3071205B2 (en) * | 1990-01-23 | 2000-07-31 | オリンパス光学工業株式会社 | Image data encoding apparatus and encoding method |
US5159447A (en) * | 1991-05-23 | 1992-10-27 | At&T Bell Laboratories | Buffer control for variable bit-rate channel |
TW318315B (en) * | 1993-05-03 | 1997-10-21 | At & T Corp | |
GB2278973B (en) * | 1993-06-11 | 1997-10-29 | Quantel Ltd | Video image processing systems |
US5881176A (en) * | 1994-09-21 | 1999-03-09 | Ricoh Corporation | Compression and decompression with wavelet style and binary style including quantization by device-dependent parser |
JP3749752B2 (en) * | 1995-03-24 | 2006-03-01 | アイティーティー・マニュファクチャリング・エンタープライジズ・インコーポレーテッド | Block adaptive differential pulse code modulation system |
US5621660A (en) * | 1995-04-18 | 1997-04-15 | Sun Microsystems, Inc. | Software-based encoder for a software-implemented end-to-end scalable video delivery system |
US5822524A (en) * | 1995-07-21 | 1998-10-13 | Infovalue Computing, Inc. | System for just-in-time retrieval of multimedia files over computer networks by transmitting data packets at transmission rate determined by frame size |
US5706216A (en) * | 1995-07-28 | 1998-01-06 | Reisch; Michael L. | System for data compression of an image using a JPEG compression circuit modified for filtering in the frequency domain |
JP2000504906A (en) * | 1996-02-14 | 2000-04-18 | オリブル コーポレイション リミティド | Method and system for progressive asynchronous transmission of multimedia data |
US5996022A (en) * | 1996-06-03 | 1999-11-30 | Webtv Networks, Inc. | Transcoding data in a proxy computer prior to transmitting the audio data to a client |
US5918013A (en) * | 1996-06-03 | 1999-06-29 | Webtv Networks, Inc. | Method of transcoding documents in a network environment using a proxy server |
US5953506A (en) * | 1996-12-17 | 1999-09-14 | Adaptive Media Technologies | Method and apparatus that provides a scalable media delivery system |
US20010039615A1 (en) * | 1997-04-15 | 2001-11-08 | At &T Corp. | Methods and apparatus for providing a broker application server |
US6014694A (en) * | 1997-06-26 | 2000-01-11 | Citrix Systems, Inc. | System for adaptive video/audio transport over a network |
AU1285099A (en) * | 1997-11-07 | 1999-05-31 | Pipe Dream, Inc | Method for compressing and decompressing motion video |
-
1999
- 1999-09-02 CA CA 2280662 patent/CA2280662A1/en not_active Abandoned
-
2000
- 2000-02-15 AU AU26528/00A patent/AU2652800A/en not_active Abandoned
- 2000-02-15 AU AU26530/00A patent/AU2653000A/en not_active Abandoned
- 2000-02-15 AU AU26529/00A patent/AU2652900A/en not_active Abandoned
- 2000-02-15 WO PCT/CA2000/000132 patent/WO2000072602A1/en active Search and Examination
- 2000-02-15 WO PCT/CA2000/000133 patent/WO2000072517A1/en active Application Filing
- 2000-02-15 WO PCT/CA2000/000131 patent/WO2000072599A1/en active Search and Examination
Also Published As
Publication number | Publication date |
---|---|
AU2652900A (en) | 2000-12-12 |
AU2652800A (en) | 2000-12-12 |
WO2000072599A1 (en) | 2000-11-30 |
WO2000072602A1 (en) | 2000-11-30 |
WO2000072517A1 (en) | 2000-11-30 |
AU2653000A (en) | 2000-12-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2280662A1 (en) | Media server with multi-dimensional scalable data compression | |
US6091777A (en) | Continuously adaptive digital video compression system and method for a web streamer | |
US6337881B1 (en) | Multimedia compression system with adaptive block sizes | |
US8929436B2 (en) | Method and apparatus for video coding, predecoding, and video decoding for video streaming service, and image filtering method | |
US6392705B1 (en) | Multimedia compression system with additive temporal layers | |
US7477688B1 (en) | Methods for efficient bandwidth scaling of compressed video data | |
US6788740B1 (en) | System and method for encoding and decoding enhancement layer data using base layer quantization data | |
Sun et al. | An overview of scalable video streaming | |
JP2006087125A (en) | Method of encoding sequence of video frames, encoded bit stream, method of decoding image or sequence of images, use including transmission or reception of data, method of transmitting data, coding and/or decoding apparatus, computer program, system, and computer readable storage medium | |
US8571027B2 (en) | System and method for multi-rate video delivery using multicast stream | |
KR100952185B1 (en) | System and method for drift-free fractional multiple description channel coding of video using forward error correction codes | |
JP2004512785A (en) | Scalable video compression based on DCT | |
EP0892557A1 (en) | Image compression | |
US20040139219A1 (en) | Transcaling: a video coding and multicasting framework for wireless IP multimedia services | |
Girod et al. | Scalable codec architectures for internet video-on-demand | |
Johanson | Scalable video conferencing using subband transform coding and layered multicast transmission | |
Abd Al-azeez et al. | Optimal quality ultra high video streaming based H. 265 | |
Mrak et al. | Scalable video coding in network applications | |
Ortiz et al. | Interactive transmission of JPEG2000 images using web proxy caching | |
Pereira et al. | Multiple description coding for internet video streaming | |
Tham et al. | Layered coding for a scalable video delivery system | |
Johanson et al. | Layered encoding and transmission of video in heterogeneous environments | |
Song et al. | PVH-3DDCT: an algorithm for layered video coding and transmission | |
Hong et al. | QoS control for internet delivery of video data | |
Mrak et al. | Video Coding Schemes for Transporting Video Over The Internet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FZDE | Discontinued | ||
FZDE | Discontinued |
Effective date: 20040902 |