WO2010038011A2 - Improved data compression - Google Patents

Improved data compression Download PDF

Info

Publication number
WO2010038011A2
WO2010038011A2 PCT/GB2009/002317 GB2009002317W WO2010038011A2 WO 2010038011 A2 WO2010038011 A2 WO 2010038011A2 GB 2009002317 W GB2009002317 W GB 2009002317W WO 2010038011 A2 WO2010038011 A2 WO 2010038011A2
Authority
WO
WIPO (PCT)
Prior art keywords
template
templates
data
bytes
strength value
Prior art date
Application number
PCT/GB2009/002317
Other languages
French (fr)
Other versions
WO2010038011A3 (en
Inventor
Richard Barden
Original Assignee
Cambridge Broadband Networks Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cambridge Broadband Networks Ltd filed Critical Cambridge Broadband Networks Ltd
Priority to EP09736448A priority Critical patent/EP2353270A2/en
Publication of WO2010038011A2 publication Critical patent/WO2010038011A2/en
Publication of WO2010038011A3 publication Critical patent/WO2010038011A3/en
Priority to ZA2011/02901A priority patent/ZA201102901B/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3088Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing the use of a dictionary, e.g. LZ78
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/06Optimizing the usage of the radio link, e.g. header compression, information sizing, discarding information

Definitions

  • the present invention relates to an apparatus and method for processing data, specifically, but not exclusively, processing data contained in frames, packets or cells which is then compressed and transmitted during wireless communications.
  • IP Internet Protocol
  • IP networks there is no single dedicated connection between communicating devices or entities. Each packet making up the source data may include a complete destination address and each one may be sent and routed independently.
  • the IP protocol provides features for addressing, type-of- service specification, fragmentation, re-assembly, and security of data packets, and is defined in Request For Comments (RFC) 791.
  • RRC Request For Comments
  • the Internet "Request For Comments" documents are written definitions of the protocols and policies of the Internet and are readily found on many websites.
  • ATM networks employ a virtual circuit, which is established at the start of a data transfer and fixed size ATM cells are transferred via the same path until at the end of the data transfer the virtual circuit may be released.
  • FIG. 1 illustrates a typical packet 100 used to transport information in a data network as known in the art and which comprises an IP header 102 of 20 bytes, a UDP header 104 of 8 bytes, a RTP header 106 of a minimum of 12 bytes and a payload 108 of up to N bytes.
  • FIG. 2 illustrates the information contained in a typical IP header 102, also as known in the art.
  • the IP header 102 in Figure 2 comprises 20 bytes of data arranged in eleven fields relating to various aspects for the delivery of data from a source to a destination. Of particular note are the IP source address 1 14 and IP destination address 1 16 fields.
  • FIGs 3a and 3b are diagrams showing two typical ATM cell formats.
  • an ATM cell consists of a 5 byte header and a 48 byte payload.
  • ATM defines two different cell formats: NNI (Network-Network Interface) and UNI (User-Network Interface). Most ATM links use UNI cell format.
  • Figure 3a shows a UNI cell header format 200, which comprises 4 bits of GFC
  • VCI Virtual Channel Identifier
  • PT Payload Type
  • CLP Cell Loss Priority
  • HEC Header Error Correction
  • Figure 3b shows a NNI cell header format 300, which comprises 12 bits of VPI (Virtual Path Identifier) 302, 16 bits of VCI (Virtual Channel Identifier) 303, a
  • CLP Cell Loss Priority
  • HEC Header Error Correction
  • wireless telecommunications networks include two or more wireless devices (stations) capable of communicating with one another. This communication may or may not occur via a wireless Access Point (AP).
  • a wireless AP will normally allow access from a wireless device to another network, such as the Internet.
  • a wireless communications device may be a portable computer with a wireless local area network (W-LAN) card that has been installed, or that is already integrated.
  • a WCD may also be a mobile phone, or a personal data assistant (PDA), a computer equipped with a wireless modem, or any other device capable of transmitting and receiving information to another communication device using a radio frequency (RF).
  • PDA personal data assistant
  • RF radio frequency
  • a wireless base station may be a fixed base station, might alternatively comprise a mobile communication device, a satellite, or any other device capable of transmitting and receiving communications from a WCD using radio frequency (RF).
  • RF radio frequency
  • Data packets suitable for transmission over a data network are generated by, for example, an application running on a WCD, such as an Internet web- browser, an email application, a digital video camera, and others.
  • data packets suitable for transmission over a data network are generated, each data packet comprising necessary header information needed to communicate with a destination device.
  • header information is referred to interchangeably herein as data network header information.
  • a web-browser operating on a WCD may generate IP packets when a user wishes to access a web page.
  • IP packets comprise data network header information, for example, header information relating to the Internet Protocol
  • IP User Datagram Protocol
  • TCP Transport Control Protocol
  • RTP Real-Time Transmission Protocol
  • a packet flow can be defined as a series or sequence of one or more related packet exchanges between two or more communicating entities e.g. a WCD and a base station.
  • Packet flow examples include the exchange of IP signalling packets that control a mobile Internet Protocol version 6 (IPv6) handover, or the packet exchanges involved in a Session Initiation Protocol (SIP) transfer, or packets involved a file transfer using the File Transfer Protocol (FTP).
  • IPv6 mobile Internet Protocol version 6
  • SIP Session Initiation Protocol
  • FTP File Transfer Protocol
  • Known wireless noise and bandwidth problems are as follows: high error rates, caused by radio interference, rain fade, or multi-path propagation etc; long delays, due to lower transmission speeds and large transmission distances for example; processing power constraints, due to short battery life, device size and weight for example; and scarce channel resources, due to frequency allocation and spectrum constraints for example. Further, in order to improve transmission efficiency and to keep pace with the exploding demand for digital information, there is a constant design objective to pump increasingly more data through the same bandwidth pipeline over any given network.
  • Packets may be compressed at a server, transmitted in their compressed state over a network, and decompressed at a client.
  • Packets may be compressed at a server, transmitted in their compressed state over a network, and decompressed at a client.
  • Another solution is partial packet compression in which only certain portions of the packet, such as a header or a data payload, are compressed.
  • the compressed header has an encoded change to the packet ID, a TCP checksum, a connection number, and a change mask.
  • the hardware and/or software used to implement the Jacobson technique must perform sophisticated computations that compress the 40-byte header to the three-byte compressed header, and then subsequently decompress the compressed header to reproduce the uncompressed header. Further, the Jacobson technique requires "knowledge" of the format of the TCP header in order for it to work.
  • IP/UDP/RTP header to an average of between 2 and 4 bytes generally by transmitting second order differences when one or more fields within the header change.
  • US Patent 7,061 ,936 Yoshimura et al. This technique is based on the compression of an RTP header only and further discloses improvements on the IETF 2508 RTP header compression technique, as well as improvements on the ROHC (Robust Header Compression) technique.
  • US Patent 7,245,639 (Westpahl, Nokia) discloses a header compression scheme that makes use of the similarity of consecutive flows from or to a given mobile terminal to compress those headers. US Patent 7,245,639 is herein incorporated by reference.
  • the present invention therefore seeks to overcome, or at least reduce some of the above-mentioned problems of the prior art.
  • the invention provides an apparatus for processing data in a telecommunications network, the apparatus comprising: an input, wherein the input receives data; storage, wherein the storage stores a set of templates, each template having a strength value; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template.
  • the processor chooses a winning template and increases the strength value of the winning template.
  • the invention provides an apparatus for processing data in a telecommunications network, the apparatus comprising: an input, wherein the input receives data; storage, wherein the storage stores a set of templates, each template having a strength value; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template; or if the at least a portion of the data matches more than one of the templates in the set of templates, the processor chooses a winning template and increases the strength value of the winning template.
  • a template may contain a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
  • the processor uses the matched, or winning template to compress received data.
  • the processor uses a template with the most matched static bytes to compress received data.
  • a successful template match requires all static bytes to be matched.
  • the invention provides an apparatus for processing data in a telecommunications network, the apparatus comprising: an input, wherein the input receives data; storage, wherein the storage stores a set of templates; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein based on the result of the comparison, the processor creates a new template in the set of templates, the new template being stored in the storage.
  • a template contains a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
  • a new template is created by the processor which then records the dynamic byte as a static byte, the matched template is then known as a parent template.
  • a new template is formed which records only the matched static bytes and the remaining bytes as dynamic, the matched template is then known as a parent template. Further preferably, wherein if the compared data matches no templates, then a new template is formed which records all bytes as dynamic.
  • the new template may be given an initial strength value, or may e assigned the strength value greater then the parent template strength value, or may be assigned a strength value less than the parent template strength value.
  • the processor compares the whole of the received packet to each template.
  • the processor decreases the strength of all templates stored in storage that do not match the received data, and/or wherein the strength is decreased using a count depletion, and/or wherein the strength is decreased using a timeout.
  • templates are stored in a table, and/or the processor uses a selection of templates with strength values above a threshold to compress received data, and/or the data is transmitted to a decompressor using the Van Jacobson technique.
  • the data selected from the list of: IP packets, TCP packets, UDP packets, RTP packets, Ethernet frames, or ATM cells, or any mixture or combination thereof.
  • the invention provides a method for processing data in a telecommunications network, the method comprising: receiving data; storing a set of templates, each template having a strength value; comparing at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, increasing the strength value of the matched template.
  • the invention provides a method for processing data in a telecommunications network, the method comprising: receiving data; storing a set of templates, each template having a strength value; comparing at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the strength value of the matched template is increased; or if the at least a portion of the data matches more than one of the templates in the set of templates, a winning template is chosen and the strength value of the winning template is increased.
  • the invention provides a method for processing data in a telecommunications network, the method comprising: receiving data; storing a set of templates; comparing at least a portion of the received data to each template in the set of templates; and wherein based on the result of the comparison, creating a new template in the set of templates, the new template being stored in the storage.
  • One advantage of the method and apparatus for reducing transmission overhead in a communication system is when the "winning" templates are used to compress incoming data, a reduction in bandwidth necessary to transmit information is achieved. This results in either higher data rates or an increased number of users over a given bandwidth.
  • Yet another advantage of the method and apparatus for reducing transmission overhead in a communication system is that latency is reduced for real-time applications, such as voice or video.
  • Another advantage is that the method and apparatus is protocol independent, as templates are created from the data received at the input to the compressor. Yet another advantage is that the way in which the compressor maintains the templates is adaptive to the data being received.
  • Figure 1 illustrates a data packet used to transport information in a data network as known in the art
  • FIG. 2 illustrates the information contained in a typical IP header, as known in the art
  • FIGS 3a and 3b are diagrams showing two typical ATM packet formats as known in the art.
  • Figure 4 is a diagram showing a data compressor and decompressor, according to one embodiment of the present invention.
  • Figure 5 is a flow diagram showing a template matching and creation process, according to one embodiment of the present invention.
  • Figure 6 is a state diagram showing a template management process, according to one embodiment of the present invention.
  • the present invention is directed to a system and method for reducing transmission overhead in a communication system, but which is protocol independent.
  • a wireless terrestrial communication system it should be understood that other embodiments are possible, including use in a satellite communication system, or in a wire-based communication system as well.
  • header or payload compression techniques replace the data packet by a label or compressed data packet at one end of the link, transmit the data with the label attached, then replace the label at the other end of the link by the original (reconstructed) data packet.
  • VJ Van Jacobson
  • ROHC ROHC
  • the compression scheme disclosed uses a "key” and packets are matched to this key. It appears the "key” is created using knowledge of the protocol, that is for example the key may be a destination address. Filters are also used which are a map of an IP packet's source IP address, source port, destination IP, destination port and protocol.
  • the Westphal disclose a protocol dependent technique, whilst the present invention is protocol independent.
  • the present invention uses a "variable key", where all the static and dynamic areas of each packet is recognized and mapped and each and every incoming packet is either matched to an existing template, or a new template is created.
  • the present invention uses evolutionary factors, such a conception, mutation and growth in order to adapt and flex to the incoming microflows.
  • the compression scheme of the present invention is therefore protocol independent and is highly adaptive to the traffic in each microflow.
  • a compressor 401 a decompressor 410, a link 413 between the two (which may be a cable or RF link for example).
  • the compressor 401 comprises a processor 402 and storage 403 capabilities and the decompressor 410 also comprises a processor 408 and storage 409 capabilities.
  • Received data 406 contained in Ethernet frames enters the compressor 401 via its input 404.
  • the compressor 401 compresses the received data 406 and then sends the data 406 via its output 405 over a link 413 to the input 407 of the decompressor 410.
  • the received data 406 is then decompressed by the decompressor 410 (if necessary, see below) and forwarded (transmitted) to the output 411 of the decompressor 410. Uncompressed data 412 then exits the decompressor 410.
  • compressor 401 and decompressor 410 are contained in a wireless communications device or network server for example and that other functions are required within each device in order for them to operate within a wired or wireless telecommunications network.
  • the specific functions of the present invention could be implemented using FPGAs (Field Programmable Gate Arrays) for example.
  • each directional link is assumed to be independent of the other.
  • the compressor 401 comprises a table (not shown) stored in the storage 409 function, the table comprising a set of templates (not shown).
  • Each template has a depth defined as "d” which is the number of bytes that is to be matched in each Ethernet frame. Depth "d" can be as little as one byte, for example or could be as large as the size of a full Ethernet frame. This is an implementation specific choice and also, the value of "d" can be chosen to adapt the compression technique to a particular protocol if required, or may remain totally independent.
  • T templates in each Table and a template consists of (a) an N bit mask, indicating which byes are static in the first N bytes of matching frames, (b) the values of those static bytes; c) an assigned or instantaneous strength value; as well as d) a record of the position and value of any dynamic bytes.
  • the compressor 401 examines each of the Ethernet frames that is received at the compressor's input 404, looking to categorize the frames according to the positions and values of static bytes within the first N bytes of each frame.
  • the compressor will operate on an actual data flow, in a laboratory environment; the compressor may operate . on data that has previously been recorded.
  • a new template is then formed from the profile of the incoming data to represent each new category (or microflow type) found.
  • the template formation may either itself operate on the same timescale as the data examination, or on a much slower timescale, for example by "sniffing" samples of the incoming data stream.
  • Templates which are not gaining strength by matching frames are discarded by the ageing process, maintaining a supply of unused template positions for the creation and mutation mechanisms. Newly created templates have an initial assigned strength to give them a chance to become established.
  • Templates may be duplicated with more (or less) static bytes by recently matching (or near matching) frames.
  • templates effectively compete against each other to match incoming frames. Only the “winners” get stronger, “losers” age & die. When a new flow starts up it is usually soon captured by a conception. This template does not have any static bytes initially, but is likely to soon spawn a mutant which captures the essence of the flow. This may further mutate to track nuances in the flow. Both templates may coexist, or one may die, depending on the flow dynamics. When the flow stops the template inevitably and eventually "dies”.
  • Figure 5 is a flow diagram showing a template matching and creation process, according to one embodiment of the present invention.
  • the process as described in Figure 5 starts at the point marked "S” and finished at point "F” and is executed for every received frame.
  • the process comprises the following steps:
  • Step AO Ageing - age all stored templates, decrease template strengths by 1 and kill any stored templates with no strength. Therefore, all templates lose (for example) 1 unit of strength for each frame received, and die when their strength reaches zero (see Figure 6 for a further discussion of the ageing process).
  • Step A1 Match the received frame against all "active" templates to determine if it matches all the static bytes in any of them. Active templates are described in more detail with reference to Figure 6 following. Step A2. Is there a full static match? If the incoming frame has the same static byte pattern in the first N bytes when compared to all active templates, there is a match made and in which case the process moves onto step A3. Otherwise the process moves onto step A7.
  • Step A3 If more than one template matches, then the one with the most matched static bytes is chosen as the best and the strength of this best matched template is increased (for example with the number of static bytes). If two or more templates have exactly the same number of static bytes when matched to the frames, then the compressor must arbitrarily choose a winner. It does not matter how the decision is made, but it is best if made consistently from decision to decision, for example the template with a higher position in the table is always chosen. Therefore, and it must be, that only one winning, best matched, template gains strength.
  • Step A4 Therefore this frame is a candidate for compression using this "winning best matched" template.
  • compression involves removing the static bytes from the frame before sending it to the decompressor. The decompressor will then reinsert them to reconstruct the original frame. This requires the compressor to have previously notified the decompressor of the necessary template context using prior art methods such as those due to Van Jacobson.
  • Step A5 If the current template has been matching, or even persistently matching, some of its dynamic bytes as well as all its static ones, then there is now the opportunity to create a mutant template with MORE static bytes. Persistent here means in two consecutive full matches, for example. If so, the process moves on to step A6, if not, the process ends at point F.
  • Step A6 If the template has been matching, or persistently matching, some of its dynamic bytes as well as all its static ones, then a mutant template is created via this step with MORE static bytes. Then either the parent strength can be assigned, or an initial strength can be assigned which may, for example, be more or less than the parent strength.
  • Step A7 On the other hand, if there is no full static match at all for the frame, (the no results from step A2) then we look for the best near miss, ie. the template which matched the most static bytes (step A7). If so, then the process moves to step A8, if not, it moves to step A9.
  • Step A8 This template is a candidate for a mutant with FEWER static bytes. Then either the parent strength can be assigned, or an initial strength can be assigned which may be more or less than the parent strength.
  • A9 Finally, if there is not even a near miss, then there is the ability to conceive a new "conception” template, with no static bytes, which will spawn further mutations if the flow persists. This is how new flows are detected, and indeed how the mechanism is primed at startup.
  • a new "conception” template is created in the next vacant position (left by a "dead” template for example) and the number of static bytes is set to zero. In this case all dynamic bytes are recorded. This scheme ensures that a single isolated frame cannot cause a new "non-conception” template to be created.
  • An initial strength value is assigned to the new "conception” template in order that in can survive for a short time, but if packets do not start matching this new "conception” template, then it is sure to die (see Figure 6 for more details).
  • the strength of a template is indicative of the total amount of bandwidth saved by using that template to compress the flow of packets through the compressor.
  • each candidate template has an instantaneous strength which is a measure of the overall bandwidth that would be saved by using this template. Therefore, a new template takes all its byte values (static & dynamic) from its creating, or parent frame. Dynamic bytes in a template are then updated with those from every best matched frame, after the consideration of mutation.
  • N is the number of static bytes in that template. Mutation templates may be created with a strength equal to their N, plus a constant. This constant may, for example, be equal to the depth divided by a factor of 4 (depth/4). Conception templates are created with strength equal to depth/2. The number of bytes into a frame for which matches are considered may typically be 96. The maximum number of concurrently active templates is typically 100. It is also possible therefore that there are more templates in the table than valid flow ids (as some are ageing). Further, the term "byte" is used here as a single example, it should be clear to someone skilled in the art that word lengths other than 8 bits are perfectly usable.
  • the compressor 401 therefore only picks up on the common bytes in consecutive frames; all random information is therefore ignored. No knowledge of the incoming packet structure is required
  • a set of templates known as "active” templates (see Figure 6) which offer the most compression gain are applied by the compressor to the incoming frames, removing the static bytes and replacing them with a single control byte (or label), as described in further details following. These active templates are then deemed as being “in-use” (se Figure 6).
  • the decompressor 410 must have enough context information to replace these bytes and therefore reconstruct the original frames. In this case the Van
  • Jacobson technique can be used to transmit the data packets with the labels created from the data stored in the frame and which is deemed most applicable to the packet in that microflow (ie taken from set of in-use templates). It should be clear to someone skilled in the art that any transmission technique could be implemented in both the compressor 401 and decompressor 410 in order to utilize the information contained in the templates and reduce the size of the packets being transmitted over the link 413.
  • Figure 6 is a state diagram showing template management processes, according to one embodiment of the present invention.
  • Templates may move from one state to another via conception or mutation processes, by an ageing process or via a decision based on strength. All template positions in the table start life as "dead” 501 and will move to an "active" 502 status once a template position has been filled in with data via a conception or mutation decision 504 (described in steps A5, A6 & A7 with reference to Figure 5 previously).
  • the set of "active" templates which offer the most compression gain are applied by the compressor to the incoming frames and so via a decision based on strength 506, this set of templates will move from the state of active 502 to in- use 503.
  • the set of "in-use” templates could have a fixed or variable size, or have to exceed a minimum strength threshold, depending upon the exact implementation required.
  • a template moves from the state in-use 503 to the state dead 501 by a first aging process 508.
  • the ageing process 508 may be implemented using a combination of factors, for example a strength depletion or a fixed timeout.
  • a strength depletion could occur every time a packet does not match an active template.
  • a fixed timeout could be a specific count which is depleted over time, as known in the art. Any other combination of threshold, time or count could be used.
  • a template moves from the state of active 502 to dead 501 also by a second ageing process 505.
  • This second ageing process 505 could be the same as the first one previously described (508), or it could be designed with different threshold, count or time parameters, if required.
  • Parameters such as maximum number of templates T, maximum header depth d, and strength rates & thresholds may be determined empirically, although an adaptive scheme could be considered.
  • the present invention provides a good compression ratio when packets are sent infrequently, but which have lots of static bytes, as well as when packets are sent frequently, but which have few static bytes. When used with smaller byte packets, up to 4:1 compression ratios may be achieved.
  • the only assumption made by the present invention is that the incoming data consists of a stream of headers and payloads and that each payload is preceded by a header.
  • the present invention can be used for any type of Ethernet frame, IP packet or ATM cell structure, or any mixture or combination thereof.
  • the present invention gives the advantages of being flexible, adaptable, protocol independent whilst still providing a good level of compression.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention provides an apparatus and method for processing data in a telecommunications network, the apparatus comprising an input, wherein the input receives data; storage, wherein the storage stores a set of templates, each template having a strength value; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template. Further, wherein based on the result of the comparison, the processor creates a new template in the set of templates, the new template being stored in the storage. Preferably, templates which do not match have their strength value depleted. Templates with no strength are replaced by new templates. Therefore, templates effectively compete against each other to match incoming packets. Only the "winners" get stronger, "losers" age & die. Mutations capture near misses and split microflows.

Description

Improved Data Compression
Field of Invention
The present invention relates to an apparatus and method for processing data, specifically, but not exclusively, processing data contained in frames, packets or cells which is then compressed and transmitted during wireless communications.
Background of Invention
In known telecommunications networks, data is typically split and sent over the network in individual amounts called "packets", "frames" or "cells". Communication protocols used to transmit data over various networks (fixed, wireless etc) are well established. Some protocols, such as Asynchronous Transfer Mode (ATM) call for fixed size cells, while other protocols such as the
Internet Protocol (IP) utilize variable size packets. Ethernet data is transmitted in frames.
In IP networks, there is no single dedicated connection between communicating devices or entities. Each packet making up the source data may include a complete destination address and each one may be sent and routed independently. The IP protocol provides features for addressing, type-of- service specification, fragmentation, re-assembly, and security of data packets, and is defined in Request For Comments (RFC) 791. The Internet "Request For Comments" documents are written definitions of the protocols and policies of the Internet and are readily found on many websites.
Alternatively, ATM networks employ a virtual circuit, which is established at the start of a data transfer and fixed size ATM cells are transferred via the same path until at the end of the data transfer the virtual circuit may be released.
ATM is defined in many standards maintained by the ITU (International Telecommunications Union). Figure 1 illustrates a typical packet 100 used to transport information in a data network as known in the art and which comprises an IP header 102 of 20 bytes, a UDP header 104 of 8 bytes, a RTP header 106 of a minimum of 12 bytes and a payload 108 of up to N bytes.
Figure 2 illustrates the information contained in a typical IP header 102, also as known in the art. The IP header 102 in Figure 2 comprises 20 bytes of data arranged in eleven fields relating to various aspects for the delivery of data from a source to a destination. Of particular note are the IP source address 1 14 and IP destination address 1 16 fields.
Figures 3a and 3b are diagrams showing two typical ATM cell formats. In general an ATM cell consists of a 5 byte header and a 48 byte payload. ATM defines two different cell formats: NNI (Network-Network Interface) and UNI (User-Network Interface). Most ATM links use UNI cell format.
Figure 3a shows a UNI cell header format 200, which comprises 4 bits of GFC
(Generic Flow Control) 201 , 8 bits of VPI (Virtual Path Identifier) 202, 16 bits of
VCI (Virtual Channel Identifier) 203, a Payload Type (PT) 204 of 3bits, a Cell Loss Priority (CLP) field 205 of 1 bit and a Header Error Correction (HEC) field
206 of an 8bit CRC (Cyclic Redundancy Check) polynomial. A 48 byte payload
207 is also shown.
Figure 3b shows a NNI cell header format 300, which comprises 12 bits of VPI (Virtual Path Identifier) 302, 16 bits of VCI (Virtual Channel Identifier) 303, a
Payload Type (PT) 304 of 3bits, a Cell Loss Priority (CLP) field 305 of 1 bit and a Header Error Correction (HEC) field 306 of an 8bit CRC (Cyclic Redundancy Check) polynomial. A 48 byte data payload 307 is also shown.
In general, wireless telecommunications networks include two or more wireless devices (stations) capable of communicating with one another. This communication may or may not occur via a wireless Access Point (AP). A wireless AP will normally allow access from a wireless device to another network, such as the Internet. A wireless communications device (WCD) may be a portable computer with a wireless local area network (W-LAN) card that has been installed, or that is already integrated.
A WCD may also be a mobile phone, or a personal data assistant (PDA), a computer equipped with a wireless modem, or any other device capable of transmitting and receiving information to another communication device using a radio frequency (RF).
A wireless base station, may be a fixed base station, might alternatively comprise a mobile communication device, a satellite, or any other device capable of transmitting and receiving communications from a WCD using radio frequency (RF).
Data packets suitable for transmission over a data network are generated by, for example, an application running on a WCD, such as an Internet web- browser, an email application, a digital video camera, and others. In such applications, data packets suitable for transmission over a data network are generated, each data packet comprising necessary header information needed to communicate with a destination device. Such header information is referred to interchangeably herein as data network header information. For example, a web-browser operating on a WCD may generate IP packets when a user wishes to access a web page. IP packets comprise data network header information, for example, header information relating to the Internet Protocol
(IP), User Datagram Protocol (UDP), Transport Control Protocol (TCP) and Real-Time Transmission Protocol (RTP) etc.
A packet flow can be defined as a series or sequence of one or more related packet exchanges between two or more communicating entities e.g. a WCD and a base station. Packet flow examples include the exchange of IP signalling packets that control a mobile Internet Protocol version 6 (IPv6) handover, or the packet exchanges involved in a Session Initiation Protocol (SIP) transfer, or packets involved a file transfer using the File Transfer Protocol (FTP).
In a wireless telecommunications network operating a packet switching technology, it would advantageous to be able to offer voice and data services using the Internet, but the need to transmit the combined headers of RTP, UDP and IP in each packet header is a disadvantage. The three headers are respectively 20, 8 and 12 bytes per packet, and this 40 byte load is nearly double the voice payload of 23 bytes for 20 milliseconds in the voice coding system known as GSM - FR (Global System for Mobile communication - Frame
Relay). It is well known that wireless resource/bandwidth is more expensive than landline arrangements, and the overhead of the large header load is a serious drawback.
Known wireless noise and bandwidth problems are as follows: high error rates, caused by radio interference, rain fade, or multi-path propagation etc; long delays, due to lower transmission speeds and large transmission distances for example; processing power constraints, due to short battery life, device size and weight for example; and scarce channel resources, due to frequency allocation and spectrum constraints for example. Further, in order to improve transmission efficiency and to keep pace with the exploding demand for digital information, there is a constant design objective to pump increasingly more data through the same bandwidth pipeline over any given network.
One way to achieve this objective is through packet compression. Packets may be compressed at a server, transmitted in their compressed state over a network, and decompressed at a client. Apart from compressing whole packets, another solution is partial packet compression in which only certain portions of the packet, such as a header or a data payload, are compressed.
Prior art compression techniques are described as follows: One technique for compressing packet headers only is discussed in an article by V. Jacobson, entitled "Compressing TCP/IP Headers for Low-Speed Serial Links,". The Jacobson technique provides an elaborate and complex compression scheme that reduces a 40-byte TCP/IP (Transmission Control Protocol/Internet Protocol) packet header to a three-byte compressed header.
The compressed header has an encoded change to the packet ID, a TCP checksum, a connection number, and a change mask. The hardware and/or software used to implement the Jacobson technique must perform sophisticated computations that compress the 40-byte header to the three-byte compressed header, and then subsequently decompress the compressed header to reproduce the uncompressed header. Further, the Jacobson technique requires "knowledge" of the format of the TCP header in order for it to work.
Another technique for compressing packet headers only is discussed in an article by S. Casner and V. Jacobson, entitled "Compressing IP/UDP/RTP Headers for Low-Speed Serial Links." This technique reduces the 40-byte
IP/UDP/RTP header to an average of between 2 and 4 bytes generally by transmitting second order differences when one or more fields within the header change.
Another technique is disclosed in US 7,035,656 (Goldberg, lnterdigital Technology Corp). This technique is based on the determination of common phrases found within the data sent from user equipment to and the compression of those "common" phrases. The patent relates to the compression of the data only, not the headers and further mentions the use of templates and dictionaries in order to compress common phrases.
A further technique is disclosed in US Patent 7,061 ,936 (Yoshimura et al). This technique is based on the compression of an RTP header only and further discloses improvements on the IETF 2508 RTP header compression technique, as well as improvements on the ROHC (Robust Header Compression) technique. US Patent 7,245,639 (Westpahl, Nokia) discloses a header compression scheme that makes use of the similarity of consecutive flows from or to a given mobile terminal to compress those headers. US Patent 7,245,639 is herein incorporated by reference.
Brief Summary of the Invention
The present invention therefore seeks to overcome, or at least reduce some of the above-mentioned problems of the prior art.
Accordingly, in a first aspect, the invention provides an apparatus for processing data in a telecommunications network, the apparatus comprising: an input, wherein the input receives data; storage, wherein the storage stores a set of templates, each template having a strength value; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template.
Preferably, wherein if the at least a portion of the data matches more than one of the templates in the set of templates, the processor chooses a winning template and increases the strength value of the winning template.
Accordingly, in a second aspect, the invention provides an apparatus for processing data in a telecommunications network, the apparatus comprising: an input, wherein the input receives data; storage, wherein the storage stores a set of templates, each template having a strength value; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template; or if the at least a portion of the data matches more than one of the templates in the set of templates, the processor chooses a winning template and increases the strength value of the winning template.
Preferably, wherein a template may contain a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
Preferably, wherein the processor uses the matched, or winning template to compress received data.
Further preferably, wherein the processor uses a template with the most matched static bytes to compress received data.
Preferably, wherein a successful template match requires all static bytes to be matched.
Accordingly, in a third aspect, the invention provides an apparatus for processing data in a telecommunications network, the apparatus comprising: an input, wherein the input receives data; storage, wherein the storage stores a set of templates; a processor, wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein based on the result of the comparison, the processor creates a new template in the set of templates, the new template being stored in the storage.
Preferably, wherein a template contains a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
Further preferably, wherein if the compared data matches with all static bytes and at least one dynamic byte in a template, then a new template is created by the processor which then records the dynamic byte as a static byte, the matched template is then known as a parent template.
Further preferably, wherein if the compared data matches at least one static byte in a template, but not all static bytes in a template, then a new template is formed which records only the matched static bytes and the remaining bytes as dynamic, the matched template is then known as a parent template. Further preferably, wherein if the compared data matches no templates, then a new template is formed which records all bytes as dynamic.
Further preferably, wherein the new template may be given an initial strength value, or may e assigned the strength value greater then the parent template strength value, or may be assigned a strength value less than the parent template strength value.
Also preferably, wherein the processor compares the whole of the received packet to each template.
Preferably, wherein the processor decreases the strength of all templates stored in storage that do not match the received data, and/or wherein the strength is decreased using a count depletion, and/or wherein the strength is decreased using a timeout.
Preferably, wherein templates that have no strength are replaced by new templates, templates are stored in a table, and/or the processor uses a selection of templates with strength values above a threshold to compress received data, and/or the data is transmitted to a decompressor using the Van Jacobson technique.
Preferably, wherein the data selected from the list of: IP packets, TCP packets, UDP packets, RTP packets, Ethernet frames, or ATM cells, or any mixture or combination thereof.
Further preferably, wherein the apparatus in implemented using FPGAs.
Accordingly, in a fourth aspect, the invention provides a method for processing data in a telecommunications network, the method comprising: receiving data; storing a set of templates, each template having a strength value; comparing at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, increasing the strength value of the matched template. Accordingly, in a fifth aspect, the invention provides a method for processing data in a telecommunications network, the method comprising: receiving data; storing a set of templates, each template having a strength value; comparing at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the strength value of the matched template is increased; or if the at least a portion of the data matches more than one of the templates in the set of templates, a winning template is chosen and the strength value of the winning template is increased.
Accordingly, in a sixth aspect, the invention provides a method for processing data in a telecommunications network, the method comprising: receiving data; storing a set of templates; comparing at least a portion of the received data to each template in the set of templates; and wherein based on the result of the comparison, creating a new template in the set of templates, the new template being stored in the storage.
Further preferable features can be found in the dependent claims.
One advantage of the method and apparatus for reducing transmission overhead in a communication system is when the "winning" templates are used to compress incoming data, a reduction in bandwidth necessary to transmit information is achieved. This results in either higher data rates or an increased number of users over a given bandwidth.
Yet another advantage of the method and apparatus for reducing transmission overhead in a communication system is that latency is reduced for real-time applications, such as voice or video.
Another advantage is that the method and apparatus is protocol independent, as templates are created from the data received at the input to the compressor. Yet another advantage is that the way in which the compressor maintains the templates is adaptive to the data being received.
Brief Description of the Drawings One embodiment of the invention will now be more fully described, by way of example, with reference to the drawings, of which:
Figure 1 illustrates a data packet used to transport information in a data network as known in the art;
Figure 2 illustrates the information contained in a typical IP header, as known in the art;
Figures 3a and 3b are diagrams showing two typical ATM packet formats as known in the art;
Figure 4 is a diagram showing a data compressor and decompressor, according to one embodiment of the present invention;
Figure 5 is a flow diagram showing a template matching and creation process, according to one embodiment of the present invention; and
Figure 6 is a state diagram showing a template management process, according to one embodiment of the present invention.
Detailed Description of the Drawings
The present invention is directed to a system and method for reducing transmission overhead in a communication system, but which is protocol independent. Although one embodiment of the present invention is described herein with respect to a wireless terrestrial communication system, it should be understood that other embodiments are possible, including use in a satellite communication system, or in a wire-based communication system as well.
Most header or payload compression techniques as known in the art replace the data packet by a label or compressed data packet at one end of the link, transmit the data with the label attached, then replace the label at the other end of the link by the original (reconstructed) data packet.
The Van Jacobson (VJ) technique and the ROHC technique work on the same principle. These schemes make use of the predictable behavior of the header sequence within one microflow. A "microflow" is disclosed in US 7, 245, 639
(Westphal) as a sequence of packets, such that two consecutive packets are within T units of time of each other.
However, all known prior art schemes require knowledge of the header or data packet structure being compressed.
In Westphal it appears that the compression scheme disclosed uses a "key" and packets are matched to this key. It appears the "key" is created using knowledge of the protocol, that is for example the key may be a destination address. Filters are also used which are a map of an IP packet's source IP address, source port, destination IP, destination port and protocol.
Therefore the Westphal disclose a protocol dependent technique, whilst the present invention is protocol independent. The present invention uses a "variable key", where all the static and dynamic areas of each packet is recognized and mapped and each and every incoming packet is either matched to an existing template, or a new template is created.
In the Westphal disclosure, two tables are created, one for the most frequent matches and one for the most recent matches, with separate processes for managing those tables. The present invention uses evolutionary factors, such a conception, mutation and growth in order to adapt and flex to the incoming microflows.
The compression scheme of the present invention is therefore protocol independent and is highly adaptive to the traffic in each microflow. In a brief overview of one embodiment of the present invention, there is shown in Figure 4 a compressor 401, a decompressor 410, a link 413 between the two (which may be a cable or RF link for example).
The compressor 401 comprises a processor 402 and storage 403 capabilities and the decompressor 410 also comprises a processor 408 and storage 409 capabilities.
Received data 406 contained in Ethernet frames, for example, enters the compressor 401 via its input 404. The compressor 401 compresses the received data 406 and then sends the data 406 via its output 405 over a link 413 to the input 407 of the decompressor 410. The received data 406 is then decompressed by the decompressor 410 (if necessary, see below) and forwarded (transmitted) to the output 411 of the decompressor 410. Uncompressed data 412 then exits the decompressor 410.
It should be clear to someone skilled in the art that the compressor 401 and decompressor 410 are contained in a wireless communications device or network server for example and that other functions are required within each device in order for them to operate within a wired or wireless telecommunications network. The specific functions of the present invention could be implemented using FPGAs (Field Programmable Gate Arrays) for example.
Further, if it assumed that the network is symmetrical, then compressor and decompressor functions will be required to carry out the process in the opposite direction (each device will therefore in practice contain both a compressor and decompressor). It should be noted that for the purposes of the present invention each directional link is assumed to be independent of the other.
The compressor 401 comprises a table (not shown) stored in the storage 409 function, the table comprising a set of templates (not shown). Each template has a depth defined as "d" which is the number of bytes that is to be matched in each Ethernet frame. Depth "d" can be as little as one byte, for example or could be as large as the size of a full Ethernet frame. This is an implementation specific choice and also, the value of "d" can be chosen to adapt the compression technique to a particular protocol if required, or may remain totally independent.
There are "T" templates in each Table and a template consists of (a) an N bit mask, indicating which byes are static in the first N bytes of matching frames, (b) the values of those static bytes; c) an assigned or instantaneous strength value; as well as d) a record of the position and value of any dynamic bytes.
The compressor 401 examines each of the Ethernet frames that is received at the compressor's input 404, looking to categorize the frames according to the positions and values of static bytes within the first N bytes of each frame. In operation, the compressor will operate on an actual data flow, in a laboratory environment; the compressor may operate. on data that has previously been recorded. A new template is then formed from the profile of the incoming data to represent each new category (or microflow type) found. The template formation may either itself operate on the same timescale as the data examination, or on a much slower timescale, for example by "sniffing" samples of the incoming data stream.
The following "Evolutionary" techniques have been developed to maintain the template set:
• Growth. A template will gain in strength each time it fully matches a frame. The more static bytes there are in the template the stronger the template grows. There is also an "ageing" process which drains all templates over time (see Figures 5 and 6 for more details). • Conception. The first frames in a microflow or a frame which does not perfectly match any current template is a good candidate for use as a new template. This constantly introduces new blood and primes the template set initially.
• Death. Templates which are not gaining strength by matching frames are discarded by the ageing process, maintaining a supply of unused template positions for the creation and mutation mechanisms. Newly created templates have an initial assigned strength to give them a chance to become established.
• Mutation. Templates may be duplicated with more (or less) static bytes by recently matching (or near matching) frames.
Therefore, templates effectively compete against each other to match incoming frames. Only the "winners" get stronger, "losers" age & die. When a new flow starts up it is usually soon captured by a conception. This template does not have any static bytes initially, but is likely to soon spawn a mutant which captures the essence of the flow. This may further mutate to track nuances in the flow. Both templates may coexist, or one may die, depending on the flow dynamics. When the flow stops the template inevitably and eventually "dies".
The process of template matching and creation is described in more details below with reference to Figure 5.
The template management process is described in further detail with reference to Figure 6 following.
Figure 5 is a flow diagram showing a template matching and creation process, according to one embodiment of the present invention. The process as described in Figure 5 starts at the point marked "S" and finished at point "F" and is executed for every received frame. The process comprises the following steps:
Step AO: Ageing - age all stored templates, decrease template strengths by 1 and kill any stored templates with no strength. Therefore, all templates lose (for example) 1 unit of strength for each frame received, and die when their strength reaches zero (see Figure 6 for a further discussion of the ageing process).
Step A1 : Match the received frame against all "active" templates to determine if it matches all the static bytes in any of them. Active templates are described in more detail with reference to Figure 6 following. Step A2. Is there a full static match? If the incoming frame has the same static byte pattern in the first N bytes when compared to all active templates, there is a match made and in which case the process moves onto step A3. Otherwise the process moves onto step A7.
Step A3. If more than one template matches, then the one with the most matched static bytes is chosen as the best and the strength of this best matched template is increased (for example with the number of static bytes). If two or more templates have exactly the same number of static bytes when matched to the frames, then the compressor must arbitrarily choose a winner. It does not matter how the decision is made, but it is best if made consistently from decision to decision, for example the template with a higher position in the table is always chosen. Therefore, and it must be, that only one winning, best matched, template gains strength.
Step A4. Therefore this frame is a candidate for compression using this "winning best matched" template. Wherein compression involves removing the static bytes from the frame before sending it to the decompressor. The decompressor will then reinsert them to reconstruct the original frame. This requires the compressor to have previously notified the decompressor of the necessary template context using prior art methods such as those due to Van Jacobson.
Step A5: If the current template has been matching, or even persistently matching, some of its dynamic bytes as well as all its static ones, then there is now the opportunity to create a mutant template with MORE static bytes. Persistent here means in two consecutive full matches, for example. If so, the process moves on to step A6, if not, the process ends at point F.
Step A6. If the template has been matching, or persistently matching, some of its dynamic bytes as well as all its static ones, then a mutant template is created via this step with MORE static bytes. Then either the parent strength can be assigned, or an initial strength can be assigned which may, for example, be more or less than the parent strength.
Step A7. On the other hand, if there is no full static match at all for the frame, (the no results from step A2) then we look for the best near miss, ie. the template which matched the most static bytes (step A7). If so, then the process moves to step A8, if not, it moves to step A9.
Step A8. This template is a candidate for a mutant with FEWER static bytes. Then either the parent strength can be assigned, or an initial strength can be assigned which may be more or less than the parent strength.
A9: Finally, if there is not even a near miss, then there is the ability to conceive a new "conception" template, with no static bytes, which will spawn further mutations if the flow persists. This is how new flows are detected, and indeed how the mechanism is primed at startup. A new "conception" template is created in the next vacant position (left by a "dead" template for example) and the number of static bytes is set to zero. In this case all dynamic bytes are recorded. This scheme ensures that a single isolated frame cannot cause a new "non-conception" template to be created. An initial strength value is assigned to the new "conception" template in order that in can survive for a short time, but if packets do not start matching this new "conception" template, then it is sure to die (see Figure 6 for more details).
The strength of a template is indicative of the total amount of bandwidth saved by using that template to compress the flow of packets through the compressor.
Therefore, the more static bytes in a template the better and this makes for a
"stronger" template (via steps A4, A5 and A6). The more matches a template has to each Ethernet frame, the stronger it gets, also, the more frequently it matches and the more bandwidth is saved. At any one time after conception, each candidate template has an instantaneous strength which is a measure of the overall bandwidth that would be saved by using this template. Therefore, a new template takes all its byte values (static & dynamic) from its creating, or parent frame. Dynamic bytes in a template are then updated with those from every best matched frame, after the consideration of mutation.
All templates lose 1 unit of strength for each frame received, and die when their strength reaches zero. Any winning best match template gains "N" units, where
"N" is the number of static bytes in that template. Mutation templates may be created with a strength equal to their N, plus a constant. This constant may, for example, be equal to the depth divided by a factor of 4 (depth/4). Conception templates are created with strength equal to depth/2. The number of bytes into a frame for which matches are considered may typically be 96. The maximum number of concurrently active templates is typically 100. It is also possible therefore that there are more templates in the table than valid flow ids (as some are ageing). Further, the term "byte" is used here as a single example, it should be clear to someone skilled in the art that word lengths other than 8 bits are perfectly usable.
The compressor 401 therefore only picks up on the common bytes in consecutive frames; all random information is therefore ignored. No knowledge of the incoming packet structure is required
A set of templates, known as "active" templates (see Figure 6) which offer the most compression gain are applied by the compressor to the incoming frames, removing the static bytes and replacing them with a single control byte (or label), as described in further details following. These active templates are then deemed as being "in-use" (se Figure 6).
The decompressor 410 must have enough context information to replace these bytes and therefore reconstruct the original frames. In this case the Van
Jacobson technique can be used to transmit the data packets with the labels created from the data stored in the frame and which is deemed most applicable to the packet in that microflow (ie taken from set of in-use templates). It should be clear to someone skilled in the art that any transmission technique could be implemented in both the compressor 401 and decompressor 410 in order to utilize the information contained in the templates and reduce the size of the packets being transmitted over the link 413.
Figure 6 is a state diagram showing template management processes, according to one embodiment of the present invention.
All templates in their lifetime may be in 3 different states, known as "dead" 501 , "active" 502 or "in-use" 503.
Templates may move from one state to another via conception or mutation processes, by an ageing process or via a decision based on strength. All template positions in the table start life as "dead" 501 and will move to an "active" 502 status once a template position has been filled in with data via a conception or mutation decision 504 (described in steps A5, A6 & A7 with reference to Figure 5 previously).
The set of "active" templates which offer the most compression gain are applied by the compressor to the incoming frames and so via a decision based on strength 506, this set of templates will move from the state of active 502 to in- use 503. The set of "in-use" templates could have a fixed or variable size, or have to exceed a minimum strength threshold, depending upon the exact implementation required.
It may also be advantageous to delay packets within the compressor to help evaluate the value of an "active" template before changing its status to "in-use", in order to improve the efficiency of the transmission process.
A template moves from the state in-use 503 to the state dead 501 by a first aging process 508. The ageing process 508 may be implemented using a combination of factors, for example a strength depletion or a fixed timeout. A strength depletion could occur every time a packet does not match an active template. A fixed timeout could be a specific count which is depleted over time, as known in the art. Any other combination of threshold, time or count could be used.
A template moves from the state of active 502 to dead 501 also by a second ageing process 505. This second ageing process 505 could be the same as the first one previously described (508), or it could be designed with different threshold, count or time parameters, if required.
It is not envisaged that a template would move from the state of in-use 503 back to the state of active 502, but if it did, this state change could be based a strength decision 507 or due to subtle changes in the microflow.
Parameters such as maximum number of templates T, maximum header depth d, and strength rates & thresholds may be determined empirically, although an adaptive scheme could be considered.
The present invention provides a good compression ratio when packets are sent infrequently, but which have lots of static bytes, as well as when packets are sent frequently, but which have few static bytes. When used with smaller byte packets, up to 4:1 compression ratios may be achieved.
The only assumption made by the present invention is that the incoming data consists of a stream of headers and payloads and that each payload is preceded by a header. The present invention can be used for any type of Ethernet frame, IP packet or ATM cell structure, or any mixture or combination thereof.
The present invention gives the advantages of being flexible, adaptable, protocol independent whilst still providing a good level of compression.
It will be appreciated that although only one particular embodiment of the invention has been described in detail, various modifications and improvements can be made by a person skilled in the art without departing from the scope of the present invention.

Claims

Claims:
1. Apparatus for processing data in a telecommunications network, the apparatus comprising:
an input, wherein the input receives data;
storage, wherein the storage stores a set of templates, each template having a strength value;
a processor,
wherein the processor compares at least a portion of the received data to each template in the set of templates; and
wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template.
2. Apparatus as claimed in claim 1 , wherein if the at least a portion of the data matches more than one of the templates in the set of templates, the processor chooses a winning template and increases the strength value of the winning template.
3. Apparatus for processing data in a telecommunications network, the apparatus comprising:
an input, wherein the input receives data;
storage, wherein the storage stores a set of templates, each template having a strength value;
a processor,
wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein if the at least a portion of the data matches one of the templates in the set of templates, the processor increases the strength value of the matched template;
or if the at least a portion of the data matches more than one of the templates in the set of templates, the processor chooses a winning template and increases the strength value of the winning template.
4. Apparatus as claimed in any preceding claim, wherein a template contains a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
5. Apparatus as claimed in any preceding claim, wherein the processor uses the matched, or winning template to compress received data.
6. Apparatus as claimed in any of claims 1 to 4, wherein the processor uses a template with the most matched static bytes to compress received data.
7. Apparatus as claimed any preceding claim, wherein a successful template match requires all static bytes to be matched.
8. Apparatus for processing data in a telecommunications network, the apparatus comprising:
an input, wherein the input receives data;
storage, wherein the storage stores a set of templates;
a processor,
wherein the processor compares at least a portion of the received data to each template in the set of templates; and wherein based on the result of the comparison, the processor creates a new template in the set of templates, the new template being stored in the storage.
9. Apparatus as claimed in claim 8, wherein a template contains a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
10. Apparatus as claimed in claim 8 or 9, wherein if the compared data i matches with all static bytes and at least one dynamic byte in a template, then a new template is created by the processor which then records the dynamic byte as a static byte, the matched template is then known as a parent template.
11. Apparatus as claimed in either claim 8 or 9, wherein if the compared data matches at least one static byte in a template, but not all static bytes in a template, then a new template is formed which records only the matched static bytes and the remaining bytes as dynamic, the matched template is then known as a parent template.
12. Apparatus as claimed in either claim 8 or 9, wherein if the compared data matches no templates, then a new template is formed which records all bytes as dynamic.
13. Apparatus as claimed in any of claims 8 to 12, wherein the new template is given an initial strength value.
14. Apparatus as claimed in any of claims 8 to 11 , wherein a new template is assigned the strength value greater then the parent template strength value.
15. Apparatus as claimed in any of claims 8 to 11 , wherein the new template is assigned a strength value less than the parent template strength value.
16. Apparatus as claimed in any preceding claims, wherein the processor compares the whole of the received packet to each template.
17. Apparatus as claimed in any preceding claim, wherein the processor decreases the strength of all templates stored in storage that do not match the received data.
18. Apparatus as claimed in claim 17, wherein the strength is decreased using a count depletion.
19. Apparatus as claimed in claim 17, wherein the strength is decreased using a timeout.
20. Apparatus as claimed in claim 17, 18 or 19, wherein templates that have no strength are replaced by new templates.
21. Apparatus as claimed in any preceding claim, wherein templates are stored in a table.
22. Apparatus as claimed in any preceding claims, wherein the processor uses a selection of templates with strength values above a threshold to compress received data.
23. Apparatus as claimed in any preceding claim, wherein data is transmitted to a decompressor using the Van Jacobson technique.
24. Apparatus as claimed in any preceding claim, wherein the data selected from the list of: IP packets, TCP packets, UDP packets, RTP packets, Ethernet frames, or ATM cells, or any mixture or combination thereof.
25. Apparatus as claimed in any preceding claim, wherein the apparatus in implemented using FPGAs.
26. A method for processing data in a telecommunications network, the method comprising:
receiving data;
storing a set of templates, each template having a strength value; comparing at least a portion of the received data to each template in the set of templates; and
wherein if the at least a portion of the data matches one of the templates in the set of templates, increasing the strength value of the matched template.
27. A method as claimed in claim 26, wherein if the at least a portion of the data matches more than one of the templates in the set of templates, a winning template is chosen and the strength value of the winning template is increased.
28. A method for processing data in a telecommunications network, the method comprising:
receiving data;
storing a set of templates, each template having a strength value;
comparing at least a portion of the received data to each template in the set of templates; and
wherein if the at least a portion of the data matches one of the templates in the set of templates, the strength value of the matched template is increased;
or if the at least a portion of the data matches more than one of the templates in the set of templates, a winning template is chosen and the strength value of the winning template is increased.
29. A method as claimed in any of claims 26 to 28, wherein a template contains a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
30. A method as claimed in any of claims 26 to 29, wherein the matched, or winning template is used to compress received data.
31. A method as claimed in any of claims 26 to 29, wherein the template with the most matched static bytes is used to compress received data.
32. A method as claimed in any of claims 26 to 31 , wherein a successful template match requires all static bytes to be matched.
33. A method for processing data in a telecommunications network, the method comprising:
receiving data;
storing a set of templates;
comparing at least a portion of the received data to each template in the set of templates; and
wherein based on the result of the comparison, creating a new template in the set of templates, the new template being stored in the storage.
34. A method as claimed in claim 33, wherein a template contains a map of the positions and values of static and dynamic bytes of the at least a portion of the received data.
35. A method as claimed in claim 33 or 34, wherein if the compared data matches with all static bytes and at least one dynamic byte in a template then a new template is created which then records the dynamic byte as a static byte, the matched template is then known as a parent template.
36. A method as claimed in either claim 33 or 34, wherein if the compared data matches at least one static byte in a template, but not all static bytes in a template, then a new template is formed which records only the matched static bytes and the remaining bytes as dynamic, the matched template is then known as a parent template.
37. A method as claimed in either claim 33 or 34, wherein if the compared data matches no templates, then a new template is formed which records all bytes as dynamic.
38. A method as claimed in any of claims 33 to 37, wherein the new template is given an initial strength value.
39. A method as claimed in any of claims 33 to 36, wherein a new template is assigned the strength value greater then the parent template strength value.
40. A method as claimed in any of claims 33 to 36, wherein the new template is assigned a strength value less than the parent template strength value.
41. A method as claimed in any of claims 26 to 40, wherein the whole of the received packet is compared to each template.
42. A method as claimed in any of claims 26 to 41 , wherein the strength of all templates stored in storage that do not match the received data are decreased.
43. A method as claimed in claim 42, wherein the strength is decreased using a count depletion.
44. A method as claimed in claim 42, wherein the strength is decreased using a timeout.
45. A method as claimed in claim 42, 43 or 44, wherein templates that have no strength are replaced by new templates.
46. A method as claimed in any of claims 26 to 45, wherein templates are stored in a table.
47. A method as claimed in any of claims 26 to 46, wherein a selection of templates with strength values above a threshold are used to compress received data.
48. A method as claimed in any of claims 26 to 47, wherein data is transmitted to a decompressor using the Van Jacobson technique.
49. A method as claimed in any of claims 26 to 48, wherein the data is selected from the list of: IP packets, TCP packets, UDP packets, RTP packets, Ethernet frames, or ATM cells, or any mixture or combination thereof.
50. Apparatus for compressing data packets as substantially hereinbefore described, with reference to Figures 4, 5 & 6.
51. A method of compressing data packets as substantially hereinbefore described, with reference to Figures 4, 5 & 6.
PCT/GB2009/002317 2008-09-30 2009-09-29 Improved data compression WO2010038011A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP09736448A EP2353270A2 (en) 2008-09-30 2009-09-29 Improved data compression
ZA2011/02901A ZA201102901B (en) 2008-09-30 2011-04-18 Improved data compression

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0817947.5 2008-09-30
GB0817947.5A GB2463920B (en) 2008-09-30 2008-09-30 Improved data compression

Publications (2)

Publication Number Publication Date
WO2010038011A2 true WO2010038011A2 (en) 2010-04-08
WO2010038011A3 WO2010038011A3 (en) 2010-11-04

Family

ID=40019854

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2009/002317 WO2010038011A2 (en) 2008-09-30 2009-09-29 Improved data compression

Country Status (4)

Country Link
EP (1) EP2353270A2 (en)
GB (1) GB2463920B (en)
WO (1) WO2010038011A2 (en)
ZA (1) ZA201102901B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116527777A (en) * 2023-04-23 2023-08-01 国网湖南省电力有限公司 Measurement data compression acquisition method and system, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5778095A (en) * 1995-12-20 1998-07-07 Xerox Corporation Classification of scanned symbols into equivalence classes
US20050041660A1 (en) * 2003-07-08 2005-02-24 Pennec Jean-Francois Le Packet header compression system and method based upon a dynamic template creation
EP1517449A2 (en) * 2003-09-19 2005-03-23 NTT DoCoMo, Inc. Compression of XML documents

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6415061B1 (en) * 1997-06-13 2002-07-02 Cisco Technology, Inc. Method of updating dictionaries in a data transmission system using data compression
US7317724B2 (en) * 2003-07-08 2008-01-08 Cisco Technology, Inc. Performing compression of user datagram protocol packets

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5778095A (en) * 1995-12-20 1998-07-07 Xerox Corporation Classification of scanned symbols into equivalence classes
US20050041660A1 (en) * 2003-07-08 2005-02-24 Pennec Jean-Francois Le Packet header compression system and method based upon a dynamic template creation
EP1517449A2 (en) * 2003-09-19 2005-03-23 NTT DoCoMo, Inc. Compression of XML documents

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116527777A (en) * 2023-04-23 2023-08-01 国网湖南省电力有限公司 Measurement data compression acquisition method and system, electronic equipment and storage medium
CN116527777B (en) * 2023-04-23 2024-04-23 国网湖南省电力有限公司 Measurement data compression acquisition method and system, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2010038011A3 (en) 2010-11-04
GB0817947D0 (en) 2008-11-05
GB2463920B (en) 2012-08-22
EP2353270A2 (en) 2011-08-10
GB2463920A (en) 2010-03-31
ZA201102901B (en) 2012-06-27

Similar Documents

Publication Publication Date Title
US7391736B2 (en) Method and apparatus for transmitting packet data having compressed header
US7600039B2 (en) Label-based multiplexing
US9635140B2 (en) Bi-directional packet data transmission system and method
CN100454920C (en) Extension header compression
EP1376878A1 (en) Protocol message compression in a wireless communications system
EP1122925A1 (en) Header compression for general packet radio service tunneling protocol (GTP)
EP2098035A2 (en) Improved header compression in a wireless communication network
KR100742868B1 (en) A method for header compression context control during handover in mobile data communication networks
KR20060054662A (en) Apparatus and method for compressing of herder in a broad band wireless communication system
EP1360818B1 (en) Compression method, transmitter and receiver for radio data communication
Abdelfadeel et al. Lschc: Layered static context header compression for lpwans
EP2127298B1 (en) Header supression in a wireless communication network
US7372843B1 (en) System and method for compressing information flows in a network environment
EP2353270A2 (en) Improved data compression
US7643494B2 (en) Interworking apparatus and method for accepting IP in WCDMA system
CA2373923C (en) Communication system and method in an ip network
Hassan Impact of Cell Loss on the E ciency of TCP/IP over ATM
Degermark et al. Soft state header compression for wireless networks
CN100428733C (en) Error recovery method and device for IP header compression in mobile communication network
Venmani et al. Impacts of IPv6 on robust header compression in LTE mobile networks
Khan A RLC/MAC protocol architecture for a wireless IP network
Cellatoglu et al. Performance of RTP/UDP/IP header compression in cellular networks
Marzegalli et al. Adaptive RTP/UDP/IP header compression for VoIP over Bluetooth
Sief et al. Guaranteed end-to-end QoS for VoIP over cellular links based on IPv6 compression
Degermark et al. Mobile Gateway

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2009736448

Country of ref document: EP