GB2515826A - Method and device for encoding headers of a message using reference header sets - Google Patents

Method and device for encoding headers of a message using reference header sets Download PDF

Info

Publication number
GB2515826A
GB2515826A GB1312112.4A GB201312112A GB2515826A GB 2515826 A GB2515826 A GB 2515826A GB 201312112 A GB201312112 A GB 201312112A GB 2515826 A GB2515826 A GB 2515826A
Authority
GB
United Kingdom
Prior art keywords
headers
reference set
header
encoding
encode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1312112.4A
Other versions
GB201312112D0 (en
Inventor
Romain Bellessort
Youenn Fablet
Herv Ruellan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to GB1312112.4A priority Critical patent/GB2515826A/en
Priority to GB1312498.7A priority patent/GB2515839A/en
Publication of GB201312112D0 publication Critical patent/GB201312112D0/en
Publication of GB2515826A publication Critical patent/GB2515826A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3088Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing the use of a dictionary, e.g. LZ78
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • H03M7/42Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6058Saving memory space in the encoder or decoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The definition of a reference set of headers to be used for encoding a set of headers is proposed to enable the encoder to dynamically define the reference set of headers to be used. This reference set is defined 320 as a region in the header table and that may be defined dynamically. Accordingly, flexibility is offered in defining the reference set for encoding a set of headers by a method which improves the compression while minimizing the overhead, especially regarding memory size. Additionally once a reference set has been defined by identifying a region of the header table to use 340 the differences between a set of headers to encode/decode and those in the reference set 330 may also be indicated 350. The arrangement may be used for the compression and decompression of headers in SPDY or HTTP 2.0 protocols.

Description

METHOD AND DEVICE FOR ENCODING HEADERS OF A MESSAGE
USING REFERENCE HEADER SETS
The invention belongs to the field of network communication, and in particular to the field of data compression used when sending messages over a communications network. The present invention concerns a method and a device for encoding headers of a message using reference header sets. More particularly, it concerns the definition of a reference set of headers to be used for encoding a set of headers.
In messages to be sent there are often lists or groups of items of information that are compressed at the encoder and decompressed at the decoder. This is for example the case for HTTP (standing for Hypertext Transfer Protocol). HTTP is commonly used to request and send web pages, and is based on a client/server architecture, wherein the client sends requests, namely HTTP requests, to the server, and the server replies to the client's requests with responses, namely HTTP responses.
The HTTP requests and HTTP responses are messages that comprise various parts of data, including headers and payload data. The HTTP headers of an HTTP message, and more generally the headers of a message, form a set of headers. An HTTP header generally consists of a name along with a corresponding value. For instance, in "Host: en.wikipedia.org", Host is the header name, and its value is "en.wikipedia.org". This header is used to indicate the host of the requested resource (for instance, the Wikipedia page describing HTTP, available at http://en.wikipedia.org/wiki/HTTP). HTTP headers are well-known to one skilled in the art, and therefore are not further detailed here. In the following the word header refers to both the header name and the header value.
Conventional HTTP provides compression of the HTTP payload data before the HTTP message is transmitted, while the set of HTTP headers is not compressed, i.e. it is encoded as text data. Usually, the headers tend to be redundant in successive messages. This is the case for HTTP. Textual encoding is not efficient in this situation, resulting in some HTTP improvements to have emerged with a view of defining more compact encodings.
SPDY protocol, used as a working base for the drafting of HTTP 2.0, has been developed in this context and combines two main ideas. First, it enables several HTTP requests and responses to be sent over a single TCP/IP connection, thus defining a set of HTTP transmissions therein. In this way, all the components of a web page (HTML documents, images, JavaScript, etc.) may share the same TCP/IP connection, thus speeding up the web page loading. Second, SPDY implements compression of the HTTP headers exchanged over the shared TCP/IP connection, using the Deflate algorithm (also known through the "zip" format). This binary compression reduces the network load.
The compression of HTTP headers by SPDY is efficient but also has drawbacks. In particular, it proves to require too much processing for constrained devices with low resources, for example on decoder side.
Furthermore, network devices that want to access a few specific headers need to uncompress all the headers and recompress all of them if they make any change to one of them.
In HTTP, headers are encoded as text data. As headers tend to be redundant, textual encoding is not efficient. Therefore, proposals have been made in the context of HTTP/2.0 standardization in order to define more compact encodings.
Currently, the draft encoding for HTTP headers rely on the following concepts. The header table is a table comprising headers already encountered and stored as an indexed set of (name, value) pairs.
The literal representation of a header consists in encoding a header by encoding its name and its value as strings. Optionally, the header name might be encoded by using the index of a header name already present in the header table. A header is said to be indexed if it is added in the header table. A header encoded using literal representation can be indexed, i.e. can be added to the header table by setting appropriate flags. Two different kinds of indexing are available: incremental indexing where the header is appended to the header table and substitution indexing where the header replaces a header already present at a given index in the header table. This given index has to be encoded to fully define a substitution indexing.
The indexed representation consists in encoding a header by encoding the index of said header in the header table. This representation can be selected only for headers which have already been indexed.
Successive HTTF messages exchanged between two nodes of a network are likely to share most of their headers. A set of headers corresponds typically to the whole set of headers of a given message but may also correspond to a subset of this whole set of headers. The set of headers is then encoded by taking the set of headers corresponding to the previous message as a reference. In other words, when encoding the set of headers of the message N+1, it is assumed that there are few differences with the set of headers of the message N. Hence, the set of headers of the message N is taken as the starting point to determine the set of headers of the message N÷1.
Consequently, instead of encoding the set of headers of the message N-fl by encoding its headers, the set of headers of the message N+1 is encoded by encoding its differences with the set of headers of the message N. This is illustrated with regard to Figure 1, which is detailed below.
More generally, other representations could be added to this encoding scheme. For instance, some values may be encoded as typed values (e.g. using binary encoding for integers and dates). As another example, a delta representation may be defined by enabling a header to be encoded by reference to the header tables even if its header value is not exactly the same as the one present in the table. In this case, the difference between the header value to encode and the header value in the table has to be encoded. Taking the example of URL5, a header ("url", "http:Ilexample.com/456") could thus be encoded as a reference to a header ("url", "http://example.com/123") already present in the header table. The following information would be encoded: the index of said header in the table; a length of 19, corresponding to the common prefix between the header values "http://example.com/456" and "http://example.com/123"; and the suffix to be added to said common prefix "456".
As it can be understood from this example, such a representation enables reducing the size of encoded data by making an efficient usage of the
header table.
Other processing may be applied to improve the encoding of headers, for instance Huffman encoding or Deflate.
The usage of the previous set of headers as a reference has been proposed for HTTP/2.0 headers encoding. It is worth noting that the wording "previous set of headers" is used as a shortcut that strictly means the "set of headers of the previous message". In this proposal, the use of a single reference set is generally advised (i.e. always the previous set of headers), but means are provided for defining up to 256 different reference sets. To do so, when encoding a new set of headers, one byte is used to indicate the reference set to be considered.
Initially (i.e. the first time its index is encoded in a set of headers), a reference set is empty. But the second time it is used, the previous set of headers for this index is used as a reference.
This solution provides flexibility, but it suffers from significant constraints: for each new reference set, a copy of the previous set has to be preserved.
From an implementation point of view, this copy has a cost both in terms of memory and processing. First, the set needs to be stored. Second, each time operations are made on the header table, those operations have to be applied to reference sets. In particular, if a header is removed from the header table, all sets have to be examined to determine whether said header is included in said sets, and if so, said header has to be removed from them.
As it can be understood, handling such sets has another drawback: reference sets are dependent on the header table. If a header present in the table is seldom used for a given reference set, but often used for another one, should it be removed from the table when room is needed? Each time an encoder has to decide how to manage the header table, it has to take all reference sets into account to avoid removing useful headers from the table.
Therefore, the management of the various reference sets is quite complex, and it is recommended to use only a single reference set, which can be seen as a consequence of this added complexity.
The present invention has been devised to address one or more of the foregoing concerns. It is proposed to enable the encoder to dynamically define the reference set of headers to be used for encoding a given set of headers.
This reference set is defined as a region in the header table that may be defined dynamically.
According to a first aspect of the invention there is provided a method of encoding a set of headers to encode using an indexed header table associating headers with respective coding indexes, the method comprising identifying in the header table a region comprising headers being part of the set of headers to encode; determining a reference set of headers in the header table based on said region; determining headers representing differences between the set of headers to encode and the reference set; and encoding the set of headers to encode by encoding both a definition of said reference set and said determined headers representing differences.
Accordingly, flexibility if offered in defining the reference set for encoding a set of header by a method which improves the compaction while minimizing the overhead, especially regarding memory size.
In an embodiment, the method comprises determining the reference set of headers based on how headers to encode are distributed in the identified region.
Accordingly, the relevance of the determined reference set is improved.
In an embodiment, encoding the definition of said reference set comprises encoding a start index representing the lowest index among the indexes of the headers comprised in the reference set; and encoding a length representing the difference between the highest index and the lowest index among the indexes of the headers comprised in the reference set.
In an embodiment, encoding the definition of said reference set comprises encoding deletion information indicating that some headers belonging to the reference set should be deleted.
Accordingly, the number of differences to encode is decreased.
In an embodiment, the method comprises storing in an indexed table at least the start indexes in the header table of a given number of previously defined reference set; and encoding the reference set using an index in said indexed table comprising at least the start indexes.
Accordingly, the memory size needed to encode the reference set is decreased.
In an embodiment, determining the reference set comprises initializing a reference set with the void set; determining a set of at least one header in the header table which reduces the size of encoded data when added to the reference set; and adding the determined set in the reference set.
In an embodiment, the method comprises determining the reference set comprises determining all possible reference sets based on the identified region in the header table; and selecting the reference set which minimizes the size of the encoded data.
Accordingly, the reference set leading to the best compaction is used.
In an embodiment, the method comprises comparing the size of the encoded data obtained for the determined reference set and for a default set comprising a set of headers previously encoded; and selecting the set which minimizes the size of the encoded data among the determined reference set and the default set.
Accordingly, the default reference set is used if leading to better compaction.
In an embodiment, the method comprises encoding a piece of information indicating that some of the already indexed headers from the reference set or from the differences between the reference set and the set of headers to encode should be added anew in the header table.
Accordingly, a reference set based on these newly added headers is likely to be relevant for encoding.
In an embodiment, the method comprises ordering headers from the set of headers to encode, part of these ordered headers being to be added in the
header table.
Accordingly, the relevance of the reference set is improved.
In an embodiment, the method comprises determining a type of said set of headers to encode; and determining the reference set based also on this type.
Accordingly, relevant reference set may be defined for each type of set of headers to encode.
In an embodiment, ordering of the headers is based on the determined type.
In an embodiment, the method comprises ordering of the headers to be added in the header table depending on their variability.
Accordingly, relevant reference sets for different messages may be defined based on the same region in the header table.
In an embodiment, the method comprises ordering of the headers from a first set of headers to encode by decreasing variability; and ordering of the headers from a second set of headers to encode by growing variability.
Accordingly, relevant reference sets may be defined for both sets of header to encode.
In an embodiment, two different types of sets of headers to encode are defined, the method comprises adding in the header table headers common to both types of sets of headers to encode between headers specific to the first type and headers specific to the second type.
Accordingly, different reference sets sharing the region comprising the common headers may be defined for both type of set of headers to encode.
In an embodiment, the method comprises selecting the index in the header table where incrementally indexed headers are added.
Accordingly, free space may be reserved in the header table for subsequent addition.
In an embodiment, the method comprises determining the index in the header table where incrementally indexed headers are added based on the definition of the reference set.
According to another aspect of the invention there is provided a method of decoding a set of headers using an indexed header table associating headers with respective coding indexes, comprising: decoding a definition of a reference set of headers; creating the reference set corresponding to this definition; decoding differences; and obtaining the decoded set of headers by applying differences to the reference set.
According to another aspect of the invention there is provided a device for encoding a set of headers to encode comprising an indexed header table associating headers with respective coding indexes; an identifying module for identifying in the header table a region comprising headers being part of the set of headers to encode; a reference set determining module for determining a reference set of headers in the header table based on said region; a difference determination module for determining headers representing differences between the set of headers to encode and the reference set; and an encoding module for encoding the set of headers to encode by encoding both a definition of said reference set and said determined headers representing differences.
According to another aspect of the invention there is provided a device for decoding a set of headers comprising an indexed header table associating headers with respective coding indexes, comprising a definition decoding module for decoding a definition of a reference set of headers; a reference set creating module for creating the reference set corresponding to this definition; a difference decoding module for decoding differences; and an obtaining module for obtaining the decoded set of headers by applying differences to the reference set.
According to another aspect of the invention there is provided a computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to the invention when loaded into and executed by the programmable apparatus.
According to another aspect of the invention there is provided a computer-readable storage medium storing instructions of a computer program for implementing a method according to the invention.
At least parts of the methods according to the invention may be computer implemented. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit", "module" or "system". Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer usable program code embodied in the medium.
Since the present invention can be implemented in software, the present invention can be embodied as computer readable code for provision to a programmable apparatus on any suitable carrier medium. A tangible carrier medium may comprise a storage medium such as a floppy disk, a CD-ROM, a hard disk drive, a magnetic tape device or a solid state memory device and the like. A transient carrier medium may include a signal such as an electrical signal, an electronic signal, an optical signal, an acoustic signal, a magnetic signal or an electromagnetic signal, e.g. a microwave or RF signal.
Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which: Figure 1 illustrates an example of encoding of a set of headers according
to prior art;
Figure 2 illustrates the encoding of a set of headers according to an embodiment of the invention; Figure 3 illustrates a flow chart of the main steps for encoding a set of headers according to an embodiment of the invention; Figure 4 illustrates a flow chart of the main steps for determining a reference set according to an embodiment of the invention; Figure 5a illustrates the encoding of the reference set according to an embodiment of the invention; Figure 5b illustrates some optional steps that may occur for encoding a reference set according to an embodiment of the invention; Figure 6 illustrates to the filtering of a reference set according to an embodiment of the invention; Figure 7 describes a pre-processing that is applied to the differences in some embodiments of the invention; Figure 8 describes the decoding of a set of headers according to an embodiment of the invention; Figure 9 is a schematic block diagram of a computing device for implementation of one or more embodiments of the invention.
Figure 1 illustrates an example of encoding a header set 1, referenced 110, after a header set 0, referenced 100, has already been encoded. Encoding of the header set 1 is made using the header set 0 as a reference set. Please note that this example is purposely very simple. In particular, it is not realistic to consider an HTTP request comprising only the three headers present in set 0 and set 1. However, this is not an issue as the mechanism illustrated in this example would work similarly with more realistic sets of headers.
It is considered in this example that the three headers of set 0 have been indexed in the header table 120. Hence, after encoding of set 0, the header table 120 comprises those three headers, with indexes ranging from 0 to 2.
When encoding set 1, set 0 is taken as the reference set of headers.
Then, differences between set 0 and set 1 are determined: "url" header from set 1 is different from "url" header from set 0. Therefore, two differences must be encoded, as illustrated in table 130. The "url" header from set 0, which is present in the reference set for set 1, has to be removed. This is done by encoding its index as illustrated by the second line of table 130. There is no need to indicate whether this index corresponds to an addition or a deletion.
Indeed, given that the header associated to said index is already present in the reference set, its presence in the list of differences necessarily corresponds to a deletion. A pair (name, value) cannot be present more than once in a set of headers.
The "url" header from set 1 is not present in the reference set: therefore, it has to be added. This is done by encoding said header as illustrated on third line of table 130, for instance using a literal representation. The presence of this header necessarily corresponds to an addition, given that said header is not present in the reference set.
Finally, even though not illustrated in this example, all the headers of set N may not be used in the reference set for set N-'-l. In some embodiment, only the indexed headers of set N are present in the reference set for set N+1. One of the reasons for doing so is that a header which is not indexed is presumably unlikely to occur again in the next set of headers. If it were, it would have been indexed.
While using the previous set of headers as a reference set is often efficient, it is not always the case. The question is how to bring more flexibility to the use of a reference set for the encoding of a given set of headers.
It is proposed to enable the encoder to dynamically define the reference set based on the header table. In contrast with prior art, which considers distinct sets that have to be managed, this solution minimizes the cost of using different reference sets by avoiding the creation of new constructs. The encoder defines the reference set by specifying a given region of the header table.
This region is basically encoded using a start index in the header table corresponding to the index of the first header belonging to the region and a length corresponding to the number of headers from the header table belonging to the region. In some embodiment, additional information may be encoded to provide more flexibility, for instance indications that some headers belonging to said region do not belong to the reference set.
This solution can be applied by any encoder, but it should also be noted that in some embodiments the encoder adapts its encoding strategy to take advantage of the new means provided. For instance, an encoder can distinguish different types of header sets to encode. For example a request corresponding to HTML I images I JavaScript or CSS for a client, or responses corresponding to status 200, 304 and 404 for a server may define a type of message and therefore a type of header set to encode. Then, upon the first occurrence of a given type of header set, the encoder will adapt its encoding so that those headers, or some of those headers, occur adjacently in the header table. Indeed, by doing so, when future header sets belonging to the same type occur, the encoder will be able to use the corresponding region of the table to define a relevant reference set. The number of differences between a header set to encode and the reference set will therefore be minimized, which will result in an improved compaction. Grouping headers together in the header table according to the type of message they belong to improves the chance of being able to define an accurate reference set for further message headers to encode.
This solution requires the encoder to keep track of the region of the indexing table corresponding to expected header sets. On the other side, the decoder additional cost is very limited: the decoder will only need to follow encoder instructions to change the reference set.
Figure 2 illustrates tables illustrating the encoding of a header set SE to encode, referenced 210, according to an embodiment of the invention. The encoding is done based on the previous set of headers 200 and the header
table 220.
First, it can be noted that the second, third and fourth headers of SE are also present in the header table 220. Second, it can also be noted that those headers occur in the header table as adjacent headers, starting from index 5 and ending at index 7. In this example, the whole header table is not described, but it is assumed that no other region of said table comprises at least three adjacent headers belonging to SE. If the headers ranging from index 5 to index 7 in the indexing table are used as the reference set, two additions will have to be encoded (and no deletion).
The default reference set consists in the headers from previous set of headers 200. It is here assumed that all those headers are indexed in the header table. By comparing SE with the previous set of headers 200, it can be seen that there is only one header in common, namely the "cookie" header.
Therefore, if the default reference set were to be used, three deletions and four additions would have to be encoded.
Consequently, even though the definition of the reference set might have a small cost, typically 1 or 2 bytes, relying on a new definition of the reference set is more efficient than relying on the default reference set. Indeed, relying on the default reference set would imply having to encode two deletions and two additions more than relying on the new reference set. Those additional deletions and additions would typically require 4 to 8 bytes, which is more than the 1 or 2 bytes needed to define the new reference set.
Table 230 illustrates a possible encoding for the set of headers to encode SE. First, the default reference set flag is set to false. This indicates that a new reference set definition is provided. Second, the definition of said new reference set is provided by indicating a start index (5) and a length (3) which means that the new reference set comprises headers at index 5, 6 and 7 in the header table. Third, the headers present in SE but not in the reference set are encoded; two such headers are present in SE, the "url" one and the "cache-control" one.
It is worth noting that, in this example, the encoder is assumed to have a specific encoding strategy that consists in distinguishing two types of headers sets: the ones comprising "method: GET", and the ones comprising "method: POST". Indeed, those two kinds of header sets tend to have different headers.
Therefore, an encoder can advantageously try to define two regions in the header table, one corresponding to a reference set used for GET header sets, and one corresponding to a reference set used for POST header sets.
To do so, the first time a header set for a given type is met, a specific set of headers is determined. Typically the headers which are supposed to remain identical for all headers of said type. For instance, in the example, "method: GET", "cookie: abcdefgh" and "accept: I" are supposed to be present for all GET header sets. On the other hand, "method: POST", "cookie: abcdefgh" and "content-length: 64" are supposed to be present for all POST header sets.
Therefore, when the first GET header set is encoded, those headers are encoded in a way that enables adding them to the header table as adjacent headers. Furthermore, given that the encoder knows that "cookie" header is used by both GET and POST header set, it can order the headers so that "cookie" becomes the last header added to the header table. By doing so, when the first POST header set is encoded, headers "method: POST" and "content-length: 64" can be added right after "cookie: abcdefgh". Consequently, by virtue of this encoding strategy, two regions are defined in the header table, one for the each of the two types of header sets. In addition, by virtue of the reordering of headers at the time of encoding, there is only a single occurrence in the header table of the header which is common to both types, for example the header "cookie: abcdefgh".
Later on, if GET and POST requests are alternatively encoded, the appropriate reference set can be used, resulting in the saving of a few bytes.
On the other hand, without the invention, the default reference set would generally be inefficient in the case of alternated GET and POST requests. In the long term, a significant gain is therefore obtained by the invention, especially with an appropriate encoding strategy.
This ability to define dynamically the reference set of header based on the header table for each set of header to encode allows improving the compression rate. This is obtained without the burden of the management of a list of possible reference sets. In some embodiments, by taking advantage of a priori knowledge of the different types of messages and the header associated, the encoder may adopt cunning policy in the ordering of the header table to have a better fit between regions of the header table and expected sets of headers to encode. This also leads to a simple header table management.
When evaluating the removal of a header from the header table, there is no need to check whether said header belongs to different reference sets.
Especially in embodiments that allow for multiple storage of a same header in the table to fit different reference set. This feature while implying the duplication of some headers in the header table and therefore being non-optimal in terms of memory is advantageous in terms of simplicity in the management of the
header table.
Figure 3 describes the encoding of a set of headers according to an embodiment of the invention.
First, at step 300, said set of headers to encode is obtained. This is typically the set of headers of the next message to send. At step 310, the set of headers common to the set to encode and to the header table, is determined.
This determination is typically achieved by taking each header from the set of headers to encode and by checking whether it is already present in the header table or not. This operation does not add a cost to the global encoding as this process has to be made by an encoder in order to check whether a header to encode can be encoded using an index or not. Indeed, if a header is already indexed, encoding it using its index is generally the most efficient solution.
At step 320, a reference set is determined. Said reference set corresponds to a set of headers from the header table which is determined based on the headers common to the set of headers to encode and the header table. This step is further described with regard to Figure 4.
At step 330, the differences between the set of headers to encode and the reference set are determined. The headers present in the set of headers to encode but not in the reference set are determined. These headers need to be added to the reference set to obtain the set of headers to encode. The headers present in the reference set but not in the set of headers to encode must also be determined. These headers need to be removed from the reference set to obtain the set of headers to encode.
At step 340, the definition of the reference set is encoded. This step is further described with regard to Figures 5a and 5b. Even though a single reference set is generally considered in this document, it may be noted that a plurality of reference sets may be determined and used. In this case, each reference set of the plurality would typically contain only headers occurring adjacently in the header table.
At step 350, the differences are encoded. This step consists in the encoding of the header set by using the reference set. Generally speaking, both additions and deletions are encoded. There is no need to indicate for each encoded headers belonging to the differences whether it is an addition or a deletion. Indeed, if the encoded header is present in the reference set, it means that it should be removed from this reference set to obtain the set of headers to encode: it is a deletion. On the other, if not present in the reference set, it means that it should be added to the reference set to obtain the set of headers to encode: it is an addition. Finally, at step 390, the process ends.
Optionally, step 300 may be followed by a step enabling to define a subset of the set of headers to encode. The definition of this subset is described with regard to Figure 6. If the subset has been defined, this subset replaces the set of headers to encode at step 310. The advantage of this optional step is to enable working on a specific subset of headers from the set of headers to encode. This may for instance be useful when the set of headers to encode is identified by the encoder as comprising headers that should be added adjacently to the header table. Indeed, in this case, the encoder may want to exclude those headers from the reference set.
Similarly, another optional step may occur after the determination of differences at step 330, in order to pre-process those differences. This optional step is described with regard to Figure 7. This step is also useful when an encoder identifies a set of headers to encode as comprising headers that should be added adjacently to the header table. As an example, the differences are not ordered, and said pre-processing step may consist in ordering those differences so that identified headers can be added adjacently to the header
table.
Figure 4 describes the determination of a reference set based on the determined set of headers in common between the set of headers to encode and the header table.
At step 400, groups of headers with adjacent indexes are determined in the headers in common. This can for instance be done by first ordering said headers according to their index in the header table, then by removing all headers whose index the header table is not adjacent to an index of another header.
Once those groups have been obtained, they can be ordered by growing start index. The start index of a header group is the lowest index of the headers belonging to the group. This is done at step 405, which also consists in initializing variables SI_MAX, El_MAX and N_MAX to 0; SI_MAX is used to store the start index of the best known reference set, El_MAX its end index, and N_MAX to store the number of headers present in both this best known reference set and the set of headers to encode.
At step 410, it is checked whether there remains a group not yet processed among the list of groups ordered at step 405. Initially, groups are considered as not processed; it is only after being processed as described hereafter that a group is marked as processed.
If there remains a group not yet processed, variable SI is initialized with the start index of this group, El with the end index of the group and variable N is initialized with the number of headers in this group in a step 415. SI, El and N are used to build a potential reference set consisting of the group and potentially augmented with following groups.
Consequently, at step 420, the next group NG in the list of groups is obtained. At step 425, the number of headers comprised in the header table between this next group and the considered potential reference set RS is compared to the number of headers in this next group. Indeed, if there are more headers in this next group than the start index of this next group minus the end index of the reference set, i.e. NC length > NC_SI -El, extending RS up to the end of NG will reduce the number of difference to be encoded. Therefore, in this case, El is set to NC_El (end index of NC) and N is increased by the length of NC at step 430, which is followed by step 420. As a remark, some of the headers comprised between NG_SI and El in the header table may also be present in the set of headers in common. Indeed, only groups of headers with adjacent indexes in the set of header in common have been determined at step 400. In a variant, the presence of such "isolated" headers from the set of header in common may be checked, and the value of N updated accordingly.
If extending the reference set up to the end of the next group does not reduce the number of differences, it is checked at step 435 whether N is greater than N_MAX, or whether N is equal to N_MAX and (El_MAX -SI_MAX) is greater than (El SI). If so, it means that the currently processed reference set either comprises more headers from the set of headers to encode than the best known reference set, either that it comprises as many headers from the set of headers to encode as the best known reference set, but that it comprises fewer headers not in the set of headers to encode. In this case, at step 440, the best known reference set is replaced by the reference set by setting SI_MAX to SI, EL MAX to El and N_MAX to N. In the negative of step 435 as well as after step 440, the current group is removed from the groups to be processed in a step 450. After this step the control resumes to step 410.
When all groups have been processed, the reference set is defined as the set starting at index SI_MAX in the header table and ending at El_MAX in a step 460. This step is followed by step 470, where it is determined whether using the default reference set is more efficient than using the determined reference set. This is typically done by computing the cost of encoding the header set using the reference set, computing the cost of encoding the header set using the default reference set, and comparing these costs. If the cost using the default reference set is lower, the default reference is chosen as the reference at step 480. After this step, or after the negative of step 470, the process ends in a step 490.
Generally speaking, Figure 4 determines a reference set based an the determined set of headers in common between the header table and the set of headers to encode by considering an empty set as the initial reference set and by adding headers from the table, especially headers from said set of headers in common, to said reference set as long as they enable reducing the size of the encoded set of headers. It is worth noting that in some cases, this method may result in the selection of the empty set as the reference set.
Various alternatives may be used to determine a reference set. A simple solution consists in selecting the largest group of adjacent headers in the set of headers common to the header table and to the set of headers to encode.
Another solution consists in determining all the possible reference sets as well as their encoding cost, and to select a reference set which minimizes said cost.
The provided algorithm attempts to maximize the number N of headers comprised in both the set of headers to encode and in the reference set.
However, another approach may consist in also considering the number of deletions to be encoded for each potential reference set. In the algorithm described in the foregoing, this parameter is considered only to distinguish two possible reference sets comprising the same number of headers N common to set of headers to encode and to the considered reference set.
It is worth noting that the reference set determined by this algorithm may be empty. Finally, said algorithm may comprise, for example as an additional condition to test 425, 435 and also possibly 470, a filtering step to check that a potential reference set meets various criteria. Some of those criteria may be related to an encoding strategy defined for the encoder, as illustrated in Figure 6.
Figure 5a describes the encoding of the reference set according to an embodiment of the invention.
First, at step 500, it is checked whether the reference set is the default reference set or not. If so, at step 510, the default reference set flag is set to true and encoded. The process then ends at step 590. It is worth noting that other specific flags may be defined. As a first example, another flag may be defined to indicate whether the reference set is the empty set or not. As a second example, another flag may be defined to indicate that the reference set to be used is the same as the one used for the encoding of previous header set.
If the reference set is not the default reference set, the default reference set flag is set to false and encoded. At step 530, it is then determined whether the start index of the reference set has already been used as the start index of a reference set or not. If not, the start index and the length of the reference set are encoded at step 550. The process then ends at step 590.
If the reference set start index has already been used as the start index for another reference set, a reference to this start index is encoded at step 540, as well as the length of the reference set. The process then ends at step 590.
The interest of distinguishing those two cases comes from the fact that when a reference to a previously used start index can be encoded, the definition of the reference set can likely be encoded on a single byte, whereas the encoding of a start index along with the encoding of a length is more likely to require two bytes.
Indeed, the number of start indexes is lower than the number of indexes in the header table: as a consequence, encoding a reference to a previously encoded start index is likely to require fewer bits. Typically, it may be decided by the encoder and the decoder to store the last eight start indexes used to encode reference sets. In this case, only three bits are necessary to encode a reference to a previously used start index. Furthermore, given that the length of the reference set is likely to be smaller than sixteen, this length can generally be encoded on four bits.
Therefore, a possible format for encoding a definition of a reference set may consist in first encoding a 0 bit, then encoding the length of the reference set, for example on 5 bits, and then encoding the start index value, for example on the next 10 bits. In this case, 16 bits (2 bytes) are used. As an optimization, if the length is null, no start index is encoded, resulting in a single byte definition.
Accordingly, the encoding of a definition comprising a reference to a previously used start index may consist in first encoding a 1 bit, then encoding the index of the start index among the 8 last start indexes encoded, which needs 3 bits, then encoding the length of the reference set on 4 bits. In this case, 8 bits (1 byte) are used.
Alternatively, instead of checking only whether start index has already been used at step 530, it may be checked whether both start index and length have already been used, and if so, a reference to the appropriate start index and length is encoded. Similarly, the check of step 530 may also include checking deletions, as described with regard to step 560. In this case, a single reference may be encoded to indicate a start index, a length and some complementary information regarding deletions.
Figure Sb describes some optional steps that may occur after steps 510, 540 or 550. As a matter of fact, those optional steps enable providing additional information about the reference set and/or about the headers comprised in the differences.
The first optional step 560 enables encoding deletions with regard to the reference set. In other words, it enables indicating that some headers comprised in the region defining the reference set do not belong to the reference set. Indicating that a header comprised in the reference set is in fact not present in the set of headers to be encoded is already possible without this optional step. Indeed, by encoding said header in the differences, it can be determined that this header should be removed from the reference set to obtain the set of encoded headers. However, doing so requires encoding the index in the header table of each header to be removed, which requires 1 or 2 bytes per header. Therefore, step 560 proposes to encode together all the deletions to be applied to the reference set. For instance, for each header of the reference set, one bit is encoded to indicate whether the corresponding header should be preserved or removed. Given that the reference set is very likely to comprise fewer than 16 headers, this additional data would generally require 1 or2 bytes.
Therefore, as soon as at least two headers have to be removed from the reference set, this method is more efficient than encoding the index in the header table of each header to be removed. It should be noted that other ways of encoding deletions may be used, for instance by encoding the positions in the reference set of those headers. As another example, if a range of deletions occurs, for example headers 4 to 6 of the reference set, such deletions may be encoded by encoding corresponding start index and length.
The second optional step 570 consists in encoding a piece of information indicating that headers from the reference set should be re-indexed. In this context, re-indexation means indexed anew, namely added anew in the header table. This step may be useful in the case where an encoder wants to add some headers which are already present in the header table to the header table, for example because it plans reusing them for future reference sets. To do so, an encoder can select a specific representation, such as the literal representation with options indicating that said header should be added to the header table.
However, such means are not optimal from the compaction point of view if a header is already present in the header table. Therefore, if such a header is present in the reference set, step 570 enables indicating that said header should be re-indexed, i.e. added again in the table, leading to having two similar headers in the table. A possible way to encode this information consists in encoding, for each header of the reference set, a single bit indicating whether it should be re-indexed or not. Other ways of encoding such information could also be provided, for instance by encoding the positions in the reference set of those headers.
The third optional step 580 consists in encoding a piece of information indicating that headers from the differences should be re-indexed. Indeed, similarly to what has been described with regard to step 570, some headers present in the differences may already be indexed. If the encoder wants to add them to the table, it can select a representation enabling to add them to the header table, for example literal representation with options for indexation, but this may not be optimal from the compaction point of view as encoding the indexed representation of said headers would be more efficient. Yet, the indexed representation might not provide re-indexing option. Therefore, step 580 provides means for indicating which headers already indexed should be re-indexed, namely added another time in the header table. A possible way to encode this information consists in encoding, for each header of the differences, a single bit indicating whether it should be re-indexed or not. Other ways of encoding such information could also be provided, for instance by encoding the positions of those headers in the differences.
Finally, it should be noted that each of those steps is independent from the others. For instance, if the representations provided by the encoding format enables indicating in the indexed representation of a header that it should be re-indexed, an encoder may implement steps 560 and 570, but not 580.
Generally speaking, means for indicating whether such optional information are encoded or not should be available. For instance, 3 bits may be used to indicate whether each kind of optional information is available or not.
Figure 6 illustrates the filtering of a reference set according to an embodiment of the invention. Such filtering may in particular be applied when the encoder wants to ensure that some headers are added adjacently to the header table. Indeed, if those headers are already present in the header table but in different regions of this header table, defining a reference set comprising those headers may not be efficient. On the other hand, adding those headers adjacently in the header table enables defining an efficient reference set comprising those headers.
At step 600, the type of the set of headers to encode is determined. This determination is typically based on the definition of various types, each type corresponding to a list of header names associated with headers values. Those values may be regular expressions, for instance to enable list of values or wildcards. Such lists may be statically or dynamically defined. They may be known a priori by a developer, or be based on statistics on data previously encoded by the encoder or by similar encoders. In a first example, various lists may be defined by the encoder of a client, each list corresponding to a type of resource requested, for example HTML, images, JavaScript, CSS.... The same may be done by the encoder of a server. In a second example, a server may also define different lists for the different kinds of response statuses, for example status code 200, 304, 404.... In a third example, a proxy used by different clients may define different types corresponding to the different browsers used by clients. Indeed, different browsers tend to use headers in different ways. In particular, "user-agent" header and "accept" family of headers tend to be different between different web browsers. In a fourth example, a proxy being used by a client to access various services may define types based on the service being used, which can typically be done by relying on the "host" header. Finally, generally speaking, the definition of types may in particular comprise a default type, which is associated to an empty list of header names.
At step 610, the list of header names and values characterizing said type is obtained. Then, at step 620, the subset of the set of headers to encode comprising headers whose name belong to said obtained list, is determined. At step 630, it is checked whether this subset of headers is present in the reference set. If not, using the reference set does not prevent the encoder from adding the headers from the determined subset adjacently in the header table.
Therefore, at step 650, the reference set (RS) is allowed, and the process ends at step 690.
If the subset of headers is present in the reference set at step 630, it is checked at step 640 whether these headers of the subset are adjacent in the header table. More generally, the subset headers may not have to be adjacent.
Their presence in the same region of the header table may be sufficient, for example if they occur in two groups separated by one or two headers, especially if those headers also belong to the set of headers to encode. If so, the reference set is allowed at step 650 and the process ends at step 690.
In contrast, in the negative of step 640, using the reference set would prevent the encoder from defining efficiently a region comprising the subset of headers. Consequently, the reference set is rejected at step 660, and the process ends at step 690.
The criteria applied at step 630 and 620 are possible criteria for doing the filtering. More generally, the issue consists in determining whether a reference set is appropriate enough" for representing a given type of header set. A proximity measurement may for instance be defined to evaluate this criterion, for example the number of headers present in the subset of headers but not in reference set, and a threshold can be defined to determine whether a reference set is close enough for a given type, for example the number of headers present in the subset but not in the reference set should be lesser than or equal to 1.
Alternatively, this filtering step may be integrated in the process of determining the reference set as, for example, illustrated on Figure 4. For instance, the process of Figure 4 may be modified so that only groups comprising headers belonging to the subset are considered at step 400.
If the means illustrated on Figure 5b are available, especially the ones described at step 570 regarding the re-indexing, the filtering of reference set is not necessary. Indeed, in this case, the encoder has the ability to request re-indexing headers from the reference set which are already indexed.
It is worth noting that, if the set of headers to encode belongs to a defined type, other than the default type which is not associated to any header names I header values, and that the corresponding headers from the subset are already indexed in the table, the filtering may also be used to force the selection of such a reference set. By doing so, an encoder can ensure that the headers characterizing a group will not be added more than once adjacently to the header table. It may also be noted that in some cases, client and server may agree on the contents of the header table while setting up the connection. In such cases, client and server can populate the header table with appropriate headers, especially in order to group headers associated to a given type in the header table. When doing so, encoder will typically encode the sets of headers in a way that preserves the proximity of the headers associated to a given type
in the header table.
Finally, the definition of types may also depend on the amount of memory available for storing headers in the header table. For instance, when the memory is highly constrained (e.g. 512 bytes), some headers may be excluded from the definition of types; on the other hand, when more memory is available (e.g. 4096 bytes), more headers may be considered in group definitions given that mare space is available. As an example, "user-agent" header is generally quite large. If the memory is highly constrained, excluding user-agent from the headers defining a group avoids adding multiple times "user-agent" headers to the header table. Yet, this implies that "user-agent" will often not belong to the chosen reference set. On the other hand, if a lot of memory is available, "user-agent" may be added to the different type definitions. This will result in having "user-agent" header to be added multiple times to the header table, but this is not an issue given that a lot of memory is available. Besides, this allows often including it in the reference set, which results in a better compaction.
Figure 7 describes a pre-processing that is applied to the differences determined at step 330 of Figure 3 in some embodiments of the invention.
At step 700, the type of the set of headers to encode and the headers associated to said type are determined. At step 710, a set of headers comprising headers from the differences that should occur adjacently in the table is determined. Those headers are then ordered at step 720 so that they occur successively in the differences. Consequently, when they are encoded at step 350 of Figure 3, they will occur successively. Furthermore, if some headers are more important than others to define a type, those headers may be grouped together, so that they form a group in the header table.
The reordering step may take further constraints into account. Indeed, as illustrated in the example with the "cookie" header and GET and POST header sets types, some headers may belong to different types, and an appropriate ordering of headers may enable placing such headers at the end of the region of the header table corresponding to a given header set, and at the beginning of the region of the header table corresponding to another header set. By doing so, a single occurrence of said headers enables defining two regions of the header table, each corresponding to a type. Generally speaking, if two types of header sets are considered, headers from a header set of the first type can be sorted based on their variability so that the more variable headers occur at the beginning, whereas for the second type, the sorting is made so that the more variable headers occur at the end. Thus, a region of the table is created that represents both types, the core of this region being actually used for both types, while the beginning and end are in fact used only for a single type. This is efficient from the memory usage point of view. It also facilitates the encoder task when some space should be obtained in the header table: indeed, the core of said region should be preserved, while the beginning and the end may be used to store other headers. More generally, even in the absence of types, headers may be ordered based on their variability, either by decreasing variability, either by growing variability. Similarly, even in the absence of types, alternating decreasing and growing variability ordering between successive sets of headers to encode may enable reducing the size of the data encoded.
As an example, such reordering may consist in distinguishing various sets of headers, such as variable headers, for example URL header, which are likely to change for every set of headers, type-specific headers, for example method, accept, status..., which are likely to remain identical for all header sets of said type, and common headers, for example cookie, user-agent, which are likely to be common to all header sets. When encoding a first type of header set for the first time, variable headers are encoded first, then type-specific headers, then common headers. When encoding a second type of header set for the first time, the common headers do not have to be added to the table a second time given that those headers are already present; if new common headers occur, for example because a cookie has been defined, those headers are encoded first, followed by type-specific headers and variable headers. Consequently, after this process, one region of the table can be defined as type#1 -specific + common headers, and another one as common+type#2-specific headers. Yet, both regions comprise a common region, corresponding to common headers. If more types are considered, the same process can be applied to other types, grouped by pairs. In other words, the headers to be added to the header table and pertaining to a set of headers to encode, said set of headers to encode being of a given type, are ordered depending on whether the header is common to most sets of headers, common to most sets of headers of the given type or specific to said set of headers to encode.
Finally, at step 740, the most compact representation enabling indexing or re-indexing is selected for each header belonging to the determined headers from step 710. In general, some representations may enable indexing, typically the literal representation, while others do not, typically the indexed representation. Here, given that determined headers from step 710 should occur adjacently in the header table, selecting a representation enabling indexing, and re-indexing if the header is already indexed, is advantageous.
However, if the means provided at step 580 of Figure Sb are available, re-indexing can be requested even for representations which are not compatible with indexing. Therefore, in this case, step 740 simply consists in selecting the most compact representation for each header from the determined headers from step 710. The process then ends at step 790.
Such pre-processing may be applied more generally, even in the absence of type management, namely in the absence of step 700. The most basic pre-processing simply consists in reordering the headers comprised in the differences so that the encoder may control the order in which they are added to
the header table.
In addition, if some means are available for selecting the index in the header table where headers are added in case of incremental indexing, such means can advantageously be used by an encoder to reserve some indexes around a given set of headers indexed in the header table. Such mechanism enables increasing the flexibility of the encoding, for instance by enabling to enrich an already defined region without removing already indexed headers.
Several approaches can be contemplated to increase incrementally reference sets within a header table. Two preferred embodiments are proposed here: first, identified reference set may be surrounded by headers that are indexed for the purpose of being replaced later one, typically using indexing by substitution. Alternatively, incrementally indexed headers may be placed at various places of the header table. By default, incrementally indexed headers are added at the end of the header table. Rules may be defined so that the position of insertion is defined on a per-message basis. For instance, either by default or based on specific message instructions, all incrementally indexed headers are appended in the header table at the end of the reference set. Or the instruction may contain a specific insertion position in the header table. Note also that the same kind of rules may be applied to header entries removal in case an eviction policy is used to automatically remove headers from the table (e.g. when reaching the maximal size of the header table). For instance, the determined insertion position may also serve as a removal position.
Alternatively, if the reference set comprises headers whose indexes are not adjacent in the header table, incrementally indexed headers may first be added in the holes between headers from the reference set, and then appended after the last header from the reference set. In case a void reference set is used, an index for starting incremental indexing may be determined from the header table or defined in the encoded message, said index corresponding to the beginning of a region of the table comprising rarely used headers. Finally, note that selecting the index at which a header should be placed in the header table is also possible through substitution indexing; however, this solution requires indicating the selected index for each indexed header, which is less compact.
Figure 8 describes the decoding of a set of headers according to an embodiment of the invention.
At step 800, the definition of the reference set is decoded. In particular, it is determined whether the reference set is the default one or not. If so, the previous decoded set of headers is used as the reference set; if not, the definition of the region of the table used as a reference set is decoded.
At step 810, the corresponding reference set is created, either from the previous decoded set of headers, either from the header table and the definition of the region to be used as a reference set.
At step 820, the differences are decoded. This decoding process may comprise updating the header table depending on the way headers have been encoded.
Those differences are taken into account at step 840 in order to obtain the decoded set of headers. In particular, headers present in both the reference set and the differences are removed from the reference set, while headers absent from the reference set but present in the differences are added to said reference set. By doing so, the decoded set of headers is finally obtained, and the process ends at step 890.
As it can be seen, this process does not involve creating new constructs for storing sets. The decoder simply decodes the encoded stream by applying what was defined by the encoder. The only cost that may be added by the invention consists in storing a list of start indexes decoded in the definition of reference sets, corresponding to what has been defined with regard to Figure 5a. However, such means are just an optimization meant to improve compaction in some embodiments and they could be removed without preventing the usage of the invention.
In an embodiment of the invention, the encoding of a set of headers to encode comprises a flag to indicate whether the reference set to be used is given by its definition or if it is the default reference set. The default reference set corresponds to the previous set of headers to encode. Advantageously, this flag is encoded using one bit.
In an embodiment of the invention, the encoding of a set of headers to encode comprises a flag to indicate whether an optional deletion set is encoded or not.
In an embodiment of the invention the index defining the beginning of the region defining the reference set, also called the start index of the reference set, is stored for a given number of previously defined reference set. The previously decoded start indexes are stored in an indexed table. By doing so, a further reference set using a previously stored start index may be encoded using an index in this indexed table instead of the start index itself. For example, if we store the eight previously used start index, a further start index corresponding to one of these start index may be encoded using only three bits for encoding values from 0 to 7. Taking into account a one bit flag to indicate this kind of encoding and a region in the header table having less than 16 headers, it is possible to encode the complete reference set using only one byte of eight bits.
For example, the first bit indicates the kind of encoding, the next four bits indicate the length of the region and the last three bits the index, in the indexed table of previously used start index, of the start index to be used.
Figure 9 is a schematic block diagram of a computing device 900 for implementation of one or more embodiments of the invention. The computing device 900 may be a device such as a micro-computer, a workstation or a light portable device. The computing device 900 comprises a communication bus connected to: -a central processing unit 901, such as a microprocessor, denoted CPU; -a random access memory 902, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method for encoding or decoding at least part of an image according to embodiments of the invention, the memory capacity thereof can be expanded by an optional RAM connected to an expansion port for example; -a read only memory 903, denoted ROM, for storing computer programs for implementing embodiments of the invention; -a network interface 904 is typically connected to a communication network over which digital data to be processed are transmitted or received.
The network interface 904 can be a single network interface, or composed of a set of different network interfaces (for instance wired and wireless interfaces, or different kinds of wired or wireless interfaces). Data packets are written to the network interface for transmission or are read from the network interface for reception under the control of the software application running in the CPU 901; -a user interface 905 may be used for receiving inputs from a user or to display information to a user; -a hard disk 906 denoted HD may be provided as a mass storage device; -an I/O module 907 may be used for receiving/sending data from/to external devices such as a video source or display.
The executable code may be stored either in read only memory 903, on the hard disk 906 or on a removable digital medium such as for example a disk.
According to a variant, the executable code of the programs can be received by means of a communication network, via the network interface 904, in order to be stored in one of the storage means of the communication device 900, such as the hard disk 906, before being executed.
The central processing unit 901 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to embodiments of the invention, which instructions are stored in one of the aforementioned storage means. After powering on, the CPU 901 is capable of executing instructions from main RAM memory 902 relating to a software application after those instructions have been loaded from the program ROM 903 or the hard-disc (HD) 906 for example. Such a software application, when executed by the CPU 901, causes the steps of the flowcharts shown in Figures 3 to 8 to be performed.
Any step of the algorithm shown in Figure 3 to B may be implemented in software by execution of a set of instructions or program by a programmable computing machine, such as a PC ("Personal Computer"), a DSP ("Digital Signal Processor") or a microcontroller; or else implemented in hardware by a machine or a dedicated component, such as an FPGA ("Field-Programmable Gate Array") or an ASIC ("Application-Specific Integrated Circuit").
Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art which lie within the scope of the present invention.
Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.
In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.

Claims (24)

  1. CLAIMS1. A method of encoding a set of headers to encode using an indexed header table associating headers with respective coding indexes, the method comprising: -identifying in the header table a region comprising headers being pad of the set of headers to encode; -determining a reference set of headers in the header table based on said region; -determining headers representing differences between the set of headers to encode and the reference set; and -encoding the set of headers to encode by encoding both a definition of said reference set and said determined headers representing differences.
  2. 2. The method according to claim 1, comprising: -determining the reference set of headers based on how headers to encode are distributed in the identified region.
  3. 3. The method according to claim 1 or 2, wherein encoding the definition of said reference set comprises: -encoding a start index representing the lowest index among the indexes of the headers comprised in the reference set; and -encoding a length representing the difference between the highest index and the lowest index among the indexes of the headers comprised in the reference set.
  4. 4. The method according to any one claim 1 to 3, wherein encoding the definition of said reference set comprises: -encoding deletion information indicating that some headers belonging to the reference set should be deleted.
  5. 5. The method according to any one claim 1 to 4, further comprising: -storing in an indexed table at least the start indexes in the header table of a given number of previously defined reference set; and -encoding the reference set using an index in said indexed table comprising at least the start indexes.
  6. 6. The method according to any one claim 1 to 5, wherein determining the reference set comprises: -initializing a reference set with the void set; -determining a set of at least one header in the header table which reduces the size of encoded data when added to the reference set; and -adding the determined set in the reference set.
  7. 7. The method according to any one claim 1 to 5, wherein determining the reference set comprises: -determining all possible reference sets based on the identified region in the header table; and -selecting the reference set which minimizes the size of the encoded data.
  8. 8. The method according to claim 6 or 7, comprising: -comparing the size of the encoded data obtained for the determined reference set and for a default set comprising a set of headers previously encoded; and -selecting the set which minimizes the size of the encoded data among the determined reference set and the default set.
  9. 9. The method according to any one claim 1 to 8, wherein the method further comprises: -encoding a piece of information indicating that some of the already indexed headers from the reference set or from the differences between the reference set and the set of headers to encode should be added anew in the header table.
  10. 10. The method according to claim 9, further comprising: -ordering headers from the set of headers to encode, part of these ordered headers being to be added in the header table.
  11. 11. The method according to any one claim 1 to 10, comprising: -determining a type of said set of headers to encode; and -determining the reference set based also on this type.
  12. 12.The method according to claims 10 and 11, wherein ordering of the headers is based on the determined type.
  13. 13.The method according to claim 10 or 11, comprising: ordering of the headers to be added in the header table depending on their variability.
  14. 14.The method according to claim 13, comprising: -ordering of the headers from a first set of headers to encode by decreasing variability; and -ordering of the headers from a second set of headers to encode by growing variability.
  15. 15. The method according to claim 13, wherein two different types of sets of headers to encode are defined and wherein the method comprises: -adding in the header table headers common to both types of sets of headers to encode between headers specific to the first type and headers specific to the second type.
  16. 16.The method according to any one claim 9 to 15, further comprising: -selecting the index in the header table where incrementally indexed headers are added.
  17. 17. The method according to any one claim 9 to 15, further comprising: -determining the index in the header table where incrementally indexed headers are added based on the definition of the reference set.
  18. 18.A method of decoding a set of headers using an indexed header table associating headers with respective coding indexes, the method comprising: -decoding a definition of a reference set of headers; -creating the reference set corresponding to this definition; -decoding differences; and -obtaining the decoded set of headers by applying differences to the reference set.
  19. 19. A device for encoding a set of headers to encode comprising: -an indexed header table associating headers with respective coding indexes; -an identifying module for identifying in the header table a region comprising headers being part of the set of headers to encode; -a reference set determining module for determining a reference set of headers in the header table based on said region; -a difference determination module for determining headers representing differences between the set of headers to encode and the reference set; and -an encoding module for encoding the set of headers to encode by encoding both a definition of said reference set and said determined headers representing differences.
  20. 20. A device for decoding a set of headers comprising: -an indexed header table associating headers with respective coding indexes; -a definition decoding module for decoding a definition of a reference set of headers; -a reference set creating module for creating the reference set corresponding to this definition; -a difference decoding module for decoding differences; and -an obtaining module for obtaining the decoded set of headers by applying differences to the reference set.
  21. 21.A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing a method according to any one of claims 1 to 18, when loaded into and executed by the programmable apparatus.
  22. 22.A computer-readable storage medium storing instructions of a computer program for implementing a method according to any one of claims ito 18.
  23. 23.A method of encoding a set of headers to encode using an indexed header table associating headers with respective coding indexes, substantially as hereinbefore described with reference to, and as shown in Figure 2 to 7.
  24. 24.A method of decoding a set of headers substantially as hereinbefore described with reference to, and as shown in Figure 8.
GB1312112.4A 2013-07-05 2013-07-05 Method and device for encoding headers of a message using reference header sets Withdrawn GB2515826A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB1312112.4A GB2515826A (en) 2013-07-05 2013-07-05 Method and device for encoding headers of a message using reference header sets
GB1312498.7A GB2515839A (en) 2013-07-05 2013-07-12 Method and device for encoding headers of a message using reference header sets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1312112.4A GB2515826A (en) 2013-07-05 2013-07-05 Method and device for encoding headers of a message using reference header sets

Publications (2)

Publication Number Publication Date
GB201312112D0 GB201312112D0 (en) 2013-08-21
GB2515826A true GB2515826A (en) 2015-01-07

Family

ID=49033406

Family Applications (2)

Application Number Title Priority Date Filing Date
GB1312112.4A Withdrawn GB2515826A (en) 2013-07-05 2013-07-05 Method and device for encoding headers of a message using reference header sets
GB1312498.7A Withdrawn GB2515839A (en) 2013-07-05 2013-07-12 Method and device for encoding headers of a message using reference header sets

Family Applications After (1)

Application Number Title Priority Date Filing Date
GB1312498.7A Withdrawn GB2515839A (en) 2013-07-05 2013-07-12 Method and device for encoding headers of a message using reference header sets

Country Status (1)

Country Link
GB (2) GB2515826A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6542504B1 (en) * 1999-05-28 2003-04-01 3Com Corporation Profile based method for packet header compression in a point to point link
EP1334560A2 (en) * 2000-11-16 2003-08-13 TELEFONAKTIEBOLAGET LM ERICSSON (publ) Communication system and method for shared context compression
US20030182454A1 (en) * 2000-07-25 2003-09-25 Hans-Peter Huth Header compression method for network protocols
US20050055464A1 (en) * 2003-09-04 2005-03-10 International Business Machines Corp. Header compression in messages
EP1725943A2 (en) * 2003-12-19 2006-11-29 Nokia Corporation Method and system for header compression
EP2507968A1 (en) * 2009-11-30 2012-10-10 Qualcomm Incorporated Methods and apparatus for improving header compression
GB2496385A (en) * 2011-11-08 2013-05-15 Canon Kk Communicating compressed data packets using dictionary compression

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6542504B1 (en) * 1999-05-28 2003-04-01 3Com Corporation Profile based method for packet header compression in a point to point link
US20030182454A1 (en) * 2000-07-25 2003-09-25 Hans-Peter Huth Header compression method for network protocols
EP1334560A2 (en) * 2000-11-16 2003-08-13 TELEFONAKTIEBOLAGET LM ERICSSON (publ) Communication system and method for shared context compression
US20050055464A1 (en) * 2003-09-04 2005-03-10 International Business Machines Corp. Header compression in messages
EP1725943A2 (en) * 2003-12-19 2006-11-29 Nokia Corporation Method and system for header compression
EP2507968A1 (en) * 2009-11-30 2012-10-10 Qualcomm Incorporated Methods and apparatus for improving header compression
GB2496385A (en) * 2011-11-08 2013-05-15 Canon Kk Communicating compressed data packets using dictionary compression

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HTTP Header Compression, draft-ietf-httpbis-header-compresison-00, HTTPbis Working Group, Internet Draft, 25 June 2013, R. Peon & H. Ruellan *

Also Published As

Publication number Publication date
GB2515839A (en) 2015-01-07
GB201312112D0 (en) 2013-08-21
GB201312498D0 (en) 2013-08-28

Similar Documents

Publication Publication Date Title
USRE48725E1 (en) Methods for accessing data in a compressed file system and devices thereof
EP3000050B1 (en) Efficient data compression and analysis as a service
RU2581551C2 (en) Method for optimisation of data storage and transmission
US9735805B2 (en) Encoder, decoder and method
US8010889B2 (en) Techniques for efficient loading of binary XML data
US11122150B2 (en) Methods and devices for encoding and decoding messages
US20180088807A1 (en) Method and device for migrating data
CN106156037B (en) Data processing method, apparatus and system
GB2516641A (en) Method and server device for exchanging information items with a plurality of client entities
CN106161633A (en) A kind of based on the transmission method of packaging file under cloud computing environment and system
US8751671B2 (en) Hierarchical bitmasks for indicating the presence or absence of serialized data fields
US9286055B1 (en) System, method, and computer program for aggregating fragments of data objects from a plurality of devices
US10931304B1 (en) Sensor content encoding
CN115643310B (en) Method, device and system for compressing data
US10168909B1 (en) Compression hardware acceleration
GB2515826A (en) Method and device for encoding headers of a message using reference header sets
CN110958212A (en) Data compression method, data decompression method, device and equipment
GB2510198A (en) Method and device for encoding a header in a message using an indexing table
CN103929404B (en) Method for analyzing HTTP chunked code data
CN109033189B (en) Compression method and device of link structure log, server and readable storage medium
US10742783B2 (en) Data transmitting apparatus, data receiving apparatus and method thereof having encoding or decoding functionalities
GB2510174A (en) Encoding message headers using an in-memory indexing table
US20140330798A1 (en) VDI File Transfer Method and Apparatus
CN115037981B (en) Decoding method and device of data stream, electronic equipment and storage medium
Lee et al. LZCode based compression method for effectively using the memory of embedded devices

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)