SE2150128A1 - Identification of compressed net resources - Google Patents

Identification of compressed net resources

Info

Publication number
SE2150128A1
SE2150128A1 SE2150128A SE2150128A SE2150128A1 SE 2150128 A1 SE2150128 A1 SE 2150128A1 SE 2150128 A SE2150128 A SE 2150128A SE 2150128 A SE2150128 A SE 2150128A SE 2150128 A1 SE2150128 A1 SE 2150128A1
Authority
SE
Sweden
Prior art keywords
http
server
client device
compression
resource
Prior art date
Application number
SE2150128A
Inventor
Carl Hasselskog
Original Assignee
Degoo Backup Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Degoo Backup Ab filed Critical Degoo Backup Ab
Priority to SE2150128A priority Critical patent/SE2150128A1/en
Priority to PCT/EP2022/052479 priority patent/WO2022167482A1/en
Priority to EP22704520.0A priority patent/EP4288877A1/en
Publication of SE2150128A1 publication Critical patent/SE2150128A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3088Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing the use of a dictionary, e.g. LZ78
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC

Abstract

There is provided a method for providing a http resource comprising the steps ofa) a client device sending a http request for a http resource to a compression server, the http request comprising information that identifies a http resource at a resource server, and where the compression server comprises compression software for compressing data for a http resource,b) the compression server sending a request for the http resource to the server defined by the http request, and the resource server sending a first data set for the http resource to the compression server,c) the compression server saving the first data set and using the first data set to make a first hash, and the compression server sending the first data set and the first hash to the client device,d) the client device receiving the first data set and the first hash and storing them, e) the client device sending a second http request to the compression server for the same http resource, and providing the first hash to the compression server, f) the compression server receiving the second request from the client device and sending a second request for the same http resource or a similar http resource to the resource server, and the resource server sending a second data set for the http resource to the compression server,g) the compression server saving the second data set and using the second data set to make a second hash,h) the compression server comparing the first hash and the second hash and if they are the same, providing a signal to the client device that causes the client device to use the first data set stored in the client device set to provide the http resource to a user of the device.

Description

Identification of compressed net resources Field of the invention This invention relates to methods and systems for providing http resources, such as web page Background Typically, http resources, such as web pages, are provided by a server to a client after the server receives a http request from the client.
Data traffic over the internet is growing. For example, todays web pages require much morebandwidth than web pages twenty years ago. A problem is that network capacity in someareas is limited which results in a poor experience to users, as fetching web pages or otherhttp resources takes too long time. Data traffic may also be expensive for users, and users of smart phones frequently have a limited monthly data traffic allowance.
Summary of invention ln a fist aspect ofthe invention there is provided a method for providing a http resource comprising the steps of a) a client device sending a first http request for a http resource to a compressionserver, the http request comprising information that identifies a http resourceat a resource server, and where the compression server comprises compres-sion software for compressing data for a http resource, b) the compression server sending a request for the http resource to the resourceserver defined by the http request, and the resource server sending a first data set for the http resource to the compression server, the compression server saving the first data set and using the first data set tomake a first hash, and the compression server sending the first data set and thefirst hash to the client device, the client device receiving the first data set and the first hash and storing them,the client device sending a second http request to the compression server forthe same http resource, and providing the first hash to the compression server,the compression server receiving the second request from the client device andsending a second request for the same http resource or a similar http resourceto the resource server, and the resource server sending a second data set forthe http resource to the compression server, the compression server saving the second data set and using the second dataset to make a second hash, the compression server comparing the first hash and the second hash and ifthey are the same, providing a signal to the client device that causes the clientdevice to use the first data set stored in the client device set to provide the http resource to a user of the device.
The method may comprise the additional step: i) if the compression server finds that the first and second hash are different,providing the second data set to the client device where the data is com-pressed by the compression server and provided in a compressed form to the client device.
The second data set comprises text preferably comprises text. The second data set is com- pressed using a diff compression algorithm, preferably using dictionary based compres- sion.
The compression server and the client device may independently create a dictionary for compression based on identical http resource data stored by the compression server and the client device respectively, and the dictionary may be used by the compression server to compress the file and used by the client device to decompress the file.
The client device and the compression server may have stored a plurality of datasets forproviding the http resource and the client device provides the hash values for the pluralityof datasets in step e), and the plurality of datasets is used to compress the second data set in step i).
The dictionary may be based on at least one dataset that is likely to be similar to the da- taset of the http resource that is compressed in step i). lt is also provided a system configured to carry out the method according to the first aspect ofthe invention. lt is also provided client software, for example a browser, for carrying out the client-side steps of the method of the first aspect of the invention. lt is also provided compression server software for carrying out the compression server-side steps of the method of the first aspect of the invention. ln a second aspect ofthe invention there is provided a method for providing a http resourcecomprising the steps of a) a client device sending a http request for a http resource to a compressionserver, the http request comprising information that identifies a http resourceat a resource server, b) the compression server sending a request for the http resource to the resourceserver defined by http request, and the resource server sending a data set forthe http resource to the compression server, c) the compression server receiving the data set and using compression software to compress the data set to obtain compressed data, d) the compression server providing the compressed data to the client device,e) the client device receiving the compressed data and using the compressed data to provide the http resource to a user of the device.
The data set may preferably comprise or be a text file, for example a file selected from a html file, a java script file and CSS file.
The data set is preferably compressed using a diff compression algorithm in step c), prefer- ably dictionary based compression.
Hence in a preferred embodiment the data set comprises a text file and where dictionary- based compression is used in step h). ln a preferred embodiment the compression server and the client device independentlycreate a dictionary for compression based on identical http resource data stored by thecompression server and the client device respectively, and where the dictionary is used bythe compression server to compress the file and used by the client device to decompress the file.
The dictionary may be based on at least one dataset that is likely to be similar to the da-taset of the http resource that is requested in step a). When the requested http resourceis identified by an URL, the dataset used for creating the dictionary may be selected bycomparing the URL of the requested http resource with URL:s of other http resources forwhich the client device has stored data set, and selecting a data set with an URL that is similar to the URL ofthe requested http resource.
Datasets with similar URLs are more likely to have similar content and hence may are morelikely to produce a useful dictionary, i.e. a dictionary that comprises abbreviations for many phrases in the dataset to be compressed.
Similarity may be determined by grouping URLs into groups according to predetermined criteria, where the groups have a predetermined ranking.
The compression server and the client device may use a cryptographically unique identi-fier, such as hash, ofthe http resource data to identify the http resource data used for cre-ating the dictionary. The hash may be included in the http request from the client device of step a).
The dictionary may be based on a plurality of http resource data sets. lt is also provided a system configured to carry out the method according to the second aspect of the invention. lt is also provided client software, for example a browser, for carrying out the client-side steps of the second aspect of the method. lt is also provided compression server software for ca rrying out the compression server side steps of the method of the second aspect of the invention.
Drawings The accompanying drawings form a part of the specification and schematically illustratepreferred embodiments of the invention, and serve to illustrate the principles of the inven- tion.
Fig. 1 is a schematic overview of a system Fig. 2 is a schematic drawing of client device where both software and hardware compo-nents are shown.
Fig. 3 is a schematic overview of a server Fig. 4 and 5 are flowchart showing methods.
Fig. 6 is a schematic drawing of hardware.
Detailed description System 1 comprises a client device 2 and a compression server 3. Resource server 4 is aserver that provides http resources such as web pages, RSS feeds, video or the like. Henceresource sever4 may be a web server. Resource server4 may or may not be a part of system1, but typically system 1 does not comprise resource server 4 but is configured to communi-cate with resource server 4. Communication between client device 2 and compressionserver 3 as well as between compression server 3 and resource server 4 is done over the internet as is known in the art.
The client device 2 may be any type of client device 2 such as a smart phone, a laptop, or atablet computer, such as for example an iPhone or an Android smart phone. lt is preferredthat the client device 2 is able to provide http resources to a user, for example by displayingcontent on a display 5 or play sounds on a speaker 6, or to headphones. lt is preferred thatthe client device 2 is able to display web pages. The client device 2 may for this purposehave installed in its memory 7 a client software 8, for example a browser 8 for displayingweb pages, described in more detail below. Client software 8 (or browser 8) does not haveto be a web browser but will be referred to as "browser 8" herein. Client software 8 mayhence by arranged to send http requests, receive data for http resources from a server, and provide http resources to a user of a client device 2.
Browser 8 may be software adapted to display web pages, for example using html, CSS orJavaScript. The browser 8 is adapted to receive input from a user, for example using a key-board or touch display or speech recognition whereby a user can enter commands forfetching a http request. For example, the user may type the address of a web page intothe browser 8. The browser 8 then sends a request for the http resource to the compres- sion server 3.
A http resource herein may be any type of http resource, such as a web page, a video, asound file, or an RSS feed. ln a preferred embodiment the data that enables the client de- vice to provide the http resource to a user comprises text such as a html file or a java script file. Data for the http resources is stored by the resource server 4. The data may be in a fileformat. Each http resource is identified with a unique identifier which may be an URL (Uni-form Resource Locator) or an IP address, for example the web address of a web page, suchas for example httpsz/Iedštionxïnn.com/Ä Social media such as Facebook pages or similarmay also be provided as http resources. Of course, a web page may comprise text as well as images, video and sound, but preferably the data comprises at least some text.
According to the prior art, a client device 2 obtains data for the http resource by sending ahttp request to the resource server 4, which then provides the http resource to the client device 2.
However, according to the invention, the client device 2 provides the http request to thecompression server 3 instead. Hence the browser 8 is configured to provide a http requestto compression server 3. ln a preferred embodiment the http resource is able to changeover time without changing its unique identifier. The http resource may for example be a dynamic web page, or a feed for a social media platform.
The browser 8 may, however, in case it cannot reach compression server 3, for example due to some technical fault, send a http request directly to resource server 4 instead.
Client device 2 may comprise decompression software 9 for decompressing data that hasbeen received from the compression server 3. Client device 2 may comprise cache 10 forstoring data for http resources, and a database 11 for keeping track of various versions ofhttp resources, for example by storing a hash and relating a hashed to the cashed data and the http resources.
Compression server 3 has compression software 12 for compressing various types of data.Different types of data such as text and video are preferably compressed using differentalgorithms. For text a non-lossy compression algorithm is preferred. lt is preferred that the type of compression used is diff-based compression, in particular dictionary-based compression. This is particularly suited when the data is text-based, such as for example text files (html or JavaScript).
Dictionary-based compression may be based on a common dictionary that is independentlyconstructed by the compression server 3 and the client device 2 using a text data set, suchas a text file, as a starting point. Provided that the same algorithm is used by the clientdevice 2 and the compression server, the client device 2 and the compression server 3 willarrive at the same dictionary in a deterministic fashion. Dictionary based compression assuch is known and may also referred to as dictionary coder in dictionary-based compressiona known file that is shared by both sender and receiver is used to create a dictionary. Thedictionary comprises a list of phrases and abbreviations. A second data set is compressedby applying the dictionary to the second dataset and replacing phrases with abbreviations.Text in the second dataset that cannot be found in the dictionary is provided in non-com- pressed form.
The compression server 3 also comprises browser reply software 13 for handling http re- quests from the browser 8 of client device 2.
Compression server 3 may also comprise hashing software 14 adapted to apply a crypto-graphic hash function to data in order to provide a hash or a different type of cryptograph- ically unambiguous identifier.
The term "cryptographically unambiguous identifier" (CUI) refers to a second set of data(the digest) derived from a first set of data, where the second set of data is deterministi-cally determined by the first set of data, and where the first set typically cannot be deter-mined from the second set of data. Thus, even a small change ofthe first set of data re-sults in a large change in the second set of data. Preferably the second set of data is muchsmaller than the first set of data (requires much less storage space). Preferably the secondset of data has a fixed size. Examples of cryptographically unambiguous identifiers are:checksum digits and a hash, were a hash is preferred. Applying a hash algorithm to data results in the output of a fixed size bit string. One example of a frequently used hash algorithm is SHA 256. Other hash functions are SHA-1 and SHA 512. Such a CUI may serve as a "fingerprint" for the first set of data.
Compression server 3 may also comprise a cache 15 for storing data sets for http re-sources (resource data set), and a database 16 that stores CUI: s for various datasets forhttp resources or other means for keeping track of different versions of http resources,such as time stamps. ln the database 16 a hash (or other CUI) is associated with one andonly one dataset for a http resource. Hence the hash (or other CUI) may be used to iden- tify the dataset.
Fig. 4 is a flowchart that shows a method. ln step 100 a client device 2 sends a http requestfor a http resource to a compression server 3, where the http request comprises infor-mation that identifies a http resource at a resource server 4. Step 100 may for example beinitiated by a user of client device 2 specifying an URL address in some manner. Optionallythe client device 2 in this step provides information to the compression server 3 in the httprequest that specifies which datasets for other http resources that the client device hasstored in its memory, in particular previous versions of the requested http resource or datasets for http resources that are similar, or likely to be similar, but not identical, to the dataset for the requested http resource. The use of hash (or other CUI) for tracking datasets for this purpose is described below with reference to Fig. 5. ln general, browser reply software 13 of compression server 3 handles request from client device 2 and provides messages from compression server 3 to client device 2. ln step 101 the compression server 3 sends a request for the http resource to the resourceserver 4 defined by http request, and the resource server 4 sends a data set for the http resource to the compression server 3. ln step 102 the compression server 3 receives the data set and uses compression software 12 to compress the data set to obtain compressed data. ln step 103 the compression server 3 provides the compressed data to the client device 2. ln step 104 the client device 2 receives the compressed data and uses the compressed datato provide the http resource to a user of the device 2. For example, when the http resourceis a web page the web page is displayed to the user, for example on the display 5 of theclient device 2. When the http resource is a video the video is played by the client device 2.When the http resource is sound, for example a podcast, the podcast is played by the client device for example using speaker 6.
The data preferably comprises text, for example a html file or a JavaScript file and compres-sion is preferably done using a diff compression algorithm. ln particular dictionary-basedcompression may be used. As is known in the art the compressed data is, when dictionary-based compression is used, provided as information about where coded phrases appear and typically some non-decompressed data (phrases that are not present in the dictionary).
When dictionary-based compression is used, the compression server 3 and the client device2 preferably independently create a dictionary for compression based on identical http re-source data sets stored by the compression server 3 and the client device 2 respectively,and where the dictionary is used by the compression server 3 to compress the data andused by the client device 2 to decompress the data. For example, the dictionary may bebased on a data that the compression server 3 has access to, and that the compressionserver 3 knows that the client device 2 has access to, for example a text file, such as html orJavaScript code for providing a web page. Different versions of the web page may be iden-tified using a hash (or other CUI) as described herein. For example, and with reference toFig. 5 the compression server 3 receives the hash values from the client device 2 and com-pares the hash values ofthe message from the client device 2 to hash values in the database16. The compression server 3 then bases the dictionary on datasets that are present at bothcompression server 3 and client device 2. Timestamps may also be used for identifying dif- ferent versions of http resources. 11 ln a preferred embodiment the compressed data is provided from the compression server3 with data that identifies the data, for example text data, used for building the dictionary, in particular http resource data set that is used for building the dictionary. ln a preferred embodiment the dictionary is based on a plurality of http resource data sets.An advantage of basing the compression of a plurality of data set is that compression can be made more efficiently.
Compression may be based not only on data sets for the http resource but also on data setsfor similar http resources in particular similar text files. This has the advantage of providing even more data for building the dictionary.
For example, when the http resource is identified by an URL, compression may be based ondata sets for http resources with similar, but not identical, URLs. Hence the similar datasetmay be identified by comparing the URL ofthe requested http resource with the URL of httpresources that are stored by the compression server 3 and which the compression server 3knows that the client device 2 has stored in its cache 10. The client device 2 may providethe compression server 3 with the CUI (hash) of datasets client device 2 determines likely issimilar to the dataset that is to be compressed. Hence, client device 2 may comprise logicfor determining the similarity between URLs for datasets, and select datasets with URLs that are similar to the requested http resource, for creating the dictionary.
Similarity may be determined by grouping URLs into groups according to predeterminedcriteria, where the groups have a predetermined ranking. For example, the compressionserver may 1) first look for data sets for http resources with URLs that are identical to therequested web address, 2) then look for URLs that have the same domain address and path,but possibly a different query string 3) then look for URLs in the same domain but otherwisedifferent, 4) then for URL:s that have the same path, but in different domain. The fourthoption may be useful to reuse files that are used across multiple domains such as advertising scripts. A request for such a script is typically made as a sub request. 12 Other groups and rankings may off course be used to determined similarity.
The extend on similarity between URL:s may be determined using suitable techniques such determining the Levenshtein distance between two URLs, for example.
The system 1 or the browser 8 may be configured to provide updated data for http re-sources under certain conditions, in particular data for http resources for http resources that a user frequently uses.
The operating system of client device 2 may for example detect a cue from the hardware ofclient device 2, or from a part of the operating system itself, of client device 2 in order totrigger prefetching of data for a predetermined http resource. The cue may be provided tothe browser 8 from the operation system of the client device 2. The cue may be a cue thatindicates that the client device is currently not being used by a user. The cue may be a cueselected from: the client device is connected to Wi-Fi, the battery of the client device 2 isbeing charged, the display 5 of the client device 2 is switched off, the client device 2 is notplaying sound, a certain time of day, for example during night-time, such as between 1 Amand 5AM. lt may be useful for example if client device updates its cache 10 with http re-source data set for cnn.com at night-time if a user frequently visits cnn.com in the morning, because then the web page will be displayed faster to the user.
Fig. 5 shows a method wherein a client device 2 and compression server 3 used hashes (orother CU|:s) to identify versions of http resources such as web pages. The method is alsoused as identities for data sets that are likely to be similar to data sets for requested http FQSOUFCQS. ln step 200 a client device 2 sends a http request for a http resource to a compression server3. The http request comprises information that identifies a http resource at a resource server 4, for example, a web address. 13 ln step 201 the compression server 3 sends a request for the http resource to the resourceserver 4 defined by the http request, and the resource server 4 sends a data set (first data set) for the http resource to the compression server 3. ln step 202 the compression server 3 saves the first dataset in the cache 15. The compres-sion server 3 then uses the hashing software 14 and the first data set to make hash (digest)(first hash) of the first data set. The first hash is stored in database 16 of the compressionserver 3, where it is associated with the first dataset. The compression server 3 then sendsthe first data set and optionally the first hash to the client device 2. The first data set maybe provided non-compressed to the client device 2 but may also be slightly compressed such with the user of Brotli. ln step 203 the client device 2 receives the first data set and optionally the first hash andstores them in memory 7. The first data set is stored in the cache 10 of the memory 7. The first hash is stored in the database 11 of the client device 2.
The client device 2 may then use the first data set to provide the http resource to a user ofthe client device 2. The client device may for example display a web page that is defined by the first data set. ln step 204 the client device 2 sends a second http request for the same http resource tothe compression server 3. For example, the second http request comprises the same webaddress as the first http request. This may for example be caused by the user revisiting thesame web page using browser 8. Dynamic http resources such as dynamic web pages maybe updated automatically, without the users triggering a http request. The client device 2provides the first hash together with the second request, for example in the http request meSSage. ln step 205 the compression server 3 receives the second request from the client device 2. 14 The compression server 3 then, in step 206, optionally sends a second request for the samehttp resource to the resource server 4, and the resource server 4 sends a second data setfor the http resource to the compression server 3. The compression server 3 saves the sec-ond data set in the cache 15. The compression server 3 uses the second data set to make a second hash. The hash is stored in the database 16 of the compression server.
Alternatively, the compression server 3 has already, before receiving the second requestfrom the client device 2, asked the resource server 4 for an update. The compression server3 has then suitable also made the second hash when receiving such an update from the resource server 4. ln step 207, the compression server 3 compares the first hash and the second hash. Logicfor this may be comprised in browser reply software 13 or in database 16, for example. lfthe first hash and the second hash are the same, the first data set and the second data setare identical. This means that the http resource has not been changed between the requests(first and second http requests from the client device 2). Hence, client device 2 is able touse the first data set, which has been cached by the client device 2, to provide the httpresource to the user of the client device 2. Compression server 3, therefore, in step 208after concluding that first hash and second hash is identical, provides a signal to the clientdevice 2 that causes the client device 2 to use the first data set stored in the cache 10 ofclient device 2 to provide the http resource to a user of the client device 2. For example, theclient device 2 may use the first data set for displaying a web page on the display 5 of the client device 2.
Hence the hash value (digest) (or other type of CUI) is used by the system 1 to keep track of which version of a http resource that is cached by the client device 2. lf the first and second hash is not the same, the compression server 3, may optionally instep 209 provide the second data set to the client device 2. This is preferably done in acompressed form as is described above with reference to Fig. 4. Hence when concluding that the first and second hash are not the same, the compression server 3 may proceed to compress the second dataset and provide the compressed data to the client device 2. Theclient device 2 may then decompress the second data set and use it to provide the httpresource to the user of the client device 2. As discussed herein it is preferred that the com-pressed data is a text file. lt is also preferred that a diff compression algorithm is used, such as dictionary-based compression.
Hence the compression of the second data set may be based on a dictionary based on thefirst data set. The compression server 3, which knows that client device 2 has first data set(because client device has sent first hash) uses the first data to construct a dictionary anduses the dictionary to compress the second dataset. The compressed file is provided fromthe compression server 3 including information about what data has been used to compressthe file, i.e. the first hash. The client device 2 receives the compressed data and the hashand uses the first dataset (stored in memory 7) to construct an identical dictionary, in orderto decompress the second data set. Because the dictionary is built in a deterministic way, itwill be identical at client device 2 and compression server 3 although build interpedently bythe compression server 3 and the client device 2. Hence hashing may be used to exchange information about what data the encryption is based on. ln one embodiment, the client device 2 and the compression server 3 has stored a pluralityof datasets for providing the http resource and the client device 2 provides the hash valuesfor the plurality of datasets in step 204 and the plurality of datasets is used to compress thesecond data set in step 209. Hence the client device 2 attaches a plurality of hashes to thehttp request, saying "l have these previous versions of the http resource" to the compres-sion server 3. An advantage of basing the compression of a plurality of data set is that com- pression can be made more efficiently.
Compression may be based not only on data sets for the http resource but also on data setsthat are likely to be similar, in particular similar text files, such as similar web pages, as described above. 16 lt is understood that the present methods and system is computer-implemented, using dig-ital computer equipment. The various embodiments and components described herein suchas client device 2, compression server 3 and resource server 4 and communication betweenthese components uses digital computer technology for storing and handling digital infor-mation and signals as well as suitable hardware and software, including for example suitabledigital processors, digital memories, input means, output means, buses and communica-tions interfaces. A user may be able to make input using for example a keyboard, a mouse or a touch screen. Output may be provided on for example a display 5.
The various components, such as client device 2 and compression server 3 may each havean operating system. With reference to Fig. 6 each of client device 2, compression server 3,and resource server 4, may comprise control circuitry comprising a memory 80, a processor 81, a bus 82 and a communication interface 83.
Compression server 3 or resource server 4 may be one physical server or may be a virtualserver. Function of compression server 3 or resource server 4 may hence be distributed across several physical entities.
The methods herein can be implemented with any suitable combination of software andhardware. Any suitable programming language may be used for the software units andmethods described. Data communication in system 1 and between system 1 and resourceserver 4 may be implemented using suitable networking technologies and protocols, induc-ing cellular communication such as 3G, 4G and 5G, LoRa, Wi-Fi or Bluetooth, or Ethernet.Data communication can be wireless, or wire bound. Information may be exchanged over a wide area net such as internet. Data communication in system 1 may be encrypted.
Communication in system 1 between client device 2, compression server 3 and resourceserver may be carried out using any suitable schedule. Typically, a reply to a http request issent immediately. However, lazy loading may be used for saving data. Lazy loading means that the interactions of a user with a web page triggers loading of web resources such as for 17 example an image. For example, advertisements may be loaded as a user scrolls towards the site where an advertisement is to be inserted. ln certain embodiments, the CUI may be calculated by the client device independently from the compression server. lt is realized that everything which has been described in connection to one embodiment isfully applicable to other embodiments, as compatible. Hence, the invention is not limitedto the described embodiments, but can be varied within the scope of the enclosed claims.While the invention has been described with reference to specific exemplary embodiments,the description is in general only intended to illustrate the inventive concept and should notbe taken as limiting the scope of the invention. The invention is generally defined by the claims.

Claims (8)

Claims
1. A method for providing a http resource comprising the steps of a) a client device sending a first http request for a http resource to a compressionserver, the http request comprising information that identifies a http resourceat a resource server, and where the compression server comprises compres-sion software for compressing data for a http resource, the compression server sending a request for the http resource to the resourceserver defined by the http request, and the resource server sending a first dataset for the http resource to the compression server, the compression server saving the first data set and using the first data set tomake a first hash, and the compression server sending the first data set and thefirst hash to the client device, the client device receiving the first data set and the first hash and storing them,the client device sending a second http request to the compression server forthe same http resource, and providing the first hash to the compression server,the compression server receiving the second request from the client device andsending a second request for the same http resource or a similar http resourceto the resource server, and the resource server sending a second data set forthe http resource to the compression server, the compression server saving the second data set and using the second dataset to make a second hash, the compression server comparing the first hash and the second hash and ifthey are the same, providing a signal to the client device that causes the clientdevice to use the first data set stored in the client device set to provide the http resource to a user of the device.
2. The method of claim 1 where the additional step is carried out: i) if the compression server finds that the first and second hash are different, providing the second data set to the client device where the data iscompressed by the compression server and provided in a compressed form to the client device.
3. The method of claim 2 where the second data set comprises text.
4. The method of claim any one of claims 1 to 3 where the second data set is com- pressed using a diff compression algorithm.
5. The method of claim 4 where the diff compression algorithm is dictionary based compression.
6. The method of claim 5 where the compression server and the client device inde-pendently create a dictionary for compression based on identical http resourcedata stored by the compression server and the client device respectively, andwhere the dictionary is used by the compression server to compress the file and used by the client device to decompress the file.
7. The method of claim 6 where the client device and the compression server hasstored a plurality of datasets for providing the http resource and the client deviceprovides the hash values for the plurality of datasets in step e), and the plurality of datasets is used to compress the second data set in step i).
8. The method of claim 6 or 7 where the dictionary is based on at least one dataset that is likely to be similar to the dataset ofthe http resource that is compressed in step i).
SE2150128A 2021-02-03 2021-02-03 Identification of compressed net resources SE2150128A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
SE2150128A SE2150128A1 (en) 2021-02-03 2021-02-03 Identification of compressed net resources
PCT/EP2022/052479 WO2022167482A1 (en) 2021-02-03 2022-02-02 Identification of compressed net resources
EP22704520.0A EP4288877A1 (en) 2021-02-03 2022-02-02 Identification of compressed net resources

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
SE2150128A SE2150128A1 (en) 2021-02-03 2021-02-03 Identification of compressed net resources

Publications (1)

Publication Number Publication Date
SE2150128A1 true SE2150128A1 (en) 2022-08-04

Family

ID=81324877

Family Applications (1)

Application Number Title Priority Date Filing Date
SE2150128A SE2150128A1 (en) 2021-02-03 2021-02-03 Identification of compressed net resources

Country Status (3)

Country Link
EP (1) EP4288877A1 (en)
SE (1) SE2150128A1 (en)
WO (1) WO2022167482A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5953503A (en) * 1997-10-29 1999-09-14 Digital Equipment Corporation Compression protocol with multiple preset dictionaries
US6178461B1 (en) * 1998-12-08 2001-01-23 Lucent Technologies Inc. Cache-based compaction technique for internet browsing using similar objects in client cache as reference objects
JP2002373106A (en) * 2001-06-13 2002-12-26 Toshiba Corp Device and method for transferring data, and program
US20130346483A1 (en) * 2012-06-25 2013-12-26 Radware, Ltd. System and method for creation, distribution, application, and management of shared compression dictionaries for use in symmetric http networks

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864837A (en) * 1996-06-12 1999-01-26 Unisys Corporation Methods and apparatus for efficient caching in a distributed environment
CA2659285A1 (en) * 2006-08-03 2008-02-07 Citrix Systems, Inc. Systems and methods for using an http-aware client agent
US10044826B2 (en) * 2016-08-10 2018-08-07 Cloudflare, Inc. Method and apparatus for reducing network resource transmission size using delta compression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5953503A (en) * 1997-10-29 1999-09-14 Digital Equipment Corporation Compression protocol with multiple preset dictionaries
US6178461B1 (en) * 1998-12-08 2001-01-23 Lucent Technologies Inc. Cache-based compaction technique for internet browsing using similar objects in client cache as reference objects
JP2002373106A (en) * 2001-06-13 2002-12-26 Toshiba Corp Device and method for transferring data, and program
US20130346483A1 (en) * 2012-06-25 2013-12-26 Radware, Ltd. System and method for creation, distribution, application, and management of shared compression dictionaries for use in symmetric http networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Simon Cooke, 'Data compression using pre-generated dictionaries', [Retrieved on 2021-10-15], Retrieved from the internet:https://www.tdcommons.org/dpubs_series/2876 *

Also Published As

Publication number Publication date
EP4288877A1 (en) 2023-12-13
WO2022167482A1 (en) 2022-08-11

Similar Documents

Publication Publication Date Title
US9077681B2 (en) Page loading optimization using page-maintained cache
US20150100631A1 (en) Proactive transmission of network content
US20020165847A1 (en) Logical semantic compression
US20060167969A1 (en) Data caching based on data contents
US20100268694A1 (en) System and method for sharing web applications
US20120089775A1 (en) Method and apparatus for selecting references to use in data compression
US11190576B2 (en) File distribution and download method, distribution server, client terminal and system
US20190018614A1 (en) Random access file management
WO2021237467A1 (en) File uploading method, file downloading method and file management apparatus
US11620260B2 (en) Record property synchronization in a network computing system
US10178171B2 (en) Content management system for distribution of content
CN103051706A (en) Dynamic webpage request processing system and method for dynamic website
CN108184170B (en) Data processing method and device
CN103699674A (en) Webpage storing method, webpage opening method, webpage storing device, webpage opening device and webpage browsing system
US20170310752A1 (en) Utilizing a Content Delivery Network as a Notification System
US10917484B2 (en) Identifying and managing redundant digital content transfers
CN109525622B (en) Fragment resource ID generation method, resource sharing method, device and electronic equipment
WO2015154682A1 (en) Network request processing method, network server, and network system
CN109962972B (en) Offline packet reassembly method and system
EP3579526B1 (en) Resource file feedback method and apparatus
CN109302449B (en) Data writing method, data reading device and server
US9973597B1 (en) Differential dictionary compression of network-accessible content
SE2150128A1 (en) Identification of compressed net resources
US20130325908A1 (en) Systems and methods for storing data and eliminating redundancy
CN107291870B (en) Method for reading files in distributed storage in batch