GB2602961A - Data compression and storage techniques - Google Patents
Data compression and storage techniques Download PDFInfo
- Publication number
- GB2602961A GB2602961A GB2019022.9A GB202019022A GB2602961A GB 2602961 A GB2602961 A GB 2602961A GB 202019022 A GB202019022 A GB 202019022A GB 2602961 A GB2602961 A GB 2602961A
- Authority
- GB
- United Kingdom
- Prior art keywords
- data
- remote
- user
- remote server
- user system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0608—Saving storage space on storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Disclosed herein is a computer-implemented method of transferring data from a user system to a remote data storage system, the method comprising: selecting, at a user system, one or more data files for remote storage; compressing the selected data files at the user system; transmitting the compressed data files to a remote server system, wherein the remote server system is a remote data storage system; and storing the compressed data files in the remote server system.
Description
Data compression and storage techniques
Field
The field of the invention is data compression and storage. Data in a user system may be transferred to a cloud based storage system. This may be, for example, to provide back-up storage of the data. Embodiments provide techniques for reducing the transmission bandwidth, as well as for reducing the storage and processing requirements in the cloud based storage system.
Background
There are many applications in which data in a user system may be transferred to a cloud based storage system. This may be, for example, to provide back-up storage of the data or the cloud based storage system may be the only long term store of the data There is a general need to improve on known techniques for providing remote storage of data
Summary
Aspects of the invention are set out in the appended independent claims.
List of figures Figure 1 shows a forwarded compression tool operating on a user system according to an embodiment.
Figure 2 shows a forwarded compression tool operating on a user system when retrieving and decompressing data according to an embodiment.
Figure 3 shows an interaction flow when a third party provides a forwarded compression 25 tool in a web browser according to an embodiment.
Figure 4 shows a forwarded compression tool running on a user system, where decompression verification is provided by a storage provider according to an embodiment.
Figure 5 shows a storage provider that uses a forwarded compression tool according to an embodiment.
Description
A problem with known techniques for transferring data between a user system and a remote storage system, such as a cloud based storage system, is that a large transmission bandwidth may be required. A problem with storing data in such a remote storage system is that either the storage requirements are large, because the data is uncompressed, or the processing requirements are large, due to the remote storage system needing to compress received data.
Embodiments may solve one or more of the above problems with known techniques by compressing data at a user system This reduces the amount of data that needs to be transmitted and allows compressed data to be stored in the remote storage system without compression being applied at the remote storage system.
Embodiments are described in more detail below.
Throughout the description of embodiments, a user system and a remote server system are referred to. The user system may be any type of computer system that is used by a user.
For example, the user system may be any of a user's lap top computer, a user's mobile device (such as a tablet or smart phone) or a local server system of a company. The remote server system is a remote storage system. The remote server system may be any type of computer system and, in particular, may be a server system with a large data storage capacity. For example, the remote server system may be a cloud based storage system. The remote server system is located remotely from the user system.
Embodiments describe the compression and decompression of data files. The data files may be any type of data, such as image data, video data, voice data or other types of data.
Embodiments may advantageously combine three techniques. The first technique is to use lossless data compression to store data in a remote server system. This saves cost as the amount of data stored is reduced The second technique is to move the computing effort expended on compressing data from the remote server system to user system. This reduces the amount of data transferred between the user system and the remote server system, saving bandwidth for both. This may also reduce the time the user system needs to wait when uploading data. In addition it reduces size and cost of the remote sewer system, as no, or less, computing resources are required to compress data.
The third technique is to move the decompression of the data out of the remote sewer system when the data is requested by the user system. This further reduces the size and cost requirements of the remote server system, as no, or less, computing resources are required at the remote server system to decompress the data for user access. In addition it reduces the amount of data transferred from the remote sewer system to the user system, saving bandwidth for both.
The above techniques of embodiments may enable a remote server system provider, and/or a third party, to seamlessly enable compression in remote server system substantially without increasing the computing resources required at the remote server system. This may provide significant cost gains for remote server system providers, such as cloud based storage system providers. In addition it enables third party providers to create a upload/download service for users, enabling a reduction in transfer time and storage costs for the user, while being scalable and low cost for the provider, The techniques according to embodiments may comprise performing, or emulating, parts of the remote server system tasks on a user system.
An implementation of an embodiment may be performed inside a browser without using any plugins. A user, of a user system, may go to a third party web page with a downloadable WebAssembly (WASM) version of a compressor. The user may then then log into their cloud storage platform (i.e. the remote server system), and provide it with an access token to the third party web page with WASM code. Then the user may select one or more files to upload and upload the selected file(s). The WASM code may then perform the compression on the uploaded file(s) and send the compressed file(s) directly to the cloud storage platform. Later, when a user wants to access the same file(s), the same, or similar, third party web page provides a WASM version of the decompressor, downloads the file(s) from the cloud storage platform, performs decompression, and provides the file(s) to the user system for the user to see and/or store.
Alternatively, the user system may download a WebAssembly (WASM) version of a compressor from a third party web page. The user system may use the downloaded compressor to compress one or more files. The user system may then transmit the compressed one or more files to the cloud storage platform (i.e. the remote server system).
Embodiments may provide a forwarded compression provider. The forwarded compression provider may be, for example, the provider of the remote server system. The forwarded compression provider may alternatively be, for example, a third party that creates and/or handles the compression/decompression tool that is sent to the user.
Embodiments may provide a forwarded compression tool. This is a tool that is sent to a user system wanting to store data at a third party location. The tool may perform in-line compression and may add metadata, such as tags required by the storage provider, original data checksum(s), thumbnails, and/or decompression information of the data inside the user system on behalf of the forwarded compression provider. The compressed data is then stored at the third party location for later retrieval by the same, or other, users. When a user wants to access the data, the tool may retrieve the compressed data along with any of the metadata, tags provided by the storage provider, original data checksums, and/or decompression information and perform in-line decompression. The decompressed data is then presented to the user in the same manner as if no compression was used.
The forwarded compression tool may be provided in several ways. First, it may be provided as a browser web page that the user accesses as a gateway in order to access data Second, it may be provided via the use of an App, as is appropriate for use on mobile devices (e.g. tablets or smart phones). Third, it may be provided as a normal program or service running on a computer. The forwarded compression/decompression functionality may be included in the existing tools that remote server system provider presents to their users. The forwarded compression/decompression functionality may be included in existing tools for upload/download/web acceleration.
Embodiments may provide metadata. In some cases, and for some file (or object) types, extra metadata and preview information might be required, or preferably received, by the storage provider (i.e. remote server system). For images, this may be smaller (thumbnail) version(s) of the images and properties like size, category, geolocation, AT analysis, or image classification. In this case, this data can be extracted, and/or created, by the forwarded compression tool. If the forwarded compression tool is provided by a third party, embodiments include this information being appended to the start of the compressed file so that the storage provider sees a file with a normal header. Many storage providers provide a low-level API that allow users, and third party tools, control beyond storing and retrieving files. Embodiments also include metadata information being provided to the storage provider by usage of these API functions.
Implementations of embodiments may also include the below described techniques.
For some compression algorithms, such as image recompression, where the image is already compressed, extra safety measures such as decompression and verification may be used. This process may take place at the user system, optionally after transmission of the compressed version of the file(s). Embodiments also include the verification process being offloaded to the remote server system, and checksum(s) generated at the user side being compared with the checksum(s) at the server side. If the verification possess fails, uncompressed versions of the file(s) may be transmitted to the remote server system.
In some scenarios, the user system may lack the computing resources to efficiently frilly compress all of the data. In this case, embodiments include the user system transmitting both compressed data and uncompressed data to the remote server system. Both the compressed and uncompressed data may be stored directly at the remote server system, giving a reduced overall compression ratio. Alternatively, the uncompressed data may be compressed at the remote server system, if there are computing resources available.
In some scenarios, the user system may lack the capability to efficiently perform the required data decompression. Embodiments include the remote server system performing some, or all, of the decompression on behalf of the user system. Embodiments also include the decompression being performed by a separate third party service provider, i.e. on separate servers from those of the remote service provider or the user system.
In some scenarios, the data transfer may be performed on the original uncompressed file(s) first. Then, at a later time, the same file(s) may be compressed by the user's system and transmitted to the remote server provider. The uncompressed file(s) may then be replaced by the compressed version(s) of the file(s). While this approach increases the amount of transferred data, it still reduces the compression requirements at the server.
The pricing model of the storage provider (i.e. provider of the remote server system), and/or any involved third party, may vary based on the degree that a user performs compression, or decompression, on their user system. A user may opt to compress none, parts or all of the data when uploading data. If the user agrees to compress a larger amount of their data a different (typically lower) price may be offered, representing the reduced storage costs, lower data transfer cost, and/or server compute cost. A user may opt to decompress none, parts or all the data when downloading data. If the user agrees to decompress a larger amount of their data a different (typically lower) price may be offered, representing the lower data transfer cost, and/or server compute cost.
Figures 1 to 5 show various implementations of embodiments.
Figure 1 shows operations that may be performed by, and/or modules that may be comprised by, a forwarded compression tool operating on a user system according to an embodiment.
Figure 2 shows operations that may be performed by, and/or modules that may be comprised by, a forwarded compression tool operating on a user system when retrieving and decompressing data according to an embodiment.
Figure 3 shows an example of an interaction flow when a third party provides a forwarded compression tool in a web browser according to an embodiment.
Figure 4 shows operations that may be performed by, and/or modules that may be comprised by, a forwarded compression tool running on a user system, where decompression verification is provided by a storage provider according to an embodiment.
Figure 5 shows operations that may be performed by, and/or modules that may be comprised by, a storage provider that uses a forwarded compression tool according to an embodiment.
The flow charts and descriptions thereof herein should not be understood to prescribe a fixed order of performing the method steps described therein. Rather, the method steps may be performed in any order that is practicable. Although the present invention has been described in connection with specific exemplary embodiments, it should be understood that various changes, substitutions, and alterations apparent to those skilled in the art can be made to the disclosed embodiments without departing from the spirit and scope of the invention as set forth in the appended claims.
Methods and processes described herein can be embodied as code (e.g., software code) and/or data Such code and data can be stored on one or more computer-readable media, which may include any device or medium that can store code and/or data for use by a computer system. When a computer system reads and executes the code and/or data stored on a computer-readable medium, the computer system performs the methods and processes embodied as data structures and code stored within the computer-readable storage medium. In certain embodiments, one or more of the steps of the methods and processes described herein can be performed by a processor (e.g., a processor of a computer system or data storage system). It should be appreciated by those skilled in the art that computer-readable media include removable and non-removable structures/devices that can be used for storage of information, such as computer-readable instructions, data structures, program modules, and other data used by a computing system/environment. A computer-readable medium includes, but is not limited to, volatile memory such as random access memories (RAM, DRAM, SRAM); and non-volatile memory such as flash memory, various read-only-memories (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memories (MRAM, FeRAM), phase-change memory and magnetic and optical storage devices (hard drives, magnetic tape, CDs, DVDs); network devices; or other media now known or later developed that is capable of storing computer-readable information/data. Computer-readable media should not be construed or interpreted to include any propagating signals.
Claims (1)
- Claims: 3. 4.A computer-implemented method of transferring data from a user system to a remote data storage system, the method comprising: selecting, at a user system, one or more data files for remote storage; compressing the selected data files at the user system; transmitting the compressed data files to a remote server system, wherein the remote server system is a remote data storage system; and storing the compressed data files in the remote server system The method according to claim 1, wherein the remote server system stores the received compressed data files substantially without further compressing the received compressed data files The method according to claim 1_ or 2, wherein, the remote server system is a cloud based storage system.The method according to any preceding claim, wherein the user system is a mobile device, such as a tablet or smart phone.The method according to any preceding claim, wherein the selected data files at the user system comprise any data, such as PDF data, text data, binary data, image data, video data and/or voice data.The method according to any preceding claim, wherein transmitting the compressed data files from the user system to a remote server system comprises: 7. 9.receiving, by the user system, a compression tool from a third party system; using the received compression tool to compress the selected data files at the user system; and transmitting, by the user system, the compressed data files to the remote server system.The method according to claim 6, wherein the third party system is a webpage.The method according to any preceding claim, further comprising obtaining metadata and/or preview data of the one or more data files for remote storage at the user system; and transmitting the obtained metadata and/or preview data from the user system to the remote server system.The method according to any preceding claim, wherein the user system is a desktop computer.10. A computer-implemented method of transferring data from a remote data storage system to a user system, the method comprising: retrieving from storage, by a remote server system, compressed data files, wherein the remote server system is a remote data storage system, transmitting, by the remote server system, the compressed data files to a user system substantially without decompressing the retrieved compressed data files; and decompressing the received compressed data files at the user system.11. A computer-implemented method of transferring data between a remote data storage system and a user system, the method comprising: a method of transferring data from the user system to the remote data storage system according to any of claims 1 to 9; and a method of transferring data from the remote data storage system to the user system according to claim 10.12. A user system and remote server system configured to perform the method according to any preceding claim.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2019022.9A GB2602961A (en) | 2020-12-02 | 2020-12-02 | Data compression and storage techniques |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB2019022.9A GB2602961A (en) | 2020-12-02 | 2020-12-02 | Data compression and storage techniques |
Publications (2)
Publication Number | Publication Date |
---|---|
GB202019022D0 GB202019022D0 (en) | 2021-01-13 |
GB2602961A true GB2602961A (en) | 2022-07-27 |
Family
ID=74099815
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2019022.9A Pending GB2602961A (en) | 2020-12-02 | 2020-12-02 | Data compression and storage techniques |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2602961A (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10693660B2 (en) * | 2017-01-05 | 2020-06-23 | Serge Vilvovsky | Method and system for secure data storage exchange, processing, and access |
-
2020
- 2020-12-02 GB GB2019022.9A patent/GB2602961A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10693660B2 (en) * | 2017-01-05 | 2020-06-23 | Serge Vilvovsky | Method and system for secure data storage exchange, processing, and access |
Non-Patent Citations (3)
Title |
---|
ANONYMOUS: "24/USING-GZIP-COMPRESSION-ASP-NET- CORE/) BY WADE (HTTPS24/USING-GZIP-COMPRESSION-ASP-NET-CORE/#RESPOND) Enabling At The Code Level (Dynamic)", NET CORE JANUARY, 24 January 2017 (2017-01-24), pages 1 - 6, XP055907341, Retrieved from the Internet <URL:https://dotnetcoretutorials.com/> [retrieved on 20220330] * |
ANONYMOUS: "cp -Copy files and objects | Cloud Storage | Google Cloud The Wayback Machine -https", 20 July 2020 (2020-07-20), pages 1 - 11, XP055907440, Retrieved from the Internet <URL:https://web.archive.org/web/20200720090249/https://cloud.google.com/storage/docs/gsutil/commands/cp> [retrieved on 20220331] * |
ANONYMOUS: "Images API for Python 2 Overview The Wayback Machine -https Images API for Python 2 Overview", 18 March 2020 (2020-03-18), pages 1 - 9, XP055907447, Retrieved from the Internet <URL:https://web.archive.org/web/20200720090249/https://cloud.google.com/storage/docs/gsutil/commands/cp> [retrieved on 20220331] * |
Also Published As
Publication number | Publication date |
---|---|
GB202019022D0 (en) | 2021-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7484007B2 (en) | System and method for partial data compression and data transfer | |
KR101035302B1 (en) | A cloud system and a method of compressing and transmtting files in a cloud system | |
KR20190141694A (en) | Neural network processor using compression and decompression of activation data to reduce memory bandwidth utilization | |
US8850075B2 (en) | Predictive, multi-layer caching architectures | |
US20110167173A1 (en) | Optimal Compression Process Selection Methods | |
US20030028673A1 (en) | System and method for compressing and decompressing browser cache in portable, handheld and wireless communication devices | |
CN104572966A (en) | Zip file unzipping method and device | |
CN102594867A (en) | On-demand or incremental remote data copy | |
CN104243923A (en) | Image processing and previewing method and system | |
US20160267649A1 (en) | Low latency web-based dicom viewer system | |
KR101769315B1 (en) | Method and apparatus for automatic converting file name based on the cloud server | |
JP2015062108A (en) | Information processing system, information processing apparatus, terminal devices, information processing method, and program | |
WO2021012723A1 (en) | Multimedia file storage and access method | |
CN112583889A (en) | Large file transmission method and device | |
CN109325006A (en) | A kind of method and apparatus for compressing the method and apparatus stored, decompression downloading | |
CN111966647A (en) | Cloud storage method and device for small files, server and storage medium | |
KR20210154785A (en) | System for cloud streaming service, method of image cloud streaming service using common cache and apparatus for the same | |
EP4256436A1 (en) | Systems and methods for virtual gpu-cpu memory orchestration | |
GB2602961A (en) | Data compression and storage techniques | |
US10168909B1 (en) | Compression hardware acceleration | |
CN113961530A (en) | Log file compression method, device and storage medium based on artificial intelligence | |
CN103701937A (en) | Method for uploading large files | |
US11317005B2 (en) | System and method for determining compression rates for images comprising text | |
CN113726838A (en) | File transmission method, device, equipment and storage medium | |
JP4648415B2 (en) | File transfer system and application server |