GB2602961A - Data compression and storage techniques - Google Patents

Data compression and storage techniques Download PDF

Info

Publication number
GB2602961A
GB2602961A GB2019022.9A GB202019022A GB2602961A GB 2602961 A GB2602961 A GB 2602961A GB 202019022 A GB202019022 A GB 202019022A GB 2602961 A GB2602961 A GB 2602961A
Authority
GB
United Kingdom
Prior art keywords
data
remote
user
remote server
user system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2019022.9A
Other versions
GB202019022D0 (en
Inventor
Erlend Jensen Rune
Berg Stene Sindre
Babington Kjetil
Simonsen Per
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Memoscale As
Original Assignee
Memoscale As
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Memoscale As filed Critical Memoscale As
Priority to GB2019022.9A priority Critical patent/GB2602961A/en
Publication of GB202019022D0 publication Critical patent/GB202019022D0/en
Publication of GB2602961A publication Critical patent/GB2602961A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Disclosed herein is a computer-implemented method of transferring data from a user system to a remote data storage system, the method comprising: selecting, at a user system, one or more data files for remote storage; compressing the selected data files at the user system; transmitting the compressed data files to a remote server system, wherein the remote server system is a remote data storage system; and storing the compressed data files in the remote server system.

Description

Data compression and storage techniques
Field
The field of the invention is data compression and storage. Data in a user system may be transferred to a cloud based storage system. This may be, for example, to provide back-up storage of the data. Embodiments provide techniques for reducing the transmission bandwidth, as well as for reducing the storage and processing requirements in the cloud based storage system.
Background
There are many applications in which data in a user system may be transferred to a cloud based storage system. This may be, for example, to provide back-up storage of the data or the cloud based storage system may be the only long term store of the data There is a general need to improve on known techniques for providing remote storage of data
Summary
Aspects of the invention are set out in the appended independent claims.
List of figures Figure 1 shows a forwarded compression tool operating on a user system according to an embodiment.
Figure 2 shows a forwarded compression tool operating on a user system when retrieving and decompressing data according to an embodiment.
Figure 3 shows an interaction flow when a third party provides a forwarded compression 25 tool in a web browser according to an embodiment.
Figure 4 shows a forwarded compression tool running on a user system, where decompression verification is provided by a storage provider according to an embodiment.
Figure 5 shows a storage provider that uses a forwarded compression tool according to an embodiment.
Description
A problem with known techniques for transferring data between a user system and a remote storage system, such as a cloud based storage system, is that a large transmission bandwidth may be required. A problem with storing data in such a remote storage system is that either the storage requirements are large, because the data is uncompressed, or the processing requirements are large, due to the remote storage system needing to compress received data.
Embodiments may solve one or more of the above problems with known techniques by compressing data at a user system This reduces the amount of data that needs to be transmitted and allows compressed data to be stored in the remote storage system without compression being applied at the remote storage system.
Embodiments are described in more detail below.
Throughout the description of embodiments, a user system and a remote server system are referred to. The user system may be any type of computer system that is used by a user.
For example, the user system may be any of a user's lap top computer, a user's mobile device (such as a tablet or smart phone) or a local server system of a company. The remote server system is a remote storage system. The remote server system may be any type of computer system and, in particular, may be a server system with a large data storage capacity. For example, the remote server system may be a cloud based storage system. The remote server system is located remotely from the user system.
Embodiments describe the compression and decompression of data files. The data files may be any type of data, such as image data, video data, voice data or other types of data.
Embodiments may advantageously combine three techniques. The first technique is to use lossless data compression to store data in a remote server system. This saves cost as the amount of data stored is reduced The second technique is to move the computing effort expended on compressing data from the remote server system to user system. This reduces the amount of data transferred between the user system and the remote server system, saving bandwidth for both. This may also reduce the time the user system needs to wait when uploading data. In addition it reduces size and cost of the remote sewer system, as no, or less, computing resources are required to compress data.
The third technique is to move the decompression of the data out of the remote sewer system when the data is requested by the user system. This further reduces the size and cost requirements of the remote server system, as no, or less, computing resources are required at the remote server system to decompress the data for user access. In addition it reduces the amount of data transferred from the remote sewer system to the user system, saving bandwidth for both.
The above techniques of embodiments may enable a remote server system provider, and/or a third party, to seamlessly enable compression in remote server system substantially without increasing the computing resources required at the remote server system. This may provide significant cost gains for remote server system providers, such as cloud based storage system providers. In addition it enables third party providers to create a upload/download service for users, enabling a reduction in transfer time and storage costs for the user, while being scalable and low cost for the provider, The techniques according to embodiments may comprise performing, or emulating, parts of the remote server system tasks on a user system.
An implementation of an embodiment may be performed inside a browser without using any plugins. A user, of a user system, may go to a third party web page with a downloadable WebAssembly (WASM) version of a compressor. The user may then then log into their cloud storage platform (i.e. the remote server system), and provide it with an access token to the third party web page with WASM code. Then the user may select one or more files to upload and upload the selected file(s). The WASM code may then perform the compression on the uploaded file(s) and send the compressed file(s) directly to the cloud storage platform. Later, when a user wants to access the same file(s), the same, or similar, third party web page provides a WASM version of the decompressor, downloads the file(s) from the cloud storage platform, performs decompression, and provides the file(s) to the user system for the user to see and/or store.
Alternatively, the user system may download a WebAssembly (WASM) version of a compressor from a third party web page. The user system may use the downloaded compressor to compress one or more files. The user system may then transmit the compressed one or more files to the cloud storage platform (i.e. the remote server system).
Embodiments may provide a forwarded compression provider. The forwarded compression provider may be, for example, the provider of the remote server system. The forwarded compression provider may alternatively be, for example, a third party that creates and/or handles the compression/decompression tool that is sent to the user.
Embodiments may provide a forwarded compression tool. This is a tool that is sent to a user system wanting to store data at a third party location. The tool may perform in-line compression and may add metadata, such as tags required by the storage provider, original data checksum(s), thumbnails, and/or decompression information of the data inside the user system on behalf of the forwarded compression provider. The compressed data is then stored at the third party location for later retrieval by the same, or other, users. When a user wants to access the data, the tool may retrieve the compressed data along with any of the metadata, tags provided by the storage provider, original data checksums, and/or decompression information and perform in-line decompression. The decompressed data is then presented to the user in the same manner as if no compression was used.
The forwarded compression tool may be provided in several ways. First, it may be provided as a browser web page that the user accesses as a gateway in order to access data Second, it may be provided via the use of an App, as is appropriate for use on mobile devices (e.g. tablets or smart phones). Third, it may be provided as a normal program or service running on a computer. The forwarded compression/decompression functionality may be included in the existing tools that remote server system provider presents to their users. The forwarded compression/decompression functionality may be included in existing tools for upload/download/web acceleration.
Embodiments may provide metadata. In some cases, and for some file (or object) types, extra metadata and preview information might be required, or preferably received, by the storage provider (i.e. remote server system). For images, this may be smaller (thumbnail) version(s) of the images and properties like size, category, geolocation, AT analysis, or image classification. In this case, this data can be extracted, and/or created, by the forwarded compression tool. If the forwarded compression tool is provided by a third party, embodiments include this information being appended to the start of the compressed file so that the storage provider sees a file with a normal header. Many storage providers provide a low-level API that allow users, and third party tools, control beyond storing and retrieving files. Embodiments also include metadata information being provided to the storage provider by usage of these API functions.
Implementations of embodiments may also include the below described techniques.
For some compression algorithms, such as image recompression, where the image is already compressed, extra safety measures such as decompression and verification may be used. This process may take place at the user system, optionally after transmission of the compressed version of the file(s). Embodiments also include the verification process being offloaded to the remote server system, and checksum(s) generated at the user side being compared with the checksum(s) at the server side. If the verification possess fails, uncompressed versions of the file(s) may be transmitted to the remote server system.
In some scenarios, the user system may lack the computing resources to efficiently frilly compress all of the data. In this case, embodiments include the user system transmitting both compressed data and uncompressed data to the remote server system. Both the compressed and uncompressed data may be stored directly at the remote server system, giving a reduced overall compression ratio. Alternatively, the uncompressed data may be compressed at the remote server system, if there are computing resources available.
In some scenarios, the user system may lack the capability to efficiently perform the required data decompression. Embodiments include the remote server system performing some, or all, of the decompression on behalf of the user system. Embodiments also include the decompression being performed by a separate third party service provider, i.e. on separate servers from those of the remote service provider or the user system.
In some scenarios, the data transfer may be performed on the original uncompressed file(s) first. Then, at a later time, the same file(s) may be compressed by the user's system and transmitted to the remote server provider. The uncompressed file(s) may then be replaced by the compressed version(s) of the file(s). While this approach increases the amount of transferred data, it still reduces the compression requirements at the server.
The pricing model of the storage provider (i.e. provider of the remote server system), and/or any involved third party, may vary based on the degree that a user performs compression, or decompression, on their user system. A user may opt to compress none, parts or all of the data when uploading data. If the user agrees to compress a larger amount of their data a different (typically lower) price may be offered, representing the reduced storage costs, lower data transfer cost, and/or server compute cost. A user may opt to decompress none, parts or all the data when downloading data. If the user agrees to decompress a larger amount of their data a different (typically lower) price may be offered, representing the lower data transfer cost, and/or server compute cost.
Figures 1 to 5 show various implementations of embodiments.
Figure 1 shows operations that may be performed by, and/or modules that may be comprised by, a forwarded compression tool operating on a user system according to an embodiment.
Figure 2 shows operations that may be performed by, and/or modules that may be comprised by, a forwarded compression tool operating on a user system when retrieving and decompressing data according to an embodiment.
Figure 3 shows an example of an interaction flow when a third party provides a forwarded compression tool in a web browser according to an embodiment.
Figure 4 shows operations that may be performed by, and/or modules that may be comprised by, a forwarded compression tool running on a user system, where decompression verification is provided by a storage provider according to an embodiment.
Figure 5 shows operations that may be performed by, and/or modules that may be comprised by, a storage provider that uses a forwarded compression tool according to an embodiment.
The flow charts and descriptions thereof herein should not be understood to prescribe a fixed order of performing the method steps described therein. Rather, the method steps may be performed in any order that is practicable. Although the present invention has been described in connection with specific exemplary embodiments, it should be understood that various changes, substitutions, and alterations apparent to those skilled in the art can be made to the disclosed embodiments without departing from the spirit and scope of the invention as set forth in the appended claims.
Methods and processes described herein can be embodied as code (e.g., software code) and/or data Such code and data can be stored on one or more computer-readable media, which may include any device or medium that can store code and/or data for use by a computer system. When a computer system reads and executes the code and/or data stored on a computer-readable medium, the computer system performs the methods and processes embodied as data structures and code stored within the computer-readable storage medium. In certain embodiments, one or more of the steps of the methods and processes described herein can be performed by a processor (e.g., a processor of a computer system or data storage system). It should be appreciated by those skilled in the art that computer-readable media include removable and non-removable structures/devices that can be used for storage of information, such as computer-readable instructions, data structures, program modules, and other data used by a computing system/environment. A computer-readable medium includes, but is not limited to, volatile memory such as random access memories (RAM, DRAM, SRAM); and non-volatile memory such as flash memory, various read-only-memories (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memories (MRAM, FeRAM), phase-change memory and magnetic and optical storage devices (hard drives, magnetic tape, CDs, DVDs); network devices; or other media now known or later developed that is capable of storing computer-readable information/data. Computer-readable media should not be construed or interpreted to include any propagating signals.

Claims (1)

  1. Claims: 3. 4.A computer-implemented method of transferring data from a user system to a remote data storage system, the method comprising: selecting, at a user system, one or more data files for remote storage; compressing the selected data files at the user system; transmitting the compressed data files to a remote server system, wherein the remote server system is a remote data storage system; and storing the compressed data files in the remote server system The method according to claim 1, wherein the remote server system stores the received compressed data files substantially without further compressing the received compressed data files The method according to claim 1_ or 2, wherein, the remote server system is a cloud based storage system.The method according to any preceding claim, wherein the user system is a mobile device, such as a tablet or smart phone.The method according to any preceding claim, wherein the selected data files at the user system comprise any data, such as PDF data, text data, binary data, image data, video data and/or voice data.The method according to any preceding claim, wherein transmitting the compressed data files from the user system to a remote server system comprises: 7. 9.receiving, by the user system, a compression tool from a third party system; using the received compression tool to compress the selected data files at the user system; and transmitting, by the user system, the compressed data files to the remote server system.The method according to claim 6, wherein the third party system is a webpage.The method according to any preceding claim, further comprising obtaining metadata and/or preview data of the one or more data files for remote storage at the user system; and transmitting the obtained metadata and/or preview data from the user system to the remote server system.The method according to any preceding claim, wherein the user system is a desktop computer.10. A computer-implemented method of transferring data from a remote data storage system to a user system, the method comprising: retrieving from storage, by a remote server system, compressed data files, wherein the remote server system is a remote data storage system, transmitting, by the remote server system, the compressed data files to a user system substantially without decompressing the retrieved compressed data files; and decompressing the received compressed data files at the user system.11. A computer-implemented method of transferring data between a remote data storage system and a user system, the method comprising: a method of transferring data from the user system to the remote data storage system according to any of claims 1 to 9; and a method of transferring data from the remote data storage system to the user system according to claim 10.12. A user system and remote server system configured to perform the method according to any preceding claim.
GB2019022.9A 2020-12-02 2020-12-02 Data compression and storage techniques Pending GB2602961A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB2019022.9A GB2602961A (en) 2020-12-02 2020-12-02 Data compression and storage techniques

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2019022.9A GB2602961A (en) 2020-12-02 2020-12-02 Data compression and storage techniques

Publications (2)

Publication Number Publication Date
GB202019022D0 GB202019022D0 (en) 2021-01-13
GB2602961A true GB2602961A (en) 2022-07-27

Family

ID=74099815

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2019022.9A Pending GB2602961A (en) 2020-12-02 2020-12-02 Data compression and storage techniques

Country Status (1)

Country Link
GB (1) GB2602961A (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10693660B2 (en) * 2017-01-05 2020-06-23 Serge Vilvovsky Method and system for secure data storage exchange, processing, and access

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10693660B2 (en) * 2017-01-05 2020-06-23 Serge Vilvovsky Method and system for secure data storage exchange, processing, and access

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "24/USING-GZIP-COMPRESSION-ASP-NET- CORE/) BY WADE (HTTPS24/USING-GZIP-COMPRESSION-ASP-NET-CORE/#RESPOND) Enabling At The Code Level (Dynamic)", NET CORE JANUARY, 24 January 2017 (2017-01-24), pages 1 - 6, XP055907341, Retrieved from the Internet <URL:https://dotnetcoretutorials.com/> [retrieved on 20220330] *
ANONYMOUS: "cp -Copy files and objects | Cloud Storage | Google Cloud The Wayback Machine -https", 20 July 2020 (2020-07-20), pages 1 - 11, XP055907440, Retrieved from the Internet <URL:https://web.archive.org/web/20200720090249/https://cloud.google.com/storage/docs/gsutil/commands/cp> [retrieved on 20220331] *
ANONYMOUS: "Images API for Python 2 Overview The Wayback Machine -https Images API for Python 2 Overview", 18 March 2020 (2020-03-18), pages 1 - 9, XP055907447, Retrieved from the Internet <URL:https://web.archive.org/web/20200720090249/https://cloud.google.com/storage/docs/gsutil/commands/cp> [retrieved on 20220331] *

Also Published As

Publication number Publication date
GB202019022D0 (en) 2021-01-13

Similar Documents

Publication Publication Date Title
KR102596644B1 (en) Neural network processor that uses compression and decompression of activation data to reduce memory bandwidth utilization
US7484007B2 (en) System and method for partial data compression and data transfer
US20200142607A1 (en) Efficient data management through compressed data interfaces
RU2689439C2 (en) Improved performance of web access
US8850075B2 (en) Predictive, multi-layer caching architectures
US20110167173A1 (en) Optimal Compression Process Selection Methods
EP2629208A2 (en) Cloud system and file compression and transmission method in a cloud system
CN104572966A (en) Zip file unzipping method and device
JP2009530702A (en) Data storage management method and device
US20090013009A1 (en) Using differential file representing differences of second version of a file compared to first version of the file
CN104243923A (en) Image processing and previewing method and system
US9667696B2 (en) Low latency web-based DICOM viewer system
JP2015062108A (en) Information processing system, information processing apparatus, terminal devices, information processing method, and program
CN112583889A (en) Large file transmission method and device
CN104572964A (en) Zip file unzipping method and device
CN111966647A (en) Cloud storage method and device for small files, server and storage medium
KR20070009557A (en) Reusable compressed objects
KR101769315B1 (en) Method and apparatus for automatic converting file name based on the cloud server
GB2602961A (en) Data compression and storage techniques
US20190026047A1 (en) Random file i/o and chunked data upload
CN103701937A (en) Method for uploading large files
WO2021012723A1 (en) Multimedia file storage and access method
KR20210154785A (en) System for cloud streaming service, method of image cloud streaming service using common cache and apparatus for the same
US10168909B1 (en) Compression hardware acceleration
US20200186675A1 (en) System and method for determining compression rates for images comprising text