CN102982046B - A kind of web data compression and storage method and system - Google Patents
A kind of web data compression and storage method and system Download PDFInfo
- Publication number
- CN102982046B CN102982046B CN201110264127.0A CN201110264127A CN102982046B CN 102982046 B CN102982046 B CN 102982046B CN 201110264127 A CN201110264127 A CN 201110264127A CN 102982046 B CN102982046 B CN 102982046B
- Authority
- CN
- China
- Prior art keywords
- piecemeal
- webpage
- compressed data
- storage
- compression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000013144 data compression Methods 0.000 title claims abstract description 19
- 238000013500 data storage Methods 0.000 title claims abstract description 14
- 238000007906 compression Methods 0.000 claims abstract description 45
- 230000006835 compression Effects 0.000 claims abstract description 45
- 238000012790 confirmation Methods 0.000 claims description 5
- 230000006837 decompression Effects 0.000 claims description 5
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 239000000686 essence Substances 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a kind of web data compression and storage method and system:When needing to be compressed any webpage, two or more piecemeal is divided the webpage into;For each piecemeal, determine whether to have stored its corresponding compressed data respectively, if it is not, then compressing the piecemeal, and compressed data is stored, if it is, without compression.Using scheme of the present invention, it is possible to increase compression efficiency and saving memory space.
Description
Technical field
The present invention relates to data processing technique, more particularly to a kind of web data compression and storage method and system.
Background technology
, can be using data compression technique come to data in order to improve the efficiency of transmission of data and save the memory space of data
It is compressed, data can be that, because itself having redundancy, data compression technique is to refer to utilize certain algorithm will by compression
The redundancy of data reduces as much as possible, and is reduced as far as distortion.
Data compression technique is generally divided into Lossless Compression and lossy compression method.
Lossless Compression refers to obtain after reducing compressed data and the identical data of initial data.It is lossless
Compression, which is mainly used in, requires reconstruction signal and the completely the same occasion of primary signal, such as compression of text data, program.It is lossless
The compression ratio of compression is relatively low, and usually 1/2~1/5.Typical lossless compression algorithm has Huffman (Huffman) coding, arithmetic
Coding, Run- Length Coding etc..
Lossy coding refers to that the data obtained after being reduced to compressed data and initial data are different, but does not influence
The information of initial data expression, therefore compression ratio is much greater.Lossy compression method is mainly used in the data such as voice, image and video
Compression.Typical Lossy Compression Algorithm have pulse code modulation (PCM, Pulse Code Modulation), predictive coding,
Transition coding, interpolation and extrapolation etc..
Web data compression generally uses Lossless Compression, and be compressed in units of single webpage, specific implementation
It is as follows:Webpage to be compressed is obtained, it is compressed according to certain algorithm, compressed data is obtained, compressed data is protected
Deposit, and correspondingly preserve the URL (URL, Uniform Resource Locator) of the webpage.Subsequently, needs are worked as
When reading the webpage, its corresponding compressed data is found according to the URL of the webpage, and compressed data is decompressed, so that
Obtain the webpage.
But, the problem of aforesaid way can have certain in actual applications:In some cases, meeting between different webpages
In the presence of certain general character, such as the different web pages in same website, and in the prior art when being compressed to webpage, with single
Webpage is unit, the general character between different web pages is not considered, such as two webpages have 40% content identical, but this identical
40% content can be compressed twice, can also be stored twice, so as to not only reduce compression efficiency, and increase to storage
The occupancy in space.
The content of the invention
In view of this, the present invention provides a kind of web data compression and storage method and system, it is possible to increase compression efficiency and
Save memory space.
To reach above-mentioned purpose, the technical proposal of the invention is realized in this way:
A kind of web data compression and storage method, including:
When needing to be compressed any webpage, two or more piecemeal is divided the webpage into;
For each piecemeal, determine whether to have stored its corresponding compressed data respectively, if it is not, then compressing this point
Block, and compressed data is stored, if it is, without compression.
A kind of web data compression storage system, including:
Compression service device, for when needing to be compressed any webpage, dividing the webpage into two or more point
Block;For each piecemeal, inquiry request is sent to storage server respectively, whether inquiry has wherein stored piecemeal correspondence
Compressed data, deny message if received, compress the piecemeal, and the storage server is arrived into compressed data storage
In, if receiving confirmation message, without compression;
The storage server, for storing compressed data, and according to the inquiry request for being received from the compression service device
Confirm or deny message to its return.
It can be seen that, using scheme of the present invention, if the corresponding compressed data of a certain piecemeal of webpage has been present, i.e., it
Preceding a certain webpage exists and the same piecemeal of the webpage and has been compressed storage, then compression is not repeated, otherwise,
It is compressed, so as to improve compression efficiency, its complete compression number need not be stored for each webpage by being additionally, since
According to, therefore save memory space.
Brief description of the drawings
Fig. 1 is a kind of template schematic diagram.
Fig. 2 is the flow chart of web data compression and storage method embodiment of the present invention.
Fig. 3 is the corresponding dom tree schematic diagram of template shown in Fig. 1.
Fig. 4 is the flow chart of web data compression and storage method preferred embodiment of the present invention.
Fig. 5 is the composition structural representation of web data compression storage system embodiment of the present invention.
Embodiment
For problems of the prior art, propose that the web data after a kind of improvement compresses storage side in the present invention
Case, it is possible to increase compression efficiency and saving memory space.
As it was previously stated, in some cases, can there is certain general character between different webpages, such as in same website not
Same webpage.
Webpage in same website be all based on greatly a class or a few class templates generation.Fig. 1 is a kind of template schematic diagram, such as
Shown in Fig. 1, according to the template, a webpage can be divided into the part of A, B, C, D, E 5 altogether, wherein, A, B, C, D part is in webpages
Navigation and the information such as advertisement, E parts are text message.For according to the different web pages of template generation shown in Fig. 1, its A, B,
C, D part are typically identical, and only E parts are different.
So, if webpage 1 and webpage 2 are the webpage according to template generation shown in Fig. 1, and compressed net is stored
A, B, C, D, E part of page 1, then subsequently when needing to be compressed storage to webpage 2, then can be without recompression storage webpage
A, B, C, D part in 2, need to only compress and store the E parts different from webpage 1.
To make technical scheme clearer, clear, develop simultaneously embodiment referring to the drawings, to of the present invention
Scheme is described in further detail.
Fig. 2 is the flow chart of web data compression and storage method embodiment of the present invention.As shown in Fig. 2 comprising the following steps:
Step 21:When needing to be compressed any webpage X (for ease of statement, any webpage is represented with webpage X),
Webpage X is divided into two or more piecemeal.
How webpage is divided into two or more piecemeal for prior art, such as, can be by the DOM Document Object Model of webpage
(DOM, Document Object Model) sets to parse webpage, and then obtains each piecemeal.
Fig. 3 is the corresponding dom tree schematic diagram of template shown in Fig. 1.As shown in figure 3, the part in addition to A, B, C, D part is
For E parts.
Step 22:For each piecemeal Y (for ease of statement, any piecemeal is represented with piecemeal Y), determine whether respectively
Its stored corresponding compressed data, if it is not, then compression piecemeal Y, and compressed data is stored, if it is, not
It is compressed.
In this step, for each piecemeal Y marked off in step 21, its identification information is generated, in actual applications, should
Identification information can be signing messages, how be generated as prior art, and determine whether to store the identification information, if it is not, then
Piecemeal Y is compressed, compressed data is stored, and correspondingly stores the identification information, if it is, without compression.
In addition, if the corresponding compressed data of non-memory partitioning Y, then after piecemeal Y compressed data is stored, note
Record the corresponding relation between the piecemeal Y storage location of compressed data and webpage X URL;If storing the corresponding pressures of piecemeal Y
Contracting data, then directly record the corresponding relation between the piecemeal X storage location of compressed data and webpage X URL.
After being disposed in the manner described above to each piecemeal, it will record webpage X URL and multiple storage positions
Corresponding relation between putting, the specific value of " multiple " is identical with the block count that webpage X is divided into.
So, when needing to read webpage X, each of webpage X can be found according to the webpage X corresponding each storage locations of URL
Piecemeal, is decompressed respectively, and each piecemeal after decompression is spliced, generation webpage X.
Process shown in Fig. 2 is further described below by preferred embodiment.
Fig. 4 is the flow chart of web data compression and storage method preferred embodiment of the present invention.As shown in figure 4, including following
Step:
Step 41:When needing to be compressed any webpage X, webpage X is divided into two or more piecemeal.
Step 42:For each piecemeal Y, its identification information is generated, and determines whether to store the identification information, if
It is no, then step 43 is performed, if it is, performing step 44.
Step 43:Piecemeal Y is compressed, compressed data is stored, and correspondingly stores its identification information, while recording piecemeal
Corresponding relation between the storage location of Y compressed data and webpage X URL, then performs step 45.
Step 44:The corresponding relation between the piecemeal Y storage location of compressed data and webpage X URL is recorded, is then held
Row step 45.
Step 45:When needing to read webpage X, each of webpage X is found according to the corresponding each storage locations of webpage X URL
Piecemeal, is decompressed respectively, and each piecemeal after decompression is spliced, and generates webpage X, terminates flow.
So far, that is, the introduction on the inventive method embodiment is completed.
Based on above-mentioned introduction, Fig. 5 is the composition structural representation of web data compression storage system embodiment of the present invention.Such as
Shown in Fig. 5, including:
Compression service device 51, for when needing to be compressed any webpage, the webpage to be divided into two or more point
Block;For each piecemeal, inquiry request is sent to storage server 52 respectively, whether inquiry has wherein stored the piecemeal pair
The compressed data answered, message is denied if received, and compresses the piecemeal, and compressed data storage is arrived into storage server 52
In, if receiving confirmation message, without compression;
Storage server 52, for storing compressed data, and according to be received from the inquiry request of compression service device 51 to its
Return and confirm or deny message.
Compression service device 51 can be further used for, and when needing to read the webpage, obtain and deposited from storage server 52
Each piecemeal of the webpage of storage, is decompressed respectively, and each piecemeal after decompression is spliced, and generates the webpage.
In addition, compression service device 51 can be further used for, for any piecemeal, its identification information is generated, and carry
Storage server 52 is sent in inquiry request;Correspondingly, storage server 52 determines itself whether store the identification information,
Deny message if it is not, then being returned to compression service device 51, if it is, returning to confirmation message to compression service device 51;Compression
If server 51 have received denies message for any piecemeal, by pressure of the identification information of the piecemeal together with the piecemeal
Contracting data correspond to storage into storage server 52 together.
The URL of the webpage can be further carried in above-mentioned inquiry request;Correspondingly, storage server 52 can be used further
In if not storing the corresponding compressed data of a piecemeal, after the corresponding compressed data of the piecemeal is stored, record should
Corresponding relation between the storage location of the compressed data of piecemeal and the URL of the webpage;If storing the corresponding pressure of the piecemeal
Contracting data, then directly record the corresponding relation between the storage location of the compressed data of the piecemeal and the URL of the webpage;Compression clothes
Business device 51 obtains the corresponding each storage locations of URL of the webpage from storage server 52, and the net is found according to each storage location
Each piecemeal of page.
Above-mentioned identification information can be signing messages.
The specific workflow of system shown in Figure 5 embodiment refer to the identical explanation in above method embodiment, herein
Repeat no more.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
God is with principle, and any modification, equivalent substitution and improvements done etc. should be included within the scope of protection of the invention.
Claims (8)
1. a kind of web data compression and storage method, it is characterised in that including:
When needing to be compressed any webpage, two or more piecemeal is divided the webpage into;
For each piecemeal, determine whether to have stored its corresponding compressed data respectively, if it is not, then the piecemeal is compressed,
And stored compressed data, if it is, without compression;
If not storing the corresponding compressed data of a piecemeal, after the compressed data of the piecemeal is stored, this point is recorded
Corresponding relation between the storage location of the compressed data of block and the uniform resource position mark URL of the webpage;If stored
The corresponding compressed data of the piecemeal, then directly record between the storage location of the compressed data of the piecemeal and the URL of the webpage
Corresponding relation;
Wherein, it is described to be directed to each piecemeal, determine whether to have stored its corresponding compressed data respectively, if it is not, then pressure
Contracted the piecemeal, and compressed data progress storage is included:
For any piecemeal, its identification information is generated, and determines whether to store the identification information, if it is not, then compressing this point
Block, compressed data is stored, and correspondingly stores the identification information.
2. according to the method described in claim 1, it is characterised in that this method further comprises:
When needing to read the webpage, each piecemeal of the webpage stored is obtained, is decompressed respectively, and will decompression
Each piecemeal after contracting is spliced, and generates the webpage.
3. method according to claim 2, it is characterised in that
Each piecemeal for obtaining the webpage stored includes:Looked for according to the corresponding each storage locations of the URL of the webpage
To each piecemeal of the webpage.
4. according to the method described in claim 1, it is characterised in that the identification information is signing messages.
5. a kind of web data compression storage system, it is characterised in that including:
Compression service device, for when needing to be compressed any webpage, dividing the webpage into two or more piecemeal;Pin
To each piecemeal, inquiry request is sent to storage server respectively, whether inquiry has wherein stored the corresponding pressure of the piecemeal
Contracting data, message is denied if received, and compresses the piecemeal, and by compressed data storage into the storage server, such as
Fruit receives confirmation message, then without compression;It is if not storing the corresponding compressed data of a piecemeal, the piecemeal is corresponding
After compressed data is stored, the storage location and the URL of the webpage of the compressed data of the piecemeal are recorded
Corresponding relation between URL;If storing the corresponding compressed data of the piecemeal, the compressed data of the piecemeal is directly recorded
Corresponding relation between the URL of storage location and the webpage;For any piecemeal, its identification information is generated, and carry in institute
State and the storage server is sent in inquiry request;Deny message for any piecemeal if having received, by this point
The identification information of block corresponds to storage into the storage server together with the compressed data of the piecemeal;
The storage server, for storing compressed data, and according to be received from the inquiry request of the compression service device to its
Return and confirm or deny message;Wherein, the URL of the webpage is carried in the inquiry request;Determine whether itself stores
The identification information, denies message, if it is, being returned to the compression service device if it is not, then being returned to the compression service device
Return confirmation message.
6. system according to claim 5, it is characterised in that the compression service device is further used for, when needing to read
During the webpage, each piecemeal of the webpage stored is obtained from the storage server, is decompressed respectively, and will
Each piecemeal after decompression is spliced, and generates the webpage.
7. system according to claim 6, it is characterised in that
The corresponding each storage locations of URL that the compression service device obtains the webpage from the storage server, according to each
Storage location finds each piecemeal of the webpage.
8. system according to claim 5, it is characterised in that the identification information is signing messages.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110264127.0A CN102982046B (en) | 2011-09-07 | 2011-09-07 | A kind of web data compression and storage method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110264127.0A CN102982046B (en) | 2011-09-07 | 2011-09-07 | A kind of web data compression and storage method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102982046A CN102982046A (en) | 2013-03-20 |
CN102982046B true CN102982046B (en) | 2017-09-26 |
Family
ID=47856082
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110264127.0A Expired - Fee Related CN102982046B (en) | 2011-09-07 | 2011-09-07 | A kind of web data compression and storage method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102982046B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104376584B (en) * | 2013-08-15 | 2018-02-13 | 华为技术有限公司 | A kind of method of data compression, computer system and device |
CN103473214B (en) * | 2013-09-06 | 2017-04-12 | 百度在线网络技术(北京)有限公司 | Method and device for displaying page characters |
EP3229444B1 (en) | 2015-12-29 | 2019-10-16 | Huawei Technologies Co., Ltd. | Server and method for compressing data by server |
CN113742335A (en) * | 2021-01-28 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Data compression management method and device |
WO2022198483A1 (en) * | 2021-03-24 | 2022-09-29 | 深圳市大疆创新科技有限公司 | Data compression method and apparatus, movable platform, and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127044A (en) * | 2007-06-08 | 2008-02-20 | 北京大学 | Dynamic web page segmentation method |
CN101944109A (en) * | 2010-09-06 | 2011-01-12 | 华南理工大学 | System and method for extracting picture abstract based on page partitioning |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IL133888A0 (en) * | 2000-01-05 | 2001-04-30 | Keselman Alexander | Method and algorithm for viewing search results in the internet and multi-page system using the same |
CN1332527A (en) * | 2000-07-10 | 2002-01-23 | 刘明 | WAP-based transmitted data compressing process |
CN1182682C (en) * | 2001-09-24 | 2004-12-29 | 北京大学 | Multimedia web site spliting and reconstructing method |
CN101079895B (en) * | 2006-12-21 | 2010-12-01 | 腾讯科技(深圳)有限公司 | A method, system and proxy service device for quick access to Web page |
CN102148833A (en) * | 2011-04-18 | 2011-08-10 | 中国工商银行股份有限公司 | Method for transmitting data report, server, client and system |
-
2011
- 2011-09-07 CN CN201110264127.0A patent/CN102982046B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127044A (en) * | 2007-06-08 | 2008-02-20 | 北京大学 | Dynamic web page segmentation method |
CN101944109A (en) * | 2010-09-06 | 2011-01-12 | 华南理工大学 | System and method for extracting picture abstract based on page partitioning |
Non-Patent Citations (3)
Title |
---|
一种在线的动态网页分块缓存方法;尤朝等;《电子学报》;20090531;第37卷(第5期);全文 * |
基于视觉的Web页面分块算法的改进与实现;高乐等;《计算机系统应用》;20090430(第4期);全文 * |
面向移动设备的WEB页面分块算法;路松峰等;《小型微型计算机系统》;20070930(第9期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN102982046A (en) | 2013-03-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9317792B2 (en) | Method and apparatus for using a limited capacity portable data carrier | |
CN106170921B (en) | It is related to the source code and decoding method and device of the data of sign compression | |
CN102982046B (en) | A kind of web data compression and storage method and system | |
CN107886560B (en) | Animation resource processing method and device | |
CN102571966B (en) | Network transmission method for large extensible markup language (XML) document | |
US7924183B2 (en) | Method and system for reducing required storage during decompression of a compressed file | |
CN102768662B (en) | A kind of method and apparatus Loaded Image | |
CN116506073B (en) | Industrial computer platform data rapid transmission method and system | |
US8824560B2 (en) | Virtual frame buffer system and method | |
CN103679487A (en) | Advertisement display monitoring method and device | |
CN103346800B (en) | A kind of data compression method and device | |
CN111510718A (en) | Method and system for improving compression ratio through inter-block difference of image file | |
CN105096367A (en) | Method and device of optimizing Canvas rendering performance | |
CN110321354A (en) | Structured data storage method, device, equipment and storage medium | |
CN104408503B (en) | The processing method and system of Quick Response Code | |
CN110019347A (en) | A kind of data processing method, device and the terminal device of block chain | |
CN106293542B (en) | Method and device for decompressing file | |
JP5180470B2 (en) | Electronic color code and information processing system | |
CN102768755B (en) | Obtain the method and apparatus of the thumbnail of picture | |
JP5110304B2 (en) | Screen data transmitting apparatus, screen data transmitting method, and screen data transmitting program | |
CN105704215A (en) | File sharing system and corresponding file sending and receiving method and device | |
CA2535282A1 (en) | A method and system for message thread compression | |
US9002135B2 (en) | Form image management system and form image management method | |
CN100511212C (en) | Processing method and apparatus for electronic table file | |
JP4446102B2 (en) | Data compression / decompression system, data compression device, data decompression device, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170926 |
|
CF01 | Termination of patent right due to non-payment of annual fee |