CN1405705A - Intelligent compression method for file of computer - Google Patents

Intelligent compression method for file of computer Download PDF

Info

Publication number
CN1405705A
CN1405705A CN01124158A CN01124158A CN1405705A CN 1405705 A CN1405705 A CN 1405705A CN 01124158 A CN01124158 A CN 01124158A CN 01124158 A CN01124158 A CN 01124158A CN 1405705 A CN1405705 A CN 1405705A
Authority
CN
China
Prior art keywords
file
document
discernible
compressed
layout
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN01124158A
Other languages
Chinese (zh)
Other versions
CN1139883C (en
Inventor
王金波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiuzhou Computer Network Co., Ltd., Beijing
Original Assignee
JIUZHOU COMPUTER NETWORK CO Ltd BEIJING
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIUZHOU COMPUTER NETWORK CO Ltd BEIJING filed Critical JIUZHOU COMPUTER NETWORK CO Ltd BEIJING
Priority to CNB011241586A priority Critical patent/CN1139883C/en
Publication of CN1405705A publication Critical patent/CN1405705A/en
Application granted granted Critical
Publication of CN1139883C publication Critical patent/CN1139883C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Compression Of Band Width Or Redundancy In Fax (AREA)
  • Document Processing Apparatus (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The various types of the data information (such as text, image and voice etc.) in the computer file can be recognized automatically by the computer. The suitable lossless or lossy coding method is utilized automatically to carry out the compression in high efficiency. The invention also includes the related decompression method.

Description

The intelligent compression method of computer documents
Technical field under the present invention is a computing machine.
Leave the various primary data information (pdi)s in the computing machine in as computer documents, as Word message, picture information and acoustic information etc., normally uncompressed.But when you when their packings are taken away or are transmitted by Internet and telephone wire, you often need they are suitably compressed.
The compression of computer documents has two kinds of operator schemes at present.A kind of is that information type by the artificial cognition file is (as Text, Image, Speech etc.), and respectively move corresponding compressed software it is compressed (as using the ARJ compressed text file, use LeadView compressed image file, using RealAudio compressed voice file etc.).You can obtain the desired compression effect in this way, but when comprising that a plurality of different kinds of information unit maybe needs to compress a plurality of file in the file, and this squeeze operation will expensive handling time, and needs to buy many kinds of compressed softwares.
The another kind of squeeze operation pattern of computer documents is at present: no matter the file of what type all adopts lossless compression-encoding without exception.For example in modem, adopt V.42bis chip, institute's information transmitted is carried out the lossless coding compression without exception, or various files are carried out the lossless coding compression without exception with WinZip software.This compact model can be avoided expensive handling time, but can only obtain lower compressibility.This compress mode more is applicable to character information, and not too is applicable to out of Memory such as image, sound.
Purpose of the present invention is the compression method that a kind of intelligence is provided for the compression of computer documents.Use this method, both can compress any one or a sets of computer file automatically, the different information types (as Word message, picture information, acoustic information etc.) in the file can both be compressed efficiently by computing machine.
The compression method of this intelligence, the mode that available computers software, computer hardware or computer software combine with hardware realizes.Its FB(flow block) as depicted in figs. 1 and 2.It is described respectively below.
In compression side, FB(flow block) as shown in Figure 1.At the initiating terminal of flow process, one group of file to be compressed is selected by the user.Other step among Fig. 1 is automatically performed according to the present invention by computer system:
A. from above-mentioned one group of file to be compressed, choose arbitrary file, verify then whether the file layout of this document is discernible.
For this reason, some file types of setting like this need be arranged in the system: the first, the file layout of these file types is known, and these file types only can not be compressed efficiently with lossless coding; The second, these file types are made of two tabulations: one is the file extent list of file names, and the file extension of above-mentioned each file type is listed in this tabulation; Another is the document control information list, and some the document control information (file header information) corresponding to above-mentioned each file extension are listed in this tabulation.
Whether discernible for the file layout of verifying selected file, system will verify at first whether the extension name of this document is listed in the above-mentioned file extent list of file names.If, system will then verify this document whether with its document control information list in corresponding control information just conform.Above-mentioned two checkings all are sure, and then the file layout of this document is discernible; It negates that then the file layout of this document is not discernible that any checking is arranged.
For example, the Bitmap file is a kind of image file type.Its appointed file extension name is bmp.Its control information corresponding is: the 1-2 byte of this document is BM, and 2-5 byte indicated the actual image data length of this document, and 10-13 byte, expression this document pictorial data reference position, or the like.
If when chosen will the compression of file that is nominally Picture.bmp is arranged, system will check at first whether bmp is listed in the file extent list of file names.If, next system will verify whether the first two byte of this document just in time is BM, and the POS INT that indicates from 10-13 byte of this document of the pictorial data of this document, whether its physical length just in time conforms to 2-5 indicated length of byte of this document.If above-mentioned checking all is sure, then the file layout of this Picure.bmp file is discernible, otherwise the file layout of this Picture.bmp file is not discernible.
If B. this document form is not discernible, then system will employ lossless coding compression this document certainly.
If C. this document form is discernible, then system will determine whether this document is the simple files that only comprises a kind of data message type, or this document is the composite file that comprises not only a kind of data message type.
D. for a simple files, according to the form of this document, system can automatically discern the data message type that this document comprises, and adopt a kind of suitable compressed encoding automatically.For example, automatically adopt G 723 encoding compression speech datas, adopt color image data of JPEG encoding compression or the like automatically.
E. for a composite file, for example RTF file or html file, according to the file layout of this document, system will automatically be split as a plurality of message units to this composite file, and each unit only comprises a kind of data message type.Thereby system can adopt the mode similar to above-mentioned D, automatically compresses the data message in each message unit.
In the present invention, for reducing the complicacy of operation, system also can adopt composite file of lossless coding compression automatically, and does not adopt the step that composite file is split.This situation is not represented in Fig. 1.
F. in order to preserve data message and the corresponding control information that these files had been compressed, system need define the literature kit form of self.This form is defined as ICF form (Intelligent Compression Format), has file extension icf.Last step of Fig. 1 is about to each file that has compressed and forms the ICF file.
Do not finish if file G. to be compressed also all compresses, then system repeats the operation of above-mentioned A to F, all compresses up to file to be compressed to finish.
The present invention also is related to a decompression method of compressed file, as shown in Figure 2.At the initiating terminal of Fig. 2, have the decompressing files for the treatment of of icf extension name, select by the user.Remaining action is all finished automatically by computer system.Operate as follows:
Press the file from above-mentioned quasi-solution with extension name icf and to choose arbitrary file, according to the control information of this document, really whether checking this document an ICF file then.
If this document is not a real ICF file,, remove to choose another file again just stop the press operation of separating to this document.
If this document is an ICF file really, system will determine further whether this document is the simple files that only comprises a kind of compressed data message type, or comprise the composite file of multiple compressed data message type.
If this document is a compressed simple files, system will discern the data message type in this document automatically, and use and the corresponding decompress(ion) coding of compressed encoding, the data message in decompress(ion) this document.
If this document is a compressed composite file, and this composite file does not compress with lossless coding fully, system will automatically be split as a plurality of message units to this document, each unit only comprises a kind of compressed data information type, and by each such unit of above-mentioned similar mode difference decompress(ion).
If a composite file compressed with a kind of lossless coding originally entirely, this composite file will adopt corresponding lossless coding to carry out decompress(ion).This situation is not shown among Fig. 2.
The final step of Fig. 2 is to form a decompressing files.If above-mentioned file with icf extension name is not also finished by whole decompress(ion)s, system will repeat top operation automatically, and all being extracted up to the above-mentioned file with icf extension name for the treatment of decompress(ion) finishes.
Based on method provided by the invention, developed a kind of new type of compression software, this software makes that the compression of various computer documentss is not only easy but also efficient, is much better than existing various computer documents tool of compression.Method provided by the invention also can be used in the various application systems, as is used for e.mail, and FTP is in the systems such as modem.

Claims (8)

1, compress the method for dissimilar computer documentss by computer system automatically, comprise the following step:
(1) choose a file to be compressed, and whether discernible by the file layout of following operation demonstration this document:
A., some file types that such setting is arranged in the system: the first, the file layout of these file types is known, and these file types only can not be compressed efficiently with lossless coding; The second, these file types constitute with following two tabulations: one is the file extent list of file names, and it lists the file extension of above-mentioned each file type; Another is the document control information list, and it lists some the document control information (file header information) corresponding to above-mentioned each file extension.
B. in order to verify whether the file layout of selection spare is discernible, system will verify at first whether the file extension of selection spare is listed in the above-mentioned file extent list of file names.If, system will then verify this document whether with its control information tabulation in corresponding control information just conform.If above-mentioned two checkings all are sure, then the file layout of this document is discernible, otherwise this document is not discernible.
(2) according to the recognition result of this document file layout, this document is compressed, operates as follows:
If a. the file layout of this document is not discernible, then with lossless coding to compression.
If b. the file layout of this document is discernible, and this document only comprises a kind of data message type, then according to the information type of this document, adopts suitable harmless or lossy coding, to compression.
If c. the file layout of this document is discernible, and this document comprises the several data information type, then at first this document is split as a plurality of message units, each message unit only comprises a kind of data message type, then according to the data message type of each unit, adopt respectively suitable harmless or lossy coding to compression.
2, compress the method for dissimilar computer documentss by computer system automatically, comprise following steps:
(1) choose a file to be compressed, and whether the file layout of checking this document is discernible, as described in step in the claim 1 (1).
(2) according to the recognition result of this document file layout, this document is compressed, operates as follows:
If a. the file layout of this document is discernible, and this document only comprises a kind of data message type, then according to the data message type of file, adopt suitable harmless or lossy coding to compression.
If b. the file layout of this document is not discernible, perhaps comprise not only a kind of data message type in the file, then this document is directly compressed with lossless coding.
3, the method for claim 1, the use in computer software, hardware or soft, combination of hardware.
4, the method for claim 2, the use in computer software, hardware or soft, combination of hardware.
5, the method for claim 1 is used as an independent compression instrument.
6, the method for claim 2 is used as an independent compression instrument.
7, the method for claim 1, the use in any application system.
8, the method for claim 2, the use in any application system.
CNB011241586A 2001-08-20 2001-08-20 Intelligent compression method for file of computer Expired - Fee Related CN1139883C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB011241586A CN1139883C (en) 2001-08-20 2001-08-20 Intelligent compression method for file of computer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB011241586A CN1139883C (en) 2001-08-20 2001-08-20 Intelligent compression method for file of computer

Publications (2)

Publication Number Publication Date
CN1405705A true CN1405705A (en) 2003-03-26
CN1139883C CN1139883C (en) 2004-02-25

Family

ID=4665545

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB011241586A Expired - Fee Related CN1139883C (en) 2001-08-20 2001-08-20 Intelligent compression method for file of computer

Country Status (1)

Country Link
CN (1) CN1139883C (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1318974C (en) * 2005-08-05 2007-05-30 北京九州汇宝软件有限公司 Method for compression and search of database backup data
CN100343851C (en) * 2004-11-03 2007-10-17 北京神舟航天软件技术有限公司 Database compression and decompression method
CN102054038A (en) * 2010-12-30 2011-05-11 东莞宇龙通信科技有限公司 File decompression method and device as well as mobile terminal
WO2011079796A1 (en) * 2009-12-30 2011-07-07 北京飞天诚信科技有限公司 Method for compressing.net document
CN102147818A (en) * 2011-05-17 2011-08-10 上海华岭集成电路技术股份有限公司 Test file compression method
CN1584875B (en) * 2004-06-01 2011-08-10 北京九州软件有限公司 Ergodic compressing and decompressing method for batched computer document
CN102693325A (en) * 2012-06-12 2012-09-26 腾讯科技(深圳)有限公司 File storing method and device
CN103403712A (en) * 2011-01-14 2013-11-20 苹果公司 Content based file chunking
CN103873860A (en) * 2014-03-18 2014-06-18 深信服网络科技(深圳)有限公司 Document transmission method and device
CN103902567A (en) * 2012-12-26 2014-07-02 联想(北京)有限公司 Data processing method, device and system
CN104125458A (en) * 2013-04-27 2014-10-29 展讯通信(上海)有限公司 Lossless stored data compression method and device
CN104868922A (en) * 2014-02-24 2015-08-26 华为技术有限公司 Data compression method and device
CN104978319A (en) * 2014-04-02 2015-10-14 东华软件股份公司 Method and equipment used for classified transmission of files
CN106470037A (en) * 2015-08-21 2017-03-01 博雅网络游戏开发(深圳)有限公司 Intelligent compression method and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9912624B2 (en) 2015-09-25 2018-03-06 International Business Machines Corporation Lossy text source coding by word length
CN110286917A (en) * 2019-05-21 2019-09-27 深圳壹账通智能科技有限公司 File packing method, device, equipment and storage medium

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584875B (en) * 2004-06-01 2011-08-10 北京九州软件有限公司 Ergodic compressing and decompressing method for batched computer document
CN100343851C (en) * 2004-11-03 2007-10-17 北京神舟航天软件技术有限公司 Database compression and decompression method
CN1318974C (en) * 2005-08-05 2007-05-30 北京九州汇宝软件有限公司 Method for compression and search of database backup data
WO2011079796A1 (en) * 2009-12-30 2011-07-07 北京飞天诚信科技有限公司 Method for compressing.net document
CN102054038B (en) * 2010-12-30 2014-05-28 东莞宇龙通信科技有限公司 File decompression method and device as well as mobile terminal
CN102054038A (en) * 2010-12-30 2011-05-11 东莞宇龙通信科技有限公司 File decompression method and device as well as mobile terminal
US9305008B2 (en) 2011-01-14 2016-04-05 Apple Inc. Content based file chunking
CN103403712A (en) * 2011-01-14 2013-11-20 苹果公司 Content based file chunking
CN102147818B (en) * 2011-05-17 2013-09-25 上海华岭集成电路技术股份有限公司 Test file compression method
CN102147818A (en) * 2011-05-17 2011-08-10 上海华岭集成电路技术股份有限公司 Test file compression method
CN102693325A (en) * 2012-06-12 2012-09-26 腾讯科技(深圳)有限公司 File storing method and device
WO2013185563A1 (en) * 2012-06-12 2013-12-19 腾讯科技(深圳)有限公司 File storage method and apparatus, and storage medium
US10013419B2 (en) 2012-06-12 2018-07-03 Tencent Technology (Shenzhen) Company Limited File storage method and apparatus, and storage medium
CN103902567A (en) * 2012-12-26 2014-07-02 联想(北京)有限公司 Data processing method, device and system
CN104125458A (en) * 2013-04-27 2014-10-29 展讯通信(上海)有限公司 Lossless stored data compression method and device
CN104125458B (en) * 2013-04-27 2017-08-08 展讯通信(上海)有限公司 Internal storage data lossless compression method and device
CN104868922A (en) * 2014-02-24 2015-08-26 华为技术有限公司 Data compression method and device
CN104868922B (en) * 2014-02-24 2018-05-29 华为技术有限公司 Data compression method and apparatus
CN103873860B (en) * 2014-03-18 2017-12-22 深信服网络科技(深圳)有限公司 Document transmission method and device
CN103873860A (en) * 2014-03-18 2014-06-18 深信服网络科技(深圳)有限公司 Document transmission method and device
CN104978319A (en) * 2014-04-02 2015-10-14 东华软件股份公司 Method and equipment used for classified transmission of files
CN106470037A (en) * 2015-08-21 2017-03-01 博雅网络游戏开发(深圳)有限公司 Intelligent compression method and system

Also Published As

Publication number Publication date
CN1139883C (en) 2004-02-25

Similar Documents

Publication Publication Date Title
CN1139883C (en) Intelligent compression method for file of computer
US6460044B1 (en) Intelligent method for computer file compression
US5933104A (en) Method and system for compression and decompression using variable-sized offset and length fields
DE69832593T2 (en) NETWORK FOR DATA CODING
DE60127695T2 (en) METHOD FOR COMPRESSING DATA PACKAGES
EP0725363A2 (en) Image compression apparatus and method
KR940701621A (en) Adaptive Block Size Image Compression Method and System
US20020065822A1 (en) Structured document compressing apparatus and method, record medium in which a structured document compressing program is stored, structured document decompressing apparatus and method, record medium in which a structured document decompressing program is stored, and structured document processing system
CN1155221C (en) Method and system for encoding and decoding method and system
BR0210786A (en) an equipment and method for encoding digital image data in a lossless manner
CN1584875B (en) Ergodic compressing and decompressing method for batched computer document
WO2005112270A1 (en) Method and apparatus for structured block-wise compressing and decompressing of xml data
US7733249B2 (en) Method and system of compressing and decompressing data
Inenaga A Faster Longest Common Extension Algorithm on Compressed Strings and its Applications.
Broder et al. Pattern-based compression of text images
JP2001169120A5 (en)
CN1656688A (en) Processing digital data prior to compression
JP3152772B2 (en) Image data restoration device
CN114640357B (en) Data encoding method, apparatus and storage medium
US6501858B1 (en) Image compression and expansion apparatus using a effectively modifiable quantization table
Garg Compact Improvement in Proficiency of Huffman Coding
JPH11136135A (en) Data compression method and device therefor, data restoration method and device therefor and recording medium
KR100529337B1 (en) Method and apparatus for processing data in image forming device
JPH0944423A (en) Computer system
JPH036924A (en) Terminal equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C57 Notification of unclear or unknown address
DD01 Delivery of document by public notice

Addressee: Wang Jinbo

Document name: payment instructions

ASS Succession or assignment of patent right

Owner name: BEIJING JIUZHOU SOFTWARE CO., LTD.

Free format text: FORMER OWNER: JIUZHOU COMPUTER NETWORK CO., LTD., BEIJING

Effective date: 20050218

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20050218

Address after: 100080, room 52, 1608 Haidian Road, Beijing, Haidian District

Patentee after: Jiuzhou Computer Network Co., Ltd., Beijing

Address before: 100080 Pacific Mansion, No. 52, Haidian Road, Beijing, Haidian District

Patentee before: Jiuzhou Computer Network Co., Ltd., Beijing

C56 Change in the name or address of the patentee

Owner name: BEIJING JIUZHOU HUIBAO SOFTWARE CO., LTD.

Free format text: FORMER NAME OR ADDRESS: BEIJING JIUZHOU SOFTWARE CO., LTD.

CP03 Change of name, title or address

Address after: 100080, room 52, 1608 Haidian Road, Beijing, Haidian District

Patentee after: Jiuzhou Huibao Software Co., Ltd., Beijing

Address before: 100080, room 52, 1608 Haidian Road, Beijing, Haidian District

Patentee before: Jiuzhou Computer Network Co., Ltd., Beijing

C56 Change in the name or address of the patentee

Owner name: BEIJING JIUZHOU S OF TWARE CO., LTD.

Free format text: FORMER NAME: BEIJING GLOGAL S OF TWARE CO., LTD.

CP03 Change of name, title or address

Address after: 100086 Beijing Haidian District Sanyi Temple No. 2 North Building 502-505

Patentee after: Jiuzhou Computer Network Co., Ltd., Beijing

Address before: 100029 Beijing city Chaoyang District Beitucheng West Road No. 3 building B block six layer Microelectronics

Patentee before: Jiuzhou Huibao Software Co., Ltd., Beijing

DD01 Delivery of document by public notice

Addressee: Wang Jinbo

Document name: Notification that Application Deemed not to be Proposed

C56 Change in the name or address of the patentee
CP02 Change in the address of a patent holder

Address after: 100044 Beijing city Xicheng District Xizhimen Street No. 135 Building No. 4 hospital 3

Patentee after: Jiuzhou Computer Network Co., Ltd., Beijing

Address before: 100086 Beijing Haidian District Sanyi Temple No. 2 North Building 502-505

Patentee before: Jiuzhou Computer Network Co., Ltd., Beijing

DD01 Delivery of document by public notice

Addressee: Jiuzhou Computer Network Co., Ltd., Beijing

Document name: Notification to Pay the Fees

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20040225

Termination date: 20140820

EXPY Termination of patent right or utility model
C56 Change in the name or address of the patentee
CP02 Change in the address of a patent holder

Address after: 100081, Haidian District, Beijing, Zhongguancun South Street, No. 52, China Foreign Exchange Building 902

Patentee after: Jiuzhou Computer Network Co., Ltd., Beijing

Address before: 100044 Beijing city Xicheng District Xizhimen Street No. 135 Building No. 4 hospital 3

Patentee before: Jiuzhou Computer Network Co., Ltd., Beijing

REIN Reinstatement of patent application or patent right
REIN Reinstatement of patent application or patent right
RR01 Reinstatement of patent right

Former decision: cessation of patent right due to non-payment of the annual fee

Former decision publication date: 20151028

DD01 Delivery of document by public notice

Addressee: Wang Jinbo

Document name: payment instructions

DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Wang Jinbo

Document name: Notice of termination of patent right

DD01 Delivery of document by public notice
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20040225

Termination date: 20200820