MXPA01004341A

MXPA01004341A - Method for compressing digital documents with control of

Info

Publication number: MXPA01004341A
Application number: MXPA/A/2001/004341A
Authority: MX
Inventors: W Zeck Norman; A Crean Peter; Kang Sangchul; E Rumph David; E Nelson William; L Eldridge George
Original assignee: A Crean Peter; L Eldridge George; Kang Sangchul; E Nelson William; E Rumph David; W Zeck Norman
Priority date: 2000-05-01
Filing date: 2001-04-30
Publication date: 2003-11-07

Abstract

IMAGE QUALITY AND COMPRESSION RATE The present discloses a method and apparatus for compressing and decompressing electronic documents, with maximum intradocument independence, and maximum flexibility in optimization of compression modes. The method includes receiving documents containing unknown combinations of a plural data types, including combinations of scanned data, computer rendered data, compressed data and/or rendering tags;dividing the received image into strips of blocks determining from the image itself, which data types are present in each block;compressing data of each data type present in each block with a compression method optimized for its data type. Scanned data may be further segmented into plural scanned data types, where each data type is compressed in said compressing data step with a compression method optimized for said scanned image data type. If the received data type is compressed data, the process may include the additional functions of determining a compression ratio thereof, and accepting the compressed data for use as, or decompressing and recompressing the data, based on acceptability of said compression ration determination. An instruction set is generated that allows detailed decompression instruction data and image data to be combined with transmitted compressed data. A data structure is shown, which segregates data types and instruction data, and allows for block to block and strip to strip processing independence.

Description

A METHOD FOR COMPRESSING DIGITAL DOCUMENTS WITH IMAGE QUALITY CONTROL AND PERCENTAGE OF COMPRESSION DESCRIPTION OF THE INVENTION In digital systems, documents with image format are often compressed to save storage costs or to reduce the transmission time through a transmission channel. The lossless compression that can be applied to these documents can achieve a very good compression in regions of the document that are converted into a computer such as characters and graphics. However, areas of the document that contain scanned image data will not be well compressed. Compression technologies such as JPEG that can be applied to the document will work well on scanned, continuous tone areas of the document. Image quality problems arise with this compression technology, and transformation coding technologies in general, with high contrast edges that are produced by computer converted objects. The solution to this problem is to apply different compression technologies to the document to optimize the quality and compressibility of the image. A method for compressing digital images of a raster image using RaB 129099 is described different compression methods for selected parts of the image and that adjusts the compression and segmentation parameters to control the exchange of image quality and compression. The image, including the conversion marks that can accompany each pixel, is encoded in a single data stream to be managed efficiently by a disk, memory and 1/0 systems. The uniqueness of the system in the separation dependent on the content of the image in lossless and lossless regimes, the transmission of only those blocks containing information, and the adjustable segmentation and compression parameters used to control the percentage of image data ( average compression ratio) over extremely small intervals (typically eight scan lines). The world of graphic arts, and Scitex in particular, as exemplified by the TIFF / IT standard "(ISO 12639: 1997E," Graphic Technology Prepress Digital Data Exchange Tag Image File Format for Image Technology (TIFF / IT) " ) has separated documents into continuous tone (CT) and line work (LW) images, maintaining different resolutions for each one and applying different compression techniques to each one (JPEG and pass length coding, respectively). the two planes of the image are in the LW channel.

US-A 5,225,911 to Buckley et al. Uses similar coding but replaces the LW channel with several data streams including mask, color, and transformation marks. Printing of compressed images has been used for a decade for binary images using one of several standard or patented formats: CCITT, JBIG, Xerox Adaptative (as discussed in Buckley, Interpress). flat planes present losses and, although they are often very effective (20: 1), they can, for some images, give little or no compression US Patent Application Serial No. 09/206487 also separates the image into two planes, but each plane is sent completely, three data streams are used (two image planes and one separation mask) and there is no mechanical control over the local data rate, JPEG is a standard for compressing continuous-tone images. refers to the Joint Photographic Experts Group The JPEG is divided into a basic system that offers a limited set of layers cidades, and a set of optional extended system features. JPEG provides an ability to encode / decode images with high loss compression. In addition to this ability to lossy coding, JPEG incorporates progressive transmission and also a lossless scheme. JPEG uses a discrete cosinusoidal transformation (DCT) as part of the coding process to provide a representation of the image that is most suitable for lossy compression. The DCT transforms the image of a spatial representation to a frequency representation. Once in the frequency domain, the coefficients are quantized to achieve compression. Lossless encoding is used after quantization to further improve compression performance. The decoder executes the inverse operations to reconstruct the image. Dictionary-based compression methods use the principle of replacing ordered subsequences in a data stream with a codeword that identifies those subsequences ordered in a dictionary. This dictionary can be static if the recognition of the inflow and statistics are known or can be adapted. Adaptive dictionary schemes are better for handling data flows where statistics are not known or vary. Many adaptive dictionary coders are based on two related techniques developed by Ziv and Lempel. The two methods are frequently referred to, LZ77 (or LZl) and LZ78 (or LZ2). Both methods use a simple method to achieve adaptive compression. An ordered subsequence of text is replaced with an indicator in a place where the ordered sequence has previously occurred. In this way the dictionary is in all or a portion of input flow that has been previously processed. The use of the previous ordered sequences of the inflow is often a good choice for the dictionary, since the ordered sequences that have occurred will probably occur again. The other advantage of this scheme is that the dictionary is transmitted essentially at no cost since the decoder can generate the dictionary of the input stream encoded previously. The main variations of the LZ coding differ mainly in how the indicators are represented and in what the indicators are allowed to refer to. LZl is a relatively easy to implement version of a dictionary encoder. The dictionary in this case is a sliding window that contains the previous data of the inflow. The coding searches in this window for the greatest similarity to the current ordered subsequence in the input stream. The search can be accelerated by indexing the subsequences ordered before with a tree, noise table, or binary search tree. The decoding by LZl is very fast since each word of code is an array search and a length to copy to the output data stream (uncoded). In contrast to the LZ1, where the indicators can refer to any ordered subsequence in the previous data window, the LZ2 method places restrictions on which ordered subsequences can be revered. However, the LZ2 does not have a window to limit how far back the ordered subsequences can be revered. This avoids the inefficiency of having more than one coded representation for the same ordered sequence, which can occur frequently the LZl. The LZ2 builds the dictionary by comparing the current ordered subsequences of the input stream to a dictionary that is stored. This stored dictionary is generated adaptively based on the content of the input flow. Since each ordered subsequence of input is searched in the dictionary, the largest similarity will be located, starting at the current symbol in the input stream. Thus, if the character "a" was the first part of an ordered subsequence, then only the ordered subsequences that started with "a" would be searched for. In general, this leads to a good comparison of the ordered subsequence of input with subsequences ordered in the dictionary. However, if there were an ordered subsequence "bacdef" in the dictionary, then "acdef" of the input flow would not be equal to this entry since the subsequence ordered in the dictionary starts with "b". This is the difference of LZl, which allowed generating a better comparison anywhere in the window and could generate an indicator for "acdef". US-A 5,479,587 discloses a method of minimizing the print buffer in which the frame data is compressed by trying different compression methods with increasing compression percentages until the frame data is compressed sufficiently when placed in a memory intermediate print given. Each time, a compression procedure with a higher percentage of compression of a predefined repertoire of such procedures is selected, ranging from lossless such as pass-length encoding to lossless coding. In general, lossless coding is efficient in line text data, while lossy coding is effective on image data. However, this method can produce poor print quality, when the nature of the raster page demands lossy compression to achieve a predetermined compression ratio. This is because only one of the selected compression procedures is applied summarily through each page band and when the band contains image data as well as text or line art data, the lossy compression procedure will generally blur the thin lines that usually delineate text or line art data or may introduce undesirable artifacts. . European Patent Publication No. 0597571 describes a method in which the object types on a page are extracted first and the limit of each object is determined before the framework. The appropriate compression procedures are applied selectively to each type of object. In this way, lossless compression procedures can be applied optimally to text objects or line art, while lossy compression procedures can be applied to image objects. The method operates at the level of the display list which is an intermediate form between the page description file and the web page. The objects and their types are determined by syntactically analyzing the commands that define implicitly, high-level objects of the PDL in the display list. This requires knowledge of the brand and particular version of the PDL orders as well as how to reconstruct a certain object of those implicit manifestations. In any case, it seems that everyone, except the simplest limits, such as those objects included in rectangular blocks are particularly determinable of such decoding at the level of the display list. US-A 5,982,937 discloses a lossless / lossy hybrid compression process by which page or frame data is analyzed to distinguish text or art objects in lines of image or photo objects. This is achieved by means of a method that analyzes and recognizes structures in the raster data in the form of color patches. A patch is considered as a scattering of connected pixels of the same color. Once the patches are recognized, they are discriminated between a Type 1 or Type 2 patch, depending on whether or not the patch can be efficiently compressed by the first type of compression procedure (typically the Coding with no Loss of Path Length. ). Each patch has a size measured by the number of pixels in it ("ConteodePixelesdelParche"). The part of- Type 1 has a ContedePixelesdelParche greater than or equal to a predetermined number, DI, and the patch of Type 2 has a ConteodePixelesdelParche less than DI. In a preferred implementation, DI is from 6 to 8. The first compression procedure (lossless) is then applied to the Type 1 patches and the second compression procedure (typically JPEG lossy) is applied to Type 2 patches. In this way, procedures are applied of compression appropriate to each type of data to achieve efficient and optimal compression and maintain quality at the same time. The references described here and above are incorporated by reference for their teachings.

BRIEF DESCRIPTION OF THE INVENTION According to the invention, there is provided a method and an apparatus for compressing and decompressing electronic documents, with maximum intrasubstances independence, and maximum flexibility in the optimization of the compression modes. According to one aspect of the invention, there is provided a method for compressing a received document, comprising: receiving documents containing unknown combinations of a plurality of data types, including combinations of scanned data, computer-transformed data, compressed data and / or conversion or transformation marks; divide the received image into bands of determined blocks of the image itself, data types which are present in each block; Compress data of each type of data present in each block with a compression method optimized for your data type. The method described also predicts that the scanned data can be further segmented into a plurality of types of data. data scanned, and each type of data is compressed in the data compression step with a compression method optimized for the type of scanned image data. The described method can also make that where a data type received, the data is compressed, the process can include the additional functions of determining a compression percentage thereof, and accepting the compressed data to be used, as such, or decompressing and recompressing the data based on the acceptability of the compression percentage determination. The described method can also cause where some or all of the received data types are predetermined, the process can use this information to select a compression method for this type of data. According to another aspect of the invention, there is provided a method for compressing received documents which includes: receiving documents containing unknown combinations of a plurality of data types, including combinations of scanned data, computer transformed data, compressed data and / or transformation brands; classify each type of data present in the decompression received; determine the optimal compression of each type of data present, which may include a pass without compression through compressed data, and that of determining the optimal compression, generate a flow of decompression instructions, useful in the decompression of the document, and that includes decompression instructions and document data. According to still another aspect of the present invention, a data structure is provided, to describe a compressed document that includes unknown combinations of a plurality of data types, including combinations of scanned data, computer transformed data, compressed data and / or transformation marks, comprising: the segregation of data according to compression methods thereof; and segregation of the data into portions of the document in the form of independent blocks and bands, whereby each portion of the document in the form of a block and each portion of the document in the form of a strip can be decompressed without reference to any other block or band, respectively. These and other aspects of the invention will become apparent "from the following descriptions to illustrate a preferred embodiment of the invention taken in conjunction with the accompanying drawings, in which: Figure 1 illustrates the compression method of the invention; Figure 2 illustrates the data structure of the invention, generated by the compression method of Figure 1, and its use in decompression; Figures 3 and 4 illustrate the independence of the bands and blocks and their advantage with respect to the compression of multiple types of data; Figures 5 and 6 illustrate respectively how the described compression and decompression methods of the invention can be multiplexed by time, so that a plurality of segments can be compressed independently; Figure 7 shows a system in which the present invention can find application. Referring now to the drawings where what is shown for the purpose of describing an embodiment of the invention and not to limit it. Figure 7 illustrates a"digital front end" or DFE, a printer driver 10 that controls the document and the flow of the image to a printer 11. This is commonly connected to a network 12, via a network connection 14. The printer driver "includes a processing unit (CPU) 20, which could be represented by a computer workstation, or a computer / control processor. access to a document memory 22, which may be integrated with the controller 10, or separate.The CPU 20 has associated therewith a large number of PCI cards, which, according to the invention, provide the interface or interconnection of communication with external devices, including, in this case, a printer 11 and memory 22. PCI cards include the compression system of the invention 30 and compression systems 32, 34, 36 and 38 described herein. In general, the data describing the document received in the CPU 20 or the memory 22 are directed through the compression system 30, for storage or processing. The data describing the document stored in the memory 22 can be directed through the decompression systems 32, 34, 36 and 38, directed to each separation channel of a printer. Of course, there are many ways to provide the same data capabilities using the present invention, and this example is only a possible choice. Figure 1 shows a diagram of the functions of the invention. The input to the compression system 90 are control marks 100, transformation marks 105, and contour frame data 110. The data is organized to contain a band of multiple scan lines (8 for example). The input frame data 110 can be precompressed as part of the frame generation step of a printing system. If this compression method satisfies the compression goals for a band, then the compression band is sent directly to the packer 800 via the deflection path 108 to format the output flow. A characteristic of this invention is the support in the decompression step of this type of deviation operation. In the case where the precompressed data does not satisfy the compression goals, the frame-to-block conversion and analysis are performed in block 120. The control flags at 100 contain information about the classification of the frame pixel data. The control information can be obtained from structured, high-level descriptions of the document that when analyzed syntactically can identify areas of the document as computer-transformed or explored. This information only controls one type of compression in this invention, the selection of lossless compression. The invention takes advantage of those control marks to direct the compression of a data block, but does not require the control information to successfully compress the block. In this way, the invention handles a wide range of methods for generating image data that may or may not contain information about the image data useful for the selection of compression methods. In this case, the input frame data may be preseparated to simplify the use of more than one compression method in a single block. The compression of the transformation marks 105 s provided by the invention. The brands of transformation are an additional raster plane or several bits / pixels that accompany the contour data, which characterize the type of object to which a pixel belongs. Transformation marks contain information, so that the best transformation decisions can be made at the time of printing / presentation. The frame-to-block analyzer 120 is a frame-based processor that performs several processes to collect information about the frame data. To simplify the implementation, the analyzer has been restricted to work on blocks of fixed size pixels. Blocks of variable size are also possible. The classifier / separator 150 takes control information 140 and frame information 130 from the analyzer in the block format. On the basis of this information the sorter / separator 150 executes a decision tree that selects the best compression method for the "frame" data class The sorter / separator also separates the frame data into data types 162, 163, 164 and delivers the frame data to one of several compression processors 310, 311 and 312, respectively, each of which implements a compression process optimized for the data type.To decompress this multi-channel data stream, it is created a map channel generator 170 from the control information 160 received from the classifier / separator 150. This map channel generates information signals 188 for each block on which the compression method was used, and in the case of several methods, a pixel map per pixel that can be used to recombine a block. There are a large number of analyzer and classifier systems that could be employed in the present invention. The analyzer and classifier systems described in U.S. Patent Application Serial No. 09 / 562,143 to Zeck et al., Filed May 1, 2000, may be used and are incorporated herein by reference for their teachings. Some unique features of this part of the invention are the independence of the blocks for analyzing and classifying the ability to use multiple compression methods within a block (i.e., a plurality of image types within a single block). ~ "This feature of the invention allows to control the quality of the image by selecting different compression methods, each with a different degree of information loss, based on the analysis and classification, in addition, to control the quality of the image in the case of lossy compression methods, the processes for the analyzer and the classifier are designed so that each block can be processed independently. This allows parallelism in the implementation of these two functions. A third unique feature is the ability to compress the blocks with more than one compression method and then select and recombine the block with the decompression operation. In this way a portion of the block can be compressed without losses and a portion with losses. One aspect of this invention allows blocks containing a plurality of data types to be represented as separate blocks that have not been fused into a block as would generally occur in a printing system described in Figure 7. This technique also solves the problem how to deal with the pixels in the block with losses that are going to be compressed with the loss method. With transformation methods, the removal of the pixels from the block creates an edge that will accept the quality of the image by generating "duplication of images" on the edge. This method allows the pixels in the block to continue avoiding the creation of the artificial edge and maintains the overall structure of the image. This natural structure will generally compress well with the transformation coding. The map channel 170 has additional information that must be created and passed to the decompressor to recombine the data correctly. The information in the map channel it is often highly correlated so that a compression process based on a dictionary will achieve good compression by this channel. Additionally, the map channel encodes constant blocks eliminating the need to compress those blocks with one of the compression methods. This produces good compression and improves the operation of the implementation in most cases. As an example, performance is improved in implementations without having to compress the constant blocks. A constant block occurs when a block has a uniform response through it, such as, for example, a single color. The compression method M190 for map channel 170 would be a lossless compression method, such as LZl. The loss in this channel would not allow the decompression implementation to work properly. The last step in the process is to pack the data stream "" compressed in the packer 198. According to another aspect of the invention, the packer takes the output of each compressor 191, 192, 193, 194 and passes through 195 and organizes the data into a "packet" 200. This packet contains the compressed data of a band of blocks, in a stand-alone or independent format of adjacent data blocks, and efficiently separated into compact elements such as data, brands and compressed maps. The package is given format as in package 200, with a field of type and length and compressed data. Other format structures that store a field of type and length could also be used, so that the packet can be easily separated into its components in the decompressor. A unique feature of this method is the independence of the band. Each band can be compressed separately and decompressed separately. This feature of the invention allows scalar performance by reproducing compressor implementations, so that each case operates in a band as in Figure 2. Decompression is achieved by using the type and length fields in the packet 200 to separate the compressed data. The decompression of the map data is done and used for the recombination of the direction of the blocks. The compression method of the invention is optimized for continuous tone images, typically eight bits / pixels / colors. However, binary data can also be compressed using the map channel information 170 to select pixels of one of the two compression methods. Each compression method compresses a black or white block. By selecting between the two methods, binary or contour data can be compressed with this method. Most of the compression methods that are designed to work with contour data operate well on binary pixel data. In this way, this invention can be used for contour and binary pixel data. Since the two compression methods are constant block compression, the compression is high. Figure 2 shows the method used for decompression. From this figure, it is clear how the map channel information is key to indicate to the decompression step how to reassemble the blocks of the package 200. The design of the data structure of the package 200, passing the data through channel 201 allows to the parser 202 to separate the different compressed segments and deliver the segments to each decompression method 203, 204, 205, 206. The map channel is decompressed by a lossless decompression method 207 which is inverse to the method used in the implementation of the compression. This online information 211 is delivered to the lift 215. The lift truck can then determine which of the compression methods to choose on a block by block basis via the lines 212, 213, 214, to regenerate the screen band. In the case of mixed blocks, the map channel contains a pixel selector per pixel for the block that allows the remover to melt the two blocks together. In the case of constant blocks, the map channel contains all the information so that the remover regenerates the block. The paddle 215 can also convert ordered data from the block back to an ordered frame depending on the compression method. In the case of the precompression bands, which would pass through the copying function of Figure 1, 185, the data need not be ordered in the block. For example, the run length compression is a preferred embodiment of this aspect of the invention and is not always an ordered block. The remover directs the compressed data outwards 216, for example to memory, a printing device as in Figure 7, or additional processing. Figure 3 shows a unique feature of this method that allows parallel implementation to scale the operation. The scanning line bands 300, the number of which is equal to the height of the block size, are given to reproduce cases of the compression systems of the invention 310, 311, 312. The bands are processed in an order such that the bands are given to each case 301, 302, 303 and then each repeated case 304, 305, 306 when the first bands are completed for the sequential processing of a page. In this example three cases of the compression system of the invention were chosen, but the invention is scaled with any number of cases. Likewise, the output flow is produced, respectively, from the compression systems 310, 311 and 312, in order 321, 322, 323 and repeated by 324, 325, 326. The output flow is mounted in order 330, so that the decompressor can overcome the bands. As an example, this type of parallelism could be implemented completely in the physical components of computation or with a system of multiple processors, each processor implementing a case of the invention. Figure 4 shows how the decompression can be scaled in operation in the same way. In this case the compressed packets 330 are processed in the order 402, 402, 403 for the decompressor systems 410, 11 and 412. When the first packets have been decompressed, another set 404, 405, 406 is given to the system decompressors. The resulting bands of the scan lines are stored in order on the output of the system in the format 300. In this example, three cases of the decompression system of the invention were chosen, but the invention is scaled with any number of cases. Likewise, the output flow is produced, respectively, of the decompression systems 410, 411 and 412, in the order 421, 422, 423 and is repeated for 424, 425, 426. The output flow is assembled in the order 300. For a possible modality and with reference to Figure 1, the input data 110 is organized in contono, in 8 bits per pixel, by the frame analyzer a block 120 as blocks of 8 x 8 pixels. The control 100 is a single bit that indicates whether a computer graphics process has transformed the data 110 or whether the data comes from an array of light sensitive sensors such as a digital camera or a scanning device. Multiple control signals are possible to satisfy the required identification. The pixel data that is transformed by the computer bypasses the analyzer and is classified by the control bit only and assigned to be compressed by a lossless compression process 310. One of two processes is used, run length in the method of LZ1 dictionary compression in block 310. For data received in a compressed array, if the compression goals are satisfied, the run length data is thus copied to the output packer 198 through the NULL compression method 185 . "" The data that is not transformed by computer or has an unknown source is delivered to the analyzer 120. The sorter / separator 150, decides, based on the information of the analyzer, if the block should be sent to the compressor without loss LZl, to one of several JPEG compressors, or it should be handled as a mixed block. Several JPEG compressors are used with different tables of quantization to allow a better control of the quality and compression of the image. The mixed blocks, where the input data containing at least two data types in 110 that have not been fused into a single block, are sent to two compression methods and the map channel contains a pixel selector per pixel to indicate how to mount the block in the decompressor. The map channel is compressed by an LZl compression method as well. -. Figure 5 shows how the compression method can be multiplexed by time, so that a plurality of document segments can be compressed independently. Each document segment can be a portion of a page, a color separation or a complete page of a document for example. The bands i, 2 ... 6 of a document segment 300 are directed to the compression system 310. The output of • the compression for this band is stored in the list of packages 330. Next the band of another segment of '. document 300A is given to compression method 310. The output of the compression operation is stored in package list 330A. This process can be repeated for all document segments. This feature is useful where there are multiple sources that generate the bands for multiple pages and this process is slower than the implementation of the compression. An example This is found in a printing system where multiple processors are used to transform several document segments. It can take a long time for each segment to generate such a thing if the implementation of the compression method were operating only on one document segment, the resources of the implementation of the compression would be underutilized. Any sequencing of the input bands can be supported as long as the order of entry and exit is maintained. Figure 6 shows how the same method of Figure 5 can be applied to decompression. In this case the single decompressor is given packets 1 ... 6 of compressed data from a plurality of lists of compressed packets 330 and 330A. The first pack of 330 is decompressed by the decompression system 410 and the output is stored in the band store 300. The first 330A pack is then delivered to the decompressor and the output is stored in the band store 300A. The process is repeated. The application of this feature of the invention is in the case where the decompression implementation is faster than the process that is using the outputs of bands 300, 300A. This allows the resources of a single decompression implementation to be used to decompress multiple document segments. Each segment of the document can be a portion of a page, a separation of color or a complete page of a document for example. Any sequencing of the compressed input packets can be supported as long as the input and output order is synchronized and stored in the corresponding order. The described method can be easily implemented in programs and programming systems that use program development environments and object-oriented programming systems that provide a portable resource code that can be used in a variety of computer hardware physical components platforms. or work station. Alternatively, the described image processing system may be partially or completely implemented in physical computing components using standard logic circuits or specifically a single integrated microcircuit using the VLSI design. If programs and programming systems or physical computing components are used, or combinations thereof to implement the system, they vary, depending on the speed and efficiency requirements of the system and also the particular function and the programs and systems of programming or systems. of particular computer physical components and the particular microprocessor or microcomputer systems used. The image processing system, without However, it can be easily developed by those skilled in the art applicable in undue experimentation starting from the functional description provided herein, together with a general knowledge of the computation techniques. It is noted that in relation to this date, the best method known to the applicant to carry out the aforementioned invention, is that which is clear from the present description of the invention.

Claims

CLAIMS Having described the invention as above, the content of the following claims is claimed as property. A method for compressing a received document, characterized in that it comprises: receiving documents containing unknown combinations of a plurality of data types, including combinations of scanned data, computer transformed data, compressed data and / or transformation marks; divide the received image into block bands; determine from an image itself, what types of data are present in each block; compress data input data type present r in each block with a compression method optimized for your data type. The method according to claim 1, characterized in that the scanned data is further segmented into a plurality of scanned data types, and each type of data is compressed in the data compression step with a compression method used for the data compression. type of scanned image data. 3. The method according to claim 1, characterized in that for the received compressed data, determine a compression percentage thereof, and accept the compressed data to be used as, or decompress and recompress the data, based on the acceptability of the compression percentage determination. A method for compressing received documents, characterized in that it includes: receiving documents containing unknown combinations of a plurality of data types, including combinations of scanned data, computer transformed data, compressed data and / or transformation marks; classify each type of data present in the received document; determining the optimal compression of each type of data present, which may include a step without compression through the compressed data; ~ of the optimal compression determination, generate a flow of decompression instructions, useful in the decompression of the document, and that includes decompression instructions and document data. 5. A data structure, for describing a compressed document that includes unknown combinations of a plurality of data types, including combinations of scanned data, computer transformed data, compressed data and / or transformation marks, characterized in that it comprises: the segregation of data according to compression methods thereof; the segregation of data into document portions in the form of independent blocks and bands, whereby each block-shaped document portion and each band-shaped document portion can be compressed or decompressed without reference to any other block or band, respectively. SUMMARY OF THE INVENTION A method and apparatus for compressing and decompressing electronic documents, with independence between maximum documents, and maximum flexibility for optimization of compression modes. The method includes receiving documents containing unknown combinations of a plurality of data types, including combinations of scanned data, computer transformed data, compressed data and / or transformation marks; divide the received image into bands of determined blocks of the image itself, type of data which are present in each block; compress blocks of each type of data present in each block with a compression method optimized for your data type. The scanned data can be further segmented into a plurality of scanned data types, where each type of data is compressed in the data compression step with a compression method optimized for the type of scanned image data. If the type of data received is compressed data, the process may include the additional functions of determining a compression percentage thereof, and accepting the compressed data to be used as such, or decompressing and recompressing the data, based on the acceptability and the determination of the compression percentage. A set of instructions is generated that allows the decompression instructions data and the image data are combined with the compressed data transmitted. The data structure is displayed, which segregates data types and instruction data, and allows block-to-block and band-to-band processing independently.