CN102289497A - Document preview image generating system and method - Google Patents

Document preview image generating system and method Download PDF

Info

Publication number
CN102289497A
CN102289497A CN2011102418973A CN201110241897A CN102289497A CN 102289497 A CN102289497 A CN 102289497A CN 2011102418973 A CN2011102418973 A CN 2011102418973A CN 201110241897 A CN201110241897 A CN 201110241897A CN 102289497 A CN102289497 A CN 102289497A
Authority
CN
China
Prior art keywords
document
image
palette
octree
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011102418973A
Other languages
Chinese (zh)
Inventor
金可伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI MEIHUA INFORMATION CO Ltd
Original Assignee
SHANGHAI MEIHUA INFORMATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI MEIHUA INFORMATION CO Ltd filed Critical SHANGHAI MEIHUA INFORMATION CO Ltd
Priority to CN2011102418973A priority Critical patent/CN102289497A/en
Publication of CN102289497A publication Critical patent/CN102289497A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a document preview image generating system and method. The document preview image generating system comprises a uniform document transforming module, and an image generating module. The uniform document transforming module is used for analyzing various kinds of document formats, and unifying various kinds of document formats into the PDF format. The image generating module is used for extracting PDF document properties and analyzing the page number and the size of a document. A n-bit technology is used to generate a memory image, and optimize a palette of a n-bit color bitmap, and the n-bit color bitmap is optimized into a m-bit color image, wherein n is larger than m. The invention efficiently makes up deficiencies of online document browsing, and is suitable for browsers or operating system platforms incompatible with flash browsing. Meanwhile, the invention can browse complete thumbnails.

Description

Document preview figure generation system and method
Technical field
The invention belongs to technical field of image processing, relate to a kind of picture generation system, relate in particular to a kind of document preview figure generation system; Simultaneously, the invention still further relates to a kind of document preview drawing generating method.
Background technology
In the internet information epoch, a large amount of traditional client application technology are applied to the internet, and as customer relation management, office management system etc., great majority have adopted the Design Mode of saas (Software-as-a-service, software is promptly served).
At present, the browsing of electronic document, as the file of forms such as POWERPOINT, WORD, TXT, PDF, existing common way is that the computer user installs the document ocr software, browses by the mode that software opens file.In addition, also have some document sharing websites that open for free, realize the online reading of document, do not need document is downloaded, directly carry out reading based on browser, very convenient, changed operation and reading model in the past.
Document sharing website major part has adopted the mode of Flash plug-in unit to carry out the reading of document.Yet the Flash plug-in unit in use mainly contains some problem: 1) system compatibility problem, especially mobile device; 2) safety issue of plug-in unit; 3) plug-in unit need download and install.
In addition, two kinds of above reading model major parts all do not have the preview of complete thumbnail, and major part all need just can be carried out preview in File Open, and this is to be short of to some extent and not full for a document of wanting to transmit complete information.
Summary of the invention
Technical matters to be solved by this invention is: a kind of document preview figure generation system is provided, has remedied the weak point that online document is browsed effectively, be applicable to browser or the operating system platform of the incompatible Flash of browsing; The present invention can provide the preview of complete thumbnail simultaneously.
In addition, the present invention also provides a kind of document preview drawing generating method, has remedied the weak point that online document is browsed effectively, is applicable to browser or the operating system platform of the incompatible Flash of browsing; This method can provide the preview of complete thumbnail simultaneously.
For solving the problems of the technologies described above, the present invention adopts following technical scheme:
A kind of document preview figure generation system, described system comprises: document is unified conversion module, image generation module;
Described document is unified conversion module various document formats is analyzed, and is PDF with various documents are unified, and document is unified conversion module and comprised Excel conversion module, Powerpoint conversion module, Word conversion module, Txt conversion module;
Described Excel conversion module is in order to transform into the PDF document with all table content; The Excel conversion module utilizes the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
Described Powerpoint conversion module is in order to transform into the PDF document with the presentation file content; The Powerpoint conversion module utilizes the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
Described Word conversion module is in order to transform into document content the PDF document; The Word conversion module utilizes the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
Described Txt conversion module is in order to transform into the PDF document with the notepad content; The Txt conversion module utilizes the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document;
Described image generation module is in order to extract the PDF document properties, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Described image generation module comprises image analysis module, image conversion module, image generation module;
Described image analysis module is utilized plug-in unit to calculate and is analyzed the PDF document content, calculates the documentation page number of codes, calculates every page of size and determines coordinate position, and it is copied to internal memory to carry out next step image scale operation;
The Octree Qctree algorithm that described image conversion module provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette; Mainly be divided into three steps: 1) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree; 2) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains; 3) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette;
Described image generation module is generated as image file in order to the image conversion result that module that image is converted obtains.
A kind of document preview figure generation system, described system comprises: document is unified conversion module, image generation module;
Described document is unified conversion module various document formats is analyzed, and is PDF with various documents are unified;
Described image generation module is in order to extract PDF document properties, analytical documentation page number quantity and size dimension; Use n bit image technology to generate the memory map picture, and the bitmap that carries out n position look is carried out palette optimization, with the image of bitmap palette optimization the becoming m position look of n position look, wherein, n>m.
As a preferred embodiment of the present invention, described document is unified conversion module and is comprised in Excel conversion module, Powerpoint conversion module, Word conversion module, the Txt conversion module one or more;
Described Excel conversion module is in order to transform into the PDF document with all table content; The Excel conversion module utilizes the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
Described Powerpoint conversion module is in order to transform into the PDF document with the presentation file content; The Powerpoint conversion module utilizes the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
Described Word conversion module is in order to transform into document content the PDF document; The Word conversion module utilizes the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
Described Txt conversion module is in order to transform into the PDF document with the notepad content; The Txt conversion module utilizes the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document.
As a preferred embodiment of the present invention, described image generation module uses 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Described image generation module comprises image analysis module, image conversion module, image generation module;
Described image analysis module is utilized plug-in unit to calculate and is analyzed the PDF document content, calculates the documentation page number of codes, calculates every page of size and determines coordinate position, and it is copied to internal memory to carry out next step image scale operation;
The Octree Qctree algorithm that described image conversion module provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette;
Described image generation module is generated as image file in order to the image conversion result that module that image is converted obtains.
As a preferred embodiment of the present invention, the conversion method of described image conversion module comprises the steps:
1) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, sets up Octree with rgb value, at first sets up root node Root, forms the value of a 0-7 respectively with each of RGB respectively then, inserts successively in the tree;
2) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains;
3) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette.
A kind of document preview drawing generating method, described method comprises the steps:
Step S1, document are unified step of converting, and various main flow document formats are analyzed, and various documents unifications are PDF; Select to enter step S11, S12, S13 or S14 according to document format:
S11, Excel step of converting transform into the PDF document with all table content; Utilize the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
S12, Powerpoint step of converting transform into the PDF document with the presentation file content; Utilize the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
S13, Word step of converting transform into the PDF document with document content; Utilize the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
S14, Txt step of converting transform into the PDF document with the notepad content; Utilize the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document;
Step S2, image generate step, extract the PDF document properties, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Step S2 comprises following process:
S21, image analysis step utilize plug-in unit to calculate the document content with analysis PDF, calculate the documentation page number of codes, calculate every page of size and determine coordinate position, and it is copied to internal memory to carry out next step image scale operation;
S22, image conversion step, the Octree Qctree algorithm that provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette; Mainly comprise three steps: S221) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, be numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree; S222) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains; S223) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette;
S23, image generate step, and the image image conversion result that module obtains that converts is generated as image file.
A kind of document preview drawing generating method, described method comprises the steps:
Step S1, various main flow document formats are analyzed, be PDF with various documents are unified;
Step S2, image generate step, extract the PDF document properties, analytical documentation page number quantity and size dimension; Use n bit image technology to generate the memory map picture, and the bitmap that carries out n position look is carried out palette optimization, with the image of bitmap palette optimization the becoming m position look of n position look, wherein, n>m.
As a preferred embodiment of the present invention, described step S1 selects to enter step S11, S12, S13 or S14 according to document format: if document format has been PDF, then need not conversion;
S11, Excel step of converting transform into the PDF document with all table content; Utilize the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
S12, Powerpoint step of converting transform into the PDF document with the presentation file content; Utilize the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
S13, Word step of converting transform into the PDF document with document content; Utilize the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
S14, Txt step of converting transform into the PDF document with the notepad content; Utilize the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document;
As a preferred embodiment of the present invention, among the described step S2, extract the PDF document properties, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Step S2 comprises following process:
S21, image analysis step utilize plug-in unit to calculate the document content with analysis PDF, calculate the documentation page number of codes, calculate every page of size and determine coordinate position, and it is copied to internal memory to carry out next step image scale operation;
S22, image conversion step, the Octree Qctree algorithm that provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette;
S23, image generate step, and the image image conversion result that module obtains that converts is generated as image file.
As a preferred embodiment of the present invention, described image conversion step comprises following process:
S221) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree;
S222) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains;
S223) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette.
A kind of document preview drawing generating method, described method comprises the steps:
Steps A, document class other places reason step judge that to document format different document formats is taked different treatment mechanisms;
Step B, document in each all is converted to PDF, reads document content and transfer the main flow document to the PDF document;
Step C, calculating and analysis PDF document mainly comprise number of pages, size, size;
Step D, image conversion step utilize the conversion algorithm to convert, and improve the quality and the no loss rate of preserving bitmap;
Step e, image generate step, preserve the image that converted.
As a preferred embodiment of the present invention, the described PDF of transferring to step utilizes open interface of Office and Office plug-in unit to carry out the extraction and the generation of PDF document content;
Described image conversion step utilizes Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, and the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks, improves the quality and the precision that generate image; Mainly comprise following steps:
Step D1, set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree;
Step D2, extraction palette, the mean value of the RGB component in the taking-up leaf node, the mean value of RGB component=component summation/node counts obtains the palette of colors value;
Step D3, coupling palette index promptly according to original rgb value, find out the index of immediate color in palette.
Beneficial effect of the present invention is: the system and method that the document preview figure that the present invention proposes generates, effectively remedied the weak point that online document is browsed, for incompatible browser or the operating system platform of browsing Flash, this system and method is an effective solution, has also avoided necessary install software simultaneously and has downloaded the traditional habit that document just can be read.
The present invention also can be used in the website that document is shared type, is used for document front cover, the harmless breviary map generalization of back cover, and thumbnail can make document can pass out more complete information and style, excites the desire of user's download and reading.
In addition, the present invention provides good support to the document format of main flow, and the present invention also supports the expansion of different document form.
Description of drawings
Fig. 1 is the composition synoptic diagram of document preview figure generation system.
Fig. 2 is the process flow diagram of document preview drawing generating method.
Fig. 3 is the process flow diagram of lossless image algorithm.
Embodiment
Describe the preferred embodiments of the present invention in detail below in conjunction with accompanying drawing.
Embodiment one
See also Fig. 1, the present invention has disclosed a kind of document and has generated the preview graph system.This system comprises that mainly document unifies conversion module 10, image generation module 20.
Wherein document is unified conversion module 10 and again the main flow form is distinguished, and mainly contains Excel conversion module 11, Powerpoint conversion module 12, Word conversion module 13, Txt conversion module 14.Image generation module 20 has comprised image analysis module 21, image conversion module 22, image generation module 23.
It mainly is that various main flow document formats are analyzed that document is unified conversion module 10, is converted into unified PDF, and PDF is a kind of transplantable document format, and this file layout and operating system platform are irrelevant.The document format of these main flows comprises form Excel file, demonstration Powerpoint document, the Word document of Office, and notepad Txt file.
Excel conversion module 11 is in order to transform into the PDF document with all table content.This module utilizes the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document.
Powerpoint conversion module 12 is in order to transform into the PDF document with the presentation file content.This module utilizes the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document.
Word conversion module 13 is in order to transform into document content the PDF document.This module utilizes the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document.
Txt conversion module 14 is in order to transform into the PDF document with the notepad content.This module utilizes the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document.
In the present embodiment, image generation module 20 is in order to extract the PDF document content, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture (certainly, also can use other bit image technology to generate the memory map picture), and utilize Octree (certainly, can utilize i fork tree, i is for greater than 1 natural number) the Qctree algorithm carries out palette optimization to the bitmap of 32 looks, the bitmap palette optimization of 32 looks become the colored Gif of 8 looks.
Image analysis module 21 is utilized plug-in unit to calculate and is analyzed the PDF document content, calculates the documentation page number of codes, calculates every page of size and determines coordinate position, and it is copied to internal memory to carry out next step image scale operation.
The Octree Qctree algorithm that image conversion module 22 provides 32 bitmaps to optimize.The Gif image quality is low, and the inside palette of Bitmap is default to be 32 looks because palette causes, and has only 8 looks i.e., 256 looks and the Gif palette is the highest, when carrying out the Gif preservation, the palette conversion can occur, default transfer process can be lost a lot of colors, causes image quality lower.And this algorithm can become the Bitmap palette optimization of 32 looks colored the Gif of 8 looks, has guaranteed to generate the harmless and information completely of image substantially.Use the Octree algorithm mainly from true color, to find out 256 kinds of colors representing whole image exactly, set up palette.Mainly be divided into three steps: 1) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, be numbered 0-7, set up Octree with rgb value, at first set up root node (Root), form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree; 2) extracting palette, after Octree has been set up, take out the mean value (component summation/node counts) of the RGB component in the leaf node, promptly is the palette of colors value that obtains; 3) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette.
Image generation module 23 is generated as image file in order to result that image is converted.
More than introduced document preview figure generation system of the present invention, the present invention also discloses the method that a kind of document generates preview graph when disclosing above-mentioned document format preview graph generation; See also Fig. 2 method flow diagram, this method comprises the steps:
[steps A] document class other places reason judges that to document format different document formats is taked different treatment mechanisms;
[step B] transfers PDF to, reads document content and transfers the main flow document to the PDF document;
The described PDF of transferring to step can utilize open interface of Office and Office plug-in unit to carry out the extraction and the generation of PDF document content.
[step C] graphical analysis is calculated and analysis PDF document, mainly comprises number of pages, size, size;
[step D] image converts, and utilizes outstanding algorithm to convert, and improves the quality and the no loss rate of preserving bitmap;
Image conversion step can utilize Octree Qctree algorithm that the bitmap that carries out 32 looks is carried out palette optimization, and the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks, improves the quality and the precision that generate image.Step please refer to Fig. 3, mainly comprises following steps:
Step D1, set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node (Root), form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree;
Step D2, extraction palette, the mean value (component summation/node counts) of the RGB component in the taking-up leaf node obtains the palette of colors value;
Step D3, coupling palette index promptly according to original rgb value, find out the index of immediate color in palette.
[step e] image generates, and preserves the image that converted.
Embodiment two
Present embodiment discloses a kind of document preview drawing generating method, and described method comprises the steps:
[step S1] document is unified step of converting, and various main flow document formats are analyzed, and various documents unifications are PDF.
Select to enter step S11, S12, S13 or S14 according to document format and (, then enter step S11 if document is the Excel document; If document is the Powerpoint document, then enter step S12; If document is a Word document, then enter step S13; If document is the Txt document, then enter step S14):
S11, Excel step of converting transform into the PDF document with all table content; Utilize the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document.
S12, Powerpoint step of converting transform into the PDF document with the presentation file content; Utilize the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document.
S13, Word step of converting transform into the PDF document with document content; Utilize the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document.
S14, Txt step of converting transform into the PDF document with the notepad content; Utilize the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document.
[step S2] image generates step, extract the PDF document properties, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks.Step S2 comprises following process:
S21, image analysis step utilize plug-in unit to calculate the document content with analysis PDF, calculate the documentation page number of codes, calculate every page of size and determine coordinate position, and it is copied to internal memory to carry out next step image scale operation;
S22, image conversion step, the Octree Qctree algorithm that provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette; Mainly comprise three steps: S221) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, be numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree; S222) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains; S223) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette;
S23, image generate step, and the image image conversion result that module obtains that converts is generated as image file.
Embodiment three
Present embodiment discloses a kind of document preview figure generation system, and described system comprises: document is unified conversion module, image generation module.
Described document is unified conversion module various document formats is analyzed, and is PDF with various documents are unified.
Described image generation module is in order to extract PDF document properties, analytical documentation page number quantity and size dimension; Use n bit image technology to generate the memory map picture, and the bitmap that carries out n position look is carried out palette optimization, with the image of bitmap palette optimization the becoming m position look of n position look, wherein, n>m.The image generation module converts to file and picture, by find out the index near the color of (or more approaching) in palette, the n bit image is converted into the m bit image.Concrete algorithm can certainly, also can be additive method with reference to the method among embodiment one, the embodiment two.
The present invention discloses a kind of document preview drawing generating method simultaneously, and described method comprises the steps:
[step S1] analyzes various main flow document formats, and various documents unifications are PDF;
[step S2] image generates step, extracts the PDF document properties, analytical documentation page number quantity and size dimension; Use n bit image technology to generate the memory map picture, and the bitmap that carries out n position look is carried out palette optimization, with the image of bitmap palette optimization the becoming m position look of n position look, wherein, n>m.
In sum, the system and method that the document preview figure that the present invention proposes generates, effectively remedied the weak point that online document is browsed, for incompatible browser or the operating system platform of browsing Flash, this system and method is an effective solution, has also avoided necessary install software simultaneously and has downloaded the traditional habit that document just can be read.
The present invention also can be used in the website that document is shared type, is used for document front cover, the harmless breviary map generalization of back cover, and thumbnail can make document can pass out more complete information and style, excites the desire of user's download and reading.
In addition, the present invention provides good support to the document format of main flow, and the present invention also supports the expansion of different document form.
Here description of the invention and application is illustrative, is not to want with scope restriction of the present invention in the above-described embodiments.Here the distortion of disclosed embodiment and change are possible, and the various parts of the replacement of embodiment and equivalence are known for those those of ordinary skill in the art.Those skilled in the art are noted that under the situation that does not break away from spirit of the present invention or essential characteristic, and the present invention can be with other form, structure, layout, ratio, and realize with other assembly, material and parts.Under the situation that does not break away from the scope of the invention and spirit, can carry out other distortion and change here to disclosed embodiment.

Claims (12)

1. document preview figure generation system, it is characterized in that described system comprises: document is unified conversion module, image generation module;
Described document is unified conversion module various document formats is analyzed, and is PDF with various documents are unified, and document is unified conversion module and comprised Excel conversion module, Powerpoint conversion module, Word conversion module, Txt conversion module;
Described Excel conversion module is in order to transform into the PDF document with all table content; The Excel conversion module utilizes the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
Described Powerpoint conversion module is in order to transform into the PDF document with the presentation file content; The Powerpoint conversion module utilizes the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
Described Word conversion module is in order to transform into document content the PDF document; The Word conversion module utilizes the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
Described Txt conversion module is in order to transform into the PDF document with the notepad content; The Txt conversion module utilizes the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document;
Described image generation module is in order to extract the PDF document content, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Described image generation module comprises image analysis module, image conversion module, image generation module;
Described image analysis module is utilized plug-in unit to calculate and is analyzed the PDF document content, calculates the documentation page number of codes, calculates every page of size and determines coordinate position, and it is copied to internal memory to carry out next step image scale operation;
The Octree Qctree algorithm that described image conversion module provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette; Mainly be divided into three steps: 1) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree; 2) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains; 3) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette;
Described image generation module is generated as image file in order to the image conversion result that module that image is converted obtains.
2. document preview figure generation system, it is characterized in that described system comprises: document is unified conversion module, image generation module;
Described document is unified conversion module document format is analyzed, and is PDF with document is unified;
Described image generation module is in order to extract PDF document properties, analytical documentation page number quantity and size dimension; Use n bit image technology to generate the memory map picture, and the bitmap that carries out n position look is carried out palette optimization, with the image of bitmap palette optimization the becoming m position look of n position look, wherein, n>m.
3. document preview figure generation system according to claim 2 is characterized in that:
Described document is unified conversion module and is comprised in Excel conversion module, Powerpoint conversion module, Word conversion module, the Txt conversion module one or more;
Described Excel conversion module is in order to transform into the PDF document with all table content; The Excel conversion module utilizes the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
Described Powerpoint conversion module is in order to transform into the PDF document with the presentation file content; The Powerpoint conversion module utilizes the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
Described Word conversion module is in order to transform into document content the PDF document; The Word conversion module utilizes the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
Described Txt conversion module is in order to transform into the PDF document with the notepad content; The Txt conversion module utilizes the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document.
4. document preview figure generation system according to claim 2 is characterized in that:
Described image generation module uses 32 bit image technology to generate the memory map picture, and utilizes Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Described image generation module comprises image analysis module, image conversion module, image generation module;
Described image analysis module is utilized plug-in unit to calculate and is analyzed the PDF document content, calculates the documentation page number of codes, calculates every page of size and determines coordinate position, and it is copied to internal memory to carry out next step image scale operation;
The Octree Qctree algorithm that described image conversion module provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette;
Described image generation module is generated as image file in order to the image conversion result that module that image is converted obtains.
5. document preview figure generation system according to claim 4 is characterized in that:
The conversion method of described image conversion module comprises the steps:
1) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, sets up Octree with rgb value, at first sets up root node Root, forms the value of a 0-7 respectively with each of RGB respectively then, inserts successively in the tree;
2) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains;
3) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette.
6. a document preview drawing generating method is characterized in that, described method comprises the steps:
Step S1, document are unified step of converting, and various main flow document formats are analyzed, and various documents unifications are PDF; Select to enter step S11, S12, S13 or S14 according to document format:
S11, Excel step of converting transform into the PDF document with all table content; Utilize the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
S12, Powerpoint step of converting transform into the PDF document with the presentation file content; Utilize the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
S13, Word step of converting transform into the PDF document with document content; Utilize the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
S14, Txt step of converting transform into the PDF document with the notepad content; Utilize the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document;
Step S2, image generate step, extract the PDF document properties, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Step S2 comprises following process:
S21, image analysis step utilize plug-in unit to calculate the document content with analysis PDF, calculate the documentation page number of codes, calculate every page of size and determine coordinate position, and it is copied to internal memory to carry out next step image scale operation;
S22, image conversion step, the Octree Qctree algorithm that provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette; Mainly comprise three steps: S221) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, be numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree; S222) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains; S223) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette;
S23, image generate step, and the image image conversion result that module obtains that converts is generated as image file.
7. a document preview drawing generating method is characterized in that, described method comprises the steps:
Step S1, various main flow document formats are analyzed, be PDF with various documents are unified;
Step S2, image generate step, extract the PDF document properties, analytical documentation page number quantity and size dimension; Use n bit image technology to generate the memory map picture, and the bitmap that carries out n position look is carried out palette optimization, with the image of bitmap palette optimization the becoming m position look of n position look, wherein, n>m.
8. document preview drawing generating method according to claim 7 is characterized in that:
Described step S1 selects to enter step S11, S12, S13 or S14 according to document format: if document format has been PDF, then need not conversion;
S11, Excel step of converting transform into the PDF document with all table content; Utilize the open interface of Office to read the content of Excel, information is read internal memory, and utilize the Office card module that Excel is saved as the PDF document;
S12, Powerpoint step of converting transform into the PDF document with the presentation file content; Utilize the open interface of Office to read the content of all pages of Powerpoint, information is read internal memory, and utilize the Office card module that Powerpoint is saved as the PDF document;
S13, Word step of converting transform into the PDF document with document content; Utilize the open interface of Office to read the Word document content, information is read internal memory, and utilize the Office card module that Word is saved as the PDF document;
S14, Txt step of converting transform into the PDF document with the notepad content; Utilize the open interface of Office to read the Txt file content, information is read internal memory, and utilize the Office card module that the Txt file is saved as the PDF document.
9. document preview drawing generating method according to claim 7 is characterized in that:
Among the described step S2, extract the PDF document properties, analytical documentation page number quantity and size dimension, use 32 bit image technology to generate the memory map picture, and utilize Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks; Step S2 comprises following process:
S21, image analysis step utilize plug-in unit to calculate the document content with analysis PDF, calculate the documentation page number of codes, calculate every page of size and determine coordinate position, and it is copied to internal memory to carry out next step image scale operation;
S22, image conversion step, the Octree Qctree algorithm that provides 32 bitmaps to optimize; Octree Qctree algorithm becomes the colored Gif of 8 looks with the Bitmap palette optimization of 32 looks, makes the harmless and information completely that generates image; Use the Octree algorithm from true color, to find out 256 kinds of colors representing whole image, set up palette;
S23, image generate step, and the image image conversion result that module obtains that converts is generated as image file.
10. document preview drawing generating method according to claim 9 is characterized in that:
Described image conversion step comprises following process:
S221) set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree;
S222) extract palette, after Octree has been set up, take out the mean value of the RGB component in the leaf node, the mean value of RGB component=component summation/node counts promptly is the palette of colors value that obtains;
S223) coupling palette index promptly according to original rgb value, finds out the index of immediate color in palette.
11. a document preview drawing generating method is characterized in that described method comprises the steps:
Steps A, document class other places reason step judge that to document format different document formats is taked different treatment mechanisms;
Step B, document in each all is converted to PDF, reads document content and transfer the main flow document to the PDF document;
Step C, calculating and analysis PDF document mainly comprise number of pages, size, size;
Step D, image conversion step utilize the conversion algorithm to convert, and improve the quality and the no loss rate of preserving bitmap;
Step e, image generate step, preserve the image that converted.
12. document preview drawing generating method according to claim 11 is characterized in that:
The described PDF of transferring to step utilizes open interface of Office and Office plug-in unit to carry out the extraction and the generation of PDF document content;
Described image conversion step utilizes Octree Qctree algorithm that the bitmap of 32 looks is carried out palette optimization, and the bitmap palette optimization of 32 looks is become the colored Gif of 8 looks, improves the quality and the precision that generate image; Mainly comprise following steps:
Step D1, set up Octree, the characteristic of octree nodes is exactly that each node has 8 byte points at most, is numbered 0-7, set up Octree with rgb value, at first set up root node Root, form the value of a 0-7 respectively with each of RGB respectively then, insert successively in the tree;
Step D2, extraction palette, the mean value of the RGB component in the taking-up leaf node, the mean value of RGB component=component summation/node counts obtains the palette of colors value;
Step D3, coupling palette index promptly according to original rgb value, find out the index of immediate color in palette.
CN2011102418973A 2011-08-22 2011-08-22 Document preview image generating system and method Pending CN102289497A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011102418973A CN102289497A (en) 2011-08-22 2011-08-22 Document preview image generating system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011102418973A CN102289497A (en) 2011-08-22 2011-08-22 Document preview image generating system and method

Publications (1)

Publication Number Publication Date
CN102289497A true CN102289497A (en) 2011-12-21

Family

ID=45335923

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011102418973A Pending CN102289497A (en) 2011-08-22 2011-08-22 Document preview image generating system and method

Country Status (1)

Country Link
CN (1) CN102289497A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092818A (en) * 2013-02-21 2013-05-08 用友软件股份有限公司 Thumbnail generation system of report table and thumbnail generation method
CN103729338A (en) * 2013-12-29 2014-04-16 国云科技股份有限公司 File on-line previewing method
CN105471829A (en) * 2014-09-05 2016-04-06 深圳市同盛绿色科技有限公司 Signal transmission method and system
CN105551069A (en) * 2015-11-30 2016-05-04 中国农业科学院棉花研究所 Method and system for real-time rapid generation of index image
CN111722771A (en) * 2019-03-20 2020-09-29 富士施乐实业发展(中国)有限公司 Image association display method and device and computer readable medium
CN113407501A (en) * 2021-06-29 2021-09-17 扬州能煜检测科技有限公司 Image display and processing method, device and equipment for multiple second files

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101470592A (en) * 2007-12-27 2009-07-01 佳能株式会社 Printing system, printing apparatus, and preview method for printing system
CN101500064A (en) * 2008-02-01 2009-08-05 索尼株式会社 Gradation converting device, gradation converting method and computer programme
US20090290185A1 (en) * 2008-05-23 2009-11-26 Canon Kabushiki Kaisha Information processing apparatus, preview method, and computer-readable storage medium
US20100088586A1 (en) * 2006-07-25 2010-04-08 ANDROMAQUE PREPRESSE ( Societe a Responsavilite Li Method and system of production and/or automatic conversion from heterogeneous content of at least one page make-up for achieving the fastest read with maximum retention
CN102004779A (en) * 2010-11-19 2011-04-06 百度在线网络技术(北京)有限公司 Document sharing platform and document processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100088586A1 (en) * 2006-07-25 2010-04-08 ANDROMAQUE PREPRESSE ( Societe a Responsavilite Li Method and system of production and/or automatic conversion from heterogeneous content of at least one page make-up for achieving the fastest read with maximum retention
CN101470592A (en) * 2007-12-27 2009-07-01 佳能株式会社 Printing system, printing apparatus, and preview method for printing system
CN101500064A (en) * 2008-02-01 2009-08-05 索尼株式会社 Gradation converting device, gradation converting method and computer programme
US20090290185A1 (en) * 2008-05-23 2009-11-26 Canon Kabushiki Kaisha Information processing apparatus, preview method, and computer-readable storage medium
CN102004779A (en) * 2010-11-19 2011-04-06 百度在线网络技术(北京)有限公司 Document sharing platform and document processing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宫辉: "基于数码相机的彩色图像处理研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092818A (en) * 2013-02-21 2013-05-08 用友软件股份有限公司 Thumbnail generation system of report table and thumbnail generation method
CN103092818B (en) * 2013-02-21 2016-05-04 用友网络科技股份有限公司 Thumbnail generation system and the reduced graph generating method of form
CN103729338A (en) * 2013-12-29 2014-04-16 国云科技股份有限公司 File on-line previewing method
CN105471829A (en) * 2014-09-05 2016-04-06 深圳市同盛绿色科技有限公司 Signal transmission method and system
CN105551069A (en) * 2015-11-30 2016-05-04 中国农业科学院棉花研究所 Method and system for real-time rapid generation of index image
CN105551069B (en) * 2015-11-30 2018-08-14 中国农业科学院棉花研究所 A kind of real-time generation method and system of thumbnail
CN111722771A (en) * 2019-03-20 2020-09-29 富士施乐实业发展(中国)有限公司 Image association display method and device and computer readable medium
CN113407501A (en) * 2021-06-29 2021-09-17 扬州能煜检测科技有限公司 Image display and processing method, device and equipment for multiple second files

Similar Documents

Publication Publication Date Title
US10452787B2 (en) Techniques for automated document translation
US10380235B2 (en) Method and system for annotation and connection of electronic documents
CN102289497A (en) Document preview image generating system and method
US20100241948A1 (en) Overriding XSLT Generation
WO2009000141A1 (en) Representation method, system and device of layout file logical structure information
US20130238968A1 (en) Automatic Creation of a Table and Query Tools
CN102855244B (en) Method and device for file catalogue processing
EP2807601A1 (en) Fixed format document conversion engine
CN111639473A (en) Excel file analysis method and device based on java, computer equipment and storage medium
CN104020984A (en) Method and device for generating static page
CN104156341A (en) Online reading system and method
CN103870583A (en) Relational-database-based online and controllable browsing method for PDF document
CN106372065A (en) Method and system for developing multi-language website
CN111797595A (en) Method and device for generating OFD format page based on XML template
CN112765999A (en) Machine translation bilingual comparison method and system
CN104090920A (en) System for realizing digital content cross-terminal publishing
CN103902571A (en) Method and system for saving webpage complete content and corresponding client end and server
CN112650529B (en) System and method for configurable generation of mobile terminal APP codes
US20130007598A1 (en) Techniques for applying cultural settings to documents during localization
US8930808B2 (en) Processing rich text data for storing as legacy data records in a data storage system
CN111898351B (en) Automatic Excel data importing method and device based on Aviator, terminal equipment and storage medium
CN105740239A (en) Translation method and system of character on webpage
CN111142871B (en) Front-end page development system, method, equipment and medium
CN103886086A (en) Cross-browser file-displaying system and method
CN104216868B (en) A kind of adaptation method and device of document display format

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20111221

WD01 Invention patent application deemed withdrawn after publication