CN112257387A - Document conversion method - Google Patents
Document conversion method Download PDFInfo
- Publication number
- CN112257387A CN112257387A CN202011160314.XA CN202011160314A CN112257387A CN 112257387 A CN112257387 A CN 112257387A CN 202011160314 A CN202011160314 A CN 202011160314A CN 112257387 A CN112257387 A CN 112257387A
- Authority
- CN
- China
- Prior art keywords
- processing
- steps
- ppt
- document conversion
- following
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/109—Font handling; Temporal or kinetic typography
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Processing Or Creating Images (AREA)
- Image Processing (AREA)
Abstract
The invention relates to the technical field of document conversion, in particular to a document conversion method.
Description
Technical Field
The invention relates to the technical field of document conversion, in particular to a document conversion method.
Background
Background
Aiming at the problem that the effect of converting ppt, word and pdf documents into h5 is poor in the current market, various effects and animations of the ppt are displayed on a h5 page to the maximum extent by analyzing internal source codes of the ppt, word and pdf. And the document is connected with the network, so that the learning cost of the trial-and-error person is further reduced.
Technical solution of the first prior art, 1, a tool for converting a document file into a picture, library office:
libreOffice is a derivative of the OpenOffice.org office suite, is also free from source opening, distributes source codes by Mozilla Public License V2.0 License, but adds a plurality of characteristic functions compared with OpenOffice. The library office has strong data import and export functions, can directly import PDF documents, Microsoft Works and Lotus word, supports the main OpenXML format, converts PDF and word into pictures through the library office, and then displays the pictures in a webpage. 2. The LibreOffice can only convert the document into the picture, so that the animation and the steps in the ppt cannot be displayed, and the effect expected by the user cannot be achieved.
The existing document conversion function in the prior art is relatively single, many ppts cannot be analyzed and recorded, files cannot be well displayed on a webpage and a mobile phone end, and one ppt mobile phone app and one desktop application need to be downloaded independently.
Disclosure of Invention
Technical problem to be solved
In order to solve the above problems in the prior art, the invention provides a document conversion method, which can solve the problem that document files cannot be well displayed on a webpage and a mobile phone end, and does not need to download a ppt mobile phone app and a desktop application separately.
(II) technical scheme
In order to achieve the purpose, the invention adopts the main technical scheme that: the method comprises the following steps:
the method comprises the following steps: analyzing the ppt file through the java code, calling a ppt command in the window to convert the ppt into a pptx file, then changing the pptx file into a zip package, then opening the zip package, and analyzing all xml files in the zip;
step two: carrying out different treatments on various elements in the ppt, wherein the element treatment comprises the following steps; character processing, animation processing, recording processing, audio and video processing and picture processing;
step three: and rendering the analyzed content to a page in an ftl template mode, so as to realize h5 playing of the document file.
Preferably, the processing method of the word processing in the second step is as follows: the method comprises the steps of resolving the size, font, bold, italics, color, background color, alignment state, rotation state, three-dimensional model and shadow of characters, and then restoring the resolved contents to be h5 to be available through ppt.
Preferably, the processing method of animation processing in the second step includes: and (4) extracting animation modes, combining the animation on the ppt, drawing the animation by using a 2d graphic making tool in one part, and rendering the animation by using webpage native codes in the other part.
Preferably, the processing method of the sound recording processing in the second step is as follows: the recording function of the WeChat end is used, the recording is recorded through WeChat and downloaded to a local server, the recording format is converted into a music format universal to mp3 webpages through ffmpeg in the server, and then the mp3 file is subjected to Fourier transform algorithm to remove part of noise in the recording.
Preferably, the processing method of audio and video processing in the second step is as follows: the audio format and the coding are converted into the mp3 format aac coding through ffmpeg, and the video is converted into the mp4 h264 coding, so that the method is suitable for playing all web pages.
Preferably, the processing method of the picture processing in the second step is as follows: the method comprises the steps of firstly analyzing picture information, compressing a picture without cutting, using a used compression tool which is pngquant, using a modified version of a median cutting quantization algorithm and an additional technology to reduce the defect of median cutting by the pngquant, selecting a box to minimize the variance of the median, and establishing a histogram on the basis of a basic perception model, so that the weight of a noise area of the picture can be reduced, and using Voronoi iteration to correct colors, thereby ensuring the locally optimal color palette, wherein the pngquant works in a pre-multiplied alpha color space, the weight of transparent colors is reduced, and when remapping is carried out, error diffusion is applied to a plurality of adjacent pixels to quantize the pixels into an area with the same value to remove edges, so that the area that the visual quality is increased under the condition of no dithering is avoided.
Preferably, in the process of improving color, the histogram is adjusted in a process similar to gradient descent by: the median cut was repeated multiple times, adding weight to the underperforming color.
Preferably, for the pictures to be cut, the pictures are cut by java first and then compressed. And for some pictures which cannot come out due to three-dimensional rotation and the like, the content is recompiled into the pictures through a windows machine.
(III) advantageous effects
The invention provides a document conversion method. The method has the following beneficial effects:
(1) the network micro-class recorder is suitable for manufacturing courseware in enterprise training, schools, training institutions and the like, a person who does not know the technology can quickly manufacture a network micro-class through h5 and recording, and a user only needs to understand manufacturing documents, so that the courseware recording efficiency is greatly improved. People can learn the knowledge that people want to know at any time and any place.
Drawings
FIG. 1 is a process flow diagram of the present invention;
fig. 2 is a functional block diagram according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A document conversion method, comprising the steps of:
the method comprises the following steps: analyzing the ppt file through the java code, calling a ppt command in the window to convert the ppt into a pptx file, then changing the pptx file into a zip package, then opening the zip package, and analyzing all xml files in the zip;
step two: carrying out different treatments on various elements in the ppt, wherein the element treatment comprises the following steps; character processing, animation processing, recording processing, audio and video processing and picture processing;
step three: and rendering the analyzed content to a page in an ftl template mode, so as to realize h5 playing of the document file.
As a specific embodiment of the present invention, the processing method for processing the chinese character in the step two includes: the method comprises the steps of resolving the size, font, bold, italics, color, background color, alignment state, rotation state, three-dimensional model and shadow of characters, and then restoring the resolved contents to be h5 to be available through ppt.
As a specific embodiment of the present invention, the processing method of the animation processing in the second step includes: and (4) extracting animation modes, combining the animation on the ppt, drawing the animation by using a 2d graphic making tool in one part, and rendering the animation by using webpage native codes in the other part.
As a specific embodiment of the present invention, the processing method of the recording processing in the step two is: the recording function of the WeChat end is used, the recording is recorded through WeChat and downloaded to a local server, the recording format is converted into a music format universal to mp3 webpages through ffmpeg in the server, and then the mp3 file is subjected to Fourier transform algorithm to remove part of noise in the recording.
As a specific embodiment of the present invention, the processing method of audio and video processing in step two includes: the audio format and the coding are converted into the mp3 format aac coding through ffmpeg, and the video is converted into the mp4 h264 coding, so that the method is suitable for playing all web pages.
As a specific embodiment of the present invention, the processing method of the picture processing in the second step is: the method comprises the steps of firstly analyzing picture information, compressing a picture without cutting, using a used compression tool which is pngquant, using a modified version of a median cutting quantization algorithm and an additional technology to reduce the defect of median cutting by the pngquant, selecting a box to minimize the variance of the median, and establishing a histogram on the basis of a basic perception model, so that the weight of a noise area of the picture can be reduced, and using Voronoi iteration to correct colors, thereby ensuring the locally optimal color palette, wherein the pngquant works in a pre-multiplied alpha color space, the weight of transparent colors is reduced, and when remapping is carried out, error diffusion is applied to a plurality of adjacent pixels to quantize the pixels into an area with the same value to remove edges, so that the area that the visual quality is increased under the condition of no dithering is avoided.
As an embodiment of the present invention, in improving the color process, the histogram is adjusted in the process similar to the gradient descent, and the method is as follows: the median cut was repeated multiple times, adding weight to the underperforming color.
As a specific implementation manner of the invention, for the pictures needing to be cut, the pictures are cut by java first and then compressed, and for some pictures which can not be cut by three-dimensional rotation and the like, the contents are recompiled into the pictures by a windows machine.
The using method comprises the following steps: A. install ffmpeg, font library, graphics map, library, fonttools, LTS (distributed task System), tomcat, postgres, redis in linux system
B. Ffmpeg, graphics magic was installed in the windows system.
C. The linux or windows command is executed using the ruby file.
D. Configuring a java environment, packaging document conversion codes onto a windows system, placing a service system into a linux system, arranging conversion codes such as audio and video pictures and the like to a plurality of machines without being limited to linux and windows, and then starting each system.
Claims (8)
1. A document conversion method, characterized by: the method comprises the following steps:
the method comprises the following steps: analyzing the ppt file through the java code, calling a ppt command in the window to convert the ppt into a pptx file, then changing the pptx file into a zip package, then opening the zip package, and analyzing all xml files in the zip;
step two: carrying out different treatments on various elements in the ppt, wherein the element treatment comprises the following steps; character processing, animation processing, recording processing, audio and video processing and picture processing;
step three: and rendering the analyzed content to a page in an ftl template mode, so as to realize h5 playing of the document file.
2. The document conversion method according to claim 1, wherein: the processing method of the character processing in the step two comprises the following steps: the method comprises the steps of resolving the size, font, bold, italics, color, background color, alignment state, rotation state, three-dimensional model and shadow of characters, and then restoring the resolved contents to be h5 to be available through ppt.
3. The document conversion method according to claim 1, wherein: the processing method of animation processing in the second step comprises the following steps: and (4) extracting animation modes, combining the animation on the ppt, drawing the animation by using a 2d graphic making tool in one part, and rendering the animation by using webpage native codes in the other part.
4. The document conversion method according to claim 1, wherein: the processing method of the recording processing in the second step comprises the following steps: the recording function of the WeChat end is used, the recording is recorded through WeChat and downloaded to a local server, the recording format is converted into a music format universal to mp3 webpages through ffmpeg in the server, and then the mp3 file is subjected to Fourier transform algorithm to remove part of noise in the recording.
5. The document conversion method according to claim 1, wherein: the audio and video processing method in the second step comprises the following steps: the audio format and the coding are converted into the mp3 format aac coding through ffmpeg, and the video is converted into the mp4 h264 coding, so that the method is suitable for playing all web pages.
6. The document conversion method according to claim 1, wherein: the processing method for processing the picture in the second step comprises the following steps: the method comprises the steps of firstly analyzing picture information, compressing a picture without cutting, using a used compression tool which is pngquant, using a modified version of a median cutting quantization algorithm and an additional technology to reduce the defect of median cutting by pngquant, selecting a box to minimize the variance of the median, establishing a histogram on the basis of a basic perception model, correcting colors by using Voronoi iteration, working in a pre-multiplied alpha color space by pngquant, and applying error diffusion to a plurality of areas with adjacent pixels quantized to the same value and edges removed when remapping.
7. The document conversion method according to claim 6, wherein: in the process of improving color, the histogram is adjusted in the process similar to gradient descent, and the method comprises the following steps: the median cut was repeated multiple times, adding weight to the underperforming color.
8. The document conversion method according to claim 6, wherein: for the pictures needing to be cut, the pictures are cut out through java, then compression processing is carried out, and for some pictures which cannot be cut out through three-dimensional rotation and the like, the contents are recompiled into the pictures through a windows machine.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011160314.XA CN112257387A (en) | 2020-10-27 | 2020-10-27 | Document conversion method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011160314.XA CN112257387A (en) | 2020-10-27 | 2020-10-27 | Document conversion method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112257387A true CN112257387A (en) | 2021-01-22 |
Family
ID=74262494
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011160314.XA Pending CN112257387A (en) | 2020-10-27 | 2020-10-27 | Document conversion method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112257387A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102156742A (en) * | 2011-04-19 | 2011-08-17 | 北京神州数码思特奇信息技术股份有限公司 | Method and middleware for supporting structured document display with own browser of mobile phone |
CN105630459A (en) * | 2014-10-25 | 2016-06-01 | 上海未达数码科技有限公司 | Method for converting PPT document to HTML page |
CN107015950A (en) * | 2017-03-20 | 2017-08-04 | 厦门云开云科技有限公司 | The generation method and device of a kind of SCORM coursewares |
CN108228843A (en) * | 2018-01-09 | 2018-06-29 | 闫健 | A kind of handout compression transmission and restoring method based on internet |
-
2020
- 2020-10-27 CN CN202011160314.XA patent/CN112257387A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102156742A (en) * | 2011-04-19 | 2011-08-17 | 北京神州数码思特奇信息技术股份有限公司 | Method and middleware for supporting structured document display with own browser of mobile phone |
CN105630459A (en) * | 2014-10-25 | 2016-06-01 | 上海未达数码科技有限公司 | Method for converting PPT document to HTML page |
CN107015950A (en) * | 2017-03-20 | 2017-08-04 | 厦门云开云科技有限公司 | The generation method and device of a kind of SCORM coursewares |
CN108228843A (en) * | 2018-01-09 | 2018-06-29 | 闫健 | A kind of handout compression transmission and restoring method based on internet |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11418832B2 (en) | Video processing method, electronic device and computer-readable storage medium | |
CN102368247B (en) | Method for executing SWF (Small Web Format) file on handheld terminal | |
GB2593327A (en) | Colour conversion within a hierarchical coding scheme | |
US11954455B2 (en) | Method for translating words in a picture, electronic device, and storage medium | |
CN109147805B (en) | Audio tone enhancement based on deep learning | |
US10600337B2 (en) | Intelligent content parsing with synthetic speech and tangible braille production | |
CN108495174B (en) | Method and system for generating video file by H5 page effect | |
US20230032417A1 (en) | Game special effect generation method and apparatus, and storage medium and electronic device | |
CN114495102A (en) | Text recognition method, and training method and device of text recognition network | |
CN112257387A (en) | Document conversion method | |
CN114495977A (en) | Speech translation and model training method, device, electronic equipment and storage medium | |
US11915458B1 (en) | System and process for reducing time of transmission for single-band, multiple-band or hyperspectral imagery using machine learning based compression | |
CN111144071B (en) | Cross-platform MathType formula conversion method and device | |
CN111554277B (en) | Voice data recognition method, device, equipment and medium | |
CN113038134B (en) | Picture processing method, intelligent terminal and storage medium | |
US20230046763A1 (en) | Speech recognition apparatus, control method, and non-transitory storage medium | |
CA2521445A1 (en) | Code conversion method and apparatus | |
CN111949234B (en) | Drawing processing method and system, terminal equipment, computer equipment and medium | |
CN113343135A (en) | Method and device for picture synthesis video and electronic equipment | |
CN116546272A (en) | Method and device for generating visual media data, electronic equipment and storage medium | |
CN117762368A (en) | Image display method and device, storage medium and electronic equipment | |
CN113034625B (en) | Lossless compression method based on picture, intelligent terminal and storage medium | |
Redfern | Computational analysis of a horror film trailer soundtrack with Python | |
CN118172496A (en) | Three-dimensional reconstruction method, system, medium, device and program product | |
CN114818633A (en) | PDF report generation system, method, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |