CN112257387A - Document conversion method - Google Patents

Document conversion method Download PDF

Info

Publication number
CN112257387A
CN112257387A CN202011160314.XA CN202011160314A CN112257387A CN 112257387 A CN112257387 A CN 112257387A CN 202011160314 A CN202011160314 A CN 202011160314A CN 112257387 A CN112257387 A CN 112257387A
Authority
CN
China
Prior art keywords
processing
steps
ppt
document conversion
following
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011160314.XA
Other languages
Chinese (zh)
Inventor
田振
袁圆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Qinggu Information Technology Co ltd
Original Assignee
Hefei Qinggu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Qinggu Information Technology Co ltd filed Critical Hefei Qinggu Information Technology Co ltd
Priority to CN202011160314.XA priority Critical patent/CN112257387A/en
Publication of CN112257387A publication Critical patent/CN112257387A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/109Font handling; Temporal or kinetic typography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of document conversion, in particular to a document conversion method.

Description

Document conversion method
Technical Field
The invention relates to the technical field of document conversion, in particular to a document conversion method.
Background
Background
Aiming at the problem that the effect of converting ppt, word and pdf documents into h5 is poor in the current market, various effects and animations of the ppt are displayed on a h5 page to the maximum extent by analyzing internal source codes of the ppt, word and pdf. And the document is connected with the network, so that the learning cost of the trial-and-error person is further reduced.
Technical solution of the first prior art, 1, a tool for converting a document file into a picture, library office:
libreOffice is a derivative of the OpenOffice.org office suite, is also free from source opening, distributes source codes by Mozilla Public License V2.0 License, but adds a plurality of characteristic functions compared with OpenOffice. The library office has strong data import and export functions, can directly import PDF documents, Microsoft Works and Lotus word, supports the main OpenXML format, converts PDF and word into pictures through the library office, and then displays the pictures in a webpage. 2. The LibreOffice can only convert the document into the picture, so that the animation and the steps in the ppt cannot be displayed, and the effect expected by the user cannot be achieved.
The existing document conversion function in the prior art is relatively single, many ppts cannot be analyzed and recorded, files cannot be well displayed on a webpage and a mobile phone end, and one ppt mobile phone app and one desktop application need to be downloaded independently.
Disclosure of Invention
Technical problem to be solved
In order to solve the above problems in the prior art, the invention provides a document conversion method, which can solve the problem that document files cannot be well displayed on a webpage and a mobile phone end, and does not need to download a ppt mobile phone app and a desktop application separately.
(II) technical scheme
In order to achieve the purpose, the invention adopts the main technical scheme that: the method comprises the following steps:
the method comprises the following steps: analyzing the ppt file through the java code, calling a ppt command in the window to convert the ppt into a pptx file, then changing the pptx file into a zip package, then opening the zip package, and analyzing all xml files in the zip;
step two: carrying out different treatments on various elements in the ppt, wherein the element treatment comprises the following steps; character processing, animation processing, recording processing, audio and video processing and picture processing;
step three: and rendering the analyzed content to a page in an ftl template mode, so as to realize h5 playing of the document file.
Preferably, the processing method of the word processing in the second step is as follows: the method comprises the steps of resolving the size, font, bold, italics, color, background color, alignment state, rotation state, three-dimensional model and shadow of characters, and then restoring the resolved contents to be h5 to be available through ppt.
Preferably, the processing method of animation processing in the second step includes: and (4) extracting animation modes, combining the animation on the ppt, drawing the animation by using a 2d graphic making tool in one part, and rendering the animation by using webpage native codes in the other part.
Preferably, the processing method of the sound recording processing in the second step is as follows: the recording function of the WeChat end is used, the recording is recorded through WeChat and downloaded to a local server, the recording format is converted into a music format universal to mp3 webpages through ffmpeg in the server, and then the mp3 file is subjected to Fourier transform algorithm to remove part of noise in the recording.
Preferably, the processing method of audio and video processing in the second step is as follows: the audio format and the coding are converted into the mp3 format aac coding through ffmpeg, and the video is converted into the mp4 h264 coding, so that the method is suitable for playing all web pages.
Preferably, the processing method of the picture processing in the second step is as follows: the method comprises the steps of firstly analyzing picture information, compressing a picture without cutting, using a used compression tool which is pngquant, using a modified version of a median cutting quantization algorithm and an additional technology to reduce the defect of median cutting by the pngquant, selecting a box to minimize the variance of the median, and establishing a histogram on the basis of a basic perception model, so that the weight of a noise area of the picture can be reduced, and using Voronoi iteration to correct colors, thereby ensuring the locally optimal color palette, wherein the pngquant works in a pre-multiplied alpha color space, the weight of transparent colors is reduced, and when remapping is carried out, error diffusion is applied to a plurality of adjacent pixels to quantize the pixels into an area with the same value to remove edges, so that the area that the visual quality is increased under the condition of no dithering is avoided.
Preferably, in the process of improving color, the histogram is adjusted in a process similar to gradient descent by: the median cut was repeated multiple times, adding weight to the underperforming color.
Preferably, for the pictures to be cut, the pictures are cut by java first and then compressed. And for some pictures which cannot come out due to three-dimensional rotation and the like, the content is recompiled into the pictures through a windows machine.
(III) advantageous effects
The invention provides a document conversion method. The method has the following beneficial effects:
(1) the network micro-class recorder is suitable for manufacturing courseware in enterprise training, schools, training institutions and the like, a person who does not know the technology can quickly manufacture a network micro-class through h5 and recording, and a user only needs to understand manufacturing documents, so that the courseware recording efficiency is greatly improved. People can learn the knowledge that people want to know at any time and any place.
Drawings
FIG. 1 is a process flow diagram of the present invention;
fig. 2 is a functional block diagram according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A document conversion method, comprising the steps of:
the method comprises the following steps: analyzing the ppt file through the java code, calling a ppt command in the window to convert the ppt into a pptx file, then changing the pptx file into a zip package, then opening the zip package, and analyzing all xml files in the zip;
step two: carrying out different treatments on various elements in the ppt, wherein the element treatment comprises the following steps; character processing, animation processing, recording processing, audio and video processing and picture processing;
step three: and rendering the analyzed content to a page in an ftl template mode, so as to realize h5 playing of the document file.
As a specific embodiment of the present invention, the processing method for processing the chinese character in the step two includes: the method comprises the steps of resolving the size, font, bold, italics, color, background color, alignment state, rotation state, three-dimensional model and shadow of characters, and then restoring the resolved contents to be h5 to be available through ppt.
As a specific embodiment of the present invention, the processing method of the animation processing in the second step includes: and (4) extracting animation modes, combining the animation on the ppt, drawing the animation by using a 2d graphic making tool in one part, and rendering the animation by using webpage native codes in the other part.
As a specific embodiment of the present invention, the processing method of the recording processing in the step two is: the recording function of the WeChat end is used, the recording is recorded through WeChat and downloaded to a local server, the recording format is converted into a music format universal to mp3 webpages through ffmpeg in the server, and then the mp3 file is subjected to Fourier transform algorithm to remove part of noise in the recording.
As a specific embodiment of the present invention, the processing method of audio and video processing in step two includes: the audio format and the coding are converted into the mp3 format aac coding through ffmpeg, and the video is converted into the mp4 h264 coding, so that the method is suitable for playing all web pages.
As a specific embodiment of the present invention, the processing method of the picture processing in the second step is: the method comprises the steps of firstly analyzing picture information, compressing a picture without cutting, using a used compression tool which is pngquant, using a modified version of a median cutting quantization algorithm and an additional technology to reduce the defect of median cutting by the pngquant, selecting a box to minimize the variance of the median, and establishing a histogram on the basis of a basic perception model, so that the weight of a noise area of the picture can be reduced, and using Voronoi iteration to correct colors, thereby ensuring the locally optimal color palette, wherein the pngquant works in a pre-multiplied alpha color space, the weight of transparent colors is reduced, and when remapping is carried out, error diffusion is applied to a plurality of adjacent pixels to quantize the pixels into an area with the same value to remove edges, so that the area that the visual quality is increased under the condition of no dithering is avoided.
As an embodiment of the present invention, in improving the color process, the histogram is adjusted in the process similar to the gradient descent, and the method is as follows: the median cut was repeated multiple times, adding weight to the underperforming color.
As a specific implementation manner of the invention, for the pictures needing to be cut, the pictures are cut by java first and then compressed, and for some pictures which can not be cut by three-dimensional rotation and the like, the contents are recompiled into the pictures by a windows machine.
The using method comprises the following steps: A. install ffmpeg, font library, graphics map, library, fonttools, LTS (distributed task System), tomcat, postgres, redis in linux system
B. Ffmpeg, graphics magic was installed in the windows system.
C. The linux or windows command is executed using the ruby file.
D. Configuring a java environment, packaging document conversion codes onto a windows system, placing a service system into a linux system, arranging conversion codes such as audio and video pictures and the like to a plurality of machines without being limited to linux and windows, and then starting each system.

Claims (8)

1. A document conversion method, characterized by: the method comprises the following steps:
the method comprises the following steps: analyzing the ppt file through the java code, calling a ppt command in the window to convert the ppt into a pptx file, then changing the pptx file into a zip package, then opening the zip package, and analyzing all xml files in the zip;
step two: carrying out different treatments on various elements in the ppt, wherein the element treatment comprises the following steps; character processing, animation processing, recording processing, audio and video processing and picture processing;
step three: and rendering the analyzed content to a page in an ftl template mode, so as to realize h5 playing of the document file.
2. The document conversion method according to claim 1, wherein: the processing method of the character processing in the step two comprises the following steps: the method comprises the steps of resolving the size, font, bold, italics, color, background color, alignment state, rotation state, three-dimensional model and shadow of characters, and then restoring the resolved contents to be h5 to be available through ppt.
3. The document conversion method according to claim 1, wherein: the processing method of animation processing in the second step comprises the following steps: and (4) extracting animation modes, combining the animation on the ppt, drawing the animation by using a 2d graphic making tool in one part, and rendering the animation by using webpage native codes in the other part.
4. The document conversion method according to claim 1, wherein: the processing method of the recording processing in the second step comprises the following steps: the recording function of the WeChat end is used, the recording is recorded through WeChat and downloaded to a local server, the recording format is converted into a music format universal to mp3 webpages through ffmpeg in the server, and then the mp3 file is subjected to Fourier transform algorithm to remove part of noise in the recording.
5. The document conversion method according to claim 1, wherein: the audio and video processing method in the second step comprises the following steps: the audio format and the coding are converted into the mp3 format aac coding through ffmpeg, and the video is converted into the mp4 h264 coding, so that the method is suitable for playing all web pages.
6. The document conversion method according to claim 1, wherein: the processing method for processing the picture in the second step comprises the following steps: the method comprises the steps of firstly analyzing picture information, compressing a picture without cutting, using a used compression tool which is pngquant, using a modified version of a median cutting quantization algorithm and an additional technology to reduce the defect of median cutting by pngquant, selecting a box to minimize the variance of the median, establishing a histogram on the basis of a basic perception model, correcting colors by using Voronoi iteration, working in a pre-multiplied alpha color space by pngquant, and applying error diffusion to a plurality of areas with adjacent pixels quantized to the same value and edges removed when remapping.
7. The document conversion method according to claim 6, wherein: in the process of improving color, the histogram is adjusted in the process similar to gradient descent, and the method comprises the following steps: the median cut was repeated multiple times, adding weight to the underperforming color.
8. The document conversion method according to claim 6, wherein: for the pictures needing to be cut, the pictures are cut out through java, then compression processing is carried out, and for some pictures which cannot be cut out through three-dimensional rotation and the like, the contents are recompiled into the pictures through a windows machine.
CN202011160314.XA 2020-10-27 2020-10-27 Document conversion method Pending CN112257387A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011160314.XA CN112257387A (en) 2020-10-27 2020-10-27 Document conversion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011160314.XA CN112257387A (en) 2020-10-27 2020-10-27 Document conversion method

Publications (1)

Publication Number Publication Date
CN112257387A true CN112257387A (en) 2021-01-22

Family

ID=74262494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011160314.XA Pending CN112257387A (en) 2020-10-27 2020-10-27 Document conversion method

Country Status (1)

Country Link
CN (1) CN112257387A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156742A (en) * 2011-04-19 2011-08-17 北京神州数码思特奇信息技术股份有限公司 Method and middleware for supporting structured document display with own browser of mobile phone
CN105630459A (en) * 2014-10-25 2016-06-01 上海未达数码科技有限公司 Method for converting PPT document to HTML page
CN107015950A (en) * 2017-03-20 2017-08-04 厦门云开云科技有限公司 The generation method and device of a kind of SCORM coursewares
CN108228843A (en) * 2018-01-09 2018-06-29 闫健 A kind of handout compression transmission and restoring method based on internet

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156742A (en) * 2011-04-19 2011-08-17 北京神州数码思特奇信息技术股份有限公司 Method and middleware for supporting structured document display with own browser of mobile phone
CN105630459A (en) * 2014-10-25 2016-06-01 上海未达数码科技有限公司 Method for converting PPT document to HTML page
CN107015950A (en) * 2017-03-20 2017-08-04 厦门云开云科技有限公司 The generation method and device of a kind of SCORM coursewares
CN108228843A (en) * 2018-01-09 2018-06-29 闫健 A kind of handout compression transmission and restoring method based on internet

Similar Documents

Publication Publication Date Title
US11418832B2 (en) Video processing method, electronic device and computer-readable storage medium
CN102368247B (en) Method for executing SWF (Small Web Format) file on handheld terminal
GB2593327A (en) Colour conversion within a hierarchical coding scheme
US11954455B2 (en) Method for translating words in a picture, electronic device, and storage medium
CN109147805B (en) Audio tone enhancement based on deep learning
US10600337B2 (en) Intelligent content parsing with synthetic speech and tangible braille production
CN108495174B (en) Method and system for generating video file by H5 page effect
US20230032417A1 (en) Game special effect generation method and apparatus, and storage medium and electronic device
CN114495102A (en) Text recognition method, and training method and device of text recognition network
CN112257387A (en) Document conversion method
CN114495977A (en) Speech translation and model training method, device, electronic equipment and storage medium
US11915458B1 (en) System and process for reducing time of transmission for single-band, multiple-band or hyperspectral imagery using machine learning based compression
CN111144071B (en) Cross-platform MathType formula conversion method and device
CN111554277B (en) Voice data recognition method, device, equipment and medium
CN113038134B (en) Picture processing method, intelligent terminal and storage medium
US20230046763A1 (en) Speech recognition apparatus, control method, and non-transitory storage medium
CA2521445A1 (en) Code conversion method and apparatus
CN111949234B (en) Drawing processing method and system, terminal equipment, computer equipment and medium
CN113343135A (en) Method and device for picture synthesis video and electronic equipment
CN116546272A (en) Method and device for generating visual media data, electronic equipment and storage medium
CN117762368A (en) Image display method and device, storage medium and electronic equipment
CN113034625B (en) Lossless compression method based on picture, intelligent terminal and storage medium
Redfern Computational analysis of a horror film trailer soundtrack with Python
CN118172496A (en) Three-dimensional reconstruction method, system, medium, device and program product
CN114818633A (en) PDF report generation system, method, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination