CN111881650A - PDF document generation method and device and electronic equipment - Google Patents

PDF document generation method and device and electronic equipment Download PDF

Info

Publication number
CN111881650A
CN111881650A CN202010701841.0A CN202010701841A CN111881650A CN 111881650 A CN111881650 A CN 111881650A CN 202010701841 A CN202010701841 A CN 202010701841A CN 111881650 A CN111881650 A CN 111881650A
Authority
CN
China
Prior art keywords
html
converted
source data
data
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010701841.0A
Other languages
Chinese (zh)
Inventor
王晓博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010701841.0A priority Critical patent/CN111881650A/en
Publication of CN111881650A publication Critical patent/CN111881650A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/154Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets

Abstract

The application discloses a PDF document generation method and device and electronic equipment, and relates to the technical field of data format conversion and large search. The specific implementation scheme is as follows: acquiring source data to be converted; generating a hypertext markup language (HTML) page according to source data to be converted; calling a browser engine QtWebkit to generate a directory tree of an HTML page; and generating a portable document format PDF document according to the directory tree and the HTML page. In the process of generating the PDF document, firstly, an HTML page is generated, the QtWebkit is called to generate a directory tree of the HTML page, the directory tree is generated without manual operation positioning, and then the PDF document is generated according to the directory tree and the HTML page, so that the efficiency of generating the PDF document with the directory tree can be improved.

Description

PDF document generation method and device and electronic equipment
Technical Field
The present application relates to the field of data format conversion technology and large search technology in computer technology, and in particular, to a method and an apparatus for generating a PDF document, and an electronic device.
Background
A PDF (Portable Document Format) Document is a common Document, and other Format data, for example, JOSN (JS Object Notation) data, is a lightweight data exchange Format data, and can be converted into a PDF Document.
Currently, in the process of converting JOSN data into a PDF document, a method commonly adopted is to convert JSON data into XML (eXtensible Markup Language) by using each Language XML class, then construct XSL (eXtensible Stylesheet Language), implement directory location by manually searching keywords, and then generate a PDF document with a directory.
Disclosure of Invention
The application provides a PDF document generation method and device and electronic equipment.
In a first aspect, an embodiment of the present application provides a PDF document generating method, where the method includes:
acquiring source data to be converted;
generating a hypertext markup language (HTML) page according to the source data to be converted;
calling a browser engine QtWebkit to generate a directory tree of the HTML page;
and generating a portable document format PDF document according to the directory tree and the HTML page.
According to the PDF document generation method, firstly, an HTML page is generated according to source data to be converted, then, a QtWebkit is called to generate a directory tree of the HTML page, and then, a PDF document is generated according to the directory tree and the HTML page, so that conversion from the source data to be converted to the PDF document is achieved. In the process of generating the PDF document, firstly, an HTML page is generated, the QtWebkit is called to generate a directory tree of the HTML page, the directory tree is generated without manual operation positioning, and then the PDF document is generated according to the directory tree and the HTML page, so that the efficiency of generating the PDF document with the directory tree can be improved.
In a second aspect, an embodiment of the present application provides a PDF document generating apparatus, including:
the first acquisition module is used for acquiring source data to be converted;
the page generation module is used for generating a hypertext markup language (HTML) page according to the source data to be converted;
the directory generation module is used for calling a browser engine QtWebkit to generate a directory tree of the HTML page;
and the document generating module is used for generating a portable document format PDF document according to the directory tree and the HTML page.
In the process of generating the PDF document by the PDF document generating device in the embodiment of the present application, an HTML page is generated according to source data to be converted, then a QtWebkit is called to generate a directory tree of the HTML page, and then a PDF document is generated according to the directory tree and the HTML page, so as to implement conversion from the source data to be converted to the PDF document. In the process of generating the PDF document, firstly, an HTML page is generated, the QtWebkit is called to generate a directory tree of the HTML page, the directory tree is generated without manual operation positioning, and then the PDF document is generated according to the directory tree and the HTML page, so that the efficiency of generating the PDF document with the directory tree can be improved.
In a third aspect, an embodiment of the present application further provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the methods provided by the embodiments of the present application.
In a fourth aspect, an embodiment of the present application further provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method provided by the embodiments of the present application.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a flow chart of a PDF document generation method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a PDF document generation method implemented according to an embodiment provided herein;
fig. 3 is an architecture diagram of a PDF document generating system implementing the PDF document generating method according to an embodiment of the present application;
FIG. 4 is a block diagram of a PDF document generating device according to an embodiment of the present application;
fig. 5 is a block diagram of an electronic device for implementing a PDF document generating method according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
As shown in fig. 1, according to an embodiment of the present application, the present application provides a PDF document generating method, including:
step S101: and acquiring source data to be converted.
The PDF document generation method in the embodiment of the application can be applied to a server, the source data to be converted is the source data which needs to be converted into the PDF document, for example, the source data to be converted can be JSON data and the like. As an example, the server may receive source data to be converted sent by the external device through a Web service (Web service), and the Web service may encrypt the source data to be converted and write the encrypted source data into a file system of the server, so that in the process of obtaining the source data to be converted, the source data to be converted may be obtained from the file system.
Step S102: and generating a hypertext markup language (HTML) page according to the source data to be converted.
After the source data to be converted is obtained, a corresponding HTML page can be generated according to the source data to be converted, and the HTML page can be displayed through a browser.
Step S103: and calling a browser engine QtWebkit to generate a directory tree of the HTML page.
The QtWebkit is a toolkit which encapsulates a Webkit browser engine (which is an open source browser engine) by using Qt (application development framework), and can also be understood as a browser engine running in a Qt environment. Parsing the HTML page through the QtWebkit may generate a directory tree.
Step S104: and generating a portable document format PDF document according to the directory tree and the HTML page.
After the directory tree is generated, the directory tree and the HTML page may be converted into a PDF document, and it can be understood that the generated PDF document includes the contents of the directory tree and the HTML page.
According to the PDF document generation method, firstly, an HTML page is generated according to source data to be converted, then, a QtWebkit is called to generate a directory tree of the HTML page, and then, a PDF document is generated according to the directory tree and the HTML page, so that conversion from the source data to be converted to the PDF document is achieved. In the process of generating the PDF document, firstly, an HTML page is generated, the QtWebkit is called to generate a directory tree of the HTML page, the directory tree is generated without manual operation positioning, and then the PDF document is generated according to the directory tree and the HTML page, so that the efficiency of generating the PDF document with the directory tree can be improved.
In one embodiment, obtaining source data to be transformed includes: and calling an HTML renderer to acquire source data to be converted.
In this embodiment, generating an HTML page according to source data to be converted includes: converting source data to be converted into HTML data through an HTML renderer; and analyzing the HTML data through the non-interface browser to generate an HTML page.
The method comprises the steps of obtaining source data to be converted through an HTML renderer in a server, converting the source data into HTML data, transmitting the HTML data to a non-interface browser in the server, analyzing the HTML data through the non-interface browser, and generating an HTML page. The HTML renderer may be a Server Side Rendering (SSR for short).
In this embodiment, the source data to be converted is converted into HTML data by an HTML renderer, then the HTML data is analyzed by a non-interface browser to generate an HTML page corresponding to the source data to be converted, and then a PDF document is generated according to a directory tree and the HTML page. In this way, in the process of regenerating the PDF document, the unbounded browser is used to analyze the HTML data obtained by converting the source data to be converted by the HTML renderer to generate the HTML page.
In one embodiment, invoking the HTML renderer to obtain the source data to be converted includes: and under the condition that the source data to be converted is monitored to be written into the file system, calling an HTML (hypertext markup language) renderer through the non-interface browser, and acquiring the source data to be converted from the file system through the HTML renderer.
Under the condition that the source data to be converted are monitored to be written into the file system, if the non-interface browser is not started, the non-interface browser is started, the HTML renderer is called through the non-interface browser, under the condition that the source data to be converted are monitored to be written into the file system, if the non-interface browser is started, the HTML renderer is called through the non-interface browser, and the source data to be converted are obtained from the file system through the HTML renderer.
In this embodiment, the source data to be converted received from the external device is first written into the file system, and the HTML renderer is called through the non-interface browser if it is monitored that the source data to be converted is written into the file system, and the source data to be converted is acquired from the file system through the HTML renderer.
As an example, the data monitoring module may monitor the file system, and when the data monitoring module monitors that source data to be converted is written in the file system, the data monitoring module may send a call instruction to the non-interface browser for calling the non-interface browser, and if the non-interface browser is not started, the non-interface browser is started after receiving the call instruction, and calls the HTML renderer through the non-interface browser, and obtains the source data to be converted from the file system through the HTML renderer.
In one embodiment, the HTML renderer calls a web service to acquire the source data to be converted from the file system, and decrypts the source data to be converted through the web service.
In this embodiment, converting the source data to be converted into HTML data by an HTML renderer includes: and converting the decrypted source data to be converted into HTML data through an HTML renderer.
The server receives source data to be converted sent by external equipment through Web service, the Web service can encrypt the source data to be converted and write the encrypted source data into a file system, the HTML renderer calls the Web server, the source data to be converted are read from the file system through the Web service and are decrypted, the decrypted source data to be converted are transmitted to the HTML renderer, and the decrypted source data to be converted are converted into HTML data through the HTML renderer.
The network server acquires the encrypted source data to be converted from the file system, decrypts the encrypted source data to be converted, and converts the decrypted source data to be converted into HTML data through the HTML renderer.
In one embodiment, the parsing HTML data through the non-interface browser to generate an HTML page includes: generating a chart or/and a watermark through a non-interface browser according to the source data to be converted under the condition that the information of a first target field in the source data to be converted is first preset information or/and the information of a second target field in the source data to be converted is second preset information; and generating the HTML page through the non-interface browser according to the HTML data and the chart or/and the watermark.
The source data to be converted not only includes contents required for generating the PDF, for example, an identification of each user in a user group, information of a jacket name required by the user, information of a size of the user, information of a jacket color required by the user, information of a jacket type required by the user, and the like, for example, the user group includes four users, corresponding identifications are 1, 2, 3, and 4, required jacket names are T-shirts, information of the size is 176, information of the color is red, and information of the type is adult clothes. The contents in the generated PDF document include the above contents. And the source data to be converted also includes information of some special fields, for example, information of a first target field for indicating whether to generate a chart or/and information of a second target field for indicating whether to generate a watermark, the information of the first target field and the information of the second target field in the source data to be converted are used for indicating whether to generate the chart and the watermark, and the content of the generated PDF document does not include the information of the first target field and the information of the second target field, that is, in the process of converting the source data to be converted into the HTML data by the HTML renderer, the HTML renderer needs to filter the information of the first target field and the information of the second target field in the source data to be converted and convert the filtered source data to be converted into the HTML data.
It should be noted that the HTML renderer may still transmit the source data to be converted, which includes the information of the first target field and the information of the second target field, to the non-interface browser, and the non-interface browser determines whether to generate the chart and the watermark according to the information of the first target field and the information of the second target field. The method comprises the steps of generating a chart or a watermark by a non-interface browser, wherein the chart or the watermark is generated by the non-interface browser, namely, the chart or the watermark is generated by the non-interface browser when the information of a first target field is first preset information, and the watermark is generated by the non-interface browser when the information of a second target field is second preset information. For example, if the data to be converted includes sales volume of a certain product every day in a week, a graph with the abscissa as time and the ordinate as sales volume may be generated. In addition, the watermark pattern may adopt a default pattern, or the data to be converted further includes information of a third target field for indicating the watermark pattern, a watermark corresponding to the information of the third target field is generated by the non-interface browser, and if the data to be converted does not include the information of the third target field for indicating the watermark pattern, the default watermark may be generated by the non-interface browser. According to the HTML data and the chart or/and the watermark, the HTML page generated through the non-interface browser not only comprises the content after the HTML data is analyzed, but also adds the chart or/and the watermark.
In this embodiment, a watermark or/and a chart corresponding to source data to be converted can be generated by the non-interface browser and added to an HTML page, so as to enrich the content of HTML and thus the content of generated PDF.
It should be noted that, in the process of generating a PDF document according to the directory tree and the HTML page, the directory tree and the HTML page may be converted into a PDF document by a QtWebkit, that is, a corresponding PDF document is generated by a QtWebkit according to the directory tree and the HTML page.
In one embodiment, after generating the portable document format PDF document, the method further comprises: and deleting the source data to be converted stored in the file system.
After the portable document format PDF document is generated, the generated PDF document can be output, and in addition, because the source data to be converted received from the external equipment is stored in the file system, the source data to be converted stored in the file system can be deleted after the PDF document is generated, so the storage space of the file system can be saved.
The following describes the procedure of the PDF document generating method in a specific embodiment, taking the source data to be converted as JSON data as an example.
As shown in fig. 2, to implement the method for generating a PDF document, firstly, JSON data sent by a service side (i.e. an external device) is received through a Web service, the Web service encrypts the JSON data and writes the encrypted JSON data into a file system, a data monitoring module monitors that JSON data is written into the file system, a call-up instruction is initiated to a non-interface browser, the non-interface browser is started, the non-interface browser calls an SSR renderer, the SSR renderer calls a Web service, the encrypted JSON data is read from the file system through the Web service, the Web server decrypts the encrypted JSON data and transmits the JSON data to the SSR renderer, the SSR renderer converts the JSON data into HTML data and transmits the HTML data to the non-interface browser, the non-interface browser can determine whether a chart and a watermark need to be generated according to information of a first target field and information of a second target field in the JSON data, if the information of the first target field indicates that the chart or/or the information of the second target field indicates that a watermark needs to be generated, the non-interface browser generates a chart or/and a watermark, and the non-interface browser generates an HTML page according to the HTML data and the generated chart or/and the generated watermark, namely, the chart or/and the watermark are added in the HTML. Then, the non-interface browser can transmit the HTML page to the QtWebkit, the QtWebkit analyzes the HTML page to generate a directory tree, the style of the directory tree can adopt a default style and the like, a PDF document is generated according to the directory tree and the HTML page, conversion from JSON data to the PDF document is completed, and the PDF document is output.
It should be noted that, the PDF document generating system implementing the PDF document generating method is based on a node.js (which is a JavaScript (JS, which is a lightweight, interpreted, or just-in-time compiled high-level programming language) execution environment), as shown in fig. 3, and mainly includes a data monitoring module, a Web service, a process manager, an SSR renderer, a non-interface browser, and a system library (including a QtWebkit). The process manager independently guards the Web service and the data monitoring module, starts a plurality of processes to process data, and supports log tracking and overload restarting.
The system libraries include, among other things, the QtWebkit engine, wkhtmltopdf (a command line tool that invokes the engine), and yum (a package management tool that runs on the operating system) dependent libraries for font sets, rendering, rasterization, etc. needed to generate PDFs.
The Web service mainly includes Express (i.e., Node Web application framework, where Node is Java script execution environment capable of running on server side), CryptoJS (encryption tool), and programming language static type system type script (providing static type checking function for javascript (js)).
The externally provided service interfaces are/SSR (providing a website of the SSR renderer),/getReport (for acquiring JSON data), and/saveddf (for storing JSON files).
The SSR Renderer mainly includes Vue Server render (Vue (framework for building client application) front-end framework, alternatively referred to as Vue Server-side rendering framework), NextJS (read Server-side rendering framework, read is Javascript library for building user interface), LRU Cache (caching mechanism, caching Server-side rendering result, improving system performance), and Echarts (chart rendering engine).
The browser behavior is operated by the non-interface browser by using puppeteer (a node. js package released in 2017).
The data interception module uses a cookie (which is a plug-in for intercepting file changes under node. The process management uses a PM2 (which is a Node process management tool) daemon, PM 2-lograte (which can be understood as a plug-in of PM2, and expands the function of PM2, namely log management, namely a PM2 log management plug-in) to cut the service log.
The method and the device realize the universal Http (hypertext transfer protocol) service for generating the PDF document by JSON data, effectively solve the problems of directory index, chart generation, performance and the like, use isomorphic rendering design to reuse the codes of the existing front-end generated page to the maximum extent, and provide support for generating a more real PDF document.
As shown in fig. 4, according to an embodiment of the present application, the present application further provides a PDF document generating device 400, which includes:
a first obtaining module 401, configured to obtain source data to be converted;
a page generating module 402, configured to generate a hypertext markup language HTML page according to source data to be converted;
a directory generation module 403, configured to invoke a browser engine QtWebkit to generate a directory tree of an HTML page;
and a document generating module 404, configured to generate a portable document format PDF document according to the directory tree and the HTML page.
In the process of generating the PDF document by the PDF document generating device in the embodiment of the present application, an HTML page is generated according to source data to be converted, then a QtWebkit is called to generate a directory tree of the HTML page, and then a PDF document is generated according to the directory tree and the HTML page, so as to implement conversion from the source data to be converted to the PDF document. In the process of generating the PDF document, firstly, an HTML page is generated, the QtWebkit is called to generate a directory tree of the HTML page, the directory tree is generated without manual operation positioning, and then the PDF document is generated according to the directory tree and the HTML page, so that the efficiency of generating the PDF document with the directory tree can be improved.
In one embodiment, obtaining source data to be transformed includes:
calling an HTML renderer to obtain source data to be converted;
a page generation module 402, comprising:
the conversion module is used for converting the source data to be converted into HTML data through the HTML renderer;
and the HTML page generation module is used for analyzing the HTML data through the non-interface browser to generate the HTML page.
In one embodiment, invoking the HTML renderer to obtain the source data to be converted includes:
and under the condition that the source data to be converted is monitored to be written into the file system, calling an HTML (hypertext markup language) renderer through the non-interface browser, and acquiring the source data to be converted from the file system through the HTML renderer.
In one embodiment, the HTML renderer calls a network service to acquire source data to be converted from a file system, and decrypts the source data to be converted through the network service;
converting the source data to be converted into HTML data through an HTML renderer, including:
and converting the decrypted source data to be converted into HTML data through an HTML renderer.
In one embodiment, an HTML page generation module, comprising:
the first generation module is used for generating a chart or/and a watermark through a non-interface browser according to the source data to be converted under the condition that the information of a first target field in the source data to be converted is first preset information or/and the information of a second target field in the source data to be converted is second preset information;
and the second generation module is used for generating an HTML page through the non-interface browser according to the HTML data and the chart or/and the watermark.
The PDF document generating device according to each embodiment is a device for implementing the PDF document generating method according to each embodiment, and has corresponding technical features and technical effects, which are not described herein again.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 5, it is a block diagram of an electronic device according to the PDF document generating method of the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of the GUM on an external input/output device (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
Memory 502 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by at least one processor to cause the at least one processor to perform the PDF document generation method provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the PDF document generating method provided by the present application.
The memory 502, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the PDF document generating method in the embodiment of the present application (e.g., the first obtaining module 401, the page generating module 402, the catalog generating module 403, and the document generating module 404 shown in fig. 4). The processor 501 executes various functional applications of the server and data processing by running non-transitory software programs, instructions, and modules stored in the memory 502, that is, implements the PDF document generating method in the above-described method embodiment.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device displayed by the keyboard, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, which may be connected to keyboard display electronics over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the PDF document generating method may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device displayed by the keyboard, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, special-purpose ASMC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using procedural and/or object oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, an HTML page is generated according to source data to be converted, then a QtWebkit is called to generate a directory tree of the HTML page, and a PDF document is generated according to the directory tree and the HTML page, so that the conversion from the source data to be converted to the PDF document is realized. In the process of generating the PDF document, firstly, an HTML page is generated, the QtWebkit is called to generate a directory tree of the HTML page, the directory tree is generated without manual operation positioning, and then the PDF document is generated according to the directory tree and the HTML page, so that the efficiency of generating the PDF document with the directory tree can be improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. A PDF document generation method, wherein the method comprises the following steps:
acquiring source data to be converted;
generating a hypertext markup language (HTML) page according to the source data to be converted;
calling a browser engine QtWebkit to generate a directory tree of the HTML page;
and generating a portable document format PDF document according to the directory tree and the HTML page.
2. The method of claim 1, wherein the obtaining source data to be transformed comprises:
calling an HTML renderer to obtain source data to be converted;
generating an HTML page according to the source data to be converted comprises the following steps:
converting the source data to be converted into HTML data through the HTML renderer;
and analyzing the HTML data through the non-interface browser to generate an HTML page.
3. The method of claim 2, wherein the invoking the HTML renderer to obtain the source data to be transformed comprises:
and under the condition that the source data to be converted is monitored to be written into the file system, calling the HTML renderer through the non-interface browser, and acquiring the source data to be converted from the file system through the HTML renderer.
4. The method of claim 3, wherein the HTML renderer calls a web service to acquire the source data to be converted from the file system, and decrypts the source data to be converted through the web service;
the converting the source data to be converted into HTML data through the HTML renderer comprises:
and converting the decrypted source data to be converted into the HTML data through the HTML renderer.
5. The method of claim 2, wherein the parsing HTML data through the non-interface browser to generate an HTML page comprises:
generating a chart or/and a watermark through a non-interface browser according to the source data to be converted under the condition that the information of a first target field in the source data to be converted is first preset information or/and the information of a second target field in the source data to be converted is second preset information;
and generating the HTML page through the non-interface browser according to the HTML data and the chart or/and the watermark.
6. A PDF document generating apparatus, wherein the apparatus comprises:
the first acquisition module is used for acquiring source data to be converted;
the page generation module is used for generating a hypertext markup language (HTML) page according to the source data to be converted;
the directory generation module is used for calling a browser engine QtWebkit to generate a directory tree of the HTML page;
and the document generating module is used for generating a portable document format PDF document according to the directory tree and the HTML page.
7. The apparatus of claim 6, wherein the obtaining source data to be transformed comprises:
calling an HTML renderer to obtain source data to be converted;
the page generation module comprises:
the conversion module is used for converting the source data to be converted into HTML data through the HTML renderer;
and the HTML page generation module is used for analyzing the HTML data through the non-interface browser to generate the HTML page.
8. The apparatus of claim 7, wherein the invoking of the HTML renderer to obtain the source data to be converted comprises:
and under the condition that the source data to be converted is monitored to be written into the file system, calling the HTML renderer through the non-interface browser, and acquiring the source data to be converted from the file system through the HTML renderer.
9. The device of claim 8, wherein the HTML renderer calls a web service to acquire the source data to be converted from the file system and decrypts the source data to be converted through the web service;
the converting the source data to be converted into HTML data through the HTML renderer comprises:
and converting the decrypted source data to be converted into the HTML data through the HTML renderer.
10. The apparatus of claim 7, wherein the HTML page generation module comprises:
a first generating module, configured to generate a chart or/and a watermark through a non-interface browser according to the source data to be converted when information of a first target field in the source data to be converted is first preset information or/and information of a second target field in the source data to be converted is second preset information;
and the second generation module is used for generating the HTML page through the non-interface browser according to the HTML data and the chart or/and the watermark.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202010701841.0A 2020-07-20 2020-07-20 PDF document generation method and device and electronic equipment Pending CN111881650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010701841.0A CN111881650A (en) 2020-07-20 2020-07-20 PDF document generation method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010701841.0A CN111881650A (en) 2020-07-20 2020-07-20 PDF document generation method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN111881650A true CN111881650A (en) 2020-11-03

Family

ID=73155153

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010701841.0A Pending CN111881650A (en) 2020-07-20 2020-07-20 PDF document generation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111881650A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051504A (en) * 2021-03-23 2021-06-29 北京百度网讯科技有限公司 Document preview method, apparatus, device, storage medium and program product
CN113408248A (en) * 2021-06-08 2021-09-17 南京冰鉴信息科技有限公司 PDF directory generation method and device, computer equipment and readable storage medium
CN115587075A (en) * 2022-12-05 2023-01-10 北京合思信息技术有限公司 Layout file processing method and device, terminal equipment and storage medium
CN116070596A (en) * 2023-03-29 2023-05-05 深圳市奥思网络科技有限公司 PDF file generation method and device based on dynamic data and related medium
CN116303252A (en) * 2023-05-18 2023-06-23 北京探索者软件股份有限公司 Format conversion method and device for DWG file, storage medium and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855244A (en) * 2011-06-28 2013-01-02 北大方正集团有限公司 Method and device for file catalogue processing
US20140333947A1 (en) * 2013-05-13 2014-11-13 Xerox Corporation Client based splitting of pdf/vt dpart catalog
CN106354700A (en) * 2016-08-11 2017-01-25 广州爱九游信息技术有限公司 Page text conversion method and system
CN107358208A (en) * 2017-07-14 2017-11-17 北京神州泰岳软件股份有限公司 A kind of PDF document structured message extracting method and device
CN110083805A (en) * 2018-01-25 2019-08-02 北京大学 A kind of method and system that Word file is converted to EPUB file

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855244A (en) * 2011-06-28 2013-01-02 北大方正集团有限公司 Method and device for file catalogue processing
US20140333947A1 (en) * 2013-05-13 2014-11-13 Xerox Corporation Client based splitting of pdf/vt dpart catalog
CN106354700A (en) * 2016-08-11 2017-01-25 广州爱九游信息技术有限公司 Page text conversion method and system
CN107358208A (en) * 2017-07-14 2017-11-17 北京神州泰岳软件股份有限公司 A kind of PDF document structured message extracting method and device
CN110083805A (en) * 2018-01-25 2019-08-02 北京大学 A kind of method and system that Word file is converted to EPUB file

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051504A (en) * 2021-03-23 2021-06-29 北京百度网讯科技有限公司 Document preview method, apparatus, device, storage medium and program product
CN113051504B (en) * 2021-03-23 2023-08-01 北京百度网讯科技有限公司 Document preview method, device, apparatus, storage medium and program product
CN113408248A (en) * 2021-06-08 2021-09-17 南京冰鉴信息科技有限公司 PDF directory generation method and device, computer equipment and readable storage medium
CN115587075A (en) * 2022-12-05 2023-01-10 北京合思信息技术有限公司 Layout file processing method and device, terminal equipment and storage medium
CN116070596A (en) * 2023-03-29 2023-05-05 深圳市奥思网络科技有限公司 PDF file generation method and device based on dynamic data and related medium
CN116070596B (en) * 2023-03-29 2023-06-09 深圳市奥思网络科技有限公司 PDF file generation method and device based on dynamic data and related medium
CN116303252A (en) * 2023-05-18 2023-06-23 北京探索者软件股份有限公司 Format conversion method and device for DWG file, storage medium and electronic device
CN116303252B (en) * 2023-05-18 2023-09-12 北京探索者软件股份有限公司 Format conversion method and device for DWG file, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN111881650A (en) PDF document generation method and device and electronic equipment
JP7194162B2 (en) Data processing method, device, electronic device and storage medium
CN111625738B (en) APP target page calling method, device, equipment and storage medium
WO2021174928A1 (en) Page pre-rendering method and apparatus, electronic device, and storage medium
CN112069201A (en) Target data acquisition method and device
CN111694857B (en) Method, device, electronic equipment and computer readable medium for storing resource data
JP7220753B2 (en) Labeling tool generation method and apparatus, labeling method and apparatus, electronic device, storage medium and program
KR20210040850A (en) Method, apparatus, device, and storage medium for parsing document
JP2010176336A (en) Client program, terminal, method, server system, and server program
CN111159592B (en) Search result page generation method and device, electronic equipment and storage medium
CN111158799A (en) Page rendering method and device, electronic equipment and storage medium
CN111475259B (en) Applet loading method and device and electronic equipment
CN106909327B (en) Display control device for industrial control equipment
JP2005228227A (en) Thin client system and its communication method
CN111770161B (en) https sniffing jump method and device
CN112491617B (en) Link tracking method, device, electronic equipment and medium
KR20210089081A (en) Landing page processing method, device, equipment and medium
CN110545324B (en) Data processing method, device, system, network equipment and storage medium
US11294651B2 (en) Code execution method, device, and rendering apparatus
CN112506854A (en) Method, device, equipment and medium for storing page template file and generating page
CN111813623A (en) Page monitoring method and device, electronic equipment and storage medium
WO2023092580A1 (en) Page display method and apparatus, storage medium, and electronic device
CN111610972A (en) Page generation method, device, equipment and storage medium
EP3616061A1 (en) Hyper dynamic java management extension
US20130325851A1 (en) Free-Text Search for Integrating Management of Applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination