CN109284453B - Data downloading method and device based on PDF document, storage medium and terminal - Google Patents

Data downloading method and device based on PDF document, storage medium and terminal Download PDF

Info

Publication number
CN109284453B
CN109284453B CN201810796560.0A CN201810796560A CN109284453B CN 109284453 B CN109284453 B CN 109284453B CN 201810796560 A CN201810796560 A CN 201810796560A CN 109284453 B CN109284453 B CN 109284453B
Authority
CN
China
Prior art keywords
class object
pdf document
data
picture
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810796560.0A
Other languages
Chinese (zh)
Other versions
CN109284453A (en
Inventor
罗先贤
龙觉刚
孙成
叶俊锋
赖云辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810796560.0A priority Critical patent/CN109284453B/en
Priority to PCT/CN2018/111697 priority patent/WO2020015220A1/en
Publication of CN109284453A publication Critical patent/CN109284453A/en
Application granted granted Critical
Publication of CN109284453B publication Critical patent/CN109284453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a data downloading method and device based on a PDF document, a storage medium and a terminal, relates to the technical field of data processing, and mainly aims to solve the problems that the format and the content of data needing to be downloaded of an existing website are not fixed, and the website processing pressure is increased when the data are added to a fixed PDF document template. The method comprises the following steps: when a data downloading request is received, acquiring text data and picture data according to request content carried in the data downloading request; respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-established PDF document template; matching and adding the first class object and the second class object to the class object X and the class object Y, and establishing an output byte stream of the matched PDF document; writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server.

Description

Data downloading method and device based on PDF document, storage medium and terminal
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for downloading data based on a PDF document, a storage medium, and a terminal.
Background
With the popularization of paperless spreading of data information, more and more users download data information to check by using internet websites. For example, in some big data websites, in order to acquire relevant data in the website, a user downloads a document having a PDF format from the website for the user to print.
At present, in the process of generating and downloading a PDF document by using existing data, after an internet website receives a data downloading request, the website establishes a fixed PDF document template in advance, and adds data to be downloaded to the PDF document template, but because the format and content of the data to be downloaded by the website are not fixed, when the data are added to the fixed PDF document template, the processing pressure of the website is increased, the data downloading time is increased, and changing the PDF document template increases the complexity of data downloading and reduces the efficiency of data downloading.
Disclosure of Invention
In view of the above, the present invention provides a method and an apparatus for downloading data based on a PDF document, a storage medium, and a terminal, and mainly aims to solve the problems that the format and the content of data to be downloaded by a website are not fixed, so that when the data are added to a fixed PDF document template, the processing pressure of the website is increased, the data downloading time is increased, and the complexity of data downloading is increased and the data downloading efficiency is reduced by replacing the PDF document template.
According to one aspect of the present invention, there is provided a PDF document-based data downloading method, including:
when a data downloading request is received, acquiring text data and picture data according to request content carried in the data downloading request;
respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-established PDF document template, wherein a text field and a picture field in the PDF document template are respectively provided with a class object X and a class object Y;
according to the class object attributes of the first class object and the second class object, respectively matching and adding the first class object and the second class object to the class object X and the class object Y, and establishing an output byte stream of the matched PDF document;
writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server, so that a user can download the data of the PDF document through the temporary storage path.
Further, before obtaining the text data and the picture data according to the request content carried in the data downloading request, the method further includes:
establishing the PDF document template, and dividing a text field and a picture field in the PDF document template, wherein the text field comprises text areas with different names, the text areas with different names respectively comprise the quantity and the attribute of different text data, the picture field comprises different picture areas, and each picture area comprises a position coordinate corresponding to the picture area.
Further, before writing the output byte stream of the created PDF document into a pre-created empty compressed file, the method further includes:
and extracting the temporary storage path which is suspended in the server, establishing the empty compressed file, and storing the empty compressed file into the temporary file under the temporary storage path.
Further, the converting the text data and the image data into a first class object and a second class object, respectively, and reading a pre-created PDF document template includes:
defining a first class object matched with a class object X in the text field according to the data attribute of the text data;
converting the identification code of the picture data into a binary code, and converting the binary code into a second class object matched with the class object Y;
respectively reading a class object X of a text field and a class object Y of a picture field in a pre-created PDF document template, wherein the class object X is a class object to which text data and attributes can be added, and the class object Y is a class object to which the coordinate position of picture data can be added.
Further, the matching and adding the first class object and the second class object to the class object X and the class object Y respectively according to the class object attributes of the first class object and the second class object, and establishing the output byte stream of the matched PDF document includes:
initializing the attribute of the class object X by using an Acrofield class object, and configuring the attribute in the first class object and the attribute of the class object X;
matching and configuring the identification code of the second class object with the coordinate position of the class object Y, wherein the matching and configuration are configured according to a preset mapping relation between the identification code and the coordinate position;
and establishing an output byte stream according to the PDF document after the class object X and the class object Y are matched and configured according to the PDF document template.
Further, after writing the output byte stream of the PDF document after being created into a pre-created empty compressed file and storing the compressed file written into the PDF document into a temporary storage path of the server, the method further includes:
and after the compression is finished, converting the compressed file into a binary word output stream in an output stream mode, so that a user can download the compressed file.
Further, the method further comprises:
and clearing the compressed files in the temporary files under the temporary storage path of the server according to a preset time interval.
According to an aspect of the present invention, there is provided a PDF document-based data downloading apparatus, comprising:
the acquisition unit is used for acquiring text data and picture data according to request content carried in a data downloading request when the data downloading request is received;
the conversion unit is used for respectively converting the text data and the picture data into a first class object and a second class object and reading a pre-created PDF document template, wherein a text field and a picture field in the PDF document template are respectively provided with a class object X and a class object Y;
the adding unit is used for respectively matching and adding the first class object and the second class object to the class object X and the class object Y according to the class object attributes of the first class object and the second class object, and establishing an output byte stream of the matched PDF document;
and the storage unit is used for writing the established output byte stream of the PDF document into a pre-established empty compressed file and storing the compressed file written into the PDF document into a temporary storage path of the server, so that a user can download the data of the PDF document through the temporary storage path.
Further, the apparatus further comprises:
the establishing unit is used for establishing the PDF document template and dividing a text field and a picture field in the PDF document template, wherein the text field comprises text areas with different names, the text areas with different names respectively comprise the quantity and the attributes of different text data, the picture field comprises different picture areas, and each picture area comprises a position coordinate corresponding to the picture area.
Further, the apparatus further comprises:
and the extracting unit is used for extracting the temporary storage path which is suspended in the server, establishing the empty compressed file, and storing the empty compressed file into the temporary file under the temporary storage path.
Further, the conversion unit includes:
the definition module is used for defining a first class object matched with the class object X in the text field according to the data attribute of the text data;
the conversion module is used for converting the identification code of the picture data into a binary code and converting the binary code into a second class object matched with the class object Y;
the reading module is used for respectively reading a class object X of a text field and a class object Y of a picture field in a pre-created PDF document template, wherein the class object X is a class object capable of adding text data and attributes, and the class object Y is a class object capable of adding coordinate positions of picture data.
Further, the adding unit includes:
the first configuration module is used for initializing the attribute of the class object X by using an Acrofield class object and configuring the attribute in the first class object and the attribute of the class object X;
the second configuration module is used for matching and configuring the identification code of the second type object with the coordinate position of the type object Y, and the matching configuration is configured according to a preset mapping relation between the identification code and the coordinate position;
and the establishing module is used for establishing an output byte stream according to the PDF document after the class object X and the class object Y are matched and configured according to the PDF document template.
Further, the conversion unit is further configured to convert the compressed file into a binary word output stream in an output stream after the compression is completed, so that the user downloads the compressed file.
Further, the apparatus further comprises:
and the clearing unit is used for clearing the compressed file in the temporary file under the temporary storage path of the server according to a preset time interval.
According to still another aspect of the present invention, there is provided a storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the above PDF document based data downloading method.
According to still another aspect of the present invention, there is provided a terminal including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the PDF document-based data downloading method.
By the technical scheme, the technical scheme provided by the embodiment of the invention at least has the following advantages:
the invention provides a PDF document-based data downloading method and device, a storage medium and a terminal, and the method comprises the steps of firstly, when a data downloading request is received, acquiring text data and picture data according to request content carried in the data downloading request; respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-established PDF document template, wherein a text field and a picture field in the PDF document template are respectively provided with a class object X and a class object Y; according to the class object attributes of the first class object and the second class object, respectively matching and adding the first class object and the second class object to the class object X and the class object Y, and establishing an output byte stream of the matched PDF document; writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server, so that a user can download the data of the PDF document through the temporary storage path. Compared with the existing website that the format and the content of data needing to be downloaded are not fixed, the embodiment of the invention converts the requested text data and the requested picture data into the first class object and the second class object which can be added into the PDF document template, then adds the first class object and the second class object into the corresponding class object X and the class object Y in a matching manner, writes the first class object and the second class object into the empty compressed file in the form of the output byte stream of the PDF document, stores the compressed file into the temporary storage path, so that the text data and the picture data can be flexibly added into the PDF document, the processing pressure of the website is reduced, the data downloading time is reduced, and the flexible text field and the flexible picture field can reduce the complexity of data addition and downloading of the PDF document template, thereby improving the data downloading efficiency.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various additional advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart illustrating a method for downloading data based on a PDF document according to an embodiment of the present invention;
FIG. 2 is a flow chart of another PDF document-based data downloading method according to an embodiment of the present invention;
FIG. 3 is a block diagram of a PDF document-based data downloading device according to an embodiment of the present invention;
FIG. 4 is a block diagram of another PDF document-based data downloading device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
An embodiment of the present invention provides a PDF document-based data downloading method, as shown in fig. 1, where the method includes:
101. when a data downloading request is received, text data and picture data are obtained according to request content carried in the data downloading request.
The request content is specific data that needs to be downloaded, and the data may be stored in a database or a cache of a server, which is not limited in the embodiments of the present invention. In addition, the text data is specific data such as characters and numbers, the picture data is image data for displaying, and when a data downloading request is received, according to the request content, for example, if the request content is a historical browsing record of a website a logged in by a user in 10 months and 3 days, the text data and the picture data in 10 months and 3 days can be acquired through the historical browsing record stored in the server.
It should be noted that the acquired text data is specific data such as characters and numbers, the acquired picture data is an identification code of a picture, such as a barcode, and the barcode itself is a character string composed of english and numbers and is converted into a barcode of a corresponding specification through an open source code.
102. And respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-created PDF document template.
And the class object X and the class object Y are respectively configured in a text field and a picture field in the PDF document template. Text fields and picture fields with different sizes and different forms are pre-configured in a PDF document template, each text field can be configured according to the quantity, size and attribute of text data which needs to be generated, naming is carried out on different text fields, and when the text data is added into the text fields, the text fields can be added according to the names of the text fields. In addition, the picture field is an area with coordinate position information, when picture data is added, the picture can be added to a specific position according to the coordinate position information, a blank PDF document is established in a pre-created PDF document template, and different text fields or picture fields are configured with a class object X of the corresponding text and a class object Y of the picture, so that matching addition is performed when the text and the picture are added.
It should be noted that, for text data, specific characters and data can be directly converted into class objects, for distinguishing the class objects converted from picture data, the text data is converted into a first class object, for picture data, the text data can be converted into binary codes of images through identification codes, and then the binary codes are converted into a second class object.
103. And according to the class object attributes of the first class object and the second class object, respectively matching and adding the first class object and the second class object to the class object X and the class object Y, and establishing an output byte stream of the matched PDF document.
In the embodiment of the invention, in order to accurately add the text data to the text field and the picture data to the picture field, the first class object converted from the text data is matched and added to the class object X in the text field, and the second class object converted from the picture data is matched and added to the class object Y in the picture field. The PDF document template is pre-provided with a fixed class object X, and then the class object X matched with the first class object is determined according to the attribute or the type of the class object. In order to facilitate the output of the PDF document to which the text data and the picture data are added, an output byte stream for the PDF document is created, so as to perform the step of writing the empty compressed file in step 104.
104. Writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server.
For the embodiment of the invention, the user can download the data of the PDF document through the temporary storage path. The empty compression file is a pre-established empty compression file without any content, a position is reserved in a temporary storage path of the server, after the data are added to the PDF document according to the acquired data, the output byte stream of the PDF document is written into the empty compression file, and the written compression file is stored in the reserved position, so that a user can download the PDF document from the temporary storage path at any time.
The invention provides a data downloading method based on a PDF document, compared with the condition that the format and the content of data required to be downloaded by the existing website are not fixed, the embodiment of the invention converts the requested text data and picture data into a first class object and a second class object which can be added into a PDF document template, then adds the first class object and the second class object into a corresponding class object X and a corresponding class object Y in a matching way, writes the first class object and the second class object into an empty compressed file in the form of output byte stream of the PDF document, and stores the compressed file into a temporary storage path, so that the text data and the picture data can be flexibly added into the PDF document, the processing pressure of the website is reduced, the data downloading time is reduced, and the flexible and various text fields and picture fields can reduce the complexity of adding and downloading data of the PDF document template, thereby improving the data downloading efficiency.
The embodiment of the invention provides another data downloading method based on a PDF document, and as shown in FIG. 2, the method comprises the following steps:
201. and establishing the PDF document template, and dividing a text field and a picture field in the PDF document template.
For the embodiment of the present invention, in order to avoid the need of re-establishing the PDF template when adding text data and picture data, the embodiment of the present invention pre-establishes a PDF document template into which text fields and picture fields have been divided, where the text fields include text regions with different names, the text regions with different names respectively include the number and attributes of different text data, the picture region includes different picture regions, and each picture region includes a position coordinate corresponding to the picture region.
It should be noted that, because the PDF document template may include a plurality of text fields and a plurality of picture fields, when dividing the text fields and the picture fields, the division of the text fields is to name the text fields of different size regions according to the number and attributes of the text data, the number of the text data is data of characters, numbers, and the like, that is, the size of the data forming the text, the data attributes include character string data, byte data, and the like, and the regions of different sizes are divided according to the number and attributes of the text data, and each region is named. In addition, the image domain includes regions with different position coordinates, different images may be added to the different position coordinates, that is, multiple images may be added to one region according to the position coordinates, or multiple image domains may be divided in one PDF document template, which is not specifically limited in the embodiment of the present invention. The position of the barcode of the picture data is pre-divided, that is, can be determined by using a coordinate axis, for example, when the requested content is the picture data, the barcode is added to the position according to the barcode position of the pre-divided picture data.
202. And extracting the temporary storage path which is suspended in the server, establishing the empty compressed file, and storing the empty compressed file into the temporary file under the temporary storage path.
For the embodiment of the invention, in order to not influence the storage of other data and not occupy the use of normal data, a temporary storage path which is suspended in the server is extracted, a blank compressed file is established under the temporary storage path and is stored in the temporary file under the temporary storage path, so that when the PDF document is added to the current blank compressed file, the blank compressed file is directly extracted from the temporary file.
203. When a data downloading request is received, text data and picture data are obtained according to request content carried in the data downloading request.
This step is the same as step 101 shown in fig. 1, and is not described herein again.
204. And respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-created PDF document template.
This step is the same as step 102 shown in fig. 1, and is not described herein again.
For the embodiment of the present invention, step 204 may specifically be: defining a first class object matched with a class object X in the text field according to the data attribute of the text data; converting the identification code of the picture data into a binary code, and converting the binary code into a second class object matched with the class object Y; respectively reading a class object X of a text field and a class object Y of a picture field in a pre-created PDF document template, wherein the class object X is a class object to which text data and attributes can be added, and the class object Y is a class object to which the coordinate position of picture data can be added.
For the embodiment of the invention, the defined first class object is the specific text data which is inquired, and each attribute of the first class object is set to be the same as that of the text data, so that the first class object can be directly matched with the class object X when matching is carried out. In the embodiment of the invention, since the image data is obtained as the identification code of the image data, such as the bar code, when the image data is obtained, in order to convert the image data into the second class object matched with the class object Y, the identification code needs to be converted into a binary code, and then the binary code is converted into the second class object.
In addition, in order to accurately add the first class object and the second class object to the pre-created PDF template, after the first class object and the second class object are read, the class object X and the class object Y in the PDF document template need to be read, so that accurate matching addition is performed.
205. And according to the class object attributes of the first class object and the second class object, respectively matching and adding the first class object and the second class object to the class object X and the class object Y, and establishing an output byte stream of the matched PDF document.
This step is the same as step 103 shown in fig. 1, and is not described again here.
For the embodiment of the present invention, step 205 may specifically be: initializing the attribute of the class object X by using an Acrofield class object, and configuring the attribute in the first class object and the attribute of the class object X; matching and configuring the identification code of the second class object with the coordinate position of the class object Y, wherein the matching and configuration are configured according to a preset mapping relation between the identification code and the coordinate position; and establishing an output byte stream according to the PDF document after the class object X and the class object Y are matched and configured according to the PDF document template.
For the embodiment of the present invention, the AcroFields class object is a java class of a general pdf text field, and the initialization process is to initialize the attribute of the class object X to a text field variable corresponding to the AcroFields class object. The PDF document template may include a plurality of text field fields, and the text field fields are initialized to the attribute fields of the standard AcroFields class object, where the attribute fields of the AcroFields are read from the specified PDF template, and may further include additional fields for parameter transmission or recording information, such as a path name, a flag bit, a PDF file name, and the like.
In addition, since the class object in the embodiment of the present invention is a data structure, the basic information of the class is stored: the method includes the steps that information such as page information, coordinate information and the like are stored in a class object Y, the page information and the coordinate information can be matched with an identification code, specifically, the class object Y is configured through a preset mapping relation between the identification code and a coordinate position, if the preset mapping relation exists between the identification code 1 and the coordinate position (a, b, c), the identification code 1 and the coordinate position (a, b, c) are configured, and when the identification code of a converted second class object can be matched and configured to the page information and the coordinate information of the class object Y, a picture is added into a PDF document.
It should be noted that the created output byte stream is the output byte stream of the PDF document to which the text data and the picture data have been added, and after the PDF document generates the output byte stream, the byte information of the output byte stream is the completed PDF document object.
206. Writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server.
This step is the same as step 104 shown in fig. 1, and will not be described herein again.
207. After the compression is completed, the compressed file is converted into a binary word output stream in the form of an output stream.
For the embodiment of the invention, the user can download the compressed file. For example, when a PDF document written with an empty compressed file is stored in a temporary path of a server as a compressed file, an input stream of the compressed file is read, where the input stream refers to that the compressed file is input into a memory or a cache, then converted into a binary byte stream, and written into an output stream, where the byte stream is an eight-bit general byte stream, that is, the compressed file is converted into byte data, and the output stream refers to that the compressed file is output from the memory or the cache, and then transmitted to a client for downloading. And after the compression is finished, reading the compressed file transmitted to the input stream, wherein the compressed file in the input stream is converted in a binary byte stream mode so as to be downloaded by a client.
208. And clearing the compressed files in the temporary files under the temporary storage path of the server according to a preset time interval.
For the embodiment of the present invention, in order to avoid that the processing efficiency of the server is affected by too many and too large temporary files in the temporary storage path, the compressed file in the temporary storage path of the server needs to be cleared at a certain time interval, so that the temporary storage path always maintains a storable state. The preset time interval may be set according to a download amount of the data, for example, when the download amount is too large, the preset time interval is set to 10 minutes, and when the download amount is too small, the preset time interval is set to 1 hour.
The embodiment of the invention converts the requested text data and picture data into a first class object and a second class object which can be added into a PDF document template, then adds the first class object and the second class object into a corresponding class object X and a corresponding class object Y in a matching way, writes the first class object and the second class object into an empty compressed file in the form of output byte streams of the PDF document, and stores the compressed file into a temporary storage path, so that the text data and the picture data can be flexibly added into the PDF document, the website processing pressure is reduced, the data downloading time is reduced, and the flexible and various text fields and picture fields can reduce the complexity of adding and downloading data of the PDF document template, thereby improving the data downloading efficiency.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides a PDF document-based data downloading device, as shown in fig. 3, where the device includes: an acquisition unit 31, a conversion unit 32, an addition unit 33, and a storage unit 34.
The acquiring unit 31 is configured to, when a data download request is received, acquire text data and picture data according to request content carried in the data download request; the acquiring unit 31 is a program module for executing, by a PDF document-based data downloading device, a program module for acquiring text data and picture data according to request contents carried in a data downloading request when the data downloading request is received.
A conversion unit 32, configured to convert the text data and the picture data into a first class object and a second class object, respectively, and read a pre-created PDF document template, where a class object X and a class object Y are configured in a text field and a picture field in the PDF document template, respectively; the conversion unit 32 is a program module that executes conversion of the text data and the image data into a first type object and a second type object, respectively, and reads a PDF document template created in advance for a PDF document-based data download device.
The adding unit 33 is configured to match and add the first class object and the second class object to the class object X and the class object Y, respectively, according to the class object attributes of the first class object and the second class object, and establish an output byte stream of the matched PDF document; the adding unit 33 is a program module for executing a PDF document-based data downloading device, according to the class object attributes of the first class object and the second class object, matching and adding the first class object and the second class object to the class object X and the class object Y, respectively, and establishing an output byte stream of the matched PDF document.
The storage unit 34 is configured to write the created output byte stream of the PDF document into a pre-created empty compressed file, and store the compressed file written into the PDF document into a temporary storage path of the server, so that the user downloads data of the PDF document through the temporary storage path. The storage unit 34 is a program module for executing, by a PDF document-based data downloading device, writing an output byte stream of a PDF document after being created into a pre-created empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server.
The invention provides a data downloading device based on a PDF document, compared with the condition that the format and the content of data required to be downloaded by the existing website are not fixed, the embodiment of the invention converts the requested text data and image data into a first class object and a second class object which can be added into a PDF document template, then adds the first class object and the second class object into a corresponding class object X and a corresponding class object Y in a matching way, writes an empty compressed file in the form of output byte stream of the PDF document, and stores the compressed file into a temporary storage path, so that the text data and the image data can be flexibly added into the PDF document, the processing pressure of the website is reduced, the data downloading time is reduced, and the flexible and various text fields and image fields can reduce the complexity of adding and downloading data of the PDF document template, thereby improving the data downloading efficiency.
Further, as an implementation of the method shown in fig. 2, another PDF document-based data downloading device is provided in the embodiment of the present invention, and as shown in fig. 4, the device includes: an acquisition unit 41, a conversion unit 42, an addition unit 43, a storage unit 44, a creation unit 45, an extraction unit 46, and a clearing unit 47.
An obtaining unit 41, configured to, when a data download request is received, obtain text data and picture data according to request content carried in the data download request;
a converting unit 42, configured to convert the text data and the picture data into a first class object and a second class object, respectively, and read a pre-created PDF document template, where a class object X and a class object Y are configured in a text field and a picture field in the PDF document template, respectively;
an adding unit 43, configured to match and add the first class object and the second class object to the class object X and the class object Y respectively according to the class object attributes of the first class object and the second class object, and establish an output byte stream of the matched PDF document;
the storage unit 44 is configured to write the created output byte stream of the PDF document into a pre-created empty compressed file, and store the compressed file written into the PDF document into a temporary storage path of the server, so that the user downloads data of the PDF document through the temporary storage path.
Further, the apparatus further comprises:
the establishing unit 45 is configured to establish the PDF document template, and divide a text field and a picture field in the PDF document template, where the text field includes text regions with different names, the text regions with different names respectively include the number and attributes of different text data, the picture field includes different picture regions, and each picture region includes a position coordinate corresponding to the picture region.
Further, the apparatus further comprises:
an extracting unit 46, configured to extract the temporary storage path suspended in the server, establish the empty compressed file, and store the empty compressed file into a temporary file under the temporary storage path.
Further, the conversion unit 42 includes:
a definition module 4201, configured to define, according to a data attribute of the text data, a first class object that matches a class object X in the text field;
a conversion module 4202, configured to convert the identification code of the picture data into a binary code, and convert the binary code into a second class object matching the class object Y;
a reading module 4203, configured to read a class object X in a text field and a class object Y in a picture field in a pre-created PDF document template, respectively, where the class object X is a class object to which text data and attributes can be added, and the class object Y is a class object to which a coordinate position of picture data can be added.
Further, the adding unit 43 includes:
a first configuration module 4301, configured to initialize the attribute of the class object X by using an AcroFields class object, and configure the attribute in the first class object and the attribute of the class object X;
a second configuration module 4302, configured to match the identification code of the second type object with the coordinate position of the type object Y, where the matching configuration is configured according to a preset mapping relationship between the identification code and the coordinate position;
the establishing module 4303 is configured to establish an output byte stream according to the PDF document after the class object X and the class object Y are configured according to the PDF document template in a matching manner.
Further, the converting unit 42 is further configured to convert the compressed file into a binary word output stream in an output stream after the compression is completed, so that the user downloads the compressed file.
Further, the apparatus further comprises:
and the clearing unit 47 is configured to clear the compressed file in the temporary file under the temporary storage path of the server according to a preset time interval.
The embodiment of the invention converts the requested text data and image data into a first class object and a second class object which can be added into a PDF document template, then adds the first class object and the second class object into a corresponding class object X and a corresponding class object Y in a matching way, writes an empty compressed file in the form of an output byte stream of the PDF document, and stores the compressed file into a temporary storage path, so that the text data and the image data can be flexibly added into the PDF document, the processing pressure of a website is reduced, the data downloading time is reduced, and the flexible and various text fields and image fields can reduce the complexity of adding and downloading data of the PDF document template, thereby improving the data downloading efficiency.
According to an embodiment of the present invention, a storage medium is provided, where the storage medium stores at least one executable instruction, and the computer executable instruction may execute the PDF document based data downloading method in any of the above method embodiments.
Fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the terminal.
As shown in fig. 5, the terminal may include: a processor (processor) 502, a Communications Interface 504, a memory 506, and a communication bus 508.
Wherein: the processor 502, communication interface 504, and memory 506 communicate with each other via a communication bus 508.
A communication interface 504 for communicating with network elements of other devices, such as clients or other servers.
The processor 502 is configured to execute the program 510, and may specifically execute relevant steps in the above embodiment of the PDF document based data downloading method.
In particular, program 510 may include program code that includes computer operating instructions.
The processor 502 may be a central processing unit CPU, or an Application Specific Integrated Circuit ASIC (Application Specific Integrated Circuit), or one or more Integrated circuits configured to implement an embodiment of the present invention. The terminal comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 506 for storing a program 510. The memory 506 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 510 may be specifically configured to cause the processor 502 to perform the following operations:
when a data downloading request is received, acquiring text data and picture data according to request content carried in the data downloading request;
respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-established PDF document template, wherein a text field and a picture field in the PDF document template are respectively provided with a class object X and a class object Y;
according to the class object attributes of the first class object and the second class object, the first class object and the second class object are respectively matched and added to the class object X and the class object Y, and an output byte stream of the matched PDF document is established;
writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server, so that a user can download the data of the PDF document through the temporary storage path.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A data downloading method based on a PDF document is characterized by comprising the following steps:
when a data downloading request is received, acquiring text data and picture data according to request content carried in the data downloading request;
respectively converting the text data and the picture data into a first class object and a second class object, and reading a pre-created PDF document template, wherein a text field and a picture field in the PDF document template are respectively provided with a class object X and a class object Y, the class object X is read based on the text field in the pre-created PDF document template, the class object Y is read based on the picture field in the pre-created PDF document template, the class object X is a class object to which text data and attributes can be added, and the class object Y is a class object to which the coordinate position of the picture data can be added;
according to the class object attributes of the first class object and the second class object, the first class object and the second class object are respectively matched and added to the class object X and the class object Y, and an output byte stream of the matched PDF document is established;
writing the output byte stream of the PDF document after being established into a pre-established empty compressed file, and storing the compressed file after being written into the PDF document into a temporary storage path of a server, so that a user can download the data of the PDF document through the temporary storage path;
the matching and adding the first class object and the second class object to the class object X and the class object Y respectively according to the class object attributes of the first class object and the second class object, and establishing the output byte stream of the matched PDF document comprises:
initializing the attribute of the class object X by using an Acrofield class object, and configuring the attribute in the first class object and the attribute of the class object X;
matching and configuring the identification code of the second class object with the coordinate position of the class object Y, wherein the matching and configuration are configured according to a preset mapping relation between the identification code and the coordinate position;
and establishing an output byte stream according to the PDF document after the class object X and the class object Y are matched and configured according to the PDF document template.
2. The method according to claim 1, wherein before acquiring the text data and the picture data according to the request content carried in the data download request, the method further comprises:
establishing the PDF document template, and dividing a text field and a picture field in the PDF document template, wherein the text field comprises text areas with different names, the text areas with different names respectively comprise the quantity and the attribute of different text data, the picture field comprises different picture areas, and each picture area comprises a position coordinate corresponding to the picture area.
3. The method of claim 1, wherein before writing the output byte stream of the built PDF document into the pre-built empty compressed file, the method further comprises:
and extracting the temporary storage path which is suspended in the server, establishing the empty compressed file, and storing the empty compressed file into the temporary file under the temporary storage path.
4. The method according to claim 1, wherein the converting the text data and the picture data into the first class object and the second class object respectively comprises:
defining a first class object matched with a class object X in the text field according to the data attribute of the text data;
and converting the identification code of the picture data into a binary code, and converting the binary code into a second class object matched with the class object Y.
5. The method according to claim 1, wherein after writing the output byte stream of the PDF document after being created into a pre-created empty compressed file and storing the compressed file after being written into the PDF document into a temporary storage path of a server, the method further comprises:
after the compression is completed, the compressed file is converted into a binary word output stream in the form of an output stream, so that the user downloads the compressed file.
6. The method of claim 1, further comprising:
and clearing the compressed files in the temporary files under the temporary storage path of the server according to a preset time interval.
7. A PDF document based data downloading apparatus, comprising:
the acquisition unit is used for acquiring text data and picture data according to request content carried in a data downloading request when the data downloading request is received;
a conversion unit, configured to convert the text data and the picture data into a first class object and a second class object, respectively, and read a pre-created PDF document template, where a text field and a picture field in the PDF document template are configured with a class object X and a class object Y, respectively, where the class object X is read based on a text field in the pre-created PDF document template, the class object Y is read based on a picture field in the pre-created PDF document template, the class object X is a class object to which text data and attributes can be added, and the class object Y is a class object to which a coordinate position of the picture data can be added;
the adding unit is used for respectively matching and adding the first class object and the second class object to the class object X and the class object Y according to the class object attributes of the first class object and the second class object, and establishing an output byte stream of the matched PDF document;
the storage unit is used for writing the established output byte stream of the PDF document into a pre-established empty compressed file and storing the compressed file written into the PDF document into a temporary storage path of a server, so that a user can download the data of the PDF document through the temporary storage path;
the adding unit includes:
the first configuration module is used for initializing the attribute of the class object X by using an Acrofilels class object and configuring the attribute in the first class object and the attribute of the class object X;
the second configuration module is used for matching and configuring the identification code of the second type object with the coordinate position of the type object Y, and the matching configuration is configured according to a preset mapping relation between the identification code and the coordinate position;
and the establishing module is used for establishing an output byte stream according to the PDF document after the class object X and the class object Y are matched and configured according to the PDF document template.
8. A storage medium having at least one executable instruction stored therein, the executable instruction causing a processor to perform operations corresponding to the PDF document based data downloading method according to any one of claims 1 to 6.
9. A terminal, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the PDF document-based data downloading method as claimed in any one of claims 1-6.
CN201810796560.0A 2018-07-19 2018-07-19 Data downloading method and device based on PDF document, storage medium and terminal Active CN109284453B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810796560.0A CN109284453B (en) 2018-07-19 2018-07-19 Data downloading method and device based on PDF document, storage medium and terminal
PCT/CN2018/111697 WO2020015220A1 (en) 2018-07-19 2018-10-24 Method and apparatus for downloading data based on pdf document, and storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810796560.0A CN109284453B (en) 2018-07-19 2018-07-19 Data downloading method and device based on PDF document, storage medium and terminal

Publications (2)

Publication Number Publication Date
CN109284453A CN109284453A (en) 2019-01-29
CN109284453B true CN109284453B (en) 2023-04-07

Family

ID=65182376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810796560.0A Active CN109284453B (en) 2018-07-19 2018-07-19 Data downloading method and device based on PDF document, storage medium and terminal

Country Status (2)

Country Link
CN (1) CN109284453B (en)
WO (1) WO2020015220A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111666745B (en) * 2020-06-03 2023-07-25 中国建设银行股份有限公司 File downloading method, device, server and medium
CN112380828A (en) * 2020-11-03 2021-02-19 前海飞算云创数据科技(深圳)有限公司 PDF document generation method and device, storage medium and electronic equipment
CN112651215B (en) * 2020-12-31 2023-11-03 中国农业银行股份有限公司 Method and device for determining document map, electronic equipment and storage medium
CN117807291B (en) * 2024-02-29 2024-04-26 南京三百云信息科技有限公司 Intelligent identification interaction processing method and platform for business materials

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090213A (en) * 2017-12-29 2018-05-29 福建南威软件有限公司 The method that mobile terminal rapid translating generates pdf document

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153462A1 (en) * 2003-02-05 2004-08-05 Bardwell Steven J. Systems, methods, and computer program product for use in association with electronic documents
JP4756870B2 (en) * 2005-02-03 2011-08-24 キヤノン株式会社 Document processing apparatus, document processing method, and program
CN101777056B (en) * 2009-12-31 2012-01-04 成都市华为赛门铁克科技有限公司 Data storage method and device
US10506017B2 (en) * 2016-05-20 2019-12-10 Adobe Inc. Manipulation of PDF file content through HTTP requests
CN106776498A (en) * 2016-12-09 2017-05-31 山东浪潮商用系统有限公司 A kind of method that data export as PDF
CN108052491B (en) * 2017-11-22 2021-02-26 中贸促商事服务有限公司 Automatic processing method and device for certificate document

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090213A (en) * 2017-12-29 2018-05-29 福建南威软件有限公司 The method that mobile terminal rapid translating generates pdf document

Also Published As

Publication number Publication date
CN109284453A (en) 2019-01-29
WO2020015220A1 (en) 2020-01-23

Similar Documents

Publication Publication Date Title
CN109284453B (en) Data downloading method and device based on PDF document, storage medium and terminal
CN108279932B (en) Method and device for dynamically configuring user interface of mobile terminal
CN110597500B (en) Method and device for serialization and deserialization of message structure
KR100262432B1 (en) Device independent and transfer optimized interactive client-server dialog system
CN109981595B (en) Resource acquisition method, resource return method, server and storage medium
US7860892B2 (en) Information processing apparatus, history file generation method and program
US20070174420A1 (en) Caching of web service requests
US20020093683A1 (en) Method and system for virtual machine rendering of non-latin1 unicode glyphs
EP3195143A1 (en) Remote font management
CN112650533B (en) Interface document generation method and device and terminal equipment
CN113518094B (en) Data processing method, device, robot and storage medium
US10783412B1 (en) Smart page encoding system including linearization for viewing and printing
CN111144402A (en) OCR recognition accuracy calculation method, device, equipment and storage medium
CN105589959A (en) Form processing method and form processing system
CN116382773A (en) Method for deploying PyFlink task
CN112581568B (en) Dynamic poster generation method, device, server and storage medium
US11087188B2 (en) Smart page decoding system including linearization for viewing and printing
US10956659B1 (en) System for generating templates from webpages
CN115712411A (en) Method and device for generating user-defined serial number
US11295072B2 (en) Autoform filling using text from optical character recognition and metadata for document types
CN113050921A (en) Webpage conversion method, device, storage medium and computer equipment
CN106487855B (en) File uploading method, file accessing method, file uploading device, file accessing device and equipment
CN114691712A (en) Method and device for generating bill and storage medium
CN109840080B (en) Character attribute comparison method and device, storage medium and electronic equipment
JP2015095092A (en) Information processing system, information processing device, information processing method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant