CN116050360A - Method and equipment for quickly manufacturing PDF (Portable document Format) form file - Google Patents

Method and equipment for quickly manufacturing PDF (Portable document Format) form file Download PDF

Info

Publication number
CN116050360A
CN116050360A CN202310142316.3A CN202310142316A CN116050360A CN 116050360 A CN116050360 A CN 116050360A CN 202310142316 A CN202310142316 A CN 202310142316A CN 116050360 A CN116050360 A CN 116050360A
Authority
CN
China
Prior art keywords
file
pdf
html
appearance
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310142316.3A
Other languages
Chinese (zh)
Inventor
徐红轮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kn Technologies Co ltd
Original Assignee
Beijing Kn Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kn Technologies Co ltd filed Critical Beijing Kn Technologies Co ltd
Priority to CN202310142316.3A priority Critical patent/CN116050360A/en
Publication of CN116050360A publication Critical patent/CN116050360A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets

Abstract

The invention relates to a method and equipment for manufacturing a PDF form file. The method comprises the following steps: converting the original format file into a hypertext markup language (HTML) file; the original format file includes a table look; inserting form fields with the form appearance into the HTML file based on the form appearance of the original format file; converting the HTML file inserted into the form field into a PDF file; and writing the information data of the form field into the form field in the PDF file. The embodiment of the invention can be used for quickly and conveniently manufacturing the interactive form in the PDF file.

Description

Method and equipment for quickly manufacturing PDF (Portable document Format) form file
Technical Field
The invention relates to processing of electronic files, in particular to a method and equipment for quickly manufacturing a PDF form file.
Background
Portable file format (PDF) is a commonly used electronic file format. PDF files may encapsulate text, fonts, graphics, images, colors, formats, parameters related to printing equipment, etc. in one file and keep page elements unchanged in network transmission, printing, and platemaking output. The PDF file may include a static PDF form (e.g., a form converted from a Word or Excel file) and an interactive PDF form (hereinafter, simply referred to as a form). The interactive form contains a field set or form field from which information data can be interactively collected from the user and the information data can be updated, processed and reused.
The difference between the form in the Word or Excel file and the PDF form is that the form does not contain structured form field information, and all the text and images input or inserted in the Word or Excel file exist as a part of the Word/Excel text.
However, the creation of forms in PDF files requires cumbersome processes and by means of a form designer. The existing form designers at present are mainly divided into two types: one is a professional PDF form designer, such as an Acrobat PDF form designer, and a Phantom form designer from Fuxin corporation, etc.; the form designer is inconvenient to use, the position of the form domain is dragged on the format PDF page, and the attribute of the form domain is configured to finish the setting of the form domain. Another is the internet HTML5 based "question" form designer, such as a questionnaire, which is typically designed de novo. The questions are taken as the main materials, and the questions are added to form a questionnaire or a form finally. This form designer has two disadvantages: on one hand, a large number of official documents in offices are in WORD or EXCEL format, the interface is complex, the form is slowly designed from scratch, and the form and text editing in the HTML mode is still more diversified and more general than WORD at present; on the other hand, the specification of data standards, such as metadata and data code tables, is often lacking.
Therefore, although PDF form files have better form data management and processing functions, the process of making forms in PDF files is not only time consuming but also inconvenient to use and error-prone, requiring further improvement.
Disclosure of Invention
The invention aims to provide a method and equipment for quickly manufacturing a PDF form file.
According to an embodiment of the present invention, there is provided a method for making a PDF form file, the method including: converting the original format file into a hypertext markup language (HTML) file; wherein the original format file includes a table appearance; inserting form fields with form appearances into the HTML file based on the form appearances of the original format file; converting the HTML file inserted into the form field into a PDF file; and writing the information data of the form fields into the form fields in the PDF file.
There is also provided, in accordance with an embodiment of the present invention, an apparatus for making a PDF form file, including: a first format conversion module configured to convert the original format file into a hypertext markup language (HTML) file; wherein the original format file includes a table appearance; a form design module configured to insert a form field having a form appearance in the HTML file based on the form appearance of the original format file; a second format conversion module configured to convert the HTML file inserted into the form field into a PDF file; and a form writing module configured to write information data of the form field into the form field in the PDF file.
There is further provided, in accordance with an embodiment of the present invention, an apparatus for making a PDF form file, including: a memory configured to store computer readable instructions; and at least one processor coupled to the memory; wherein the computer readable instructions, when executed by the processor, cause the processor to implement a method for making a PDF form file as described in the above embodiments.
According to an embodiment of the present invention, there is further provided a computer storage medium having stored thereon computer readable instructions which, when executed, cause a computer to implement the method for making a PDF form file as described in the above embodiments.
By using the method, the device and the computer storage medium of the embodiment of the invention, the interactive form can be quickly and conveniently manufactured in the PDF file.
Drawings
Fig. 1 shows a flowchart of a method for making a PDF form file according to an embodiment of the present invention.
Fig. 2 shows a schematic diagram of a table appearance example of a original format file.
Fig. 3 is a block diagram showing the construction of an apparatus for making a PDF form file according to an embodiment of the present invention.
Detailed Description
Fig. 1 shows a flowchart of a method for making a PDF form file according to an embodiment of the present invention. As shown in fig. 1, the method 100 includes: converting the original format file into a hypertext markup language (HTML) file (step 110); the original format file comprises a table appearance; inserting a form field having a form appearance in the HTML file based on the form appearance of the original format file (step 120); converting the HTML file inserted into the form field into a PDF file (step 130); and writing the information data of the form field into the form field in the PDF file (step 140).
In the embodiment of the present invention, the original format file contains the appearance of the table, that is, the basic structure and fields of the table are included, and the table contains no data, but may also contain data. Fig. 2 is a schematic diagram showing an example of the form appearance of a original format file (e.g., word file), in which the "company device record table" shown is an empty table containing no data. Fig. 2 is a schematic diagram showing the appearance of a table, in which table fields such as "device name, device model number, device number, manufacturer and device status" and appearance style of the entire table are shown. The table look may include the basic structure of the table (e.g., number of rows, columns of the table), shape (e.g., rectangular table), size (e.g., length of each field, line spacing), font size, etc.
The raw format file may typically be a form file produced by a user using its commonly used application software (e.g., word processing or spreadsheet software), many users having a large number of form templates in word processing or spreadsheet format, all of which contain form appearances. HTML is a standard language for describing web pages; HTML allows for embedding objects like images, sound, animation, multimedia etc., and can be used to create interactive forms, and can also be used to structure information (e.g., titles, paragraphs, lists, etc.). By performing the file conversion step 110, a user can conveniently perform form design on an HTML file using a simple user interaction interface without having to perform cumbersome form design on a PDF file. Therefore, in the embodiment of the invention, the form appearance in the original format file commonly used or existing by a user can be conveniently utilized, the original format file is converted into the HTML file, the form domain of the interactive form with the same form appearance is created in the HTML file by utilizing the characteristic of the HTML, and the PDF file inserted with the form domain is further generated, so that the technical effect of quickly and conveniently manufacturing the PDF form file is realized.
In some embodiments, the raw format file includes a Word processing file (e.g., word file) or a spreadsheet file (e.g., excel file); the step 110 of converting the original format file into an HTML file includes: the original format file is converted into an HTML file by calling an Object Linking and Embedding (OLE) interface. The OLE interface, developed by microsoft corporation, is a comprehensive set of standards for transmitting and sharing information among user applications that allows the creation of hybrid documents with links to applications so that the user does not have to switch protocols between applications when modifying. In an embodiment of the present invention, invoking an OLE interface is based mainly on OLE Automation (OLE Automation) invoking WORD processors (e.g., WORD); where OLE automation is a mechanism that allows Windows applications to manipulate another program, the native interface of word processing software or spreadsheet software can be invoked directly. The Word or Excel file is converted by using the OLE interface, and all the information such as the layout, the content, the style and the like of the Word/Excel file can be basically maintained, so that only fine adjustment of page content and insertion of a form domain are needed in the subsequent HTML file editing process. However, embodiments of the present invention are not limited to using the OLE interface to convert the original format file, and may also be implemented by other methods, such as file conversion via the OpenOffice interface. After step 110 is performed, the table look of the Word/Excel file can be essentially maintained in the obtained HTML file, except that the form field needs to be added.
The most popular Office documents at present are Office documents such as Word/Excel; in addition, a large number of Word/Excel form templates exist in many units, and the form templates contain layout information, text graphic image content information and various form appearance styles, such as forms with various business types, including a leave list, a reimbursement list, an employee examination list, an equipment record list and the like. The Word/Excel form can be conveniently edited, filled and printed in the office process. The difference between the Office table and the PDF table is that the Office table does not contain structured form domain information, all input or inserted texts and images exist as part of a Word/Excel body, and the Office table does not contain the form domain and the standardization and standardization of the form domain fields, so that the Office table is troublesome in later extraction and reuse, and even cannot be reused. The PDF form file manufactured according to the embodiment of the invention can overcome the defects, and the information data written in the form field can be reused and processed.
According to the embodiment of the invention, the form appearance of the original format file (such as Word processing or electronic form file) which is generated can be conveniently utilized, or the original format file with the form appearance can be conveniently generated by using Word processing software (such as Word) or electronic form processing software (such as Excel); the PDF form file can be manufactured by converting the original format file into an HTML file, inserting form fields with the same form appearance into the HTML file, and converting the HTML file into a PDF file; and then allows the information data of the form fields to be written into the form fields in the PDF form file.
In some embodiments, the form fields inserted in the HTML file include form fields that contain only the appearance of the form fields, and/or form fields that contain the appearance of the form fields and metadata; wherein the form field's form appearance is generated based on the form appearance of the original format file. In one example, after converting the original format file into an HTML file, text, graphics, images, and forms corresponding to the form appearance appear on the pages of the HTML file. The form appearance of the form field generated based on the form appearance of the original format file facilitates the quick generation of interactive forms based on existing form appearances.
In some embodiments, the method 100 further comprises: adjusting page content in the HTML file; wherein the adjusting the page content includes editing at least one of text, graphics, images, and tables in a page of the HTML file. In one example, the user may fine tune the HTML page content based on the form needs and the actual state of the form appearance in the converted HTML file. For example, the basic text, graphics, images, and forms in the page may be edited, and the form editing may include: inserting, editing, deleting text, graphics, images, etc.; various style treatments such as adjusting the font, font size, underline, alignment, etc. of the text; adding rows of the table; deleting the table row; adjusting the row width and the column width of the table; and quick editing operations such as copying, pasting, rollback, advancing and the like. Then, more important operation is to insert the form field in the proper position on the page layout of the original form.
In one embodiment, form fields having the same form appearance are inserted into the HTML file by the form designer based on the form appearance of the original format file. The form designer can perform form design on the basis of HTML (hypertext markup language), and can realize the editing experience of a form domain, such as dragging, clicking and the like by means of a form design mechanism rich in HTML (such as HTML 5); besides, it is also possible to edit text, pictures, etc. Forms designed in the HTML file using the form designer are interactive forms.
The form field is used to collect data entered or selected by the user. The common form field only contains the appearance of the form field, and the data is not normalized. The appearance of the form field contains the location of the form field, the form field type, the font and font size of the text therein, etc., and the form field type contains a text input box, a rich text (multi-line text) input box, a plastic input box, a floating point input box, a date selection box, a time selection box, a date time selection box, a radio selection box, a check box, a combination box, a digital signature field, a handwritten signature field, etc.
The form field with metadata refers to a form field containing specific metadata, and in addition to basic attributes of a common form field, english names, chinese names, data standard sets, option code tables, etc. of the form field are also standardized, for example: "SEX in CDISC data standard set (HWB/SEX), HEIGHT (HWB/HEIGHT), WEIGHT (HWB/WEIGHT)"; wherein "/" preceding "HWB" is the DOMAIN (DOMAIN); the "/" followed by "SEX, HEIGHT, WEIGHT" is the field english name. The names of the form fields adopt metadata names in general, such as the form field names HWB/HEIGHT corresponding to the metadata HWB/HEIGHT; however, the form field NAMEs sometimes do not correspond to metadata NAMEs one-to-one, such as a multi-row form field NAME naming convention, first row HWB/NAME_1, HWB/SEX_1, HWB/WEIGHT_1, HWB/HEIGHT_1; a second row HWB/name_2, HWB/sex_2, hwb_weight_2; and so on. Wherein HWB/NAME_1 and HWB/NAME_2 use the same metadata HWB/NAME; HWB/WEIGHT_1 and HWB/WEIGHT_2 use the same metadata HWB/WEIGHT; and so on.
In some embodiments, the information data written to the form field includes at least one of a form field name of the form field, a form field type (as described above), metadata (e.g., metadata dictionary), page number, location, border style (e.g., rectangular border, underlined border), word size, alignment (e.g., the alignment of the words in the text box is centered or left aligned, etc.), option value, and auto-fill. Metadata is data for describing data, mainly information describing attributes of data, and is used for describing names, default values, value ranges, units, formats, lengths, codes, and the like of data items.
In some embodiments, the step 130 of converting the HTML file inserted into the form field into a PDF file includes: converting the HTML file inserted into the form field into a PDF file in a virtual printing mode; in the converting step 130, the HTML file is printed as a PDF file in accordance with a predetermined in-page configuration; and intercepting information data of the form field in the virtual printing step. The virtual printing function can be a specially designed virtual printer program or a virtual printing engine of a third party, and a form domain information interception function is added in the virtual printer program or the virtual printing engine; thereby, the form domain information in the HTML file can be output simultaneously in the process of converting the HTML file into the PDF file by virtual printing. When virtual printing is carried out, an HTML file containing a designed webpage form is printed into a PDF file according to the input page internal configuration (information such as horizontal edition/vertical edition, page margin, page size, page break and the like); meanwhile, the information of the form domain is intercepted or output in the printing process, and the information of the form domain is stored in a JSON or XML format. Each form field may contain the following information: form field names, form field types, metadata dictionaries, page numbers, locations, frame styles, word sizes, alignment patterns, option values, auto-fill, etc. In the embodiment of the present invention, the conversion of the HTML file into the PDF file is not limited to the virtual printing manner, and file conversion may be performed in other manners (e.g., a file conversion application).
In the step 140 of writing the information data of the form field into the form field in the PDF file, the form field information intercepted in the step 130 of converting the HTML file into the PDF file may be written into the converted PDF file by the form writing module, to obtain a final PDF form file with the form field. In step 140, the intercepted form field information is parsed, and the form fields are written into the converted PDF file one by one according to the information description of the form fields therein. The form writing module may be a specially designed software development tool (SDK) or a third party SDK.
In the embodiment of the present invention, the step of converting the original format file into the HTML file 110, the step of inserting the form field having the form appearance 120 into the HTML file, the step of converting the HTML file inserted with the form field into the PDF file 130, and the step of writing the information data of the form field into the form field 140 in the PDF file are combined, so that the efficiency of manufacturing the PDF form file can be significantly improved, the form design experience of the user can be improved, and the form design cost can be reduced.
Fig. 3 is a block diagram showing the construction of an apparatus for making a PDF form file according to an embodiment of the present invention. As shown in fig. 3, an apparatus 300 for making a PDF form file includes: a first format conversion module 310 configured to convert the original format file into a hypertext markup language (HTML) file; the original format file includes a table look (e.g., as shown in fig. 2); a form design module 320 configured to insert a form field having a form appearance in the HTML file based on the form appearance of the original format file; a second format conversion module 330 configured to convert the HTML file inserted into the form field into a PDF file; and a form writing module 340 configured to write information data of the form field into the form field in the PDF file.
Those skilled in the art will appreciate that the apparatus 300 may be used to implement the various embodiments of the method for making a PDF form file described above (e.g., the method 100 described in connection with fig. 1 and 2) and may achieve the same technical effects as the various embodiments of the method 100. For example, form design module 320 corresponds to a form designer, which may be used to insert form fields and adjust page content in an HTML file; the second format conversion module 330 may be a virtual printing module that converts the HTML file inserted into the form field into a PDF file by means of virtual printing. The same parts of the various embodiments of the apparatus 300 as in the method 100 are not described in detail. The various modules in device 300 may be implemented by computer software, hardware, firmware, or combinations thereof.
There is also provided, in accordance with an embodiment of the present invention, an apparatus for making a PDF form file, including: a memory configured to store computer readable instructions; and at least one processor coupled to the memory; the computer readable instructions, when executed by a processor, cause the processor to implement the method for making a PDF form file described in the above embodiments (e.g., method 100). The device in this embodiment may be various computers, computing devices, or information processing devices.
There is further provided, in accordance with an embodiment of the present invention, a computer storage medium having stored thereon computer readable instructions that, when executed, cause a computer to implement the above-described method for making a PDF form file, such as the various embodiments of the method 100 described in connection with fig. 1 and 2. The storage medium may be any of a variety of media that may be used to store program instructions and/or data, including, but not limited to, optical disks, hard disks, flash memory, RAM, ROM, EPROM, and the like, volatile or non-volatile storage media.
What has been described above is a preferred embodiment of the present invention. It will be understood by those skilled in the art that the various embodiments described above are illustrative only and not limiting, and that various modifications and changes can be made by those skilled in the art without departing from the spirit of the invention, which should fall within the scope of the invention, which is defined by the appended claims.

Claims (9)

1. A method for making a PDF form file, comprising:
converting the original format file into a hypertext markup language (HTML) file; wherein the original format file includes a table appearance;
inserting form fields with form appearances into the HTML file based on the form appearances of the original format file;
converting the HTML file inserted into the form field into a PDF file; and
and writing the information data of the form fields into the form fields in the PDF file.
2. The method of claim 1, wherein the raw format file comprises a word processing file or a spreadsheet file; wherein converting the original format file into the HTML file includes: the original format file is converted to an HTML file by invoking an Object Linking and Embedding (OLE) interface.
3. The method of claim 1, wherein the inserted form fields include form fields that contain only the appearance of the form field, and/or form fields that contain the appearance of the form field and metadata; wherein the form field's form appearance is generated based on the original format file's form appearance.
4. The method of claim 1, wherein the information data of the form field includes at least one of a form field name, a form field type, metadata, a page number, a position, a border style, a word size, an alignment, an option value, and an auto-fill of the form field.
5. The method of claim 1 or 4, wherein converting the HTML file inserted into the form field into a PDF file comprises:
converting the HTML file inserted into the form field into a PDF file in a virtual printing mode; the conversion is to print the HTML file into the PDF file according to a preset in-page configuration; and
and intercepting the information data of the form field in the printing step.
6. The method of claim 1, further comprising: adjusting page content in the HTML file; wherein the adjusting the page content includes editing at least one of text, graphics, images, and tables in a page of the HTML file.
7. An apparatus for making a PDF form file, comprising:
a first format conversion module configured to convert the original format file into a hypertext markup language (HTML) file; wherein the original format file includes a table appearance;
a form design module configured to insert a form field having a form appearance in the HTML file based on the form appearance of the original format file;
a second format conversion module configured to convert the HTML file inserted into the form field into a PDF file; and
and the form writing module is configured to write the information data of the form field into the form field in the PDF file.
8. An apparatus for making a PDF form file, comprising:
a memory configured to store computer readable instructions; and
at least one processor coupled to the memory;
wherein the computer readable instructions, when executed by the processor, cause the processor to implement the method of any one of claims 1 to 6.
9. A computer storage medium having stored thereon computer readable instructions which, when executed, cause a computer to implement the method of any of claims 1 to 6.
CN202310142316.3A 2023-02-09 2023-02-09 Method and equipment for quickly manufacturing PDF (Portable document Format) form file Pending CN116050360A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310142316.3A CN116050360A (en) 2023-02-09 2023-02-09 Method and equipment for quickly manufacturing PDF (Portable document Format) form file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310142316.3A CN116050360A (en) 2023-02-09 2023-02-09 Method and equipment for quickly manufacturing PDF (Portable document Format) form file

Publications (1)

Publication Number Publication Date
CN116050360A true CN116050360A (en) 2023-05-02

Family

ID=86127392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310142316.3A Pending CN116050360A (en) 2023-02-09 2023-02-09 Method and equipment for quickly manufacturing PDF (Portable document Format) form file

Country Status (1)

Country Link
CN (1) CN116050360A (en)

Similar Documents

Publication Publication Date Title
US10067931B2 (en) Analysis of documents using rules
US8954841B2 (en) RTF template and XSL/FO conversion: a new way to create computer reports
CN109408783B (en) Electronic document online editing method and system
Furuta et al. Interactively editing structured documents
US8095870B2 (en) Extensible document transformation language: an innovative way of generating business document and report
RU2358311C2 (en) Word processing document, stored as single xml file, which can be manipulated by applications which can read xml language
US7434160B2 (en) PDF document to PPML template translation
EP1672524B1 (en) Systems and methods for converting a formatted document to a web page
US8181106B2 (en) Use of overriding templates associated with customizable elements when editing a web page
US20050235202A1 (en) Automatic graphical layout printing system utilizing parsing and merging of data
CN109857670B (en) Test report automatic generation method based on universal template
KR101979322B1 (en) Electronic document braille translation system and a method therefor
CN111797595A (en) Method and device for generating OFD format page based on XML template
US20070180359A1 (en) Method of and apparatus for preparing a document for display or printing
Bagley et al. Editing images of text
CN107015959A (en) A kind of method that version is closed to pdf document
CN116050360A (en) Method and equipment for quickly manufacturing PDF (Portable document Format) form file
US9946698B2 (en) Inserting text and graphics using hand markup
GB2458692A (en) A process for generating database-backed, web-based documents
US8990219B2 (en) Processing and publishing digital contents including encyclopedia
CN111222310A (en) Method and system for inputting and displaying irregular form
CN110457659B (en) Clause document generation method and terminal equipment
Semerikov et al. How to format your paper for CTE Workshop
Kiv et al. How to format your paper for CS&SE@ SW Workshop
CN116226035A (en) Method and device for converting OpenXML document into Web form

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination