CN113609820B - Method, device and equipment for generating word file based on extensible markup language file - Google Patents

Method, device and equipment for generating word file based on extensible markup language file Download PDF

Info

Publication number
CN113609820B
CN113609820B CN202110872843.0A CN202110872843A CN113609820B CN 113609820 B CN113609820 B CN 113609820B CN 202110872843 A CN202110872843 A CN 202110872843A CN 113609820 B CN113609820 B CN 113609820B
Authority
CN
China
Prior art keywords
word
file
template
placeholder
markup language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110872843.0A
Other languages
Chinese (zh)
Other versions
CN113609820A (en
Inventor
陈凯鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202110872843.0A priority Critical patent/CN113609820B/en
Publication of CN113609820A publication Critical patent/CN113609820A/en
Application granted granted Critical
Publication of CN113609820B publication Critical patent/CN113609820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/154Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method, a device and equipment for generating word files based on extensible markup language files. The method comprises the following steps: creating a word style template; converting the word style template into an extensible markup language format file; converting the extensible markup language format file into a freemaker file; generating a freemaker template based on the freemaker file; and outputting the freemaker template as a word file. The method of the application can make word difficult to support, can completely drive word through data, can realize personalized display and dynamic display, make word generation no longer fixed by the template, a template can generate word files required by business scenes according to different data formats, can also make users generate word files according to own ideas, and does not need to be bound on a certain template generated in advance, so that the operation of generating word files is more convenient, and the maintenance cost is reduced.

Description

Method, device and equipment for generating word file based on extensible markup language file
Technical Field
The present invention relates to the field of document format processing, and in particular, to a method, an apparatus, and a computer device for generating a word file based on an extensible markup language file.
Background
With the development of informatization, there is a need for a WORD template output with a certain format on many business systems, and not a simple WORD content output, but a WORD document needs to be generated according to a certain standard format, so that the method is suitable for a specific environment, for example: papers, documents and texts; business process approval forms of the business industry, and the like. word software has not been open source, making development of word software by java extremely difficult. The third party plug-in capable of supporting development of word software through java is few, and the functions are single, so that only editing of basic word files can be supported. Moreover, the third-party plug-ins are simpler to support the word patterns, a large number of api and parameters are needed, so that development of the word through java needs a large amount of time learning test, compatibility is poor, each version of the word needs to use a single corresponding generation method, and excle software is the same condition.
For generating word files through data driving, most of solutions are realized by using Apache POIs, which are generated by using character string occupation, and have fixed formats, fixed patterns, fixed placeholders, fixed line numbers, fixed page numbers, which are very inflexible, are feasible for fixed simple data, but the data formats are dynamic, so that the realization is difficult, corresponding templates are written in advance for different data formats, more templates are required to be maintained for complex business scenes, and all templates are required to be modified and the maintenance cost is high once one data field is changed or the pattern or the format is changed.
Disclosure of Invention
Based on the above, it is necessary to provide a method, a device and a computer device for generating a word file based on an extensible markup language file, aiming at the problem that when the word file is generated, corresponding templates need to be written in advance for a dynamic data format, more templates need to be maintained for complex business scenes, once one of data fields is changed or a style or a format needs to be modified, and maintenance cost is relatively high.
A method for generating a word file based on an extensible markup language file, comprising:
Creating a word style template;
converting the word style template into an extensible markup language format file;
converting the extensible markup language format file into a freemaker file;
generating a freemaker template based on the freemaker file;
and outputting the freemaker template as a word file.
In one embodiment, the creating the word style template includes:
setting placeholders for a word initialization template, establishing an association relation between the placeholders and data, editing the word initialization template in a mode of directly filling the placeholders, and finishing setting of the template to obtain a word style template.
In one embodiment, the creating the word style template includes:
setting a placeholder for marking the position of the corresponding data in the word template;
Establishing an association relation table matched with the placeholder; the association relation table is used for storing association relation between the placeholders and the corresponding data;
setting word templates according to the requirements of users, setting placeholders at corresponding positions in the word templates, and completing the establishment of personalized word templates;
Acquiring the personalized word template, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file, and identifying a paragraph part and a form part in the template;
each placeholder in the region is processed differently according to the region.
In one embodiment, the distinguishing the placeholders in the regions according to the regions includes:
If the processing is directed at the paragraph part, identifying the placeholder of the paragraph part according to the placeholder tag, and directly replacing the placeholder by calling a corresponding data from the database according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the universal placeholder;
If the processing is directed at the table part, identifying the table part placeholders according to the placeholder tags, and forming different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word pattern template.
In one embodiment, the identifying the table part placeholders according to the placeholder tags forms different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word pattern template, which includes:
if the cells are all common placeholders, a unique data is called from the database to directly replace the common placeholders according to the association relation between the placeholders and the corresponding data by inquiring the association relation table of the common placeholders;
If the cells are all group placeholders, according to the setting format of the placeholders in each cell, sequentially and circularly outputting a matched group of data according to the association relation between the characters and a corresponding group of corresponding data by inquiring the association relation table of the universal placeholders;
If the system comprises both a common placeholder and a group placeholder, the group placeholder sequentially and circularly outputs a matched group of data according to the association relation between the placeholder and corresponding data; the general placeholders cyclically output the data times according to the group placeholders, and the unique data corresponding to the group placeholders are sequentially and cyclically repeated according to the rows.
In one embodiment, the converting the word style template into the extensible markup language format file includes:
Extracting a format object from the word style template; wherein, the objects with different formats have different object format information;
Dividing the word style template into at least one level of file blocks according to the format information of the format object;
and converting the divided file blocks of at least one level into an extensible markup language format file according to the label information corresponding to each file block and a preset extensible markup language format library.
In one embodiment, the converting the word style template into the extensible markup language format file includes:
Converting the content part in the word style template into a first extensible markup language file with a preset format standard through a word structuring engine;
extracting a format file in the word style template;
and supplementing the format file into the first extensible markup language file to generate a second extensible markup language file.
An apparatus for generating a word file based on an extensible markup language file, comprising:
the creation module is used for creating word style templates;
The first conversion module is used for converting the word style template into an extensible markup language format file;
The second conversion module is used for converting the extensible markup language format file into a freemaker file;
the generating module is used for generating a freemaker template based on the freemaker file;
and the output module is used for outputting the freemaker template as a word file.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the steps of any of the above methods of generating word files based on extensible markup language files.
A storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of any of the above methods of generating word files based on extensible markup language files.
According to the method, the device and the equipment for generating the word file based on the extensible markup language file, the word style template is created, the word style template is converted into the extensible markup language format file, the extensible markup language format file is converted into the freeemaker file, the freeemaker template is generated based on the freeemaker file, the freeemaker template is output as the word file, so that the word is not difficult to support any more, the word can be driven completely through data, personalized display and dynamic display can be realized, the word generation is not fixed by the template any more, one template can generate the word file required by a service scene according to different data formats, the user can generate the word file according to own ideas, the operation of generating the word file is more convenient and fast without being bound on a certain template generated in advance, and the maintenance cost is reduced.
Drawings
FIG. 1 is a diagram of an implementation environment for a method of generating word files based on extensible markup language files, as provided in one embodiment;
FIG. 2 is a block diagram of the internal architecture of a computer device in one embodiment;
FIG. 3 is a flow diagram of a method of generating word files based on extensible markup language files in one embodiment;
FIG. 4 is a flow diagram of creating word style templates in one embodiment;
FIG. 5 is a flow diagram of identifying form part placeholders from placeholder tags, in one embodiment;
FIG. 6 is a flow diagram of converting word style templates to extensible markup language format files, in one embodiment;
FIG. 7 is a block diagram of an apparatus for generating word files based on extensible markup language files in one embodiment;
FIG. 8 is a block diagram of the architecture of an invasive modeling block in one embodiment;
FIG. 9 is a block diagram of a computer device in one embodiment;
FIG. 10 is a schematic diagram of a storage medium storing computer readable instructions in one embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It is to be understood that the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance. It will also be understood that, although the terms "first," "second," "third," etc. may be used in this document to describe various elements in some embodiments of the application, these elements should not be limited by these terms. These terms are only used to distinguish between various elements.
Fig. 1 is a diagram of an implementation environment of a method for generating a word file based on an extensible markup language file according to an embodiment, where, as shown in fig. 1, a computer device 110 is included in the implementation environment, and the computer device 110 may be used to implement the method for generating a word file based on an extensible markup language file according to the embodiment. It should be noted that the computer device 110 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, etc., but is not limited thereto.
FIG. 2 is a schematic diagram of the internal structure of a computer device in one embodiment. As shown in fig. 2, the computer device includes a processor, a storage medium, a memory, and a network interface connected by a system bus. The nonvolatile storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store a control information sequence, and when the computer readable instructions are executed by a processor, the processor can realize a method for generating word files based on extensible markup language files. The processor of the computer device is used to provide computing and control capabilities, supporting the operation of the entire computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, cause the processor to perform a method of generating a word file based on an extensible markup language file. The network interface of the computer device is for communicating with a terminal connection. It will be appreciated by persons skilled in the art that the architecture shown in fig. 2 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
As shown in fig. 3, in one embodiment, a method for generating a word file based on an extensible markup language file is provided, and the method for generating a word file based on an extensible markup language file may specifically include the following steps:
s10, creating a word style template.
The word style template is used for describing information such as a frame, a style and the like of the word document. The content of the word style template may include a format object and object format information corresponding to the format object. The format object of the word style template may include at least one of a text object, a form object, a picture object, and a formula object. The object format information corresponding to the text object may be referred to as text object format information. The object format information corresponding to the table object may be referred to as table object format information. The object format information corresponding to the picture object may be referred to as image object format information. The object format information corresponding to the formula object may be referred to as formula object format information.
For example, to ensure the correctness of the table object format information, a table format adding function may be newly added in the template making toolbar, and a table format extension attribute may be added to the table, so as to facilitate adding the table object format information in the word style template.
In order to avoid the missing of the text object format information, the formula object format information, the picture object format information or the table object format information, default text object format information, formula object format information, picture object format information and table object format information can be preset in the word style template. For example, if text object format information is not specifically set when a word style template is created, default text object format information is called when the word style template is created.
In certain embodiments, step S10 comprises: setting placeholders for a word initialization template, establishing an association relation between the placeholders and data, editing the word initialization template in a mode of directly filling the placeholders, and finishing setting of the template to obtain a word style template.
Referring to fig. 4, in some embodiments, step S10 further comprises the steps of:
Step S101: and setting a placeholder for marking the position of the corresponding data in the word template. In the word template, a marker is respectively arranged at the front side and the rear side of each placeholder.
For example, the front and rear sides of the placeholder may be provided with "×" as a marker, i.e. in the form of "× placeholder", respectively. Placeholders include generic placeholders and custom placeholders.
The universal placeholder is a preset placeholder with universal property, the custom placeholder is a placeholder which is subsequently self-defined by a user according to requirements, and the custom placeholder only comprises a common placeholder.
The universal placeholders include a general placeholder and a group placeholder; wherein each generic placeholder uniquely points to corresponding data. Examples of common placeholders are, for example: * phoneNUMBER, provice, villagename, EMAILADDRESS, RECEIVERNAME, houseNUMBER, and the like.
When a general placeholder is used in the word template, the placeholder will be replaced by the matching corresponding data in the final generated file; each GROUP placeholder is preceded by a "GROUP" as a tag that uniquely points to a corresponding GROUP of data, e.g., may uniquely point to an array; examples of group placeholders are, for example: * Group_number, _group_ CODESET, _group_family, _group_ membergroup, _group_team, and so forth.
When group placeholders are used in word templates, they are cyclically replaced with a corresponding set of data in the final generated file.
Step S102: establishing an association relation table matched with the placeholders; the association relation table is used for storing the association relation between the placeholders and the corresponding data so as to accurately display the data at the matched positions in the final file generation process. The association table may be, for example, a dictionary data table.
Step S103: setting word templates according to the user demands, setting placeholders at corresponding positions in the word templates, and completing the establishment of the personalized word templates.
Step S104: acquiring a personalized word template which is created by a user, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file, and identifying a paragraph part and a form part in the template; according to different areas, distinguishing and processing each placeholder in the areas; if the processing is for the paragraph part, go to step S105; if it is a processing for the table section, the process goes to step S106.
POI (Poor Obfuscation Implementatio) is an open source cross-platform application program interface written in the Java language, and Java programs can read and write Microsoft Office format documents through the application program interface provided by POI (Poor Obfuscation Implementatio).
Step S105: and identifying the placeholder of the paragraph part according to the placeholder tag, and directly replacing the placeholder by calling one piece of corresponding data from the database according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the universal placeholder.
The general placeholders and group placeholders of the paragraph parts are identified from the placeholder tags.
For the general placeholder, the unique data is called from the database to directly replace the general placeholder according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the general placeholder.
And for the group placeholders, according to the format set by the current character, including font type, font color, cell filling, cell shading and the like, by inquiring an association relation table of the universal placeholders, the corresponding data is circularly output downwards from the current character position according to the association relation between the characters and the corresponding group of corresponding data.
In some embodiments, custom placeholder processing for paragraph parts is performed in the same manner as generic placeholders for paragraph parts.
Step S106: identifying a table portion placeholder according to the placeholder tag; and forming different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word pattern template.
Referring to fig. 5, step S106 may include the steps of:
S1061, if the cells are all common placeholders, a unique data is called from a database according to the association relation between the placeholders and corresponding data by inquiring an association relation table of the common placeholders to directly replace the common placeholders;
S1062, if all the cells are the group placeholders, sequentially circularly outputting a matched group of data according to the row by inquiring an association relation table of the universal placeholders and the association relation of the characters and a corresponding group of corresponding data according to the setting format of the placeholders in each cell, including the font type, the font type and the font color;
s1063, if the system comprises both a general placeholder and a group placeholder, outputting a matched group of data in a row circulation mode in sequence according to the association relation between the placeholder and the corresponding data; the placeholders typically cyclically output data by group placeholders for a number of times, and sequentially cyclically repeat the unique data with respect to each other by row.
S20, converting the word style template into an extensible markup language format file.
The extensible markup language is a subset of the standard generic markup language, XML for short, which is a markup language used to mark electronic files to be structured.
Referring to fig. 6, in some embodiments, S20 includes:
S201, extracting a format object from a word style template; wherein the different format objects have different object format information. The word style template may include at least one of a text object, a table object, a picture object, a formula object, and the like.
In the above step S201, the word style templates have different object format information between different format objects.
Text object format information may include text position, word spacing, paragraph format, text color, text font, font type, and font size, among others.
The picture object format information may include a picture position, a picture format, a picture size, a picture color, a picture shape, and the like.
The table object format information may include a table position, a table row number, a table column number, and an object in each cell, where the object in the cell may also include an object such as a text object, a formula object, and a picture object, and the object in the cell also has corresponding format information. For example, if the object in the cell is a text object, the text object in the cell also has format information such as a text position, a word spacing, a word color, a word font, a font type, and a word size of the text object. If the object in the cell is a picture object, the picture object in the cell may also have format information such as a picture position, a picture format, a picture size, a picture color, and a picture shape.
The formula object format information may include formula location, formula content, formula size, formula format, and the like.
If the format object in the cell is a formula object, the formula object in the cell also has format information such as formula position, formula content, formula size, and formula format of the formula object.
S202, dividing the word style template into at least one level of file blocks according to the format information of the format object; wherein the file block attributes are different among the file blocks of the same level.
In the above step S202, the word style templates are divided into text object blocks, picture object blocks, table object blocks, and formula object blocks according to the text objects, picture objects, table objects, and formula objects.
In some embodiments, the text object blocks may be further partitioned, for example, into summary portion file blocks, body portion file blocks, and reference portion file blocks; furthermore, the abstract part file block can be divided into parts such as title, abstract, author, unit and keyword, the text part file block is divided into sections and paragraphs, the reference part file block is subdivided into each reference mark number and reference name language, and each reference can be further fragmented.
S203, converting the divided file blocks of at least one level into an extensible markup language format file according to the label information corresponding to each file block and a preset extensible markup language format library.
In the above step S203, an extensible markup language format library including file block tag information and extensible markup language is stored in advance.
And for each file block obtained by dividing, searching the matched extensible markup language from the extensible markup language format library according to the tag information carried by the file block, storing the file block through the searched extensible markup language, and when the storage of all the file blocks is completed, converting the divided file blocks into the extensible markup language format file.
In certain embodiments, S20 comprises:
1) And converting the content part in the word style template into a first extensible markup language file with a preset format standard through a word structuring engine.
Specifically, taking a word file in a paper format as an example, a paper content part in the word file can be converted into a first extensible markup language file meeting a preset standard through a word structuring engine.
2) And extracting the format file in the word style template.
The format file in the word style template comprises a format object and corresponding object format information. According to one embodiment of the present invention, the format objects in the word style templates may include: at least one of a text object, a picture object, a form object, and a formula object. Each picture object has unique corresponding picture object format information, each table object has unique corresponding table object format information, and each formula object has unique corresponding formula object format information.
Specifically, a layout. Extensible markup language file in the word file is extracted.
3) The format file is appended to the first extensible markup language file to generate a second extensible markup language file.
Specifically, the layout. Extensible markup language file in the extracted word file is appended to the first extensible markup language file, and a second extensible markup language file is generated. For example, a layout. EXtensible markup language file is exported into a custom eXtensible markup language folder of a word file.
In an embodiment of the present invention, the step of supplementing the format file into the first extensible markup language file and generating the second extensible markup language file may specifically include: and establishing a matching relationship between at least one picture object, a table object and a formula object and corresponding format information. For example, each picture object, table object and formula object is marked with a unique identification, and the index of the identification is increased on the format information of the corresponding object.
And supplementing format information into the first extensible markup language file according to the matching relation, reading the assembly rule of the paper assembly metadata, generating a paper quotation format, a paper number and a DOI (digital object identification number, digital Object Identifier, abbreviated as DOI) according to the assembly rule, supplementing the paper quotation format, the paper number and the DOI into the first extensible markup language file, and generating a second extensible markup language file.
The typesetting file may include, for example, non-article information including data of header and side header areas of the paper and content static decoration data.
S30, converting the extensible markup language format file into a freemaker file.
Format conversion is performed on the extensible markup language format file by using a format conversion tool in the toolkit to generate a freemaker file, namely, text in ftl formats is generated. Freemarker is a template engine, a generic tool that generates output text based on templates and data to be changed, which is not end user oriented, but rather a Java class library, a component that programmers can embed in their developed products. The kit may employ a JAR package. The JAR file is collectively referred to as Java ARCHIVE FILE, meaning a Java archive file. A JAR file is a compressed file that is compatible with the usual ZIP compressed files and is therefore also referred to as a JAR package.
The principle of freemaker is that the template + data model = output, the template is only responsible for the representation of the data in the page, not involving any logical code, and all logic is handled by the data model. Thus, freemaker can completely separate presentation layers and business logic. The output that the user ultimately sees is created from the combination of the template and the data model. Thus, the use of freemaker overcomes the following technical drawbacks of the prior art: the use of JSP development process has a large amount of business logic codes in the page, resulting in messy page content and difficult maintenance of later modification.
In addition, according to the development experience of the prior art, JSP pages are used to present data during development. When the JSP needs to be converted into Servlet class during the first execution, and when the function is adjusted in the development stage, the JSP needs to be frequently modified, and each modification needs to be compiled and converted, so that the time for compiling the program is wasted. In contrast to JSP, the freemaker template technique does not have the problem of compiling and converting, so the above problem of wasting time does not exist. Therefore FREEMARKER can greatly improve development efficiency.
The Freemarker makes personnel division more definite in the development process, and as an interface developer, only needs to concentrate on creating HTML files, images and other visual aspects of Web pages, and does not need to learn data; and program developers concentrate on system implementation and are responsible for preparing data to be displayed for pages, so that the working efficiency is improved.
Classifying the freeemarker files by using a classification tool in the tool kit to obtain classified freeemarker files. Specifically, the classification tool classifies the content in the freejacket file, so as to obtain the classified freejacket file. The classification tool classifies the freeemarker files according to a preset sequence, wherein the preset sequence can be the priority order of the freeemarker files, and can also be the time sequence corresponding to the time of generating the freeemarker files.
S40, generating a freemaker template based on the freemaker file.
The freemaker file is formatted with a formatting tool in the toolkit to generate a freemaker template. The freemaker file will correspond the pre-written key values to the incoming data and determine which patterns should be used to present the different data in the same data format and which modules should be presented. The freeemaker template is obtained after the freeemaker file is formatted, and the freeemaker template can be repeatedly used for a plurality of times based on the manufactured freeemaker template. By setting the key value, the format information can be set conveniently.
S50, outputting the freemaker template as a word file.
And converting the freeemarker template into a word file in doc format through the function of the Windows system, and outputting the word file. The Windows system is provided with a conversion tool, so that the function of converting the freeemarker template into a word file in doc format can be realized.
According to the method for generating the word file based on the extensible markup language file, the word is not difficult to support, the word can be completely driven by data like a front end, personalized display and dynamic display can be realized, the word generation is not completely fixed by one template, one template can generate files required by a business scene according to different data formats, a client user can generate the word according to own ideas, what colors and patterns are liked by the user, typesetting is performed according to own will, the word is not limited on a certain template generated in advance, no matter how complex data can be realized through a key value set in advance, the operation of generating the word file is more convenient, the maintenance cost is reduced, and the following defects in the prior art are overcome: when a word file is generated, corresponding templates need to be written in advance aiming at a dynamic data format, more templates need to be maintained for complex business scenes, and once one data field is changed or the style or the format is changed, all the templates need to be modified, so that the maintenance cost is high.
Referring to FIG. 7, in one embodiment, an apparatus for generating a word file based on an extensible markup language file is provided, comprising:
the creation module is used for creating word style templates;
The first conversion module is used for converting the word style template into an extensible markup language format file;
The second conversion module is used for converting the extensible markup language format file into a freemaker file;
the generating module is used for generating a freemaker template based on the freemaker file;
and the output module is used for outputting the freemaker template as a word file.
In some embodiments, the creation module is specifically configured to:
setting placeholders for a word initialization template, establishing an association relation between the placeholders and data, editing the word initialization template in a mode of directly filling the placeholders, and finishing setting of the template to obtain a word style template.
Referring to FIG. 8, in some embodiments, the creation module includes:
the first setting unit is used for setting placeholders and marking the positions of the corresponding data in the word template;
The establishing unit is used for establishing an association relation table matched with the placeholder; the association relation table is used for storing association relation between the placeholders and the corresponding data;
The second setting unit is used for setting word templates according to the requirements of users, setting placeholders at corresponding positions in the word templates, and completing the establishment of personalized word templates;
the acquisition unit is used for acquiring the personalized word template, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file and identifying a paragraph part and a form part in the template;
and the processing unit is used for distinguishing and processing each placeholder in the region according to the different regions.
In some embodiments, the processing unit comprises:
The first sub-processing unit is used for identifying the placeholder of the paragraph part according to the placeholder tag if the paragraph part is processed, and directly replacing the placeholder by calling a corresponding data from the database according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the universal placeholder;
And the second sub-processing unit is used for identifying the placeholders of the table part according to the placeholder tags if the processing is performed on the table part, forming different data displays according to the types of the placeholders distributed in each cell in a row unit, and obtaining a word pattern template.
In some embodiments, the identifying the table part placeholders according to the placeholder tags forms different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word style template, which includes:
if the cells are all common placeholders, a unique data is called from the database to directly replace the common placeholders according to the association relation between the placeholders and the corresponding data by inquiring the association relation table of the common placeholders;
If the cells are all group placeholders, according to the setting format of the placeholders in each cell, sequentially and circularly outputting a matched group of data according to the association relation between the characters and a corresponding group of corresponding data by inquiring the association relation table of the universal placeholders;
If the system comprises both a common placeholder and a group placeholder, the group placeholder sequentially and circularly outputs a matched group of data according to the association relation between the placeholder and corresponding data; the general placeholders cyclically output the data times according to the group placeholders, and the unique data corresponding to the group placeholders are sequentially and cyclically repeated according to the rows.
In certain embodiments, the first conversion module comprises:
the extraction unit is used for extracting a format object from the word style template; wherein, the objects with different formats have different object format information;
The dividing unit is used for dividing the word style template into at least one level of file blocks according to the format information of the format object;
the conversion unit is used for converting the divided file blocks of at least one level into extensible markup language format files according to the label information corresponding to each file block and a preset extensible markup language format library.
In certain embodiments, the first conversion module comprises:
The conversion sub-module is used for converting the content part in the word style template into a first extensible markup language file with a preset format standard through a word structuring engine;
the extraction submodule is used for extracting the format file in the word style template;
And the supplementing sub-module is used for supplementing the format file into the first extensible markup language file and generating a second extensible markup language file.
In one embodiment, a computer device is presented, the computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
Creating a word style template;
converting the word style template into an extensible markup language format file;
converting the extensible markup language format file into a freemaker file;
generating a freemaker template based on the freemaker file;
and outputting the freemaker template as a word file.
In one embodiment, the step of creating word style templates performed by the processor comprises: setting placeholders for a word initialization template, establishing an association relation between the placeholders and data, editing the word initialization template in a mode of directly filling the placeholders, and finishing setting of the template to obtain a word style template.
In one embodiment, the creating word style templates performed by the processor comprises:
setting a placeholder for marking the position of the corresponding data in the word template;
Establishing an association relation table matched with the placeholder; the association relation table is used for storing association relation between the placeholders and the corresponding data;
setting word templates according to the requirements of users, setting placeholders at corresponding positions in the word templates, and completing the establishment of personalized word templates;
Acquiring the personalized word template, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file, and identifying a paragraph part and a form part in the template;
each placeholder in the region is processed differently according to the region.
In one embodiment, the processing performed by the processor for distinguishing each placeholder in the region according to the region includes:
If the processing is directed at the paragraph part, identifying the placeholder of the paragraph part according to the placeholder tag, and directly replacing the placeholder by calling a corresponding data from the database according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the universal placeholder;
If the processing is directed at the table part, identifying the table part placeholders according to the placeholder tags, and forming different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word pattern template.
In one embodiment, the identifying, by the processor, the table part placeholders according to the placeholder tags, forming different data displays according to the placeholder types distributed in each cell in a row unit, and obtaining a word style template includes:
if the cells are all common placeholders, a unique data is called from the database to directly replace the common placeholders according to the association relation between the placeholders and the corresponding data by inquiring the association relation table of the common placeholders;
If the cells are all group placeholders, according to the setting format of the placeholders in each cell, sequentially and circularly outputting a matched group of data according to the association relation between the characters and a corresponding group of corresponding data by inquiring the association relation table of the universal placeholders;
If the system comprises both a common placeholder and a group placeholder, the group placeholder sequentially and circularly outputs a matched group of data according to the association relation between the placeholder and corresponding data; the general placeholders cyclically output the data times according to the group placeholders, and the unique data corresponding to the group placeholders are sequentially and cyclically repeated according to the rows.
In one embodiment, the converting the word style templates to extensible markup language format files performed by the processor includes:
Extracting a format object from the word style template; wherein, the objects with different formats have different object format information;
Dividing the word style template into at least one level of file blocks according to the format information of the format object;
and converting the divided file blocks of at least one level into an extensible markup language format file according to the label information corresponding to each file block and a preset extensible markup language format library.
In one embodiment, the converting the word style templates to extensible markup language format files performed by the processor includes:
Converting the content part in the word style template into a first extensible markup language file with a preset format standard through a word structuring engine;
extracting a format file in the word style template;
and supplementing the format file into the first extensible markup language file to generate a second extensible markup language file.
Referring to fig. 9, the computer device 10 of one implementation of the present embodiment may include: a processor 100, a memory 101, a bus 102 and a communication interface 103, the processor 100, the communication interface 103 and the memory 101 being connected by the bus 102; the memory 101 stores a computer program executable on the processor 100, and the processor 100 executes the method according to any of the foregoing embodiments of the present application when the computer program is executed.
The memory 101 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and the at least one other network element is implemented via at least one communication interface 103 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.
Bus 102 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 101 is configured to store a program, and the processor 100 executes the program after receiving an execution instruction, and the method disclosed in any of the foregoing embodiments of the present application may be applied to the processor 100 or implemented by the processor 100.
The processor 100 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 100 or by instructions in the form of software. The processor 100 may be a general-purpose processor, and may include a central processing unit (Central Processing Unit, abbreviated as CPU), a network processor (Network Processor, abbreviated as NP), and the like; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 101, and the processor 100 reads the information in the memory 101 and, in combination with its hardware, performs the steps of the method described above.
The computer device provided by the embodiment of the application and the method provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the computer device and the method provided by the embodiment of the application are in the same application conception.
In one embodiment, a storage medium storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of:
Creating a word style template;
converting the word style template into an extensible markup language format file;
converting the extensible markup language format file into a freemaker file;
generating a freemaker template based on the freemaker file;
and outputting the freemaker template as a word file.
In one embodiment, the step of creating word style templates performed by the one or more processors includes: setting placeholders for a word initialization template, establishing an association relation between the placeholders and data, editing the word initialization template in a mode of directly filling the placeholders, and finishing setting of the template to obtain a word style template.
In one embodiment, the creating word style templates performed by the one or more processors includes:
setting a placeholder for marking the position of the corresponding data in the word template;
Establishing an association relation table matched with the placeholder; the association relation table is used for storing association relation between the placeholders and the corresponding data;
setting word templates according to the requirements of users, setting placeholders at corresponding positions in the word templates, and completing the establishment of personalized word templates;
Acquiring the personalized word template, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file, and identifying a paragraph part and a form part in the template;
each placeholder in the region is processed differently according to the region.
In one embodiment, the processing of the placeholders in the regions according to the regions, which is executed by the one or more processors, includes:
If the processing is directed at the paragraph part, identifying the placeholder of the paragraph part according to the placeholder tag, and directly replacing the placeholder by calling a corresponding data from the database according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the universal placeholder;
If the processing is directed at the table part, identifying the table part placeholders according to the placeholder tags, and forming different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word pattern template.
In one embodiment, the identifying, by the one or more processors, the table portion placeholders according to the placeholder tags, forming different data displays according to the placeholder types distributed in each cell in units of rows, and obtaining a word style template includes:
if the cells are all common placeholders, a unique data is called from the database to directly replace the common placeholders according to the association relation between the placeholders and the corresponding data by inquiring the association relation table of the common placeholders;
If the cells are all group placeholders, according to the setting format of the placeholders in each cell, sequentially and circularly outputting a matched group of data according to the association relation between the characters and a corresponding group of corresponding data by inquiring the association relation table of the universal placeholders;
If the system comprises both a common placeholder and a group placeholder, the group placeholder sequentially and circularly outputs a matched group of data according to the association relation between the placeholder and corresponding data; the general placeholders cyclically output the data times according to the group placeholders, and the unique data corresponding to the group placeholders are sequentially and cyclically repeated according to the rows.
In one embodiment, the converting the word style templates to extensible markup language format files performed by the one or more processors includes:
Extracting a format object from the word style template; wherein, the objects with different formats have different object format information;
Dividing the word style template into at least one level of file blocks according to the format information of the format object;
and converting the divided file blocks of at least one level into an extensible markup language format file according to the label information corresponding to each file block and a preset extensible markup language format library.
In one embodiment, the converting the word style templates to extensible markup language format files performed by the one or more processors includes:
Converting the content part in the word style template into a first extensible markup language file with a preset format standard through a word structuring engine;
extracting a format file in the word style template;
and supplementing the format file into the first extensible markup language file to generate a second extensible markup language file.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Referring to fig. 10, a storage medium is shown as an optical disc 20 having stored thereon a computer program (i.e. program product) which, when executed by a processor, performs the method provided by any of the embodiments described above.
It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.
The storage medium provided by the above-described embodiments of the present application has the same advantageous effects as the method adopted, operated or implemented by the application program stored therein, for the same inventive concept as the method provided by the embodiments of the present application.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (8)

1. A method for generating a word file based on an extensible markup language file, comprising:
Creating a word style template;
converting the word style template into an extensible markup language format file;
converting the extensible markup language format file into a freemaker file;
generating a freemaker template based on the freemaker file;
outputting the freemaker template as a word file;
the creating word style templates comprises the following steps:
setting a placeholder for marking the position of the corresponding data in the word template;
Establishing an association relation table matched with the placeholder; the association relation table is used for storing association relation between the placeholders and the corresponding data;
setting word templates according to the requirements of users, setting placeholders at corresponding positions in the word templates, and completing the establishment of personalized word templates;
Acquiring the personalized word template, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file, and identifying a paragraph part and a form part in the template;
each placeholder in the region is processed differently according to the region.
2. The method for generating word files based on extensible markup language files of claim 1, wherein the distinguishing process of each placeholder in a region according to regions comprises the following steps:
If the processing is directed at the paragraph part, identifying the placeholder of the paragraph part according to the placeholder tag, and directly replacing the placeholder by calling a corresponding data from the database according to the association relation between the placeholder and the corresponding data by inquiring the association relation table of the universal placeholder;
If the processing is directed at the table part, identifying the table part placeholders according to the placeholder tags, and forming different data displays according to the placeholder types distributed in each cell in a row unit to obtain a word pattern template.
3. The method for generating word files based on extensible markup language files according to claim 2, wherein said identifying table part placeholders according to placeholder tags forms different data displays according to the placeholder types distributed in each cell in units of rows to obtain word style templates, comprising:
if the cells are all common placeholders, a unique data is called from the database to directly replace the common placeholders according to the association relation between the placeholders and the corresponding data by inquiring the association relation table of the common placeholders;
If the cells are all group placeholders, according to the setting format of the placeholders in each cell, sequentially and circularly outputting a matched group of data according to the association relation between the characters and a corresponding group of corresponding data by inquiring the association relation table of the universal placeholders;
If the system comprises both a common placeholder and a group placeholder, the group placeholder sequentially and circularly outputs a matched group of data according to the association relation between the placeholder and corresponding data; the general placeholders cyclically output the data times according to the group placeholders, and the unique data corresponding to the group placeholders are sequentially and cyclically repeated according to the rows.
4. The method for generating a word file based on an extensible markup language file of claim 1, wherein converting the word style template into an extensible markup language format file comprises:
Extracting a format object from the word style template; wherein, the objects with different formats have different object format information;
Dividing the word style template into at least one level of file blocks according to the format information of the format object;
And converting the divided file blocks of at least one level into an extensible markup language format file according to the label information corresponding to each file block and a preset extensible markup language format library.
5. The method for generating a word file based on an extensible markup language file of claim 1, wherein converting the word style template into an extensible markup language format file comprises:
Converting the content part in the word style template into a first extensible markup language file with a preset format standard through a word structuring engine;
extracting a format file in the word style template;
and supplementing the format file into the first extensible markup language file to generate a second extensible markup language file.
6. An apparatus for generating a word file based on an extensible markup language file, comprising:
the creation module is used for creating word style templates;
The first conversion module is used for converting the word style template into an extensible markup language format file;
The second conversion module is used for converting the extensible markup language format file into a freemaker file;
the generating module is used for generating a freemaker template based on the freemaker file;
the output module is used for outputting the freeemarker template as a word file;
the word creation module is further configured to:
setting a placeholder for marking the position of the corresponding data in the word template;
Establishing an association relation table matched with the placeholder; the association relation table is used for storing association relation between the placeholders and the corresponding data;
setting word templates according to the requirements of users, setting placeholders at corresponding positions in the word templates, and completing the establishment of personalized word templates;
Acquiring the personalized word template, analyzing the content of the word template by utilizing the read-write performance of the POI on the word file, and identifying a paragraph part and a form part in the template;
each placeholder in the region is processed differently according to the region.
7. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the steps of the method of generating word files based on extensible markup language files of any one of claims 1 to 5.
8. A storage medium storing computer readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the method of generating word files based on extensible markup language files of any one of claims 1 to 5.
CN202110872843.0A 2021-07-30 2021-07-30 Method, device and equipment for generating word file based on extensible markup language file Active CN113609820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110872843.0A CN113609820B (en) 2021-07-30 2021-07-30 Method, device and equipment for generating word file based on extensible markup language file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110872843.0A CN113609820B (en) 2021-07-30 2021-07-30 Method, device and equipment for generating word file based on extensible markup language file

Publications (2)

Publication Number Publication Date
CN113609820A CN113609820A (en) 2021-11-05
CN113609820B true CN113609820B (en) 2024-04-30

Family

ID=78338763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110872843.0A Active CN113609820B (en) 2021-07-30 2021-07-30 Method, device and equipment for generating word file based on extensible markup language file

Country Status (1)

Country Link
CN (1) CN113609820B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779953B (en) * 2021-11-10 2022-04-01 航天宏图信息技术股份有限公司 Automatic document generation method and system
CN115051904B (en) * 2022-03-23 2023-09-12 武汉烽火技术服务有限公司 Method and device for managing single disk state based on markup language
CN115293123A (en) * 2022-07-19 2022-11-04 盐城金堤科技有限公司 Document template generation method, report online generation method and device
CN116227455A (en) * 2023-04-25 2023-06-06 深圳代码兄弟技术有限公司 File generation method and system and electronic equipment
CN116226053B (en) * 2023-05-05 2024-03-22 中国民航信息网络股份有限公司 Text processing method, device and equipment
CN116562258B (en) * 2023-07-05 2023-10-10 商飞软件有限公司 Method for generating aircraft histories based on element templates

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005352880A (en) * 2004-06-11 2005-12-22 Antenna House Kk Xml document creation system
CN101976235A (en) * 2010-09-21 2011-02-16 天津神舟通用数据技术有限公司 Extensible Word report automatically-generating method based on dynamic web page
CN109933752A (en) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 A kind of method and apparatus exporting electronic document
CN111126010A (en) * 2019-12-20 2020-05-08 深圳前海环融联易信息科技服务有限公司 Freemarker template file repairing method and device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005352880A (en) * 2004-06-11 2005-12-22 Antenna House Kk Xml document creation system
CN101976235A (en) * 2010-09-21 2011-02-16 天津神舟通用数据技术有限公司 Extensible Word report automatically-generating method based on dynamic web page
CN109933752A (en) * 2017-12-15 2019-06-25 北京京东尚科信息技术有限公司 A kind of method and apparatus exporting electronic document
CN111126010A (en) * 2019-12-20 2020-05-08 深圳前海环融联易信息科技服务有限公司 Freemarker template file repairing method and device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
骆蓉 ; 黄俊 ; 黎茂锋 ; 刘志勤 ; .基于Word模板的复杂文档快速生成方法.计算机应用与软件.2020,第37卷(第10期),57-63. *

Also Published As

Publication number Publication date
CN113609820A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
CN113609820B (en) Method, device and equipment for generating word file based on extensible markup language file
US9418315B1 (en) Systems, methods, and computer readable media for extracting data from portable document format (PDF) files
US20010014900A1 (en) Method and system for separating content and layout of formatted objects
CN110879937A (en) Method and device for generating webpage from document, computer equipment and storage medium
EP2291010A1 (en) Structure processing method and apparatus for layout file
CN111797595A (en) Method and device for generating OFD format page based on XML template
CN114020256A (en) Front-end page generation method, device and equipment and readable storage medium
CN111752565A (en) Interface generation method and device, computer equipment and readable storage medium
CN109658485B (en) Webpage animation drawing method, device, computer equipment and storage medium
CN113283228A (en) Document generation method and device, electronic equipment and storage medium
CN116610304B (en) Page code generation method, device, equipment and storage medium
CN113723063A (en) Method for converting RTF (real time function) into HTML (hypertext markup language) and realizing effect on PDF (Portable document Format) file
US8656371B2 (en) System and method of report representation
CN111241096A (en) Text extraction method, system, terminal and storage medium for EXCEL document
CN113139145B (en) Page generation method and device, electronic equipment and readable storage medium
CN113297425B (en) Document conversion method, device, server and storage medium
CN114037828A (en) Component identification method and device, electronic equipment and storage medium
CN114385167A (en) Front-end page generation method, device, equipment and medium
CN115048920A (en) Front-end data exporting method, device, equipment and storage medium
CN112965772A (en) Web page display method and device and electronic equipment
CN113971044A (en) Component document generation method, device, equipment and readable storage medium
CN114637505A (en) Page content extraction method and device
CN113343663A (en) Bill structuring method and device
CN112328246A (en) Page component generation method and device, computer equipment and storage medium
CN114489895B (en) Batch poster generation method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant