CN112016290A - Automatic document typesetting method, device, equipment and storage medium - Google Patents

Automatic document typesetting method, device, equipment and storage medium Download PDF

Info

Publication number
CN112016290A
CN112016290A CN202010909982.1A CN202010909982A CN112016290A CN 112016290 A CN112016290 A CN 112016290A CN 202010909982 A CN202010909982 A CN 202010909982A CN 112016290 A CN112016290 A CN 112016290A
Authority
CN
China
Prior art keywords
typesetting
document
configuration file
file
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010909982.1A
Other languages
Chinese (zh)
Inventor
李威
张勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Writehelp Technology Co ltd
Original Assignee
Hunan Writehelp Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Writehelp Technology Co ltd filed Critical Hunan Writehelp Technology Co ltd
Priority to CN202010909982.1A priority Critical patent/CN112016290A/en
Publication of CN112016290A publication Critical patent/CN112016290A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a method, a device, equipment and a storage medium for automatically typesetting a document, wherein the method comprises the following steps: obtaining a source file to be processed; selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file; typesetting the source file to be processed according to the configuration file; according to the typesetting processing result, an automatic typesetting file is generated, the interaction between people and the system is reduced through automatic reading and typesetting of the document, the efficiency of document typesetting is improved, meanwhile, a non-text type format can be automatically configured, the application range of automatic typesetting is improved, and more use scenes are enlarged.

Description

Automatic document typesetting method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of document making, in particular to a method, a device, equipment and a storage medium for automatically typesetting a document.
Background
The existing automatic paper typesetting tool (Wuhan university Benke paper automatic typesetting tool) requires a user to manually input various parts of the graduation paper on a tool interface, wherein the various parts comprise a primary title, a secondary title, a tertiary title, a Chinese and English abstract, a text part of each chapter, a reference document, thank you and the like. The software outputs the content to a word document according to each part input by the user, automatically sets the format of each part according to the parameters, and automatically generates page numbers and catalogues.
Its disadvantages are: 1. the automatic typesetting tool cannot directly read the existing word document, and the thesis content needs to be manually input into each part (including a first-level title, a second-level title, a third-level title, a Chinese and English abstract, a text part of each chapter, a reference document, a thank you and the like) from the position specified by the interface. 2. The automatic typesetting tool can only input the text content of the paper, does not support the input of pictures, tables and formulas, and needs the user to supplement the word document generated by the tool.
Therefore, a new technical solution is needed to solve the problems in the prior art.
Disclosure of Invention
In view of the foregoing problems in the prior art, an object of the present invention is to provide a method, an apparatus, a device and a storage medium for automatically typesetting a document, which can automatically typeset document contents.
In order to solve the technical problems, the specific technical scheme of the invention is as follows:
in one aspect, the invention provides a method for automatically typesetting a document, which comprises the following steps:
obtaining a source file to be processed;
selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file;
typesetting the source file to be processed according to the configuration file;
and generating an automatic typesetting file according to the typesetting processing result.
Further, before the obtaining of the source file to be processed, a preset configuration file library is established, wherein the preset configuration file library comprises a plurality of groups of configuration files,
the establishing of the preset configuration file library comprises the following steps:
acquiring corresponding document specification requirements from different specification units, wherein the document specification requirements comprise configuration parameter information of a document;
and establishing a corresponding configuration file aiming at each document specification requirement, so that the source file to be processed can be automatically typeset according to the configuration file.
Further, the typesetting processing of the source file to be processed according to the configuration file includes:
acquiring a structural element type of the source file to be processed, wherein the structural element type comprises one or more of a text, a picture, a table and a formula;
typesetting the source file to be processed according to the type of the structural element;
when the structural element type is a text, extracting and classifying the text information according to a preset regular expression to obtain text information of different text types;
and setting a specific format for the classified text information according to the parameters of the configuration file.
Further, the obtaining of the structural element type of the source file to be processed further includes, before one or more of a text, a picture, a table, and a formula: the source file is standardized by a standard processing method,
specifically, the source file normalization process includes:
acquiring page structure information of a document, and formatting the page structure information;
acquiring page number structure information of a document, and formatting the page number structure information;
acquiring directory structure information of a document, and formatting the directory structure information;
acquiring title and text structure information of a document, and formatting;
and acquiring section character and page character information of the document, and formatting the document.
Further, the setting of the specific format of the classified text information according to the parameters of the configuration file includes:
acquiring text information, and matching a corresponding configuration file according to a text type corresponding to the text information, wherein the text type comprises a title and a body;
and configuring the text information according to the configuration file, wherein the configuration file comprises the font, the font size, the alignment format and the text sequence of the text.
Further, setting a specific format of the classified text information according to the parameters of the configuration file further includes:
generating directory information, the generated directory information including a format and a location of a directory.
Further, setting a specific format of the classified text information according to the parameters of the configuration file further includes:
adding page headers and generating page numbers, wherein the page headers and the page numbers comprise the content, format and position of the page headers and the page numbers.
On the other hand, the invention also provides a document automatic typesetting device, which comprises:
the source file acquisition module is used for acquiring a source file to be processed;
the configuration file acquisition module is used for selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file;
the typesetting processing module is used for typesetting the source file to be processed according to the configuration file;
and the result generation module is used for generating an automatic typesetting file according to the typesetting processing result.
In a third aspect, the present invention further provides an apparatus, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or an instruction set, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the method for automatically typesetting a document as described above.
In a fourth aspect, the present invention further provides a storage medium, where at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the method for automatically typesetting a document as described above.
By adopting the technical scheme, the method, the device, the equipment and the storage medium for automatically typesetting the document have the following beneficial effects that:
1. according to the automatic document typesetting method, the automatic document typesetting device, the automatic document typesetting equipment and the automatic document typesetting storage medium, interaction between people and a system is reduced, and document typesetting efficiency is improved.
2. The method, the device, the equipment and the storage medium for automatically typesetting the document can automatically configure the format of the non-text type, improve the application range of automatic typesetting and enlarge more use scenes.
3. The automatic document typesetting method, the automatic document typesetting device, the automatic document typesetting equipment and the storage medium reduce manual participation and operation, thereby reducing the error rate and the error rate, improving the accuracy of document typesetting and reducing the labor cost.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings used in the description of the embodiment or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram of an application environment of a document automatic typesetting method according to the invention;
FIG. 2 is a schematic flow chart of a document automatic typesetting method in this specification;
FIG. 3 is a flowchart illustrating an automatic document layout method according to another embodiment of the present disclosure;
FIG. 4 is a detailed step diagram of step S105 in the embodiment of the present specification;
FIG. 5 is a detailed step diagram of step S509 in the embodiment of the present specification;
FIG. 6 is a schematic diagram of an automatic document layout apparatus according to the present disclosure;
FIG. 7 is a schematic diagram of the structure of an apparatus according to the present invention;
FIG. 8 is a schematic diagram of a storage medium according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or device.
Example 1
Referring to fig. 1, fig. 1 is a schematic diagram of an application environment according to an embodiment of the present invention, which may include a client 01 and a server 02, where the client and the server may be directly or indirectly connected through wired or wireless communication. The user experiences the business service with the client. When the service is updated, the client can report the corresponding data to the server. It should be noted that fig. 1 is only an example. Specifically, the client 01 may include a smart phone, a desktop computer, a tablet computer, a notebook computer, an Augmented Reality (AR)/Virtual Reality (VR) device, a digital assistant, a smart speaker, a smart wearable device, and other types of physical devices, and may also include software running in the physical devices, such as a computer program. The operating system running on the client 01 may include, but is not limited to, an Android system (Android system), an IOS system (which is a mobile operating system developed by apple inc.), linux (an operating system), Microsoft Windows (Microsoft Windows operating system), and the like. Specifically, the server 02 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The server 02 may comprise a network communication unit, a processor and a memory, etc. The server 02 may provide background services for the clients.
In a specific embodiment, when the client corresponds to an entity device, a computer program provided by a service provider and pointing to a certain product is run in the entity device. When the client corresponds to a computer program running in the physical device, the computer program is provided by the service provider and directed to a product.
The computer program comprises a normal service program corresponding to the service main logic and a program corresponding to the configuration file parameter. The server receiving the configuration file parameters may be a specific data server. In practical application, the client may also point to the program corresponding to the configuration file parameters, and a document layout system may be constructed based on the program corresponding to the configuration file parameters.
In order to better implement the automatic document layout process, a specific embodiment of the automatic document layout method according to the present invention is described below, and fig. 2 is a flowchart of an automatic document layout method according to an embodiment of the present invention, where the method operation steps described in the embodiment or the flowchart are provided in this specification, but more or fewer operation steps may be included based on conventional or non-creative labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2, the method may include:
s101: obtaining a source file to be processed;
s103: selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file;
s105: typesetting the source file to be processed according to the configuration file;
s107: generating an automatic typesetting file according to the typesetting processing result
In actual work, the source file to be processed may be a first draft of a paper, optionally, the paper may be a graduation paper or a contribution paper, and the like, each college or contribution unit (magazine, journal, and the like) has a special requirement on the format of the paper, and the first draft of the paper is not processed according to a specified format, and in order to meet the specified format requirement, the paper is generally manually operated, and in this case, manpower is wasted, and efficiency and accuracy are difficult to guarantee.
Specifically, the configuration file may be the typesetting requirements of each college or contribution unit, such as the font, font size, line spacing, and alignment of the text. The configuration files required by the typesetting requirements of high efficiency or contribution units are integrated into a preset configuration file library, so that when the source files to be processed are obtained, the configuration files can be configured according to the requirements of different source files to be processed, and the corresponding specification files are obtained.
Therefore, in some other embodiments, as shown in fig. 3, obtaining the source file to be processed may further include:
s100: establishing a preset configuration file library, wherein the preset configuration file library comprises a plurality of groups of configuration files;
wherein the establishing of the preset configuration file library comprises:
acquiring corresponding document specification requirements from different specification units, wherein the document specification requirements comprise configuration parameter information of a document;
and establishing a corresponding configuration file aiming at each document specification requirement, so that the source file to be processed can be automatically typeset according to the configuration file.
Illustratively, written specifications for graduate papers and contribution papers are first downloaded from a website at each college or periodical, wherein the graduate papers of this department are typically downloaded from a texthouse website and the graduate position papers are typically downloaded from a graduate institute website. The method for reading the configuration file is written in a computer high-level language, so that a system can automatically read the configuration file and construct a corresponding thesis structure template, wherein the computer high-level language can be JAVA, python, C language and the like, the configuration file can be a script obtained through the computer programming language and comprises corresponding configuration parameters, and the configuration parameters are in the formats of texts, tables, pictures and formulas which are adapted to documents.
Because the structural elements of the thesis are complex, particularly the high-efficiency graduation thesis not only processes the most common texts, but also comprises pictures, tables, formulas and the like, and the elements cannot be stored in a txt text file, the source file of the graduation thesis is generally stored in a word document format. In order to reduce the interaction with people in the automatic typesetting process, an automatic typesetting system capable of directly reading a word document needs to be realized. In order to read all structural elements in a Word document, in the embodiment of the present specification, a component spiral. As a completely independent component, the operating environment of the thread doc can perform a variety of Word document processing tasks without installing Microsoft Office, which includes generating, reading, converting and printing Word documents, inserting pictures, adding headers and footers, creating forms, adding form fields and mail merge fields, adding bookmarks, adding text and picture watermarks, setting background colors and background pictures, adding footers and endnotes, adding hyperlinks, encrypting and decrypting Word documents, adding annotations, adding shapes, and the like.
Thus, in the present illustrative embodiment, as shown in fig. 4, the step S105 includes:
s503: acquiring a structural element type of the source file to be processed, wherein the structural element type comprises one or more of a text, a picture, a table and a formula;
s505: typesetting the source file to be processed according to the type of the structural element;
s507: when the structural element type is a text, extracting and classifying the text information according to a preset regular expression to obtain text information of different text types;
s509: and setting a specific format for the classified text information according to the parameters of the configuration file.
Because different structural elements have different specification requirements and different requirements in different parts of a text, the structural element composition type in a source file to be processed can be obtained through a preset judgment rule, for example, a corresponding extraction rule can be set through a regular expression, when a text element is determined, pure character information is extracted to serve as a regular extraction principle, when a picture element is determined, information such as a picture boundary and a picture pixel is extracted to serve as a picture extraction condition, when a form element is determined, the attribute of the form serves as an extraction principle, and when a formula element is determined, information such as a symbol of the formula serves as an extraction principle. The above specific determination conditions are set as the case may be. In this embodiment of the present specification, a configuration interface may be provided by a spiral.
Different element processing rules and complexity are different, such as relative simplicity of pictures, tables, and formulas, such as size, position, and resolution of pictures; the size and position of the table and the attributes of characters in the table; the size and location of the formula, etc. Particularly, the typesetting for text information is relatively complex, and the text part includes different paragraphs and contents, for example, when the source file to be processed is a graduation paper, the text part may include a chinese abstract title, a chinese abstract text, a chinese keyword, an english abstract title, an english abstract text, an english keyword, a paper text, a reference text, a thank you title, a thank you text, and the like. The name and format will have different requirements.
Therefore, the regular expression can be used for extracting and classifying the text information, and it needs to be explained that because the literary habit of the source file writer does not strictly write the names according to the requirements, a plurality of similar extraction conditions can be set for extracting when the regular expression is extracted, for example, when the abstract is in progress, the Chinese abstract title, the Chinese title and the abstract title can be selected.
The format of the text information of different text types obtained by extracting and classifying through the regular expression can be adjusted through a preset configuration file.
Because various word documents submitted by users have different thesis structures, such as whether catalogues are added to the thesis structures in the documents, whether covers are added to the thesis structures, whether headers are added to the thesis structures, and whether basic structures of graduation treatises are missing, in order to facilitate programs to perform unified processing on various documents, after the word documents are loaded, standard processing needs to be performed on source files, and specifically, the source files can be subjected to standard processing by configuring parameters of a sphere.
Therefore, in the embodiment of the present specification, step S503 is preceded by:
s501: and (5) source file standardization processing.
Wherein the source file normalization process comprises:
acquiring page structure information of a document, and formatting the page structure information;
acquiring page number structure information of a document, and formatting the page number structure information;
acquiring directory structure information of a document, and formatting the directory structure information;
acquiring title and text structure information of a document, and formatting;
and acquiring section character and page character information of the document, and formatting the document.
It should be noted that the normalization process of the source file is to obtain the content of the paper structure which tends to be consistent, and besides the above normalization process, the normalization process can also be used to process pictures, tables and formulas, so that when the document after the normalization process is formatted, the processing efficiency and speed can be improved, and the error rate can be reduced.
In addition, in the embodiment of the present specification, a manual document adjustment step may be further included, specifically, after step S501, a document that is processed through standardization may have an un-standardized portion, for example, a certain page information in the document may not be accurately identified, so that formatting processing cannot be performed, and at this time, manual adjustment may be performed, or a useful portion in the document may be formatted due to a parameter setting problem, and at this time, a corresponding portion needs to be manually supplemented, and through a manual review step, an accident in a document typesetting process may be avoided, and accuracy of document processing is improved.
In this embodiment, as shown in fig. 5, step S509 may further include:
s5091: acquiring text information, and matching a corresponding configuration file according to a text type corresponding to the text information, wherein the text type comprises a title and a body;
s5093: and configuring the text information according to the configuration file, wherein the configuration file comprises the font, the font size, the alignment format and the text sequence of the text.
When a text part obtained by extracting a standardized document through a regular expression is configured, a text and a title need to be processed, the processing on the text is relatively simple and consistent, but the requirements on the title are relatively simple, but the papers of journal type are relatively complex, and the requirements on different colleges and universities may be different, for example, the title of a graduate paper generally comprises a first-level title, a second-level title and a third-level title, and the representation methods and formats of the titles at different levels are also different, so that not only the title is identified through the regular expression, but also the level of the title is judged according to the state relationship of the context, further, when the title number of the paper is wrong, a sphere The format is aligned.
After the specification setting is performed on the content part of the paper, the method may further include:
s5095: generating directory information, wherein the generated directory information comprises the format and the position of a directory;
s5097: reading parameters from the configuration file, generating a directory based on each grade title by using an interface provided by a component spiral.
In order to enable the output of the typeset document to be more complete, the parameters can be configured according to the cover requirements of high efficiency or contribution units, the cover documents of the graduation papers and the structure documents of the graduation papers are merged according to the interface provided by the component spiral.
By the automatic document typesetting method, interaction between people and a system is reduced through automatic reading and typesetting of the document, the document typesetting efficiency is improved, meanwhile, a non-text type format can be automatically configured, the application range of automatic typesetting is improved, more use scenes are enlarged, and further artificial participation and operation are reduced, so that the error rate and the error rate are reduced, the document typesetting accuracy is improved, and the labor cost is reduced.
On the basis of the above-mentioned document automatic typesetting method, an embodiment of the present specification further provides a document automatic typesetting apparatus, as shown in fig. 6, the apparatus includes:
the source file acquisition module is used for acquiring a source file to be processed;
the configuration file acquisition module is used for selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file;
the typesetting processing module is used for typesetting the source file to be processed according to the configuration file;
and the result generation module is used for generating an automatic typesetting file according to the typesetting processing result.
It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
In a specific embodiment, as shown in fig. 7, a schematic structural diagram of an electronic device provided in an embodiment of the present invention is shown. The electronic device 800 may include components such as memory 810 for one or more computer-readable storage media, processor 820 for one or more processing cores, input unit 830, display unit 840, Radio Frequency (RF) circuitry 850, wireless fidelity (WiFi) module 860, and power supply 870. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 7 does not constitute a limitation of electronic device 800, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components. Wherein:
the memory 810 may be used to store software programs and modules, and the processor 820 executes various functional applications and data processing by operating or executing the software programs and modules stored in the memory 810 and calling data stored in the memory 810. The memory 810 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 810 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device. Accordingly, memory 810 may also include a memory controller to provide processor 820 with access to memory 810.
The processor 820 is a control center of the electronic device 800, connects various parts of the whole electronic device by using various interfaces and lines, and performs various functions of the electronic device 800 and processes data by operating or executing software programs and/or modules stored in the memory 810 and calling data stored in the memory 810, thereby performing overall monitoring of the electronic device 800. The Processor 820 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input unit 830 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. Specifically, the input unit 830 may include an image input device 831 and other input devices 832. The image input device 831 may be a camera or a photoelectric scanning device. The input unit 830 may include other input devices 832 in addition to the image input device 831. In particular, other input devices 832 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 840 may be used to display information input by or provided to a user and various graphical user interfaces of an electronic device, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 840 may include a Display panel 841, and the Display panel 841 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like, as an option.
The RF circuit 850 may be used for receiving and transmitting signals during a message transmission or communication process, and in particular, for receiving downlink messages from a base station and then processing the received downlink messages by the one or more processors 820; in addition, data relating to uplink is transmitted to the base station. In general, the RF circuitry 850 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 850 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.
WiFi belongs to short-range wireless transmission technology, and the electronic device 800 can help the user send and receive e-mails, browse web pages, access streaming media, etc. through the WiFi module 860, and it provides the user with wireless broadband internet access. Although fig. 7 shows WiFi module 860, it is understood that it does not belong to the essential components of electronic device 800, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The electronic device 800 also includes a power supply 870 (e.g., a battery) for powering the various components, which may be logically coupled to the processor 820 via a power management system to manage charging, discharging, and power consumption via the power management system. The power source 870 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
It should be noted that, although not shown, the electronic device 800 may further include a bluetooth module, and the like, which is not described herein again.
An embodiment of the present invention further provides a storage medium, as shown in fig. 8, where at least one instruction, at least one program, a code set, or an instruction set is stored in the storage medium, and the at least one instruction, the at least one program, the code set, or the instruction set is executable by a processor of an electronic device to perform any one of the log processing methods described above.
Optionally, in an embodiment of the present invention, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, it will be appreciated by those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the invention can be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (10)

1. An automatic document typesetting method is characterized by comprising the following steps:
obtaining a source file to be processed;
selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file;
typesetting the source file to be processed according to the configuration file;
and generating an automatic typesetting file according to the typesetting processing result.
2. The automatic typesetting method for documents according to claim 1, further comprising establishing a preset configuration file library before the obtaining of the source files to be processed, wherein the preset configuration file library comprises a plurality of groups of configuration files,
the establishing of the preset configuration file library comprises the following steps:
acquiring corresponding document specification requirements from different specification units, wherein the document specification requirements comprise configuration parameter information of a document;
and establishing a corresponding configuration file aiming at each document specification requirement, so that the source file to be processed can be automatically typeset according to the configuration file.
3. The method for automatically typesetting the document according to claim 1, wherein the typesetting the source file to be processed according to the configuration file comprises:
acquiring a structural element type of the source file to be processed, wherein the structural element type comprises one or more of a text, a picture, a table and a formula;
typesetting the source file to be processed according to the type of the structural element;
when the structural element type is a text, extracting and classifying the text information according to a preset regular expression to obtain text information of different text types;
and setting a specific format for the classified text information according to the parameters of the configuration file.
4. The method for automatically typesetting the document according to claim 3, wherein the obtaining of the structural element type of the source file to be processed further comprises before one or more of text, picture, table and formula: the source file is standardized by a standard processing method,
the source file standardization process comprises:
acquiring page structure information of a document, and formatting the page structure information;
acquiring page number structure information of a document, and formatting the page number structure information;
acquiring directory structure information of a document, and formatting the directory structure information;
acquiring title and text structure information of a document, and formatting;
and acquiring section character and page character information of the document, and formatting the document.
5. The method of claim 3, wherein the setting of the specific format of the classified text information according to the parameters of the configuration file comprises:
acquiring text information, and matching a corresponding configuration file according to a text type corresponding to the text information, wherein the text type comprises a title and a body;
and configuring the text information according to the configuration file, wherein the configuration file comprises the font, the font size, the alignment format and the text sequence of the text.
6. The method of automatic typesetting for documents according to claim 5, wherein the setting of the specific format for the classified text information according to the parameters of the configuration file further comprises:
generating directory information, the generated directory information including a format and a location of a directory.
7. The method of automatic typesetting for documents according to claim 5, wherein the setting of the specific format for the classified text information according to the parameters of the configuration file further comprises:
adding page headers and generating page numbers, wherein the page headers and the page numbers comprise the content, format and position of the page headers and the page numbers.
8. An apparatus for automatic composing a document, the apparatus comprising:
the source file acquisition module is used for acquiring a source file to be processed;
the configuration file acquisition module is used for selecting a configuration file corresponding to the source file to be processed from a preset configuration file library, wherein the configuration file represents the configuration parameter information of the file;
the typesetting processing module is used for typesetting the source file to be processed according to the configuration file;
and the result generation module is used for generating an automatic typesetting file according to the typesetting processing result.
9. An apparatus comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the method of automatic composition of documents according to any of claims 1 to 7.
10. A storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement a method of automatically composing a document according to any one of claims 1 to 7.
CN202010909982.1A 2020-09-02 2020-09-02 Automatic document typesetting method, device, equipment and storage medium Pending CN112016290A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010909982.1A CN112016290A (en) 2020-09-02 2020-09-02 Automatic document typesetting method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010909982.1A CN112016290A (en) 2020-09-02 2020-09-02 Automatic document typesetting method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112016290A true CN112016290A (en) 2020-12-01

Family

ID=73516732

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010909982.1A Pending CN112016290A (en) 2020-09-02 2020-09-02 Automatic document typesetting method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112016290A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507666A (en) * 2020-12-21 2021-03-16 北京百度网讯科技有限公司 Document conversion method and device, electronic equipment and storage medium
CN112966484A (en) * 2021-03-01 2021-06-15 维沃移动通信有限公司 Chart typesetting method and device, electronic equipment and readable storage medium
CN113297832A (en) * 2021-05-25 2021-08-24 北京北大方正电子有限公司 Method, device and equipment for optimizing folding position and storage medium
CN113312317A (en) * 2021-05-18 2021-08-27 珠海金山办公软件有限公司 File processing method and device, electronic equipment and storage medium
CN115062584A (en) * 2022-06-28 2022-09-16 杭州数梦工场科技有限公司 Document style generation method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779118A (en) * 2012-07-17 2012-11-14 周广超 Paper typesetting method and system
CN104239284A (en) * 2014-09-15 2014-12-24 广州市西美信息科技有限公司 Method and device for automatic image-text composition
CN110390091A (en) * 2018-04-18 2019-10-29 成都野望数码科技有限公司 A kind of typesetting document structure tree method, device and equipment
CN111597771A (en) * 2019-02-21 2020-08-28 珠海金山办公软件有限公司 Method, device, electronic equipment and medium for adjusting document content format

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779118A (en) * 2012-07-17 2012-11-14 周广超 Paper typesetting method and system
CN104239284A (en) * 2014-09-15 2014-12-24 广州市西美信息科技有限公司 Method and device for automatic image-text composition
CN110390091A (en) * 2018-04-18 2019-10-29 成都野望数码科技有限公司 A kind of typesetting document structure tree method, device and equipment
CN111597771A (en) * 2019-02-21 2020-08-28 珠海金山办公软件有限公司 Method, device, electronic equipment and medium for adjusting document content format

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507666A (en) * 2020-12-21 2021-03-16 北京百度网讯科技有限公司 Document conversion method and device, electronic equipment and storage medium
CN112507666B (en) * 2020-12-21 2023-07-11 北京百度网讯科技有限公司 Document conversion method, device, electronic equipment and storage medium
CN112966484A (en) * 2021-03-01 2021-06-15 维沃移动通信有限公司 Chart typesetting method and device, electronic equipment and readable storage medium
CN112966484B (en) * 2021-03-01 2024-06-07 维沃移动通信有限公司 Chart typesetting method, device, electronic equipment and readable storage medium
CN113312317A (en) * 2021-05-18 2021-08-27 珠海金山办公软件有限公司 File processing method and device, electronic equipment and storage medium
CN113312317B (en) * 2021-05-18 2023-12-26 珠海金山办公软件有限公司 File processing method and device, electronic equipment and storage medium
CN113297832A (en) * 2021-05-25 2021-08-24 北京北大方正电子有限公司 Method, device and equipment for optimizing folding position and storage medium
CN115062584A (en) * 2022-06-28 2022-09-16 杭州数梦工场科技有限公司 Document style generation method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN112016290A (en) Automatic document typesetting method, device, equipment and storage medium
US9411790B2 (en) Systems, methods, and media for generating structured documents
US9577965B2 (en) Method and device for posting microblog message
WO2020233332A1 (en) Text structured information extraction method, server and storage medium
CN108108342B (en) Structured text generation method, search method and device
US8359532B2 (en) Text type-ahead
US8316035B2 (en) Systems and arrangements of text type-ahead
US20090094324A1 (en) Methods, apparatus, and systems for providing local and online data services
KR20130126930A (en) Using text messages to interact with spreadsheets
CN105094775B (en) Webpage generation method and device
US11295064B2 (en) Method for transmitting information at user device side and network device side
CN110347984B (en) Policy page changing method and device, computer equipment and storage medium
US20200233878A1 (en) Card-based information management method and system
CN112650529B (en) System and method for configurable generation of mobile terminal APP codes
CN103020119A (en) Conversion method, device and system for converting paper edition resume into electronic edition resume
WO2018053594A1 (en) Emoji images in text messages
CN111737443B (en) Answer text processing method and device and key text determining method
CN104008087A (en) Automatic typesetting method and system special for copywriter with standard format
CN103107979A (en) Processing method and processing device for notes of layout files
CN110134920B (en) Pictogram compatible display method, device, terminal and computer readable storage medium
AU2016222279A1 (en) Generating a signed electronic document
CN105824951A (en) Retrieval method and retrieval device
US10402482B2 (en) Content management system
CN111767703B (en) Form data acquisition method, device and system
CN100498765C (en) Method and device for making electric newspaper printing plate

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination