CN115630618A - Intelligent scene editing method, system, equipment and medium for PDF document - Google Patents

Intelligent scene editing method, system, equipment and medium for PDF document Download PDF

Info

Publication number
CN115630618A
CN115630618A CN202211352604.3A CN202211352604A CN115630618A CN 115630618 A CN115630618 A CN 115630618A CN 202211352604 A CN202211352604 A CN 202211352604A CN 115630618 A CN115630618 A CN 115630618A
Authority
CN
China
Prior art keywords
editing
information
target file
file
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211352604.3A
Other languages
Chinese (zh)
Inventor
王培兵
杨朗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Wangbang Technology Co ltd
Original Assignee
Sichuan Wangbang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Wangbang Technology Co ltd filed Critical Sichuan Wangbang Technology Co ltd
Priority to CN202211352604.3A priority Critical patent/CN115630618A/en
Publication of CN115630618A publication Critical patent/CN115630618A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/149Adaptation of the text data for streaming purposes, e.g. Efficient XML Interchange [EXI] format
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an intelligent scene editing method, system, equipment and medium for a PDF document, and relates to the technical field of document editing. The method comprises the steps of responding to the editing scene selection requirement of a user, obtaining corresponding editing scene selection information, and then selecting a processing scheme for a target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme, so that the target file can be subjected to subsequent accurate processing according to the processing scheme. That is to say, the method optimizes the method flow of converting the PDF document into an editable form, thereby being capable of using different decoding and encoding algorithms aiming at PDF documents with different sources, so that the resource objects in the PDF document can be accurately extracted and new data flow information can be formed, and the phenomena of errors in the process of misreading, missed reading and text recombination of the PDF document caused by fixed and single algorithm of the traditional editing method are avoided.

Description

Intelligent scene editing method, system, equipment and medium for PDF document
Technical Field
The invention relates to the technical field of document editing, in particular to an intelligent scene editing method, system, equipment and medium for a PDF document.
Background
PDF (Portable Document Format) is a file Format developed by Adobe Systems, and is used for file exchange in a manner unrelated to an application program, an operating system, and hardware. The PDF file can faithfully reproduce each character, color, and image of an original on any printer based on a PostScript language image model. The PDF file format may encapsulate text, fonts, formats, colors, and device and resolution independent graphical images, etc. in one file. The format file can also contain electronic information such as hypertext links, sound, dynamic images and the like, supports a very long file, and has high integration level and high safety and reliability. The PDF file may support publishing and publishing of cross-platform multimedia integration information, and may in particular provide support for network information publishing. Therefore, PDF is one of the most widely used document formats in the fields of daily office, information exchange, business activities, and the like.
The PDF format of the invention aims to be portable, high-fidelity and cross-platform, and convenient editing is not the target of the PDF from the beginning and even inconvenient editing becomes the 'advantage' of the PDF. However, as one of the most widely used document formats, PDF editing and modification are just needed in PDF creation, distribution, storage and sharing, and thus there is always a strong demand. Meanwhile, due to the characteristic of 'inconvenient editing' of the PDF, certain technical difficulty is caused in developing PDF editing functions. In order to convert a PDF into an editable form, the prior art uses a fixed conversion algorithm to convert the PDF into an editable text stream. However, in practical applications, PDF documents are often not in the format of original documents, and are often used as an intermediary for document distribution, making PDF sources diverse, which results in a lot of unpredictability in converting PDF into an editable form. Therefore, by the PDF conversion method of the prior art, it will be easy for misreading, missed reading, and conversion failure to occur.
Disclosure of Invention
The invention aims to provide a PDF document intelligent scene editing method, a PDF document intelligent scene editing system, a PDF document intelligent scene editing device and a PDF document intelligent scene editing medium, wherein a PDF document is converted into an editable form by optimizing a method flow, so that different decoding and encoding algorithms can be used for PDF documents with different sources, resource objects in the PDF document can be accurately extracted, new data flow information is formed, and the phenomena of errors in the PDF document misreading, misreading and text recombination process caused by the fact that the algorithm is fixed and single in the traditional editing method are avoided.
The embodiment of the invention is realized by the following steps:
in a first aspect, an embodiment of the present application provides an intelligent scenario editing method for a PDF document, including the following steps:
step S101: acquiring a target file and acquiring source information of the target file;
step S102: responding to the editing scene selection requirement of a user to obtain corresponding editing scene selection information;
step S103: selecting a processing scheme for the target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme;
step S104: reading structure data and an encoding algorithm aiming at the target file according to the processing scheme;
step S105: reading data information of a target file based on the structural data and the coding algorithm;
step S106: coding is carried out on the basis of the data information to form data stream information of a target format, and a corresponding new format file of the target format is obtained on the basis of the data stream information;
step S107: judging the structural difference information of the target file and the new format file, if the structural difference information is smaller than a preset threshold value, entering the next step, and otherwise, returning to the step S103 to reselect other processing schemes;
step S108: and sending the data stream information or the new format file to a corresponding editing environment based on the editing scene selection information to respond to the document editing requirement of the user.
In some embodiments of the present invention, the step S103 specifically includes:
step S201: reading file header information of a target file to obtain version information of the target file;
step S202: reading the file tail information of the target file to obtain the cross reference table information of the target file;
step S203: obtaining the information of the content description object of the target file according to the cross reference table information;
step S204: and obtaining the structural characteristics of the target file based on the information of the content description object, and selecting a corresponding processing scheme based on the structural characteristics, the version information and preset empirical data.
In some embodiments of the present invention, the content description object includes a boolean value object, a number object, a string object, a name object, an array object, a dictionary object, a stream object, and a null object.
In some embodiments of the present invention, the step S102 specifically includes:
step S301: responding to the editing scene selection demand information of a user to obtain first editing scene selection information;
step S302: obtaining second editing scene selection information based on the source information of the target file;
step S303: and if the first editing scene selection information and the second editing scene selection information are consistent in data, obtaining a corresponding processing scheme according to the first editing scene selection information, otherwise, generating corresponding prompt information to be displayed to the user, and responding to the confirmation selection information of the user to obtain final editing scene selection information.
In some embodiments of the present invention, the step S107 further includes: and detecting the compatibility of the data stream information or the new format file and the corresponding browser or editor, if the compatibility is realized, entering the next step, and otherwise, returning to the step S103 to reselect other processing schemes.
In some embodiments of the present invention, the editing scene selection information includes a Docx scene edit, a PPT scene edit, an Excel scene edit, an OCR scene edit, and an AIS intelligent judgment scene edit.
In some embodiments of the present invention, the sources of the target files include Word documents, PPT documents, excel form documents, graphic photo documents, and scan-to-generate documents.
In a second aspect, an embodiment of the present application provides an intelligent scenario editing system for a PDF document, which includes:
the file acquisition module is used for acquiring a target file and acquiring source information of the target file;
the scene selection module is used for responding to the editing scene selection requirement of a user and obtaining corresponding editing scene selection information;
the processing scheme module is used for selecting a processing scheme for the target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme;
the algorithm reading module is used for reading the structural data and the coding algorithm aiming at the target file according to the processing scheme;
the data reading module is used for reading data information of the target file based on the structural data and the coding algorithm;
the file generating module is used for coding the data information to form data stream information of a target format and obtaining a new format file of the corresponding target format based on the data stream information;
the difference judging module is used for judging the structure difference information of the target file and the new format file, if the structure difference information is smaller than a preset threshold value, the next step is carried out, and if not, the step S102 is returned to, and other processing schemes are reselected;
and the document editing module is used for sending the data stream information or the new format file to the corresponding editing environment based on the editing scene selection information to respond to the document editing requirement of the user.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory for storing one or more programs; a processor. The one or more programs, when executed by the processor, implement the method as described in any of the above first aspects.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method as described in any one of the above first aspects.
Compared with the prior art, the embodiment of the invention has at least the following advantages or beneficial effects:
the embodiment of the application provides an intelligent scene editing method of a PDF document, which comprises the steps of obtaining source information of a target file needing to be processed by a user, and responding to an editing scene selection requirement of the user so as to obtain corresponding editing scene selection information. And then, based on the source information and the editing scene selection information, a targeted and accurate processing scheme can be obtained, so that the structural data and the coding algorithm of the target file can be read according to the processing scheme, then the data information of the target file is read, and the coding processing is carried out, so that a new format file with clear and accurate data is obtained. And then judging the structural difference between the new format file and the target file, reselecting other schemes for the new format file with overlarge structural difference, and sending the obtained new format file into a corresponding editing environment to respond to the document editing requirement of a user until the structural difference information is smaller than a preset threshold value, so that the fidelity of the data in the target file can be effectively ensured. That is, the method optimizes the method flow of converting the PDF document into editable form, so that different decoding and encoding algorithms can be used for PDF documents with different sources, so that the resource objects in the PDF documents can be accurately extracted and new data flow information can be formed, and the phenomena of misreading, missing reading and text recombination process errors of the PDF documents caused by fixed and single algorithms of the traditional editing method are avoided. In addition, the method can comprehensively consider the editing scene selection requirements of the user, thereby subsequently providing a more humanized editing scene for the user and effectively improving the use experience of the user.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a flowchart illustrating an embodiment of a method for intelligently editing a PDF document according to the present invention;
FIG. 2 is a flowchart illustrating a PDF document editing method according to another embodiment of the present invention;
FIG. 3 is a flowchart illustrating another embodiment of an intelligent scene editing method for PDF documents according to the present invention;
FIG. 4 is a flowchart illustrating a PDF document intelligent scene editing method according to yet another embodiment of the present invention;
FIG. 5 is a block diagram illustrating an embodiment of an intelligent scene editing system for PDF documents according to the present invention;
fig. 6 is a block diagram of an electronic device according to an embodiment of the present invention.
An icon: 1. a file acquisition module; 2. a scene selection module; 3. a processing scheme module; 4. an algorithm reading module; 5. a data reading module; 6. a file generation module; 7. a difference judgment module; 8. a document editing module; 9. a memory; 10. a processor; 11. a communication interface.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the individual features of the embodiments can be combined with one another without conflict.
Examples
Referring to fig. 1 and fig. 2, the method for intelligently editing a PDF document in a scenic manner includes the following steps:
step S101: and acquiring the target file and acquiring the source information of the target file.
In the prior art, the target file in the above steps, that is, the PDF file which is uploaded by the user and needs to be edited, is directly read and decoded, that is, a fixed single algorithm flow is adopted, which is very easy to cause the phenomena of misreading and missing reading. The main reason is that the target files have various sources, and targeted different decoding and encoding algorithms need to be used for the target files from different sources, so that the resource objects in the target files can be conveniently and accurately extracted subsequently. That is, by acquiring the source information of the target file in the above steps, it is possible to provide raw data support for subsequent analysis and determination of a specific processing scheme.
The source of the target file comprises common PDF file generating sources such as a Word document, a PPT document, an Excel form document, a graphic photo document and a scanning generating document, and after the source information is obtained, the target file can be subjected to subsequent processing in a targeted manner, so that the method is convenient and quick.
Step S102: and responding to the editing scene selection requirement of the user to obtain corresponding editing scene selection information.
In the steps, because the editing habits of each user are different, the editing scene selection requirements of the users are also different, the corresponding editing scene selection information is obtained by responding to the editing scene selection requirements of the users, the editing scene selection requirements of the users can be comprehensively considered in the subsequent processing process of the target file, and the user experience is greatly improved.
Illustratively, the editing scene selection information may include a Docx scene edit, a PPT scene edit, an Excel scene edit, an OCR scene edit, and an AIS intelligent judgment scene edit. In the prior art, editing scenes similar to Word are generally used, however, sometimes a user may have other requirements for selecting editing scenes. Therefore, by setting the editing scene selection information to include the Docx scene editing, the PPT scene editing, the Excel scene editing, and the OCR scene editing, the content of the target file can be conveniently presented to the user by using the editing scene selected by the user, that is, the subsequent user can conveniently perform corresponding operations in the familiar or expected editing scene. In addition, due to the fact that the source of the target file is various, a user may not know the source information of the target file or may not know which editing scene is more appropriate, scene editing can be intelligently judged by selecting the AIS, and therefore the target file can be automatically analyzed and determined subsequently to obtain the best editing scene.
Specifically, referring to fig. 4, in some embodiments of the present invention, the step S102 specifically includes:
step S301: responding to the editing scene selection demand information of a user to obtain first editing scene selection information;
step S302: obtaining second editing scene selection information based on the source information of the target file;
step S303: if the first editing scene selection information and the second editing scene selection information are consistent in data, obtaining a corresponding processing scheme according to the first editing scene selection information, otherwise, generating corresponding prompt information to be displayed to a user, and responding to the selection information confirmed by the user to obtain final editing scene selection information.
Due to the difference of the editing habits of each user, the editing scene selection requirements of each user will also be different, and due to the difference of the source information of the corresponding target files, different editing scenes will have different effects on the editing of the target files. That is, the editing scene selected by the user may not be the best editing scene, thereby affecting the subsequent editing experience. In the above steps, by responding to the editing scene selection requirement information of the user, the first editing scene selection information including the editing scene most desired by the user can be obtained, and then the second editing scene selection information including the actually best editing scene of the target file can be obtained based on the source information of the target file. And then, by comparing the consistency of the first editing scene selection information and the second editing scene selection information, when the first editing scene selection information and the second editing scene selection information are inconsistent, the second editing scene selection information is utilized to generate the optimal scene editing prompt information and display the optimal scene editing prompt information to the user, so that the user can be helped to comprehensively consider and select the final editing scene selection information. The subsequent user may still select the most desired editing scene, or may select other editing scene information under the prompt information, so that the above method steps may finally comprehensively consider and confirm to select the specific editing scene information, that is, may obtain the final editing scene selection information.
Illustratively, the generated prompt message may include an optimal editing scenario prompt message (including what the optimal editing scenario is, reasons and corresponding advantages), and editing defects that may occur in other editing scenarios, so as to better assist the user in selecting and confirming the final editing scenario.
Step S103: and selecting a processing scheme for the target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme.
In the above steps, by integrating the source information of the target file and the editing scene selection information, a real and effective data support can be provided for the selection of the processing scheme, so that a more accurate and suitable processing scheme can be obtained.
Specifically, in some embodiments of the present invention, the step S103 may specifically include:
step S201: and reading the file header information of the target file to obtain the version information of the target file.
In the above steps, the Header is located in the first line of the PDF document, which indicates the version of the PDF specification used by the current file, i.e. the version information of the target file is obtained through the Header information, which may facilitate the subsequent adjustment of the processing scheme.
Step S202: and reading the file tail information of the target file to obtain the cross reference table information of the target file.
The file end information is a file end tracker, and the file end tracker (Trailer) of the file end tracker comprises a start address of a cross reference table, the total number of objects in the cross reference table, an object number of a Catalog object in a document, security information such as encryption and the like, and can find the cross reference table and the Catalog object of the whole PDF file according to an application program of PDF (portable document format) information provided by a file end, so that the whole PDF document is controlled. The Cross-reference Table (Cross-reference Table) is an indirect object address index Table set for random access to indirect objects, and gives entry addresses, i.e. byte offsets, of all current file use objects, so that the system can randomly access different objects. That is to say, in the above steps, the file end information of the target file is read, so that the cross reference table information of the target file can be obtained, and then the subsequent acquisition of the content information of the target file is facilitated.
Step S203: and obtaining the information of the content description object of the target file according to the cross reference table information.
The information of the content description object of the target file, that is, the boolean value object, the number object, the character string object, the name object, the array object, the dictionary object, the stream object, and the null object including the target file, is obtained through the above steps, and then it will be possible to facilitate the subsequent corresponding processing based on the content description object. Wherein Boolean objects represent true and false logical values, shown as keywords true and false; the digital object comprises an integer and a real number, the integer object represents a mathematical integer, and the real number object represents a mathematical real number; a string object containing 0 or more bytes; the name object is an atomic symbol and is uniquely defined by using any character sequence except the null object; an array object, which is a one-dimensional object set, elements may be heterogeneous, and elements of an array may be any combination of numbers, character strings, dictionaries or any other objects (including other arrays); a dictionary object, which is an association table containing object pairs (key-value pairs, dictionary entries), where a key is a name object, and a value can be any type of object, and even can be a dictionary, and multiple entries in the same dictionary should not have the same key; the stream object is a byte sequence without length limitation, and the stream comprises a dictionary and is followed by 0 or more bytes surrounded by the keywords stream and endstream; the empty object, type and value are not equal to any other object, and there should be only one null type object, represented by the keyword null.
Step S204: and obtaining the structural characteristics of the target file based on the information of the content description object, and selecting a corresponding processing scheme based on the structural characteristics, the version information and preset empirical data.
In the above steps, through the acquired structural characteristics, version information and preset empirical data, the corresponding processing scheme can be selected, and a plurality of corresponding data are selected for supporting the scheme, so that the acquirability of the processing scheme (convenience is brought to subsequent upgrading and optimization of the processing scheme) and the accuracy (effective and reliable data support) can be effectively improved.
Step S104: and reading the structure data and the coding algorithm for the target file according to the processing scheme.
In the above steps, after the specific processing scheme is obtained, reading the structure data and the encoding algorithm for the target file based on the processing scheme may be started. That is to say, the read structure data and the encoding algorithm have pertinence to the target file, and the fixed structure data and the encoding algorithm are not required to be called as in the prior art, so that the efficiency and the accuracy of subsequently reading the data information of the target file can be effectively improved.
Step S105: and reading the data information of the target file based on the structure data and the coding algorithm.
In the above steps, because the structure data and the coding algorithm are the structure data and the coding algorithm which have pertinence to the target file and are obtained based on the processing scheme, the data information of the target file read by the method can be more accurate and effective.
Step S106: and coding based on the data information to form data stream information of a target format, and obtaining a new format file of the corresponding target format based on the data stream information. The target format comprises an editable format which is expected to be generated by a user: docx, PPT, JPG, and Excel formats.
Step S107: and judging the structural difference information of the target file and the new format file, if the structural difference information is smaller than a preset threshold value, entering the next step, and otherwise, returning to the step S103 to reselect other processing schemes.
In the above steps, the structure difference information of the target file and the structure difference information of the new format file are obtained, so that the structure difference information is ensured to be smaller than the preset threshold value, the data loss caused by the large structure difference between the target file and the new format file is avoided, and the consistency and the accuracy of the data can be effectively ensured.
Referring to fig. 3, in some embodiments of the present invention, the step S107 further includes: and detecting the compatibility of the data stream information or the new format file and the corresponding browser or editor, if the compatibility is realized, entering the next step, and otherwise, returning to the step S103 to reselect other processing schemes.
In the above steps, since the finally generated data stream information or new format file may not be compatible with the corresponding browser or editor, if the compatibility is not determined, the data stream information or new format file is directly sent to the corresponding browser or editor, which may cause data loss and damage. Therefore, the above-mentioned phenomenon can be effectively avoided by the above-mentioned steps.
Step S108: and sending the data stream information or the new format file into the corresponding editing environment based on the editing scene selection information to respond to the document editing requirement of the user.
In the above steps, after the data stream information or the new format file is sent to the corresponding editing environment, the user can modify and edit the text therein, so that a new file with a corresponding format can be obtained after the modification and editing are completed, and the modification, editing and conversion of the target file are completed.
Based on the same inventive concept, please refer to fig. 5, the present invention further provides an intelligent scenario editing system for PDF documents, comprising:
the file acquisition module 1 is used for acquiring a target file and acquiring source information of the target file;
the scene selection module 2 is used for responding to the editing scene selection requirement of the user and obtaining corresponding editing scene selection information;
the processing scheme module 3 is used for selecting a processing scheme for the target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme;
the algorithm reading module 4 is used for reading the structure data and the coding algorithm aiming at the target file according to the processing scheme;
the data reading module 5 is used for reading data information of the target file based on the structural data and the coding algorithm;
the file generation module 6 is used for encoding based on the data information to form data stream information of a target format and obtaining a new format file of the corresponding target format based on the data stream information;
the difference judgment module 7 is used for judging the structure difference information of the target file and the new format file, if the structure difference information is smaller than a preset threshold value, the next step is carried out, otherwise, the step S102 is returned to, and other processing schemes are reselected;
and the document editing module 8 is used for sending the data stream information or the new format file into the corresponding editing environment based on the editing scene selection information to respond to the document editing requirement of the user.
For a specific implementation process of the system, please refer to the intelligent scene editing method for the PDF document provided in the embodiment of the present application, which is not described herein again.
Referring to fig. 6, fig. 6 is a block diagram of an electronic device according to an embodiment of the present invention. The electronic device comprises a memory 9, a processor 10 and a communication interface 11, wherein the memory 9, the processor 10 and the communication interface 11 are electrically connected with each other directly or indirectly to realize the transmission or interaction of data. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 9 may be used to store software programs and modules, such as program instructions/modules corresponding to the intelligent scene editing system for PDF documents provided in the embodiments of the present application, and the processor 10 executes various functional applications and data processing by executing the software programs and modules stored in the memory 9. The communication interface 11 may be used for communication of signaling or data with other node devices.
The Memory 9 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 10 may be an integrated circuit chip having signal processing capabilities. The Processor 10 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in fig. 6 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 6 or have a different configuration than shown in fig. 6. The components shown in fig. 6 may be implemented in hardware, software, or a combination thereof.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The above-described functions, if implemented in the form of software functional modules and sold or used as a separate product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (10)

1. An intelligent scene editing method of a PDF document is characterized by comprising the following steps:
step S101: acquiring a target file and acquiring source information of the target file;
step S102: responding to the editing scene selection requirement of a user to obtain corresponding editing scene selection information;
step S103: selecting a processing scheme for the target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme;
step S104: reading structure data and an encoding algorithm aiming at the target file according to the processing scheme;
step S105: reading data information of a target file based on the structural data and the coding algorithm;
step S106: coding is carried out on the basis of the data information to form data stream information of a target format, and a corresponding new format file of the target format is obtained on the basis of the data stream information;
step S107: judging the structure difference information of the target file and the new format file, if the structure difference information is smaller than a preset threshold value, entering the next step, otherwise, returning to the step S103 to reselect other processing schemes;
step S108: and sending the data stream information or the new format file into the corresponding editing environment based on the editing scene selection information to respond to the document editing requirement of the user.
2. The intelligent scene editing method of a PDF document according to claim 1, wherein said step S103 specifically comprises:
step S201: reading file header information of a target file to obtain version information of the target file;
step S202: reading file tail information of the target file to obtain cross reference table information of the target file;
step S203: obtaining the information of the content description object of the target file according to the cross reference table information;
step S204: and obtaining the structural characteristics of the target file based on the information of the content description object, and selecting a corresponding processing scheme based on the structural characteristics, the version information and preset empirical data.
3. The method of claim 2, wherein the content description objects include boolean objects, numeric objects, string objects, name objects, array objects, dictionary objects, stream objects, and null objects.
4. The intelligent scene editing method of a PDF document according to claim 1, wherein said step S102 specifically comprises:
step S301: responding to the editing scene selection demand information of a user to obtain first editing scene selection information;
step S302: obtaining second editing scene selection information based on the source information of the target file;
step S303: and if the first editing scene selection information and the second editing scene selection information are consistent in data, obtaining a corresponding processing scheme according to the first editing scene selection information, otherwise, generating corresponding prompt information to be displayed to the user, and responding to the confirmation selection information of the user to obtain final editing scene selection information.
5. The method for intelligently scene-editing of a PDF document according to claim 1, wherein said step S107 is followed by further comprising: and detecting the compatibility of the data stream information or the new format file and the corresponding browser or editor, if the compatibility is realized, entering the next step, and otherwise, returning to the step S103 to reselect other processing schemes.
6. The intelligent scene editing method of a PDF document according to claim 1, wherein said editing scene selection information includes a Docx scene editing, a PPT scene editing, an Excel scene editing, an OCR scene editing, and an AIS intelligent judgment scene editing.
7. The intelligent scenarized editing method of a PDF document according to claim 1, wherein the source of the target file comprises a Word document, a PPT document, an Excel form document, a graphic photo document, and a scan-to-generate document.
8. An intelligent scene editing system of a PDF document is characterized by comprising:
the file acquisition module is used for acquiring a target file and acquiring source information of the target file;
the scene selection module is used for responding to the editing scene selection requirement of the user and obtaining corresponding editing scene selection information;
the processing scheme module is used for selecting a processing scheme for the target file based on the source information of the target file and the editing scene selection information to obtain a corresponding processing scheme;
the algorithm reading module is used for reading the structural data and the coding algorithm aiming at the target file according to the processing scheme;
the data reading module is used for reading data information of the target file based on the structural data and the coding algorithm;
the file generation module is used for coding based on the data information to form data stream information of a target format and obtaining a new format file of the corresponding target format based on the data stream information;
the difference judging module is used for judging the structure difference information of the target file and the new format file, if the structure difference information is smaller than a preset threshold value, the next step is carried out, and if not, the step S102 is carried out, and other processing schemes are reselected;
and the document editing module is used for sending the data stream information or the new format file into the corresponding editing environment based on the editing scene selection information to respond to the document editing requirement of the user.
9. An electronic device, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202211352604.3A 2022-11-01 2022-11-01 Intelligent scene editing method, system, equipment and medium for PDF document Pending CN115630618A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211352604.3A CN115630618A (en) 2022-11-01 2022-11-01 Intelligent scene editing method, system, equipment and medium for PDF document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211352604.3A CN115630618A (en) 2022-11-01 2022-11-01 Intelligent scene editing method, system, equipment and medium for PDF document

Publications (1)

Publication Number Publication Date
CN115630618A true CN115630618A (en) 2023-01-20

Family

ID=84909521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211352604.3A Pending CN115630618A (en) 2022-11-01 2022-11-01 Intelligent scene editing method, system, equipment and medium for PDF document

Country Status (1)

Country Link
CN (1) CN115630618A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170527A (en) * 2023-02-16 2023-05-26 南京金阵微电子技术有限公司 Message editing method, message editing device, medium and electronic equipment
CN116661767A (en) * 2023-07-28 2023-08-29 亚信科技(中国)有限公司 File generation method, device, equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170527A (en) * 2023-02-16 2023-05-26 南京金阵微电子技术有限公司 Message editing method, message editing device, medium and electronic equipment
CN116170527B (en) * 2023-02-16 2023-11-07 南京金阵微电子技术有限公司 Message editing method, message editing device, medium and electronic equipment
CN116661767A (en) * 2023-07-28 2023-08-29 亚信科技(中国)有限公司 File generation method, device, equipment and storage medium
CN116661767B (en) * 2023-07-28 2023-10-27 亚信科技(中国)有限公司 File generation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109933752B (en) Method and device for exporting electronic document
CN115630618A (en) Intelligent scene editing method, system, equipment and medium for PDF document
CN112749326B (en) Information processing method, information processing device, computer equipment and storage medium
US8849726B2 (en) Information processing apparatus and control method for the same
CN108664546B (en) XML data structure conversion method and device
CN112131289A (en) Data processing method and device, electronic equipment and storage medium
CN112527291A (en) Webpage generation method and device, electronic equipment and storage medium
CN112650529A (en) System and method capable of configuring generation of mobile terminal APP code
KR102366182B1 (en) Server and method for automatically generating slide using artificial intelligence
CN116541228B (en) Touch response detection method and device for display and computer equipment
CN112769767A (en) Vehicle-mounted Ethernet SOME/IP protocol data analysis method, device, medium and system
CN109614592B (en) Text processing method and device, storage medium and electronic equipment
CN110457527A (en) A kind of XML message comparison method and system
CN115827856B (en) Method for transmitting military field message based on computer
CN113505153B (en) Memorandum backup method based on iOS system and related equipment
CN115640420A (en) ES-based audio information index database establishing and retrieving method, ES-based audio information index database establishing and retrieving equipment and ES-based audio information index database storing medium
CN111401005B (en) Text conversion method and device and readable storage medium
US11409804B2 (en) Data analysis method and data analysis system thereof for searching learning sections
CN112784527B (en) Document merging method and device and electronic equipment
CN113408250B (en) Project file processing method and device
JPH10124356A (en) Information processing system
CN110457659B (en) Clause document generation method and terminal equipment
CN117852496A (en) Text segmentation formatting method and device
CN117349474A (en) XML and XMIND conversion method and Testlink test case management system
CN114357949A (en) Text collaborative editing method and related device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination