CN112784527A - Document merging method and device and electronic equipment - Google Patents

Document merging method and device and electronic equipment Download PDF

Info

Publication number
CN112784527A
CN112784527A CN202011069543.0A CN202011069543A CN112784527A CN 112784527 A CN112784527 A CN 112784527A CN 202011069543 A CN202011069543 A CN 202011069543A CN 112784527 A CN112784527 A CN 112784527A
Authority
CN
China
Prior art keywords
document
merged
data
array
merging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011069543.0A
Other languages
Chinese (zh)
Inventor
薛俊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Original Assignee
Beijing Kingsoft Office Software Inc
Zhuhai Kingsoft Office Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Office Software Inc, Zhuhai Kingsoft Office Software Co Ltd filed Critical Beijing Kingsoft Office Software Inc
Priority to CN202011069543.0A priority Critical patent/CN112784527A/en
Publication of CN112784527A publication Critical patent/CN112784527A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification

Abstract

The embodiment of the invention provides a document merging method, a document merging device and electronic equipment, wherein a first document to be merged and a second document to be merged are obtained; performing data category analysis on data contained in the second document to be merged to obtain a display sequence of the data categories in the second document to be merged; extracting data contained in the first document to be merged, and generating a third document to be merged according to the display sequence; and merging the third document to be merged and the second document to be merged to obtain a merged document. In the finally obtained merged document, the data category display sequence of the data corresponding to the first document to be merged is consistent with the data category display sequence of the data corresponding to the second document to be merged, so that the merging efficiency of documents with different formats is improved, and meanwhile, a reader can conveniently search and compare the data.

Description

Document merging method and device and electronic equipment
Technical Field
The present invention relates to the field of electronic document processing technologies, and in particular, to a document merging method and apparatus, and an electronic device.
Background
In the field of electronic document processing, there are very many document formats, and due to work needs, it is often necessary to merge a plurality of documents in different formats into 1 document.
Currently, the document merging process is generally: and directly utilizing a format conversion function in the document editing software to convert the format of each document to be merged to obtain the document to be merged in the target format, and then merging the documents to be merged in the target format to obtain the merged document. There is also a method of copying contents in one document and then copying and pasting the contents to another corresponding document.
For each document to be merged, the document merging process is troublesome and time-consuming, and the efficiency of document merging is low.
Disclosure of Invention
The embodiment of the invention aims to provide a document merging method, a document merging device and electronic equipment, so as to improve the efficiency of document merging. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a document merging method, including:
acquiring a first document to be merged and a second document to be merged;
performing data category analysis on data contained in the second document to be merged to obtain a display sequence of data categories in the second document to be merged;
extracting data contained in the first document to be merged, and generating a third document to be merged according to the display sequence;
and merging the third document to be merged with the second document to be merged to obtain a merged document.
Further, the extracting data included in the first document to be merged and generating a third document to be merged according to the presentation order includes:
establishing an array container set comprising a plurality of array containers, wherein one array container corresponds to one data category;
respectively adding the data contained in the first to-be-merged document into corresponding array containers according to different data types;
and creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged.
Further, merging the third document to be merged with the second document to be merged, and before the merged document is obtained, the method further includes:
judging whether the second document to be merged is a final merged format document; if not, the following steps are executed:
merging the third document to be merged with the second document to be merged to obtain a merged document; performing target format conversion on the merged document to obtain a merged target format document; alternatively, the first and second electrodes may be,
and converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document.
Further, when the formats of the third document to be merged and the second document to be merged are the same, merging the third document to be merged and the second document to be merged to obtain a merged document, including:
receiving a first insertion instruction; the first insertion instruction comprises insertion position information;
and inserting the third document to be merged into the second document to be merged according to the insertion position information contained in the first insertion instruction to obtain a merged document.
Further, when the formats of the third document to be merged and the second document to be merged are the same, the adding the data included in the first document to be merged to the corresponding array containers respectively according to the different data types includes:
dividing the data contained in the first document to be merged into a plurality of data subsets according to a preset dividing mode;
respectively adding the data contained in each data subset and the identification information of the data subset to which the data belong to the data subsets into corresponding array containers according to different data types;
the newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged, including:
respectively creating a third document for each data subset, and sequentially adding the data belonging to the data subset in each array container to the third document corresponding to the data subset according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the data subset;
the merging the third document to be merged with the second document to be merged to obtain a merged document, including:
receiving a second insertion instruction; the second insertion instruction comprises insertion position information of each third document to be merged;
and inserting the third documents to be merged into the second document to be merged according to the insertion position information of the third documents to be merged, which is contained in the second insertion instruction, to obtain the merged document.
Further, the number of the first documents to be merged is multiple;
the establishing of the array container set comprising a plurality of array containers comprises the following steps:
respectively establishing an array container set containing a plurality of array containers for each first document to be merged;
the adding the data contained in the first document to be merged to the corresponding array containers according to different data types comprises:
for each first document to be merged, in an array container set corresponding to the first document to be merged, adding data contained in the first document to be merged to corresponding array containers respectively according to different data types;
the newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged, including:
and respectively creating a third document for each first document to be merged, and sequentially adding data in each array container in the array container set corresponding to the first document to be merged to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
Further, the number of the first documents to be merged is multiple;
the establishing of the array container set comprising a plurality of array containers comprises the following steps:
establishing an array container set comprising a plurality of array containers;
the adding the data contained in the first document to be merged to the corresponding array containers according to different data types comprises:
for each first document to be merged, adding the data contained in the document and the identification information of the first document to be merged into a corresponding array container according to different data types;
the newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged, including:
and respectively creating a third document for each first document to be merged, and sequentially adding the data belonging to the first document to be merged in each array container to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
In a second aspect, an embodiment of the present invention provides a document merging apparatus, including:
the acquiring module is used for acquiring a first document to be merged and a second document to be merged;
a display order obtaining module, configured to perform data category analysis on data included in the second document to be merged to obtain a display order of data categories in the second document to be merged;
the third document to be merged generating module is used for extracting data contained in the first document to be merged and generating a third document to be merged according to the display sequence;
and the document merging module is used for merging the third document to be merged and the second document to be merged to obtain a merged document.
Further, the third to-be-merged document generating module includes:
the array container set establishing submodule is used for establishing an array container set comprising a plurality of array containers, wherein one array container corresponds to one data category;
the data adding submodule is used for respectively adding the data contained in the first to-be-merged document into the corresponding array containers according to different data types;
and the document new sub-module is used for newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence to obtain a third document to be merged corresponding to the first document to be merged.
Further, the apparatus further comprises:
the judging module is used for judging whether the second document to be merged is a final merged format document or not before merging the third document to be merged with the second document to be merged to obtain a merged document; if not, triggering the document merging module to execute the following steps: merging the third document to be merged with the second document to be merged to obtain a merged document; performing target format conversion on the merged document to obtain a merged target format document; or converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document.
Further, when the third document to be merged is in the same format as the second document to be merged, the document merging module is specifically configured to:
receiving a first insertion instruction; the first insertion instruction comprises insertion position information;
and inserting the third document to be merged into the second document to be merged according to the insertion position information contained in the first insertion instruction to obtain a merged document.
Further, when the third document to be merged is in the same format as the second document to be merged, the data adding sub-module is specifically configured to: dividing the data contained in the first document to be merged into a plurality of data subsets according to a preset dividing mode; respectively adding the data contained in each data subset and the identification information of the data subset to which the data belong to the data subsets into corresponding array containers according to different data types;
the document new sub-module is specifically configured to: respectively creating a third document for each data subset, and sequentially adding the data belonging to the data subset in each array container to the third document corresponding to the data subset according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the data subset;
the document merging module is specifically configured to: receiving a second insertion instruction; the second insertion instruction comprises insertion position information of each third document to be merged; and inserting the third documents to be merged into the second document to be merged according to the insertion position information of the third documents to be merged, which is contained in the second insertion instruction, to obtain the merged document.
Further, the number of the first documents to be merged is multiple;
the array container set establishing submodule is specifically configured to: respectively establishing an array container set containing a plurality of array containers for each first document to be merged;
the data adding submodule is specifically configured to: for each first document to be merged, in an array container set corresponding to the first document to be merged, adding data contained in the first document to be merged to corresponding array containers respectively according to different data types;
the document new sub-module is specifically configured to: and respectively creating a third document for each first document to be merged, and sequentially adding data in each array container in the array container set corresponding to the first document to be merged to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
Further, the number of the first documents to be merged is multiple;
the array container set establishing submodule is specifically configured to: establishing an array container set comprising a plurality of array containers;
the data adding submodule is specifically configured to: for each first document to be merged, adding the data contained in the document and the identification information of the first document to be merged into a corresponding array container according to different data types;
the document new sub-module is specifically configured to: and respectively creating a third document for each first document to be merged, and sequentially adding the data belonging to the first document to be merged in each array container to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of any document merging method when executing the program stored in the memory.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where instructions are stored, and when the instructions are executed on a computer, the computer is caused to execute any one of the above-mentioned document merging methods.
In a fifth aspect, an embodiment of the present invention further provides a computer program product containing instructions, which when run on a computer, cause the computer to perform any of the above-mentioned document merging methods.
The embodiment of the invention has the following beneficial effects:
the document merging method, the document merging device and the electronic equipment provided by the embodiment of the invention are used for acquiring a first document to be merged and a second document to be merged; performing data category analysis on data contained in the second document to be merged to obtain a display sequence of data categories in the second document to be merged; extracting data contained in the first document to be merged, and generating a third document to be merged according to the display sequence; and merging the third document to be merged with the second document to be merged to obtain a merged document.
In the embodiment of the invention, before the documents are merged, the display sequence of the data categories in the second document to be merged is obtained through analysis, then the data contained in the first document to be merged is extracted, a third document to be merged is generated according to the display sequence, and then the third document to be merged and the second document to be merged are merged to obtain the merged document. The third document to be merged is established according to the display sequence of the data categories in the second document to be merged, so that the display sequence of the data categories in the third document to be merged is consistent with the display sequence of the data categories in the second document to be merged, and correspondingly, the display sequence of the data categories of the data corresponding to the first document to be merged and the display sequence of the data categories of the data corresponding to the second document to be merged in the merged document are also consistent, so that the merging efficiency of documents with different formats is improved, a reader can conveniently search and compare the data in the merged document, and the user experience is improved.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
FIG. 1 is a flowchart illustrating a document merging method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another document merging method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a document merging device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic flow chart of a document merging method provided by an embodiment of the present invention, which specifically includes the following steps:
step 101, a first document to be merged and a second document to be merged are obtained.
The format of the first document to be merged may be the same as that of the second document to be merged, or may be different from that of the second document to be merged. For example: the format of the first document to be merged can be any document format such as Word, PPT, Excel, Pdf, note document and the like, and the format of the second document to be merged can be any document format such as Word, PPT, Excel, Pdf and the like.
Meanwhile, the number of the first documents to be merged may be one or more.
And 102, performing data category analysis on data contained in the second document to be merged to obtain a display sequence of the data categories in the second document to be merged.
The data categories may include: pictures, text, connections, audio, and video, among others.
In this step, existing methods such as machine learning may be adopted to perform data category identification on the data included in the second document to be merged, so as to obtain a display sequence of the data categories in the second document to be merged. Here, the specific implementation method adopted when the presentation order of the data categories in the second document to be merged is not limited.
And 103, extracting data contained in the first document to be merged, and generating a third document to be merged according to the display sequence.
Further, the third document to be merged may be generated by:
establishing an array container set comprising a plurality of array containers, wherein one array container corresponds to one data category;
respectively adding data contained in the first to-be-merged document into corresponding array containers according to different data types;
and creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged.
In this step, the format of the third document to be merged may be the same as the format of the second document to be merged, or may be different from the format of the second document to be merged.
When the format of the third document to be merged is the same as the format of the second document to be merged, in the subsequent step 104, the third document to be merged may be directly inserted into the second document to be merged. When the format of the third document to be merged is different from the format of the second document to be merged, in the subsequent step 104, the format of the third document to be merged may be converted into the same format as the second document to be merged, and then the third document to be merged is inserted into the second document to be merged; or converting the format of the second document to be merged into a fourth format document, and then inserting the third document to be merged into the fourth format document; and the third document to be merged has the same format as the fourth format document.
In the process of generating the third document to be merged, when the number of the first documents to be merged is multiple, the following two ways may be used to establish the array container set:
first, for each first document to be merged, an array container set including a plurality of array containers is respectively established.
Second, a set of array containers is created that includes a plurality of array containers. That is, a plurality of first documents to be merged share one array container set.
Corresponding to the first array container set establishing manner, for each first document to be merged, in the array container set corresponding to the first document to be merged, adding data contained in the first document to be merged to the corresponding array container respectively according to different data types; then, a third document may be created for each first document to be merged, and the data in each array container in the array container set corresponding to the first document to be merged may be sequentially added to the third document corresponding to the first document to be merged according to the display order of the data categories in the second document to be merged, so as to obtain a third document to be merged corresponding to the first document to be merged.
Corresponding to the second array container set establishing manner, the data contained in each first document to be merged and the identification information of the first document to be merged can be added to the array container corresponding to the same array container set according to different data types; and then, according to the identification information of the first documents to be merged, respectively creating a third document for each first document to be merged, and sequentially adding the data belonging to the first document to be merged in each array container to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
The two modes are compared: in the first mode, an array container set comprising a plurality of array containers is respectively established for each first document to be merged, so that only data of the first document to be merged corresponding to the array container set need to be added into the array containers, and identification information of the first document to be merged does not need to be added into the corresponding array containers, so that the data adding operation is simple; in the second mode, a plurality of first documents to be merged share one array container set, and an array container set does not need to be established for each first document to be merged, so that the number of array container sets to be established is small, and the establishment process of the data container set is simple.
And 104, merging the third document to be merged and the second document to be merged to obtain a merged document.
In this step, when the formats of the third document to be merged and the second document to be merged are the same, the third document to be merged and the second document to be merged are merged, and the following ways of obtaining the merged document may be used:
first, an insertion position, that is, a preset position, may be set in advance, and after a third document to be merged corresponding to the first document to be merged is obtained, the third document to be merged is directly inserted into the preset position in the second document to be merged, so as to obtain a merged document. The preset positions may be: a position before the data contained in the second document to be merged, or a position after the data contained in the second document to be merged.
Secondly, after a third document to be merged corresponding to the first document to be merged is obtained, a first insertion instruction may be received; the first insertion instruction comprises insertion position information; and then, inserting the third document to be merged into the second document to be merged according to the insertion position information contained in the first insertion instruction to obtain the merged document. For example: the insertion position information may be determined by the user according to actual needs, and the insertion position may be any position in the second document to be merged.
Thirdly, in step 103, dividing the data contained in the first document to be merged into a plurality of data subsets according to a preset dividing mode; the preset dividing mode can be a mode of arbitrarily dividing the document structure according to chapters, page breaks, affiliated title types and the like. Respectively adding the data contained in each data subset and the identification information of the data subset to which the data belong to the data subsets into corresponding array containers according to different data types; respectively creating a third document for each data subset, sequentially adding the data belonging to the data subset in each array container to the third document corresponding to the data subset according to the display sequence of the data category in the second document to be merged to obtain the third document to be merged corresponding to the data subset, and then, in the step (step 104), receiving a second insertion instruction; the second insertion instruction comprises insertion position information of each third document to be merged; and inserting the third documents to be merged into the second documents to be merged according to the insertion position information of the third documents to be merged, which is contained in the second insertion instruction, so as to obtain the merged documents. For example: the insertion position information of each third document to be merged can be determined by the user according to actual needs, and each insertion position can also be any position in the second document to be merged.
In this step, when the formats of the third document to be merged and the second document to be merged are different, the third document to be merged and the second document to be merged are merged, and the manner of obtaining the merged document may adopt the following manner:
and converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document. And the third merged document and the fourth document are documents in the same format.
In another embodiment of the present invention, at step 104: merging the third document to be merged with the second document to be merged, and before the merged document is obtained, the method may further include:
judging whether the second document to be merged is in a final merged document format or not; if not, the following steps are executed:
merging the third document to be merged with the second document to be merged to obtain a merged document; performing target format conversion on the merged document to obtain a merged target format document; the third document to be merged and the second document to be merged are documents in the same format; alternatively, the first and second electrodes may be,
and converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document. Wherein the third merged document and the fourth document are documents of the same format.
When the second document to be merged is the final merged document, in this step, when the formats of the third document to be merged and the second document to be merged are the same, the third document to be merged and the second document to be merged are merged, and the manner of obtaining the merged document refers to the first, second, and third manners, which is not described herein again.
In the document merging method shown in fig. 1 provided in the embodiment of the present invention, before merging documents, a display order of data categories in a second document to be merged is obtained through analysis, then, data included in a first document to be merged is extracted, a third document to be merged is generated according to the display order, and then, the third document to be merged and the second document to be merged are merged to obtain a merged document. The third document to be merged is established according to the display sequence of the data categories in the second document to be merged, so that the display sequence of the data categories in the third document to be merged is consistent with the display sequence of the data categories in the second document to be merged, and correspondingly, the display sequence of the data categories of the data corresponding to the first document to be merged and the display sequence of the data categories of the data corresponding to the second document to be merged in the merged document are also consistent, so that the merging efficiency of documents with different formats is improved, a reader can conveniently search and compare the data in the merged document, and the user experience is improved.
In addition, in the document merging process, after the first document to be merged and the second document to be merged are obtained, the user does not need to give instructions of conversion, merging and the like, that is, manual participation is not needed, and document merging can be automatically realized to obtain the merged document. Therefore, the user experience can be further improved.
Referring to fig. 2, fig. 2 is another schematic flow chart of a document merging method provided in the embodiment of the present invention, and the specific process includes:
step 201, a first document to be merged and a second document to be merged are obtained.
Step 202, performing data category analysis on the data contained in the second document to be merged to obtain a display sequence of the data categories in the second document to be merged.
Step 203, an array container set including a plurality of array containers is established, wherein one array container corresponds to one data category.
Step 204, dividing the data contained in the first document to be merged into a plurality of data subsets according to a preset dividing mode.
The preset dividing mode can be as follows: dividing the data contained in the first document to be merged into a plurality of data subsets according to different chapters, namely different chapters to which the data belong; or, the data included in the first document to be merged may be divided into a plurality of data subsets according to page breaks or page numbers; the data contained in the first document to be merged may be divided into a plurality of data subsets according to different types of titles to which the data belong. Here, specific contents of the preset division manner are not limited.
Step 205, according to the difference of the data types, adding the data contained in each data subset and the identification information of the data subset to which the data belong to the corresponding array container.
The data categories may include: pictures, text, connections, audio, and video, among others.
Step 206, a third document is created for each data subset, and the data belonging to the data subset in each array container is sequentially added to the third document corresponding to the data subset according to the display sequence of the data categories in the second document to be merged, so as to obtain the third document to be merged corresponding to the data subset.
In this step, the format of the third document to be merged and the format of the second document to be merged are the same format.
Step 207, receiving a second insert instruction; the second insertion instruction includes insertion position information of each third document to be merged.
In this step, the insertion positions of the third documents to be merged may be different insertion positions, each insertion position may be determined by the user according to actual needs, and may also be any position in the second document to be merged.
And 208, inserting each third document to be merged into the second document to be merged according to the insertion position information of each third document to be merged included in the second insertion instruction to obtain a merged document.
And step 209, performing target format conversion on the merged document to obtain a merged target format document.
After the merged document is obtained, the format of the merged document may not be the target format required by the user, and in this case, the merged document may be subjected to target format conversion through step 209, so as to obtain the merged target format document. The method for obtaining the merged document may be any one of the foregoing methods.
In the document merging method shown in fig. 2 provided in the embodiment of the present invention, before merging documents, a display order of data categories in a second document to be merged is obtained through analysis, and data included in a first document to be merged is divided into a plurality of data subsets according to a preset dividing manner, then, according to the display order of the data categories in the second document to be merged, a third document to be merged corresponding to each data subset is newly created, and then, each third document to be merged is merged with the second document to be merged, so that a merged document is obtained. Because each third document to be merged is established according to the display sequence of the data categories in the second document to be merged, the display sequence of the data categories in each third document to be merged is consistent with the display sequence of the data categories in the second document to be merged, and correspondingly, the display sequence of the data categories of the data corresponding to each data subset in the first document to be merged and the display sequence of the data categories of the data corresponding to the second document to be merged in the finally merged document are also consistent, so that a reader can conveniently search and compare the data in the merged document, and the user experience is improved.
Meanwhile, data contained in the first document to be merged is divided into a plurality of data subsets, after a third document to be merged corresponding to each data subset is newly built, the third document to be merged is inserted into a proper position in the second document to be merged according to the insertion position information of each third document to be merged, and therefore after the first document to be merged is split, the fragment segments are inserted into the proper positions in the second document to be merged according to user requirements, the obtained merged document is more in line with the viewing habits of users, and therefore user experience is further improved.
Based on the same inventive concept, according to the document merging method provided by the above embodiment of the present invention, accordingly, an embodiment of the present invention provides a document merging device, a schematic structural diagram of which is shown in fig. 3, including:
an obtaining module 301, configured to obtain a first document to be merged and a second document to be merged;
a display order obtaining module 302, configured to perform data category analysis on data included in the second document to be merged to obtain a display order of data categories in the second document to be merged;
a third to-be-merged document generating module 303, configured to extract data included in the first to-be-merged document, and generate a third to-be-merged document according to the display order;
and the document merging module 304 is configured to merge the third document to be merged with the second document to be merged to obtain a merged document.
Further, the third to-be-merged document generating module 303 includes:
the array container set establishing submodule is used for establishing an array container set comprising a plurality of array containers, wherein one array container corresponds to one data category;
the data adding submodule is used for respectively adding the data contained in the first document to be merged into the corresponding array containers according to different data types;
and the document new sub-module is used for newly building a third document, and sequentially adding the data in each array container to the third document according to the display sequence to obtain a third document to be merged corresponding to the first document to be merged.
Further, the apparatus further comprises:
the judging module is used for judging whether the second document to be merged is the final merged format document or not before merging the third document to be merged with the second document to be merged to obtain a merged document; if not, triggering the document merging module to execute the following steps: merging the third document to be merged with the second document to be merged to obtain a merged document; performing target format conversion on the merged document to obtain a merged target format document; or converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document.
Further, when the format of the third document to be merged is the same as that of the second document to be merged, the document merging module 304 is specifically configured to:
receiving a first insertion instruction; the first insertion instruction comprises insertion position information;
and inserting the third document to be merged into the second document to be merged according to the insertion position information contained in the first insertion instruction to obtain the merged document.
Further, when the format of the third document to be merged is the same as that of the second document to be merged, the data adding sub-module is specifically configured to: dividing data contained in a first document to be merged into a plurality of data subsets according to a preset dividing mode; respectively adding the data contained in each data subset and the identification information of the data subset to which the data belong to the data subsets into corresponding array containers according to different data types;
the document new sub-module is specifically used for: respectively creating a third document for each data subset, and sequentially adding the data belonging to the data subset in each array container to the third document corresponding to the data subset according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the data subset;
the document merging module 304 is specifically configured to: receiving a second insertion instruction; the second insertion instruction comprises insertion position information of each third document to be merged; and inserting the third documents to be merged into the second documents to be merged according to the insertion position information of the third documents to be merged, which is contained in the second insertion instruction, so as to obtain the merged documents.
Further, the number of the first documents to be merged is multiple;
the array container set establishing submodule is specifically used for: respectively establishing an array container set containing a plurality of array containers for each first document to be merged;
the data adding submodule is specifically used for: for each first document to be merged, in an array container set corresponding to the first document to be merged, adding data contained in the first document to be merged to corresponding array containers respectively according to different data types;
the document new sub-module is specifically used for: and respectively creating a third document for each first document to be merged, and sequentially adding data in each array container in the array container set corresponding to the first document to be merged to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
Further, the number of the first documents to be merged is multiple;
the array container set establishing submodule is specifically used for: establishing an array container set comprising a plurality of array containers;
the data adding submodule is specifically used for: for each first document to be merged, adding the data contained in the document and the identification information of the first document to be merged into a corresponding array container according to different data types;
the document new sub-module is specifically used for: and respectively creating a third document for each first document to be merged, and sequentially adding the data belonging to the first document to be merged in each array container to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
In the document merging device shown in fig. 3 provided in the embodiment of the present invention, before merging documents, a display order of data categories in a second document to be merged is obtained through analysis, then, data included in a first document to be merged is extracted, a third document to be merged is generated according to the display order, and then, the third document to be merged and the second document to be merged are merged to obtain a merged document. The third document to be merged is established according to the display sequence of the data categories in the second document to be merged, so that the display sequence of the data categories in the third document to be merged is consistent with the display sequence of the data categories in the second document to be merged, and correspondingly, the display sequence of the data categories of the data corresponding to the first document to be merged and the display sequence of the data categories of the data corresponding to the second document to be merged in the merged document are also consistent, so that the merging efficiency of documents with different formats is improved, a reader can conveniently search and compare the data in the merged document, and the user experience is improved.
In addition, in the document merging process, after the first document to be merged and the second document to be merged are obtained, the user does not need to give instructions of conversion, merging and the like, that is, manual participation is not needed, and document merging can be automatically realized to obtain the merged document. Therefore, the user experience can be further improved.
Based on the same inventive concept, according to the document merging method provided by the above embodiment of the present invention, correspondingly, the embodiment of the present invention further provides an electronic device, as shown in fig. 4, including a processor 401, a communication interface 402, a memory 403, and a communication bus 404, where the processor 401, the communication interface 402, and the memory 403 complete mutual communication through the communication bus 404.
A memory 403 for storing a computer program;
the processor 401, when executing the program stored in the memory 403, at least implements the following steps:
acquiring a first document to be merged and a second document to be merged;
performing data category analysis on data contained in the second document to be merged to obtain a display sequence of the data categories in the second document to be merged;
extracting data contained in the first document to be merged, and generating a third document to be merged according to the display sequence;
and merging the third document to be merged and the second document to be merged to obtain a merged document.
Further, other processing flows in the above document merging method provided by the embodiment of the present invention may also be included, and are not described in detail here.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above-mentioned document merging methods.
In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer, causes the computer to perform any of the above-described document merging methods.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus and device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (16)

1. A method for merging documents, the method comprising:
acquiring a first document to be merged and a second document to be merged;
performing data category analysis on data contained in the second document to be merged to obtain a display sequence of data categories in the second document to be merged;
extracting data contained in the first document to be merged, and generating a third document to be merged according to the display sequence;
and merging the third document to be merged with the second document to be merged to obtain a merged document.
2. The method according to claim 1, wherein the extracting data included in the first document to be merged and generating a third document to be merged according to the presentation order include:
establishing an array container set comprising a plurality of array containers, wherein one array container corresponds to one data category;
respectively adding the data contained in the first to-be-merged document into corresponding array containers according to different data types;
and creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged.
3. The method according to claim 1, wherein before merging the third document to be merged with the second document to be merged to obtain a merged document, the method further comprises:
judging whether the second document to be merged is a final merged format document; if not, the following steps are executed:
merging the third document to be merged with the second document to be merged to obtain a merged document; performing target format conversion on the merged document to obtain a merged target format document; alternatively, the first and second electrodes may be,
and converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document.
4. The method according to claim 1 or 2, wherein when the formats of the third document to be merged and the second document to be merged are the same, merging the third document to be merged and the second document to be merged to obtain a merged document, includes:
receiving a first insertion instruction; the first insertion instruction comprises insertion position information;
and inserting the third document to be merged into the second document to be merged according to the insertion position information contained in the first insertion instruction to obtain a merged document.
5. The method according to claim 2, wherein when the third document to be merged is in the same format as the second document to be merged, the adding the data contained in the first document to be merged to the corresponding array containers according to different data categories comprises:
dividing the data contained in the first document to be merged into a plurality of data subsets according to a preset dividing mode;
respectively adding the data contained in each data subset and the identification information of the data subset to which the data belong to the data subsets into corresponding array containers according to different data types;
the newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged, including:
respectively creating a third document for each data subset, and sequentially adding the data belonging to the data subset in each array container to the third document corresponding to the data subset according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the data subset;
the merging the third document to be merged with the second document to be merged to obtain a merged document, including:
receiving a second insertion instruction; the second insertion instruction comprises insertion position information of each third document to be merged;
and inserting the third documents to be merged into the second document to be merged according to the insertion position information of the third documents to be merged, which is contained in the second insertion instruction, to obtain the merged document.
6. The method according to claim 2, wherein the number of the first to-be-merged documents is plural;
the establishing of the array container set comprising a plurality of array containers comprises the following steps:
respectively establishing an array container set containing a plurality of array containers for each first document to be merged;
the adding the data contained in the first document to be merged to the corresponding array containers according to different data types comprises:
for each first document to be merged, in an array container set corresponding to the first document to be merged, adding data contained in the first document to be merged to corresponding array containers respectively according to different data types;
the newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged, including:
and respectively creating a third document for each first document to be merged, and sequentially adding data in each array container in the array container set corresponding to the first document to be merged to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
7. The method according to claim 2, wherein the number of the first to-be-merged documents is plural;
the establishing of the array container set comprising a plurality of array containers comprises the following steps:
establishing an array container set comprising a plurality of array containers;
the adding the data contained in the first document to be merged to the corresponding array containers according to different data types comprises:
for each first document to be merged, adding the data contained in the document and the identification information of the first document to be merged into a corresponding array container according to different data types;
the newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence of the data types in the second document to be merged to obtain a third document to be merged corresponding to the first document to be merged, including:
and respectively creating a third document for each first document to be merged, and sequentially adding the data belonging to the first document to be merged in each array container to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
8. A document merge device, comprising:
the acquiring module is used for acquiring a first document to be merged and a second document to be merged;
a display order obtaining module, configured to perform data category analysis on data included in the second document to be merged to obtain a display order of data categories in the second document to be merged;
the third document to be merged generating module is used for extracting data contained in the first document to be merged and generating a third document to be merged according to the display sequence;
and the document merging module is used for merging the third document to be merged and the second document to be merged to obtain a merged document.
9. The apparatus of claim 8, wherein the third to-be-merged document generation module comprises:
the array container set establishing submodule is used for establishing an array container set comprising a plurality of array containers, wherein one array container corresponds to one data category;
the data adding submodule is used for respectively adding the data contained in the first to-be-merged document into the corresponding array containers according to different data types;
and the document new sub-module is used for newly creating a third document, and sequentially adding the data in each array container to the third document according to the display sequence to obtain a third document to be merged corresponding to the first document to be merged.
10. The apparatus of claim 8, further comprising:
the judging module is used for judging whether the second document to be merged is a final merged format document or not before merging the third document to be merged with the second document to be merged to obtain a merged document; if not, triggering the document merging module to execute the following steps: merging the third document to be merged with the second document to be merged to obtain a merged document; performing target format conversion on the merged document to obtain a merged target format document; or converting the second document to be merged into a fourth document, and merging the third document to be merged and the fourth document to obtain a merged document.
11. The apparatus according to claim 8 or 9, wherein when the third document to be merged is in the same format as the second document to be merged, the document merging module is specifically configured to:
receiving a first insertion instruction; the first insertion instruction comprises insertion position information;
and inserting the third document to be merged into the second document to be merged according to the insertion position information contained in the first insertion instruction to obtain a merged document.
12. The apparatus according to claim 9, wherein when the third document to be merged is in the same format as the second document to be merged, the data adding sub-module is specifically configured to: dividing the data contained in the first document to be merged into a plurality of data subsets according to a preset dividing mode; respectively adding the data contained in each data subset and the identification information of the data subset to which the data belong to the data subsets into corresponding array containers according to different data types;
the document new sub-module is specifically configured to: respectively creating a third document for each data subset, and sequentially adding the data belonging to the data subset in each array container to the third document corresponding to the data subset according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the data subset;
the document merging module is specifically configured to: receiving a second insertion instruction; the second insertion instruction comprises insertion position information of each third document to be merged; and inserting the third documents to be merged into the second document to be merged according to the insertion position information of the third documents to be merged, which is contained in the second insertion instruction, to obtain the merged document.
13. The apparatus according to claim 9, wherein the first document to be merged is plural in number;
the array container set establishing submodule is specifically configured to: respectively establishing an array container set containing a plurality of array containers for each first document to be merged;
the data adding submodule is specifically configured to: for each first document to be merged, in an array container set corresponding to the first document to be merged, adding data contained in the first document to be merged to corresponding array containers respectively according to different data types;
the document new sub-module is specifically configured to: and respectively creating a third document for each first document to be merged, and sequentially adding data in each array container in the array container set corresponding to the first document to be merged to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
14. The apparatus according to claim 9, wherein the first document to be merged is plural in number;
the array container set establishing submodule is specifically configured to: establishing an array container set comprising a plurality of array containers;
the data adding submodule is specifically configured to: for each first document to be merged, adding the data contained in the document and the identification information of the first document to be merged into a corresponding array container according to different data types;
the document new sub-module is specifically configured to: and respectively creating a third document for each first document to be merged, and sequentially adding the data belonging to the first document to be merged in each array container to the third document corresponding to the first document to be merged according to the display sequence of the data categories in the second document to be merged to obtain the third document to be merged corresponding to the first document to be merged.
15. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 7 when executing a program stored in the memory.
16. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202011069543.0A 2020-09-30 2020-09-30 Document merging method and device and electronic equipment Pending CN112784527A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011069543.0A CN112784527A (en) 2020-09-30 2020-09-30 Document merging method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011069543.0A CN112784527A (en) 2020-09-30 2020-09-30 Document merging method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN112784527A true CN112784527A (en) 2021-05-11

Family

ID=75750476

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011069543.0A Pending CN112784527A (en) 2020-09-30 2020-09-30 Document merging method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112784527A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115688711A (en) * 2022-12-30 2023-02-03 中化现代农业有限公司 Document merging method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0550694A (en) * 1991-08-28 1993-03-02 Nec Corp Form mergence
JP2006309398A (en) * 2005-04-27 2006-11-09 Nec Fielding Ltd Document merge system, data processor, document merge method and program
CN103390005A (en) * 2012-05-11 2013-11-13 北大方正集团有限公司 Method and system for merging documents
US20160232226A1 (en) * 2015-02-06 2016-08-11 International Business Machines Corporation Identifying categories within textual data
CN109697281A (en) * 2018-12-17 2019-04-30 万兴科技股份有限公司 The online method, apparatus and electronic equipment for merging document
JP2019133534A (en) * 2018-02-02 2019-08-08 富士通株式会社 Merging method, merging device, and merging program
CN110909123A (en) * 2019-10-23 2020-03-24 深圳价值在线信息科技股份有限公司 Data extraction method and device, terminal equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0550694A (en) * 1991-08-28 1993-03-02 Nec Corp Form mergence
JP2006309398A (en) * 2005-04-27 2006-11-09 Nec Fielding Ltd Document merge system, data processor, document merge method and program
CN103390005A (en) * 2012-05-11 2013-11-13 北大方正集团有限公司 Method and system for merging documents
US20160232226A1 (en) * 2015-02-06 2016-08-11 International Business Machines Corporation Identifying categories within textual data
JP2019133534A (en) * 2018-02-02 2019-08-08 富士通株式会社 Merging method, merging device, and merging program
CN109697281A (en) * 2018-12-17 2019-04-30 万兴科技股份有限公司 The online method, apparatus and electronic equipment for merging document
CN110909123A (en) * 2019-10-23 2020-03-24 深圳价值在线信息科技股份有限公司 Data extraction method and device, terminal equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
庞世隆;: "浅谈Word混合长文档的编排技巧", 无线互联科技, no. 07, pages 51 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115688711A (en) * 2022-12-30 2023-02-03 中化现代农业有限公司 Document merging method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109344241B (en) Information recommendation method and device, terminal and storage medium
US9495347B2 (en) Systems and methods for extracting table information from documents
CN108108342B (en) Structured text generation method, search method and device
CN111444750B (en) PDF document identification method and device and electronic equipment
US10853564B2 (en) Operation for copied content
CN113377653B (en) Method and device for generating test cases
CN105843800A (en) DOI-based language information display method and device
CN115982376B (en) Method and device for training model based on text, multimode data and knowledge
CN110968314B (en) Page generation method and device
CN104899203B (en) Webpage generation method and device and terminal equipment
CN105260459A (en) Search method and apparatus
JP2019522847A (en) Method, device and terminal device for extracting data
CN107729491B (en) Method, device and equipment for improving accuracy rate of question answer search
CN112784527A (en) Document merging method and device and electronic equipment
CN110232155B (en) Information recommendation method for browser interface and electronic equipment
CN107168627B (en) Text editing method and device for touch screen
CN114741144B (en) Web-side complex form display method, device and system
US20220301285A1 (en) Processing picture-text data
CN115563942A (en) Contract generation method and device, electronic equipment and computer readable medium
CN114239501A (en) Contract generation method, apparatus, device and medium
CN114911753A (en) Method and device for generating presentation document, electronic equipment and storage medium
CN113420042A (en) Data statistics method, device, equipment and storage medium based on presentation
CN110895924B (en) Method and device for reading document content aloud, electronic equipment and readable storage medium
CN111666522A (en) Information processing method, device, equipment and storage medium
CN112417832A (en) Format conversion method and device for spreadsheet document and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination