CN109740125A - Update lookup method, device, storage medium and equipment for Documents Comparison - Google Patents

Update lookup method, device, storage medium and equipment for Documents Comparison Download PDF

Info

Publication number
CN109740125A
CN109740125A CN201811625251.3A CN201811625251A CN109740125A CN 109740125 A CN109740125 A CN 109740125A CN 201811625251 A CN201811625251 A CN 201811625251A CN 109740125 A CN109740125 A CN 109740125A
Authority
CN
China
Prior art keywords
file
common
gap
common element
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811625251.3A
Other languages
Chinese (zh)
Other versions
CN109740125B (en
Inventor
韩志刚
宋洋
于广伟
姜楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Original Assignee
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp filed Critical Neusoft Corp
Priority to CN201811625251.3A priority Critical patent/CN109740125B/en
Publication of CN109740125A publication Critical patent/CN109740125A/en
Application granted granted Critical
Publication of CN109740125B publication Critical patent/CN109740125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

This disclosure relates to a kind of update lookup method, device, storage medium and equipment for Documents Comparison, this method comprises: using the content of each default unit as an element in the first file and the second file, first file and the second file are compared, to obtain the longest common subsequence of the first file and the second file;After the common element in the first file be indexed with the common element in the second file according to longest common subsequence being aligned, the position corresponding relationship in the common element gap where the deletion element in the first file and the common element gap where the addition element in the second file, determines the more new element in the first file and second file.It can realize that the alignment of comparison and common content between file can reduce realization difficulty so as to the more new content between the common content search file according to alignment independent of complicated algorithm, it is easy to accomplish.

Description

Update lookup method, device, storage medium and equipment for Documents Comparison
Technical field
This disclosure relates to text-processing technical field, and in particular, to a kind of update lookup method for Documents Comparison, Device, storage medium and electronic equipment.
Background technique
In daily use, the comparison of file or text is all a relatively common demand in many fields.Such as it is right Than the code (such as two in two articles (such as comparison two word documents) in two files, or two files of comparison Code difference in a script) etc..The purpose of file comparison is carried out typically to the row or paragraph for realizing two files Alignment, to find content relevance and difference.
Due to that user can be helped to be quickly found correlation and difference in two files, nothing by file comparison By being for personal or multiple person cooperational, file comparison is all a critical function in daily use, can be improved user's Working efficiency.Such as in current software development, it is substantially to be developed by multiple person cooperational and completes, therefore other people are modified File, identical content, and the position of position disparity can be quickly found by Documents Comparison, to facilitate collaborators Subsequent processing is carried out, the workload of developer is reduced.
Currently, existing (update and be understood that search the part updated between two files by carrying out file comparison Entirely different content and not exactly the same content between two files, the i.e. identical content phase of two file parts Together, but the place different there is also partial content or attribute), be substantially based on more complicated algorithm realization, realize difficulty Greatly.
Summary of the invention
Purpose of this disclosure is to provide a kind of update lookup method, device, storage medium and electronics for Documents Comparison Equipment, it is complicated for solving the existing algorithm for updating lookup method, realize the big problem of difficulty.
To achieve the goals above, the disclosure in a first aspect, provide a kind of update lookup method for Documents Comparison, The described method includes:
Using the content of each default unit as an element in the first file and the second file, to first file It is compared with second file, to obtain the longest common subsequence of first file Yu second file;
According to the longest common subsequence by first file common element and second file in After common element is indexed alignment, according to the common element gap and described the deleted where element in first file The position corresponding relationship in the common element gap where the addition element in two files, determines first file and described second More new element in file.
Optionally, the method also includes:
The corresponding each pair of common element in position in first file and second file is subjected to character comparison, with true Whether fixed each pair of common element is identical;
Pass through the corresponding each pair of common member in position that will there is more new element in first file and second file More new element in plain gap carries out character comparison, with the common word of the more new element in determination each pair of common element gap Symbol deletes character, addition character and more fresh character.
Optionally, it is described according to the longest common subsequence by the common element and described the in first file After common element in two files is indexed alignment, between the common element where the deletion element in first file The position corresponding relationship in the common element gap where addition element in gap and second file, determines first file With the more new element in second file, comprising:
According to the longest common subsequence, determines the common element in first file and delete element, Yi Jisuo State common element and the addition element in the second file, the deletion element be in first file except the common element it Outer other elements, the addition element are the other elements in second file in addition to the common element;
It is corresponding by the way that the common element in first file is established index with the common element in second file Common element in first file is indexed with the common element in second file and is aligned by relationship;
After the common element in first file is aligned with the common element index in second file, institute is determined State the position corresponding relationship in common element gap and the common element gap in second file in the first file;
Common element gap where the deletion element in first file and the addition in second file The position corresponding relationship in the common element gap where element, determines the more Singapore dollar in first file and second file Element.
Optionally, the common element gap where the deletion element according in first file and second text The position corresponding relationship in the common element gap where the addition element in part, determines first file and second file In more new element, comprising:
When there are n deletion elements for the first gap location of the common element in first file, and described second Second gap location of the common element in file is there are when m addition element, by the n deletion member in first gap The m addition element in plain and described second gap is determined as more new element, and first gap is first file In any common element gap, second gap is common element gap corresponding with first interstitial site;
By the index of the n deletion element in first gap and the m addition in second gap The index of element establishes corresponding relationship.
Optionally, described that the corresponding each pair of common element in position in first file and second file is subjected to word Symbol comparison, it is whether identical with determination each pair of common element, comprising:
Each character in the first common element in first file and second in second file is common Whether the correspondence character in element carries out attribute comparison, to deposit in the common element of determination described first and the described second common element In the different character of attribute, the first common element is any common element in first file, and described second is common Element is the common element being aligned in second file with the described first common element index;
When the described first common element from the described second common element there are when the different character of at least one attribute, will The first common element and the described second common element are determined as attribute update;
When the described first common element is from the different character of attribute is not present in the described second common element, by described the One common element is determined as identical with the described second common element.
Optionally, described corresponding by the position that will there is more new element in first file and second file More new element in each pair of common element gap carries out character comparison, with the more Singapore dollar in determination each pair of common element gap The common content of element deletes content, addition content and more new content, comprising:
N deletion element in first gap of the common element in first file is merged into the first element;
M addition element in second gap of the common element in second file is merged into second element, institute Stating the first gap is any common element gap in first file, and second gap is and first interstitial site Corresponding common element gap;
First element and the second element are subjected to character comparison, with determination first element and described second The common character of element deletes character, addition character and more fresh character.
Second aspect, provides a kind of update lookup device for Documents Comparison, and described device includes:
Contrast module, in the first file and the second file using the content of each default unit as an element, First file and second file are compared, it is public with the longest for obtaining first file and second file Subsequence altogether;
Update determining module, for according to the longest common subsequence by first file common element with After common element in second file is indexed alignment, according to common where the deletion element in first file The position corresponding relationship in the common element gap where addition element in element gap and second file determines described the More new element in one file and second file.
Optionally, described device further include:
First updates identification module, for position in first file and second file is corresponding each pair of common Whether element carries out character comparison, identical with determination each pair of common element;
Second updates identification module, and the position for will there is more new element in first file and the second file is corresponding Each pair of common element gap in more new element carry out character comparison, with the update in determination each pair of common element gap The common character of element deletes character, addition character and more fresh character.
Optionally, the update determining module, comprising:
Elemental recognition submodule, for determining the common member in first file according to the longest common subsequence Element and deletion element and common element and addition element in second file, the deletion element are first text Other elements in part in addition to the common element, the addition element be second file in except the common element it Outer other elements;
Element alignment submodule, for by by the common element in first file and being total in second file Logical element establishes index corresponding relationship, by the common element in the common element and second file in first file into Line index alignment;
Gap corresponds to submodule, for the common member in the common element and second file in first file After element index alignment, the position in common element gap and the common element gap in second file in first file is determined Set corresponding relationship;
Element determines submodule, for according in first file delete element where common element gap and institute The position corresponding relationship for stating the common element gap where the addition element in the second file determines first file and described More new element in second file.
Optionally, the element determines submodule, is used for:
When there are n deletion elements for the first gap location of the common element in first file, and described second Second gap location of the common element in file is there are when m addition element, by the n deletion member in first gap The m addition element in plain and described second gap is determined as more new element, and first gap is first file In any common element gap, second gap is common element gap corresponding with first interstitial site;
By the index of the n deletion element in first gap and the m addition in second gap The index of element establishes corresponding relationship.
Optionally, described first updates identification module, comprising:
Attribute compares submodule, for by each character and described the in the first common element in first file Correspondence character in the second common element in two files carries out attribute comparison, with the common element of determination described first and described the With the presence or absence of the character that attribute is different in two common elements, the first common element is any common in first file Element, the second common element are the common elements being aligned in second file with the described first common element index;
Submodule is determined, for when there are at least one attributes in the described first common element and the described second common element When different character, the described first common element and the described second common element are determined as attribute and updated;
The determining submodule is also used to when there is no attributes in the described first common element and the described second common element When different character, the described first common element is determined as with the described second common element identical.
Optionally, described second updates identification module, comprising:
Merge submodule, for closing n deletion element in the first gap of the common element in first file It and is the first element;
The merging submodule is also used to m addition in the second gap of the common element in second file Element merges into second element, and first gap is any common element gap in first file, between described second Gap is common element gap corresponding with first interstitial site;
Character compares submodule, for first element and the second element to be carried out character comparison, to determine It states the common character of the first element and the second element, delete character, addition character and more fresh character.
The third aspect provides a kind of computer readable storage medium, is stored thereon with computer program, the computer journey The step of above-mentioned first aspect the method is realized when sequence is executed by processor.
Fourth aspect provides a kind of electronic equipment, comprising: memory is stored thereon with computer program;
Processor, for executing the computer program in the memory, to realize described in above-mentioned first aspect The step of method.
In the above-mentioned technical solutions, by the first file and the second file using the content of each default unit as one A element compares the first file and the second file, to obtain the longest common subsequence of the first file and the second file; It is aligned being indexed the common element in the first file with the common element in the second file according to longest common subsequence Afterwards, common where according to the common element gap where the deletion element in the first file with the addition element in the second file The position corresponding relationship in element gap determines the more new element in the first file and second file.Pass through above-mentioned technical side Case can realize the alignment of the comparison and common content between file, independent of complicated algorithm so as to according to alignment More new content between common content search file, can reduce realization difficulty, it is easy to accomplish.
Other feature and advantage of the disclosure will the following detailed description will be given in the detailed implementation section.
Detailed description of the invention
Attached drawing is and to constitute part of specification for providing further understanding of the disclosure, with following tool Body embodiment is used to explain the disclosure together, but does not constitute the limitation to the disclosure.In the accompanying drawings:
Fig. 1 is a kind of stream of update lookup method for Documents Comparison shown according to one exemplary embodiment of the disclosure Journey schematic diagram;
Fig. 2 is the update lookup method that Documents Comparison is used for according to the another kind shown in one exemplary embodiment of the disclosure Flow diagram;
Fig. 3 is the update lookup method that Documents Comparison is used for according to the another kind shown in one exemplary embodiment of the disclosure Flow diagram;
Fig. 4 is the schematic diagram that method is determined according to a kind of more newline shown in one exemplary embodiment of the disclosure;
Fig. 5 is the update lookup method that Documents Comparison is used for according to another shown in one exemplary embodiment of the disclosure Flow diagram;
Fig. 6 is the update lookup method that Documents Comparison is used for according to another shown in one exemplary embodiment of the disclosure Flow diagram;
Fig. 7 is the schematic diagram according to a kind of labeling method of more newline shown in one exemplary embodiment of the disclosure;
Fig. 8 is a kind of frame that device is searched for the update of Documents Comparison shown according to one exemplary embodiment of the disclosure Figure;
Fig. 9 is a kind of block diagram of update determining module shown according to one exemplary embodiment of the disclosure;
Figure 10 is the block diagram that identification module is updated according to one kind first shown in one exemplary embodiment of the disclosure;
Figure 11 is the block diagram that identification module is updated according to one kind second shown in one exemplary embodiment of the disclosure;
Figure 12 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.
Specific embodiment
It is described in detail below in conjunction with specific embodiment of the attached drawing to the disclosure.It should be understood that this place is retouched The specific embodiment stated is only used for describing and explaining the disclosure, is not limited to the disclosure.
Fig. 1 is a kind of stream of update lookup method for Documents Comparison shown according to one exemplary embodiment of the disclosure Journey schematic diagram, as shown in Figure 1, this method comprises:
Step 101, using the content of each default unit as an element in the first file and the second file, to first File and second file compare, to obtain the longest common subsequence of the first file and the second file.
Wherein, before carrying out the comparison of the first file and the second file, it is necessary first to by the first file and the second file It is serialized, serializing can be understood as hereof using the content of each default unit as an element, thus one A file can regard an element sequence being made of in order multiple above-mentioned elements as.It is exemplary, above-mentioned first file It can be the file that record has text or code with the second file, which can be word, sentence, row or section Fall etc., it can according to need to be arranged, it can a word, a word, a line or one section are treated as a whole, As an element.
Step 102, according to longest common subsequence by the first file common element with it is common in the second file After element is indexed alignment, the common element gap where the deletion element in the first file and adding in the second file The position corresponding relationship in the common element gap where added elements, determines the more new element in the first file and the second file.
Wherein, which refers to the gap between two common elements, before further including first common element Position after position and the last one common element.
It is exemplary, due to by first file and the second file using the content of each default unit as an element, then The corresponding element sequence of available first file and the corresponding element sequence of the second file.Then according to the first file pair The element sequence and the corresponding element sequence of the second file answered, comparing to first file and second file can be with Determine the longest common subsequence of the first file and the second file.Wherein, which is first file and should The maximum common portion of arrangement of elements sequence consensus in second file.
As an example it is assumed that then every row in the first file and the second file is exactly above-mentioned using row as the default unit An element, if indicating a line with a letter, it assumes that complete serializing the first file and the second file can divide It is not expressed as sequence below:
First file=AAACCGTGAFTTATTCGTTCTAGA
Second file=CACCCCTAAGGTACCTTTGGTT
Wherein, each of above-mentioned first file and the second file letter indicate a line in file, and letter is in sequence In sequence meant that the sequence of row represented by the letter hereof.By the sequence and the second text that compare the first file The sequence of part is that can determine, the longest common subsequence of the first file and the second file.The longest common subsequence, i.e., the first text The row content possessed in part and the second file is identical, and the consistent longest consensus sequence that puts in order, thus may determine that should The longest common subsequence of longest common subsequence namely the first file and the second file are as follows: S=ACCTAGTACTTTG.It is above-mentioned Above-mentioned default unit be also possible to word, sentence, row or paragraph, determine side when longest common subsequence and behavior unit Method is identical, repeats no more.
And the element in longest common subsequence be exactly in the first file and the second file common element (can mark for Common element), the other elements in the first file other than common element are to delete element (to mark as member Element), the other elements in the second file other than common element are addition element (can mark as element).In determination Common element, after deleting element and addition element, the common element of the first file and the second file can be aligned ( It referred to as shakes hands).After common element alignment, the position in the common element gap of the first file and the second file also just can determine Corresponding relationship has determined the common element gap of each of first file corresponding common element gap in the second file, So as to according to the member in the common element gap of the first file and the second file usually determine more new element (can mark for Changed element).
In each embodiment of the disclosure, if there are at least one to delete in a certain common element gap of the first file Except element, and there are at least one addition element in the corresponding common element gap in the position of the second file, then the deletion is first Element and addition element are above-mentioned more new element.
Further, as shown in Fig. 2, this method can also include:
Step 103, the corresponding each pair of common element in position in the first file and the second file is subjected to character comparison, with true Whether fixed each pair of common element is identical.
Wherein, due to usually first ignoring characters' property, therefore common element is content when carrying out file comparison Identical element, character in element whether be also it is identical not yet determine, therefore after two common elements of file have been determined, Character in each pair of common element can be compared, when the characters' property of two common elements in each pair of common element Also when identical, this can be determined as identical (can mark and be) to common element, if the two There are when at least one characters' property difference in common element, this, which is determined as attribute update to common element, (can be marked For Updated).Wherein, the corresponding common element of a pair in position refers to that indexed alignment is (i.e. in the first file and the second file Index corresponding relationship has been established) two common elements.
It step 104, will be between the corresponding each pair of common element in position that there is more new element in the first file and the second file More new element in gap carries out character comparison, with the common character of the more new element in each pair of common element gap of determination, deletion Character, addition character and more fresh character.
It is exemplary, based on the explanation in above-mentioned steps it is found that a pair of common element gap refers to the first file and the second text The corresponding two common element gaps in position in part, more new element refer to the corresponding two common members in position in both of these documents One group of deletion element and addition element present in plain gap, more new element can be confirmed as illustrate include in this group of element Deletion element and addition element may be different or not exactly the same.For example, illustrating so that above-mentioned default unit is row, it is assumed that The deletion row in any common gap in the ranks in the first file is indicated with A, is indicated corresponding with location A in the second file with B and (is existed Common in the ranks gap of the common gap in the ranks where with B in the second file in first file where A is corresponding) add line, then After determining that A and B is more newline, further the character of the corresponding position in each character and B in A can be carried out pair one by one Than comparison process is identical as the above-mentioned used method when carrying out file comparison, first carries out A and B as unit of character Comparison obtains longest common subsequence of the A and B as unit of character, and the character in longest common subsequence is exactly the A and B Common character, the remaining character in A other than common character is exactly to delete character, the remaining character in B other than common character Character is exactly added, when a certain common inter-character space of A (refers to the gap between two common characters and first common character Position after before/the last one common character) it is middle in the presence of deletion character, and exist in corresponding common inter-character space in B Character is added, then the deletion character in the common inter-character space of this group and addition character are more fresh character.
Through the above technical solutions, pair of the comparison and common content between file can be realized independent of complicated algorithm Together, it so as to the more new content between the common content search file according to alignment, can reduce realization difficulty, be easy to real It is existing.
Fig. 3 is the update lookup method that Documents Comparison is used for according to the another kind shown in one exemplary embodiment of the disclosure Flow diagram, as shown in figure 3, described in step 102 according to longest common subsequence by the common element in the first file It is indexed with the common element in the second file after being aligned, between the common element where the deletion element in the first file The position corresponding relationship in the common element gap where addition element in gap and the second file determines the first file and the second text It the step of more new element in part, may comprise steps of:
Step 1021, it according to longest common subsequence, determines the common element in the first file and deletes element, Yi Ji Common element and addition element in two files, deleting element is the other elements in the first file in addition to common element, is added Added elements are the other elements in the second file in addition to common element.
Wherein, the common element in the common element and the second file in first file is exactly the public sub- sequence of the longest Column, the deletion element are the other elements in first file in addition to the common element, which is second file In other elements in addition to the common element.
Or for using row as above-mentioned default unit, then common element here is common row, and deleting element is to delete Except row, addition element be add line, then by taking the sequence of the first above-mentioned file and the sequence of the second file as an example, the first file and The common row of second file is: ACCTAGTACTTTG, then the deletion row in the first file is in the first file in addition to common row Except other rows, then delete row=AAGGTTTGCAA, adding line in the second file is in the second file in addition to common row Except other rows, then add line=CCCAGCGTT.
Step 1022, corresponding by the way that the common element in the first file is established index with the common element in the second file Common element in first file is indexed with the common element in the second file and is aligned by relationship.
It, can be to the common of the first file and the second file after common element has been determined, has deleted element and addition element Element is indexed alignment (also referred to as shaking hands).It is exemplary, it can be in advance each element in the first file and the second file According to its appearance sequence setting call number hereof.For example, by taking the above-mentioned default unit of behavior as an example, in the first file Call number (i.e. line number) is set for every row, call number also is set for every row in the second file.To in the first file and second In file, every row all has unique call number.It is thus determined that after common row in the first file and the second file, pass through by (corresponding row can be understood as the first file and second to correspondence row in common row in the common row and the second file of first file In the common row of file, there is first A in the row of sequence consensus, such as the common row ACCTAGTACTTTG of the first file First A in the common row ACCTAGTACTTTG of corresponding second file) call number establish corresponding relationship, can be realized the One file is aligned with the index of the common row in the second file.
Step 1023, it after the common element in the first file is aligned with the common element index in the second file, determines The position corresponding relationship in common element gap and the common element gap in the second file in first file.
Step 1024, according to the common element gap where the deletion element in the first file and the addition in the second file The position corresponding relationship in the common element gap where element, determines the more new element in the first file and second file.
After common element in the first file is aligned with the common element in the second file, also just can determine The position corresponding relationship in the common element gap of one file and the second file, that is, determined the common element of each of first file Gap corresponding common element gap in the second file, so as to according between the common element of the first file and the second file Member in gap usually determines more new element.
Wherein, above-mentioned common element gap may include: the position in common element between any two element, common member Position before the first element of element and the position after the last one element of common element.The position in common element gap Corresponding relationship is set it is to be understood that if the position in some common element gap in the first file and certain in the second file The position in a common element gap is unanimously that it is corresponding for being considered as the two gaps.Therefore, deleting in the first file is being determined After the addition element in element and the second file, it can recorde corresponding between deletion element and adjacent common element Relationship, the common element gap where so as to record deletion element can record addition for addition element and similarly with this Common element gap where element.
In the corresponding two common element gaps in the position of two files, addition is not present if only existing and deleting element Element, or there is no delete element only exist addition element, in such common gap in the ranks deletion element or addition Element is without processing.And not only existed in the corresponding common element gap in position and deleted element, but also there are addition element, then can To be determined as " updating ", as we it needs to be determined that more new element.
Therefore, step 1024 can be realized especially by following manner:
When there are n deletion elements for the first gap location of the common element in the first file, and in the second file Second gap location of common element, will be in the n deletion element and the second gap in the first gap there are when m addition element M addition element is determined as more new element, which is any common element gap in the first file, and the second gap is Common element gap corresponding with the first interstitial site.Also, it is further, it can be by n deletion element in the first gap The index of index and m addition element in the second gap establish corresponding relationship, that is, by this n deletion element and the m A addition element carries out knob alignment as more new element.
By taking the above-mentioned default unit of behavior as an example, then common element gap is common gap in the ranks.For example, the first file is common First in the common row ACCTAGTACTTTG of the position and the second file before first A in row ACCTAGTACTTTG Position before A is corresponding, the position between first A and first C in the common row ACCTAGTACTTTG of the first file, Position between first A and first C in the common row ACCTAGTACTTTG of the second file is corresponding.
Therefore, according to the first file=AAACCGTGAFTTATTCGTTCTAGA and the second file= From the point of view of CACCCCTAAGGTACCTTTGGTT, the deletion row AA in the position before first A in the common row of the first file, It is corresponding, i.e. deletion row AA in position with the C that adds line in the position before first A in the common row of the second file It is in the corresponding common gap in the ranks in position with the C that adds line.First A and first C in the common row of the first file it Between position in there is no row is deleted, exist in the position between first A and first C in the common row of the second file and add Add row CC, to, there is no row is deleted, only exist and add in this gap between first A and first C in common row Add row.And so on, it all deletion rows of available first file and the second file and adds line in each common gap in the ranks In between position corresponding relationship.
For example, Fig. 4 is the signal that method is determined according to a kind of more newline shown in one exemplary embodiment of the disclosure Figure thus in the form of sequence, respectively illustrates the first file and the as shown in figure 4, indicate a line in figure with each letter Two files, and above the corresponding letter of the first file and below the corresponding letter of the second file, indicate common with "=" Row "-" is indicated to delete and be gone, and "+" expression adds line, and " c " indicates more newline (c indicates change).Wherein due in the first file Adding line and can not be aligned in capable and the second file is deleted, therefore without processing.According in the first file and the second file " c " symbol has three groups it is found that the more newline in the first file and the second file has altogether, respectively AA and C, G and C, A and GTT.Into One step, the index of the index and C of AA in first group can be established into corresponding relationship, by the index of the G in second group and the rope of C Draw and establish corresponding relationship, the index of the index of the A in third group and GTT is established into corresponding relationship.
By above-mentioned step, just have determined that common element between the first file and the second file, common element, Delete element, addition element and more new element.Further, whether common element can further be compared by step 102 It is identical.
Fig. 5 is the update lookup method that Documents Comparison is used for according to another shown in one exemplary embodiment of the disclosure Flow diagram, as shown in figure 5, above-mentioned steps 103 can with specifically includes the following steps:
Step 1031, each character in the first common element in the first file and second in the second file is common Correspondence character in element carries out attribute comparison, whether there is attribute not to determine in the first common element and the second common element Same character.
Wherein, which is any common element in the first file, and the second common element is the second file In the common element that is aligned with the first common element index.
Step 1032, when the first common element from the second common element there are when the different character of at least one attribute, First common element and the second common element are determined as attribute to update.
Step 1033, when the character that the first common element is different from attribute is not present in the second common element, by first Common element is determined as identical with the second common element.
Wherein, characters' property usually may include: font, color, font size, whether overstriking, whether have underscore, whether There is font special efficacy (further including special efficacy type if any font special efficacy) etc..For example, with the common row of the first above-mentioned file First A in ACCTAGTACTTTG is as in the first common element, with the common row ACCTAGTACTTTG of the second file First A be the second common element for, it is assumed that the content of the first common element and the second common element is " 123456789 ", wherein " 2 " color in the first common element is red, " 2 " color in the second common element is blue, It can be so attribute update by the first common element and the second common rubidium marking;If the first common element and should The font type of each character in " 123456789 " in second common element, color, font size, whether have underscore, whether It can be identical with the second common rubidium marking by the first common element when attributes such as overstriking are completely the same.
On the other hand, the character of more new element can be further compared, by step 104 to find out one group of more Singapore dollar Common character between element deletes character, adds character and more fresh character.Fig. 6 is shown according to one exemplary embodiment of the disclosure Flow diagram of out another for the update lookup method of Documents Comparison, as shown in fig. 6, above-mentioned steps 104 can wrap It includes:
Step 1041, n deletion element in the first gap of the common element in the first file is merged into first yuan Element.
Step 1042, m addition element in the second gap of the common element in the second file is merged into second yuan Element.Wherein, which is any common element gap in the first file, and the second gap is corresponding with the first interstitial site Common element gap.
Step 1043, the first element and second element are subjected to character comparison, to determine the first element and second element Common character deletes character, addition character and more fresh character.
It is understood that n deletion element can be one or more deletion elements, m addition element can be one A or multiple addition element, and n is equally likely to m, it is also possible to n is not equal to m.
But no matter whether n and m is identical, above-mentioned n by the first gap deletes m in element and the second gap When the character content of addition element compares, can following methods be used:
Firstly, n deletion element is merged into an element X, m addition element is also merged into an element Y, then By element X and element Y as unit of character, the longest common subsequence of calculating elements X and element Y, the longest common subsequence In character be exactly element X and element Y common character, the remaining character in X other than common character is exactly to delete character, Y In remaining character other than common character be exactly to add character, delete character when existing in a certain common inter-character space of X, and There is addition character in Y in corresponding common inter-character space, then the deletion character in the common inter-character space of this group and addition word Fu Weigeng fresh character.Then common character, deletion character, addition character and more fresh character can be marked respectively, so as to Significantly more contrast effect is provided when showing.
Illustratively, Fig. 7 is showing according to the labeling method of more newline shown in one exemplary embodiment of the disclosure a kind of It is intended to, as shown in fig. 7, the content in two files is respectively illustrated, and the recognition result after comparison, wherein by the first file With there are different positions to be outlined with box in the second file, and indicate more newline (i.e. expression Changed) with "<>", with "! =" indicate the common row (i.e. expression Updated) updated there are attribute, with "==" indicate identical common row (i.e. table Show Identical).Then as shown in fig. 7, common row is marked with "==".After common row has been determined, common gap in the ranks Also it determines that, therefore can determine the position of more newline, as shown in Figure 7, the first file and second according to common gap in the ranks The content of 1st row of file is different, therefore the 1st row of the first file and the second file will not be calculated in common row when comparing In, that is to say, that the 1st row of the first file belongs to deletion row, and the 1st row of the second file belongs to the case where adding line, therefore first 1st row of file and the second file belongs to more newline, therefore the 1st rower of the first file and the second file is denoted as "<>", the One file and 2-3 row, the 6th row, eighth row and 10-16 row in the second file belong to common row, and character content and category Property be identical, therefore labeled as "==", and the 5th row and the 7th row also belong to common row, but due to the first file and The character different there are attribute in 5th row of the second file and the 7th row, therefore be denoted as "!=".And further, Ke Yi When displaying, the character that content is different and attribute is different is marked with box, as shown in Figure 7.Alternatively, can open up When showing comparing result, the label at the content difference character different with attribute is first hidden, when (such as mouse is moved to user's triggering On these positions) when, then show the label.
Through the above technical solutions, pair of the comparison and common content between file can be realized independent of complicated algorithm Together, it so as to the more new content between the common content search file according to alignment, can reduce realization difficulty, be easy to real It is existing.
Fig. 8 is a kind of frame that device is searched for the update of Documents Comparison shown according to one exemplary embodiment of the disclosure Figure, as shown in figure 8, the device 100 may include:
Contrast module 110, in the first file and the second file using the content of each default unit as a member Element compares the first file and the second file, to obtain the longest common subsequence of the first file and the second file;
Update determining module 120, for according to longest common subsequence by the common element and second in the first file After common element in file is indexed alignment, according to the common element gap and the deleted where element in the first file The position corresponding relationship in the common element gap where the addition element in two files, determines in the first file and the second file More new element.
Optionally, which can also include:
First updates identification module 130, is used for the corresponding each pair of common element in position in the first file and the second file Character comparison is carried out, it is whether identical with each pair of common element of determination;
Second updates identification module 140, and the position for will there is more new element in the first file and the second file is corresponding Each pair of common element gap in more new element carry out character comparison, with the more new element in each pair of common element gap of determination Common character, delete character, addition character and more fresh character.
Optionally, Fig. 9 is a kind of block diagram of update determining module shown according to one exemplary embodiment of the disclosure, is such as schemed Shown in 9, which may include:
Elemental recognition submodule 121, for according to longest common subsequence, determining the common element in the first file and deleting Except the common element and addition element in element and the second file, delete element be in the first file in addition to common element Other elements, addition element is other elements in addition to common element in the second file;
Element alignment submodule 122, for by by the common member in the common element and the second file in the first file Element establishes index corresponding relationship, and the common element in the first file is indexed with the common element in the second file and is aligned;
Gap corresponds to submodule 123, for the common element rope in the common element and the second file in the first file After drawing alignment, the position corresponding relationship in common element gap and the common element gap in the second file in the first file is determined;
Element determines submodule 124, for according to the common element gap and the deleted where element in the first file The position corresponding relationship in the common element gap where the addition element in two files, determines in the first file and the second file More new element.
Optionally, which determines submodule 121, is used for:
When there are n deletion elements for the first gap location of the common element in the first file, and in the second file Second gap location of common element, will be in the n deletion element and the second gap in the first gap there are when m addition element M addition element is determined as more new element, and the first gap is any common element gap in the first file, the second gap be with The corresponding common element gap of first interstitial site;
By the index foundation pair of the index of n deletion element in the first gap and m addition element in the second gap It should be related to.
Optionally, Figure 10 is the frame that identification module is updated according to one kind first shown in one exemplary embodiment of the disclosure Figure, as shown in Figure 10, which may include:
Attribute compares submodule 131, for by each character in the first common element in the first file and second literary Correspondence character in the second common element in part carries out attribute comparison, to determine in the first common element and the second common element With the presence or absence of the different character of attribute, the first common element is any common element in the first file, and the second common element is The common element being aligned in second file with the described first common element index;
Determine submodule 132, for when the first common element from there are at least one attribute is different in the second common element Character when, by the first common element and the second common element be determined as attribute update;
It determines submodule 132, is also used to the word different from attribute is not present in the second common element when the first common element First common element is determined as with the second common element identical by Fu Shi.
Optionally, Figure 11 is the frame that identification module is updated according to one kind second shown in one exemplary embodiment of the disclosure Figure, as shown in figure 11, which may include:
Merge submodule 141, for merging n deletion element in the first gap of the common element in the first file For the first element;
Merge submodule 141, is also used to close m addition element in the second gap of the common element in the second file And be second element, the first gap is any common element gap in the first file, and the second gap is and the first interstitial site Corresponding common element gap;
Character compares submodule 142, for the first element and second element to be carried out character comparison, to determine the first element With common character, deletion character, addition character and the more fresh character of second element.
Through the above technical solutions, pair of the comparison and common content between file can be realized independent of complicated algorithm Together, it so as to the more new content between the common content search file according to alignment, can reduce realization difficulty, be easy to real It is existing.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.
Figure 12 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.As shown in figure 12, which sets Standby 200 may include: processor 201, memory 202.The electronic equipment 200 can also include multimedia component 203, input/ Export one or more of (I/O) interface 204 and communication component 205.
Wherein, processor 201 is used to control the integrated operation of the electronic equipment 200, above-mentioned for file ratio to complete Compared with update lookup method in all or part of the steps.Memory 202 is for storing various types of data to support at this The operation of electronic equipment 200, these data for example may include any application program for operating on the electronic equipment 200 Or the instruction and the relevant data of application program of method, such as contact data, the message of transmitting-receiving, picture, audio, video Etc..The memory 202 can be by any kind of volatibility or non-volatile memory device or their combination realization, example Such as static random access memory (Static Random Access Memory, abbreviation SRAM), electrically erasable is read-only Memory (Electrically Erasable Programmable Read-Only Memory, abbreviation EEPROM), it is erasable Programmable read only memory (Erasable Programmable Read-Only Memory, abbreviation EPROM), may be programmed read-only Memory (Programmable Read-Only Memory, abbreviation PROM), read-only memory (Read-Only Memory, letter Claim ROM), magnetic memory, flash memory, disk or CD.Multimedia component 203 may include screen and audio component.Its Middle screen for example can be touch screen, and audio component is used for output and/or input audio signal.For example, audio component can wrap A microphone is included, microphone is for receiving external audio signal.The received audio signal can be further stored in and deposit Reservoir 202 is sent by communication component 205.Audio component further includes at least one loudspeaker, is used for output audio signal.I/ O Interface 204 provides interface between processor 201 and other interface modules, other above-mentioned interface modules can be keyboard, mouse Mark, button etc..These buttons can be virtual push button or entity button.Communication component 205 is for the electronic equipment 200 and its Wired or wireless communication is carried out between his equipment.Wireless communication, such as Wi-Fi, bluetooth, near-field communication (Near Field Communication, abbreviation NFC), 2G, 3G, 4G, NB-IOT, eMTC or other 5G etc. or they one or more of Combination, it is not limited here.Therefore the corresponding communication component 207 may include: Wi-Fi module, bluetooth module, NFC mould Block etc..
In one exemplary embodiment, electronic equipment 200 can be by one or more application specific integrated circuit (Application Specific Integrated Circuit, abbreviation ASIC), digital signal processor (Digital Signal Processor, abbreviation DSP), digital signal processing appts (Digital Signal Processing Device, Abbreviation DSPD), programmable logic device (Programmable Logic Device, abbreviation PLD), field programmable gate array (Field Programmable Gate Array, abbreviation FPGA), controller, microcontroller, microprocessor or other electronics member Part is realized, for executing the above-mentioned update lookup method for Documents Comparison.
In a further exemplary embodiment, a kind of computer readable storage medium including program instruction is additionally provided, it should The step of above-mentioned update lookup method for Documents Comparison is realized when program instruction is executed by processor.For example, the calculating Machine readable storage medium storing program for executing can be the above-mentioned memory 202 including program instruction, and above procedure instruction can be by electronic equipment 200 Processor 201 is executed to complete the above-mentioned update lookup method for Documents Comparison.

Claims (10)

1. a kind of update lookup method for Documents Comparison, which is characterized in that the described method includes:
Using the content of each default unit as an element in the first file and the second file, to first file and institute It states the second file to compare, to obtain the longest common subsequence of first file Yu second file;
According to the longest common subsequence by first file common element with it is common in second file Common element gap and second text after element is indexed alignment, where the deletion element in first file The position corresponding relationship in the common element gap where the addition element in part, determines first file and second file In more new element.
2. the method according to claim 1, wherein the method also includes:
The corresponding each pair of common element in position in first file and second file is subjected to character comparison, to determine Whether identical state each pair of common element;
By will in first file and second file in the presence of more new element the corresponding each pair of common element in position between More new element in gap carries out character comparison, with the common character of the more new element in determination each pair of common element gap, Delete character, addition character and more fresh character.
3. the method according to claim 1, wherein it is described according to the longest common subsequence by described Common element in one file is indexed with the common element in second file be aligned after, according in first file Delete element where common element gap and second file in addition element where common element gap position Corresponding relationship is set, determines the more new element in first file and second file, comprising:
According to the longest common subsequence, determines common element in first file and delete element and described the Common element and addition element in two files, the deletion element be first file in addition to the common element Other elements, the addition element are the other elements in second file in addition to the common element;
By the way that the common element in the common element and second file in first file is established index corresponding relationship, Common element in first file is indexed with the common element in second file and is aligned;
After the common element in first file is aligned with the common element index in second file, described is determined The position corresponding relationship in common element gap and the common element gap in second file in one file;
Common element gap where the deletion element in first file and the addition element in second file The position corresponding relationship in the common element gap at place, determines the more new element in first file and second file.
4. according to the method described in claim 3, it is characterized in that, where the deletion element according in first file Common element gap and second file in addition element where common element gap position corresponding relationship, determine More new element in first file and second file, comprising:
When there are n deletion elements for the first gap location of the common element in first file, and in second file In common element the second gap location there are when m addition element, by first gap the n deletion element and The m addition element in second gap is determined as more new element, and first gap is in first file Any common element gap, second gap are common element gaps corresponding with first interstitial site;
By the index of the n deletion element in first gap and the m addition element in second gap Index establish corresponding relationship.
5. according to the method described in claim 2, it is characterized in that, described by position in first file and second file It sets corresponding each pair of common element and carries out character comparison, it is whether identical with determination each pair of common element, comprising:
By the second common element in each character and second file in the first common element in first file In correspondence character carry out attribute comparison, in the common element of determination described first and the described second common element with the presence or absence of belonging to Property different character, the first common element is any common element in first file, the second common element It is the common element being aligned in second file with the described first common element index;
When the described first common element from the described second common element there are when the different character of at least one attribute, will be described First common element and the described second common element are determined as attribute update;
When the character that the described first common element is different from attribute is not present in the described second common element, described first is total to Logical element is determined as identical with the described second common element.
6. device is searched in a kind of update for Documents Comparison, which is characterized in that described device includes:
Contrast module, in the first file and the second file using the content of each default unit as an element, to institute It states the first file and second file compares, to obtain the public son of longest of first file Yu second file Sequence;
Update determining module, for according to the longest common subsequence by first file common element with it is described Common element after common element in second file is indexed alignment, where the deletion element in first file The position corresponding relationship in the common element gap where addition element in gap and second file determines first text More new element in part and second file.
7. device according to claim 6, which is characterized in that described device further include:
First updates identification module, is used for the corresponding each pair of common element in position in first file and second file Character comparison is carried out, it is whether identical with determination each pair of common element;
Second updates identification module, and the position for will there is more new element in first file and the second file is corresponding every Character comparison is carried out to the more new element in common element gap, with the more new element in determination each pair of common element gap Common character, delete character, addition character and more fresh character.
8. device according to claim 6, which is characterized in that the update determining module, comprising:
Elemental recognition submodule, for according to the longest common subsequence, determine common element in first file and Element and common element and addition element in second file are deleted, the deletion element is in first file Other elements in addition to the common element, the addition element be second file in addition to the common element Other elements;
Element alignment submodule, for by by first file common element and second file in common member Element establishes index corresponding relationship, and the common element in the common element and second file in first file is carried out rope Draw alignment;
Gap corresponds to submodule, for the common element rope in the common element and second file in first file After drawing alignment, the position pair in common element gap and the common element gap in second file in first file is determined It should be related to;
Element determines submodule, for according to the common element gap and described the deleted where element in first file The position corresponding relationship in the common element gap where the addition element in two files, determines first file and described second More new element in file.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt The step of any one of claim 1-5 the method is realized when processor executes.
10. a kind of electronic equipment characterized by comprising
Memory is stored thereon with computer program;
Processor, for executing the computer program in the memory, to realize described in any one of claim 1-5 The step of method.
CN201811625251.3A 2018-12-28 2018-12-28 Update search method, device, storage medium and equipment for file comparison Active CN109740125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811625251.3A CN109740125B (en) 2018-12-28 2018-12-28 Update search method, device, storage medium and equipment for file comparison

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811625251.3A CN109740125B (en) 2018-12-28 2018-12-28 Update search method, device, storage medium and equipment for file comparison

Publications (2)

Publication Number Publication Date
CN109740125A true CN109740125A (en) 2019-05-10
CN109740125B CN109740125B (en) 2023-06-27

Family

ID=66361944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811625251.3A Active CN109740125B (en) 2018-12-28 2018-12-28 Update search method, device, storage medium and equipment for file comparison

Country Status (1)

Country Link
CN (1) CN109740125B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104788A (en) * 2019-12-05 2020-05-05 东软集团股份有限公司 Document differential content alignment method and device, storage medium and electronic equipment
CN114356245A (en) * 2022-01-12 2022-04-15 济南点量软件有限公司 Method and system for rapidly comparing and updating mass files

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589838A (en) * 2015-12-24 2016-05-18 中国电子科技集团公司第三十三研究所 Electronic official document trace reserving method based on file comparison
CN106372040A (en) * 2016-08-24 2017-02-01 长园深瑞继保自动化有限公司 Difference comparison system of intelligent substation configuration file
CN106469219A (en) * 2016-09-09 2017-03-01 武汉长光科技有限公司 A kind of method that embedded device configuration file synchronously compares
CN107273359A (en) * 2017-06-20 2017-10-20 北京四海心通科技有限公司 A kind of text similarity determines method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589838A (en) * 2015-12-24 2016-05-18 中国电子科技集团公司第三十三研究所 Electronic official document trace reserving method based on file comparison
CN106372040A (en) * 2016-08-24 2017-02-01 长园深瑞继保自动化有限公司 Difference comparison system of intelligent substation configuration file
CN106469219A (en) * 2016-09-09 2017-03-01 武汉长光科技有限公司 A kind of method that embedded device configuration file synchronously compares
CN107273359A (en) * 2017-06-20 2017-10-20 北京四海心通科技有限公司 A kind of text similarity determines method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104788A (en) * 2019-12-05 2020-05-05 东软集团股份有限公司 Document differential content alignment method and device, storage medium and electronic equipment
CN111104788B (en) * 2019-12-05 2023-09-22 东软集团股份有限公司 Alignment method and device of document differential content, storage medium and electronic equipment
CN114356245A (en) * 2022-01-12 2022-04-15 济南点量软件有限公司 Method and system for rapidly comparing and updating mass files
CN114356245B (en) * 2022-01-12 2023-09-22 济南点量软件有限公司 Method and system for fast comparing and updating mass files

Also Published As

Publication number Publication date
CN109740125B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
US9628423B2 (en) Electronic sticky note system, information processing terminal, method for processing electronic sticky note, medium storing program, and data structure of electronic sticky note
CN109690481A (en) The customization of dynamic function row
CN104350493B (en) Transform the data into consumable content
CN106796518A (en) Based on the feedback being intended to
CN106796582A (en) The dynamic presentation of suggestion content
CN104471565A (en) Abstract relational model for transforming data into consumable content
CN103176705B (en) A kind of mobile terminal and method for previewing thereof
WO2021008334A1 (en) Data binding method, apparatus, and device of mini program, and storage medium
CN104813312B (en) Stateful editor is carried out to abundant content using basic text box
CN106855748A (en) A kind of data inputting method, device and intelligent terminal
US11314408B2 (en) Computationally efficient human-computer interface for collaborative modification of content
US20150293975A1 (en) Method and device for searching for contact object, and storage medium
US20200065052A1 (en) Enhanced techniques for merging content from separate computing devices
CN106663091A (en) Summary data autofill
CN113518026A (en) Message processing method and device and electronic equipment
CN109740125A (en) Update lookup method, device, storage medium and equipment for Documents Comparison
US20200065361A1 (en) Selectively controlling modification states for user-defined subsets of objects within a digital document
CN109740124A (en) Difference output method, device, storage medium and the electronic equipment of document comparison
US20170124051A1 (en) Extensibility of compound data objects
JP2011197983A (en) Information display device and information display program
JP5353872B2 (en) Information display device and information display program
CN109684437B (en) Content alignment method, device, storage medium and equipment for file comparison
JP2007219940A (en) Menu control device, mobile phone, and program for menu control device
CN109815446A (en) Page boundary processing method, device, storage medium and electronic equipment
CN105867763A (en) Information processing method and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant