CN109740125A - Update lookup method, device, storage medium and equipment for Documents Comparison - Google Patents
Update lookup method, device, storage medium and equipment for Documents Comparison Download PDFInfo
- Publication number
- CN109740125A CN109740125A CN201811625251.3A CN201811625251A CN109740125A CN 109740125 A CN109740125 A CN 109740125A CN 201811625251 A CN201811625251 A CN 201811625251A CN 109740125 A CN109740125 A CN 109740125A
- Authority
- CN
- China
- Prior art keywords
- file
- common
- gap
- common element
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
This disclosure relates to a kind of update lookup method, device, storage medium and equipment for Documents Comparison, this method comprises: using the content of each default unit as an element in the first file and the second file, first file and the second file are compared, to obtain the longest common subsequence of the first file and the second file;After the common element in the first file be indexed with the common element in the second file according to longest common subsequence being aligned, the position corresponding relationship in the common element gap where the deletion element in the first file and the common element gap where the addition element in the second file, determines the more new element in the first file and second file.It can realize that the alignment of comparison and common content between file can reduce realization difficulty so as to the more new content between the common content search file according to alignment independent of complicated algorithm, it is easy to accomplish.
Description
Technical field
This disclosure relates to text-processing technical field, and in particular, to a kind of update lookup method for Documents Comparison,
Device, storage medium and electronic equipment.
Background technique
In daily use, the comparison of file or text is all a relatively common demand in many fields.Such as it is right
Than the code (such as two in two articles (such as comparison two word documents) in two files, or two files of comparison
Code difference in a script) etc..The purpose of file comparison is carried out typically to the row or paragraph for realizing two files
Alignment, to find content relevance and difference.
Due to that user can be helped to be quickly found correlation and difference in two files, nothing by file comparison
By being for personal or multiple person cooperational, file comparison is all a critical function in daily use, can be improved user's
Working efficiency.Such as in current software development, it is substantially to be developed by multiple person cooperational and completes, therefore other people are modified
File, identical content, and the position of position disparity can be quickly found by Documents Comparison, to facilitate collaborators
Subsequent processing is carried out, the workload of developer is reduced.
Currently, existing (update and be understood that search the part updated between two files by carrying out file comparison
Entirely different content and not exactly the same content between two files, the i.e. identical content phase of two file parts
Together, but the place different there is also partial content or attribute), be substantially based on more complicated algorithm realization, realize difficulty
Greatly.
Summary of the invention
Purpose of this disclosure is to provide a kind of update lookup method, device, storage medium and electronics for Documents Comparison
Equipment, it is complicated for solving the existing algorithm for updating lookup method, realize the big problem of difficulty.
To achieve the goals above, the disclosure in a first aspect, provide a kind of update lookup method for Documents Comparison,
The described method includes:
Using the content of each default unit as an element in the first file and the second file, to first file
It is compared with second file, to obtain the longest common subsequence of first file Yu second file;
According to the longest common subsequence by first file common element and second file in
After common element is indexed alignment, according to the common element gap and described the deleted where element in first file
The position corresponding relationship in the common element gap where the addition element in two files, determines first file and described second
More new element in file.
Optionally, the method also includes:
The corresponding each pair of common element in position in first file and second file is subjected to character comparison, with true
Whether fixed each pair of common element is identical;
Pass through the corresponding each pair of common member in position that will there is more new element in first file and second file
More new element in plain gap carries out character comparison, with the common word of the more new element in determination each pair of common element gap
Symbol deletes character, addition character and more fresh character.
Optionally, it is described according to the longest common subsequence by the common element and described the in first file
After common element in two files is indexed alignment, between the common element where the deletion element in first file
The position corresponding relationship in the common element gap where addition element in gap and second file, determines first file
With the more new element in second file, comprising:
According to the longest common subsequence, determines the common element in first file and delete element, Yi Jisuo
State common element and the addition element in the second file, the deletion element be in first file except the common element it
Outer other elements, the addition element are the other elements in second file in addition to the common element;
It is corresponding by the way that the common element in first file is established index with the common element in second file
Common element in first file is indexed with the common element in second file and is aligned by relationship;
After the common element in first file is aligned with the common element index in second file, institute is determined
State the position corresponding relationship in common element gap and the common element gap in second file in the first file;
Common element gap where the deletion element in first file and the addition in second file
The position corresponding relationship in the common element gap where element, determines the more Singapore dollar in first file and second file
Element.
Optionally, the common element gap where the deletion element according in first file and second text
The position corresponding relationship in the common element gap where the addition element in part, determines first file and second file
In more new element, comprising:
When there are n deletion elements for the first gap location of the common element in first file, and described second
Second gap location of the common element in file is there are when m addition element, by the n deletion member in first gap
The m addition element in plain and described second gap is determined as more new element, and first gap is first file
In any common element gap, second gap is common element gap corresponding with first interstitial site;
By the index of the n deletion element in first gap and the m addition in second gap
The index of element establishes corresponding relationship.
Optionally, described that the corresponding each pair of common element in position in first file and second file is subjected to word
Symbol comparison, it is whether identical with determination each pair of common element, comprising:
Each character in the first common element in first file and second in second file is common
Whether the correspondence character in element carries out attribute comparison, to deposit in the common element of determination described first and the described second common element
In the different character of attribute, the first common element is any common element in first file, and described second is common
Element is the common element being aligned in second file with the described first common element index;
When the described first common element from the described second common element there are when the different character of at least one attribute, will
The first common element and the described second common element are determined as attribute update;
When the described first common element is from the different character of attribute is not present in the described second common element, by described the
One common element is determined as identical with the described second common element.
Optionally, described corresponding by the position that will there is more new element in first file and second file
More new element in each pair of common element gap carries out character comparison, with the more Singapore dollar in determination each pair of common element gap
The common content of element deletes content, addition content and more new content, comprising:
N deletion element in first gap of the common element in first file is merged into the first element;
M addition element in second gap of the common element in second file is merged into second element, institute
Stating the first gap is any common element gap in first file, and second gap is and first interstitial site
Corresponding common element gap;
First element and the second element are subjected to character comparison, with determination first element and described second
The common character of element deletes character, addition character and more fresh character.
Second aspect, provides a kind of update lookup device for Documents Comparison, and described device includes:
Contrast module, in the first file and the second file using the content of each default unit as an element,
First file and second file are compared, it is public with the longest for obtaining first file and second file
Subsequence altogether;
Update determining module, for according to the longest common subsequence by first file common element with
After common element in second file is indexed alignment, according to common where the deletion element in first file
The position corresponding relationship in the common element gap where addition element in element gap and second file determines described the
More new element in one file and second file.
Optionally, described device further include:
First updates identification module, for position in first file and second file is corresponding each pair of common
Whether element carries out character comparison, identical with determination each pair of common element;
Second updates identification module, and the position for will there is more new element in first file and the second file is corresponding
Each pair of common element gap in more new element carry out character comparison, with the update in determination each pair of common element gap
The common character of element deletes character, addition character and more fresh character.
Optionally, the update determining module, comprising:
Elemental recognition submodule, for determining the common member in first file according to the longest common subsequence
Element and deletion element and common element and addition element in second file, the deletion element are first text
Other elements in part in addition to the common element, the addition element be second file in except the common element it
Outer other elements;
Element alignment submodule, for by by the common element in first file and being total in second file
Logical element establishes index corresponding relationship, by the common element in the common element and second file in first file into
Line index alignment;
Gap corresponds to submodule, for the common member in the common element and second file in first file
After element index alignment, the position in common element gap and the common element gap in second file in first file is determined
Set corresponding relationship;
Element determines submodule, for according in first file delete element where common element gap and institute
The position corresponding relationship for stating the common element gap where the addition element in the second file determines first file and described
More new element in second file.
Optionally, the element determines submodule, is used for:
When there are n deletion elements for the first gap location of the common element in first file, and described second
Second gap location of the common element in file is there are when m addition element, by the n deletion member in first gap
The m addition element in plain and described second gap is determined as more new element, and first gap is first file
In any common element gap, second gap is common element gap corresponding with first interstitial site;
By the index of the n deletion element in first gap and the m addition in second gap
The index of element establishes corresponding relationship.
Optionally, described first updates identification module, comprising:
Attribute compares submodule, for by each character and described the in the first common element in first file
Correspondence character in the second common element in two files carries out attribute comparison, with the common element of determination described first and described the
With the presence or absence of the character that attribute is different in two common elements, the first common element is any common in first file
Element, the second common element are the common elements being aligned in second file with the described first common element index;
Submodule is determined, for when there are at least one attributes in the described first common element and the described second common element
When different character, the described first common element and the described second common element are determined as attribute and updated;
The determining submodule is also used to when there is no attributes in the described first common element and the described second common element
When different character, the described first common element is determined as with the described second common element identical.
Optionally, described second updates identification module, comprising:
Merge submodule, for closing n deletion element in the first gap of the common element in first file
It and is the first element;
The merging submodule is also used to m addition in the second gap of the common element in second file
Element merges into second element, and first gap is any common element gap in first file, between described second
Gap is common element gap corresponding with first interstitial site;
Character compares submodule, for first element and the second element to be carried out character comparison, to determine
It states the common character of the first element and the second element, delete character, addition character and more fresh character.
The third aspect provides a kind of computer readable storage medium, is stored thereon with computer program, the computer journey
The step of above-mentioned first aspect the method is realized when sequence is executed by processor.
Fourth aspect provides a kind of electronic equipment, comprising: memory is stored thereon with computer program;
Processor, for executing the computer program in the memory, to realize described in above-mentioned first aspect
The step of method.
In the above-mentioned technical solutions, by the first file and the second file using the content of each default unit as one
A element compares the first file and the second file, to obtain the longest common subsequence of the first file and the second file;
It is aligned being indexed the common element in the first file with the common element in the second file according to longest common subsequence
Afterwards, common where according to the common element gap where the deletion element in the first file with the addition element in the second file
The position corresponding relationship in element gap determines the more new element in the first file and second file.Pass through above-mentioned technical side
Case can realize the alignment of the comparison and common content between file, independent of complicated algorithm so as to according to alignment
More new content between common content search file, can reduce realization difficulty, it is easy to accomplish.
Other feature and advantage of the disclosure will the following detailed description will be given in the detailed implementation section.
Detailed description of the invention
Attached drawing is and to constitute part of specification for providing further understanding of the disclosure, with following tool
Body embodiment is used to explain the disclosure together, but does not constitute the limitation to the disclosure.In the accompanying drawings:
Fig. 1 is a kind of stream of update lookup method for Documents Comparison shown according to one exemplary embodiment of the disclosure
Journey schematic diagram;
Fig. 2 is the update lookup method that Documents Comparison is used for according to the another kind shown in one exemplary embodiment of the disclosure
Flow diagram;
Fig. 3 is the update lookup method that Documents Comparison is used for according to the another kind shown in one exemplary embodiment of the disclosure
Flow diagram;
Fig. 4 is the schematic diagram that method is determined according to a kind of more newline shown in one exemplary embodiment of the disclosure;
Fig. 5 is the update lookup method that Documents Comparison is used for according to another shown in one exemplary embodiment of the disclosure
Flow diagram;
Fig. 6 is the update lookup method that Documents Comparison is used for according to another shown in one exemplary embodiment of the disclosure
Flow diagram;
Fig. 7 is the schematic diagram according to a kind of labeling method of more newline shown in one exemplary embodiment of the disclosure;
Fig. 8 is a kind of frame that device is searched for the update of Documents Comparison shown according to one exemplary embodiment of the disclosure
Figure;
Fig. 9 is a kind of block diagram of update determining module shown according to one exemplary embodiment of the disclosure;
Figure 10 is the block diagram that identification module is updated according to one kind first shown in one exemplary embodiment of the disclosure;
Figure 11 is the block diagram that identification module is updated according to one kind second shown in one exemplary embodiment of the disclosure;
Figure 12 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.
Specific embodiment
It is described in detail below in conjunction with specific embodiment of the attached drawing to the disclosure.It should be understood that this place is retouched
The specific embodiment stated is only used for describing and explaining the disclosure, is not limited to the disclosure.
Fig. 1 is a kind of stream of update lookup method for Documents Comparison shown according to one exemplary embodiment of the disclosure
Journey schematic diagram, as shown in Figure 1, this method comprises:
Step 101, using the content of each default unit as an element in the first file and the second file, to first
File and second file compare, to obtain the longest common subsequence of the first file and the second file.
Wherein, before carrying out the comparison of the first file and the second file, it is necessary first to by the first file and the second file
It is serialized, serializing can be understood as hereof using the content of each default unit as an element, thus one
A file can regard an element sequence being made of in order multiple above-mentioned elements as.It is exemplary, above-mentioned first file
It can be the file that record has text or code with the second file, which can be word, sentence, row or section
Fall etc., it can according to need to be arranged, it can a word, a word, a line or one section are treated as a whole,
As an element.
Step 102, according to longest common subsequence by the first file common element with it is common in the second file
After element is indexed alignment, the common element gap where the deletion element in the first file and adding in the second file
The position corresponding relationship in the common element gap where added elements, determines the more new element in the first file and the second file.
Wherein, which refers to the gap between two common elements, before further including first common element
Position after position and the last one common element.
It is exemplary, due to by first file and the second file using the content of each default unit as an element, then
The corresponding element sequence of available first file and the corresponding element sequence of the second file.Then according to the first file pair
The element sequence and the corresponding element sequence of the second file answered, comparing to first file and second file can be with
Determine the longest common subsequence of the first file and the second file.Wherein, which is first file and should
The maximum common portion of arrangement of elements sequence consensus in second file.
As an example it is assumed that then every row in the first file and the second file is exactly above-mentioned using row as the default unit
An element, if indicating a line with a letter, it assumes that complete serializing the first file and the second file can divide
It is not expressed as sequence below:
First file=AAACCGTGAFTTATTCGTTCTAGA
Second file=CACCCCTAAGGTACCTTTGGTT
Wherein, each of above-mentioned first file and the second file letter indicate a line in file, and letter is in sequence
In sequence meant that the sequence of row represented by the letter hereof.By the sequence and the second text that compare the first file
The sequence of part is that can determine, the longest common subsequence of the first file and the second file.The longest common subsequence, i.e., the first text
The row content possessed in part and the second file is identical, and the consistent longest consensus sequence that puts in order, thus may determine that should
The longest common subsequence of longest common subsequence namely the first file and the second file are as follows: S=ACCTAGTACTTTG.It is above-mentioned
Above-mentioned default unit be also possible to word, sentence, row or paragraph, determine side when longest common subsequence and behavior unit
Method is identical, repeats no more.
And the element in longest common subsequence be exactly in the first file and the second file common element (can mark for
Common element), the other elements in the first file other than common element are to delete element (to mark as member
Element), the other elements in the second file other than common element are addition element (can mark as element).In determination
Common element, after deleting element and addition element, the common element of the first file and the second file can be aligned (
It referred to as shakes hands).After common element alignment, the position in the common element gap of the first file and the second file also just can determine
Corresponding relationship has determined the common element gap of each of first file corresponding common element gap in the second file,
So as to according to the member in the common element gap of the first file and the second file usually determine more new element (can mark for
Changed element).
In each embodiment of the disclosure, if there are at least one to delete in a certain common element gap of the first file
Except element, and there are at least one addition element in the corresponding common element gap in the position of the second file, then the deletion is first
Element and addition element are above-mentioned more new element.
Further, as shown in Fig. 2, this method can also include:
Step 103, the corresponding each pair of common element in position in the first file and the second file is subjected to character comparison, with true
Whether fixed each pair of common element is identical.
Wherein, due to usually first ignoring characters' property, therefore common element is content when carrying out file comparison
Identical element, character in element whether be also it is identical not yet determine, therefore after two common elements of file have been determined,
Character in each pair of common element can be compared, when the characters' property of two common elements in each pair of common element
Also when identical, this can be determined as identical (can mark and be) to common element, if the two
There are when at least one characters' property difference in common element, this, which is determined as attribute update to common element, (can be marked
For Updated).Wherein, the corresponding common element of a pair in position refers to that indexed alignment is (i.e. in the first file and the second file
Index corresponding relationship has been established) two common elements.
It step 104, will be between the corresponding each pair of common element in position that there is more new element in the first file and the second file
More new element in gap carries out character comparison, with the common character of the more new element in each pair of common element gap of determination, deletion
Character, addition character and more fresh character.
It is exemplary, based on the explanation in above-mentioned steps it is found that a pair of common element gap refers to the first file and the second text
The corresponding two common element gaps in position in part, more new element refer to the corresponding two common members in position in both of these documents
One group of deletion element and addition element present in plain gap, more new element can be confirmed as illustrate include in this group of element
Deletion element and addition element may be different or not exactly the same.For example, illustrating so that above-mentioned default unit is row, it is assumed that
The deletion row in any common gap in the ranks in the first file is indicated with A, is indicated corresponding with location A in the second file with B and (is existed
Common in the ranks gap of the common gap in the ranks where with B in the second file in first file where A is corresponding) add line, then
After determining that A and B is more newline, further the character of the corresponding position in each character and B in A can be carried out pair one by one
Than comparison process is identical as the above-mentioned used method when carrying out file comparison, first carries out A and B as unit of character
Comparison obtains longest common subsequence of the A and B as unit of character, and the character in longest common subsequence is exactly the A and B
Common character, the remaining character in A other than common character is exactly to delete character, the remaining character in B other than common character
Character is exactly added, when a certain common inter-character space of A (refers to the gap between two common characters and first common character
Position after before/the last one common character) it is middle in the presence of deletion character, and exist in corresponding common inter-character space in B
Character is added, then the deletion character in the common inter-character space of this group and addition character are more fresh character.
Through the above technical solutions, pair of the comparison and common content between file can be realized independent of complicated algorithm
Together, it so as to the more new content between the common content search file according to alignment, can reduce realization difficulty, be easy to real
It is existing.
Fig. 3 is the update lookup method that Documents Comparison is used for according to the another kind shown in one exemplary embodiment of the disclosure
Flow diagram, as shown in figure 3, described in step 102 according to longest common subsequence by the common element in the first file
It is indexed with the common element in the second file after being aligned, between the common element where the deletion element in the first file
The position corresponding relationship in the common element gap where addition element in gap and the second file determines the first file and the second text
It the step of more new element in part, may comprise steps of:
Step 1021, it according to longest common subsequence, determines the common element in the first file and deletes element, Yi Ji
Common element and addition element in two files, deleting element is the other elements in the first file in addition to common element, is added
Added elements are the other elements in the second file in addition to common element.
Wherein, the common element in the common element and the second file in first file is exactly the public sub- sequence of the longest
Column, the deletion element are the other elements in first file in addition to the common element, which is second file
In other elements in addition to the common element.
Or for using row as above-mentioned default unit, then common element here is common row, and deleting element is to delete
Except row, addition element be add line, then by taking the sequence of the first above-mentioned file and the sequence of the second file as an example, the first file and
The common row of second file is: ACCTAGTACTTTG, then the deletion row in the first file is in the first file in addition to common row
Except other rows, then delete row=AAGGTTTGCAA, adding line in the second file is in the second file in addition to common row
Except other rows, then add line=CCCAGCGTT.
Step 1022, corresponding by the way that the common element in the first file is established index with the common element in the second file
Common element in first file is indexed with the common element in the second file and is aligned by relationship.
It, can be to the common of the first file and the second file after common element has been determined, has deleted element and addition element
Element is indexed alignment (also referred to as shaking hands).It is exemplary, it can be in advance each element in the first file and the second file
According to its appearance sequence setting call number hereof.For example, by taking the above-mentioned default unit of behavior as an example, in the first file
Call number (i.e. line number) is set for every row, call number also is set for every row in the second file.To in the first file and second
In file, every row all has unique call number.It is thus determined that after common row in the first file and the second file, pass through by
(corresponding row can be understood as the first file and second to correspondence row in common row in the common row and the second file of first file
In the common row of file, there is first A in the row of sequence consensus, such as the common row ACCTAGTACTTTG of the first file
First A in the common row ACCTAGTACTTTG of corresponding second file) call number establish corresponding relationship, can be realized the
One file is aligned with the index of the common row in the second file.
Step 1023, it after the common element in the first file is aligned with the common element index in the second file, determines
The position corresponding relationship in common element gap and the common element gap in the second file in first file.
Step 1024, according to the common element gap where the deletion element in the first file and the addition in the second file
The position corresponding relationship in the common element gap where element, determines the more new element in the first file and second file.
After common element in the first file is aligned with the common element in the second file, also just can determine
The position corresponding relationship in the common element gap of one file and the second file, that is, determined the common element of each of first file
Gap corresponding common element gap in the second file, so as to according between the common element of the first file and the second file
Member in gap usually determines more new element.
Wherein, above-mentioned common element gap may include: the position in common element between any two element, common member
Position before the first element of element and the position after the last one element of common element.The position in common element gap
Corresponding relationship is set it is to be understood that if the position in some common element gap in the first file and certain in the second file
The position in a common element gap is unanimously that it is corresponding for being considered as the two gaps.Therefore, deleting in the first file is being determined
After the addition element in element and the second file, it can recorde corresponding between deletion element and adjacent common element
Relationship, the common element gap where so as to record deletion element can record addition for addition element and similarly with this
Common element gap where element.
In the corresponding two common element gaps in the position of two files, addition is not present if only existing and deleting element
Element, or there is no delete element only exist addition element, in such common gap in the ranks deletion element or addition
Element is without processing.And not only existed in the corresponding common element gap in position and deleted element, but also there are addition element, then can
To be determined as " updating ", as we it needs to be determined that more new element.
Therefore, step 1024 can be realized especially by following manner:
When there are n deletion elements for the first gap location of the common element in the first file, and in the second file
Second gap location of common element, will be in the n deletion element and the second gap in the first gap there are when m addition element
M addition element is determined as more new element, which is any common element gap in the first file, and the second gap is
Common element gap corresponding with the first interstitial site.Also, it is further, it can be by n deletion element in the first gap
The index of index and m addition element in the second gap establish corresponding relationship, that is, by this n deletion element and the m
A addition element carries out knob alignment as more new element.
By taking the above-mentioned default unit of behavior as an example, then common element gap is common gap in the ranks.For example, the first file is common
First in the common row ACCTAGTACTTTG of the position and the second file before first A in row ACCTAGTACTTTG
Position before A is corresponding, the position between first A and first C in the common row ACCTAGTACTTTG of the first file,
Position between first A and first C in the common row ACCTAGTACTTTG of the second file is corresponding.
Therefore, according to the first file=AAACCGTGAFTTATTCGTTCTAGA and the second file=
From the point of view of CACCCCTAAGGTACCTTTGGTT, the deletion row AA in the position before first A in the common row of the first file,
It is corresponding, i.e. deletion row AA in position with the C that adds line in the position before first A in the common row of the second file
It is in the corresponding common gap in the ranks in position with the C that adds line.First A and first C in the common row of the first file it
Between position in there is no row is deleted, exist in the position between first A and first C in the common row of the second file and add
Add row CC, to, there is no row is deleted, only exist and add in this gap between first A and first C in common row
Add row.And so on, it all deletion rows of available first file and the second file and adds line in each common gap in the ranks
In between position corresponding relationship.
For example, Fig. 4 is the signal that method is determined according to a kind of more newline shown in one exemplary embodiment of the disclosure
Figure thus in the form of sequence, respectively illustrates the first file and the as shown in figure 4, indicate a line in figure with each letter
Two files, and above the corresponding letter of the first file and below the corresponding letter of the second file, indicate common with "="
Row "-" is indicated to delete and be gone, and "+" expression adds line, and " c " indicates more newline (c indicates change).Wherein due in the first file
Adding line and can not be aligned in capable and the second file is deleted, therefore without processing.According in the first file and the second file
" c " symbol has three groups it is found that the more newline in the first file and the second file has altogether, respectively AA and C, G and C, A and GTT.Into
One step, the index of the index and C of AA in first group can be established into corresponding relationship, by the index of the G in second group and the rope of C
Draw and establish corresponding relationship, the index of the index of the A in third group and GTT is established into corresponding relationship.
By above-mentioned step, just have determined that common element between the first file and the second file, common element,
Delete element, addition element and more new element.Further, whether common element can further be compared by step 102
It is identical.
Fig. 5 is the update lookup method that Documents Comparison is used for according to another shown in one exemplary embodiment of the disclosure
Flow diagram, as shown in figure 5, above-mentioned steps 103 can with specifically includes the following steps:
Step 1031, each character in the first common element in the first file and second in the second file is common
Correspondence character in element carries out attribute comparison, whether there is attribute not to determine in the first common element and the second common element
Same character.
Wherein, which is any common element in the first file, and the second common element is the second file
In the common element that is aligned with the first common element index.
Step 1032, when the first common element from the second common element there are when the different character of at least one attribute,
First common element and the second common element are determined as attribute to update.
Step 1033, when the character that the first common element is different from attribute is not present in the second common element, by first
Common element is determined as identical with the second common element.
Wherein, characters' property usually may include: font, color, font size, whether overstriking, whether have underscore, whether
There is font special efficacy (further including special efficacy type if any font special efficacy) etc..For example, with the common row of the first above-mentioned file
First A in ACCTAGTACTTTG is as in the first common element, with the common row ACCTAGTACTTTG of the second file
First A be the second common element for, it is assumed that the content of the first common element and the second common element is
" 123456789 ", wherein " 2 " color in the first common element is red, " 2 " color in the second common element is blue,
It can be so attribute update by the first common element and the second common rubidium marking;If the first common element and should
The font type of each character in " 123456789 " in second common element, color, font size, whether have underscore, whether
It can be identical with the second common rubidium marking by the first common element when attributes such as overstriking are completely the same.
On the other hand, the character of more new element can be further compared, by step 104 to find out one group of more Singapore dollar
Common character between element deletes character, adds character and more fresh character.Fig. 6 is shown according to one exemplary embodiment of the disclosure
Flow diagram of out another for the update lookup method of Documents Comparison, as shown in fig. 6, above-mentioned steps 104 can wrap
It includes:
Step 1041, n deletion element in the first gap of the common element in the first file is merged into first yuan
Element.
Step 1042, m addition element in the second gap of the common element in the second file is merged into second yuan
Element.Wherein, which is any common element gap in the first file, and the second gap is corresponding with the first interstitial site
Common element gap.
Step 1043, the first element and second element are subjected to character comparison, to determine the first element and second element
Common character deletes character, addition character and more fresh character.
It is understood that n deletion element can be one or more deletion elements, m addition element can be one
A or multiple addition element, and n is equally likely to m, it is also possible to n is not equal to m.
But no matter whether n and m is identical, above-mentioned n by the first gap deletes m in element and the second gap
When the character content of addition element compares, can following methods be used:
Firstly, n deletion element is merged into an element X, m addition element is also merged into an element Y, then
By element X and element Y as unit of character, the longest common subsequence of calculating elements X and element Y, the longest common subsequence
In character be exactly element X and element Y common character, the remaining character in X other than common character is exactly to delete character, Y
In remaining character other than common character be exactly to add character, delete character when existing in a certain common inter-character space of X, and
There is addition character in Y in corresponding common inter-character space, then the deletion character in the common inter-character space of this group and addition word
Fu Weigeng fresh character.Then common character, deletion character, addition character and more fresh character can be marked respectively, so as to
Significantly more contrast effect is provided when showing.
Illustratively, Fig. 7 is showing according to the labeling method of more newline shown in one exemplary embodiment of the disclosure a kind of
It is intended to, as shown in fig. 7, the content in two files is respectively illustrated, and the recognition result after comparison, wherein by the first file
With there are different positions to be outlined with box in the second file, and indicate more newline (i.e. expression Changed) with "<>", with "!
=" indicate the common row (i.e. expression Updated) updated there are attribute, with "==" indicate identical common row (i.e. table
Show Identical).Then as shown in fig. 7, common row is marked with "==".After common row has been determined, common gap in the ranks
Also it determines that, therefore can determine the position of more newline, as shown in Figure 7, the first file and second according to common gap in the ranks
The content of 1st row of file is different, therefore the 1st row of the first file and the second file will not be calculated in common row when comparing
In, that is to say, that the 1st row of the first file belongs to deletion row, and the 1st row of the second file belongs to the case where adding line, therefore first
1st row of file and the second file belongs to more newline, therefore the 1st rower of the first file and the second file is denoted as "<>", the
One file and 2-3 row, the 6th row, eighth row and 10-16 row in the second file belong to common row, and character content and category
Property be identical, therefore labeled as "==", and the 5th row and the 7th row also belong to common row, but due to the first file and
The character different there are attribute in 5th row of the second file and the 7th row, therefore be denoted as "!=".And further, Ke Yi
When displaying, the character that content is different and attribute is different is marked with box, as shown in Figure 7.Alternatively, can open up
When showing comparing result, the label at the content difference character different with attribute is first hidden, when (such as mouse is moved to user's triggering
On these positions) when, then show the label.
Through the above technical solutions, pair of the comparison and common content between file can be realized independent of complicated algorithm
Together, it so as to the more new content between the common content search file according to alignment, can reduce realization difficulty, be easy to real
It is existing.
Fig. 8 is a kind of frame that device is searched for the update of Documents Comparison shown according to one exemplary embodiment of the disclosure
Figure, as shown in figure 8, the device 100 may include:
Contrast module 110, in the first file and the second file using the content of each default unit as a member
Element compares the first file and the second file, to obtain the longest common subsequence of the first file and the second file;
Update determining module 120, for according to longest common subsequence by the common element and second in the first file
After common element in file is indexed alignment, according to the common element gap and the deleted where element in the first file
The position corresponding relationship in the common element gap where the addition element in two files, determines in the first file and the second file
More new element.
Optionally, which can also include:
First updates identification module 130, is used for the corresponding each pair of common element in position in the first file and the second file
Character comparison is carried out, it is whether identical with each pair of common element of determination;
Second updates identification module 140, and the position for will there is more new element in the first file and the second file is corresponding
Each pair of common element gap in more new element carry out character comparison, with the more new element in each pair of common element gap of determination
Common character, delete character, addition character and more fresh character.
Optionally, Fig. 9 is a kind of block diagram of update determining module shown according to one exemplary embodiment of the disclosure, is such as schemed
Shown in 9, which may include:
Elemental recognition submodule 121, for according to longest common subsequence, determining the common element in the first file and deleting
Except the common element and addition element in element and the second file, delete element be in the first file in addition to common element
Other elements, addition element is other elements in addition to common element in the second file;
Element alignment submodule 122, for by by the common member in the common element and the second file in the first file
Element establishes index corresponding relationship, and the common element in the first file is indexed with the common element in the second file and is aligned;
Gap corresponds to submodule 123, for the common element rope in the common element and the second file in the first file
After drawing alignment, the position corresponding relationship in common element gap and the common element gap in the second file in the first file is determined;
Element determines submodule 124, for according to the common element gap and the deleted where element in the first file
The position corresponding relationship in the common element gap where the addition element in two files, determines in the first file and the second file
More new element.
Optionally, which determines submodule 121, is used for:
When there are n deletion elements for the first gap location of the common element in the first file, and in the second file
Second gap location of common element, will be in the n deletion element and the second gap in the first gap there are when m addition element
M addition element is determined as more new element, and the first gap is any common element gap in the first file, the second gap be with
The corresponding common element gap of first interstitial site;
By the index foundation pair of the index of n deletion element in the first gap and m addition element in the second gap
It should be related to.
Optionally, Figure 10 is the frame that identification module is updated according to one kind first shown in one exemplary embodiment of the disclosure
Figure, as shown in Figure 10, which may include:
Attribute compares submodule 131, for by each character in the first common element in the first file and second literary
Correspondence character in the second common element in part carries out attribute comparison, to determine in the first common element and the second common element
With the presence or absence of the different character of attribute, the first common element is any common element in the first file, and the second common element is
The common element being aligned in second file with the described first common element index;
Determine submodule 132, for when the first common element from there are at least one attribute is different in the second common element
Character when, by the first common element and the second common element be determined as attribute update;
It determines submodule 132, is also used to the word different from attribute is not present in the second common element when the first common element
First common element is determined as with the second common element identical by Fu Shi.
Optionally, Figure 11 is the frame that identification module is updated according to one kind second shown in one exemplary embodiment of the disclosure
Figure, as shown in figure 11, which may include:
Merge submodule 141, for merging n deletion element in the first gap of the common element in the first file
For the first element;
Merge submodule 141, is also used to close m addition element in the second gap of the common element in the second file
And be second element, the first gap is any common element gap in the first file, and the second gap is and the first interstitial site
Corresponding common element gap;
Character compares submodule 142, for the first element and second element to be carried out character comparison, to determine the first element
With common character, deletion character, addition character and the more fresh character of second element.
Through the above technical solutions, pair of the comparison and common content between file can be realized independent of complicated algorithm
Together, it so as to the more new content between the common content search file according to alignment, can reduce realization difficulty, be easy to real
It is existing.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, no detailed explanation will be given here.
Figure 12 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.As shown in figure 12, which sets
Standby 200 may include: processor 201, memory 202.The electronic equipment 200 can also include multimedia component 203, input/
Export one or more of (I/O) interface 204 and communication component 205.
Wherein, processor 201 is used to control the integrated operation of the electronic equipment 200, above-mentioned for file ratio to complete
Compared with update lookup method in all or part of the steps.Memory 202 is for storing various types of data to support at this
The operation of electronic equipment 200, these data for example may include any application program for operating on the electronic equipment 200
Or the instruction and the relevant data of application program of method, such as contact data, the message of transmitting-receiving, picture, audio, video
Etc..The memory 202 can be by any kind of volatibility or non-volatile memory device or their combination realization, example
Such as static random access memory (Static Random Access Memory, abbreviation SRAM), electrically erasable is read-only
Memory (Electrically Erasable Programmable Read-Only Memory, abbreviation EEPROM), it is erasable
Programmable read only memory (Erasable Programmable Read-Only Memory, abbreviation EPROM), may be programmed read-only
Memory (Programmable Read-Only Memory, abbreviation PROM), read-only memory (Read-Only Memory, letter
Claim ROM), magnetic memory, flash memory, disk or CD.Multimedia component 203 may include screen and audio component.Its
Middle screen for example can be touch screen, and audio component is used for output and/or input audio signal.For example, audio component can wrap
A microphone is included, microphone is for receiving external audio signal.The received audio signal can be further stored in and deposit
Reservoir 202 is sent by communication component 205.Audio component further includes at least one loudspeaker, is used for output audio signal.I/
O Interface 204 provides interface between processor 201 and other interface modules, other above-mentioned interface modules can be keyboard, mouse
Mark, button etc..These buttons can be virtual push button or entity button.Communication component 205 is for the electronic equipment 200 and its
Wired or wireless communication is carried out between his equipment.Wireless communication, such as Wi-Fi, bluetooth, near-field communication (Near Field
Communication, abbreviation NFC), 2G, 3G, 4G, NB-IOT, eMTC or other 5G etc. or they one or more of
Combination, it is not limited here.Therefore the corresponding communication component 207 may include: Wi-Fi module, bluetooth module, NFC mould
Block etc..
In one exemplary embodiment, electronic equipment 200 can be by one or more application specific integrated circuit
(Application Specific Integrated Circuit, abbreviation ASIC), digital signal processor (Digital
Signal Processor, abbreviation DSP), digital signal processing appts (Digital Signal Processing Device,
Abbreviation DSPD), programmable logic device (Programmable Logic Device, abbreviation PLD), field programmable gate array
(Field Programmable Gate Array, abbreviation FPGA), controller, microcontroller, microprocessor or other electronics member
Part is realized, for executing the above-mentioned update lookup method for Documents Comparison.
In a further exemplary embodiment, a kind of computer readable storage medium including program instruction is additionally provided, it should
The step of above-mentioned update lookup method for Documents Comparison is realized when program instruction is executed by processor.For example, the calculating
Machine readable storage medium storing program for executing can be the above-mentioned memory 202 including program instruction, and above procedure instruction can be by electronic equipment 200
Processor 201 is executed to complete the above-mentioned update lookup method for Documents Comparison.
Claims (10)
1. a kind of update lookup method for Documents Comparison, which is characterized in that the described method includes:
Using the content of each default unit as an element in the first file and the second file, to first file and institute
It states the second file to compare, to obtain the longest common subsequence of first file Yu second file;
According to the longest common subsequence by first file common element with it is common in second file
Common element gap and second text after element is indexed alignment, where the deletion element in first file
The position corresponding relationship in the common element gap where the addition element in part, determines first file and second file
In more new element.
2. the method according to claim 1, wherein the method also includes:
The corresponding each pair of common element in position in first file and second file is subjected to character comparison, to determine
Whether identical state each pair of common element;
By will in first file and second file in the presence of more new element the corresponding each pair of common element in position between
More new element in gap carries out character comparison, with the common character of the more new element in determination each pair of common element gap,
Delete character, addition character and more fresh character.
3. the method according to claim 1, wherein it is described according to the longest common subsequence by described
Common element in one file is indexed with the common element in second file be aligned after, according in first file
Delete element where common element gap and second file in addition element where common element gap position
Corresponding relationship is set, determines the more new element in first file and second file, comprising:
According to the longest common subsequence, determines common element in first file and delete element and described the
Common element and addition element in two files, the deletion element be first file in addition to the common element
Other elements, the addition element are the other elements in second file in addition to the common element;
By the way that the common element in the common element and second file in first file is established index corresponding relationship,
Common element in first file is indexed with the common element in second file and is aligned;
After the common element in first file is aligned with the common element index in second file, described is determined
The position corresponding relationship in common element gap and the common element gap in second file in one file;
Common element gap where the deletion element in first file and the addition element in second file
The position corresponding relationship in the common element gap at place, determines the more new element in first file and second file.
4. according to the method described in claim 3, it is characterized in that, where the deletion element according in first file
Common element gap and second file in addition element where common element gap position corresponding relationship, determine
More new element in first file and second file, comprising:
When there are n deletion elements for the first gap location of the common element in first file, and in second file
In common element the second gap location there are when m addition element, by first gap the n deletion element and
The m addition element in second gap is determined as more new element, and first gap is in first file
Any common element gap, second gap are common element gaps corresponding with first interstitial site;
By the index of the n deletion element in first gap and the m addition element in second gap
Index establish corresponding relationship.
5. according to the method described in claim 2, it is characterized in that, described by position in first file and second file
It sets corresponding each pair of common element and carries out character comparison, it is whether identical with determination each pair of common element, comprising:
By the second common element in each character and second file in the first common element in first file
In correspondence character carry out attribute comparison, in the common element of determination described first and the described second common element with the presence or absence of belonging to
Property different character, the first common element is any common element in first file, the second common element
It is the common element being aligned in second file with the described first common element index;
When the described first common element from the described second common element there are when the different character of at least one attribute, will be described
First common element and the described second common element are determined as attribute update;
When the character that the described first common element is different from attribute is not present in the described second common element, described first is total to
Logical element is determined as identical with the described second common element.
6. device is searched in a kind of update for Documents Comparison, which is characterized in that described device includes:
Contrast module, in the first file and the second file using the content of each default unit as an element, to institute
It states the first file and second file compares, to obtain the public son of longest of first file Yu second file
Sequence;
Update determining module, for according to the longest common subsequence by first file common element with it is described
Common element after common element in second file is indexed alignment, where the deletion element in first file
The position corresponding relationship in the common element gap where addition element in gap and second file determines first text
More new element in part and second file.
7. device according to claim 6, which is characterized in that described device further include:
First updates identification module, is used for the corresponding each pair of common element in position in first file and second file
Character comparison is carried out, it is whether identical with determination each pair of common element;
Second updates identification module, and the position for will there is more new element in first file and the second file is corresponding every
Character comparison is carried out to the more new element in common element gap, with the more new element in determination each pair of common element gap
Common character, delete character, addition character and more fresh character.
8. device according to claim 6, which is characterized in that the update determining module, comprising:
Elemental recognition submodule, for according to the longest common subsequence, determine common element in first file and
Element and common element and addition element in second file are deleted, the deletion element is in first file
Other elements in addition to the common element, the addition element be second file in addition to the common element
Other elements;
Element alignment submodule, for by by first file common element and second file in common member
Element establishes index corresponding relationship, and the common element in the common element and second file in first file is carried out rope
Draw alignment;
Gap corresponds to submodule, for the common element rope in the common element and second file in first file
After drawing alignment, the position pair in common element gap and the common element gap in second file in first file is determined
It should be related to;
Element determines submodule, for according to the common element gap and described the deleted where element in first file
The position corresponding relationship in the common element gap where the addition element in two files, determines first file and described second
More new element in file.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt
The step of any one of claim 1-5 the method is realized when processor executes.
10. a kind of electronic equipment characterized by comprising
Memory is stored thereon with computer program;
Processor, for executing the computer program in the memory, to realize described in any one of claim 1-5
The step of method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811625251.3A CN109740125B (en) | 2018-12-28 | 2018-12-28 | Update search method, device, storage medium and equipment for file comparison |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811625251.3A CN109740125B (en) | 2018-12-28 | 2018-12-28 | Update search method, device, storage medium and equipment for file comparison |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109740125A true CN109740125A (en) | 2019-05-10 |
CN109740125B CN109740125B (en) | 2023-06-27 |
Family
ID=66361944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811625251.3A Active CN109740125B (en) | 2018-12-28 | 2018-12-28 | Update search method, device, storage medium and equipment for file comparison |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109740125B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111104788A (en) * | 2019-12-05 | 2020-05-05 | 东软集团股份有限公司 | Document differential content alignment method and device, storage medium and electronic equipment |
CN114356245A (en) * | 2022-01-12 | 2022-04-15 | 济南点量软件有限公司 | Method and system for rapidly comparing and updating mass files |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105589838A (en) * | 2015-12-24 | 2016-05-18 | 中国电子科技集团公司第三十三研究所 | Electronic official document trace reserving method based on file comparison |
CN106372040A (en) * | 2016-08-24 | 2017-02-01 | 长园深瑞继保自动化有限公司 | Difference comparison system of intelligent substation configuration file |
CN106469219A (en) * | 2016-09-09 | 2017-03-01 | 武汉长光科技有限公司 | A kind of method that embedded device configuration file synchronously compares |
CN107273359A (en) * | 2017-06-20 | 2017-10-20 | 北京四海心通科技有限公司 | A kind of text similarity determines method |
-
2018
- 2018-12-28 CN CN201811625251.3A patent/CN109740125B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105589838A (en) * | 2015-12-24 | 2016-05-18 | 中国电子科技集团公司第三十三研究所 | Electronic official document trace reserving method based on file comparison |
CN106372040A (en) * | 2016-08-24 | 2017-02-01 | 长园深瑞继保自动化有限公司 | Difference comparison system of intelligent substation configuration file |
CN106469219A (en) * | 2016-09-09 | 2017-03-01 | 武汉长光科技有限公司 | A kind of method that embedded device configuration file synchronously compares |
CN107273359A (en) * | 2017-06-20 | 2017-10-20 | 北京四海心通科技有限公司 | A kind of text similarity determines method |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111104788A (en) * | 2019-12-05 | 2020-05-05 | 东软集团股份有限公司 | Document differential content alignment method and device, storage medium and electronic equipment |
CN111104788B (en) * | 2019-12-05 | 2023-09-22 | 东软集团股份有限公司 | Alignment method and device of document differential content, storage medium and electronic equipment |
CN114356245A (en) * | 2022-01-12 | 2022-04-15 | 济南点量软件有限公司 | Method and system for rapidly comparing and updating mass files |
CN114356245B (en) * | 2022-01-12 | 2023-09-22 | 济南点量软件有限公司 | Method and system for fast comparing and updating mass files |
Also Published As
Publication number | Publication date |
---|---|
CN109740125B (en) | 2023-06-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9628423B2 (en) | Electronic sticky note system, information processing terminal, method for processing electronic sticky note, medium storing program, and data structure of electronic sticky note | |
CN109690481A (en) | The customization of dynamic function row | |
CN104350493B (en) | Transform the data into consumable content | |
CN106796518A (en) | Based on the feedback being intended to | |
CN106796582A (en) | The dynamic presentation of suggestion content | |
CN104471565A (en) | Abstract relational model for transforming data into consumable content | |
CN103176705B (en) | A kind of mobile terminal and method for previewing thereof | |
WO2021008334A1 (en) | Data binding method, apparatus, and device of mini program, and storage medium | |
CN104813312B (en) | Stateful editor is carried out to abundant content using basic text box | |
CN106855748A (en) | A kind of data inputting method, device and intelligent terminal | |
US11314408B2 (en) | Computationally efficient human-computer interface for collaborative modification of content | |
US20150293975A1 (en) | Method and device for searching for contact object, and storage medium | |
US20200065052A1 (en) | Enhanced techniques for merging content from separate computing devices | |
CN106663091A (en) | Summary data autofill | |
CN113518026A (en) | Message processing method and device and electronic equipment | |
CN109740125A (en) | Update lookup method, device, storage medium and equipment for Documents Comparison | |
US20200065361A1 (en) | Selectively controlling modification states for user-defined subsets of objects within a digital document | |
CN109740124A (en) | Difference output method, device, storage medium and the electronic equipment of document comparison | |
US20170124051A1 (en) | Extensibility of compound data objects | |
JP2011197983A (en) | Information display device and information display program | |
JP5353872B2 (en) | Information display device and information display program | |
CN109684437B (en) | Content alignment method, device, storage medium and equipment for file comparison | |
JP2007219940A (en) | Menu control device, mobile phone, and program for menu control device | |
CN109815446A (en) | Page boundary processing method, device, storage medium and electronic equipment | |
CN105867763A (en) | Information processing method and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |