CN109783454A - A kind of super large text file comparison method - Google Patents
A kind of super large text file comparison method Download PDFInfo
- Publication number
- CN109783454A CN109783454A CN201910062450.6A CN201910062450A CN109783454A CN 109783454 A CN109783454 A CN 109783454A CN 201910062450 A CN201910062450 A CN 201910062450A CN 109783454 A CN109783454 A CN 109783454A
- Authority
- CN
- China
- Prior art keywords
- row
- file
- text file
- data
- line number
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of super large text file comparison methods, this method includes extracting source text file comparative information, target text file comparative information is extracted, using Multi-thread synchronization comparison method calculating ratio pair as a result, the file after comparison is carried out Dynamically Announce using memory buffer mechanism.The present invention uses Text Address, size text administrative mechanism in the analysis of super large text file first, and Large Volume Data is avoided to load, so that it is small to guarantee that memory source occupies, effectively solves the problems, such as super large text-processing process low memory;Secondly it is combined using paging and branch's administrative mechanism with multiple threads, greatly improves super large text file and compare speed;Last text display needs dynamically load content of text according to user in such a way that row management and buffer area combine, to guarantee the real-time and continuity of content load, content of text display is smooth, greatly improves user experience feeling.
Description
Technical field
The invention belongs to file comparison technology fields, and in particular to a kind of super large text file comparison method.
Background technique
Currently, the tool software of text file more commonly used in industry has very much, for example, notepad, notepad++,
The software tools such as UltraEdit, bes.These tool software are in processing super large text file, especially GB grades and the above size
When file, following difficulty is mainly faced:
1. opening file needs very big memory overhead, Installed System Memory can not be supported.
2. data are only able to display, but more than two file comparative analyses.
3. not having Bit Error Ratio Measurement function.
4. customer parameter can not be arranged.
5. it is too slow that file compares speed.
Summary of the invention
Goal of the invention of the invention is: in order to solve problem above existing in the prior art, the invention proposes one kind
Super large text file comparison method.
The technical scheme is that a kind of super large text file comparison method, comprising the following steps:
A, source text file is obtained, source text file content is analyzed, extracts source text file comparative information;
B, target text file is obtained, target text file content is analyzed, extracts target text file comparison letter
Breath;
C, transcription comparison's parameter is set, comparison result is calculated using Multi-thread synchronization comparison method;
D, the file after comparison is carried out by Dynamically Announce using memory buffer mechanism.
Further, source text file content is analyzed in the step A, extracts source text file comparative information,
Specifically:
Source text file content is analyzed, using paging management mechanism, according to page size parameter by source text file
Multiple page file data are divided into, the page information of source text file is extracted, then analyze the page information of source text file, adopts
With branch's administrative mechanism, each page file data are divided into multiple row data, extract the row information of source text file.
Further, target text file content is analyzed in the step B, extracts target text file comparison letter
Breath, specifically:
Target text file content is analyzed, using paging management mechanism, according to page size parameter by target text
File division extracts the page information of target text file at multiple page file data, then to the page information of target text file into
Each page file data are divided into multiple row data, are extracted the row of target text file by row analysis using branch's administrative mechanism
Information.
Further, the page information includes file name, file size, file paging size, file paging number, text
Part start of Page address, file page byte length.
Further, the row information includes line number, initial address, offset address, byte length, frame number.
Further, in the step C use Multi-thread synchronization comparison method calculating ratio pair as a result, specifically:
According to data frame format Analytic Traveling data, the row information of source text file and the row of target text file are then compared
Information gap records difference address and length scale and difference type.
Further, the file after comparison is carried out Dynamically Announce using memory buffer mechanism by the step D, specifically:
Using the file initial row data of memory buffer mechanism display setting, and use sliding type dynamically load line number
According to checking other comparison datas.
Further, described to use in sliding type dynamically load row data when user drags scroll bar toward row number numerical value
Augment direction drag when, specifically include it is following step by step:
S11, display correlation data originate N row data content, and setting can currently show that starting line number is 1 and can show cut-off
Line number is N;
S12, when user drag scroll bar enter the first buffer mark position after, row display manager is reloaded from file
N row data enter in the memory of row display manager afterwards, and update can currently show that starting line number is 1 and can show that cut-off line number is
2N;
S13, after user drags scroll bar into the second buffer mark position again, row display manager from file again
N row data enter in the memory of row display manager after load, while deleting N row data in foremost in row display manager, more
It newly can currently show that starting line number is N and can show that cut-off line number is 3N;
S14, when user drags scroll bar again and enters third buffer mark position or mark position further below, will be by
According to method display line data described in step S13;
S15, when user drag scroll bar reach the last one buffer mark position when, row display manager from file plus
It carries remaining All Datarows to enter in the memory of row display manager, then deletes preceding N row data in row display manager, simultaneously
Update the numerical value that can currently show starting line number and can show cut-off line number.
Further, described to use in sliding type dynamically load row data when user drags scroll bar toward row number numerical value
Reduce direction dragging when, specifically include it is following step by step:
S11, when user drag scroll bar enter the 4th buffer mark position after, row display manager from file load before
N row data are inserted into before the minimum line number of row display manager memory, while needing to delete in row display manager backmost
K row data, while update can currently show starting line number and can show cut-off line number numerical value;
S12, when user drag scroll bar enter the 5th buffer mark position after, row display manager is reloaded from file
Preceding N row data are inserted into before the minimum line number of row display manager memory, while needing to delete last in row display manager
The N row data in face, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S13, when user drags scroll bar and enters the buffer mark position of the 6th buffer mark position or more front, will be by
According to method display line data described in step S12;
S14, when user drag scroll bar reach the first buffer mark position when, row display manager load from file remains
Remaining All Datarows enter in the memory of row display manager, then delete rearmost K row data in row display manager, protect
The current maximum display line number of card is 2N row, can show that starting line number is 1, can show that cut-off line number is 2N.
The beneficial effects of the present invention are: the present invention is big using Text Address, text first in the analysis of super large text file
Small administrative mechanism, avoids Large Volume Data from loading, so that it is small to guarantee that memory source occupies, effectively solves super large text-processing process
The problem of low memory;Secondly it is combined using paging and branch's administrative mechanism with multiple threads, greatly improves super large text
File compares speed;Last text display needs dynamically load according to user in such a way that row management and buffer area combine
Content of text, to guarantee the real-time and continuity of content load, content of text display is smooth, greatly improves user experience
Feel.
Detailed description of the invention
Fig. 1 is super large text file comparison method flow diagram of the invention;
Fig. 2 is transcription comparison's flow diagram in super large text file comparison method of the invention;
Fig. 3 is that file shows processing schematic in super large text file comparison method of the invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not
For limiting the present invention.
As shown in Figure 1, being super large text file comparison method flow diagram of the invention.A kind of super large text file ratio
To method, comprising the following steps:
A, source text file is obtained, source text file content is analyzed, extracts source text file comparative information;
B, target text file is obtained, target text file content is analyzed, extracts target text file comparison letter
Breath;
C, transcription comparison's parameter is set, comparison result is calculated using Multi-thread synchronization comparison method;
D, the file after comparison is carried out by Dynamically Announce using memory buffer mechanism.
In step, the present invention obtains source text file first, then analyzes the source text file content of acquisition,
The page information and row information of source text file are extracted, specifically:
A file analysis thread is created, source text file content is analyzed, using paging management mechanism, according to page
Source text file is divided into multiple page file data by size parameter, extracts the page information of source text file, then herein to source document
The page information of part is analyzed, and using branch's administrative mechanism, each page file data is divided into multiple row data, extract source document
The row information of this document.
The page information of above-mentioned source text file specifically includes file name, file size, file paging size, file paging
Number, file page initial address, file page byte length.
The row information of above-mentioned source text file specifically includes line number, initial address, offset address, byte length, frame number.
In stepb, the present invention obtains target text file first, then divides the target text file content of acquisition
The page information and row information of target text file are extracted in analysis, specifically:
A file analysis thread is created, target text file content is analyzed, using paging management mechanism, according to
Target text file division at multiple page file data, is extracted the page information of target text file by page size parameter, then to mesh
The page information of mark text file is analyzed, and using branch's administrative mechanism, each page file data are divided into multiple row data,
Extract the row information of target text file.
The page information of above-mentioned target text file specifically includes file name, file size, file paging size, file point
Page number, file page initial address, file page byte length.
The row information of above-mentioned target text file specifically includes line number, initial address, offset address, byte length, frame sequence
Number.
In step C, as shown in Fig. 2, for transcription comparison's process signal in super large text file comparison method of the invention
Figure.The present invention selects data frame format resolution file first, judges that file whether there is;If file exists, comparison ginseng is configured
Number reselects data frame format resolution file if file is not present;Judge whether to reach maximum thread again, if so, waiting
It is completed to thread, if it is not, then creating page data parsing compares thread, after the completion of waiting all threads, using Multi-thread synchronization ratio
To method calculating ratio pair as a result, specifically:
According to data frame format Analytic Traveling data, the row information of source text file and the row of target text file are then compared
Information gap records difference address and length scale and difference type.Here difference type is frame losing, error code, normal
Deng.
In step D, as shown in figure 3, for file display processing signal in super large text file comparison method of the invention
Figure.File after comparison is carried out Dynamically Announce using memory buffer mechanism by the present invention, specifically:
Using the file initial row data of memory buffer mechanism display setting, and use sliding type dynamically load line number
According to checking other comparison datas.
It is above-mentioned to use in sliding type dynamically load row data when user drags scroll bar toward row number numerical value augment direction
When dragging, specifically include it is following step by step:
S11, display correlation data originate N row data content, and setting can currently show that starting line number is 1 and can show cut-off
Line number is N;Preferably, the present invention sets N as 100, that is, shows that correlation data originates 100 row data contents.
S12, when user drag scroll bar enter the first buffer mark position 1. after, row display manager from file again plus
N row data enter in the memory of row display manager after load, and update can currently show that starting line number is 1 and can show cut-off line number
For 2N;
S13, when user drag again scroll bar into the second buffer mark position 2. after, row display manager is from file
It reloads rear N row data to enter in the memory of row display manager, while deleting N row data in foremost in row display manager,
Update can currently show that starting line number is N and can show that cut-off line number is 3N;
S14, enter the mark position of third buffer mark position 3. or further below (except most when user drags scroll bar again
The latter buffer mark position) when, it will be according to method display line data described in step S13;
S15, when user drag scroll bar reach the last one buffer mark position when, row display manager from file plus
It carries remaining All Datarows (row number is less than or equal to 100) to enter in the memory of row display manager, then deletes row display tube
Preceding N row data in device are managed, while updating the numerical value that can currently show starting line number and can show cut-off line number.
It is above-mentioned to reduce direction toward row number numerical value when user drags scroll bar using in sliding type dynamically load row data
When dragging, processing mode is opposite toward numerical value augment direction with scroll bar, it is assumed that scroll bar front position is in end line position, specific packet
Include it is following step by step:
S11, when user drag scroll bar enter the 4th buffer mark position 4. after, row display manager is loaded from file
Preceding N row data are inserted into before the minimum line number of row display manager memory, while needing to delete last in row display manager
The K row data in face, while updating the numerical value that can currently show starting line number and can show cut-off line number;Preferably, the present invention is set
Determine the total line number -200 of K=row display manager.
S12, when user drag scroll bar enter the 5th buffer mark position 5. after, row display manager from file again plus
N row data are inserted into before the minimum line number of row display manager memory before carrying, while needing to delete in row display manager most
Subsequent N row data, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S13, enter the buffer mark position of the 6th buffer mark position or more front (except first when user drags scroll bar
A buffer mark position) when, it will be according to method display line data described in step S12;
S14, when user drag scroll bar reach the first buffer mark position when, row display manager load from file remains
Remaining All Datarows enter in the memory of row display manager, then delete rearmost K row data in row display manager, protect
The current maximum display line number of card is 2N row, can show that starting line number is 1, can show that cut-off line number is 2N.
Those of ordinary skill in the art will understand that the embodiments described herein, which is to help reader, understands this hair
Bright principle, it should be understood that protection scope of the present invention is not limited to such specific embodiments and embodiments.This field
Those of ordinary skill disclosed the technical disclosures can make according to the present invention and various not depart from the other each of essence of the invention
The specific variations and combinations of kind, these variations and combinations are still within the scope of the present invention.
Claims (9)
1. a kind of super large text file comparison method, which comprises the following steps:
A, source text file is obtained, source text file content is analyzed, extracts source text file comparative information;
B, target text file is obtained, target text file content is analyzed, extracts target text file comparative information;
C, transcription comparison's parameter is set, comparison result is calculated using Multi-thread synchronization comparison method;
D, the file after comparison is carried out by Dynamically Announce using memory buffer mechanism.
2. super large text file comparison method as described in claim 1, which is characterized in that in the step A herein to source document
Part content is analyzed, and source text file comparative information is extracted, specifically:
Source text file content is analyzed, using paging management mechanism, is divided source text file according to page size parameter
At multiple page file data, extract the page information of source text file, then the page information of source text file analyzed, using point
Each page file data are divided into multiple row data, extract the row information of source text file by row administrative mechanism.
3. super large text file comparison method as described in claim 1, which is characterized in that target text in the step B
File content is analyzed, and target text file comparative information is extracted, specifically:
Target text file content is analyzed, using paging management mechanism, according to page size parameter by target text file
Multiple page file data are divided into, the page information of target text file is extracted, then the page information of target text file is divided
Each page file data are divided into multiple row data using branch's administrative mechanism by analysis, extract the row letter of target text file
Breath.
4. super large text file comparison method as claimed in claim 2 or claim 3, which is characterized in that the page information includes file
Title, file size, file paging size, file paging number, file page initial address, file page byte length.
5. super large text file comparison method as claimed in claim 2 or claim 3, which is characterized in that the row information include line number,
Initial address, offset address, byte length, frame number.
6. super large text file comparison method as described in claim 1, which is characterized in that use multithreading in the step C
Synchronous comparison method calculating ratio pair as a result, specifically:
According to data frame format Analytic Traveling data, the row information of source text file and the row information of target text file are then compared
Difference records difference address and length scale and difference type.
7. super large text file comparison method as described in claim 1, which is characterized in that the step D uses memory buffer
File after comparison is carried out Dynamically Announce by mechanism, specifically:
Using the file initial row data of memory buffer mechanism display setting, and sliding type dynamically load row data are used, looked into
See other comparison datas.
8. super large text file comparison method as claimed in claim 7, which is characterized in that described to be added using sliding type dynamic
Carry row data in when user drag scroll bar toward row number numerical value augment direction drag when, specifically include it is following step by step:
S11, display correlation data originate N row data content, and setting can currently show that starting line number is 1 and can show cut-off line number
For N;
S12, when user drag scroll bar enter the first buffer mark position after, row display manager reloads rear N from file
Row data enter in the memory of row display manager, and update can currently show that starting line number is 1 and can show that cut-off line number is 2N;
S13, after user drags scroll bar into the second buffer mark position again, row display manager is reloaded from file
N row data enter in the memory of row display manager afterwards, while deleting foremost N row data, update in row display manager and working as
Before can show starting line number be N and can show cut-off line number be 3N;
S14, when user drags scroll bar again and enters third buffer mark position or mark position further below, will be according to step
Method display line data described in rapid S13;
S15, when user drag scroll bar reach the last one buffer mark position when, row display manager load from file remains
Remaining All Datarows enter in the memory of row display manager, then delete preceding N row data in row display manager, update simultaneously
It can currently show starting line number and can show the numerical value of cut-off line number.
9. super large text file comparison method as claimed in claim 7, which is characterized in that described to be added using sliding type dynamic
Carry row data in when user drag scroll bar toward row number numerical value reduce direction drag when, specifically include it is following step by step:
S11, after user drags scroll bar and enters the 4th buffer mark position, row display manager load preceding N row from file
Data are inserted into before the minimum line number of row display manager memory, while needing to delete rearmost K in row display manager
Row data, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S12, when user drag scroll bar enter the 5th buffer mark position after, row display manager reloads preceding N from file
Row data are inserted into before the minimum line number of row display manager memory, while needing to delete rearmost in row display manager
N row data, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S13, when user drag scroll bar enter the 6th buffer mark position or more front buffer mark position when, will be according to step
Method display line data described in rapid S12;
S14, when user drag scroll bar reach the first buffer mark position when, row display manager loads remaining from file
All Datarows enter in the memory of row display manager, then delete rearmost K row data in row display manager, guarantee to work as
Preceding maximum display line number is 2N row, can show that starting line number is 1, can show that cut-off line number is 2N.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910062450.6A CN109783454A (en) | 2019-01-23 | 2019-01-23 | A kind of super large text file comparison method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910062450.6A CN109783454A (en) | 2019-01-23 | 2019-01-23 | A kind of super large text file comparison method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109783454A true CN109783454A (en) | 2019-05-21 |
Family
ID=66502187
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910062450.6A Pending CN109783454A (en) | 2019-01-23 | 2019-01-23 | A kind of super large text file comparison method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109783454A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110413960A (en) * | 2019-06-19 | 2019-11-05 | 平安银行股份有限公司 | File control methods, device, computer equipment and computer readable storage medium |
CN111723052A (en) * | 2020-05-09 | 2020-09-29 | 厦门亿联网络技术股份有限公司 | Editing method and device for large file data |
CN111723229A (en) * | 2020-06-24 | 2020-09-29 | 重庆紫光华山智安科技有限公司 | Data comparison method and device, computer readable storage medium and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130117246A1 (en) * | 2011-11-03 | 2013-05-09 | Sebastien Cabaniols | Methods of processing text data |
CN104133705A (en) * | 2014-07-31 | 2014-11-05 | 武汉邮电科学研究院 | System and method for loading PowerPC system guide file through serial port |
CN107463541A (en) * | 2017-07-31 | 2017-12-12 | 武汉斗鱼网络科技有限公司 | File difference comparative approach, storage medium, electronic equipment and system |
CN108920436A (en) * | 2018-06-29 | 2018-11-30 | 郑州云海信息技术有限公司 | A kind of file data comparison method, tool and equipment |
CN109039804A (en) * | 2018-07-12 | 2018-12-18 | 武汉斗鱼网络科技有限公司 | A kind of file reading and electronic equipment |
-
2019
- 2019-01-23 CN CN201910062450.6A patent/CN109783454A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130117246A1 (en) * | 2011-11-03 | 2013-05-09 | Sebastien Cabaniols | Methods of processing text data |
CN104133705A (en) * | 2014-07-31 | 2014-11-05 | 武汉邮电科学研究院 | System and method for loading PowerPC system guide file through serial port |
CN107463541A (en) * | 2017-07-31 | 2017-12-12 | 武汉斗鱼网络科技有限公司 | File difference comparative approach, storage medium, electronic equipment and system |
CN108920436A (en) * | 2018-06-29 | 2018-11-30 | 郑州云海信息技术有限公司 | A kind of file data comparison method, tool and equipment |
CN109039804A (en) * | 2018-07-12 | 2018-12-18 | 武汉斗鱼网络科技有限公司 | A kind of file reading and electronic equipment |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110413960A (en) * | 2019-06-19 | 2019-11-05 | 平安银行股份有限公司 | File control methods, device, computer equipment and computer readable storage medium |
CN111723052A (en) * | 2020-05-09 | 2020-09-29 | 厦门亿联网络技术股份有限公司 | Editing method and device for large file data |
CN111723052B (en) * | 2020-05-09 | 2022-05-24 | 厦门亿联网络技术股份有限公司 | Editing method and device for large file data |
CN111723229A (en) * | 2020-06-24 | 2020-09-29 | 重庆紫光华山智安科技有限公司 | Data comparison method and device, computer readable storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109783454A (en) | A kind of super large text file comparison method | |
US10169036B2 (en) | Synchronizing comments in source code with text documents | |
US7743317B2 (en) | Automated document formatting tool | |
CA1268553A (en) | Preservation of previously defined text formats | |
CN107729526B (en) | Text structuring method | |
CN107358208B (en) | A kind of PDF document structured message extracting method and device | |
CN109842629B (en) | Method for realizing self-defined protocol based on protocol analysis framework | |
DE112013004769T5 (en) | Space prediction for text input | |
US20140019841A1 (en) | Method for handling excessive input characters in a field | |
CN107391457B (en) | Document segmentation method and device based on text line | |
US8832543B2 (en) | Automated document formatting tool | |
CN112632960A (en) | Log analysis method and system based on dynamic field template | |
WO2018032698A1 (en) | Page turning method and device, and writing terminal | |
CN100407159C (en) | Method for recovering files deleted from FAT32 document system | |
US9658988B2 (en) | Systems and methods to segment text for layout and rendering | |
CN111159497A (en) | Regular expression generation method and regular expression-based data extraction method | |
US8930808B2 (en) | Processing rich text data for storing as legacy data records in a data storage system | |
KR101690075B1 (en) | Method for materialization issues in the source code files based on log | |
CN114020717A (en) | Method, device, equipment and medium for acquiring performance data of distributed storage system | |
US11036693B2 (en) | Apparatus of continuous profiling for multicore embedded system and method of the same | |
CN112380173B (en) | Intelligent correction rapid PCM decoding calculation method | |
US20180225348A1 (en) | Database processing method and database processing device | |
US10613839B2 (en) | Source code display device, source code display method, and computer readable recording medium having program for performing the same | |
JP2016218743A (en) | Operation candidate providing program, operation candidate providing apparatus, and operation candidate providing method | |
CN109726166B (en) | Electronic book display method and device, computer equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190521 |
|
RJ01 | Rejection of invention patent application after publication |