CN109783454A - A kind of super large text file comparison method - Google Patents

A kind of super large text file comparison method Download PDF

Info

Publication number
CN109783454A
CN109783454A CN201910062450.6A CN201910062450A CN109783454A CN 109783454 A CN109783454 A CN 109783454A CN 201910062450 A CN201910062450 A CN 201910062450A CN 109783454 A CN109783454 A CN 109783454A
Authority
CN
China
Prior art keywords
row
file
text file
data
line number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910062450.6A
Other languages
Chinese (zh)
Inventor
王成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Yihaitong Technology Co Ltd
Original Assignee
Chengdu Yihaitong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Yihaitong Technology Co Ltd filed Critical Chengdu Yihaitong Technology Co Ltd
Priority to CN201910062450.6A priority Critical patent/CN109783454A/en
Publication of CN109783454A publication Critical patent/CN109783454A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of super large text file comparison methods, this method includes extracting source text file comparative information, target text file comparative information is extracted, using Multi-thread synchronization comparison method calculating ratio pair as a result, the file after comparison is carried out Dynamically Announce using memory buffer mechanism.The present invention uses Text Address, size text administrative mechanism in the analysis of super large text file first, and Large Volume Data is avoided to load, so that it is small to guarantee that memory source occupies, effectively solves the problems, such as super large text-processing process low memory;Secondly it is combined using paging and branch's administrative mechanism with multiple threads, greatly improves super large text file and compare speed;Last text display needs dynamically load content of text according to user in such a way that row management and buffer area combine, to guarantee the real-time and continuity of content load, content of text display is smooth, greatly improves user experience feeling.

Description

A kind of super large text file comparison method
Technical field
The invention belongs to file comparison technology fields, and in particular to a kind of super large text file comparison method.
Background technique
Currently, the tool software of text file more commonly used in industry has very much, for example, notepad, notepad++, The software tools such as UltraEdit, bes.These tool software are in processing super large text file, especially GB grades and the above size When file, following difficulty is mainly faced:
1. opening file needs very big memory overhead, Installed System Memory can not be supported.
2. data are only able to display, but more than two file comparative analyses.
3. not having Bit Error Ratio Measurement function.
4. customer parameter can not be arranged.
5. it is too slow that file compares speed.
Summary of the invention
Goal of the invention of the invention is: in order to solve problem above existing in the prior art, the invention proposes one kind Super large text file comparison method.
The technical scheme is that a kind of super large text file comparison method, comprising the following steps:
A, source text file is obtained, source text file content is analyzed, extracts source text file comparative information;
B, target text file is obtained, target text file content is analyzed, extracts target text file comparison letter Breath;
C, transcription comparison's parameter is set, comparison result is calculated using Multi-thread synchronization comparison method;
D, the file after comparison is carried out by Dynamically Announce using memory buffer mechanism.
Further, source text file content is analyzed in the step A, extracts source text file comparative information, Specifically:
Source text file content is analyzed, using paging management mechanism, according to page size parameter by source text file Multiple page file data are divided into, the page information of source text file is extracted, then analyze the page information of source text file, adopts With branch's administrative mechanism, each page file data are divided into multiple row data, extract the row information of source text file.
Further, target text file content is analyzed in the step B, extracts target text file comparison letter Breath, specifically:
Target text file content is analyzed, using paging management mechanism, according to page size parameter by target text File division extracts the page information of target text file at multiple page file data, then to the page information of target text file into Each page file data are divided into multiple row data, are extracted the row of target text file by row analysis using branch's administrative mechanism Information.
Further, the page information includes file name, file size, file paging size, file paging number, text Part start of Page address, file page byte length.
Further, the row information includes line number, initial address, offset address, byte length, frame number.
Further, in the step C use Multi-thread synchronization comparison method calculating ratio pair as a result, specifically:
According to data frame format Analytic Traveling data, the row information of source text file and the row of target text file are then compared Information gap records difference address and length scale and difference type.
Further, the file after comparison is carried out Dynamically Announce using memory buffer mechanism by the step D, specifically:
Using the file initial row data of memory buffer mechanism display setting, and use sliding type dynamically load line number According to checking other comparison datas.
Further, described to use in sliding type dynamically load row data when user drags scroll bar toward row number numerical value Augment direction drag when, specifically include it is following step by step:
S11, display correlation data originate N row data content, and setting can currently show that starting line number is 1 and can show cut-off Line number is N;
S12, when user drag scroll bar enter the first buffer mark position after, row display manager is reloaded from file N row data enter in the memory of row display manager afterwards, and update can currently show that starting line number is 1 and can show that cut-off line number is 2N;
S13, after user drags scroll bar into the second buffer mark position again, row display manager from file again N row data enter in the memory of row display manager after load, while deleting N row data in foremost in row display manager, more It newly can currently show that starting line number is N and can show that cut-off line number is 3N;
S14, when user drags scroll bar again and enters third buffer mark position or mark position further below, will be by According to method display line data described in step S13;
S15, when user drag scroll bar reach the last one buffer mark position when, row display manager from file plus It carries remaining All Datarows to enter in the memory of row display manager, then deletes preceding N row data in row display manager, simultaneously Update the numerical value that can currently show starting line number and can show cut-off line number.
Further, described to use in sliding type dynamically load row data when user drags scroll bar toward row number numerical value Reduce direction dragging when, specifically include it is following step by step:
S11, when user drag scroll bar enter the 4th buffer mark position after, row display manager from file load before N row data are inserted into before the minimum line number of row display manager memory, while needing to delete in row display manager backmost K row data, while update can currently show starting line number and can show cut-off line number numerical value;
S12, when user drag scroll bar enter the 5th buffer mark position after, row display manager is reloaded from file Preceding N row data are inserted into before the minimum line number of row display manager memory, while needing to delete last in row display manager The N row data in face, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S13, when user drags scroll bar and enters the buffer mark position of the 6th buffer mark position or more front, will be by According to method display line data described in step S12;
S14, when user drag scroll bar reach the first buffer mark position when, row display manager load from file remains Remaining All Datarows enter in the memory of row display manager, then delete rearmost K row data in row display manager, protect The current maximum display line number of card is 2N row, can show that starting line number is 1, can show that cut-off line number is 2N.
The beneficial effects of the present invention are: the present invention is big using Text Address, text first in the analysis of super large text file Small administrative mechanism, avoids Large Volume Data from loading, so that it is small to guarantee that memory source occupies, effectively solves super large text-processing process The problem of low memory;Secondly it is combined using paging and branch's administrative mechanism with multiple threads, greatly improves super large text File compares speed;Last text display needs dynamically load according to user in such a way that row management and buffer area combine Content of text, to guarantee the real-time and continuity of content load, content of text display is smooth, greatly improves user experience Feel.
Detailed description of the invention
Fig. 1 is super large text file comparison method flow diagram of the invention;
Fig. 2 is transcription comparison's flow diagram in super large text file comparison method of the invention;
Fig. 3 is that file shows processing schematic in super large text file comparison method of the invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not For limiting the present invention.
As shown in Figure 1, being super large text file comparison method flow diagram of the invention.A kind of super large text file ratio To method, comprising the following steps:
A, source text file is obtained, source text file content is analyzed, extracts source text file comparative information;
B, target text file is obtained, target text file content is analyzed, extracts target text file comparison letter Breath;
C, transcription comparison's parameter is set, comparison result is calculated using Multi-thread synchronization comparison method;
D, the file after comparison is carried out by Dynamically Announce using memory buffer mechanism.
In step, the present invention obtains source text file first, then analyzes the source text file content of acquisition, The page information and row information of source text file are extracted, specifically:
A file analysis thread is created, source text file content is analyzed, using paging management mechanism, according to page Source text file is divided into multiple page file data by size parameter, extracts the page information of source text file, then herein to source document The page information of part is analyzed, and using branch's administrative mechanism, each page file data is divided into multiple row data, extract source document The row information of this document.
The page information of above-mentioned source text file specifically includes file name, file size, file paging size, file paging Number, file page initial address, file page byte length.
The row information of above-mentioned source text file specifically includes line number, initial address, offset address, byte length, frame number.
In stepb, the present invention obtains target text file first, then divides the target text file content of acquisition The page information and row information of target text file are extracted in analysis, specifically:
A file analysis thread is created, target text file content is analyzed, using paging management mechanism, according to Target text file division at multiple page file data, is extracted the page information of target text file by page size parameter, then to mesh The page information of mark text file is analyzed, and using branch's administrative mechanism, each page file data are divided into multiple row data, Extract the row information of target text file.
The page information of above-mentioned target text file specifically includes file name, file size, file paging size, file point Page number, file page initial address, file page byte length.
The row information of above-mentioned target text file specifically includes line number, initial address, offset address, byte length, frame sequence Number.
In step C, as shown in Fig. 2, for transcription comparison's process signal in super large text file comparison method of the invention Figure.The present invention selects data frame format resolution file first, judges that file whether there is;If file exists, comparison ginseng is configured Number reselects data frame format resolution file if file is not present;Judge whether to reach maximum thread again, if so, waiting It is completed to thread, if it is not, then creating page data parsing compares thread, after the completion of waiting all threads, using Multi-thread synchronization ratio To method calculating ratio pair as a result, specifically:
According to data frame format Analytic Traveling data, the row information of source text file and the row of target text file are then compared Information gap records difference address and length scale and difference type.Here difference type is frame losing, error code, normal Deng.
In step D, as shown in figure 3, for file display processing signal in super large text file comparison method of the invention Figure.File after comparison is carried out Dynamically Announce using memory buffer mechanism by the present invention, specifically:
Using the file initial row data of memory buffer mechanism display setting, and use sliding type dynamically load line number According to checking other comparison datas.
It is above-mentioned to use in sliding type dynamically load row data when user drags scroll bar toward row number numerical value augment direction When dragging, specifically include it is following step by step:
S11, display correlation data originate N row data content, and setting can currently show that starting line number is 1 and can show cut-off Line number is N;Preferably, the present invention sets N as 100, that is, shows that correlation data originates 100 row data contents.
S12, when user drag scroll bar enter the first buffer mark position 1. after, row display manager from file again plus N row data enter in the memory of row display manager after load, and update can currently show that starting line number is 1 and can show cut-off line number For 2N;
S13, when user drag again scroll bar into the second buffer mark position 2. after, row display manager is from file It reloads rear N row data to enter in the memory of row display manager, while deleting N row data in foremost in row display manager, Update can currently show that starting line number is N and can show that cut-off line number is 3N;
S14, enter the mark position of third buffer mark position 3. or further below (except most when user drags scroll bar again The latter buffer mark position) when, it will be according to method display line data described in step S13;
S15, when user drag scroll bar reach the last one buffer mark position when, row display manager from file plus It carries remaining All Datarows (row number is less than or equal to 100) to enter in the memory of row display manager, then deletes row display tube Preceding N row data in device are managed, while updating the numerical value that can currently show starting line number and can show cut-off line number.
It is above-mentioned to reduce direction toward row number numerical value when user drags scroll bar using in sliding type dynamically load row data When dragging, processing mode is opposite toward numerical value augment direction with scroll bar, it is assumed that scroll bar front position is in end line position, specific packet Include it is following step by step:
S11, when user drag scroll bar enter the 4th buffer mark position 4. after, row display manager is loaded from file Preceding N row data are inserted into before the minimum line number of row display manager memory, while needing to delete last in row display manager The K row data in face, while updating the numerical value that can currently show starting line number and can show cut-off line number;Preferably, the present invention is set Determine the total line number -200 of K=row display manager.
S12, when user drag scroll bar enter the 5th buffer mark position 5. after, row display manager from file again plus N row data are inserted into before the minimum line number of row display manager memory before carrying, while needing to delete in row display manager most Subsequent N row data, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S13, enter the buffer mark position of the 6th buffer mark position or more front (except first when user drags scroll bar A buffer mark position) when, it will be according to method display line data described in step S12;
S14, when user drag scroll bar reach the first buffer mark position when, row display manager load from file remains Remaining All Datarows enter in the memory of row display manager, then delete rearmost K row data in row display manager, protect The current maximum display line number of card is 2N row, can show that starting line number is 1, can show that cut-off line number is 2N.
Those of ordinary skill in the art will understand that the embodiments described herein, which is to help reader, understands this hair Bright principle, it should be understood that protection scope of the present invention is not limited to such specific embodiments and embodiments.This field Those of ordinary skill disclosed the technical disclosures can make according to the present invention and various not depart from the other each of essence of the invention The specific variations and combinations of kind, these variations and combinations are still within the scope of the present invention.

Claims (9)

1. a kind of super large text file comparison method, which comprises the following steps:
A, source text file is obtained, source text file content is analyzed, extracts source text file comparative information;
B, target text file is obtained, target text file content is analyzed, extracts target text file comparative information;
C, transcription comparison's parameter is set, comparison result is calculated using Multi-thread synchronization comparison method;
D, the file after comparison is carried out by Dynamically Announce using memory buffer mechanism.
2. super large text file comparison method as described in claim 1, which is characterized in that in the step A herein to source document Part content is analyzed, and source text file comparative information is extracted, specifically:
Source text file content is analyzed, using paging management mechanism, is divided source text file according to page size parameter At multiple page file data, extract the page information of source text file, then the page information of source text file analyzed, using point Each page file data are divided into multiple row data, extract the row information of source text file by row administrative mechanism.
3. super large text file comparison method as described in claim 1, which is characterized in that target text in the step B File content is analyzed, and target text file comparative information is extracted, specifically:
Target text file content is analyzed, using paging management mechanism, according to page size parameter by target text file Multiple page file data are divided into, the page information of target text file is extracted, then the page information of target text file is divided Each page file data are divided into multiple row data using branch's administrative mechanism by analysis, extract the row letter of target text file Breath.
4. super large text file comparison method as claimed in claim 2 or claim 3, which is characterized in that the page information includes file Title, file size, file paging size, file paging number, file page initial address, file page byte length.
5. super large text file comparison method as claimed in claim 2 or claim 3, which is characterized in that the row information include line number, Initial address, offset address, byte length, frame number.
6. super large text file comparison method as described in claim 1, which is characterized in that use multithreading in the step C Synchronous comparison method calculating ratio pair as a result, specifically:
According to data frame format Analytic Traveling data, the row information of source text file and the row information of target text file are then compared Difference records difference address and length scale and difference type.
7. super large text file comparison method as described in claim 1, which is characterized in that the step D uses memory buffer File after comparison is carried out Dynamically Announce by mechanism, specifically:
Using the file initial row data of memory buffer mechanism display setting, and sliding type dynamically load row data are used, looked into See other comparison datas.
8. super large text file comparison method as claimed in claim 7, which is characterized in that described to be added using sliding type dynamic Carry row data in when user drag scroll bar toward row number numerical value augment direction drag when, specifically include it is following step by step:
S11, display correlation data originate N row data content, and setting can currently show that starting line number is 1 and can show cut-off line number For N;
S12, when user drag scroll bar enter the first buffer mark position after, row display manager reloads rear N from file Row data enter in the memory of row display manager, and update can currently show that starting line number is 1 and can show that cut-off line number is 2N;
S13, after user drags scroll bar into the second buffer mark position again, row display manager is reloaded from file N row data enter in the memory of row display manager afterwards, while deleting foremost N row data, update in row display manager and working as Before can show starting line number be N and can show cut-off line number be 3N;
S14, when user drags scroll bar again and enters third buffer mark position or mark position further below, will be according to step Method display line data described in rapid S13;
S15, when user drag scroll bar reach the last one buffer mark position when, row display manager load from file remains Remaining All Datarows enter in the memory of row display manager, then delete preceding N row data in row display manager, update simultaneously It can currently show starting line number and can show the numerical value of cut-off line number.
9. super large text file comparison method as claimed in claim 7, which is characterized in that described to be added using sliding type dynamic Carry row data in when user drag scroll bar toward row number numerical value reduce direction drag when, specifically include it is following step by step:
S11, after user drags scroll bar and enters the 4th buffer mark position, row display manager load preceding N row from file Data are inserted into before the minimum line number of row display manager memory, while needing to delete rearmost K in row display manager Row data, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S12, when user drag scroll bar enter the 5th buffer mark position after, row display manager reloads preceding N from file Row data are inserted into before the minimum line number of row display manager memory, while needing to delete rearmost in row display manager N row data, while updating the numerical value that can currently show starting line number and can show cut-off line number;
S13, when user drag scroll bar enter the 6th buffer mark position or more front buffer mark position when, will be according to step Method display line data described in rapid S12;
S14, when user drag scroll bar reach the first buffer mark position when, row display manager loads remaining from file All Datarows enter in the memory of row display manager, then delete rearmost K row data in row display manager, guarantee to work as Preceding maximum display line number is 2N row, can show that starting line number is 1, can show that cut-off line number is 2N.
CN201910062450.6A 2019-01-23 2019-01-23 A kind of super large text file comparison method Pending CN109783454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910062450.6A CN109783454A (en) 2019-01-23 2019-01-23 A kind of super large text file comparison method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910062450.6A CN109783454A (en) 2019-01-23 2019-01-23 A kind of super large text file comparison method

Publications (1)

Publication Number Publication Date
CN109783454A true CN109783454A (en) 2019-05-21

Family

ID=66502187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910062450.6A Pending CN109783454A (en) 2019-01-23 2019-01-23 A kind of super large text file comparison method

Country Status (1)

Country Link
CN (1) CN109783454A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413960A (en) * 2019-06-19 2019-11-05 平安银行股份有限公司 File control methods, device, computer equipment and computer readable storage medium
CN111723052A (en) * 2020-05-09 2020-09-29 厦门亿联网络技术股份有限公司 Editing method and device for large file data
CN111723229A (en) * 2020-06-24 2020-09-29 重庆紫光华山智安科技有限公司 Data comparison method and device, computer readable storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130117246A1 (en) * 2011-11-03 2013-05-09 Sebastien Cabaniols Methods of processing text data
CN104133705A (en) * 2014-07-31 2014-11-05 武汉邮电科学研究院 System and method for loading PowerPC system guide file through serial port
CN107463541A (en) * 2017-07-31 2017-12-12 武汉斗鱼网络科技有限公司 File difference comparative approach, storage medium, electronic equipment and system
CN108920436A (en) * 2018-06-29 2018-11-30 郑州云海信息技术有限公司 A kind of file data comparison method, tool and equipment
CN109039804A (en) * 2018-07-12 2018-12-18 武汉斗鱼网络科技有限公司 A kind of file reading and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130117246A1 (en) * 2011-11-03 2013-05-09 Sebastien Cabaniols Methods of processing text data
CN104133705A (en) * 2014-07-31 2014-11-05 武汉邮电科学研究院 System and method for loading PowerPC system guide file through serial port
CN107463541A (en) * 2017-07-31 2017-12-12 武汉斗鱼网络科技有限公司 File difference comparative approach, storage medium, electronic equipment and system
CN108920436A (en) * 2018-06-29 2018-11-30 郑州云海信息技术有限公司 A kind of file data comparison method, tool and equipment
CN109039804A (en) * 2018-07-12 2018-12-18 武汉斗鱼网络科技有限公司 A kind of file reading and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110413960A (en) * 2019-06-19 2019-11-05 平安银行股份有限公司 File control methods, device, computer equipment and computer readable storage medium
CN111723052A (en) * 2020-05-09 2020-09-29 厦门亿联网络技术股份有限公司 Editing method and device for large file data
CN111723052B (en) * 2020-05-09 2022-05-24 厦门亿联网络技术股份有限公司 Editing method and device for large file data
CN111723229A (en) * 2020-06-24 2020-09-29 重庆紫光华山智安科技有限公司 Data comparison method and device, computer readable storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN109783454A (en) A kind of super large text file comparison method
US10169036B2 (en) Synchronizing comments in source code with text documents
US7743317B2 (en) Automated document formatting tool
CA1268553A (en) Preservation of previously defined text formats
CN107729526B (en) Text structuring method
CN107358208B (en) A kind of PDF document structured message extracting method and device
CN109842629B (en) Method for realizing self-defined protocol based on protocol analysis framework
DE112013004769T5 (en) Space prediction for text input
US20140019841A1 (en) Method for handling excessive input characters in a field
CN107391457B (en) Document segmentation method and device based on text line
US8832543B2 (en) Automated document formatting tool
CN112632960A (en) Log analysis method and system based on dynamic field template
WO2018032698A1 (en) Page turning method and device, and writing terminal
CN100407159C (en) Method for recovering files deleted from FAT32 document system
US9658988B2 (en) Systems and methods to segment text for layout and rendering
CN111159497A (en) Regular expression generation method and regular expression-based data extraction method
US8930808B2 (en) Processing rich text data for storing as legacy data records in a data storage system
KR101690075B1 (en) Method for materialization issues in the source code files based on log
CN114020717A (en) Method, device, equipment and medium for acquiring performance data of distributed storage system
US11036693B2 (en) Apparatus of continuous profiling for multicore embedded system and method of the same
CN112380173B (en) Intelligent correction rapid PCM decoding calculation method
US20180225348A1 (en) Database processing method and database processing device
US10613839B2 (en) Source code display device, source code display method, and computer readable recording medium having program for performing the same
JP2016218743A (en) Operation candidate providing program, operation candidate providing apparatus, and operation candidate providing method
CN109726166B (en) Electronic book display method and device, computer equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190521

RJ01 Rejection of invention patent application after publication