A kind of processing method and system of big data level Pcap file
Technical field
The present invention relates to technical field of the computer network, more particularly to a kind of big data level Pcap file processing method and
System.
Background technology
Pcap files be carry out network packet capturing after the data file deposited, generally, we can use wireshark
To check Pcap file contents, and the packet that we need is filtered out by filter, and carry out network traffic analysis.But it is existing
In having technology, due to the restriction of Installed System Memory size, can only typically process the Pcap files of the GB orders of magnitude, and to the TB orders of magnitude this
The integrity of the processing speed, efficiency and data processing of class big data level Pcap file is not high.
The content of the invention
For defect present in above-mentioned prior art, the present invention proposes a kind of processing method of big data level Pcap file
And system, computing system internal memory, and the dynamic calculation Installed System Memory utilization rate during order reads Pcap files first, when
Memory usage stops reading when reaching specified value, and labelling flag bit, and the data to once reading are discharged after being analyzed
Internal memory, Returning mark position continue to be read out Pcap files, until Pcap files are by complete process.
The concrete content of the invention includes:
A kind of processing method of big data level Pcap file, comprises the steps:
Step 1:Obtain Installed System Memory information, computing system capacity;
Step 2:Start order from Pcap top of files and read data;
Step 3:Dynamic calculation memory usage, when memory usage reaches specified value, suspends and reads data;
Step 4:Flag bit is set the position of reading data is suspended, this eigenvalue for reading data is calculated, and by flag bit
It is stored in journal file with eigenvalue;
Step 5:This reading data is analyzed, characteristic information extraction, by regulation storage characteristic information;
Step 6:The position of return Pcap file mark bits, erasing marker bit, release Installed System Memory, order reading data, and again
Secondary execution step 3 is to step 5;
Step 7:Repeat step 6, until Pcap files are by complete process.
Further, also include:With reference to the calculation for calculating this read data features value, Pcap files are calculated
The eigenvalue of data between the two neighboring marker bit recorded in journal file, if result is complete with the eigenvalue in journal file
Complete to match, then handled Pcap files are complete, if can not match completely, the mark corresponding to eigenvalue that it fails to match
Data before note position are incomplete, need to return to the position of Pcap file respective markers position, data are reacquired.
Further, it is described to store characteristic information by regulation, specially:Characteristic information is stored in into what is named with marker bit
In file.
Further, the characteristic information includes:Source IP, purpose IP, URL, protocol mode, port information.
Further, the Installed System Memory information includes:The total internal memory of system, system free memory, block device buffer size,
File buffering size.
A kind of processing system of big data level Pcap file, including:
Power system capacity computing module, for obtaining Installed System Memory information, computing system capacity;
Memory usage computing module, for dynamic calculation memory usage, when memory usage reaches specified value, suspends
Read data;
Flag bit setup module, for suspending the position for reading data setting flag bit, calculates this feature for reading data
Value, and flag bit and eigenvalue are stored in journal file;
Data analysis module, for being analyzed to reading data, characteristic information extraction, by regulation storage characteristic information;
File read module, for sequentially reading Pcap file datas, and dynamic call memory usage computing module, flag bit
Setup module, data analysis module, until Pcap files are by complete process.
Further, also including data integrity verifying module, for reference to described this read data features value of calculating
Calculation, calculate the eigenvalue of data between the two neighboring marker bit that records in journal file of Pcap files, if knot
Fruit is matched completely with the eigenvalue in journal file, then handled Pcap files are complete, if can not match completely,
The data before marker bit corresponding to eigenvalue with failure are incomplete, need to return to the position of Pcap file respective markers position
Put, data are reacquired.
Further, it is described to store characteristic information by regulation, specially:Characteristic information is stored in into what is named with marker bit
In file.
Further, the characteristic information includes:Source IP, purpose IP, URL, protocol mode, port information.
Further, the Installed System Memory information includes:The total internal memory of system, system free memory, block device buffer size,
File buffering size.
The invention has the beneficial effects as follows:
Coupling system internal memory situation of the present invention is read in batches to big data level Pcap file, system has been effectively ensured and has processed number
According to speed, improve data analysiss efficiency;Further, school can be carried out to the integrity of the Pcap files of process according to the present invention
Test, it is ensured that the accuracy of the integrity and result of data.
Description of the drawings
In order to be illustrated more clearly that technical scheme of the invention or of the prior art, below will be to embodiment or prior art
Needed for description, accompanying drawing to be used is briefly described, it should be apparent that, during drawings in the following description are only the present invention
Some embodiments recorded, for those of ordinary skill in the art, on the premise of not paying creative work, can be with
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is a kind of process flow figure of big data level Pcap file of the invention;
Fig. 2 is a kind of processing system structure chart of big data level Pcap file of the invention.
Specific embodiment
In order that those skilled in the art more fully understand the technical scheme in the embodiment of the present invention, and make the present invention's
Above-mentioned purpose, feature and advantage can become apparent from understandable, and below in conjunction with the accompanying drawings technical scheme in the present invention is made further in detail
Thin explanation.
The present invention gives a kind of processing method embodiment of big data level Pcap file, as shown in figure 1, including:
S101:Obtain Installed System Memory information, computing system capacity;
S102:Start order from Pcap top of files and read data;
S103:Dynamic calculation memory usage, when memory usage reaches specified value, suspends and reads data;
Memory usage(MEMUsedPerc)Can be calculated in the following way:
MEMUsedPerc=100*(MemTotal-MemFree-Buffers-Cached)/MemTotal
Wherein,
MemTotal:The total internal memory of system
MemFree:System free memory
Buffers:Block device buffer size
Cached:File buffering size
The specified value can be set according to concrete data processing needs and system environmentss etc., generally, the numerical value does not surpass
Cross 90%;
S104:Suspend read data position arrange flag bit, calculate this read data eigenvalue, and by flag bit with
Eigenvalue is stored in journal file;
S105:This reading data is analyzed, characteristic information extraction, by regulation storage characteristic information;The process needs basis
Concrete data analysis requirements are analyzed to reading data, and characteristic information extraction;
S106:The position of Pcap file mark bits is returned to, marker bit is wiped, Installed System Memory is discharged, order reads data;
S107:Whether Pcap files are judged by complete process, if it is not, then entering S103, if so, then terminate.
Preferably, also include:With reference to the calculation for calculating this read data features value, calculate Pcap files and exist
The eigenvalue of data between the two neighboring marker bit recorded in journal file, if result is complete with the eigenvalue in journal file
Matching, then handled Pcap files are complete, if can not match completely, the labelling corresponding to eigenvalue that it fails to match
Data before position are incomplete, need to return to the position of Pcap file respective markers position, data are reacquired.
Preferably, it is described to store characteristic information by regulation, specially:Characteristic information is stored in the text named with marker bit
In part.
Preferably, the characteristic information includes:Source IP, purpose IP, URL, protocol mode, port information.
Preferably, the Installed System Memory information includes:The total internal memory of system, system free memory, block device buffer size, text
Part buffer size.
A kind of processing system of big data level Pcap file, including:
Power system capacity computing module 201, for obtaining Installed System Memory information, computing system capacity;
Memory usage computing module 202, for dynamic calculation memory usage, when memory usage reaches specified value,
Suspend and read data;
Flag bit setup module 203, for suspending the position for reading data setting flag bit, calculates this spy for reading data
Value indicative, and flag bit and eigenvalue are stored in journal file;
Data analysis module 204, for being analyzed to reading data, characteristic information extraction, by regulation storage characteristic information;
File read module 205, for sequentially reading Pcap file datas, and dynamic call memory usage computing module, mark
Will position setup module, data analysis module, until Pcap files are by complete process.
Preferably, also including data integrity verifying module, for reference to this read data features value of calculating
Calculation, calculates the eigenvalue of data between the two neighboring marker bit that Pcap files are recorded in journal file, if result
Matched with the eigenvalue in journal file completely, then handled Pcap files are complete, if can not match completely, are matched
The data before marker bit corresponding to the eigenvalue of failure are incomplete, need to return to the position of Pcap file respective markers position
Put, data are reacquired.
Preferably, it is described to store characteristic information by regulation, specially:Characteristic information is stored in the text named with marker bit
In part.
Preferably, the characteristic information includes:Source IP, purpose IP, URL, protocol mode, port information.
Preferably, the Installed System Memory information includes:The total internal memory of system, system free memory, block device buffer size, text
Part buffer size.
In this specification, the embodiment of method is described by the way of progressive, for the embodiment of system, due to which
Embodiment of the method is substantially similar to, so description is fairly simple, related part is illustrated referring to the part of embodiment of the method.
For prior art it cannot be guaranteed that big data level Pcap file activity this technological deficiency, the present invention proposes a kind of big data
The processing method and system of level Pcap files, first computing system internal memory, and the dynamic during order reads Pcap files
Computing system memory usage, stops reading when memory usage reaches specified value, and labelling flag bit, to once reading
Data be analyzed rear releasing memory, Returning mark position continues to be read out Pcap files, until Pcap files are complete
Process.Coupling system internal memory situation of the present invention is read in batches to big data level Pcap file, and system process has been effectively ensured
The speed of data, improves data analysiss efficiency;Further, the integrity of the Pcap files of process can be carried out according to the present invention
Verification, it is ensured that the accuracy of the integrity and result of data.
Although depicting the present invention by embodiment, it will be appreciated by the skilled addressee that the present invention have it is many deformation and
Change the spirit without deviating from the present invention, it is desirable to which appended claim includes these deformations and changes without deviating from the present invention's
Spirit.