CN106547810B - A kind of method and system of flow storage quick indexing - Google Patents

A kind of method and system of flow storage quick indexing Download PDF

Info

Publication number
CN106547810B
CN106547810B CN201610193639.5A CN201610193639A CN106547810B CN 106547810 B CN106547810 B CN 106547810B CN 201610193639 A CN201610193639 A CN 201610193639A CN 106547810 B CN106547810 B CN 106547810B
Authority
CN
China
Prior art keywords
session
stream
file
information
call number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610193639.5A
Other languages
Chinese (zh)
Other versions
CN106547810A (en
Inventor
邱勇良
张栗伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ahtech Network Safe Technology Ltd
Original Assignee
Beijing Ahtech Network Safe Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ahtech Network Safe Technology Ltd filed Critical Beijing Ahtech Network Safe Technology Ltd
Priority to CN201610193639.5A priority Critical patent/CN106547810B/en
Publication of CN106547810A publication Critical patent/CN106547810A/en
Application granted granted Critical
Publication of CN106547810B publication Critical patent/CN106547810B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method and systems of flow storage quick indexing, affiliated stream and/or session are indicated according to Transmission Control Protocol and udp protocol to the message in capture flow first, and for stream and/or session establishment incremental index number and stream and/or session information structure, information in stream and/or session is stored according to stream and/or the call number size order of session, subsequent basis goes out required information as the indexed search of major key using call number.Pass through the problem of present method solves in conventional method, large storage capacity and quick-searching can only meet one.

Description

A kind of method and system of flow storage quick indexing
Technical field
The present invention relates to field of information security technology more particularly to a kind of method and systems of flow storage quick indexing.
Background technique
In the prior art, the memory capacity of file system can extend, but extremely difficult to internal data retrieval, close It is the retrieval convenience of type database, but with the rising of amount of storage, quickly, KV database root compares increased costs according to hash retrieval It is convenient, but it is relatively difficult to maintain relative complex relationship to retrieve.The present invention can consumption compared with low-resource and hardware version compared with In the case where low, meet the requirement of bulk information amount of storage, the requirement of stream and/or session quick-searching and many condition are compound Retrieve the requirement of data on flows.
Summary of the invention
In view of the above technical problems, the present invention provides a kind of method and system of flow storage quick indexing, the inventions The message captured in flow is indicated into affiliated stream and/or session according to Transmission Control Protocol and udp protocol, and for stream and/or meeting Words establish incremental index number and stream and/or session information structure, by information in stream and/or session according to stream and/or session Call number storage, goes out required information as the indexed search of major key according to using call number.
A kind of method of flow storage quick indexing, comprising:
Capture flow;
Indicate stream and/or session belonging to each message in Transmission Control Protocol and udp protocol in outflow;
For stream and/or session establishment incremental index number and establish stream and/or session information structure;
Information in stream and/or session is stored according to stream and/or the call number size order of session;
It establishes using call number as the index of major key;
Required information is retrieved according to the call number of stream and/or session;
The information by stream and/or session is stored according to stream and/or the call number size order of session, specifically: Information in stream and/or session is stored in stream file and/or session file according to fixed size in a manner of structural array, And flow and/or session in storage location of the information in stream file and/or session file by ascending suitable of call number Sequence is stored, automatically will be subsequent after the byte number stored in stream file and/or session file reaches the range of user preset Stream and/or session in information be stored in next stream file and/or session file, stream file and/or session file Filename includes the smallest stream and/or the call number of session in file, by stream and/or session and call number in filename, file Between relationship be saved in independent index file.
It is further, described to be directed to stream and/or session establishment incremental index number, specifically: to first found in flow Stream belonging to a message and/or session generate a call number being made of timestamp and incremented sequence number.
Further, the stream information structure includes call number, five-tuple, initial time, end time, uplink and downlink message Number, byte number;Session information structure includes call number, initial time, the end time, uplink and downlink message number, byte number, next Session call number, storage location.
Further, the call number according to stream and/or session retrieves required information, specifically: according to index File quickly navigates to the stream file and/or session file where the information in stream and/or session, then according to stream and/or meeting The call number of words quickly positions position of the information in stream file and/or session file in stream and/or session.
Further, it is described foundation using call number as the index of major key, replacement are as follows: establish with IP address/port/domain name/ URL/ file HASH value is the index of major key.
Further, further includes: it is based on Information Extracting strategy, it, will when being stored to the original flow information in session Multiple similar original flow informations are according to the time numerical value of extraction, and ascending sequential storage is into the same storage file.
Further, further includes: it is based on Information Extracting strategy, it, will be multiple when being stored to the also original in session Similar also original is according to the time numerical value of extraction, and ascending sequential storage is into the same storage file.
A kind of system of flow storage quick indexing, comprising:
Trapping module, for capturing flow;
Module is indicated, for indicating stream and/or session belonging to each message in Transmission Control Protocol and udp protocol in outflow;
First establishes module, for being directed to stream and/or session establishment incremental index number and establishing stream and/or session information knot Structure;
Memory module, for depositing the information in stream and/or session according to the call number size order of stream and/or session Storage;
Second establishes module, for establishing the index using call number as major key;
Retrieval module, for retrieving required information according to the call number of stream and/or session;
The memory module is specifically used for: by the information in stream and/or session according to fixed size with the side of structural array Formula is stored into stream file and/or session file, and the information in stream and/or session is in stream file and/or session file Storage location stored by the ascending sequence of call number, when the byte number stored in stream file and/or session file After the range for reaching user preset, the information in subsequent stream and/or session is stored in next stream file and/or meeting automatically It talks about in file, the filename of stream file and/or session file includes the smallest stream and/or the call number of session in file, will be literary Stream and/or the relationship between session and call number are saved in independent index file in part name, file.
Further, described first establish in module for stream and/or session establishment incremental index number, specifically: convection current Stream belonging to first message found in amount and/or session generate an index being made of timestamp and incremented sequence number Number.
Further, described first to establish stream information structure in module include call number, five-tuple, initial time, end Time, uplink and downlink message number, byte number;Session information structure includes call number, initial time, end time, uplink and downlink message Number, byte number, next session call number, storage location.
Further, the retrieval module is specifically used for: the letter in stream and/or session is quickly navigated to according to index file Then stream file and/or session file where ceasing quickly position in stream and/or session according to stream and/or the call number of session Position of the information in stream file and/or session file.
Further, it is described foundation using call number as the index of major key, replacement are as follows: establish with IP address/port/domain name/ URL/ file HASH value is the index of major key.
Further, further includes: original flow information storage module, for being based on Information Extracting strategy, in session When original flow information is stored, time numerical value by multiple similar original flow informations according to extraction, ascending sequence It stores in the same storage file.
Further, further includes: reduction file storage module, for being based on Information Extracting strategy, to the reduction in session When file is stored, will multiple similar also originals according to the time numerical value of extraction, ascending sequential storage is to same In storage file.
The present invention provides a kind of method and systems of flow storage quick indexing, comprising: capture flow;Indicate outflow Stream belonging to each message and/or session in middle Transmission Control Protocol and udp protocol;Simultaneously for stream and/or session establishment incremental index number Establish stream and/or session information structure;By the information in stream and/or session according to stream and/or the call number size order of session Storage;It establishes using call number as the index of major key;Required information is retrieved according to the call number of stream and/or session.The present invention Can be in consumption compared under low-resource and the lower situation of hardware version, meeting the requirement of bulk information amount of storage, stream and/or session The requirement of quick-searching and the requirement of the compound retrieval data on flows of many condition.
The present invention can according to stream and/or session establishment incremental index number and stream and/or session information structure, will stream and/ Or the information in session is stored in order and is retrieved according to call number, realizes the technology effect of massive store and quick indexing Fruit.
Detailed description of the invention
In order to illustrate more clearly of technical solution of the present invention, letter will be made to attached drawing needed in the embodiment below Singly introduce, it should be apparent that, the accompanying drawings in the following description is only some embodiments recorded in the present invention, for this field For those of ordinary skill, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the embodiment of the method flow chart that a kind of flow provided by the invention stores quick indexing;
Fig. 2 is the system embodiment structure chart that a kind of flow provided by the invention stores quick indexing.
Specific embodiment
The present invention gives a kind of method and systems of flow storage quick indexing, in order to make those skilled in the art more The technical solution in the embodiment of the present invention is understood well, and keeps the above objects, features and advantages of the present invention more obvious It is understandable, technical solution in the present invention is described in further detail with reference to the accompanying drawing:
Present invention firstly provides a kind of methods of flow storage quick indexing, as shown in Figure 1, comprising:
S101 captures flow;
S102 indicates stream and/or session belonging to each message in Transmission Control Protocol and udp protocol in outflow;
S103 is for stream and/or session establishment incremental index number and establishes stream and/or session information structure;
One is generated by timestamp and incremented sequence number to stream belonging to first message found in flow and/or session The call number of composition;
The stream information structure includes call number, five-tuple, initial time, end time, uplink and downlink message number, byte Number;
The session information structure includes call number, initial time, the end time, uplink and downlink message number, byte number, next A session call number, storage location;
S104 stores the information in stream and/or session according to stream and/or the call number size order of session;
Information in stream and/or session is stored into stream file and/or meeting according to fixed size in a manner of structural array It talks about in file, and storage location of the information in stream and/or session in stream file and/or session file presses call number by small It is stored to big sequence, after the byte number stored in stream file and/or session file reaches the range of user preset, from The dynamic information by subsequent stream and/or session is stored in next stream file and/or session file, stream file and/or meeting The filename for talking about file includes the smallest stream and/or the call number of session in file, by stream and/or session in filename, file Relationship between call number is saved in independent index file;
Based on Information Extracting strategy, when being stored to the original flow information in session, by multiple similar original flows Information is according to the time numerical value of extraction, and into the same storage file, storage file entitled one is passed ascending sequential storage The numerical value of N of increasing when storage, adds a header structure for each original flow information, header structure includes original flow information institute The location information of the call number of the session of category, original flow message length and next similar original flow information, after storage, note Deviation post O of the original flow information in storage file under record, storage file name N and deviation post O form location information knot Location information structure L is saved in original if original flow information is first original flow information of affiliated session by structure L In session information structure belonging to beginning flow information, a upper original flow for current original flow information in session is otherwise searched Location information structure L is saved in the header structure of a upper original flow information, stores in storage file by the header structure of information Byte number reach the range of user preset after, subsequent original flow information is stored in next storage file automatically;
Based on Information Extracting strategy, when being stored to the also original in session, will it is multiple similar and also originals according to The time numerical value of extraction, ascending sequential storage is into the same storage file, the entitled incremental numerical value of storage file N1, when storage, for each reduction one header structure of file attachment, header structure includes the index of also session belonging to original Number, reduction file size, reduction file type and also original HASH value, after storage, record also original in storage file In deviation post O1, storage file name N1 and deviation post O1 form location information structure L1, by location information structure L1 protect It is stored in session information structure belonging to also original and also original HASH value and location information structure L1 will be stored in file In the library HASH, in storage also original, judge with the presence or absence of the HASH value when pre reduction file in the library file HASH, if not In the presence of then storing, otherwise update the location information structure L1 corresponding to the pre reduction file in the library file HASH to going back original text It, automatically will be subsequent after the byte number stored in storage file reaches the range of user preset in session information structure belonging to part Also original be stored in next storage file;
S105 is established using call number as the index of major key;
It is major key creation index that call number can be replaced with to IP address/port/domain name/URL/ file HASH value;
S106 retrieves required information according to the call number of stream and/or session;
The stream file and/or session file where the information in stream and/or session are quickly navigated to according to index file, so The information in stream and/or session is quickly positioned in stream file and/or session file according to stream and/or the call number of session afterwards Position.
The present invention also provides a kind of systems of flow storage quick indexing, as shown in Figure 2, comprising:
Trapping module 201, for capturing flow;
Module 202 is indicated, for indicating stream and/or meeting belonging to each message in Transmission Control Protocol and udp protocol in outflow Words;
First establishes module 203, for being directed to stream and/or session establishment incremental index number and establishing stream and/or session letter Cease structure;
Memory module 204, for will flow and/or session in information according to stream and/or session call number size order Storage;
Second establishes module 205, for establishing the index using call number as major key;
Retrieval module 206, for retrieving required information according to the call number of stream and/or session.
In conclusion the present invention first to capture flow in message according to Transmission Control Protocol and udp protocol indicate belonging to Stream and/or session, and for stream and/or session establishment incremental index number and stream and/or session information structure, by stream and/or meeting Information in words is stored according to stream and/or the call number size order of session, is gone out according to by the indexed search of major key of call number Required information.In conventional method, large storage capacity and quick-searching can only meet first, and the present invention can consumption compared with Under low-resource and the lower situation of hardware version, meet the requirement of bulk information amount of storage, stream and/or session quick-searching are wanted Ask and many condition it is compound retrieval data on flows requirement.
Above embodiments are to illustrative and not limiting technical solution of the present invention.Appointing for spirit and scope of the invention is not departed from What modification or part replacement, are intended to be within the scope of the claims of the invention.

Claims (14)

1. a kind of method of flow storage quick indexing characterized by comprising
Capture flow;
Indicate stream and/or session belonging to each message in Transmission Control Protocol and udp protocol in outflow;
For stream and/or session establishment incremental index number and establish stream and/or session information structure;
Information in stream and/or session is stored according to stream and/or the call number size order of session;
It establishes using call number as the index of major key;
Required information is retrieved according to the call number of stream and/or session;
The information by stream and/or session is stored according to stream and/or the call number size order of session, specifically: it will flow And/or the information in session is stored in stream file and/or session file according to fixed size in a manner of structural array, and Storage location of the information in stream file and/or session file in stream and/or session by the ascending sequence of call number into Row storage, after the byte number stored in stream file and/or session file reaches the range of user preset, automatically by subsequent stream And/or the information in session is stored in next stream file and/or session file, the file of stream file and/or session file Name includes the smallest stream and/or the call number of session in file, will be in filename, file between stream and/or session and call number Relationship be saved in independent index file.
2. the method as described in claim 1, it is characterised in that: it is described for stream and/or session establishment incremental index number, specifically Are as follows: one is generated to stream belonging to first message found in flow and/or session and is made of timestamp and incremented sequence number Call number.
3. the method as described in claim 1, it is characterised in that: when the stream information structure includes call number, five-tuple, starting Between, end time, uplink and downlink message number, byte number;Session information structure include call number, initial time, the end time, up and down Row message number, byte number, next session call number, storage location.
4. the method as described in claim 1, which is characterized in that needed for the call number according to stream and/or session retrieves Information, specifically: the stream file where the information in stream and/or session and/or session are quickly navigated to according to index file Then file quickly positions the information in stream and/or session in stream file and/or session according to stream and/or the call number of session Position in file.
5. the method as described in claim 1, which is characterized in that the foundation is using call number as the index of major key, replacement are as follows: build It stands using IP address/port/domain name/URL/ file HASH value as the index of major key.
6. the method as described in claim 1, which is characterized in that further include: it is based on Information Extracting strategy, to original in session When flow information is stored, time numerical value by multiple similar original flow informations according to extraction, ascending sequential storage Into the same storage file.
7. the method as described in claim 1, which is characterized in that further include: it is based on Information Extracting strategy, to the reduction in session When file is stored, will multiple similar also originals according to the time numerical value of extraction, ascending sequential storage is to same In storage file.
8. a kind of system of flow storage quick indexing characterized by comprising
Trapping module, for capturing flow;
Module is indicated, for indicating stream and/or session belonging to each message in Transmission Control Protocol and udp protocol in outflow;
First establishes module, for being directed to stream and/or session establishment incremental index number and establishing stream and/or session information structure;
Memory module, for will flow and/or session in information according to stream and/or session call number size order store;
Second establishes module, for establishing the index using call number as major key;
Retrieval module, for retrieving required information according to the call number of stream and/or session;
The memory module is specifically used for: the information in stream and/or session is deposited in a manner of structural array according to fixed size It stores up in stream file and/or session file, and information the depositing in stream file and/or session file in stream and/or session Storage space is set to be stored by the ascending sequence of call number, when the byte number stored in stream file and/or session file reaches After the range of user preset, the information in subsequent stream and/or session is stored in next stream file and/or session text automatically In part, the filename of stream file and/or session file includes the smallest stream and/or the call number of session in file, by filename, Stream and/or the relationship between session and call number are saved in independent index file in file.
9. system as claimed in claim 8, it is characterised in that: described first establishes in module for stream and/or session establishment Incremental index number, specifically: to stream belonging to first message found in flow and/or session generate one by timestamp and The call number of incremented sequence number composition.
10. system as claimed in claim 8, it is characterised in that: described first, which establishes stream information structure in module, includes index Number, five-tuple, initial time, end time, uplink and downlink message number, byte number;When session information structure includes call number, starting Between, end time, uplink and downlink message number, byte number, next session call number, storage location.
11. system as claimed in claim 8, it is characterised in that: the retrieval module is specifically used for: quick according to index file The stream file and/or session file where the information in stream and/or session are navigated to, then according to stream and/or the index of session Number quickly position of the information in stream file and/or session file in positioning stream and/or session.
12. system as claimed in claim 8, which is characterized in that the foundation is using call number as the index of major key, replacement are as follows: It establishes using IP address/port/domain name/URL/ file HASH value as the index of major key.
13. system as claimed in claim 8, which is characterized in that further include: original flow information storage module, for being based on Information Extracting strategy, when being stored to the original flow information in session, by multiple similar original flow informations according to extraction Time numerical value, ascending sequential storage is into the same storage file.
14. system as claimed in claim 8, which is characterized in that further include: reduction file storage module, for being based on information Extraction strategy, in session also original store when, will it is multiple it is similar also originals according to extraction time numerical value, by It is small to big sequential storage into the same storage file.
CN201610193639.5A 2016-03-31 2016-03-31 A kind of method and system of flow storage quick indexing Active CN106547810B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610193639.5A CN106547810B (en) 2016-03-31 2016-03-31 A kind of method and system of flow storage quick indexing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610193639.5A CN106547810B (en) 2016-03-31 2016-03-31 A kind of method and system of flow storage quick indexing

Publications (2)

Publication Number Publication Date
CN106547810A CN106547810A (en) 2017-03-29
CN106547810B true CN106547810B (en) 2019-07-02

Family

ID=58364942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610193639.5A Active CN106547810B (en) 2016-03-31 2016-03-31 A kind of method and system of flow storage quick indexing

Country Status (1)

Country Link
CN (1) CN106547810B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108111261B (en) * 2017-11-10 2021-02-02 北京全路通信信号研究设计院集团有限公司 Search matrix generation method and message search method
CN113596098B (en) * 2021-07-01 2023-04-25 杭州迪普科技股份有限公司 Session retrieval method, apparatus, device and computer readable storage medium
CN115002179A (en) * 2022-05-06 2022-09-02 北京中睿天下信息技术有限公司 Method for storing and restoring network full-flow session stream data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073726A (en) * 2011-01-11 2011-05-25 百度在线网络技术(北京)有限公司 Search engine system and structured data import method for search engine system
CN102073725A (en) * 2011-01-11 2011-05-25 百度在线网络技术(北京)有限公司 Method for searching structured data and search engine system for implementing same
CN103714134A (en) * 2013-12-18 2014-04-09 中国科学院计算技术研究所 Network flow data index method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8706710B2 (en) * 2011-05-24 2014-04-22 Red Lambda, Inc. Methods for storing data streams in a distributed environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073726A (en) * 2011-01-11 2011-05-25 百度在线网络技术(北京)有限公司 Search engine system and structured data import method for search engine system
CN102073725A (en) * 2011-01-11 2011-05-25 百度在线网络技术(北京)有限公司 Method for searching structured data and search engine system for implementing same
CN103714134A (en) * 2013-12-18 2014-04-09 中国科学院计算技术研究所 Network flow data index method and system

Also Published As

Publication number Publication date
CN106547810A (en) 2017-03-29

Similar Documents

Publication Publication Date Title
CN106547810B (en) A kind of method and system of flow storage quick indexing
CN102906751B (en) A kind of method of data storage, data query and device
CN103986829B (en) A kind of mobile terminal and its contact person's display methods and device
CN103299600B (en) For transmitting the apparatus and method of live media content
CN103258018A (en) File synchronization method capable of accurately monitoring file changes in catalog folder
CN105989076A (en) Data statistical method and device
CN105096363A (en) Picture editing method and picture editing device
CN102622384A (en) File management method
CN103957282A (en) Domain name resolution accelerating system of in-domain terminal users and method thereof
CN105450964B (en) Method, system and management node for cloud storage of video data
CN110083524A (en) Upload data test method, apparatus, computer equipment and storage medium
CN104217011A (en) Method and device for inquiring HBase secondary index table
CN105391863A (en) Method and device for creating reminding item information
CN106777387A (en) A kind of Internet of Things big data access method based on HBase
CN106101412A (en) The information session processing method of mobile communication terminal and mobile communication terminal
CN110381128B (en) Uploading method and cloud storage model suitable for streaming media file
CN105450997A (en) Cloud storage based video monitoring system
CN107483238A (en) A kind of blog management method, cluster management node and system
CN107562810A (en) Video big data is classified storage method
CN107729419A (en) A kind of intelligence preserves method, mobile terminal and the storage medium of picture and video
CN105550377B (en) The processing method and processing device of multimedia file
CN102937956A (en) Method and device for storing real-time messages in intelligent substation
CN103957119A (en) Method for managing network devices through MIB file and browser
CN106130880A (en) The gathering method of network mail data and system
CN103870477B (en) Gathering file management method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100195 Beijing city Haidian District minzhuang Road No. 3, Tsinghua Science Park Building 1 Yuquan Huigu a

Applicant after: Beijing ahtech network Safe Technology Ltd

Address before: 100080 Zhongguancun Haidian District street, No. 14, layer, 1 1415-16

Applicant before: Beijing Antiy Electronic Installation Co., Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant