CN106850504A - Harmful code detection method and device based on HTTP static compress data flows - Google Patents

Harmful code detection method and device based on HTTP static compress data flows Download PDF

Info

Publication number
CN106850504A
CN106850504A CN201510884627.2A CN201510884627A CN106850504A CN 106850504 A CN106850504 A CN 106850504A CN 201510884627 A CN201510884627 A CN 201510884627A CN 106850504 A CN106850504 A CN 106850504A
Authority
CN
China
Prior art keywords
character
string
matched
pointer
compressed data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510884627.2A
Other languages
Chinese (zh)
Other versions
CN106850504B (en
Inventor
李博
彭浩
孙佩源
刘哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201510884627.2A priority Critical patent/CN106850504B/en
Publication of CN106850504A publication Critical patent/CN106850504A/en
Application granted granted Critical
Publication of CN106850504B publication Critical patent/CN106850504B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0245Filtering by information in the payload

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention provides a kind of harmful code detection method and device based on HTTP static compress data flows, and method includes:According to the corresponding object binary string inquiry gzip compressed data streams of harmful pattern string, there is position first in obtain object binary string, and Hofmann decoding is carried out to the gzip compressed data streams after the position, obtains the first compressed data stream;Triggering multi-mode matching window sliding, to promote LZ77 to decompress slip of the sliding window on the first compressed data stream;Wherein, for the Long pointer in the first compressed data stream, only do Boundary Match, and judge whether interim stack has the index position in the reference character string pointed by pointer, if so, determine pointer where target location in index position and store into interim stack, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, invasion or harmful code detection time are shortened, improve the speed of user's online experience.

Description

Harmful code detection method and device based on HTTP static compress data flows
Technical field
It is the present invention relates to communication technical field more particularly to a kind of based on HTTP static compress data flows Harmful code detection method and device.
Background technology
At present, in order to improve the security of network transmission, for Internet Server send based on super text The gzip compressed data streams of this host-host protocol (Hyper Text Transfer Protocol, HTTP) are (by counting Obtained respectively through LZ77 compressions and huffman coding according to stream), positioned at Internet Server and client Between fire wall and gateway server need to invade each TCP of compressed data stream compression burst Detection and the detection of harmful code, detection just send out each TCP compression bursts of compressed data stream after passing through Deliver to client.
In the prior art, when fire wall and gateway server are detected to compressed data stream, it is necessary first to Carry out Hofmann decoding and LZ77 decompressions successively to compressed data stream, then the data after decompression are flowed into Row invasion or the detection of harmful code.However, the Hofmann decoding and LZ77 that are carried out to compressed data stream Decompression, consumes substantial amounts of time and storage resource, extends the transmission time of compressed data stream, influences The speed of client user's online experience.
The content of the invention
The present invention provides a kind of harmful code detection method and dress based on HTTP static compress data flows Put, for solving existing detection process in, consume the problem of substantial amounts of time and storage resource.
The first aspect of the invention is to provide a kind of harmful code based on HTTP static compress data flows Detection method, including:
Obtain Internet Server and send client gzip compressed data streams to be subjected, the gzip compressions Data flow is for the Internet Server by original plaintext data by after LZ77 compressions and huffman coding The compressed data stream for obtaining;
Inquire about default Hofman tree, obtain the corresponding binary Huffman of each character in harmful pattern string and compile Code, the generation corresponding object binary string of harmful pattern string;
The gzip compressed data streams are inquired about according to the object binary string, the gzip compressions number is obtained According to the appearance position of the first time of object binary string described in stream;
Huffman solution is carried out to occurring the compressed data stream after position described in the gzip compressed data streams Code, obtain by LZ77 compress the first compressed data stream, first compressed data stream include to Few TCP compressions burst;
Triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window in the described first compression Slip in data flow, to realize decompressing sliding window to first compressed data using the LZ77 It is original bright to what is obtained after decompression using the multi-mode matching window while stream carries out LZ77 decompressions Literary data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described using the LZ77 During first compressed data stream carries out LZ77 decompressions, the pointer in the first compressed data stream is looked into Default interim stack is ask, judges whether have positioned at the reference word pointed by the pointer in the interim stack The first index position in symbol string, if having, in the reference character string according to pointed by the pointer the One index position, replaces in the reference character string for obtaining on the target location where determining the pointer Two index positions, second index position is stored into the interim stack;And judge the pointer institute Whether the length length of the reference character string of sensing is more than 2 (Lmin-1), in the ginseng pointed by the pointer When the length length for examining character string is more than 2 (Lmin-1), the ginseng on the target location where the pointer Examine and skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, with Just after using multi-mode matching window to decompression the original plaintext data that obtain carry out harmful pattern string Timing, jump Length-2 (Lmin-1)+1 at the Lmin-1 character in the reference character string Pattern matching is carried out at character to+1 character of Length- (Lmin-1);The LZ77 decompressions are slided The distance between the right margin of window and the right margin of multi-mode matching window are less than Lmin-1;Lmin tables Show the minimum length of harmful pattern string, and be positive integer;Length is LZ77 decompression procedure middle fingers The length of the reference character string pointed by pin;The pointer include the pointer where target location extremely The first distance between the position of the reference character string that the pointer is pointed to, and the reference character string Length;
If after at least one TCP compressions burst all decompression, the original plaintext number corresponding to it The character string with harmful pattern matching is not contained in, then by the corresponding gzip of TCP compression bursts Compressed data stream is sent to the client.
Further, it is described using the multi-mode matching window to the original plaintext data that are obtained after decompression The matching of harmful pattern string is carried out, including:
Obtain the first word to be matched that the i-th byte to the i-th+N-1 bytes in the original plaintext data is constituted Symbol block;
Judge whether be provided with skip distance parameter on each character in the described first character block to be matched;
If being not provided with skip distance parameter on each character in the described first character block to be matched, sentence Disconnected described first character block to be matched whether there is corresponding step value in default SHIFT tables, if Described first character block to be matched does not exist corresponding step value in the default SHIFT tables, will I adds Lmin-1, repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched Section length, and be positive integer.
Further, described method also includes:
If there is corresponding step-length in the default SHIFT tables in the described first character block to be matched Value, then judge that the character block to be matched has corresponding step value in the default SHIFT tables Whether it is default step value;
If the character block to be matched exist in the default SHIFT tables corresponding step value be for The default step value, then obtain Lmin-N byte before i-th byte;
Before the Lmin-N byte is added into the described first character block to be matched, to form second Character block to be matched;
Default Hash table is inquired about according to the described first character block to be matched, acquisition includes that described first treats The harmful pattern string of the first of the character block of matching;
Judge whether the described second character block to be matched matches with described first harmful pattern string;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented The minimum length of harmful pattern string, and be positive integer.
Further, described method also includes:
If being provided with skip distance parameter on the character in the described first character block to be matched, i is added Length-2(Lmin-1)+1;And the first character block to be matched of reacquisition is matched.
Further, described method also includes:
If the described second character block to be matched and described first harmful pattern matching, by described second The index position of character block to be matched is stored into the interim stack.
Further, it is described to judge whether have positioned at the reference pointed by the pointer in the interim stack The first index position in character string, including:
The index position with the character string of harmful pattern matching stored in the interim stack is obtained, it is described Index position includes:The length of the index address of the character string and the character string;
Whether the index position and the pointer for judging the character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=refer to The reference character string in the first distance+pointer in target location-pointer where pin Length;
If the index position of the character string meets above-mentioned formula with the pointer, it is determined that the interim stack In have the first index position in the reference character string pointed by the pointer.
In the present invention, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding, Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams, Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77 While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer Index position in character string, if so, determine pointer where target location in index position and store Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting The invasion of short compressed data stream or harmful code detection time, improve client user's online experience Speed.
Another aspect of the present invention provides a kind of harmful code inspection based on HTTP static compress data flows Device is surveyed, including:
Acquisition module, client gzip compressed datas to be subjected are sent for obtaining Internet Server Stream, the gzip compressed data streams are that the Internet Server presses original plaintext data by LZ77 The compressed data stream obtained after contracting and huffman coding;
Enquiry module, for inquiring about default Hofman tree, each character is corresponding in obtaining harmful pattern string Binary huffman code, the generation corresponding object binary string of harmful pattern string;
The enquiry module, is additionally operable to inquire about the gzip compressed data streams according to the object binary string, Obtain the appearance position of object binary string first time described in the gzip compressed data streams;
Decompression module, for occurring the compressed data after position described in the gzip compressed data streams Stream carries out Hofmann decoding, obtains the first compressed data stream compressed by LZ77, first compression Data flow includes that at least one TCP compresses burst;
Trigger module, for triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window Slip on first compressed data stream;
Matching module, for decompressing sliding window to first compressed data stream using the LZ77 While carrying out LZ77 and decompress, using the multi-mode matching window to the original plaintext that is obtained after decompression Data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described the using the LZ77 During one compressed data stream carries out LZ77 decompressions, the pointer inquiry in the first compressed data stream Whether default interim stack, judges have positioned at the reference character pointed by the pointer in the interim stack The first index position in string, if having, according to pointed by the pointer in reference character string first Index position, replaces second in the reference character string for obtaining on the target location where determining the pointer Index position, second index position is stored into the interim stack;And judge that the pointer is signified To reference character string length length whether be more than 2 (Lmin-1), in the reference pointed by the pointer When the length length of character string is more than 2 (Lmin-1), the reference on the target location where the pointer Skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, so as to The original plaintext data obtained after using multi-mode matching window to decompression carry out the matching of harmful pattern string When, jump Length-2 (Lmin-1)+1 word at the Lmin-1 character in the reference character string Pattern matching is carried out at symbol to+1 character of Length- (Lmin-1);The LZ77 decompresses sliding window The distance between the right margin of mouth and the right margin of multi-mode matching window are less than Lmin-1;Lmin is represented The minimum length of harmful pattern string, and be positive integer;Length is pointer in LZ77 decompression procedures The length of pointed reference character string;At least one TCP compressions point are preserved in the interim stack With the index position of the pattern string of harmful pattern matching in piece;The pointer is included where the pointer The position of reference character string pointed to of target location to the pointer between the first distance, it is and described The length of reference character string;
Sending module, for after at least one TCP compressions burst all decompression, corresponding to it Original plaintext data in when not containing the character string with harmful pattern matching, by TCP compressions point The corresponding gzip compressed data streams of piece are sent to the client.
Further, the matching module includes:Acquisition submodule and judging submodule;
The matching module is entered using the multi-mode matching window to the original plaintext data obtained after decompression In the matching of the harmful pattern string of row, the acquisition submodule, for obtaining in the original plaintext data the The first character block to be matched that i bytes to the i-th+N-1 bytes are constituted;
The judging submodule, for judging whether set on each character in the described first character block to be matched It is equipped with skip distance parameter;
The judging submodule, is additionally operable to not set on each character in the described first character block to be matched When being equipped with skip distance parameter, judge that the described first character block to be matched is in default SHIFT tables It is no to there is corresponding step value, if the described first character block to be matched is in the default SHIFT tables In the absence of corresponding step value, i plus Lmin-1 repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched Section length, and be positive integer.
Further, the matching module also includes:Addition submodule;
The judging submodule, is additionally operable in the described first character block to be matched described default When there is corresponding step value in SHIFT tables, judge the character block to be matched described default Have whether corresponding step value is default step value in SHIFT tables;
The acquisition submodule, is additionally operable to the character block to be matched in the default SHIFT tables When to there is corresponding step value be for the default step value, Lmin-N before i-th byte is obtained Individual byte;
The addition submodule, for the Lmin-N byte is to be matched added to described first Before character block, to form the second character block to be matched;
The acquisition submodule, is additionally operable to inquire about default Hash according to the described first character block to be matched Table, acquisition includes first harmful pattern string of the described first character block to be matched;
The judging submodule, is additionally operable to judge that the described second character block to be matched is harmful with described first Whether pattern string matches;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented The minimum length of harmful pattern string, and be positive integer.
Further, the acquisition submodule, is additionally operable to the word in the described first character block to be matched When skip distance parameter is provided with symbol, by i plus Length-2 (Lmin-1)+1;And reacquisition first is treated The character block of matching is matched.
In the present invention, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding, Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams, Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77 While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer Index position in character string, if so, determine pointer where target location in index position and store Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting The invasion of short compressed data stream or harmful code detection time, improve client user's online experience Speed.
Brief description of the drawings
The harmful code detection method one based on HTTP static compress data flows that Fig. 1 is provided for the present invention The flow chart of individual embodiment;
Fig. 2 is that the default interim stack of pointer inquiry in the first compressed data stream obtains the first index bit Put and store the schematic diagram of the second index position;
Fig. 3 for the harmful code detection method based on HTTP static compress data flows that provides of the present invention again The flow chart of one embodiment;
The harmful code detection means one based on HTTP static compress data flows that Fig. 4 is provided for the present invention The structural representation of individual embodiment;
Fig. 5 for the harmful code detection means based on HTTP static compress data flows that provides of the present invention again The structural representation of one embodiment.
Specific embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with this hair Accompanying drawing in bright embodiment, is clearly and completely described to the technical scheme in the embodiment of the present invention, Obviously, described embodiment is a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained under the premise of creative work is not made The every other embodiment for obtaining, belongs to the scope of protection of the invention.
The harmful code detection method one based on HTTP static compress data flows that Fig. 1 is provided for the present invention The flow chart of individual embodiment, as shown in figure 1, specifically including following steps:
101st, obtain Internet Server and send client gzip compressed data streams to be subjected, gzip pressures Contracting data flow is that Internet Server obtains original plaintext data by after LZ77 compressions and huffman coding The compressed data stream for arriving.
The execution master of the harmful code detection method based on HTTP static compress data flows that the present invention is provided Body is the harmful code detection means based on HTTP static compress data flows, based on HTTP static compress The harmful code detection means of data flow is specifically as follows anti-between Internet Server and client Wall with flues or gateway server.
Wherein, LZ77 is a kind of pointer backtracking compression algorithm of self adaptation, and core is to compress to slide in LZ77 The byte of repetition is searched in history character in dynamic window, when there is the byte of repetition, the byte for repeating will Replaced by one group of short and small pointer (distance, length), pointer the inside distance refers to and pleonasm The distance between section, from 1 to 32KB.Length is the number between 3 to 258, represents weight The length of multiple byte.For example, character " applefapplt ", can be compressed to " applef (6,4) t ". The object of LZ77 compression treatment is specifically as follows html language, Javascript in original plaintext data Language and CSS files etc..
After Internet Server carries out LZ77 compressions to original plaintext data, using huffman coding pair The compression process of the data flow after LZ77 compressions includes:Using huffman coding in LZ77 compression processes The pointer and character of generation carry out second compression again, obtain two Hofman tree information and two class Huffmans are compiled The bit streams of code;Two Hofman trees are finally compressed using Run- Length Coding and huffman coding compress technique Data sequence information.
After Internet Server completes to obtain gzip compressed data streams to original plaintext data compression, can be by Gzip compressed data streams split into multiple continuous gzip compressed data packets, are each gzip compressed data packets Distribution HTTP file headers, and be transmitted.
102nd, default Hofman tree is inquired about, the corresponding binary system Hough of each character in harmful pattern string is obtained Graceful coding, the generation corresponding object binary string of harmful pattern string.
Wherein, in Hofman tree, each character can correspond to a huffman coding value, for example, word The corresponding huffman coding value of symbol " a " can be 0100;The corresponding huffman coding value of character " r " can Being 10111;The corresponding huffman coding value of character " e " can be 001;Character " o " is corresponding suddenly The graceful encoded radio of husband can be 10110;The corresponding huffman coding value of character " t " can be 1000;Character " h " corresponding huffman coding value can be 10011.If certain harmful pattern string is " there ", The corresponding object binary string of the harmful pattern string is " 10001001100110111001 ".
103rd, gzip compressed data streams are inquired about according to object binary string, in acquisition gzip compressed data streams The appearance position of object binary string first time.
104th, Huffman solution is carried out to occurring the compressed data stream after position in gzip compressed data streams Code, obtains the first compressed data stream compressed by LZ77, and the first compressed data stream includes at least one Individual TCP compresses burst.
The gzip compressed data streams that fire wall or gateway server get are specifically as follows multiple and continuously take Gzip compressed data packets with HTTP file headers, each gzip compressed data packets include a TCP Compression burst.Fire wall or gateway server are got after gzip compressed data streams, can remove each Compressed data packets are constituted compressed data stream by the HTTP file headers of compressed data packets according to sequence number.Compression The size of packet is general from 64 bytes to 1518 bytes.Wherein, Hofman tree information is stored in In one gzip compressed data packets.In general, different gzip compressed data streams have different Huffmans Tree information.
105th, multi-mode matching window sliding is triggered, to promote LZ77 to decompress sliding window in the first compression Slip in data flow, to realize carrying out the first compressed data stream using LZ77 decompression sliding windows While LZ77 is decompressed, the original plaintext data obtained after decompression are carried out using multi-mode matching window The matching of harmful pattern string;Wherein, the first compressed data stream is carried out using LZ77 decompression sliding windows During LZ77 is decompressed, the pointer in the first compressed data stream inquires about default interim stack, sentences Whether there is the first index position in the reference character string pointed by pointer in disconnected interim stack, if depositing Have, the first index position in reference character string according to pointed by pointer, the target where determining pointer The second index position in the reference character string for obtaining is replaced on position, the second index position is stored to facing When stack in;And judge whether the length length of the reference character string pointed by pointer is more than 2 (Lmin-1), When the length length of the reference character string pointed by pointer is more than 2 (Lmin-1), in the mesh where pointer Skip distance parameter is set at the Lmin-1 character in reference character string in cursor position Length-2 (Lmin-1)+1, so as to the original plaintext obtained after using multi-mode matching window to decompression When data carry out the matching of harmful pattern string, jumped at the Lmin-1 character in reference character string Pattern string is carried out at+1 character of Length-2 (Lmin-1) to+1 character of Length- (Lmin-1) Match somebody with somebody;The distance between the right margin of LZ77 decompression sliding windows and the right margin of multi-mode matching window are small In Lmin-1;Lmin represents the minimum length of harmful pattern string, and is positive integer;Length is LZ77 The length of the reference character string in decompression procedure pointed by pointer;At least one TCP is preserved in interim stack Index position in compression burst with the pattern string of harmful pattern matching;Pointer is included where pointer The first distance between the position of the reference character string that target location to pointer is pointed to, and reference character string Length.
Wherein, multiple harmful pattern strings have been pre-saved on fire wall or gateway server.Wherein, The length of LZ77 decompression sliding windows is specifically as follows 32KB.The length of multi-mode matching window is specific Can be 32KB.LZ77 decompressions are carried out to the first compressed data stream using LZ77 decompression sliding windows During, the default interim stack of pointer inquiry in the first compressed data stream obtains the first index position And the schematic diagram of the second index position of storage can be with as shown in Fig. 2 in fig. 2, pointed by pointer The black vertical line position in the first index position such as reference character string in reference character string;Pointer institute Target location on replace the reference character string for obtaining in the second index position such as the first repeat character string In black vertical line position, i.e., the position pointed by the dotted arrow of interim stack;First repeat character (RPT) Zone line in string is the region to be jumped;The region on the both sides of the first repeat character string is carried out for needs The region of matching.
Wherein, before decompression, index position information can not be included in interim stack.Specifically, step 103 Middle the first index position for judging whether to have in the reference character string pointed by pointer in interim stack Process can specifically include:Obtain the index with the character string of harmful pattern matching stored in interim stack Position, index position includes:The index address of character string and the length of character string;Judge character string Whether index position meets below equation with pointer:
The first distance in target location-pointer where pointer<The index position of=character string<=pointer institute Target location-pointer in the first distance+pointer in reference character string length;
If the index position of character string meets above-mentioned formula with pointer, it is determined that have positioned at finger in interim stack The first index position in reference character string pointed by pin.
For example, be stored with inside interim stack (10,3), (20,4), (25,1), (40,2) rope Draw position, there is pointer (40,20) and (50,10) that current location is 70 and 100 respectively now. In interim stack, index position (40,2) is in the position area of the reference character string pointed by pointer (40,20) Between (30,50) it is internal, i.e. (70-40<=40<=70-40+20), then the finger at pointer position 70 There is the index position (80,2) of harmful pattern string inside pin (40,20), therefore inserted in interim stack Enter index position (80,2);The reference word pointed by pointer (50,10) similarly at position 100 Accord with the interval internal index position in the absence of harmful pattern string in position of string.
If the 106th, after at least one TCP compressions burst all decompression, the original plaintext corresponding to it The character string with harmful pattern matching is not contained in data, then by the corresponding gzip pressures of TCP compression bursts Contracting data flow is sent to client.
Wherein, if containing the character string with harmful pattern matching in original plaintext data corresponding to it, Then fire wall or gateway server abandon the corresponding gzip compressed data streams of TCP compression bursts.
Wherein, in step 104, the corresponding original plaintext data of at least one TCP compression bursts are to include At least one TCP compresses the corresponding original plaintext data of the first compressed data stream of burst.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding, Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams, Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77 While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer Index position in character string, if so, determine pointer where target location in index position and store Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting The invasion of short compressed data stream or harmful code detection time, improve client user's online experience Speed.
Fig. 3 for the harmful code detection method based on HTTP static compress data flows that provides of the present invention again The flow chart of one embodiment, on the basis of embodiment illustrated in fig. 1, uses multi-mode in step 105 The process that match window carries out the matching of harmful pattern string to the original plaintext data obtained after decompression specifically may be used To comprise the following steps:
1051st, the i-th byte to the i-th+N-1 bytes is constituted in acquisition original plaintext data first is to be matched Character block.
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the byte long of character block to be matched Degree, and be positive integer.
1052nd, judge whether be provided with skip distance parameter on each character in the first character block to be matched, If being not provided with skip distance parameter on each character in the first character block to be matched, step is performed 1053;If being provided with skip distance parameter on the character in the first character block to be matched, step is performed 1055。
1053rd, judge that the first character block to be matched whether there is corresponding step in default SHIFT tables Long value, if the first character block to be matched does not exist corresponding step value in default SHIFT tables, Perform step 1054;If there is corresponding step-length in default SHIFT tables in the first character block to be matched Value, then perform step 1056.
Wherein, judging the first character block to be matched in default SHIFT tables with the presence or absence of corresponding Before step value, fire wall or gateway server can create SHIFT tables and Kazakhstan according to harmful pattern string Uncommon table.
The process for creating SHIFT tables is the minimum length Lmin for obtaining harmful pattern string, using minimum Length intercepts the preceding Lmin character of harmful pattern string since the prefix of each harmful pattern string, according to Each character block in the preceding Lmin character of harmful pattern string, according to each character block at preceding Lmin The step-length generation SHIFT tables of character.Then the situation generation according to the harmful pattern string for including character block is breathed out Uncommon table, preserves character block and the corresponding harmful pattern string for including character block in Hash table.
For example, it is assumed that harmful pattern string includes:Rainbow, shine, river, version, brush. The SHIFT tables of the character string after interception prefix, character block and generation are shown in Table 1.
Table 1
Wherein, step-length is the ending character and current character block that 0 expression has the harmful character string after interception It is consistent.
1054th, i is added into Lmin-1, repeats step 1051, until judging to complete.
1055th, i is added into Length-2 (Lmin-1)+1, repeats step 1051, until judging to complete.
1056th, judge whether character block to be matched has corresponding step value in default SHIFT tables To preset step value;If there is corresponding step value in default SHIFT tables in character block to be matched To preset step value, then step 1057 is performed;If character block to be matched is deposited in default SHIFT tables In corresponding step value, but corresponding step value is not default step value, then perform step 1051.
Wherein, default step value herein can be 0.
1057th, Lmin-N byte before the i-th byte is obtained.
1058th, before Lmin-N byte being added into the first character block to be matched, treated with forming second The character block of matching.
1059th, default Hash table is inquired about according to the first character block to be matched, acquisition includes that first treats The harmful pattern string of the first of the character block matched somebody with somebody.
1060th, judge whether the second character block to be matched matches with first harmful pattern string;Then perform Step 1061;If the second character block to be matched and first harmful pattern matching, also perform step 1062.
Wherein, N represents the byte length of character block to be matched, and is positive integer;Lmin represents harmful The minimum length of character string, and be positive integer.
1061st, i is added 1, repeats step 1051, until judging to complete.
1062nd, the index position of the second character block to be matched is stored into interim stack.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding, Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams, Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77 While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer Index position in character string, if so, determine pointer where target location in index position and store Into interim stack;The original plaintext data obtained after with multi-mode matching window to decompression carry out harmful mould During the matching of formula string, the i-th byte to the i-th+N-1 bytes in original plaintext data is constituted the is obtained Whether one character block to be matched, first judges be provided with jump on each character in the first character block to be matched Distance parameter, jump is determined whether according to judged result, then judges the first character block to be matched pre- If SHIFT tables in whether there is corresponding step value, determined whether there is and first according to judged result Harmful pattern string of the corresponding second character Block- matching of character block to be matched, and LZ77 decompression sliding windows The distance between mouth and multi-mode matching window are less than predeterminable range threshold value, so as to realize that side carries out LZ77 Decompression, while carrying out the multimode matching detection of jumping characteristic, shortens invasion or the harmful code of compressed data stream Detection time, improves the speed of client user's online experience.
One of ordinary skill in the art will appreciate that:Realize all or part of step of above-mentioned each method embodiment Suddenly can be completed by the related hardware of programmed instruction.Foregoing program can be stored in a computer can In reading storage medium.The program upon execution, performs the step of including above-mentioned each method embodiment;And Foregoing storage medium includes:ROM, RAM, magnetic disc or CD etc. are various can be with storage program generation The medium of code.
The harmful code detection means one based on HTTP static compress data flows that Fig. 4 is provided for the present invention The structural representation of individual embodiment, as shown in figure 4, including:
Acquisition module 41, client gzip compressed datas to be subjected are sent for obtaining Internet Server Stream, gzip compressed data streams are that Internet Server compresses and Hough original plaintext data by LZ77 The compressed data stream obtained after graceful coding;
Enquiry module 42, for inquiring about default Hofman tree, obtains each character correspondence in harmful pattern string Binary huffman code, the harmful corresponding object binary string of pattern string of generation;
Enquiry module 42, is additionally operable to inquire about gzip compressed data streams according to object binary string, obtains gzip The appearance position of object binary string first time in compressed data stream;
Decompression module 43, for being carried out to occurring the compressed data stream after position in gzip compressed data streams Hofmann decoding, obtains the first compressed data stream compressed by LZ77, is wrapped in the first compressed data stream Include at least one TCP compression bursts;
Trigger module 44, for triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window Slip of the mouth on the first compressed data stream;
Matching module 45, for being carried out to the first compressed data stream using LZ77 decompression sliding windows While LZ77 is decompressed, the original plaintext data obtained after decompression are carried out using multi-mode matching window The matching of harmful pattern string;Wherein, the first compressed data stream is carried out using LZ77 decompression sliding windows During LZ77 is decompressed, the pointer in the first compressed data stream inquires about default interim stack, sentences Whether there is the first index position in the reference character string pointed by pointer in disconnected interim stack, if depositing Have, the first index position in reference character string according to pointed by pointer, the target where determining pointer The second index position in the reference character string for obtaining is replaced on position, the second index position is stored to facing When stack in;And judge whether the length length of the reference character string pointed by pointer is more than 2 (Lmin-1), When the length length of the reference character string pointed by pointer is more than 2 (Lmin-1), in the mesh where pointer Skip distance parameter is set at the Lmin-1 character in reference character string in cursor position Length-2 (Lmin-1)+1, so as to the original plaintext obtained after using multi-mode matching window to decompression When data carry out the matching of harmful pattern string, jumped at the Lmin-1 character in reference character string Pattern string is carried out at+1 character of Length-2 (Lmin-1) to+1 character of Length- (Lmin-1) Match somebody with somebody;The distance between the right margin of LZ77 decompression sliding windows and the right margin of multi-mode matching window are small In Lmin-1;Lmin represents the minimum length of harmful pattern string, and is positive integer;Length is LZ77 The length of the reference character string in decompression procedure pointed by pointer;At least one TCP is preserved in interim stack Index position in compression burst with the pattern string of harmful pattern matching;Pointer is included where pointer The first distance between the position of the reference character string that target location to pointer is pointed to, and reference character string Length;
Sending module 46, for after at least one TCP compressions burst all decompression, corresponding to it Original plaintext data in not comprising character string with harmful pattern matching when, TCP is compressed into burst pair The gzip compressed data streams answered are sent to client.
The harmful code detection means based on HTTP static compress data flows that the present invention is provided specifically can be with It is the fire wall or gateway server between Internet Server and client.
Specifically, whether matching module 45 judges have positioned at the reference character pointed by pointer in interim stack The process of the first index position in string can specifically include:Obtain storing with harmful pattern in interim stack The index position of the character string of String matching, index position includes:The index address and character string of character string Length;Whether the index position and pointer for judging character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=pointer institute Target location-pointer in the first distance+pointer in reference character string length;
If the index position of character string meets above-mentioned formula with pointer, it is determined that have positioned at finger in interim stack The first index position in reference character string pointed by pin.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding, Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams, Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77 While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer Index position in character string, if so, determine pointer where target location in index position and store Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting The invasion of short compressed data stream or harmful code detection time, improve client user's online experience Speed.
Further, with reference to Fig. 5 is referred to, on the basis of embodiment illustrated in fig. 4, matching module 45 is wrapped Include:Acquisition submodule 451, judging submodule 452 and addition submodule 453;
Matching module 45 is had using multi-mode matching window to the original plaintext data obtained after decompression In the matching of evil pattern string, acquisition submodule 451, for obtaining in original plaintext data the i-th byte to the First character block to be matched of i+N-1 bytes composition;
Judging submodule 452, for judging whether be provided with each character in the first character block to be matched Skip distance parameter;
Judging submodule 452, is additionally operable to be not provided with each character in the first character block to be matched During skip distance parameter, judge the first character block to be matched in default SHIFT tables with the presence or absence of right The step value answered, if the first character block to be matched does not exist corresponding step-length in default SHIFT tables Value, i plus Lmin-1 repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the byte long of character block to be matched Degree, and be positive integer;
Judging submodule 452, is additionally operable to be deposited in default SHIFT tables in the first character block to be matched In corresponding step value, judge that character block to be matched has corresponding step in default SHIFT tables Whether long value is default step value;
Acquisition submodule 451, is additionally operable to character block to be matched and there is correspondence in default SHIFT tables Step value when being for default step value, obtain Lmin-N byte before the i-th byte;
Addition submodule 453, before Lmin-N byte is added into the first character block to be matched, To form the second character block to be matched;
Acquisition submodule 451, is additionally operable to inquire about default Hash table according to the first character block to be matched, Acquisition includes first harmful pattern string of the first character block to be matched;
Judging submodule 452, is additionally operable to judge that the second character block to be matched is with first harmful pattern string No matching;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of character block to be matched, and is positive integer;Lmin represents harmful The minimum length of pattern string, and be positive integer;
Wherein, judging submodule 452 is judging the first character block to be matched in default SHIFT tables Before corresponding step value, fire wall or gateway server can be created according to harmful character string Build SHIFT tables and Hash table.
The process for creating SHIFT tables is the minimum length Lmin for obtaining harmful pattern string, using minimum Length intercepts the preceding Lmin character of harmful pattern string since the prefix of each harmful pattern string, according to Each character block in the preceding Lmin character of harmful pattern string, according to each character block at preceding Lmin The step-length generation SHIFT tables of character.Then the situation generation according to the harmful pattern string for including character block is breathed out Uncommon table, preserves character block and the corresponding harmful character string for including character block in Hash table.
Further, acquisition submodule 451, are additionally operable on the character in the first character block to be matched When being provided with skip distance parameter, by i plus Length-2 (Lmin-1)+1;And it is to be matched to reacquire first Character block matched.
Further, described device can also include:Memory module, memory module is used for, second When character block to be matched and first harmful pattern matching, by the index bit of the second character block to be matched Put and store into interim stack.
Further, whether judging submodule 452 judges have positioned at the ginseng pointed by pointer in interim stack Examine in the first index position in character string, judging submodule 452 in the interim stack of acquisition specifically for depositing The index position with the character string of harmful pattern matching of storage, index position includes:The index of character string Address and the length of character string;
Whether the index position and pointer for judging character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=pointer institute Target location-pointer in the first distance+pointer in reference character string length;
If the index position of character string meets above-mentioned formula with pointer, it is determined that have positioned at finger in interim stack The first index position in reference character string pointed by pin.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding, Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams, Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77 While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer Index position in character string, if so, determine pointer where target location in index position and store Into interim stack;The original plaintext data obtained after with multi-mode matching window to decompression carry out harmful mould During the matching of formula string, the i-th byte to the i-th+N-1 bytes in original plaintext data is constituted the is obtained Whether one character block to be matched, first judges be provided with jump on each character in the first character block to be matched Distance parameter, jump is determined whether according to judged result, then judges the first character block to be matched pre- If SHIFT tables in whether there is corresponding step value, determined whether there is and first according to judged result Harmful pattern string of the corresponding second character Block- matching of character block to be matched, and LZ77 decompression sliding windows The distance between mouth and multi-mode matching window are less than predeterminable range threshold value, so as to realize that side carries out LZ77 Decompression, while carrying out the multimode matching detection of jumping characteristic, shortens invasion or the harmful code of compressed data stream Detection time, improves the speed of client user's online experience.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than right Its limitation;Although being described in detail to the present invention with reference to foregoing embodiments, this area it is common Technical staff should be understood:It can still be repaiied to the technical scheme described in foregoing embodiments Change, or equivalent is carried out to which part or all technical characteristic;And these are changed or replace Change, do not make the scope of the essence disengaging various embodiments of the present invention technical scheme of appropriate technical solution.

Claims (10)

1. a kind of harmful code detection method based on HTTP static compress data flows, its feature exists In, including:
Obtain Internet Server and send client gzip compressed data streams to be subjected, the gzip compressions Data flow is for the Internet Server by original plaintext data by after LZ77 compressions and huffman coding The compressed data stream for obtaining;
Inquire about default Hofman tree, obtain the corresponding binary Huffman of each character in harmful pattern string and compile Code, the generation corresponding object binary string of harmful pattern string;
The gzip compressed data streams are inquired about according to the object binary string, the gzip compressions number is obtained According to the appearance position of the first time of object binary string described in stream;
Huffman solution is carried out to occurring the compressed data stream after position described in the gzip compressed data streams Code, obtain by LZ77 compress the first compressed data stream, first compressed data stream include to Few TCP compressions burst;
Triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window in the described first compression Slip in data flow, to realize decompressing sliding window to first compressed data using the LZ77 It is original bright to what is obtained after decompression using the multi-mode matching window while stream carries out LZ77 decompressions Literary data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described using the LZ77 During first compressed data stream carries out LZ77 decompressions, the pointer in the first compressed data stream is looked into Default interim stack is ask, judges whether have positioned at the reference word pointed by the pointer in the interim stack The first index position in symbol string, if having, in the reference character string according to pointed by the pointer the One index position, replaces in the reference character string for obtaining on the target location where determining the pointer Two index positions, second index position is stored into the interim stack;And judge the pointer institute Whether the length length of the reference character string of sensing is more than 2 (Lmin-1), in the ginseng pointed by the pointer When the length length for examining character string is more than 2 (Lmin-1), the ginseng on the target location where the pointer Examine and skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, with Just after using multi-mode matching window to decompression the original plaintext data that obtain carry out harmful pattern string Timing, jump Length-2 (Lmin-1)+1 at the Lmin-1 character in the reference character string Pattern matching is carried out at character to+1 character of Length- (Lmin-1);The LZ77 decompressions are slided The distance between the right margin of window and the right margin of multi-mode matching window are less than Lmin-1;Lmin tables Show the minimum length of harmful pattern string, and be positive integer;Length is LZ77 decompression procedure middle fingers The length of the reference character string pointed by pin;The pointer include the pointer where target location extremely The first distance between the position of the reference character string that the pointer is pointed to, and the reference character string Length;
If after at least one TCP compressions burst all decompression, the original plaintext number corresponding to it The character string with harmful pattern matching is not contained in, then by the corresponding gzip of TCP compression bursts Compressed data stream is sent to the client.
2. method according to claim 1, it is characterised in that described to use the multi-mode With window the original plaintext data obtained after decompression are carried out with the matching of harmful pattern string, including:
Obtain the first word to be matched that the i-th byte to the i-th+N-1 bytes in the original plaintext data is constituted Symbol block;
Judge whether be provided with skip distance parameter on each character in the described first character block to be matched;
If being not provided with skip distance parameter on each character in the described first character block to be matched, sentence Disconnected described first character block to be matched whether there is corresponding step value in default SHIFT tables, if Described first character block to be matched does not exist corresponding step value in the default SHIFT tables, will I adds Lmin-1, repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched Section length, and be positive integer.
3. method according to claim 2, it is characterised in that also include:
If there is corresponding step-length in the default SHIFT tables in the described first character block to be matched Value, then judge that the character block to be matched has corresponding step value in the default SHIFT tables Whether it is default step value;
If the character block to be matched exist in the default SHIFT tables corresponding step value be for The default step value, then obtain Lmin-N byte before i-th byte;
Before the Lmin-N byte is added into the described first character block to be matched, to form second Character block to be matched;
Default Hash table is inquired about according to the described first character block to be matched, acquisition includes that described first treats The harmful pattern string of the first of the character block of matching;
Judge whether the described second character block to be matched matches with described first harmful pattern string;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented The minimum length of harmful pattern string, and be positive integer.
4. method according to claim 2, it is characterised in that also include:
If being provided with skip distance parameter on the character in the described first character block to be matched, i is added Length-2(Lmin-1)+1;And the first character block to be matched of reacquisition is matched.
5. method according to claim 3, it is characterised in that also include:
If the described second character block to be matched and described first harmful pattern matching, by described second The index position of character block to be matched is stored into the interim stack.
6. method according to claim 1, it is characterised in that in the judgement interim stack Whether first index position positioned at the reference character string pointer pointed by is had, including:
The index position with the character string of harmful pattern matching stored in the interim stack is obtained, it is described Index position includes:The length of the index address of the character string and the character string;
Whether the index position and the pointer for judging the character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=refer to The reference character string in the first distance+pointer in target location-pointer where pin Length;
If the index position of the character string meets above-mentioned formula with the pointer, it is determined that the interim stack In have the first index position in the reference character string pointed by the pointer.
7. a kind of harmful code detection means based on HTTP static compress data flows, its feature exists In, including:
Acquisition module, client gzip compressed datas to be subjected are sent for obtaining Internet Server Stream, the gzip compressed data streams are that the Internet Server presses original plaintext data by LZ77 The compressed data stream obtained after contracting and huffman coding;
Enquiry module, for inquiring about default Hofman tree, each character is corresponding in obtaining harmful pattern string Binary huffman code, the generation corresponding object binary string of harmful pattern string;
The enquiry module, is additionally operable to inquire about the gzip compressed data streams according to the object binary string, Obtain the appearance position of object binary string first time described in the gzip compressed data streams;
Decompression module, for occurring the compressed data after position described in the gzip compressed data streams Stream carries out Hofmann decoding, obtains the first compressed data stream compressed by LZ77, first compression Data flow includes that at least one TCP compresses burst;
Trigger module, for triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window Slip on first compressed data stream;
Matching module, for decompressing sliding window to first compressed data stream using the LZ77 While carrying out LZ77 and decompress, using the multi-mode matching window to the original plaintext that is obtained after decompression Data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described the using the LZ77 During one compressed data stream carries out LZ77 decompressions, the pointer inquiry in the first compressed data stream Whether default interim stack, judges have positioned at the reference character pointed by the pointer in the interim stack The first index position in string, if having, according to pointed by the pointer in reference character string first Index position, replaces second in the reference character string for obtaining on the target location where determining the pointer Index position, second index position is stored into the interim stack;And judge that the pointer is signified To reference character string length length whether be more than 2 (Lmin-1), in the reference pointed by the pointer When the length length of character string is more than 2 (Lmin-1), the reference on the target location where the pointer Skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, so as to The original plaintext data obtained after using multi-mode matching window to decompression carry out the matching of harmful pattern string When, jump Length-2 (Lmin-1)+1 word at the Lmin-1 character in the reference character string Pattern matching is carried out at symbol to+1 character of Length- (Lmin-1);The LZ77 decompresses sliding window The distance between the right margin of mouth and the right margin of multi-mode matching window are less than Lmin-1;Lmin is represented The minimum length of harmful pattern string, and be positive integer;Length is pointer in LZ77 decompression procedures The length of pointed reference character string;At least one TCP compressions point are preserved in the interim stack With the index position of the pattern string of harmful pattern matching in piece;The pointer is included where the pointer The position of reference character string pointed to of target location to the pointer between the first distance, it is and described The length of reference character string;
Sending module, for after at least one TCP compressions burst all decompression, corresponding to it Original plaintext data in when not containing the character string with harmful pattern matching, by TCP compressions point The corresponding gzip compressed data streams of piece are sent to the client.
8. device according to claim 7, it is characterised in that the matching module includes:Obtain Take submodule and judging submodule;
The matching module is entered using the multi-mode matching window to the original plaintext data obtained after decompression In the matching of the harmful pattern string of row, the acquisition submodule, for obtaining in the original plaintext data the The first character block to be matched that i bytes to the i-th+N-1 bytes are constituted;
The judging submodule, for judging whether set on each character in the described first character block to be matched It is equipped with skip distance parameter;
The judging submodule, is additionally operable to not set on each character in the described first character block to be matched When being equipped with skip distance parameter, judge that the described first character block to be matched is in default SHIFT tables It is no to there is corresponding step value, if the described first character block to be matched is in the default SHIFT tables In the absence of corresponding step value, i plus Lmin-1 repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched Section length, and be positive integer.
9. device according to claim 8, it is characterised in that the matching module also includes: Addition submodule;
The judging submodule, is additionally operable in the described first character block to be matched described default When there is corresponding step value in SHIFT tables, judge the character block to be matched described default Have whether corresponding step value is default step value in SHIFT tables;
The acquisition submodule, is additionally operable to the character block to be matched in the default SHIFT tables When to there is corresponding step value be for the default step value, Lmin-N before i-th byte is obtained Individual byte;
The addition submodule, for the Lmin-N byte is to be matched added to described first Before character block, to form the second character block to be matched;
The acquisition submodule, is additionally operable to inquire about default Hash according to the described first character block to be matched Table, acquisition includes first harmful pattern string of the described first character block to be matched;
The judging submodule, is additionally operable to judge that the described second character block to be matched is harmful with described first Whether pattern string matches;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented The minimum length of harmful pattern string, and be positive integer.
10. device according to claim 8, it is characterised in that the acquisition submodule, also uses When skip distance parameter is provided with the character in the described first character block to be matched, by i plus Length-2(Lmin-1)+1;And the first character block to be matched of reacquisition is matched.
CN201510884627.2A 2015-12-04 2015-12-04 Harmful code detection method and device based on HTTP static compress data flow Active CN106850504B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510884627.2A CN106850504B (en) 2015-12-04 2015-12-04 Harmful code detection method and device based on HTTP static compress data flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510884627.2A CN106850504B (en) 2015-12-04 2015-12-04 Harmful code detection method and device based on HTTP static compress data flow

Publications (2)

Publication Number Publication Date
CN106850504A true CN106850504A (en) 2017-06-13
CN106850504B CN106850504B (en) 2019-11-15

Family

ID=59150401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510884627.2A Active CN106850504B (en) 2015-12-04 2015-12-04 Harmful code detection method and device based on HTTP static compress data flow

Country Status (1)

Country Link
CN (1) CN106850504B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090115A (en) * 2017-11-03 2018-05-29 中国科学院信息工程研究所 A kind of filter method and system for Gzip compressed datas
WO2023082156A1 (en) * 2021-11-10 2023-05-19 山东方寸微电子科技有限公司 Lz77 decoding circuit and operation method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1359196A (en) * 2000-12-15 2002-07-17 国际商业机器公司 Quick joint image expert group Huffman coding and decoding method
CN104468044A (en) * 2014-12-05 2015-03-25 北京国双科技有限公司 Data compression method and device applied to network transmission
CN104811209A (en) * 2015-04-22 2015-07-29 北京理工大学 Compressed file data embedding method and device capable of resisting longest matching detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1359196A (en) * 2000-12-15 2002-07-17 国际商业机器公司 Quick joint image expert group Huffman coding and decoding method
CN104468044A (en) * 2014-12-05 2015-03-25 北京国双科技有限公司 Data compression method and device applied to network transmission
CN104811209A (en) * 2015-04-22 2015-07-29 北京理工大学 Compressed file data embedding method and device capable of resisting longest matching detection

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090115A (en) * 2017-11-03 2018-05-29 中国科学院信息工程研究所 A kind of filter method and system for Gzip compressed datas
CN108090115B (en) * 2017-11-03 2022-05-17 中国科学院信息工程研究所 Filtering method and system for Gzip compressed data
WO2023082156A1 (en) * 2021-11-10 2023-05-19 山东方寸微电子科技有限公司 Lz77 decoding circuit and operation method thereof

Also Published As

Publication number Publication date
CN106850504B (en) 2019-11-15

Similar Documents

Publication Publication Date Title
US9558241B2 (en) System and method for performing longest common prefix strings searches
US20190273510A1 (en) Classification of source data by neural network processing
US8458354B2 (en) Multi-pattern matching in compressed communication traffic
US9727574B2 (en) System and method for applying an efficient data compression scheme to URL parameters
Gueniche et al. Compact prediction tree: A lossless model for accurate sequence prediction
US20110219357A1 (en) Compressing source code written in a scripting language
WO2011007956A2 (en) Data compression method
JP2009542092A5 (en)
EP3195481B1 (en) Adaptive rate compression hash processing device
CN104868922A (en) Data compression method and device
US9264068B2 (en) Deflate compression algorithm
CN107947918A (en) A kind of carrier-free text steganography method based on character feature
CN104811209B (en) A kind of the compressed file data embedding method and device of anti-most long matching detection
US8909813B2 (en) Efficient processing of compressed communication traffic
CN103701470B (en) Stream intelligence prediction differencing and compression algorithm and corresponding control device
CN106850504A (en) Harmful code detection method and device based on HTTP static compress data flows
CN107623855A (en) A kind of embedded rate steganography device of height based on compressed encoding and steganography method
CN107277109B (en) Multi-string matching method for compressed flow
CN105391514B (en) Character code coding/decoding method and device
Bremler-Barr et al. Decompression-free inspection: Dpi for shared dictionary compression over http
CN106850507A (en) Harmful code detection method and device based on HTTP compressed data streams
CN108090115B (en) Filtering method and system for Gzip compressed data
CN108573069B (en) Twins method for accelerating matching of regular expressions of compressed flow
Beirami et al. Packet-level network compression: Realization and scaling of the network-wide benefits
US9923576B2 (en) Decoding techniques using a programmable priority encoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant