CN106850504A - Harmful code detection method and device based on HTTP static compress data flows - Google Patents
Harmful code detection method and device based on HTTP static compress data flows Download PDFInfo
- Publication number
- CN106850504A CN106850504A CN201510884627.2A CN201510884627A CN106850504A CN 106850504 A CN106850504 A CN 106850504A CN 201510884627 A CN201510884627 A CN 201510884627A CN 106850504 A CN106850504 A CN 106850504A
- Authority
- CN
- China
- Prior art keywords
- character
- string
- matched
- pointer
- compressed data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0245—Filtering by information in the payload
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The present invention provides a kind of harmful code detection method and device based on HTTP static compress data flows, and method includes:According to the corresponding object binary string inquiry gzip compressed data streams of harmful pattern string, there is position first in obtain object binary string, and Hofmann decoding is carried out to the gzip compressed data streams after the position, obtains the first compressed data stream;Triggering multi-mode matching window sliding, to promote LZ77 to decompress slip of the sliding window on the first compressed data stream;Wherein, for the Long pointer in the first compressed data stream, only do Boundary Match, and judge whether interim stack has the index position in the reference character string pointed by pointer, if so, determine pointer where target location in index position and store into interim stack, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, invasion or harmful code detection time are shortened, improve the speed of user's online experience.
Description
Technical field
It is the present invention relates to communication technical field more particularly to a kind of based on HTTP static compress data flows
Harmful code detection method and device.
Background technology
At present, in order to improve the security of network transmission, for Internet Server send based on super text
The gzip compressed data streams of this host-host protocol (Hyper Text Transfer Protocol, HTTP) are (by counting
Obtained respectively through LZ77 compressions and huffman coding according to stream), positioned at Internet Server and client
Between fire wall and gateway server need to invade each TCP of compressed data stream compression burst
Detection and the detection of harmful code, detection just send out each TCP compression bursts of compressed data stream after passing through
Deliver to client.
In the prior art, when fire wall and gateway server are detected to compressed data stream, it is necessary first to
Carry out Hofmann decoding and LZ77 decompressions successively to compressed data stream, then the data after decompression are flowed into
Row invasion or the detection of harmful code.However, the Hofmann decoding and LZ77 that are carried out to compressed data stream
Decompression, consumes substantial amounts of time and storage resource, extends the transmission time of compressed data stream, influences
The speed of client user's online experience.
The content of the invention
The present invention provides a kind of harmful code detection method and dress based on HTTP static compress data flows
Put, for solving existing detection process in, consume the problem of substantial amounts of time and storage resource.
The first aspect of the invention is to provide a kind of harmful code based on HTTP static compress data flows
Detection method, including:
Obtain Internet Server and send client gzip compressed data streams to be subjected, the gzip compressions
Data flow is for the Internet Server by original plaintext data by after LZ77 compressions and huffman coding
The compressed data stream for obtaining;
Inquire about default Hofman tree, obtain the corresponding binary Huffman of each character in harmful pattern string and compile
Code, the generation corresponding object binary string of harmful pattern string;
The gzip compressed data streams are inquired about according to the object binary string, the gzip compressions number is obtained
According to the appearance position of the first time of object binary string described in stream;
Huffman solution is carried out to occurring the compressed data stream after position described in the gzip compressed data streams
Code, obtain by LZ77 compress the first compressed data stream, first compressed data stream include to
Few TCP compressions burst;
Triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window in the described first compression
Slip in data flow, to realize decompressing sliding window to first compressed data using the LZ77
It is original bright to what is obtained after decompression using the multi-mode matching window while stream carries out LZ77 decompressions
Literary data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described using the LZ77
During first compressed data stream carries out LZ77 decompressions, the pointer in the first compressed data stream is looked into
Default interim stack is ask, judges whether have positioned at the reference word pointed by the pointer in the interim stack
The first index position in symbol string, if having, in the reference character string according to pointed by the pointer the
One index position, replaces in the reference character string for obtaining on the target location where determining the pointer
Two index positions, second index position is stored into the interim stack;And judge the pointer institute
Whether the length length of the reference character string of sensing is more than 2 (Lmin-1), in the ginseng pointed by the pointer
When the length length for examining character string is more than 2 (Lmin-1), the ginseng on the target location where the pointer
Examine and skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, with
Just after using multi-mode matching window to decompression the original plaintext data that obtain carry out harmful pattern string
Timing, jump Length-2 (Lmin-1)+1 at the Lmin-1 character in the reference character string
Pattern matching is carried out at character to+1 character of Length- (Lmin-1);The LZ77 decompressions are slided
The distance between the right margin of window and the right margin of multi-mode matching window are less than Lmin-1;Lmin tables
Show the minimum length of harmful pattern string, and be positive integer;Length is LZ77 decompression procedure middle fingers
The length of the reference character string pointed by pin;The pointer include the pointer where target location extremely
The first distance between the position of the reference character string that the pointer is pointed to, and the reference character string
Length;
If after at least one TCP compressions burst all decompression, the original plaintext number corresponding to it
The character string with harmful pattern matching is not contained in, then by the corresponding gzip of TCP compression bursts
Compressed data stream is sent to the client.
Further, it is described using the multi-mode matching window to the original plaintext data that are obtained after decompression
The matching of harmful pattern string is carried out, including:
Obtain the first word to be matched that the i-th byte to the i-th+N-1 bytes in the original plaintext data is constituted
Symbol block;
Judge whether be provided with skip distance parameter on each character in the described first character block to be matched;
If being not provided with skip distance parameter on each character in the described first character block to be matched, sentence
Disconnected described first character block to be matched whether there is corresponding step value in default SHIFT tables, if
Described first character block to be matched does not exist corresponding step value in the default SHIFT tables, will
I adds Lmin-1, repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched
Section length, and be positive integer.
Further, described method also includes:
If there is corresponding step-length in the default SHIFT tables in the described first character block to be matched
Value, then judge that the character block to be matched has corresponding step value in the default SHIFT tables
Whether it is default step value;
If the character block to be matched exist in the default SHIFT tables corresponding step value be for
The default step value, then obtain Lmin-N byte before i-th byte;
Before the Lmin-N byte is added into the described first character block to be matched, to form second
Character block to be matched;
Default Hash table is inquired about according to the described first character block to be matched, acquisition includes that described first treats
The harmful pattern string of the first of the character block of matching;
Judge whether the described second character block to be matched matches with described first harmful pattern string;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented
The minimum length of harmful pattern string, and be positive integer.
Further, described method also includes:
If being provided with skip distance parameter on the character in the described first character block to be matched, i is added
Length-2(Lmin-1)+1;And the first character block to be matched of reacquisition is matched.
Further, described method also includes:
If the described second character block to be matched and described first harmful pattern matching, by described second
The index position of character block to be matched is stored into the interim stack.
Further, it is described to judge whether have positioned at the reference pointed by the pointer in the interim stack
The first index position in character string, including:
The index position with the character string of harmful pattern matching stored in the interim stack is obtained, it is described
Index position includes:The length of the index address of the character string and the character string;
Whether the index position and the pointer for judging the character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=refer to
The reference character string in the first distance+pointer in target location-pointer where pin
Length;
If the index position of the character string meets above-mentioned formula with the pointer, it is determined that the interim stack
In have the first index position in the reference character string pointed by the pointer.
In the present invention, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding,
Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two
System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams
Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams,
Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote
LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77
While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression
The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream
In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer
Index position in character string, if so, determine pointer where target location in index position and store
Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default
Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting
The invasion of short compressed data stream or harmful code detection time, improve client user's online experience
Speed.
Another aspect of the present invention provides a kind of harmful code inspection based on HTTP static compress data flows
Device is surveyed, including:
Acquisition module, client gzip compressed datas to be subjected are sent for obtaining Internet Server
Stream, the gzip compressed data streams are that the Internet Server presses original plaintext data by LZ77
The compressed data stream obtained after contracting and huffman coding;
Enquiry module, for inquiring about default Hofman tree, each character is corresponding in obtaining harmful pattern string
Binary huffman code, the generation corresponding object binary string of harmful pattern string;
The enquiry module, is additionally operable to inquire about the gzip compressed data streams according to the object binary string,
Obtain the appearance position of object binary string first time described in the gzip compressed data streams;
Decompression module, for occurring the compressed data after position described in the gzip compressed data streams
Stream carries out Hofmann decoding, obtains the first compressed data stream compressed by LZ77, first compression
Data flow includes that at least one TCP compresses burst;
Trigger module, for triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window
Slip on first compressed data stream;
Matching module, for decompressing sliding window to first compressed data stream using the LZ77
While carrying out LZ77 and decompress, using the multi-mode matching window to the original plaintext that is obtained after decompression
Data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described the using the LZ77
During one compressed data stream carries out LZ77 decompressions, the pointer inquiry in the first compressed data stream
Whether default interim stack, judges have positioned at the reference character pointed by the pointer in the interim stack
The first index position in string, if having, according to pointed by the pointer in reference character string first
Index position, replaces second in the reference character string for obtaining on the target location where determining the pointer
Index position, second index position is stored into the interim stack;And judge that the pointer is signified
To reference character string length length whether be more than 2 (Lmin-1), in the reference pointed by the pointer
When the length length of character string is more than 2 (Lmin-1), the reference on the target location where the pointer
Skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, so as to
The original plaintext data obtained after using multi-mode matching window to decompression carry out the matching of harmful pattern string
When, jump Length-2 (Lmin-1)+1 word at the Lmin-1 character in the reference character string
Pattern matching is carried out at symbol to+1 character of Length- (Lmin-1);The LZ77 decompresses sliding window
The distance between the right margin of mouth and the right margin of multi-mode matching window are less than Lmin-1;Lmin is represented
The minimum length of harmful pattern string, and be positive integer;Length is pointer in LZ77 decompression procedures
The length of pointed reference character string;At least one TCP compressions point are preserved in the interim stack
With the index position of the pattern string of harmful pattern matching in piece;The pointer is included where the pointer
The position of reference character string pointed to of target location to the pointer between the first distance, it is and described
The length of reference character string;
Sending module, for after at least one TCP compressions burst all decompression, corresponding to it
Original plaintext data in when not containing the character string with harmful pattern matching, by TCP compressions point
The corresponding gzip compressed data streams of piece are sent to the client.
Further, the matching module includes:Acquisition submodule and judging submodule;
The matching module is entered using the multi-mode matching window to the original plaintext data obtained after decompression
In the matching of the harmful pattern string of row, the acquisition submodule, for obtaining in the original plaintext data the
The first character block to be matched that i bytes to the i-th+N-1 bytes are constituted;
The judging submodule, for judging whether set on each character in the described first character block to be matched
It is equipped with skip distance parameter;
The judging submodule, is additionally operable to not set on each character in the described first character block to be matched
When being equipped with skip distance parameter, judge that the described first character block to be matched is in default SHIFT tables
It is no to there is corresponding step value, if the described first character block to be matched is in the default SHIFT tables
In the absence of corresponding step value, i plus Lmin-1 repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched
Section length, and be positive integer.
Further, the matching module also includes:Addition submodule;
The judging submodule, is additionally operable in the described first character block to be matched described default
When there is corresponding step value in SHIFT tables, judge the character block to be matched described default
Have whether corresponding step value is default step value in SHIFT tables;
The acquisition submodule, is additionally operable to the character block to be matched in the default SHIFT tables
When to there is corresponding step value be for the default step value, Lmin-N before i-th byte is obtained
Individual byte;
The addition submodule, for the Lmin-N byte is to be matched added to described first
Before character block, to form the second character block to be matched;
The acquisition submodule, is additionally operable to inquire about default Hash according to the described first character block to be matched
Table, acquisition includes first harmful pattern string of the described first character block to be matched;
The judging submodule, is additionally operable to judge that the described second character block to be matched is harmful with described first
Whether pattern string matches;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented
The minimum length of harmful pattern string, and be positive integer.
Further, the acquisition submodule, is additionally operable to the word in the described first character block to be matched
When skip distance parameter is provided with symbol, by i plus Length-2 (Lmin-1)+1;And reacquisition first is treated
The character block of matching is matched.
In the present invention, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding,
Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two
System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams
Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams,
Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote
LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77
While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression
The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream
In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer
Index position in character string, if so, determine pointer where target location in index position and store
Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default
Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting
The invasion of short compressed data stream or harmful code detection time, improve client user's online experience
Speed.
Brief description of the drawings
The harmful code detection method one based on HTTP static compress data flows that Fig. 1 is provided for the present invention
The flow chart of individual embodiment;
Fig. 2 is that the default interim stack of pointer inquiry in the first compressed data stream obtains the first index bit
Put and store the schematic diagram of the second index position;
Fig. 3 for the harmful code detection method based on HTTP static compress data flows that provides of the present invention again
The flow chart of one embodiment;
The harmful code detection means one based on HTTP static compress data flows that Fig. 4 is provided for the present invention
The structural representation of individual embodiment;
Fig. 5 for the harmful code detection means based on HTTP static compress data flows that provides of the present invention again
The structural representation of one embodiment.
Specific embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with this hair
Accompanying drawing in bright embodiment, is clearly and completely described to the technical scheme in the embodiment of the present invention,
Obviously, described embodiment is a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained under the premise of creative work is not made
The every other embodiment for obtaining, belongs to the scope of protection of the invention.
The harmful code detection method one based on HTTP static compress data flows that Fig. 1 is provided for the present invention
The flow chart of individual embodiment, as shown in figure 1, specifically including following steps:
101st, obtain Internet Server and send client gzip compressed data streams to be subjected, gzip pressures
Contracting data flow is that Internet Server obtains original plaintext data by after LZ77 compressions and huffman coding
The compressed data stream for arriving.
The execution master of the harmful code detection method based on HTTP static compress data flows that the present invention is provided
Body is the harmful code detection means based on HTTP static compress data flows, based on HTTP static compress
The harmful code detection means of data flow is specifically as follows anti-between Internet Server and client
Wall with flues or gateway server.
Wherein, LZ77 is a kind of pointer backtracking compression algorithm of self adaptation, and core is to compress to slide in LZ77
The byte of repetition is searched in history character in dynamic window, when there is the byte of repetition, the byte for repeating will
Replaced by one group of short and small pointer (distance, length), pointer the inside distance refers to and pleonasm
The distance between section, from 1 to 32KB.Length is the number between 3 to 258, represents weight
The length of multiple byte.For example, character " applefapplt ", can be compressed to " applef (6,4) t ".
The object of LZ77 compression treatment is specifically as follows html language, Javascript in original plaintext data
Language and CSS files etc..
After Internet Server carries out LZ77 compressions to original plaintext data, using huffman coding pair
The compression process of the data flow after LZ77 compressions includes:Using huffman coding in LZ77 compression processes
The pointer and character of generation carry out second compression again, obtain two Hofman tree information and two class Huffmans are compiled
The bit streams of code;Two Hofman trees are finally compressed using Run- Length Coding and huffman coding compress technique
Data sequence information.
After Internet Server completes to obtain gzip compressed data streams to original plaintext data compression, can be by
Gzip compressed data streams split into multiple continuous gzip compressed data packets, are each gzip compressed data packets
Distribution HTTP file headers, and be transmitted.
102nd, default Hofman tree is inquired about, the corresponding binary system Hough of each character in harmful pattern string is obtained
Graceful coding, the generation corresponding object binary string of harmful pattern string.
Wherein, in Hofman tree, each character can correspond to a huffman coding value, for example, word
The corresponding huffman coding value of symbol " a " can be 0100;The corresponding huffman coding value of character " r " can
Being 10111;The corresponding huffman coding value of character " e " can be 001;Character " o " is corresponding suddenly
The graceful encoded radio of husband can be 10110;The corresponding huffman coding value of character " t " can be 1000;Character
" h " corresponding huffman coding value can be 10011.If certain harmful pattern string is " there ",
The corresponding object binary string of the harmful pattern string is " 10001001100110111001 ".
103rd, gzip compressed data streams are inquired about according to object binary string, in acquisition gzip compressed data streams
The appearance position of object binary string first time.
104th, Huffman solution is carried out to occurring the compressed data stream after position in gzip compressed data streams
Code, obtains the first compressed data stream compressed by LZ77, and the first compressed data stream includes at least one
Individual TCP compresses burst.
The gzip compressed data streams that fire wall or gateway server get are specifically as follows multiple and continuously take
Gzip compressed data packets with HTTP file headers, each gzip compressed data packets include a TCP
Compression burst.Fire wall or gateway server are got after gzip compressed data streams, can remove each
Compressed data packets are constituted compressed data stream by the HTTP file headers of compressed data packets according to sequence number.Compression
The size of packet is general from 64 bytes to 1518 bytes.Wherein, Hofman tree information is stored in
In one gzip compressed data packets.In general, different gzip compressed data streams have different Huffmans
Tree information.
105th, multi-mode matching window sliding is triggered, to promote LZ77 to decompress sliding window in the first compression
Slip in data flow, to realize carrying out the first compressed data stream using LZ77 decompression sliding windows
While LZ77 is decompressed, the original plaintext data obtained after decompression are carried out using multi-mode matching window
The matching of harmful pattern string;Wherein, the first compressed data stream is carried out using LZ77 decompression sliding windows
During LZ77 is decompressed, the pointer in the first compressed data stream inquires about default interim stack, sentences
Whether there is the first index position in the reference character string pointed by pointer in disconnected interim stack, if depositing
Have, the first index position in reference character string according to pointed by pointer, the target where determining pointer
The second index position in the reference character string for obtaining is replaced on position, the second index position is stored to facing
When stack in;And judge whether the length length of the reference character string pointed by pointer is more than 2 (Lmin-1),
When the length length of the reference character string pointed by pointer is more than 2 (Lmin-1), in the mesh where pointer
Skip distance parameter is set at the Lmin-1 character in reference character string in cursor position
Length-2 (Lmin-1)+1, so as to the original plaintext obtained after using multi-mode matching window to decompression
When data carry out the matching of harmful pattern string, jumped at the Lmin-1 character in reference character string
Pattern string is carried out at+1 character of Length-2 (Lmin-1) to+1 character of Length- (Lmin-1)
Match somebody with somebody;The distance between the right margin of LZ77 decompression sliding windows and the right margin of multi-mode matching window are small
In Lmin-1;Lmin represents the minimum length of harmful pattern string, and is positive integer;Length is LZ77
The length of the reference character string in decompression procedure pointed by pointer;At least one TCP is preserved in interim stack
Index position in compression burst with the pattern string of harmful pattern matching;Pointer is included where pointer
The first distance between the position of the reference character string that target location to pointer is pointed to, and reference character string
Length.
Wherein, multiple harmful pattern strings have been pre-saved on fire wall or gateway server.Wherein,
The length of LZ77 decompression sliding windows is specifically as follows 32KB.The length of multi-mode matching window is specific
Can be 32KB.LZ77 decompressions are carried out to the first compressed data stream using LZ77 decompression sliding windows
During, the default interim stack of pointer inquiry in the first compressed data stream obtains the first index position
And the schematic diagram of the second index position of storage can be with as shown in Fig. 2 in fig. 2, pointed by pointer
The black vertical line position in the first index position such as reference character string in reference character string;Pointer institute
Target location on replace the reference character string for obtaining in the second index position such as the first repeat character string
In black vertical line position, i.e., the position pointed by the dotted arrow of interim stack;First repeat character (RPT)
Zone line in string is the region to be jumped;The region on the both sides of the first repeat character string is carried out for needs
The region of matching.
Wherein, before decompression, index position information can not be included in interim stack.Specifically, step 103
Middle the first index position for judging whether to have in the reference character string pointed by pointer in interim stack
Process can specifically include:Obtain the index with the character string of harmful pattern matching stored in interim stack
Position, index position includes:The index address of character string and the length of character string;Judge character string
Whether index position meets below equation with pointer:
The first distance in target location-pointer where pointer<The index position of=character string<=pointer institute
Target location-pointer in the first distance+pointer in reference character string length;
If the index position of character string meets above-mentioned formula with pointer, it is determined that have positioned at finger in interim stack
The first index position in reference character string pointed by pin.
For example, be stored with inside interim stack (10,3), (20,4), (25,1), (40,2) rope
Draw position, there is pointer (40,20) and (50,10) that current location is 70 and 100 respectively now.
In interim stack, index position (40,2) is in the position area of the reference character string pointed by pointer (40,20)
Between (30,50) it is internal, i.e. (70-40<=40<=70-40+20), then the finger at pointer position 70
There is the index position (80,2) of harmful pattern string inside pin (40,20), therefore inserted in interim stack
Enter index position (80,2);The reference word pointed by pointer (50,10) similarly at position 100
Accord with the interval internal index position in the absence of harmful pattern string in position of string.
If the 106th, after at least one TCP compressions burst all decompression, the original plaintext corresponding to it
The character string with harmful pattern matching is not contained in data, then by the corresponding gzip pressures of TCP compression bursts
Contracting data flow is sent to client.
Wherein, if containing the character string with harmful pattern matching in original plaintext data corresponding to it,
Then fire wall or gateway server abandon the corresponding gzip compressed data streams of TCP compression bursts.
Wherein, in step 104, the corresponding original plaintext data of at least one TCP compression bursts are to include
At least one TCP compresses the corresponding original plaintext data of the first compressed data stream of burst.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding,
Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two
System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams
Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams,
Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote
LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77
While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression
The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream
In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer
Index position in character string, if so, determine pointer where target location in index position and store
Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default
Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting
The invasion of short compressed data stream or harmful code detection time, improve client user's online experience
Speed.
Fig. 3 for the harmful code detection method based on HTTP static compress data flows that provides of the present invention again
The flow chart of one embodiment, on the basis of embodiment illustrated in fig. 1, uses multi-mode in step 105
The process that match window carries out the matching of harmful pattern string to the original plaintext data obtained after decompression specifically may be used
To comprise the following steps:
1051st, the i-th byte to the i-th+N-1 bytes is constituted in acquisition original plaintext data first is to be matched
Character block.
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the byte long of character block to be matched
Degree, and be positive integer.
1052nd, judge whether be provided with skip distance parameter on each character in the first character block to be matched,
If being not provided with skip distance parameter on each character in the first character block to be matched, step is performed
1053;If being provided with skip distance parameter on the character in the first character block to be matched, step is performed
1055。
1053rd, judge that the first character block to be matched whether there is corresponding step in default SHIFT tables
Long value, if the first character block to be matched does not exist corresponding step value in default SHIFT tables,
Perform step 1054;If there is corresponding step-length in default SHIFT tables in the first character block to be matched
Value, then perform step 1056.
Wherein, judging the first character block to be matched in default SHIFT tables with the presence or absence of corresponding
Before step value, fire wall or gateway server can create SHIFT tables and Kazakhstan according to harmful pattern string
Uncommon table.
The process for creating SHIFT tables is the minimum length Lmin for obtaining harmful pattern string, using minimum
Length intercepts the preceding Lmin character of harmful pattern string since the prefix of each harmful pattern string, according to
Each character block in the preceding Lmin character of harmful pattern string, according to each character block at preceding Lmin
The step-length generation SHIFT tables of character.Then the situation generation according to the harmful pattern string for including character block is breathed out
Uncommon table, preserves character block and the corresponding harmful pattern string for including character block in Hash table.
For example, it is assumed that harmful pattern string includes:Rainbow, shine, river, version, brush.
The SHIFT tables of the character string after interception prefix, character block and generation are shown in Table 1.
Table 1
Wherein, step-length is the ending character and current character block that 0 expression has the harmful character string after interception
It is consistent.
1054th, i is added into Lmin-1, repeats step 1051, until judging to complete.
1055th, i is added into Length-2 (Lmin-1)+1, repeats step 1051, until judging to complete.
1056th, judge whether character block to be matched has corresponding step value in default SHIFT tables
To preset step value;If there is corresponding step value in default SHIFT tables in character block to be matched
To preset step value, then step 1057 is performed;If character block to be matched is deposited in default SHIFT tables
In corresponding step value, but corresponding step value is not default step value, then perform step 1051.
Wherein, default step value herein can be 0.
1057th, Lmin-N byte before the i-th byte is obtained.
1058th, before Lmin-N byte being added into the first character block to be matched, treated with forming second
The character block of matching.
1059th, default Hash table is inquired about according to the first character block to be matched, acquisition includes that first treats
The harmful pattern string of the first of the character block matched somebody with somebody.
1060th, judge whether the second character block to be matched matches with first harmful pattern string;Then perform
Step 1061;If the second character block to be matched and first harmful pattern matching, also perform step 1062.
Wherein, N represents the byte length of character block to be matched, and is positive integer;Lmin represents harmful
The minimum length of character string, and be positive integer.
1061st, i is added 1, repeats step 1051, until judging to complete.
1062nd, the index position of the second character block to be matched is stored into interim stack.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding,
Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two
System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams
Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams,
Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote
LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77
While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression
The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream
In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer
Index position in character string, if so, determine pointer where target location in index position and store
Into interim stack;The original plaintext data obtained after with multi-mode matching window to decompression carry out harmful mould
During the matching of formula string, the i-th byte to the i-th+N-1 bytes in original plaintext data is constituted the is obtained
Whether one character block to be matched, first judges be provided with jump on each character in the first character block to be matched
Distance parameter, jump is determined whether according to judged result, then judges the first character block to be matched pre-
If SHIFT tables in whether there is corresponding step value, determined whether there is and first according to judged result
Harmful pattern string of the corresponding second character Block- matching of character block to be matched, and LZ77 decompression sliding windows
The distance between mouth and multi-mode matching window are less than predeterminable range threshold value, so as to realize that side carries out LZ77
Decompression, while carrying out the multimode matching detection of jumping characteristic, shortens invasion or the harmful code of compressed data stream
Detection time, improves the speed of client user's online experience.
One of ordinary skill in the art will appreciate that:Realize all or part of step of above-mentioned each method embodiment
Suddenly can be completed by the related hardware of programmed instruction.Foregoing program can be stored in a computer can
In reading storage medium.The program upon execution, performs the step of including above-mentioned each method embodiment;And
Foregoing storage medium includes:ROM, RAM, magnetic disc or CD etc. are various can be with storage program generation
The medium of code.
The harmful code detection means one based on HTTP static compress data flows that Fig. 4 is provided for the present invention
The structural representation of individual embodiment, as shown in figure 4, including:
Acquisition module 41, client gzip compressed datas to be subjected are sent for obtaining Internet Server
Stream, gzip compressed data streams are that Internet Server compresses and Hough original plaintext data by LZ77
The compressed data stream obtained after graceful coding;
Enquiry module 42, for inquiring about default Hofman tree, obtains each character correspondence in harmful pattern string
Binary huffman code, the harmful corresponding object binary string of pattern string of generation;
Enquiry module 42, is additionally operable to inquire about gzip compressed data streams according to object binary string, obtains gzip
The appearance position of object binary string first time in compressed data stream;
Decompression module 43, for being carried out to occurring the compressed data stream after position in gzip compressed data streams
Hofmann decoding, obtains the first compressed data stream compressed by LZ77, is wrapped in the first compressed data stream
Include at least one TCP compression bursts;
Trigger module 44, for triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window
Slip of the mouth on the first compressed data stream;
Matching module 45, for being carried out to the first compressed data stream using LZ77 decompression sliding windows
While LZ77 is decompressed, the original plaintext data obtained after decompression are carried out using multi-mode matching window
The matching of harmful pattern string;Wherein, the first compressed data stream is carried out using LZ77 decompression sliding windows
During LZ77 is decompressed, the pointer in the first compressed data stream inquires about default interim stack, sentences
Whether there is the first index position in the reference character string pointed by pointer in disconnected interim stack, if depositing
Have, the first index position in reference character string according to pointed by pointer, the target where determining pointer
The second index position in the reference character string for obtaining is replaced on position, the second index position is stored to facing
When stack in;And judge whether the length length of the reference character string pointed by pointer is more than 2 (Lmin-1),
When the length length of the reference character string pointed by pointer is more than 2 (Lmin-1), in the mesh where pointer
Skip distance parameter is set at the Lmin-1 character in reference character string in cursor position
Length-2 (Lmin-1)+1, so as to the original plaintext obtained after using multi-mode matching window to decompression
When data carry out the matching of harmful pattern string, jumped at the Lmin-1 character in reference character string
Pattern string is carried out at+1 character of Length-2 (Lmin-1) to+1 character of Length- (Lmin-1)
Match somebody with somebody;The distance between the right margin of LZ77 decompression sliding windows and the right margin of multi-mode matching window are small
In Lmin-1;Lmin represents the minimum length of harmful pattern string, and is positive integer;Length is LZ77
The length of the reference character string in decompression procedure pointed by pointer;At least one TCP is preserved in interim stack
Index position in compression burst with the pattern string of harmful pattern matching;Pointer is included where pointer
The first distance between the position of the reference character string that target location to pointer is pointed to, and reference character string
Length;
Sending module 46, for after at least one TCP compressions burst all decompression, corresponding to it
Original plaintext data in not comprising character string with harmful pattern matching when, TCP is compressed into burst pair
The gzip compressed data streams answered are sent to client.
The harmful code detection means based on HTTP static compress data flows that the present invention is provided specifically can be with
It is the fire wall or gateway server between Internet Server and client.
Specifically, whether matching module 45 judges have positioned at the reference character pointed by pointer in interim stack
The process of the first index position in string can specifically include:Obtain storing with harmful pattern in interim stack
The index position of the character string of String matching, index position includes:The index address and character string of character string
Length;Whether the index position and pointer for judging character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=pointer institute
Target location-pointer in the first distance+pointer in reference character string length;
If the index position of character string meets above-mentioned formula with pointer, it is determined that have positioned at finger in interim stack
The first index position in reference character string pointed by pin.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding,
Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two
System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams
Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams,
Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote
LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77
While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression
The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream
In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer
Index position in character string, if so, determine pointer where target location in index position and store
Into interim stack, and LZ77 decompression the distance between sliding windows and multi-mode matching window is less than default
Distance threshold, so as to realize that side carries out LZ77 decompressions, while carrying out the multimode matching detection of jumping characteristic, contracting
The invasion of short compressed data stream or harmful code detection time, improve client user's online experience
Speed.
Further, with reference to Fig. 5 is referred to, on the basis of embodiment illustrated in fig. 4, matching module 45 is wrapped
Include:Acquisition submodule 451, judging submodule 452 and addition submodule 453;
Matching module 45 is had using multi-mode matching window to the original plaintext data obtained after decompression
In the matching of evil pattern string, acquisition submodule 451, for obtaining in original plaintext data the i-th byte to the
First character block to be matched of i+N-1 bytes composition;
Judging submodule 452, for judging whether be provided with each character in the first character block to be matched
Skip distance parameter;
Judging submodule 452, is additionally operable to be not provided with each character in the first character block to be matched
During skip distance parameter, judge the first character block to be matched in default SHIFT tables with the presence or absence of right
The step value answered, if the first character block to be matched does not exist corresponding step-length in default SHIFT tables
Value, i plus Lmin-1 repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the byte long of character block to be matched
Degree, and be positive integer;
Judging submodule 452, is additionally operable to be deposited in default SHIFT tables in the first character block to be matched
In corresponding step value, judge that character block to be matched has corresponding step in default SHIFT tables
Whether long value is default step value;
Acquisition submodule 451, is additionally operable to character block to be matched and there is correspondence in default SHIFT tables
Step value when being for default step value, obtain Lmin-N byte before the i-th byte;
Addition submodule 453, before Lmin-N byte is added into the first character block to be matched,
To form the second character block to be matched;
Acquisition submodule 451, is additionally operable to inquire about default Hash table according to the first character block to be matched,
Acquisition includes first harmful pattern string of the first character block to be matched;
Judging submodule 452, is additionally operable to judge that the second character block to be matched is with first harmful pattern string
No matching;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of character block to be matched, and is positive integer;Lmin represents harmful
The minimum length of pattern string, and be positive integer;
Wherein, judging submodule 452 is judging the first character block to be matched in default SHIFT tables
Before corresponding step value, fire wall or gateway server can be created according to harmful character string
Build SHIFT tables and Hash table.
The process for creating SHIFT tables is the minimum length Lmin for obtaining harmful pattern string, using minimum
Length intercepts the preceding Lmin character of harmful pattern string since the prefix of each harmful pattern string, according to
Each character block in the preceding Lmin character of harmful pattern string, according to each character block at preceding Lmin
The step-length generation SHIFT tables of character.Then the situation generation according to the harmful pattern string for including character block is breathed out
Uncommon table, preserves character block and the corresponding harmful character string for including character block in Hash table.
Further, acquisition submodule 451, are additionally operable on the character in the first character block to be matched
When being provided with skip distance parameter, by i plus Length-2 (Lmin-1)+1;And it is to be matched to reacquire first
Character block matched.
Further, described device can also include:Memory module, memory module is used for, second
When character block to be matched and first harmful pattern matching, by the index bit of the second character block to be matched
Put and store into interim stack.
Further, whether judging submodule 452 judges have positioned at the ginseng pointed by pointer in interim stack
Examine in the first index position in character string, judging submodule 452 in the interim stack of acquisition specifically for depositing
The index position with the character string of harmful pattern matching of storage, index position includes:The index of character string
Address and the length of character string;
Whether the index position and pointer for judging character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=pointer institute
Target location-pointer in the first distance+pointer in reference character string length;
If the index position of character string meets above-mentioned formula with pointer, it is determined that have positioned at finger in interim stack
The first index position in reference character string pointed by pin.
In the present embodiment, by obtaining the gzip compressed data streams by LZ77 compressions and huffman coding,
Default Hofman tree is inquired about, the corresponding object binary string of harmful pattern string is obtained, is entered according to target two
System string inquiry gzip compressed data streams, object binary string first time goes out in acquisition gzip compressed data streams
Existing position, Hofmann decoding is carried out to occurring the compressed data stream after position in gzip compressed data streams,
Obtain the first compressed data stream compressed by LZ77;Triggering multi-mode matching window sliding, to promote
LZ77 decompresses slip of the sliding window on the first compressed data stream, is slided with realizing being decompressed using LZ77
While window carries out LZ77 decompressions to the first compressed data stream, using multi-mode matching window to decompression
The original plaintext data for obtaining afterwards carry out the matching of harmful pattern string;Wherein, for the first compressed data stream
In Long pointer, only do Boundary Match, and judge whether interim stack has positioned at the reference pointed by pointer
Index position in character string, if so, determine pointer where target location in index position and store
Into interim stack;The original plaintext data obtained after with multi-mode matching window to decompression carry out harmful mould
During the matching of formula string, the i-th byte to the i-th+N-1 bytes in original plaintext data is constituted the is obtained
Whether one character block to be matched, first judges be provided with jump on each character in the first character block to be matched
Distance parameter, jump is determined whether according to judged result, then judges the first character block to be matched pre-
If SHIFT tables in whether there is corresponding step value, determined whether there is and first according to judged result
Harmful pattern string of the corresponding second character Block- matching of character block to be matched, and LZ77 decompression sliding windows
The distance between mouth and multi-mode matching window are less than predeterminable range threshold value, so as to realize that side carries out LZ77
Decompression, while carrying out the multimode matching detection of jumping characteristic, shortens invasion or the harmful code of compressed data stream
Detection time, improves the speed of client user's online experience.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than right
Its limitation;Although being described in detail to the present invention with reference to foregoing embodiments, this area it is common
Technical staff should be understood:It can still be repaiied to the technical scheme described in foregoing embodiments
Change, or equivalent is carried out to which part or all technical characteristic;And these are changed or replace
Change, do not make the scope of the essence disengaging various embodiments of the present invention technical scheme of appropriate technical solution.
Claims (10)
1. a kind of harmful code detection method based on HTTP static compress data flows, its feature exists
In, including:
Obtain Internet Server and send client gzip compressed data streams to be subjected, the gzip compressions
Data flow is for the Internet Server by original plaintext data by after LZ77 compressions and huffman coding
The compressed data stream for obtaining;
Inquire about default Hofman tree, obtain the corresponding binary Huffman of each character in harmful pattern string and compile
Code, the generation corresponding object binary string of harmful pattern string;
The gzip compressed data streams are inquired about according to the object binary string, the gzip compressions number is obtained
According to the appearance position of the first time of object binary string described in stream;
Huffman solution is carried out to occurring the compressed data stream after position described in the gzip compressed data streams
Code, obtain by LZ77 compress the first compressed data stream, first compressed data stream include to
Few TCP compressions burst;
Triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window in the described first compression
Slip in data flow, to realize decompressing sliding window to first compressed data using the LZ77
It is original bright to what is obtained after decompression using the multi-mode matching window while stream carries out LZ77 decompressions
Literary data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described using the LZ77
During first compressed data stream carries out LZ77 decompressions, the pointer in the first compressed data stream is looked into
Default interim stack is ask, judges whether have positioned at the reference word pointed by the pointer in the interim stack
The first index position in symbol string, if having, in the reference character string according to pointed by the pointer the
One index position, replaces in the reference character string for obtaining on the target location where determining the pointer
Two index positions, second index position is stored into the interim stack;And judge the pointer institute
Whether the length length of the reference character string of sensing is more than 2 (Lmin-1), in the ginseng pointed by the pointer
When the length length for examining character string is more than 2 (Lmin-1), the ginseng on the target location where the pointer
Examine and skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, with
Just after using multi-mode matching window to decompression the original plaintext data that obtain carry out harmful pattern string
Timing, jump Length-2 (Lmin-1)+1 at the Lmin-1 character in the reference character string
Pattern matching is carried out at character to+1 character of Length- (Lmin-1);The LZ77 decompressions are slided
The distance between the right margin of window and the right margin of multi-mode matching window are less than Lmin-1;Lmin tables
Show the minimum length of harmful pattern string, and be positive integer;Length is LZ77 decompression procedure middle fingers
The length of the reference character string pointed by pin;The pointer include the pointer where target location extremely
The first distance between the position of the reference character string that the pointer is pointed to, and the reference character string
Length;
If after at least one TCP compressions burst all decompression, the original plaintext number corresponding to it
The character string with harmful pattern matching is not contained in, then by the corresponding gzip of TCP compression bursts
Compressed data stream is sent to the client.
2. method according to claim 1, it is characterised in that described to use the multi-mode
With window the original plaintext data obtained after decompression are carried out with the matching of harmful pattern string, including:
Obtain the first word to be matched that the i-th byte to the i-th+N-1 bytes in the original plaintext data is constituted
Symbol block;
Judge whether be provided with skip distance parameter on each character in the described first character block to be matched;
If being not provided with skip distance parameter on each character in the described first character block to be matched, sentence
Disconnected described first character block to be matched whether there is corresponding step value in default SHIFT tables, if
Described first character block to be matched does not exist corresponding step value in the default SHIFT tables, will
I adds Lmin-1, repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched
Section length, and be positive integer.
3. method according to claim 2, it is characterised in that also include:
If there is corresponding step-length in the default SHIFT tables in the described first character block to be matched
Value, then judge that the character block to be matched has corresponding step value in the default SHIFT tables
Whether it is default step value;
If the character block to be matched exist in the default SHIFT tables corresponding step value be for
The default step value, then obtain Lmin-N byte before i-th byte;
Before the Lmin-N byte is added into the described first character block to be matched, to form second
Character block to be matched;
Default Hash table is inquired about according to the described first character block to be matched, acquisition includes that described first treats
The harmful pattern string of the first of the character block of matching;
Judge whether the described second character block to be matched matches with described first harmful pattern string;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented
The minimum length of harmful pattern string, and be positive integer.
4. method according to claim 2, it is characterised in that also include:
If being provided with skip distance parameter on the character in the described first character block to be matched, i is added
Length-2(Lmin-1)+1;And the first character block to be matched of reacquisition is matched.
5. method according to claim 3, it is characterised in that also include:
If the described second character block to be matched and described first harmful pattern matching, by described second
The index position of character block to be matched is stored into the interim stack.
6. method according to claim 1, it is characterised in that in the judgement interim stack
Whether first index position positioned at the reference character string pointer pointed by is had, including:
The index position with the character string of harmful pattern matching stored in the interim stack is obtained, it is described
Index position includes:The length of the index address of the character string and the character string;
Whether the index position and the pointer for judging the character string meet below equation:
The first distance in target location-pointer where pointer<The index position of=character string<=refer to
The reference character string in the first distance+pointer in target location-pointer where pin
Length;
If the index position of the character string meets above-mentioned formula with the pointer, it is determined that the interim stack
In have the first index position in the reference character string pointed by the pointer.
7. a kind of harmful code detection means based on HTTP static compress data flows, its feature exists
In, including:
Acquisition module, client gzip compressed datas to be subjected are sent for obtaining Internet Server
Stream, the gzip compressed data streams are that the Internet Server presses original plaintext data by LZ77
The compressed data stream obtained after contracting and huffman coding;
Enquiry module, for inquiring about default Hofman tree, each character is corresponding in obtaining harmful pattern string
Binary huffman code, the generation corresponding object binary string of harmful pattern string;
The enquiry module, is additionally operable to inquire about the gzip compressed data streams according to the object binary string,
Obtain the appearance position of object binary string first time described in the gzip compressed data streams;
Decompression module, for occurring the compressed data after position described in the gzip compressed data streams
Stream carries out Hofmann decoding, obtains the first compressed data stream compressed by LZ77, first compression
Data flow includes that at least one TCP compresses burst;
Trigger module, for triggering multi-mode matching window sliding, to promote LZ77 to decompress sliding window
Slip on first compressed data stream;
Matching module, for decompressing sliding window to first compressed data stream using the LZ77
While carrying out LZ77 and decompress, using the multi-mode matching window to the original plaintext that is obtained after decompression
Data carry out the matching of harmful pattern string;Wherein, sliding window is decompressed to described the using the LZ77
During one compressed data stream carries out LZ77 decompressions, the pointer inquiry in the first compressed data stream
Whether default interim stack, judges have positioned at the reference character pointed by the pointer in the interim stack
The first index position in string, if having, according to pointed by the pointer in reference character string first
Index position, replaces second in the reference character string for obtaining on the target location where determining the pointer
Index position, second index position is stored into the interim stack;And judge that the pointer is signified
To reference character string length length whether be more than 2 (Lmin-1), in the reference pointed by the pointer
When the length length of character string is more than 2 (Lmin-1), the reference on the target location where the pointer
Skip distance parameter Length-2 (Lmin-1)+1 is set at the Lmin-1 character in character string, so as to
The original plaintext data obtained after using multi-mode matching window to decompression carry out the matching of harmful pattern string
When, jump Length-2 (Lmin-1)+1 word at the Lmin-1 character in the reference character string
Pattern matching is carried out at symbol to+1 character of Length- (Lmin-1);The LZ77 decompresses sliding window
The distance between the right margin of mouth and the right margin of multi-mode matching window are less than Lmin-1;Lmin is represented
The minimum length of harmful pattern string, and be positive integer;Length is pointer in LZ77 decompression procedures
The length of pointed reference character string;At least one TCP compressions point are preserved in the interim stack
With the index position of the pattern string of harmful pattern matching in piece;The pointer is included where the pointer
The position of reference character string pointed to of target location to the pointer between the first distance, it is and described
The length of reference character string;
Sending module, for after at least one TCP compressions burst all decompression, corresponding to it
Original plaintext data in when not containing the character string with harmful pattern matching, by TCP compressions point
The corresponding gzip compressed data streams of piece are sent to the client.
8. device according to claim 7, it is characterised in that the matching module includes:Obtain
Take submodule and judging submodule;
The matching module is entered using the multi-mode matching window to the original plaintext data obtained after decompression
In the matching of the harmful pattern string of row, the acquisition submodule, for obtaining in the original plaintext data the
The first character block to be matched that i bytes to the i-th+N-1 bytes are constituted;
The judging submodule, for judging whether set on each character in the described first character block to be matched
It is equipped with skip distance parameter;
The judging submodule, is additionally operable to not set on each character in the described first character block to be matched
When being equipped with skip distance parameter, judge that the described first character block to be matched is in default SHIFT tables
It is no to there is corresponding step value, if the described first character block to be matched is in the default SHIFT tables
In the absence of corresponding step value, i plus Lmin-1 repeats the above steps, until judging to complete;
Wherein, i is positive integer, and i is equal to 1 when initial;N represents the word of the character block to be matched
Section length, and be positive integer.
9. device according to claim 8, it is characterised in that the matching module also includes:
Addition submodule;
The judging submodule, is additionally operable in the described first character block to be matched described default
When there is corresponding step value in SHIFT tables, judge the character block to be matched described default
Have whether corresponding step value is default step value in SHIFT tables;
The acquisition submodule, is additionally operable to the character block to be matched in the default SHIFT tables
When to there is corresponding step value be for the default step value, Lmin-N before i-th byte is obtained
Individual byte;
The addition submodule, for the Lmin-N byte is to be matched added to described first
Before character block, to form the second character block to be matched;
The acquisition submodule, is additionally operable to inquire about default Hash according to the described first character block to be matched
Table, acquisition includes first harmful pattern string of the described first character block to be matched;
The judging submodule, is additionally operable to judge that the described second character block to be matched is harmful with described first
Whether pattern string matches;
I plus 1, is repeated the above steps, until judging to complete;
Wherein, N represents the byte length of the character block to be matched, and is positive integer;Lmin is represented
The minimum length of harmful pattern string, and be positive integer.
10. device according to claim 8, it is characterised in that the acquisition submodule, also uses
When skip distance parameter is provided with the character in the described first character block to be matched, by i plus
Length-2(Lmin-1)+1;And the first character block to be matched of reacquisition is matched.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510884627.2A CN106850504B (en) | 2015-12-04 | 2015-12-04 | Harmful code detection method and device based on HTTP static compress data flow |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510884627.2A CN106850504B (en) | 2015-12-04 | 2015-12-04 | Harmful code detection method and device based on HTTP static compress data flow |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106850504A true CN106850504A (en) | 2017-06-13 |
CN106850504B CN106850504B (en) | 2019-11-15 |
Family
ID=59150401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510884627.2A Active CN106850504B (en) | 2015-12-04 | 2015-12-04 | Harmful code detection method and device based on HTTP static compress data flow |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106850504B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108090115A (en) * | 2017-11-03 | 2018-05-29 | 中国科学院信息工程研究所 | A kind of filter method and system for Gzip compressed datas |
WO2023082156A1 (en) * | 2021-11-10 | 2023-05-19 | 山东方寸微电子科技有限公司 | Lz77 decoding circuit and operation method thereof |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1359196A (en) * | 2000-12-15 | 2002-07-17 | 国际商业机器公司 | Quick joint image expert group Huffman coding and decoding method |
CN104468044A (en) * | 2014-12-05 | 2015-03-25 | 北京国双科技有限公司 | Data compression method and device applied to network transmission |
CN104811209A (en) * | 2015-04-22 | 2015-07-29 | 北京理工大学 | Compressed file data embedding method and device capable of resisting longest matching detection |
-
2015
- 2015-12-04 CN CN201510884627.2A patent/CN106850504B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1359196A (en) * | 2000-12-15 | 2002-07-17 | 国际商业机器公司 | Quick joint image expert group Huffman coding and decoding method |
CN104468044A (en) * | 2014-12-05 | 2015-03-25 | 北京国双科技有限公司 | Data compression method and device applied to network transmission |
CN104811209A (en) * | 2015-04-22 | 2015-07-29 | 北京理工大学 | Compressed file data embedding method and device capable of resisting longest matching detection |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108090115A (en) * | 2017-11-03 | 2018-05-29 | 中国科学院信息工程研究所 | A kind of filter method and system for Gzip compressed datas |
CN108090115B (en) * | 2017-11-03 | 2022-05-17 | 中国科学院信息工程研究所 | Filtering method and system for Gzip compressed data |
WO2023082156A1 (en) * | 2021-11-10 | 2023-05-19 | 山东方寸微电子科技有限公司 | Lz77 decoding circuit and operation method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN106850504B (en) | 2019-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9558241B2 (en) | System and method for performing longest common prefix strings searches | |
US20190273510A1 (en) | Classification of source data by neural network processing | |
US8458354B2 (en) | Multi-pattern matching in compressed communication traffic | |
US9727574B2 (en) | System and method for applying an efficient data compression scheme to URL parameters | |
Gueniche et al. | Compact prediction tree: A lossless model for accurate sequence prediction | |
US20110219357A1 (en) | Compressing source code written in a scripting language | |
WO2011007956A2 (en) | Data compression method | |
JP2009542092A5 (en) | ||
EP3195481B1 (en) | Adaptive rate compression hash processing device | |
CN104868922A (en) | Data compression method and device | |
US9264068B2 (en) | Deflate compression algorithm | |
CN107947918A (en) | A kind of carrier-free text steganography method based on character feature | |
CN104811209B (en) | A kind of the compressed file data embedding method and device of anti-most long matching detection | |
US8909813B2 (en) | Efficient processing of compressed communication traffic | |
CN103701470B (en) | Stream intelligence prediction differencing and compression algorithm and corresponding control device | |
CN106850504A (en) | Harmful code detection method and device based on HTTP static compress data flows | |
CN107623855A (en) | A kind of embedded rate steganography device of height based on compressed encoding and steganography method | |
CN107277109B (en) | Multi-string matching method for compressed flow | |
CN105391514B (en) | Character code coding/decoding method and device | |
Bremler-Barr et al. | Decompression-free inspection: Dpi for shared dictionary compression over http | |
CN106850507A (en) | Harmful code detection method and device based on HTTP compressed data streams | |
CN108090115B (en) | Filtering method and system for Gzip compressed data | |
CN108573069B (en) | Twins method for accelerating matching of regular expressions of compressed flow | |
Beirami et al. | Packet-level network compression: Realization and scaling of the network-wide benefits | |
US9923576B2 (en) | Decoding techniques using a programmable priority encoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |