CN100357846C - Intrusion detection accelerator - Google Patents

Intrusion detection accelerator Download PDF

Info

Publication number
CN100357846C
CN100357846C CNB2003801061642A CN200380106164A CN100357846C CN 100357846 C CN100357846 C CN 100357846C CN B2003801061642 A CNB2003801061642 A CN B2003801061642A CN 200380106164 A CN200380106164 A CN 200380106164A CN 100357846 C CN100357846 C CN 100357846C
Authority
CN
China
Prior art keywords
state table
token
character
file
intrusion detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2003801061642A
Other languages
Chinese (zh)
Other versions
CN1735850A (en
Inventor
迈克尔·C·达普
埃里克·C·莱特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lockheed Martin Corp
Original Assignee
Lockheed Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lockheed Corp filed Critical Lockheed Corp
Publication of CN1735850A publication Critical patent/CN1735850A/en
Application granted granted Critical
Publication of CN100357846C publication Critical patent/CN100357846C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

A hardware accelerated validation parser is provided to remove a large portion if not all of the processing and overhead burden of validation parsing from a host processor by parallel access to both a state table and a data dictionary based on a token and merging and selective redirection of the respective outputs thereof; a portion of a transition control word (TCW) formed by the merged data being used to advance through the state table and a portion of the TCW being used to control formation of a tree structured data object (TSDO) corresponding to a text document in a language such as XML<TM> which supports interoperability and platform independence. A stack is provided to accommodate nesting of elements and aggregate elements. The formation of the TSDO can be and preferably is performed asynchronously and autonomously in parallel with the validation parsing.

Description

Intrusion detection accelerator
Explanation
Background of invention
Invention field
The present invention relates generally to such as XML TMThe document analysis of file (parsing) relates in particular to network data APMB package or other logic sequence is resolved to be used to detect potential invasion or the attack to network node.
The description of prior art
In recent years, the digital communicating field that is connected at computing machine with computing machine between the link in the network had obtained development rapidly, and this all is similar with increasing rapidly of personal computer several years ago in many aspects.This effective performance and the function that has increased the single computing machine in above-mentioned network system at the interconnectivity of teleprocessing and the growth pole the earth aspect the possibility.Yet, the purposes of single computing machine and system and be placed into their user's of when service hobby and the diversity of these aspects of state of the art has caused the performance of individual machines and their operating system and configuration to produce basic degree change when computing machine, individual machines and their operating system jointly are called as " platform ", these platforms are mutually incompatible on some degree usually, especially on the aspect of operating system and program language.
Platform identity and to communication capacity and teleprocessing ability and for the compatibility of enough degree of supporting it time this incompatible development that has caused Object-oriented Programming Design of demand (it has admitted such conception of species, that is, the frame of reference by entity, attribute and relation is one group of more or less general module with application program and data acquisition) and a large amount of programming language that embodies this conception of species.Extend markup language TM(XML TM) language that comes to this, it its used widely, and can be as file and in the transmission over networks of any configuration and architecture.
In such language, some character string is corresponding with some instruction or identifier, comprise special character and other important data (they gather and are known as control word), it allows data or efficient in operation ground to carry out oneself's identification so that they can be taken as " object " subsequently, like this, relevant data and order can be translated into the appropriate format and the order of different application with different language, thereby produce the compatibility to a certain degree of processing of expectation that is enough to be supported in given machine place of each platform that links to each other.The detection of these character strings realizes by the operation that is called parsing (parsing), and this is decomposed into its ingredient and to its more conventional application class that carries out syntactic description seemingly with grammer with expression formula (for example sentence).
Work as analyzing XML TMDuring file, most of and may be used to travel through this document Search Control word, special character and other main central processing unit (CPU) execution time and be defined as just at processed concrete XML TMThe significant data of standard.This measure is typically finished by software, this software is inquired about each character and is determined whether it belongs to the predefined set of being concerned about character string, for example, one group of character string that comprises "<command〉", "<data type=dataword〉", "</command〉" etc.If any one of target string is detected, then token is saved, and this token has hereof the pointer of the position of pointing to the starting point that is used for token and length.These tokens are accumulated up to all files resolved.
Traditional implementation is to carry out to search those character strings that will be concerned about based on the finite state machine (FSM) of form (table).State table resides in the storer and is designed to search hereof special pattern.Current state is used as the plot in the state table, and the ascii table of input character shows that (representation) is the index of this table.For example, suppose that the ASCII value that above-mentioned state machine is in the state 0 (zero) and first input character is 02, then the specific address of state entry will be plot (state 0) and index/ascii character (02) and/cascade (concatenation).FSM begins from CPU takes out input file from storer first character.Then structure then obtains status data corresponding to the specific address of initialized/current state and input character in the state table of CPU in storer from this state table.Based on the status data that is returned, different if (representing first character of this character) corresponding to the character string of being concerned about, CPU upgrades current state to new value, and other operation that execution is indicated in status data (for example, if based on aforesaid further repetition, if single character is the last character that special character or current character are found to be the string that will be concerned about, then send token or interruption).
Along with the continuation character of the string that will be concerned about is found, above-mentioned processing is repeated, and state is changed.Just, if initial character is the original character of the string be concerned about, then the state of above-mentioned FSM can be promoted to new state (for example, from original state 0 to state 1).If this character is not to be concerned about, then by the state table entries of returning from the state table address, specifying same state (for example, state 0) or not coomand mode renewal) and (usually) state machine is remained unchanged.Possible operation includes, but are not limited to be provided with interruption, storage token and upgrades pointer.To repeat above-mentioned processing subsequently to afterwards character.Should be noted that, just followed and above-mentioned FSM when being in the state (or other string that will be concerned about of expression also is not found to be other state of current character of following) of non-state 0 at the string that will be concerned about, but can be found with the character of the original character of inconsistent other string that will be concerned about of current string.Under such a case, string fragment or part that state table entries will indicate suitable operation before to be followed with indication and identification, and follow the possible new string that will be concerned about and thoroughly discerned or be found to be till the string that is not to be concerned about up to this new string.In other words, the string of being concerned about can be nested, and state machine must detect the string that will be concerned about in another string that will be concerned about, or the like.This just requires CPU traversal XML TMFile is many times with this XML of thorough parsing TMFile.
Whole XML TMFile or other Languages file are connect a character ground by a character by the way and resolve.Along with potential target strings is recognized, character of above-mentioned FSM connects a character ground progressively by various states, up to the string of being concerned about identified fully or run into and the inconsistent character of possible string be concerned about (for example, above-mentioned string by fully/when intactly mating, or character is when departing from objectives string) till.Under a kind of situation in back, except turn back to initial state or with the corresponding state of detection of the original character of other target strings, can not operate usually.In the previous case, token in input file start address and the length of token be stored in the storer.When parsing was finished, all objects will be identified, and based on local platform or can begin for the processing of fixed platform.
Because above-mentioned search is normally carried out at a plurality of strings of being concerned about, so state table can provide the multiple conversion from any given state.This implementation allows current character resolved to be used for a plurality of target strings simultaneously when adapting to nested string expediently.
From above stated specification as can be seen, to for example XML TMThe document analysis of file needs many repetitions and is used for many storage access of each time repetition.Therefore, the processing time in universal cpu is the most basic essential.And another main complicacy of handling a plurality of strings is that it need produce big state table, and is to handle under the situation that breaks away from the processing of real time data bag.Yet, this just need a large amount of cpu cycles obtain input character data, obtain status data, upgrade the various pointers and the state address that are used for each character of file.Like this, resolve for example XML TMThe file of file handles with in try to be the first fully (pre-empt) CPU or the platform other, and to postpone required processing basically be common relatively (common).
Should be realized that in the art, can make general hardware imitate the function of (emulate) specialized hardware by program composition, special-purpose data processing hardware is usually than faster by the operation of the common hardware of program composition, even if their structure and program are accurately consistent with each other, this is because need expense still less to manage and control specialized hardware.Yet the needs of specialized hardware are used for surprising (prohibitively) that the hardware resource of particular procedure can be big, especially the growth of processing speed may be less situation under.And specialized hardware must have the function restriction, and for specific application provides enough dirigibilities, for example, it also can be conditional that the ability of the combination in any of the arbitrary number of searching character is provided.Like this, for practical, specialized hardware must provide big growth in processing speed when hardware saving very fully is provided; In required processing capacity, because functional mobility that increases or program composition and the demand that is difficult to synchronous adaptation (accommodate) further is needs.
In this, be connected to each other ability and need be used to resolve for example XML TMThe file processing time quantum of file has also been drawn the security of system problem.On the one hand, any processing of carrying out with high relatively priority, need very many processing times is attacked similar in appearance to the denial of service (DOS) in system or its node in some aspects, maybe can be the instrument that uses in a kind of like this attack.
In order to consume inimically and finally to make the available resources overload, frequent dos attack shows as trifling or bad system service request.The suitable configuration of hardware accelerator can greatly reduce or eliminate the potentiality of available resources overload.In addition, when overload, system often breaks down or exposes fragile security.Like this, eliminating overload is the thing that important security will be considered.
And, because state table must (basic levels) comprise cpu command on fundamental aspect, therefore, finish before the parsing, it is possible beginning some processing and carrying out number order, and this is difficulty or impossible under the situation of performance of system not being carried out strict compromise (compromise).In brief, the safe potentiality that is used to compromise will be used to handle for example XML by minimizing TMProcessing time of resolving and reducing necessarily, still can not obtain but be used for reducing significantly the technology in the processing time that is used for above-mentioned parsing.
Many security systems rely on very the stage early and detect the security breach (breach) that makes a stab, in case these security breach begin, apace or to interrupt security breach by the interference of program composition be difficulty or impossible.For example, the system that a kind of security performance is high is suggested, and is disclosed in the U.S. the 09/973rd, 769 and 09/973, No. 776 patented claim, and these two patents all are transferred to the application's assignee.Above-mentioned application discloses a kind of system with two-stage internode (internodal) communication, one-level has very high speed, with this, the node under attack or invasion that is detected can be spaced (compartmentalize), and is automatically repaired (if necessary) before being reconnected to network.Therefore, the response to potential attack is supported in the acceleration of resolving in early days, and in system (system of describing in for example above-mentioned patent application of incorporating into), especially has useful effect, this is because the suitable control of network can be used as the parsing incident and is activated, and if resolve and quickened significantly just can be activated in the time early.Except intrusion detection, response detection alarm and the correct network control that in time starts can also realize intrusion protection.
Summary of the invention
The invention provides a kind of hardware parser accelerator, it can be used for potential real-time intrusion detection and fence operation, it provides the very big acceleration of the parsing of file being used for detecting may invading, attack or other the sign (signature) of security breach of network computer system, and described parsing to file is carried out with the speed of the speed that can adapt to the transmitted data on network bag.
In order to finish these purposes of the present invention and other purpose, a kind of intruding detection system that can realize in document parser is provided, comprising: character buffer is used for a plurality of bytes of file; State table, can according to the byte of file and state come addressing interrupt with access from described state table or unusual and NextState data at least one; Register is used to store described NextState data; Adding up device merges the content of described register and the subsequent data of file to form the further address of described status register; And be used for described interruption or be transferred to the bus of host CPU unusually.
Brief description of drawings
Aforementioned and other target, aspect and beneficial effect will be better understood after the reference accompanying drawing is described in detail preferred implementation of the present invention, wherein:
Fig. 1 has shown the part of the state table that uses in document analysis;
Fig. 2 A is the high-level diagram according to the described parser accelerator of submitting to simultaneously of relevant temporary patent application;
Fig. 2 B is the high-level diagram according to parser accelerator of the present invention;
Fig. 2 C has described an implementation of the present invention that has primary processor and primary memory;
Fig. 3 has described the form as the described preferred character palette of Fig. 2 A (palette);
Fig. 4 A and 4B have described described state table form of describing as Fig. 2 A of preferred form of the present invention and the state table control register that uses with state table;
Fig. 5 A has described the preferred NextState palette form shown in Fig. 2 A;
Fig. 5 B has described the preferred state table entries form that uses with the present invention shown in Fig. 2 B, and
Fig. 6 is preferred as described in Figure 5 token form.
The detailed description of the preferred embodiment of the invention
With reference now to accompanying drawing,,, wherein shown the part of understanding the useful state table of the present invention has been represented more specifically with reference to Fig. 1.Should be appreciated that the state table that shows is to can be used for analyzing XML in Fig. 1 TMThe very little part of file, and be to be used for exemplary purpose in itself.Although all state tables physically do not exist (at least in the form that the present invention shows), Fig. 1 can be used to be convenient to the understanding to the operation of known software resolver, and not have partly among Fig. 1 is the prior art that belongs to related to the present invention.
Should be appreciated that XML TMFile is used as the embodiment that can use one type the logical data sequence that accelerator according to the present invention handles here.Other logical data sequence also can make up from content of network data packet, for example is used for the user terminal command string that share service device computing machine is carried out.These command strings are produced continually by the user of malice, and are sent to the server of shared (shatred) as the part of long-term invasion attack attempt.Accelerator according to the present invention is suitable for handling many such logical data sequences.
Many clauses and subclauses of observing the state table part of in Fig. 1, describing be replicability also be helpful, for the understanding of the present invention, the whole state table that no longer needs to make hardware be adapted to express among Fig. 1 is important.On the contrary, although the present invention can realize (may use special-purpose processor) in software, but hsrdware requirements according to the present invention are limited fully, and the performance loss (penalty) that is used for increase processing time of resolving by software can not passed judgment on (justify) by possible hardware saving (hardware economy) arbitrarily.
Yet, should be realized that, above-mentioned intruding detection system may be used on the digital document of any type, and be not limited only to be used to data packet transmission speed or exceed text document or the special language that data packet transmission speed is explained concrete application or data structure, above-mentioned transmission speed can adapt to the real-time Data Transmission on the network, carries out and attack normally to transmit by this.Like this, the present invention also can be embodied as a kind of configuration structure that is used for only providing intrusion detection; To expect in this case to realize optimum basically performance with minimum cost.Yet, realize that with signaling rate the target of intrusion detection can also be used as the special pattern of parser accelerator operation and obtains, under this pattern, certain operations is left in the basket and quickens to provide further, these operations also may be increased by alternative state table memory construction as described below, and this pattern is considered to preferred at present.Therefore, for integrality with in order to transmit the more penetrating understanding of beneficial effect scope provided by the invention, the present invention will describe in the context of parser accelerator, even if context play than the present invention be used in real time, the required device of intrusion detection effect at a high speed is more complicated.
In Fig. 1, state table is divided into the row of arbitrary number, and each row has and the corresponding plot of a kind of state.The row of plot is divided into the row consistent with the number that is used to represent the coding for the treatment of the character in the resolved file; In this embodiment, have 256 (256) be listed as corresponding with the basic octet that is used for as the character of the index of state table.
Several aspects of noticing the state table entries of demonstration are helpful, and the sub-fraction of the exemplary state table of especially describing in being delivered in Fig. 1 is how to support aspect the understanding of detection of many words:
1. in the state table that shows, have only two clauses and subclauses to comprise the clauses and subclauses that are not " being positioned at state 0 " at the row that is used for state 0, when the original character of tested byte and the arbitrary string of being concerned about did not match, these clauses and subclauses had kept initial state.The single clauses and subclauses that the state of advancing to 1 is provided are corresponding in particular cases a kind of like this, and wherein whole strings of being concerned about all begin with identical character.Can provide any other character that advances to other state will be usually but not necessarily advance to state except that state 1, still, can be used to for example detect nested string the further reference of the equal state that can be reached by other character.In that { state 0, the order shown in the FD} place (for example, " special interruption ") can be used to detect and operate special single character with combining of " being positioned at state 0 ".
2. in the state that is higher than state 0, the clauses and subclauses that " remain in state n " are for treating that the state that will be maintained by the possible long-play of one or more characters stipulates, as situation about running into usually, this state runs in may increasing at the numerical value of for example order.The special processing that the invention provides such character string is so that the acceleration of enhancing to be provided, as will describing in detail below.
3. at the state that is higher than state 0, " arrival state 0 " expression has detected the character of distinguishing this string from any string of being concerned about, and no matter there are how many characters matched formerly to be detected, and dissection process is turned back to initially/default setting to be to begin to search other string of being concerned about.(therefore, up to the present, " arrival state 0 " clauses and subclauses normally frequent and the most maximum clauses and subclauses occur in the state table).Turn back to state 0 must find the solution analyse operation turn back in the file when distinguishing character and be detected, the character of the back of the beginning character of being followed.
4. comprise that " clauses and subclauses of the order of arrival state 0 represent to have finished the detection to the complete string of being concerned about.Usually, this order will be a storage token (address and the length that have this token), and afterwards, this token makes string be used as an object and handles.Yet, the command specifies with " arrive in state n " continue to follow can be potentially with the string of the string coupling of being concerned about in, initiate operation at intermediate point (intermediate point).
5. for fear of (for example searching thereon at two strings being concerned about, have n-1 identical original character but n the string that character is different, or have a string of different original characters) between branch appears and the place, arbitrfary point uncertain (ambiguity) appears, usually need to different (for example, noncontinuity) state advances, as at { state 1,01} and the { state shown in the state 1, FD}.Except the special circumstances of the string of the special character that comprised and the string be concerned about, be that the complete identification of the string of n needs n-1 state to random length with common original character.For this reason, the number that state and state table are capable usually must be very big, even if also like this for the string of the appropriate relatively number of being concerned about.
7. opposite with the leading portion content, most state can be characterized fully by one or two unique clauses and subclauses and default " arrival state 0 ".The hardware that these characteristics of the state table of Fig. 1 are used to produce high level is in the present invention saved property and is quickened the normal conditions of the string that dissection process is concerned about to be used for significantly.
What as above hinted is the same, and in dissection process, the same as realizing traditionally, system starts from given default/original state, is described to state 0 in Fig. 1, then, after re-treatment along with mating that character is found and the state that advances to higher sequence number.When the string of being concerned about was discerned fully, when perhaps specifying a specific operation at place, the centre position of the string with potential matching, the operation of for example storing token or sending interruption was performed.Yet, repeating of each character that is used for file at every turn, character must obtain from the CPU storer, state table must be obtained by (once more from the CPU storer), and various pointers (for example, point to the character of file and the plot in the state table) and register (for example, pointing to the initial matching character address of string and the length that adds up) must in operation in proper order, be updated.Therefore, be easy to recognize that above-mentioned dissection process expends a large amount of processing times.
According to the high-level exemplary block scheme of parser accelerator 100 of the present invention shown in Fig. 2 A.As those of ordinary skill in the art was cognoscible, Fig. 2 A it is also understood that the process flow diagram for the execution in step of carrying out parsing according to the present invention.As below will discussing in detail together with Fig. 3,4A, 4B, 5A and 6, the present invention has used some hardware economies when the expression state table, so that a plurality of hardware pipelines (pipelines) are developed, although these hardware pipelines have slight asymmetric (skewed) to operate in parallel mode basically in time.Like this, the renewal of pointer and register can be carried out basically concurrently and with other operation concomitantly, by visiting the hardware of operating with parallel mode quickly and looking ahead (prefetching) from the CPU storer about state table and file, the required time of access memory is reduced in a large number simultaneously.
As general overall viewpoint, for example XML TMThe file of file externally is stored among the DRAM 120 that is indexed by register 112,114, and preferably is transferred to the input block 130 of playing the effect of multiplexer for above-mentioned streamline by 32 words.Each streamline comprises the copy of character palette (palette) 140, state table 160 and NextState palette 170; Its each all be adapted to the part of the state table of compressed format.The output of NextState palette 170 comprise the clauses and subclauses in the table 160 that gets the hang of the address the NextState address portion and treat stored token value (if any).Operation in character palette 140 and the NextState palette 170 is to the simple memory access of high-speed internal SRAM, internal SRAM can move mutually concurrently, and parallel with the simple memory visit to the outside high-speed DRAM (it also can be embodied as high-speed buffer) that forms state table 160.Therefore, only need CPU initially to control these hardware elements (still, it is in case start, and only can enough accidental CPU operation calls spontaneously works with the updating file data and stores token) relative few clock period with the assessment of each character of being used for file.The acceleration gain on basis is the minimizing that all operations duration of each character in CPU adds the summation of the duration of the single storage operation of spontaneously carrying out in high-speed SRAM or DRAM.
Should be appreciated that being called " outside " memory construction here is the configuration that is used for hinting storer 120,140, consider the amount of required storage and from the access of above-mentioned hardware parser accelerator and/or host CPU, it is that inventor institute is preferred at present.In other words, token is handled and some other operations provide architecture according to parser accelerator of the present invention with convenient memory sharing, or conveniently at least has beneficial effect by host CPU and hardware accelerator access.Discuss according to these, those of ordinary skill in the art should be realized that the hardware substitute that the expection that does not have other includes with for example synchronous dram (SDRAM) of wider scope is fit to.
Is that example to the form of character palette 140, state table 160, NextState table 170 and NextState and token discuss to Fig. 6 with the hardware economy of supporting preferred realization among Fig. 2 A referring now to Fig. 3.Can use other technology/form, same, the form of foregoing description can be understood as exemplary but is preferred at present.
The character palette preferred form that Fig. 3 has described, this palette is with to be included in the character that maybe can be included in the string of being concerned about corresponding.Correspond to the number that is listed as in the state table among Fig. 1, these forms preferably provide the clauses and subclauses that are numbered 0-255.(term " palette (palette) " uses with the identical meaning with comprising the term " palette (color palette) " that is used to support the data of each color and be called full gamut (gamut) by collectivity ground, and the use of palette has reduced the clauses and subclauses/row in the state table).For example, be called as statement in single-row that the character that can not cause any state variation of " null character (NUL) " can be in state table, rather than in many row, explain.Be that null character (NUL) output tests at 144 places be desired, this can quicken the processing that is used to resolve basically, and this is because its allows next character is handled at once, and need not the further storage operation to the state table access.This form can be adapted to by single register, is perhaps adapted to by the storage unit that is configured by (for example) data in the base register 142 of the palette (schematically describing with overlapping memory board in by Fig. 2) that points to special character.From file (XML for example TMFile) clauses and subclauses current 8 characters (in four one provides from input buffer 130, and is the same with the nybble word that receives from outside DRAM 120) addressing palette, this palette is followed index or the local pointers of OPADD as status register.Like this, by providing palette with above-mentioned form, the part of the function of Fig. 1 can provide with the form of the single register that is limited capacity relatively; Like this, allow a plurality of in them to be formed and operate, keep other function in sufficient hardware saving and the status of support table 160 simultaneously in parallel mode.
Fig. 4 A has shown preferred state table form, and it is to constitute with the similar mode of character palette (for example, the same with register basically) or to dispose.Be with the main difference of character palette among Fig. 3, the length of register depend on to the response times of desired character and be concerned about the number and the length of character string.Therefore, if the quantity of the internal storage that can be provided economically is insufficient in concrete situation, then is provided at the possibility that realizes sort memory among CPU or other the outside DRAM (may have inside or outside speed buffering) and considers that expectation obtains.However, because the clauses and subclauses with higher replicability in the state table of Fig. 1 can be reduced to single clauses and subclauses, be very clearly so sufficient hardware saving is provided; The address of single clauses and subclauses is admitted by the data that provide as mentioned above as the character palette according to Fig. 3.Preferably one, two or four of the outputs of state table 160, but regulation nearly 32 figure place the dirigibility of increase can be provided, as discussing with reference to Fig. 4 B below.Under any circumstance, the output of state table provides the address or the pointer of visit NextState palette 170.
Referring now to Fig. 4 B, as the of the present invention perfect feature of paying close attention in the back, the present invention realizes that preferably feature comprises state table control register 162, and it allows further fully hardware saving, if especially 32 outputs of state table 160 with situation about being provided under.In fact, the state table control register by allow variable-length word and be stored in state table and with it from wherein reading the compression that state table information is provided.
More particularly, state table control register 162 storage and the length of each clauses and subclauses in the state table 160 of Fig. 4 A is provided.Since some state table entries among Fig. 1 be the height replicability (for example, " arrival state 0 ", " remaining in state n "), so these clauses and subclauses not only can by the single clauses and subclauses in the state table 160 or at least the clauses and subclauses in Fig. 1 represent, but also can be by less bit representation, even if these less positions can still can produce sufficient hardware saving less when most of or all replicability clauses and subclauses are included in state table, just as being found to be easily in some state tables.Those of ordinary skill in the art should be realized that the principle of this minimizing encodes similar in appearance to so-called entropy (entropy).
Referring now to Fig. 5 A the preferred format of NextState palette 170 is discussed.NextState palette 170 is preferably realized in the mode very identical with the character palette 140 of above-mentioned discussion.Yet owing to have status register 160, the number of the clauses and subclauses that may need can not known in advance, and the length of clauses and subclauses is preferably grown (for example, two 32 byte) very much separately.On the other hand, owing to have only relatively little and foreseeable address realm need be included in the given time arbitrarily, NextState palette 170 can be operated (for example, using NextState palette base register 172) as speed buffering.And, if 32 outputs of state table 160 are provided, some above-mentioned data can be used to replenish the data in the clauses and subclauses of NextState palette 170, may allow the clauses and subclauses of lacking or may fully skip the NextState palette in the latter, shown in dash line 175.
As shown in Fig. 5 A, 32 words of low address of exporting from NextState palette 170 are tokens to be saved.These tokens preferably be formed have 16 token value, 8 token flag and 8 control marks, wherein, token value and token flag be stored in token buffer 190 by the pointer 192 of the beginning of pointing to string with by calculating the place, address that length that successful character comparison accumulates provides jointly.Control mark setting is interrupted the processing in host CPU or the control parser accelerator.One in above-mentioned latter's control mark preferably is used for being provided with the function that can skip character, this can not cause the state variation at the state place except state 0, for example the string of the identical or relevant character of the random length that occurs in the string of being concerned about as above can be derived.Under such a case, the NextState table clause need not just to obtain and can be reused from SRAM/SDRAM.Input buffering address 112 need not extra processing and just can be incremented; Thereby allow to be used for the fully additional acceleration of parsing of definite string of character.Second 32 word is the address offsets that feed back to register 180 and totalizer 150, and it is treated with output is connected (concatenate) to form the state table pointer that sensing is used for next character from the index of character palette.Initial address corresponding to state 0 provides by register 182.
So as can be seen, the status register of the use of character palette, reduced form and NextState storer clearly are expressed as the independent stage with the function of the operation of traditional status register; Each can carry out by enough high-speed slightly storeies relatively in stage as quick as thought, like this, each stage can be replicated with form successively with other operation of token and memory parallel to the file parallel pipeline of character manipulation separately.Therefore, dissection process can greatly be quickened, even if with respect to handle the application specific processor that can must be carried out all above-mentioned functions before beginning in turn at other character.
In a word, character data of this accelerator access host CPU packet of Network Transmission (be known as sometimes imply) and the state table program storage of locating.Accelerator 100 is under the control of host CPU via memory map registers.This accelerator can interrupt host CPU to point out exception, report to the police and to stop, and in the context of intrusion detection, it can usually be called as pattern match and report to the police.Intrusion event warning etc.When resolving beginning, pointer (112,114) is set to the beginning and the ending of the data for the treatment of resolved input buffer 130, and state table to be used (is based upon in the accelerator with other control information (for example, 142) shown in plot 182.
In order to start the operation of accelerator, CPU issues commands to accelerator, as response to this order, accelerator from the CPU program storage (for example, 120 or speed buffering) obtain first 32 words, and it is presented to input buffer 130, the first bytes/ascii character from input buffering 130, selects.Accelerator obtains status information and the current state corresponding to input character (just, the single character of the good working condition table in Fig. 4 A corresponding diagram 1 or single row).This status information comprises NextState address and pending for example interrupts of CPU or arbitrary specific operation of termination.
Next accelerator is selected to treat next analyzed byte from input buffer 130, and is utilized available new state information to repeat above-mentioned processing with totalizer 150.The storage of aforesaid operations or token information can be carried out concomitantly.This execution continued before all four characters of input word are analyzed.Then (or with resolve the 4th character obtaining in advance concomitantly), impact damper 112,114 is compared to determine whether to have arrived the end of archive buffer 120, if reached the end of archive buffer, then interrupts being sent out back CPU.Otherwise, obtaining new word, impact damper 112 is updated, and above-mentioned processing is repeated.
Because the pointer sum counter realizes in the hardware of special use, thus they can be upgraded concurrently, if rather than picture required serial renewal when realizing with software.This just with the time decreased of the byte of resolution data to carrying out the following required time of action, that is, from local input buffer, obtain character, from the local character palette of high speed storer, produce the state table address, from storer, obtain corresponding state table entries and from local high-speed buffer, take out NextState information once more.The certain operations of aforesaid operations can be carried out in independent parallel pipeline concurrently, and other operation of appointment can character be performed when resolving continuing further in state table information (partly or wholly providing by the NextState palette).
Therefore can be clear that very that the present invention is by little and specialized hardware economic quantities provides the abundant acceleration of dissection process.When parser accelerator can interrupts of CPU, above-mentioned processing operate in initial command after moved on to parser accelerator from CPU fully.Yet, even if because when operating concomitantly with other parse operation, need the processing that adequate time is used for token, so the acceleration that provides as mentioned above is not best for the detection of possible invasion or security breach, especially in view of the following fact, that is,, just can start and be difficult to or can not obtainablely operate by during resolving, giving an order.
Referring now to Fig. 2 B, wherein shown the structural arrangements that is used for hardware parser accelerator, it greatly improves the processing speed of resolving, thereby exceed as mentioned above but for the processing speed of the structural arrangements among Fig. 2 A of the limited purposes of the detection of possible invasion and the sign of security breach (signature), and be fully and its compatibility.By comparison diagram 2A and Fig. 2 B; those of ordinary skill in the art should be realized that; structural arrangements among Fig. 2 B is the subclass (sub-set) of the structural arrangements among Fig. 2 A substantially; and provide identical can searching can be the invasion sign whole strings effect (for example; the part of mating one or more expression formulas or expression formula; with this; be sent to CPU before the expressed intact formula coupling that the warning of palette coupling can be encoded in state table; increased response speed like this); but,, it quickens so providing further simultaneously by omitting the token processing owing to have only the interruption of protection system or the unusual result that need be used as processing that sends to be issued.Because containing of possible security breach sign into being done, so the aforesaid complete parsing that is used for file can be performed after screened at this document (screen).Because token is handled and is omitted in the middle of the process of above-mentioned Screening Treatment, so the number of times of memory access is reduced.Just, for intrusion detection accelerator according to the present invention (with above-mentioned hardware parser accelerator relatively) for, token is handled and the use of character palette is omitted, and this has caused lower memory resource demand and some minimizings were arranged on the processing time.Yet,,, this is partially dependent upon the specific installation that is used to the various resources relevant with speed, logic speed etc. so the increase of processing speed is about 25% or a shade below 25% usually because a lot of such processing carry out with parallel mode.Yet perhaps specific rate be the more important thing is the following fact, that is, any security breach sign that may exist will be detected, and can send remedial interruption or unusual (exception) before being carried out by CPU as the corresponding command of a part of attacking.
The all functions unit of having represented the structural arrangements among Fig. 2 B among Fig. 2 A, and units corresponding is used identical label.Therefore, obviously, according to intrusion detection parser accelerator 200 complete and above-mentioned parser accelerator compatibilities of the present invention, and, the change of its structural arrangements can be done to a great extent by program composition, so that the special pattern of the parser accelerator operation that is Fig. 2 A is handled in above-mentioned intrusion detection.
Specifically, input buffer 120 and input word impact damper 130 and address register 112,114, totalizer 150 and state table base address register 182 are consistent with above-mentioned corresponding units, and visit state table 160 in an identical manner.Difference mainly has been to omit the data in character palette, NextState palette storer, the state table and the internal format of data.This state table has the width of 256 characters identical with the embodiment shown in Fig. 2 A basically.Should be appreciated that, for its sign of searching may be more complicated than the single character string that the character of arbitrary number is formed.This sign more generally is described to " regular expression (regularexpressions) " more complicated than character string, as people such as Vikram Vaswani (open source code Web developing web (Developer shed)) " So what ' s a$#! %%Regular Expression Anyway? ($#! What the %% regular expression is actually) " (Man Lefan (Melonfire) the copyright 2000-2002 of company) middle discuss such, its whole content is incorporated this paper into as a reference.Like this, can be with the corresponding state table of regular expression much larger than the state table of single character string.In addition, several regular expressions can use same state table table to be searched concomitantly, and this will cause very large state table.Yet in practice, 64 states are normally enough.Yet,, will need state table entries is extended to above 8 if not enough.Therefore, the extreme compression that provides as mentioned above is optional usually, and the hardware of the state table 160 of enough parts can be provided as mentioned above is not be needed to detect the state table of attacking sign for the number of times that reduces required memory access whole.
In the embodiment shown in Fig. 2 A, the form of the data in the state table preferably includes n=256 clauses and subclauses to admit the number of the character that can explain by byte.Yet, under the situation of the embodiment shown in Fig. 2 B, be that directly the character bit from be buffered in word buffer 130 is carried out to the visit of state table.The content of each clauses and subclauses in the state table only need comprise the row of NextState or state table to be loaded, this just defines the character string of being concerned about, and allow character string to be followed, and/or allow to be used for interruption or the mark of eliminating or the character (it might not be the last character that constitutes the string of above-mentioned sign) that other coding is used to support above-mentioned string is identified as the string of sign that will be issued, shown in Fig. 5 B.NextState can be described in the mode that is less than 8 usually, and the aborted that will be produced behind the string of being concerned about detecting can be described to one.
Like this, character is tested in turn, before first character of the string of being concerned about is run into, except other register outside the register 112,114 can not upgrade.Just, before above-mentioned detection, even if state can not change and the NextState in register 180 can not be updated.Therefore, the enough speed that is exceedingly fast of file energy is screened to be used for initial character.When the initial character of the string of being concerned about is run into, the NextState data are read from state table, register 180 is updated, and new state table data are written into (if also non-existent words) in the status register, and next character is handled with identical form.This state table stores device is far smaller than the aforesaid XML of being used for TMThe storer of resolver.This just allows this state table stores device to carry realization at the plate of the chip of other logic with Fig. 2 A or Fig. 2 B and element, and this just will handle and reduce to 25% of the design required time that is used for using external memory storage cycle length as far as possible.Yet, if accelerator is designed to XML simply TMThe special pattern of accelerator operation, then state table will externally be realized in the storer, above-mentioned speed increment can not realize.Therefore, in these cases, provide the plate that alternately uses according to the size of state table to carry and calculate with external memory storage.Like this, each character only needs few relatively clock period to screen the file that is used to attack sign.When the processed string of being concerned about with identification of the character of sufficient amount, the data that read from state table will comprise that being used as order sends to host CPU with the interruption that is used for protection system or unusual.
Although comprise that the architecture of the system of the present invention as shown in Fig. 2 A or Fig. 2 B is not crucial for implementation of the present invention, the architecture that shows as Fig. 2 C is preferred for the intrusion detection accelerator among Fig. 2 B.Specifically, host CPU 230 is connected by bus 220 with its primary memory 210, and hardware parser accelerator of the present invention communicates by bus 220 and primary memory 210 and host CPU 230.When CPU 230 can monitor communication between primary memory 210 and the accelerator 100/200, token also was not defined or sets up, and carried out to be used to realize that the coding of attacking is impossible.Therefore, before any operation in being included in attack was carried out, the present invention just can send the interruption remedied and unusually.
According to aforesaid content as can be seen, the document screening speed very fast that the invention provides is to be used to seek some signs, these signs can show the possibility that attempt is attacked in context (context) at hardware accelerator and the environment, and this measure can will be used to resolve for example XML TMThe time of file is reduced to the sub-fraction of needed time before the present invention greatly.Intrusion detection resolver of the present invention does not need to exceed add ons or the hardware according to parser accelerator of the present invention, and can become to send before can carrying out in any invasion court reason and interrupt and/or unusual.
Although the present invention describes according to independent preferred implementation, those of ordinary skill in the art will recognize that in the scope of spirit of the present invention and appended claim, also can make modification to the present invention.

Claims (25)

1. intruding detection system comprises:
Character buffer is used for a plurality of bytes of storage file;
State table, can come addressing according to the byte and the bonding state of described file, with visit from described state table interrupt, unusual or be used for storing the order of token and at least one of NextState data, wherein, when in described state table, parsing the effective token of expression, visit the described order that is used to store token;
Register is used to store described NextState data;
Be used for the content of described register and the subsequent byte of described file are merged the device that enters the further address of described state table with formation;
Token buffer is used to store a plurality of tokens, and wherein said a plurality of tokens can be used for further processing by host-processor; And
Bus is used for described interruption or describedly is delivered to described host-processor unusually;
Wherein, described state table, the described device that is used to merge and described token buffer can be operated concurrently simultaneously.
2. intruding detection system as claimed in claim 1, wherein, described intruding detection system is provided by resolver.
3. intruding detection system as claimed in claim 1, wherein, described state table is provided by storer, and described storer is in the identical chip with at least one of described register and the described device that is used for merging.
4. intruding detection system as claimed in claim 2, wherein, described state table is provided by external memory storage.
5. intruding detection system as claimed in claim 4, also comprise storer, described storer is in the identical chip with at least one of described register and the described device that is used for merging, is used for need not to store described state table when described external memory storage is realized when described state table.
6. intruding detection system as claimed in claim 1, wherein, described state table is visited with the speed greater than the network packet transmission speed.
7. intruding detection system as claimed in claim 1, comprise also being used to respond the appearance that detects list entries and pattern match being reported to the police deliver to the device of described host-processor that described list entries is complementary with the sign of the one or more sequences that are encoded in described state table.
8. intruding detection system as claimed in claim 7 wherein, is passed to described host-processor in response to described interruption or described unusual intrusion alarm and starts the intrusion prevention operation, to prevent or to limit invasion and attack.
9. intruding detection system as claimed in claim 1, wherein, the speed visit that described state table is equated with the network packet transmission speed.
10. intrusion detection method comprises:
The state table that visit can be carried out addressing according to the byte and the current state of file;
If interrupt or available unusually, then from described state table, obtain described interruption or described in unusual at least one;
If determine not interrupt or available unusually and effective token has been analyzed, then from described state table, obtain the token store order;
In token buffer, store token in response to described token store order;
From described state table, obtain the NextState data;
Store described NextState data; And
The subsequent byte of described stored NextState data and described file is merged the further address that enters described state table with formation.
11. intrusion detection method as claimed in claim 10, wherein, described intrusion detection method is provided by resolver.
12. intrusion detection method as claimed in claim 11, wherein, described state table is provided by external memory storage.
13. intrusion detection method as claimed in claim 10, wherein, described state table is visited with the speed greater than the network packet transmission speed.
14. intrusion detection method as claimed in claim 10 when also comprising the list entries that the sign when the one or more sequences that detect and encode is complementary in described state table, is reported to the police pattern match and is delivered to described host-processor.
15. intrusion detection method as claimed in claim 14 wherein, is communicated to described host-processor with the operation of startup intrusion prevention corresponding to described interruption or described unusual intrusion alarm, thereby prevents or limit the invasion attack.
16. intrusion detection method as claimed in claim 10, wherein, the speed visit that described state table is equated with the network packet transmission speed.
17. one kind makes computing machine can quicken intrusion method for testing, comprises the steps:
The state table that visit can be carried out addressing according to the byte and the states of previous states of file;
If interrupt or available unusually, then from described state table, obtain described interruption or described in unusual at least one;
If order available and described token carried out multianalysis, then from described state table, obtain the token store order, and in response to the described token of described token store demanded storage;
From described state table, obtain the NextState data;
Store described NextState data;
The subsequent byte of described stored NextState data and described file is merged the further address that enters described state table with formation; And
By analysis with stored described token after, make described token can be used for the subsequent treatment of various objectives.
18. method as claimed in claim 17, wherein said different purpose are to carry out contextual analysis to detect the invasion on the file level.
19. method as claimed in claim 17, wherein said different purpose are the final uses of described file.
20. method as claimed in claim 17, wherein said different purpose and intrusion detection are irrelevant.
21. an intrusion detection method comprises:
The Access status table, described state table can carry out addressing according to the first and the current state of file;
But, from described state table, obtain described order, and in response to the demanded storage token of storing token in the order time spent;
From described state table, obtain the NextState data;
Store described NextState data;
The second portion of described stored NextState data and described file is merged, enter further address described state table with formation;
The following function of executed in parallel is side by side promptly visited described state table, is stored described token and the second portion of described stored NextState data and described file is merged; And
With described interruption or describedly be sent to described host-processor unusually.
22. intrusion detection method as claimed in claim 21, wherein said intrusion detection method is provided by resolver.
23. intrusion detection method as claimed in claim 21, also comprise when list entries that the one or more sequence flag that detect Yu encode in described state table are complementary, described host-processor is delivered in pattern match warning to be suggested, to increase response speed.
24. intrusion detection method as claimed in claim 21 wherein, is communicated to described host-processor with the operation of startup intrusion prevention corresponding to described interruption or unusual intrusion alarm, thereby prevents or limit the invasion attack.
25. intrusion detection method as claimed in claim 21, wherein, described first and described second portion are represented character.
CNB2003801061642A 2002-10-29 2003-10-03 Intrusion detection accelerator Expired - Fee Related CN100357846C (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US42177302P 2002-10-29 2002-10-29
US60/421,774 2002-10-29
US60/421,775 2002-10-29
US60/421,773 2002-10-29
US10/331,879 2002-12-31

Publications (2)

Publication Number Publication Date
CN1735850A CN1735850A (en) 2006-02-15
CN100357846C true CN100357846C (en) 2007-12-26

Family

ID=35925173

Family Applications (3)

Application Number Title Priority Date Filing Date
CNB2003801061642A Expired - Fee Related CN100357846C (en) 2002-10-29 2003-10-03 Intrusion detection accelerator
CNB2003801061661A Expired - Fee Related CN100380322C (en) 2002-10-29 2003-10-03 Hardware accelerated validating parser
CNB2003801061657A Expired - Fee Related CN100430896C (en) 2002-10-29 2003-10-03 Hardware parser accelerator

Family Applications After (2)

Application Number Title Priority Date Filing Date
CNB2003801061661A Expired - Fee Related CN100380322C (en) 2002-10-29 2003-10-03 Hardware accelerated validating parser
CNB2003801061657A Expired - Fee Related CN100430896C (en) 2002-10-29 2003-10-03 Hardware parser accelerator

Country Status (1)

Country Link
CN (3) CN100357846C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4945410B2 (en) * 2006-12-06 2012-06-06 株式会社東芝 Information processing apparatus and information processing method
US8117347B2 (en) * 2008-02-14 2012-02-14 International Business Machines Corporation Providing indirect data addressing for a control block at a channel subsystem of an I/O processing system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5319776A (en) * 1990-04-19 1994-06-07 Hilgraeve Corporation In transit detection of computer virus with safeguard
US5414833A (en) * 1993-10-27 1995-05-09 International Business Machines Corporation Network security system and method using a parallel finite state machine adaptive active monitor and responder
CN1310539A (en) * 2001-03-16 2001-08-29 巨龙信息技术有限责任公司 Telecom service developing method based on independent service module
CA2307529A1 (en) * 2000-03-29 2001-09-29 Pmc-Sierra, Inc. Method and apparatus for grammatical packet classifier

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3606387B2 (en) * 1994-09-13 2005-01-05 松下電器産業株式会社 Compilation device
US5799307A (en) * 1995-10-06 1998-08-25 Callware Technologies, Inc. Rapid storage and recall of computer storable messages by utilizing the file structure of a computer's native operating system for message database organization
US5995963A (en) * 1996-06-27 1999-11-30 Fujitsu Limited Apparatus and method of multi-string matching based on sparse state transition list
JP4153989B2 (en) * 1996-07-11 2008-09-24 株式会社日立製作所 Document retrieval and delivery method and apparatus
JP3958902B2 (en) * 1999-03-03 2007-08-15 富士通株式会社 Character string input device and method
US6427202B1 (en) * 1999-05-04 2002-07-30 Microchip Technology Incorporated Microcontroller with configurable instruction set
AUPQ849500A0 (en) * 2000-06-30 2000-07-27 Canon Kabushiki Kaisha Hash compact xml parser

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5319776A (en) * 1990-04-19 1994-06-07 Hilgraeve Corporation In transit detection of computer virus with safeguard
US5414833A (en) * 1993-10-27 1995-05-09 International Business Machines Corporation Network security system and method using a parallel finite state machine adaptive active monitor and responder
CA2307529A1 (en) * 2000-03-29 2001-09-29 Pmc-Sierra, Inc. Method and apparatus for grammatical packet classifier
CN1310539A (en) * 2001-03-16 2001-08-29 巨龙信息技术有限责任公司 Telecom service developing method based on independent service module

Also Published As

Publication number Publication date
CN100430896C (en) 2008-11-05
CN100380322C (en) 2008-04-09
CN1726464A (en) 2006-01-25
CN1735850A (en) 2006-02-15
CN1726465A (en) 2006-01-25

Similar Documents

Publication Publication Date Title
TWI730654B (en) Method and device for deploying and executing smart contract
AU2003277248B2 (en) Intrusion detection accelerator
US10783082B2 (en) Deploying a smart contract
US20040172234A1 (en) Hardware accelerator personality compiler
US20040083466A1 (en) Hardware parser accelerator
US20070016554A1 (en) Hardware accelerated validating parser
US20050273450A1 (en) Regular expression acceleration engine and processing model
US20070061884A1 (en) Intrusion detection accelerator
WO2006029508A1 (en) Highly scalable subscription matching for a content routing network
CN102254111A (en) Malicious site detection method and device
CN102696016B (en) Method for compressing mark symbol
CN113312108B (en) SWIFT message verification method and device, electronic equipment and storage medium
US20060235868A1 (en) Methods and apparatus for representing markup language data
CN101673217A (en) Method for realizing remote program call and system thereof
EP1611530B1 (en) Systems and method for optimizing tag based protocol stream parsing
CN100357846C (en) Intrusion detection accelerator
CN111240772A (en) Data processing method and device based on block chain and storage medium
WO2004040446A2 (en) Hardware parser accelerator
US20030033314A1 (en) Efficient method to describe hierarchical data structures
CA2504491A1 (en) Hardware accelerated validating parser
Shires An E cient Method for Parsing Large Finite Element Data Files

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20071226

Termination date: 20101003