JPWO2014041783A1

JPWO2014041783A1 - Character string detection circuit and character string detection method

Info

Publication number: JPWO2014041783A1
Application number: JP2014535372A
Authority: JP
Inventors: 浩明井上
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2012-09-11
Filing date: 2013-09-09
Publication date: 2016-08-12
Also published as: WO2014041783A1

Abstract

入力文字列から検出文字列の先端を含む先端文字列の位置情報と一致情報とを検出する先端検出手段と、入力文字列から検出文字列の終端を含む終端文字列の位置情報と一致情報とを検出する終端検出手段と、検出文字列において先端文字列と終端文字列に挟まれた中間文字列の一致情報を入力文字列から検出する中間検出手段と、を有する一致検出手段と、位置情報及び一致情報を用いて入力文字列から一致長を考慮して検出文字列を検出する範囲長検出手段と、を備える文字列検出回路とする。Leading edge detecting means for detecting position information and matching information of the leading character string including the leading edge of the detected character string from the input character string; position information and matching information of the terminal character string including the end of the detecting character string from the input character string; Position detection means, and intermediate detection means for detecting match information of an intermediate character string sandwiched between the leading character string and the termination character string from the input character string in the detected character string, and position information And a range length detecting means for detecting a detected character string in consideration of the matching length from the input character string using the matching information.

Description

本発明は、文字列の一致検出をする文字列検出回路及び文字列検出方法に関する。特に、文字列の一致長を考慮した文字列検出装置及び文字列検出方法に関する。 The present invention relates to a character string detection circuit and a character string detection method for detecting matching of character strings. In particular, the present invention relates to a character string detection device and a character string detection method in consideration of the matching length of character strings.

文字列一致検出は、ネットワーク侵入検知や迷惑メールフィルタ等、産業界で様々な応用が存在する。とりわけ近年、文字列の一致長を考慮した一致検出に注目が集まっている。 Character string match detection has various applications in the industry, such as network intrusion detection and spam mail filter. In particular, in recent years, attention has been focused on matching detection in consideration of matching lengths of character strings.

例えば、最大一致長が２文字として任意のアルファベットとの一致検出を考える。この例では、ＡやＡＢという文字列に関しては一致検出されるものの、ＡＢＣという文字列に関しては最大一致長が３となるために検出されない。このように、一致長を指定することで、より高精度な文字列一致検出を実現することができる。しかしながら、このような文字列の一致長を考慮した一致検出を高速に実現することは非常に困難であった。 For example, consider a match detection with an arbitrary alphabet with a maximum match length of 2 characters. In this example, the character strings A and AB are detected as being coincident, but the character string ABC is not detected because the maximum coincidence length is 3. Thus, by specifying the match length, more accurate character string match detection can be realized. However, it is very difficult to realize coincidence detection considering the coincidence length of character strings at high speed.

図９は、一般的な文字列の繰返しを考慮した一致検出を行う繰返し数制約付き文字列検出回路９０を示した図である。なお、図９の方式の一例は、非特許文献１に開示されている。 FIG. 9 is a diagram showing a character string detection circuit 90 with a repetition number constraint for performing coincidence detection in consideration of general character string repetition. Note that an example of the method of FIG. 9 is disclosed in Non-Patent Document 1.

この例では、最小繰返しを１、最大繰返し数を４０としている。文字列検出回路９０では、入力文字列を、例えば、正規表現から生成される状態遷移機械９１によって一致検出を行う。この状態遷移機械９１は、もし一致していればカウンタ９２に内部カウンタを増加させる信号を送る。すなわち、状態遷移が繰り返されるたびに、繰返し回数を１増やす。比較器９３は内部カウンタの出力結果が１以上４０以下であれば、一致情報を出力回路９５へ送る。 In this example, the minimum repetition is 1 and the maximum number of repetitions is 40. In the character string detection circuit 90, the input character string is detected by the state transition machine 91 generated from a regular expression, for example. The state transition machine 91 sends a signal to the counter 92 to increment the internal counter if they match. That is, each time the state transition is repeated, the repetition count is increased by one. The comparator 93 sends coincidence information to the output circuit 95 if the output result of the internal counter is 1 or more and 40 or less.

続いて、非一致検出回路９４は状態遷移機械９１の内部状態を監視し、もし今後一致する可能性がなければカウンタ９２と出力回路９５にリセット信号を送る。カウンタ９２はリセット信号により内部カウンタ値を０に戻す。すなわち、状態遷移が繰り返されなければ繰返し回数を０とし、出力回路９５はリセット信号を受信した際に、比較器９３の出力結果が一致していれば一致と判定し、一致結果を出力する。 Subsequently, the non-coincidence detection circuit 94 monitors the internal state of the state transition machine 91 and sends a reset signal to the counter 92 and the output circuit 95 if there is no possibility of coincidence in the future. The counter 92 returns the internal counter value to 0 by the reset signal. That is, if the state transition is not repeated, the number of repetitions is set to 0. When the output circuit 95 receives the reset signal, if the output results of the comparator 93 match, it determines that they match, and outputs a match result.

図９に示した方式の状態遷移機械９１は、１文字の繰返しからなる文字列に関しては、一致長を考慮した文字列の検出が可能である。１文字の繰返しからなる文字列とは「ＡＡＡＡＡ」のような、特定の文字や任意の文字の繰返しとして表現される。この場合、「Ａ」の繰返し数は５で、かつ、繰返し回数は５となるため、繰返し回数が一致長と同値になる。そのため、一致長を考慮した文字列の検出が可能となる。 The state transition machine 91 of the method shown in FIG. 9 can detect a character string in consideration of the matching length for a character string consisting of one character repetition. A character string formed by repeating one character is expressed as a repetition of a specific character or an arbitrary character such as “AAAAAA”. In this case, since the number of repetitions of “A” is 5 and the number of repetitions is 5, the number of repetitions is equal to the matching length. For this reason, it is possible to detect a character string in consideration of the matching length.

M．Faezipour、M．Nourani、“Constraint Repetition Inspection for Regular Expression on FPGA、”16th IEEE Symposium on High Performance Interconnects、pp．111−118（2008）M. Faezipour, M.M. Nourani, “Constraint Repetition Inspection for Regular Expression on FPGA,” 16th IEEE Symposium on High Performance Interconnects, pp. 111-118 (2008)

図９に示した方式では、「ＡＢＢＢＣ」という繰返し文字列ではない文字列の一致長を出力したい場合、入力文字列として「ＡＢＢＢＣ」が与えられたとしても、ＡとＣという異なる文字が付加されている。そのため、本方式では繰返し回数の１が出力され、実際の一致長である５は出力されない。 In the method shown in FIG. 9, when it is desired to output the matching length of a character string that is not a repeated character string “ABBBC”, different characters A and C are added even if “ABBBC” is given as the input character string. ing. Therefore, in this method, 1 is output as the number of repetitions, and 5 as the actual matching length is not output.

すなわち、図９に示した方式では、繰返しを持たない文字列に関しては、一致長を考慮した文字列として検出できないという課題があった。 That is, the method shown in FIG. 9 has a problem that a character string having no repetition cannot be detected as a character string in consideration of the matching length.

また、図９に示した方式では、単一の状態遷移機械によってシリアルに処理を行うため、多バイト入力による処理が難しい。そのため、多バイト入力によって、文字列を高速に処理することができないという課題があった。 Further, in the method shown in FIG. 9, since processing is performed serially by a single state transition machine, processing by multi-byte input is difficult. For this reason, there has been a problem that character strings cannot be processed at high speed by multi-byte input.

本発明は上述の課題を鑑みてなされたものであり、文字列の一致長を考慮した一致検出を、制限なく、かつ高速に処理する装置及び方法を提供することを目的とする。 The present invention has been made in view of the above-described problems, and an object of the present invention is to provide an apparatus and method for processing match detection in consideration of the match length of a character string at high speed without limitation.

本発明の文字列検出回路は、入力文字列から検出文字列の先端を含む先端文字列の位置情報と一致情報とを検出する先端検出手段と、入力文字列から検出文字列の終端を含む終端文字列の位置情報と一致情報とを検出する終端検出手段と、検出文字列の先端文字列と終端文字列に挟まれた中間文字列の一致情報を入力文字列から検出する中間検出手段と、を有する一致検出手段と、位置情報及び一致情報を用いて入力文字列から一致長を考慮して検出文字列を検出する範囲長検出手段と、を備える。 The character string detection circuit according to the present invention includes a leading end detection unit that detects position information and matching information of a leading character string including the leading end of the detected character string from the input character string, and a termination including the end of the detected character string from the input character string. End detection means for detecting position information and match information of the character string; intermediate detection means for detecting match information of the intermediate character string sandwiched between the leading character string and the end character string of the detected character string from the input character string; And a range length detecting means for detecting a detected character string in consideration of a match length from an input character string using position information and match information.

本発明の文字列検出方法は、入力文字列から検出文字列の先端を含む先端文字列の位置情報と一致情報との検出、入力文字列から検出文字列の終端を含む終端文字列の位置情報と一致情報との検出、及び検出文字列において先端文字列と終端文字列に挟まれた中間文字列の一致情報を入力文字列からの検出、を並列で行い、位置情報及び一致情報を用いて入力文字列から一致長を考慮して検出文字列を検出する。 The character string detection method of the present invention detects the position information and the matching information of the leading character string including the leading end of the detected character string from the input character string, and the positional information of the terminal character string including the end of the detected character string from the input character string. And the matching information are detected, and the matching information of the intermediate character string sandwiched between the leading character string and the terminal character string in the detected character string is detected from the input character string in parallel, and the position information and the matching information are used. A detection character string is detected from the input character string in consideration of the matching length.

本発明によれば、文字列の一致長を考慮した一致検出を、繰返し文字列に制限されずに実施することができる。また、一致検出を、多バイト入力を活用することで、非常に高速に処理することができる。 According to the present invention, match detection in consideration of the match length of a character string can be performed without being limited to repeated character strings. Also, match detection can be processed very quickly by utilizing multi-byte input.

本発明の第１の実施形態に係る文字列検出回路の構成図である。It is a block diagram of the character string detection circuit which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態において検出対象とする文字列の構成図である。It is a block diagram of the character string made into the detection target in the 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字列検出回路における入力文字列の構成図である。It is a block diagram of the input character string in the character string detection circuit which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字列検出回路における入力文字列の構成図の一例である。It is an example of the block diagram of the input character string in the character string detection circuit which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字列検出回路の動作例を示す図である。It is a figure which shows the operation example of the character string detection circuit which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字列検出回路の範囲長検出部の状態遷移図である。It is a state transition diagram of the range length detection part of the character string detection circuit which concerns on the 1st Embodiment of this invention. 本発明の第２の実施形態に係る文字列検出回路の構成図である。It is a block diagram of the character string detection circuit which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施形態に係る文字列検出回路の文字列処理部の構成図である。It is a block diagram of the character string process part of the character string detection circuit which concerns on the 2nd Embodiment of this invention. 一般的な構成の最大長制約付き文字列検出回路の構成図である。It is a block diagram of the character string detection circuit with a maximum length restriction of a general structure.

以下において、本発明を実施するための形態について図面を参照しながら説明する。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings.

（第１の実施形態）
（文字列検出回路）
図１は、本発明の第１の実施形態に係る文字列検出回路１の構成例を示す図である。なお、本発明の実施形態に係る文字列検出回路１は、一致長制約付き文字列検出回路のことを指す。また、本発明の実施形態に係る文字列検出回路１は、正規表現の文字だけではなく、正規表現にはない記号やパターンなどの一致検出にも適応できる。(First embodiment)
(Character string detection circuit)
FIG. 1 is a diagram showing a configuration example of a character string detection circuit 1 according to the first embodiment of the present invention. The character string detection circuit 1 according to the embodiment of the present invention refers to a character string detection circuit with a matching length constraint. Further, the character string detection circuit 1 according to the embodiment of the present invention can be applied not only to the regular expression characters but also to the coincidence detection of symbols and patterns that are not in the regular expressions.

図１を参照すると、文字列検出回路１は、先端検出部２、中間検出部３、終端検出部４及び範囲長検出部５を備えている。 Referring to FIG. 1, the character string detection circuit 1 includes a front end detection unit 2, an intermediate detection unit 3, an end detection unit 4, and a range length detection unit 5.

文字列検出回路１は、入力された文字列と検出文字列との比較を行い、一致長を考慮してそれらの一致を判定する回路である。 The character string detection circuit 1 is a circuit that compares an input character string with a detected character string and determines the match in consideration of the match length.

先端検出部２、中間検出部３、終端検出部４は、入力された文字列から、検出文字列の先端文字列、中間文字列、終端文字列のそれぞれを検出する。なお、先端検出部２、中間検出部３、終端検出部４は、それぞれ別の構成要素としているが、これらの検出機能を一つの一致検出部として実現することができれば、必ずしも検出部の全てを個別の構成要素としなくてもよい。 The leading edge detection unit 2, the middle detection unit 3, and the termination detection unit 4 detect the leading character string, the middle character string, and the termination character string of the detected character string from the input character string. The front end detection unit 2, the intermediate detection unit 3, and the end detection unit 4 are separate components. However, if these detection functions can be realized as one coincidence detection unit, all of the detection units are not necessarily provided. It does not have to be an individual component.

範囲長検出部５は、先端検出部２、中間検出部３、終端検出部４から出力された結果を入力し、それらの結果から入力された文字列と検出文字列の一致を判定する。 The range length detection unit 5 receives the results output from the front end detection unit 2, the intermediate detection unit 3, and the end detection unit 4, and determines whether the input character string matches the detected character string.

なお、上述の各構成要素の詳細については、後ほどあらためて説明する。 Note that details of each of the above-described components will be described later.

（検出文字列）
図２は、本発明の第１の実施形態に係る文字列検出回路１が検出する検出文字列２０の構成を示す図である。(Detection string)
FIG. 2 is a diagram showing a configuration of the detected character string 20 detected by the character string detection circuit 1 according to the first embodiment of the present invention.

検出文字列２０は、先端文字列２１、中間文字列２２及び終端文字列２３からなる。 The detected character string 20 includes a leading character string 21, an intermediate character string 22, and a terminal character string 23.

先端文字列２１及び終端文字列２３は、それぞれ検出文字列２０の先端、終端を含む文字列である。先端文字列２１及び終端文字列２３には、一般に中間文字列２２とは異なる符号が割り当てられている。また、先端文字列２１及び終端文字列２３の文字長は任意であり、例えば、１バイト以上などと指定できる。 The leading character string 21 and the terminating character string 23 are character strings including the leading and trailing ends of the detected character string 20, respectively. The leading character string 21 and the terminating character string 23 are generally assigned different codes from the intermediate character string 22. The character lengths of the leading character string 21 and the terminal character string 23 are arbitrary, and can be specified as, for example, 1 byte or more.

中間文字列２２は、一致長を考慮する文字列に相当する。なお、一致長とは、先端文字列２１と終端文字列２３に挟まれた中間文字列２２の文字数を示す。また、一致長は、最小長または最大長のいずれか一方、もしくは最小長及び最大長の両方を含む範囲として設定される。中間文字列２２は、一致長のみを考慮すればよいため、文字数のみを定義すればよい。ただし、中間文字列２２については、必要に応じて特定の文字列を含むこととしても構わない。 The intermediate character string 22 corresponds to a character string considering the matching length. The matching length indicates the number of characters in the intermediate character string 22 sandwiched between the leading character string 21 and the terminal character string 23. The coincidence length is set as a range including either the minimum length or the maximum length, or both the minimum length and the maximum length. Since the intermediate character string 22 only needs to consider the matching length, only the number of characters need be defined. However, the intermediate character string 22 may include a specific character string as necessary.

例えば、検出文字列２０を、「ＡＢＣ・・・ＣＹＺ」とし、先端文字列２１をＡＢ、終端文字列２３をＹＺ、中間文字列２２をＣの繰返し文字列と定義する。仮に、最大一致長を５とすると、「ＡＢＹＺ」、「ＡＢＣＹＺ」、「ＡＢＣＣＹＺ」、「ＡＢＣＣＣＹＺ」、「ＡＢＣＣＣＣＹＺ」、「ＡＢＣＣＣＣＣＹＺ」という文字列が検出対象となる。しかしながら、「ＡＢＣＣＣＣＣＣＹＺ」は、中間文字列２２の一致長が６となるため、検出対象外である。この例では、中間文字列２２としてＣの繰返し文字列をあげたが、中間文字列２２は任意の文字で構成されていてもよく、一致長のみを検出対象とする。 For example, the detected character string 20 is defined as “ABC... CYZ”, the leading character string 21 is defined as AB, the terminating character string 23 is defined as YZ, and the intermediate character string 22 is defined as a repeated character string C. Assuming that the maximum matching length is 5, the character strings “ABYZ”, “ABCYZ”, “ABCCYZ”, “ABCCCCYZ”, “ABCCCCYZ”, and “ABCCCCCYZ” are detected. However, “ABCCCCCCYZ” is not a detection target because the matching length of the intermediate character string 22 is 6. In this example, a C repeated character string is used as the intermediate character string 22. However, the intermediate character string 22 may be composed of any character, and only the matching length is detected.

本発明の第１の実施形態に係る文字列検出回路１では、検出文字列２０と完全一致した文字列のみを検出対象とするのではなく、先端文字列２１、終端文字列２３と完全一致し、指定した一致長の条件に合致する中間文字列２２を含む文字列が検出対象となる。 In the character string detection circuit 1 according to the first exemplary embodiment of the present invention, not only the character string that completely matches the detected character string 20 is detected, but the leading character string 21 and the terminal character string 23 are completely matched. A character string including the intermediate character string 22 that matches the specified matching length condition is to be detected.

このように、任意の文字列において先端文字列２１及び終端文字列２３を規定することができるため、本発明の実施形態に係る検出文字列の定義は、一般的な文字列に適用できる。 As described above, since the leading character string 21 and the terminating character string 23 can be defined in an arbitrary character string, the definition of the detected character string according to the embodiment of the present invention can be applied to a general character string.

なお、検出文字列２０は、論理回路として実現されてもよいし、記憶素子に格納されていてもよいし、一般に文字が判定できる方式であればどのような実装でも構わない。 The detected character string 20 may be realized as a logic circuit, may be stored in a storage element, or may be implemented in any manner as long as it can generally determine a character.

（入力文字列）
図３は、本発明の第１の実施形態に係る文字列検出回路１に入力される入力文字列３０の構成を示す図である。入力文字列３０としては、正規表現などで表される文字列を取り扱うことができる。また、正規表現で表されていない記号やパターンなどを含む記号列を文字列として取り扱うこともできる。(Input string)
FIG. 3 is a diagram showing the configuration of the input character string 30 input to the character string detection circuit 1 according to the first embodiment of the present invention. As the input character string 30, a character string represented by a regular expression or the like can be handled. In addition, a symbol string including symbols and patterns not represented by regular expressions can be handled as a character string.

入力文字列３０は、任意長の文字列からなる複数個の部分文字列３１ａ、３１ｂ、・・・、３１ｎから構成される。この入力文字列３０から検出文字列２０と一致長を考慮した上で一致する文字列を検出する。なお、本発明の実施形態においては、部分文字列３１ａ、３１ｂ、・・・、３１ｎを用いて説明する際に、特に区別する必要がないときは部分文字列３１と記載する。 The input character string 30 is composed of a plurality of partial character strings 31a, 31b,. A character string that matches the detected character string 20 in consideration of the matching length is detected from the input character string 30. In the embodiment of the present invention, when the description is made using the partial character strings 31a, 31b,.

例えば、入力文字列３０が「ＡＢＣＣＣＣＣＣＣＣＹＺ」であるとする。文字列検出回路１が一度に処理できる最大の文字列長が４である場合、部分文字列３１は、「ＡＢＣＣ」、「ＣＣＣＣ」、「ＣＣＹＺ」に設定できる。無論、最大の文字列長が入力文字列長より大きければ、入力文字列３０を一度に文字列検出回路１へと入力することができる。 For example, it is assumed that the input character string 30 is “ABCCCCCCCCYZ”. When the maximum character string length that the character string detection circuit 1 can process at a time is 4, the partial character string 31 can be set to “ABCC”, “CCCC”, and “CCYZ”. Of course, if the maximum character string length is larger than the input character string length, the input character string 30 can be input to the character string detection circuit 1 at a time.

ただし、部分文字列３１ａ、３１ｂ、・・・、３１ｎのそれぞれの長さは、全て同じに設定してもよく、文字列の出現順などに応じて設定してもよい。 However, the lengths of the partial character strings 31a, 31b,..., 31n may all be set to the same, or may be set according to the appearance order of the character strings.

なお、部分文字列３１は、それぞれ符号部分とデータ部分を含んでいてもよく、また、それぞれの部分文字列３１を指定したアドレスに保存することができるのであれば、データ部分のみから構成されていてもよい。 The partial character string 31 may include a code part and a data part, respectively. If each partial character string 31 can be stored at a specified address, the partial character string 31 includes only a data part. May be.

（動作）
以下において、図４及び図５を用いて本発明の第１の実施形態に係る文字列検出回路１の動作を説明する。(Operation)
The operation of the character string detection circuit 1 according to the first embodiment of the present invention will be described below with reference to FIGS.

図４は、部分文字列４１と部分文字列と４２からなる入力文字列４０である。入力文字列４０は、図３の入力文字列３０の一例である。 FIG. 4 shows an input character string 40 composed of a partial character string 41, a partial character string and 42. The input character string 40 is an example of the input character string 30 in FIG.

図５は、本発明の第１の実施形態に係る文字列検出回路１における入力方式の一例を示した図である。 FIG. 5 is a diagram showing an example of an input method in the character string detection circuit 1 according to the first embodiment of the present invention.

文字列検出回路１は、部分文字列４１及び部分文字列４２が存在したときに、一致長を考慮した文字列検出を行う。 The character string detection circuit 1 performs character string detection in consideration of the matching length when the partial character string 41 and the partial character string 42 exist.

文字列検出回路１に対して、時刻Ｔ０では部分文字列４１を入力し、時刻Ｔ１では部分文字列４２を入力する。すなわち、これらの部分文字列４１、４２は、ある時刻で、順序良く文字列検出回路１に入力される。この際、入力文字列４０は、先端検出部２、中間検出部３及び終端検出部４に同時刻に入力されることになる。そのため、先端文字列２１、中間文字列２２、終端文字列２３のそれぞれにおいて、並列で文字列検出される。そのため、高速での文字列検出処理が可能となる。 A partial character string 41 is input to the character string detection circuit 1 at time T0, and a partial character string 42 is input at time T1. That is, these partial character strings 41 and 42 are input to the character string detection circuit 1 in order at a certain time. At this time, the input character string 40 is input to the front end detection unit 2, the intermediate detection unit 3, and the end detection unit 4 at the same time. Therefore, character strings are detected in parallel in each of the leading character string 21, the intermediate character string 22, and the terminal character string 23. Therefore, it is possible to perform character string detection processing at high speed.

なお、部分文字列４１、４２を文字列検出回路１に入力するタイミングは、範囲長検出部５の処理能力に応じて決定される。例えば、範囲長検出部５が並列処理できない場合、部分文字列４２は、部分文字列４１の文字列検出処理が終了してから文字列検出回路１に入力すればよい。 The timing at which the partial character strings 41 and 42 are input to the character string detection circuit 1 is determined according to the processing capability of the range length detection unit 5. For example, when the range length detection unit 5 cannot perform parallel processing, the partial character string 42 may be input to the character string detection circuit 1 after the character string detection processing of the partial character string 41 is completed.

また、文字列検出回路１において並列処理が可能であれば、部分文字列４１の文字列検出処理が実行中であっても、部分文字列４１を文字列検出回路１に入力してもよい。例えば、先端検出部２、中間検出部３及び終端検出部４を複数設けたり、それぞれの検出部の内部処理を並行にしたり、それぞれの検出部の内部に記憶装置を設けたりすることによって、並行処理が可能となる。また、パイプライン方式で検出処理を実行することが可能であれば、並列処理を連続的に実行することができるため、さらに高速な処理が可能となる。なお、部分文字列４１、４２の入力方式によって、一致長を考慮した文字列検出が実施されるが、本発明の第１の実施形態に係る文字列検出回路１の検出方式は、ここであげたものに限らない。 If the character string detection circuit 1 can perform parallel processing, the partial character string 41 may be input to the character string detection circuit 1 even when the character string detection process of the partial character string 41 is being executed. For example, by providing a plurality of front end detection units 2, intermediate detection units 3 and end detection units 4, paralleling internal processing of each detection unit, or providing a storage device inside each detection unit, Processing is possible. In addition, if the detection process can be executed in a pipeline manner, parallel processing can be executed continuously, so that higher-speed processing is possible. The character string detection in consideration of the matching length is performed according to the input method of the partial character strings 41 and 42. The detection method of the character string detection circuit 1 according to the first embodiment of the present invention is described here. Not limited to those.

以下において、文字列検出回路１の構成を詳細に説明する。例として、入力文字列３０を「ＦＦＡＢＣＣＣＣＹＺＦＦ」とし、先端文字列２１を「ＡＢ」、終端文字列２３を「ＹＺ」、中間文字列２２をＣの繰返し文字列と定義する。この時、最大一致長を５とし、部分文字列の長さは４とする。 Hereinafter, the configuration of the character string detection circuit 1 will be described in detail. As an example, the input character string 30 is defined as “FFABCCCCCYZFF”, the leading character string 21 is defined as “AB”, the terminating character string 23 is defined as “YZ”, and the intermediate character string 22 is defined as a C repeated character string. At this time, the maximum matching length is set to 5, and the length of the partial character string is set to 4.

（先端検出部）
先端検出部２は、部分文字列３１を入力とし、その部分文字列３１における先端文字列２１を検出する。(Tip detector)
The leading edge detector 2 receives the partial character string 31 and detects the leading character string 21 in the partial character string 31.

先端文字列２１が検出された場合には、先端位置１１と先端一致情報１２とを範囲長検出部５へと送信する。 When the leading character string 21 is detected, the leading end position 11 and the leading end coincidence information 12 are transmitted to the range length detection unit 5.

先端位置１１とは、部分文字列３１内における先端文字列２１の位置を示す。ここで、位置とは部分文字列３１におけるオフセットを意味する。例えば、先端文字列２１の末尾の文字位置で定義することができる。一般に、オフセットとは先頭からの距離を表わす整数であるが、本実施形態においては、部分文字列３１の左または右からのどちらからでも位置を指定可能である。例えば、部分文字列３１の左を基準として位置を指定する場合、「ＦＦＡＢ」を入力すると、先端文字列「ＡＢ」は先頭から数えると３番目に位置しているので、先端位置１１は３となる。 The leading end position 11 indicates the position of the leading end character string 21 in the partial character string 31. Here, the position means an offset in the partial character string 31. For example, it can be defined by the character position at the end of the leading character string 21. In general, an offset is an integer representing a distance from the head, but in the present embodiment, the position can be designated from either the left or the right of the partial character string 31. For example, when the position is designated with reference to the left of the partial character string 31, if “FFAB” is input, the leading character string “AB” is third when counted from the top, and the leading edge position 11 is 3 Become.

また、先端一致情報１２とは、部分文字列３１内のいずれかの文字列が、先端文字列２１と一致したか否か、または、一致途上か、のいずれかを示す。すなわち、先端一致情報１２は、一致、不一致、一致途上の３つの情報を含む。ここで、一致途上とは、先端文字列２１の長さが１以上の場合において、先端文字列２１が連続する部分文字列３１にまたがっている場合を意味する。 The leading end match information 12 indicates whether any character string in the partial character string 31 matches the leading character string 21 or is in the process of matching. That is, the leading end coincidence information 12 includes three pieces of information on coincidence, disagreement, and coincidence. Here, “on the way of matching” means a case where the leading character string 21 straddles a continuous partial character string 31 when the length of the leading character string 21 is 1 or more.

（中間検出部）
中間検出部３は、部分文字列３１を入力とし、その部分文字列３１において、中間文字列２２を検出する。(Intermediate detector)
The intermediate detector 3 receives the partial character string 31 and detects the intermediate character string 22 in the partial character string 31.

中間文字列２２が検出された場合には、一致したか否かを示す中間一致情報１３を、範囲長検出部５へと送信する。ここで、中間文字列２２の一致情報は、一致長が所定の範囲内にあるか否かを示すことになる。 When the intermediate character string 22 is detected, the intermediate match information 13 indicating whether or not they match is transmitted to the range length detection unit 5. Here, the match information of the intermediate character string 22 indicates whether or not the match length is within a predetermined range.

（終端検出部）
終端検出部４は、部分文字列３１を入力とし、その部分文字列３１において、終端文字列２３を検出する。(Termination detector)
The end detection unit 4 receives the partial character string 31 and detects the end character string 23 in the partial character string 31.

終端文字列２３が検出された場合には、終端位置１４と終端一致情報１５とを範囲長検出部５へと送信する。 When the end character string 23 is detected, the end position 14 and the end match information 15 are transmitted to the range length detection unit 5.

終端位置１４とは、部分文字列２３内における終端文字列２３の位置を示す。例えば終端文字列２３が始まる文字位置で定義することができる。例えば、部分文字列３１の左を基準として位置を指定する場合、「ＹＺＦＦ」を入力すると、終端文字列「ＺＦ」はその部分文字列３１においては先頭から数えると２番目に位置している。ここで、それまでに入力した部分文字列長に２を加えた数値が終端位置１４となる。 The end position 14 indicates the position of the end character string 23 in the partial character string 23. For example, it can be defined at the character position where the terminal character string 23 starts. For example, when the position is specified with reference to the left of the partial character string 31, if “YZFF” is input, the terminal character string “ZF” is positioned second in the partial character string 31 when counted from the top. Here, a numerical value obtained by adding 2 to the partial character string length input so far is the end position 14.

また、終端一致情報１５とは、部分文字列３１内のいずれかの文字列が、終端文字列２３と一致したか否か、一致途上か、のいずれかを示す。すなわち、終端一致情報１５は、一致、不一致、一致途上の３つの情報を含む。 The end match information 15 indicates whether any character string in the partial character string 31 matches the end character string 23 or is in the process of being matched. In other words, the terminal match information 15 includes three pieces of information on match, mismatch, and matching.

（範囲長検出部）
範囲長検出部５は、先端検出部２から出力された先端位置１１及び先端一致情報１２、中間検出部３から出力された中間一致情報１３、終端検出部４から出力された終端位置１４及び終端一致情報１５、といった位置情報と一致情報を入力とする状態遷移機械を構成する。範囲長検出部５は、内部に図示しないカウンタを備える。そのため、最小一致長または最大一致長のいずれか一方、もしくは最小一致長及び最大一致長の両方を考慮することが可能である。(Range length detector)
The range length detection unit 5 includes a tip position 11 and tip match information 12 output from the tip detection unit 2, intermediate match information 13 output from the intermediate detection unit 3, and a termination position 14 and termination output from the termination detection unit 4. A state transition machine that receives position information such as the coincidence information 15 and coincidence information is configured. The range length detector 5 includes a counter (not shown) inside. Therefore, it is possible to consider either the minimum match length or the maximum match length, or both the minimum match length and the maximum match length.

図６は、本発明の第１の実施形態に係る範囲長検出部５の構成図である。なお、カウンタは図示していない。 FIG. 6 is a configuration diagram of the range length detector 5 according to the first embodiment of the present invention. The counter is not shown.

図６を参照すると、範囲長検出部５は、未検出状態６１、検出中状態６２、半終端状態６３の３状態からなる状態遷移機械６０として構成される。 Referring to FIG. 6, the range length detection unit 5 is configured as a state transition machine 60 composed of three states of an undetected state 61, a detecting state 62, and a semi-terminal state 63.

未検出状態６１は、検出する文字列を検出していない状態であり、いわゆる初期状態に相当する。 The undetected state 61 is a state in which a character string to be detected is not detected, and corresponds to a so-called initial state.

検出中状態６２は、先端位置１１と先端一致情報１２、中間検出部３から出力された中間一致情報１３を元に、検証対象の文字列を検証している状態に対応する。 The detecting state 62 corresponds to a state in which the character string to be verified is verified based on the tip position 11 and the tip match information 12 and the intermediate match information 13 output from the intermediate detection unit 3.

半終端状態６３は、終端文字列２３が次に入力される部分文字列３１にまたがっている状態、すなわち、終端文字列２３である可能性はあるが次の時刻まで終端文字列２３であるか否かの判断がつかない場合に遷移する状態である。 The semi-terminal state 63 is a state where the terminal character string 23 straddles the partial character string 31 to be input next, that is, the terminal character string 23 may be the terminal character string 23 but is the terminal character string 23 until the next time. This is a transition state when it is not possible to determine whether or not.

ただし、本発明の実施形態に係る範囲長検出部５は、先端検出部２、中間検出部３、終端検出部４によって抽出された情報から、範囲長を算出できる回路でありさえすればよく、図６の構成に限定されるわけではない。 However, the range length detection unit 5 according to the embodiment of the present invention only needs to be a circuit that can calculate the range length from the information extracted by the front end detection unit 2, the intermediate detection unit 3, and the end detection unit 4. It is not necessarily limited to the configuration of FIG.

ここで、図６に示した状態遷移機械６０の状態遷移条件について詳細に説明する。なお、これ以降の説明においては、数値範囲を「〜以上、〜以下」という表現で範囲しているが、これは、「〜より大きい」、「〜より小さい」といった、一般の範囲表現に適宜置換することが可能である。また、以下の状態遷移条件の説明においては、先端位置１１及び終端位置１４は、文字列の左端、右端などといった同じ基準に対する位置で定義する。 Here, the state transition conditions of the state transition machine 60 shown in FIG. 6 will be described in detail. In the following description, the numerical value range is expressed by the expression “to be greater than or equal to, but less than or equal to”. However, this is appropriate for general range expressions such as “greater than” and “less than”. It is possible to substitute. In the following description of the state transition conditions, the leading edge position 11 and the trailing edge position 14 are defined by positions with respect to the same reference such as the left end and the right end of the character string.

以下の説明においては、一般的な文字列を入力することを想定し、実施形態としてはＳ１〜Ｓ９という状態遷移条件について説明している。 In the following description, it is assumed that a general character string is input, and state transition conditions S1 to S9 are described as embodiments.

また、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力文字列３０とし、先端文字列「ＡＢ」、中間文字列「Ｃ・・・Ｃ」、終端文字列「ＹＺ」からなる「ＡＢＣ・・・ＣＹＺ」を検出文字列２０とした例を交えて説明を加える。なお、以下に加える説明は一例であり、部分文字列３１の長さは、各状態遷移条件において任意に設定した値であって、本発明の実施形態を限定するものではない。 Further, “FFABCCCCCYZFF” is set as the input character string 30, and “ABC... CYZ” including the leading character string “AB”, the intermediate character string “C... C”, and the terminal character string “YZ” is detected character string 20. Add an explanation with the example. The description added below is an example, and the length of the partial character string 31 is a value arbitrarily set in each state transition condition, and does not limit the embodiment of the present invention.

（状態遷移条件Ｓ１）
状態遷移条件Ｓ１は、未検出状態６１において、先端文字列２１の一致情報が一致を示し、中間文字列２２の一致情報が一致を示し、終端文字列２３の一致情報が一致を示した場合の遷移条件である。(State transition condition S1)
In the state transition condition S1, in the undetected state 61, the match information of the leading character string 21 indicates a match, the match information of the intermediate character string 22 indicates a match, and the match information of the end character string 23 indicates a match. It is a transition condition.

状態遷移条件Ｓ１では、未検出状態６１から未検出状態６１へと遷移する。 In the state transition condition S1, a transition is made from the undetected state 61 to the undetected state 61.

ここで、終端位置１４と先端位置１１との差が、指定された最小長以上かつ指定された最大長以下という所定の範囲内に含まれるため、中間文字列２２の一致長が一致と判定されることになり、一致結果を出力する。なお、最小長には０または１、最大長には１以上無限大が指定可能である。 Here, since the difference between the end position 14 and the tip position 11 is included in a predetermined range that is not less than the specified minimum length and not more than the specified maximum length, the match length of the intermediate character string 22 is determined to match. The match result is output. Note that 0 or 1 can be specified for the minimum length, and 1 or more can be specified for the maximum length.

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力した場合が、この条件に相当する。 For example, in the example of inputting “FFABCCCCYZFF”, the case where “FFABCCCCYZFF” is input corresponds to this condition.

（状態遷移条件Ｓ２）
状態遷移条件Ｓ２は、未検出状態６１において、先端文字列２１の一致情報が一致を示し、中間文字列２２の一致情報が一致を示し、終端文字列２３の一致情報が不一致を示した場合の遷移条件である。(State transition condition S2)
In the state transition condition S2, in the undetected state 61, the match information of the leading character string 21 indicates a match, the match information of the intermediate character string 22 indicates a match, and the match information of the end character string 23 indicates a mismatch. It is a transition condition.

状態遷移条件Ｓ２では、未検出状態６１から検出中状態６２へと遷移する。 In the state transition condition S2, a transition is made from the undetected state 61 to the detecting state 62.

ここで、部分文字列長と先端位置１１から算出された中間文字列２２の一致長をカウンタに代入する。すなわち、先端文字列２１が検出され、中間文字列は未だ所定の範囲内にあるため、終端文字列２３の検出待ちの状態となる。 Here, the matching length of the intermediate character string 22 calculated from the partial character string length and the leading end position 11 is substituted into the counter. That is, since the leading character string 21 is detected and the intermediate character string is still within the predetermined range, the terminal character string 23 is awaiting detection.

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、「ＦＦＡＢ」を入力した場合が、この条件に相当する。「ＦＦＡＢ」を入力した場合、中間文字列の一致長は０なので、カウンタには０を代入する。 For example, in the example of inputting “FFABCCCCCYZFF”, the case where “FFAB” is input corresponds to this condition. When “FFAB” is input, the match length of the intermediate character string is 0, so 0 is substituted for the counter.

（状態遷移条件Ｓ３）
状態遷移条件Ｓ３は、検出中状態６２において、先端文字列２１の一致情報が不一致を示し、中間文字列２２の一致情報が一致、終端文字列２３の一致情報が不一致を示した場合の遷移条件である。(State transition condition S3)
The state transition condition S3 is a transition condition when the matching information of the leading character string 21 indicates mismatch, the matching information of the intermediate character string 22 matches, and the matching information of the terminal character string 23 indicates mismatch in the detecting state 62. It is.

状態遷移条件Ｓ３では、検出中状態６２から検出中状態６２へと遷移する。 In the state transition condition S3, a transition is made from the detecting state 62 to the detecting state 62.

ここで、部分文字列長をカウンタに加算する。この遷移条件では、先端文字列２１との一致は既に検出されており、終端文字列２３が未だ未検出であるため、中間文字列２２が所定の範囲内にあるうちは検出中状態６２を維持することを示している。 Here, the partial character string length is added to the counter. Under this transition condition, a match with the leading character string 21 has already been detected, and the terminal character string 23 has not yet been detected. Therefore, while the intermediate character string 22 is within the predetermined range, the detecting state 62 is maintained. It shows that

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、「ＣＣＣＣ」を入力した場合が、この条件に相当する。中間文字列２２「ＣＣＣＣ」を入力した場合、カウンタに４を加算することになる。 For example, in the example of inputting “FFABCCCCCYZFF”, the case where “CCCC” is input corresponds to this condition. When the intermediate character string 22 “CCCC” is input, 4 is added to the counter.

（状態遷移条件Ｓ４）
状態遷移条件Ｓ４は、検出中状態６２において、先端文字列２１の一致情報が不一致を示し、中間文字列２２の一致情報が一致を示し、終端文字列２３の一致情報が一致を示した場合の遷移条件である。(State transition condition S4)
The state transition condition S4 is a case where, in the detecting state 62, the matching information of the leading character string 21 indicates mismatch, the matching information of the intermediate character string 22 indicates matching, and the matching information of the terminal character string 23 indicates matching. It is a transition condition.

状態遷移条件Ｓ４では、検出中状態６２から未検出状態６１へと遷移する。 In the state transition condition S4, a transition is made from the detecting state 62 to the undetected state 61.

ここで、部分文字列長と終端位置１４から算出された、中間文字列２２の一致長をカウンタに加算する。その結果、中間文字列の一致長が指定された最小長以上かつ指定された最大長以下という所定の範囲内であれば、先端文字列２１と、中間文字列２２の一致長と、終端文字列２３が一致することになるため、一致結果を出力する。なお、最小長には０または１、最大長には１以上無限大が指定可能である。 Here, the matching length of the intermediate character string 22 calculated from the partial character string length and the end position 14 is added to the counter. As a result, if the matching length of the intermediate character string is within a predetermined range of not less than the specified minimum length and not more than the specified maximum length, the matching length of the leading character string 21, the intermediate character string 22, and the terminal character string Since 23 matches, a match result is output. Note that 0 or 1 can be specified for the minimum length, and 1 or more can be specified for the maximum length.

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、「ＹＺＦＦ」を入力した場合が、この条件に相当する。「ＹＺＦＦ」を入力した場合、中間文字列２２の一致長は０なので、カウンタには０を代入する。ここで、状態遷移条件Ｓ３の結果と合わせると、その一致長は４である。すなわち、指定された最大長を５とすると、入力文字列３０と検出文字列２０とは一致すると判定され、文字列検出回路１からは、一致結果が出力される。 For example, in the example of inputting “FFABCCCCCYZFF”, the case where “YZFF” is input corresponds to this condition. When “YZFF” is input, since the matching length of the intermediate character string 22 is 0, 0 is substituted for the counter. Here, when combined with the result of the state transition condition S3, the matching length is 4. That is, if the designated maximum length is 5, it is determined that the input character string 30 and the detected character string 20 match, and the character string detection circuit 1 outputs a match result.

（状態遷移条件Ｓ５）
状態遷移条件Ｓ５は、検出中状態６２において、先端文字列２１の一致情報が不一致を示し、中間文字列２２の一致情報が一致、終端文字列２３の一致情報が一致途上を示した場合の遷移条件である。(State transition condition S5)
The state transition condition S5 is a transition when the matching information of the leading character string 21 indicates mismatch, the matching information of the intermediate character string 22 matches, and the matching information of the terminal character string 23 indicates that the matching is in progress. It is a condition.

状態遷移条件Ｓ５では、部分文字列長と終端位置１４から算出された、中間文字列２２の一致長をカウンタに加算する。その結果、中間文字列２２の一致長が指定された最小長以上かつ指定された最大長以下であれば、検出中状態６２から半終端状態６３へと遷移する。すなわち、先端文字列２１は既に検出され、中間文字列２２は一致長が一致と判定され、終端文字列２３が途中まで検出された状態となる。 In the state transition condition S5, the match length of the intermediate character string 22 calculated from the partial character string length and the end position 14 is added to the counter. As a result, if the matching length of the intermediate character string 22 is not less than the specified minimum length and not more than the specified maximum length, the transition from the detecting state 62 to the half-terminated state 63 is made. That is, the leading character string 21 has already been detected, the intermediate character string 22 is determined to have a matching length, and the terminal character string 23 has been detected halfway.

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、既に先端文字列「ＡＢ」が検出済みであり、そこに「ＣＣＣＹ」を入力した場合が、この条件に相当する。「ＣＣＣＹ」を入力した場合、終端文字列「ＹＺ」の途中までが検出された状態となる。 For example, in the example of inputting “FFABCCCCCYZFF”, the case where the leading character string “AB” has already been detected and “CCCY” is input thereto corresponds to this condition. When “CCCY” is input, the middle of the termination character string “YZ” is detected.

（状態遷移条件Ｓ６）
状態遷移条件Ｓ６は、未検出状態６１において、先端文字列２１の一致情報が一致を示し、中間文字列２２の一致情報が一致、終端文字列の一致情報が一致途上を示した場合の遷移条件である。(State transition condition S6)
The state transition condition S6 is a transition condition when the matching information of the leading character string 21 indicates matching, the matching information of the intermediate character string 22 matches, and the matching information of the terminal character string indicates that matching is in progress in the undetected state 61. It is.

状態遷移条件Ｓ６では、部分文字列長と終端位置１４から算出された、中間文字列２２の一致長をカウンタに加算する。その結果、指定された最小長以上かつ指定された最大長以下であれば、未検出状態６１から半終端状態６３へと遷移する。 In the state transition condition S6, the match length of the intermediate character string 22 calculated from the partial character string length and the end position 14 is added to the counter. As a result, if it is not less than the specified minimum length and not more than the specified maximum length, the state transits from the undetected state 61 to the half-terminated state 63.

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、「ＡＢＣＣＣＹ」を入力した場合が、この条件に相当する。「ＦＦＡＢＣＣＣＹ」を入力した場合、終端文字列２３「ＹＺ」の途中までが検出された状態となる。 For example, in the example of inputting “FFABCCCCYZFF”, the case where “ABCCCY” is input corresponds to this condition. When “FFABCCY” is input, the middle of the termination character string 23 “YZ” is detected.

（状態遷移条件Ｓ７）
状態遷移条件Ｓ７は、半終端状態６３において、先端文字列２１の一致情報が不一致を示し、中間文字列２２の一致情報が一致、終端文字列２３の一致情報が一致を示した場合の遷移条件である
状態遷移条件Ｓ７では、半終端状態６３から未検出状態６１へと遷移する。(State transition condition S7)
The state transition condition S7 is a transition condition when the match information of the leading character string 21 indicates mismatch, the match information of the intermediate character string 22 matches, and the match information of the end character string 23 indicates match in the semi-terminal state 63. In the state transition condition S7, the transition from the semi-terminal state 63 to the undetected state 61 occurs.

この場合、既に先端文字列２１は検出済みであり、中間文字列２２の一致長は所定の範囲内にあり、終端文字列２３も一致したことになるため、無条件で一致結果を出力する。 In this case, since the leading character string 21 has already been detected, the matching length of the intermediate character string 22 is within a predetermined range, and the terminal character string 23 has also been matched, the matching result is output unconditionally.

例えば、「ＦＦＡＢＣＣＣＣＹＺＦＦ」を入力する例において、既に先端文字列「ＡＢ」が検出済みであり、そこに「ＣＹＺＦＦ」を入力した場合が、この条件に相当する。「ＣＹＺＦＦ」を入力した場合、中間文字列２２と終端文字列２３の両方が一致と判定されるため、検出文字列２０が検出されたことになる。 For example, in the example of inputting “FFABCCCCCYZFF”, the case where the leading character string “AB” has already been detected and “CYZFF” is input thereto corresponds to this condition. When “CYZFF” is input, it is determined that both the intermediate character string 22 and the terminal character string 23 match, and thus the detected character string 20 is detected.

（状態遷移条件Ｓ８）
状態遷移条件Ｓ８は、検出中状態６２において、状態遷移条件Ｓ３、Ｓ４、Ｓ５の、いずれの条件も満足しない場合の遷移条件である。(State transition condition S8)
The state transition condition S8 is a transition condition when none of the state transition conditions S3, S4, and S5 is satisfied in the detecting state 62.

状態遷移条件Ｓ８では、検出中状態６２から未検出状態６１へと遷移する。 In the state transition condition S8, a transition is made from the detecting state 62 to the undetected state 61.

すなわち、先端文字列２１は検出されていたものの、中間文字列２２が一致長の範囲内になかった場合、もしくは終端文字列２３が一致条件を満たさなかった場合などに未検出状態６１に遷移する条件である。 That is, when the leading character string 21 is detected but the intermediate character string 22 is not within the matching length range, or when the terminal character string 23 does not satisfy the matching condition, the transition to the undetected state 61 is made. It is a condition.

例えば、「ＡＢＣ・・・ＣＹＺ」を検出文字列２０とし、中間文字列２２の一致長を５とした場合に、「ＦＦＡＢＣＣＣＣＣＣＹＺＦＦ」や「ＦＦＡＢＣＣＣＣＹＫＦＦ」などを入力する際に状態遷移条件Ｓ８を取りうる。 For example, when “ABC... CYZ” is the detection character string 20 and the matching length of the intermediate character string 22 is 5, the state transition condition S8 can be taken when inputting “FFABCCCCCCYZFF”, “FFABCCCCYKFF”, or the like. .

（状態遷移条件Ｓ９）
状態遷移条件Ｓ９は、半終端状態６３において、状態遷移条件Ｓ７を満足しない場合の遷移条件である。(State transition condition S9)
The state transition condition S9 is a transition condition when the state transition condition S7 is not satisfied in the semi-terminal state 63.

状態遷移条件Ｓ９では、半終端状態６３から未検出状態６１へと遷移する。 In the state transition condition S9, a transition is made from the half-terminal state 63 to the undetected state 61.

すなわち、先端文字列２１は検出され、中間文字列２２の一致条件も満たされ、終端文字列２３の一部が一致していたところに、ある部分文字列３１が入力され、中間文字列２２の一致情報が不一致、もしくは終端文字列２３の一致情報が不一致となった際の条件である。 That is, the leading character string 21 is detected, the matching condition of the intermediate character string 22 is also satisfied, and a partial character string 31 is input when a part of the terminal character string 23 is matched. This is a condition when the match information does not match or the match information of the terminal character string 23 does not match.

例えば、「ＡＢＣ・・・ＣＹＺ」を検出文字列２０とした場合において、入力文字列３０「ＦＦＡＢＣＣＣＣＹＫＦＦ」を、「ＦＦＡＢＣ」、「ＣＣＣＹ」、「ＫＦＦ」などの部分文字列３１として入力する際に状態遷移条件Ｓ９を取りうる。 For example, when “ABC... CYZ” is set as the detected character string 20, the input character string 30 “FFABCCCCCYKFF” is input as the partial character string 31 such as “FFABC”, “CCCY”, “KFF”, and the like. State transition condition S9 can be taken.

このように、本発明の第１の実施形態によると、入力された文字列に対して、先端情報１１及び先端一致情報１２と、終端情報１４及び終端一致情報１５と、中間一致情報１３といった位置情報と一致情報から、指定された範囲長の中間文字列２２を検出することができる。 As described above, according to the first embodiment of the present invention, positions such as the leading edge information 11 and the leading edge matching information 12, the trailing edge information 14 and the trailing edge matching information 15, and the intermediate matching information 13 with respect to the input character string. The intermediate character string 22 having the designated range length can be detected from the information and the matching information.

以上が、本発明の第１の実施形態に係る範囲長検出部５の状態遷移条件の説明である。なお、本発明の第１の実施形態に係る範囲長検出部５の状態遷移条件は、上述の条件に限定されず、各状態遷移条件に種々の変形を加えてもよい。 The above is the description of the state transition condition of the range length detection unit 5 according to the first embodiment of the present invention. Note that the state transition conditions of the range length detection unit 5 according to the first embodiment of the present invention are not limited to the above-described conditions, and various modifications may be made to each state transition condition.

以上のように、本発明の第１の実施形態に係る文字列検出回路１によれば、先端文字列２１と終端文字２３に着目することで、繰返し文字列に留まらない幅広い、一致長を考慮した文字列検出処理が可能である。 As described above, according to the character string detection circuit 1 according to the first embodiment of the present invention, by focusing on the leading character string 21 and the terminal character 23, a wide range of matching lengths that do not remain in the repeated character string is considered. Character string detection processing is possible.

また、部分文字列３１を状態遷移機械６０で直接受信するのではなく、先端検出部２、中間検出部３、終端検出部４において抽出された情報によって状態遷移機械６０を駆動する。すなわち、多バイト入力を活用して処理することになるため、非常に高速な処理が可能となる。 Further, the state transition machine 60 is not directly received by the state transition machine 60, but the state transition machine 60 is driven by information extracted by the front end detection unit 2, the intermediate detection unit 3, and the end detection unit 4. That is, since processing is performed using multi-byte input, very high speed processing is possible.

このように、本発明の第１の実施形態に係る文字列検出回路１によれば、重要な情報を抽出してから状態遷移機械を活用するため、非常に高速な処理が可能となっている。 As described above, according to the character string detection circuit 1 according to the first embodiment of the present invention, since the state transition machine is used after extracting important information, very high-speed processing is possible. .

（第２の実施形態）
図７に、本発明の第２の実施形態に係る文字列検出回路７０を示した。(Second Embodiment)
FIG. 7 shows a character string detection circuit 70 according to the second embodiment of the present invention.

第２の実施形態に係る文字列検出回路７０は、先端検出部７２、中間検出部７３、終端検出部７４、範囲長検出部７５及び文字列処理部７１を備えている。すなわち、文字列検出回路７０においては、第１の実施形態に係る文字列検出回路１に、文字列処理部７１が追加されている。なお、それぞれの文字列検出部の出力結果は第１の実施形態と同じであるため、出力結果の流れを矢印のみで示した。 The character string detection circuit 70 according to the second embodiment includes a front end detection unit 72, an intermediate detection unit 73, an end detection unit 74, a range length detection unit 75, and a character string processing unit 71. That is, in the character string detection circuit 70, a character string processing unit 71 is added to the character string detection circuit 1 according to the first embodiment. In addition, since the output result of each character string detection part is the same as 1st Embodiment, the flow of the output result was shown only with the arrow.

文字列処理部７１は、入力文字列３０を所定の長さの部分文字列３１に分割する。なお、分割する文字列の長さは、任意に設定可能である。 The character string processing unit 71 divides the input character string 30 into partial character strings 31 having a predetermined length. Note that the length of the character string to be divided can be arbitrarily set.

文字列処理部７１は、入力文字列３０を、等分としたり、先端文字列２１や終端文字列２３の長さを基準とした長さとしたり、中間文字列２２を基準とした長さにすることができる。 The character string processing unit 71 divides the input character string 30 into equal parts, a length based on the lengths of the leading character string 21 and the terminal character string 23, or a length based on the intermediate character string 22. be able to.

例えば、入力文字列３０が「ＡＢＣＣＣＣＣＣＣＣＹＺ」である場合、３等分すると、「ＡＢＣＣ」、「ＣＣＣＣ」、「ＣＣＹＺ」と設定できる。また、先端文字列２１や終端文字列２３に符号が含まれることが分かっている場合など、その符号の長さを考慮して部分文字列を設定してもよい。 For example, when the input character string 30 is “ABCCCCCCCCYZ”, it is possible to set “ABCC”, “CCCC”, and “CCYZ” by dividing into three equal parts. Further, when it is known that the leading character string 21 and the terminating character string 23 include a code, the partial character string may be set in consideration of the length of the code.

また、検出文字列２０の長さよりも、部分文字列３１の長さが長くなるような値を設定してもよいし、反対に部分文字列３１の長さが変化するように設定してもよい。 Further, a value that makes the length of the partial character string 31 longer than the length of the detected character string 20 may be set, or conversely, the length of the partial character string 31 may be changed. Good.

例えば、部分文字列３１を入力していくにつれて、部分文字列３１の長さを次第に短くしたり、次第に長くしたりすることができる。例えば、入力文字列３０が「ＡＢＣＣＣＣＣＣＣＣＹＺ」である場合、部分文字列３１を「ＡＢＣＣＣ」、「ＣＣＣＣ」、「ＣＹＺ」としたり、「ＡＢＣ」、「ＣＣＣＣ」、「ＣＣＣＹＺ」としたりできる。 For example, as the partial character string 31 is input, the length of the partial character string 31 can be gradually shortened or gradually increased. For example, when the input character string 30 is “ABCCCCCCCCCYZ”, the partial character string 31 can be “ABCCC”, “CCCC”, “CYZ”, or “ABC”, “CCCC”, “CCCYZ”.

また、部分文字列３１の長さをランダムに設定してもよく、例えば、「ＡＢ」、「ＣＣＣＣＣＣ」、「ＹＺ」とすることもできる。なお、入力文字列３０を分割して部分文字列３１を生成する方法はここにあげた限りではなく、各検出部の処理能力や外部の入力元・出力先にある装置の性能や設定に応じて任意に設定可能である。 Further, the length of the partial character string 31 may be set at random, for example, “AB”, “CCCCCC”, and “YZ”. Note that the method of generating the partial character string 31 by dividing the input character string 30 is not limited to the above, but depends on the processing capability of each detection unit and the performance and settings of the devices at the external input source and output destination. Can be set arbitrarily.

ただし、文字列処理部７１による入力文字列３０の分割は、ここであげた限りではない。 However, the division of the input character string 30 by the character string processing unit 71 is not limited to the above.

図８は、文字列処理部７１の構成の一例を示す図である。 FIG. 8 is a diagram illustrating an example of the configuration of the character string processing unit 71.

図８に示した文字列処理部７１は、文字列分割部８１と、記憶部８２と、タイミング部８３と、選択部８４と、出力部８５と、を備えている。 The character string processing unit 71 illustrated in FIG. 8 includes a character string dividing unit 81, a storage unit 82, a timing unit 83, a selection unit 84, and an output unit 85.

文字列分割部８１は、入力文字列３０を所定の長さに分割し、部分文字列３１を作製する。入力文字列３０を分割する長さは、任意に設定可能である。例えば、文字列分割部８１は、入力文字列３０を、等分してもよく、検出文字列２０の長さに応じて設定された長さに分割してもよく、ランダムな長さに分割してもよい。さらに、入力文字列３０の符号部分の長さが予め分かっている場合は、符号部分のみを取り出し、それ以外のデータ部分を任意の長さに分割してもよい。 The character string dividing unit 81 divides the input character string 30 into a predetermined length and creates the partial character string 31. The length for dividing the input character string 30 can be arbitrarily set. For example, the character string dividing unit 81 may divide the input character string 30 into equal parts, may be divided into lengths set according to the length of the detected character string 20, or may be divided into random lengths. May be. Furthermore, when the length of the code part of the input character string 30 is known in advance, only the code part may be taken out and the other data part may be divided into arbitrary lengths.

記憶部８２は、所定の長さに分割された部分文字列３１を記憶する。 The storage unit 82 stores the partial character string 31 divided into a predetermined length.

タイミング部８３は、選択部８４に、記憶部８２から部分文字列３１を取得するタイミングで信号を送信する。タイミング部８３は、設定されたタイミングで信号を送信できる。また、タイミング部８３を範囲長検出部７５内部のカウンタと同期させれば、範囲長検出部７５の処理に合わせて信号を送信することもできる。外部の演算処理装置に設けられたカウンタと同期させる場合は、そのカウンタのタイミングに合わせて処理を行えばよい。 The timing unit 83 transmits a signal to the selection unit 84 at the timing of acquiring the partial character string 31 from the storage unit 82. The timing unit 83 can transmit a signal at a set timing. Further, if the timing unit 83 is synchronized with the counter in the range length detection unit 75, a signal can be transmitted in accordance with the processing of the range length detection unit 75. When synchronizing with a counter provided in an external arithmetic processing unit, processing may be performed in accordance with the timing of the counter.

選択部８４は、タイミング部８３からの信号のタイミングに応じて、記憶部８２から部分文字列３１を選択し、出力部８５に送信する。 The selection unit 84 selects the partial character string 31 from the storage unit 82 according to the timing of the signal from the timing unit 83 and transmits the partial character string 31 to the output unit 85.

出力部８５は、選択部８４から送信されてきた部分文字列３１を出力する。なお、選択部８４と出力部８５は、共通の構成要素としてもよい。 The output unit 85 outputs the partial character string 31 transmitted from the selection unit 84. Note that the selection unit 84 and the output unit 85 may be common components.

以上が文字列処理部７１の構成及び動作の簡単な説明である。なお、文字列処理部７１は、入力文字列３０を分割しさえすれば、上述の構成及び動作に限定することはない。 The above is a brief description of the configuration and operation of the character string processing unit 71. The character string processing unit 71 is not limited to the above-described configuration and operation as long as the input character string 30 is divided.

以上のように、第２の実施形態に係る文字列検出回路７０によれば、文字列検出回路７０に入力文字列３０を入力する前に、部分文字列３１に分割しなくても入力文字列３０をそのままの形で入力することができる。また、部分文字列３１の入力タイミングを設定しなくても、内部のタイミング部８３のタイミングで部分文字列３１を範囲長検出部７５に送信できる。このように、第２の実施形態の文字列検出回路７０によれば、入力文字列３０は、範囲長検出部７５が処理しやすい長さに分割されるため、さらに高速での処理が可能となる。 As described above, according to the character string detection circuit 70 according to the second embodiment, the input character string can be input without being divided into the partial character strings 31 before the input character string 30 is input to the character string detection circuit 70. 30 can be input as it is. Further, the partial character string 31 can be transmitted to the range length detection unit 75 at the timing of the internal timing unit 83 without setting the input timing of the partial character string 31. As described above, according to the character string detection circuit 70 of the second embodiment, the input character string 30 is divided into lengths that can be easily processed by the range length detection unit 75, so that processing at higher speed is possible. Become.

以上、本発明を上記実施形態に即して説明したが、本発明は、上記実施形態の構成や動作にのみ限定されるものではなく、本発明の範囲内で当業者であればなしうることが可能な各種変形、修正を含むことはもちろんである。 Although the present invention has been described with reference to the above embodiment, the present invention is not limited only to the configuration and operation of the above embodiment, and can be made by those skilled in the art within the scope of the present invention. Of course, it includes various possible variations and modifications.

この出願は、２０１２年９月１１日に出願された日本出願特願２０１２−１９９０１５を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2012-199015 for which it applied on September 11, 2012, and takes in those the indications of all here.

１文字列検出回路
２先端検出部
３中間検出部
４終端検出部
５範囲長検出部
２０検出文字列
２１先端文字列
２２中間文字列
２３終端文字列
３０入力文字列
３１部分文字列
４０入力文字列
４１、４２部分文字列
６１未検出状態
６２検出中状態
６３半終端状態
７０文字列検出回路
７１文字列処理部
７２先端検出部
７３中間検出部
７４終端検出部
７５範囲長検出部
８１文字列分割部
８２記憶部
８３タイミング部
８４選択部
８５出力部
９０文字列検出回路
９１状態遷移機械
９２カウンタ
９３比較器
９４非一致検出回路
９５出力回路DESCRIPTION OF SYMBOLS 1 Character string detection circuit 2 Front end detection part 3 Intermediate | middle detection part 4 End | end detection part 5 Range length detection part 20 Detection character string 21 Front end character string 22 Intermediate character string 23 End character string 30 Input character string 31 Partial character string 40 Input character string 41, 42 Partial character string 61 Undetected state 62 Detected state 63 Half-terminated state 70 Character string detection circuit 71 Character string processing unit 72 Tip detection unit 73 Intermediate detection unit 74 Termination detection unit 75 Range length detection unit 81 Character string division unit 82 storage unit 83 timing unit 84 selection unit 85 output unit 90 character string detection circuit 91 state transition machine 92 counter 93 comparator 94 non-match detection circuit 95 output circuit

Claims

Leading edge detecting means for detecting position information and matching information of the leading character string including the leading edge of the detected character string from the input character string, and matching with positional information of the terminal character string including the end of the detected character string from the input character string End detection means for detecting information, and intermediate detection means for detecting match information between the leading character string of the detected character string and an intermediate character string sandwiched between the end character strings from the input character string Detection means;
A character string detection circuit comprising: range length detection means for detecting the detected character string from the input character string in consideration of the match length using the position information and the match information.

2. The character string detection circuit according to claim 1, wherein the input character string is input to each of the leading edge detection unit, the intermediate detection unit, and the termination detection unit at the same timing.

The input character string is divided into a plurality of partial character strings,
3. The character string detection circuit according to claim 1, wherein the plurality of partial character strings are input to the coincidence detection unit at different timings.

The range length detecting means includes
The character string detection circuit according to claim 3, wherein the plurality of partial character strings are processed in an input order.

The range length detecting means includes
A state transition machine that detects a match in consideration of a match length of the intermediate character string together with a match of the leading character string and the terminal character string,
5. The character string according to claim 1, wherein the state transition machine includes a counter that receives the position information and the match information and adds a match length of the intermediate character string. Detection circuit.

The state transition machine is:
An undetected state in which the leading character string is not detected;
A detecting state in which the detected character string is detected based on position information and matching information of the leading character string and matching information of the intermediate character string;
The character string detection circuit according to claim 5, further comprising: a semi-terminal state in which the terminal character string is partially detected.

In the detecting state,
The range length detecting means includes
When partially detecting the end character string, the matching length of the intermediate character string calculated from the length of the partial character string and the end position is added to the counter,
When the counter value of the counter is within a predetermined range, transition to the half-terminal state,
7. The character string detection circuit according to claim 5, wherein when the counter value of the counter is not within the predetermined range, the state shifts to the undetected state.

In the half-terminated state,
The range length detecting means includes
When the termination character string is detected, if the counter value of the counter is within a predetermined range, a match result is output and transition to the undetected state,
6. If the counter value of the counter is not within the predetermined range, or if the terminal character string is not detected, transition is made to the undetected state without outputting a match result. 6. The character string detection circuit according to 6.

The character string detection circuit according to claim 1, further comprising character string processing means for dividing the input character string to create the partial character string.

Detection of position information and matching information of the leading character string including the leading end of the detection character string from the input character string, detection of position information and matching information of the terminal character string including the end of the detection character string from the input character string, And detecting from the input character string the matching information of the intermediate character string sandwiched between the leading character string and the terminal character string in the detected character string,
A character string detection method for detecting the detected character string in consideration of a match length from the input character string using the position information and the match information.