WO2016042762A1 - Information-generating device, information-generating method, and recording medium - Google Patents

Information-generating device, information-generating method, and recording medium Download PDF

Info

Publication number
WO2016042762A1
WO2016042762A1 PCT/JP2015/004707 JP2015004707W WO2016042762A1 WO 2016042762 A1 WO2016042762 A1 WO 2016042762A1 JP 2015004707 W JP2015004707 W JP 2015004707W WO 2016042762 A1 WO2016042762 A1 WO 2016042762A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
information
words
concept
associative
Prior art date
Application number
PCT/JP2015/004707
Other languages
French (fr)
Japanese (ja)
Inventor
健太郎 園田
かや人 関谷
由也 木津
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2016548560A priority Critical patent/JP6436171B2/en
Publication of WO2016042762A1 publication Critical patent/WO2016042762A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to an information generation device, an information generation method, and a recording medium.
  • Patent Document 1 describes a method of identifying the outflow source when personal information or the like leaks to the outside.
  • dummy search data is mixed with search results of a database, and the search requester who leaks customer data including this dummy data by associating this dummy data with the identification information of the search requester. Is identified.
  • Patent Document 2 describes an information processing apparatus that displays dummy data on a display unit when a condition for releasing the security lock is not satisfied.
  • Patent Document 3 discloses a method of dividing two words into two and connecting the first half of one word and the second half of the other word.
  • Patent Document 4 discloses a method of assigning a priority to a plurality of terms related to a certain term and assigning them a priority.
  • Patent Document 5 describes a method for increasing or decreasing the value of sequential number phrases indicating the order of numbers, English letters, symbols, and the like.
  • the dummy information is preferably information that is difficult for an attacker to identify as dummy information. That is, it is preferable that the dummy information is difficult to distinguish from regular information, has no sense of incongruity, and is easily caught by an attacker.
  • regular information indicates information that already exists (is used).
  • dummy data is generated by using a synonym or by defining a word considered as an address name or the like and combining it appropriately.
  • dummy data is generated by replacing a part of information in data usable as dummy data.
  • the generated character string becomes a simple character enumeration that does not make sense.
  • Such dummy data is also likely to be determined as a dummy by an attacker.
  • the dummy information is a static list prepared in advance, depending on an ICT (Information and Communication Technology) system to which the dummy information is applied, it may be easily determined as dummy information. Further, when the regular information of the target ICT system to which the dummy information is applied is organized and the dummy information suitable for the ICT system is manually generated, when the amount of regular information and the number of target ICT systems increase, It takes an enormous amount of time to generate dummy information.
  • ICT Information and Communication Technology
  • the present invention has been made in view of the above-described problems, and an object of the present invention is to provide a technique for generating dummy information that is more difficult for an attacker to identify as dummy information.
  • An information generation apparatus provides an analysis unit that decomposes a character string included in component information relating to a component of a system into words, and a word included in conceptual information among the decomposed words, An associative word determining means for determining an associative word of the word based on the concept information, the associative word, and a character string before the word used for determining the associative word before and after the word And a synthesizing unit that generates dummy information including a character string different from the character string included in the component element information by combining a word following at least one of the above or an associated word of the word.
  • An information generation method decomposes a character string included in component information relating to a component of a system into words, and applies the concept information to a word included in concept information among the decomposed words. Based on the word, the word following the at least one of the word before and after the word in the character string before the word used for the word and the word used for the word determination, or The dummy information including a character string different from the character string included in the component element information is generated by combining the associated word with the word.
  • FIG. 1 is a functional block diagram showing an example of a functional configuration of the information generation apparatus 10 according to the first embodiment of the present invention.
  • the information generation apparatus 10 shown in FIG. 1 shows a configuration unique to the present invention, and the information generation apparatus 10 shown in FIG. 1 may have a member that is not shown in FIG. Needless to say.
  • the direction of the arrow in the drawings shows an example, and does not limit the direction of signals between blocks.
  • the directions of the arrows in the drawings show an example and do not limit the direction of signals between the blocks.
  • the information generation apparatus 10 includes an analysis unit 101, an associative word determination unit 102, and a synthesis unit 103.
  • the analysis unit 101 analyzes the character string included in the component information related to the component such as a server included in the system, and decomposes it into one or more words.
  • the component information includes, for example, a host name indicating each component, a user account used to access each component, a file name indicating an information resource stored in each component, and a location of the information resource Information such as URI (Uniform Resource Identifier) is included, but the component information of the present embodiment is not limited to this.
  • the component information may include, for example, the email address of the user who uses each component. Since this component element information is information that is actually used in the system, it is also called regular information in this system.
  • the analysis unit 101 analyzes the character string for each character string and decomposes the character string into one or more words. Then, the analysis unit 101 outputs the decomposed word to the associative word determination unit 102 and the synthesis unit 103.
  • the associative word determination unit 102 receives the decomposed word from the analysis unit 101. Then, the associative word determination unit 102 confirms whether or not the received word is included in the concept information, and determines an associative word of the word based on the concept information for the word included in the concept information. The associative word determination unit 102 outputs the determined associative word to the synthesis unit 103.
  • the synthesizing unit 103 receives the decomposed word from the analyzing unit 101.
  • the synthesizing unit 103 receives an associative word from the associative word determining unit 102.
  • the synthesizing unit 103 combines the associative word and the word following at least one of the word before and after the word or the word associative word in the character string before the word used for the determination of the word. .
  • the composition unit 103 generates a character string different from the character string included in the component element information.
  • the analysis unit 101 decomposes “spring01” into “spring” and “01”. Also, the analysis unit 101 decomposes “fall02” into “fall” and “02”.
  • the associative word determination unit 102 checks whether or not “spring”, “fall”, “01”, and “02” are included in the concept information. Then, when “sspring” and “fall” are included in the concept information, the associative word determination unit 102, based on the concept information, associate words of “sspring” and “fall” (for example, “winter” and “fall”). ”outumn”).
  • the synthesizing unit 103 combines (1) “winter” and (2) a word following this word in the character string before the word used for the determination of “winter”.
  • the number of words used to determine this associative word may be one or plural. For example, when “spring” and “fall” share the superordinate concept “season”, the associative word determination unit 102 determines an associative word “winter” for the two words.
  • the character strings before the word decomposition used in the determination of “winter” in the above (2) are “spring01” and “fall02”.
  • the words following the word used to determine “winter” are “01” and “02”. The same is true for “autumn”.
  • the synthesis unit 103 generates “winter01”, “winter02”, “autumn01”, and “autumn02”. These character strings are different from the character strings “spring01” and “fall02” included in the component information.
  • the synthesis unit 103 can generate dummy information including the generated character string.
  • the information generation apparatus 10 can generate dummy information that is more difficult to discriminate when it is dummy information for an attacker. That is, the information generation apparatus 10 according to the present embodiment can automatically generate dummy information that is used by an attacker without noticing a dummy.
  • a system that uses dummy information generated in this way for the system can detect an attacker more.
  • FIG. 2 is a diagram illustrating an example of the configuration of the information generation system 1 according to the present embodiment.
  • the information generation system 1 according to the present embodiment includes an information generation device 100 and an in-company system 300.
  • the information generation apparatus 100 and the in-company system 300 are connected via a network 200 so that they can communicate with each other.
  • the in-company system 300 is a system using ICT (Information and Communication Technology).
  • ICT Information and Communication Technology
  • an example of an ICT system will be described taking a system built in a company as an example, but the ICT system of the present embodiment is not limited to this.
  • the ICT system only needs to be an environment where services can be used via a network or the like.
  • the in-company system 300 includes various devices such as a server, a client, and a storage. Hereinafter, these are referred to as components of the in-company system 300.
  • the information generation device 100 is a device that generates dummy information.
  • the configuration of the information generation apparatus 100 will be described with reference to different drawings.
  • FIG. 3 is a functional block diagram illustrating an example of a functional configuration of the information generation apparatus 100 according to the present embodiment.
  • the information generation apparatus 100 according to the present embodiment has a configuration in which the information generation apparatus 10 described in the first embodiment includes a sequential number generation unit 104, a collection unit 110, and a storage unit 120.
  • the information generation apparatus 100 includes an analysis unit 101, an associative word determination unit 102, a synthesis unit 103, a sequence number generation unit 104, a collection unit 110, and a storage unit 120. It is equipped with.
  • the collection unit 110 is means for collecting component information regarding each component of the in-company system 300 via the network 200.
  • the collection unit 110 collects component information from, for example, a directory service. Then, the collection unit 110 outputs the collected component information (hereinafter referred to as collection data) to the analysis unit 101. Note that the collection unit 110 may store the collected component information in, for example, the storage unit 120 or the collection unit 110 described later. The collection unit 110 may be configured to collect data of a specific type (for example, a host name) among the component element information regarding each component of the in-company system 300.
  • a specific type for example, a host name
  • the analysis unit 101 receives the collected data from the collection unit 110. Similarly to the analysis unit 101 in the first embodiment, the analysis unit 101 performs language analysis on one or more character strings included in the collected data (component element information), and each character string is converted into one or more character strings. Break it down into words. In the present embodiment, the analysis unit 101 uses morphological analysis as language analysis, but the language analysis method of the present embodiment is not limited to this. The analysis unit 101 may perform language analysis using other language analysis methods.
  • the analysis unit 101 outputs one or a plurality of words (referred to as decomposition data) decomposed for each character string to the associative word determination unit 102, the synthesis unit 103, and the sequence number generation unit 104. At this time, the analysis unit 101 assigns the attribute of the word to each of the decomposed one or more words.
  • This attribute includes information indicating the position in the character string before decomposition (character string included in the collected data). For example, when the analysis unit 101 decomposes the character string “AAA01” into “AAA” and “01”, the attribute “AAA” indicates that the word is the first part of the character string before decomposition. Contains information to indicate. The information included in this attribute is not limited to this.
  • the attribute may include, for example, information that there is a space after the decomposed word.
  • the attribute may include information indicating a character string before decomposition, for example.
  • the analysis unit 101 may perform analysis processing by extracting data of the analysis target type from the collected data.
  • the storage unit 120 stores conceptual information prepared in advance.
  • Concept information is information including a concept dictionary, which is a dictionary for defining the concept of words.
  • the concept information is not limited to this, and may include, for example, antonyms, synonyms, and the like of a certain word.
  • the conceptual information may be changed as appropriate according to the components in the in-company system 300.
  • the configuration in which the storage unit 120 is built in the information generating apparatus 100 will be described as an example, but the configuration related to the storage unit 120 is not limited to this.
  • the storage unit 120 may be realized by a storage device that is separate from the information generation device 100.
  • the associative word determination unit 102 determines each associative word of one or more words decomposed by the analysis unit 101 based on the concept information, similarly to the associative word determination unit 102 of the first embodiment. Specifically, the associative word determination unit 102 receives the decomposed data decomposed by the language analysis from the analysis unit 101. Then, the associative word determination unit 102 refers to the concept information stored in the storage unit 120 and confirms whether one or more words included in the decomposition data are included in the concept information. The associative word determination unit 102 determines another word having the same superordinate concept as that of the word included in the concept information as an associative word for the word.
  • the associative word determination unit 102 assigns, to the determined associative word, the attribute of the word used in the determination of the associative word (the word included in the decomposition data output from the analysis unit 101) as the attribute of the associative word. .
  • the attribute of the word used to determine the associative word (referred to as the original word) includes information indicating the position of the character string before decomposition
  • the associative word determining unit 102 sets the attribute of the associative word to Include information indicating the position of the string.
  • the attribute of the associative word may include information indicating the original word.
  • the associative word determination unit 102 outputs the determined associative word to the serial number generation unit 104.
  • the serial number generation unit 104 receives the decomposed data from the analysis unit 101. Then, the serial number generation unit 104 identifies a word that can generate serial numbers and consecutive phrases (hereinafter referred to as sequential number phrases) among the received words.
  • the word that can generate the serial number phrase is a continuous number, alphabet, or the like.
  • generate a serial number phrase is not limited to a number and an alphabet, For example, phrases, such as "(alpha), (beta), (gamma), ", may be sufficient. That is, the words that can generate sequential number phrases may be words included in a predetermined array.
  • the serial number generation unit 104 generates a serial number phrase of the identified word. That is, the serial number generation unit 104 extracts words that are included in a predetermined array including the identified word and that are different from the identified word as sequential number phrases.
  • the serial number generation unit 104 assigns, to the generated serial number word / phrase, an attribute of a word (word that can generate the serial number word / phrase) that is a source of generation of the serial number word / phrase as an attribute of the serial number word / phrase.
  • the serial number generation unit 104 includes information indicating the position of the character string in the attribute of the serial number phrase.
  • the serial number phrase attribute may include information indicating the original word.
  • sequence number generation unit 104 outputs the generated sequence number phrase to the synthesis unit 103.
  • the synthesizing unit 103 receives the decomposed data from the analyzing unit 101.
  • the synthesizing unit 103 receives an associative word from the associative word determining unit 102.
  • the synthesizing unit 103 also receives sequential number phrases from the sequential number generating unit 104.
  • the synthesizing unit 103 combines the associative word and the word following at least one of the word before and after the word or the word associative word in the character string before the word used for the determination of the word. (Synthesize).
  • Synthesize A specific example of the combining method of the combining unit 103 will be described with reference to different drawings.
  • FIG. 4 is a flowchart illustrating an example of a processing flow of the information generation apparatus 100 according to the present embodiment.
  • 5 to 7 are diagrams for explaining the operation of the information generating apparatus 100 according to the present embodiment. In the following, description will be given by taking as an example that the component information collected by the collection unit 110 is a host name.
  • the collection unit 110 collects component information (step S41). Then, the character string included in the component information (collected data) collected by the analysis unit 101 is linguistically analyzed and decomposed into one or a plurality of words (step S42).
  • FIG. 5 is a diagram for explaining the operation of the analysis unit 101 of the information generation apparatus 100 according to the present embodiment.
  • the collection data collected by the collection unit 110 includes the four host names “spring-a”, “fall”, “test01”, and “test02” shown on the left side of FIG.
  • the analysis unit 101 performs linguistic analysis on each of these host names and decomposes them into one or a plurality of words. That is, the analysis unit 101 generates decomposed data including six words “spring”, “ ⁇ a”, “fall”, “test”, “01”, and “02”.
  • the analysis unit 101 may include the number of appearances in the attribute of the overlapping word (in this example, “test”).
  • the analysis unit 101 includes information indicating that the word is the first part of the character string before decomposition in the attributes “spring” and “test” and includes “ ⁇ a”, “01”, “02”. In the attribute of, information indicating that the word is the last part of the character string before decomposition is included. Further, the analysis unit 101 may include information indicating that the attribute has not been decomposed in the attribute of “fall”.
  • the analysis unit 101 outputs the decomposed data including these words to the associative word determination unit 102, the synthesis unit 103, and the sequence number generation unit 104.
  • the associative word determination unit 102 refers to the storage unit 120 and determines an associative word of the word included in the conceptual information among one or more words included in the decomposed data (step S43).
  • FIG. 6 is a diagram for explaining the configuration of the conceptual information stored in the storage unit 120.
  • the storage unit 120 stores tree-structured conceptual information. Note that the data structure of the concept information is not limited to this, and any structure may be used as long as the superordinate concept and / or subordinate concept of a word can be understood.
  • FIG. 6 includes “spring”, “summer”, “winter”, “fall”, “autumn”, and the like as examples of conceptual information. Then, “season”, which is a superordinate concept of these words, is associated with a word such as “sspring” as a superordinate concept such as “spring”. Similarly, the superordinate concept “xxx” of “season” is associated with “season” as a superordinate concept of “season”.
  • one superordinate concept is associated as a superordinate concept of a word, but the present embodiment is not limited to this.
  • One word may be associated with a plurality of superordinate concepts.
  • “spring” may be associated with a superordinate concept “elastic body”.
  • the associative word determination unit 102 checks whether or not “spring” included in the decomposition data is included in the concept information, and if included, searches for a superordinate concept of the word. When the superordinate concept “season” of “spring” is searched, the associative word determination unit 102 determines an arbitrary word from the subordinate concepts of “season” as an associative word. In this example, it is assumed that the associative word determination unit 102 determines “winter” as an associative word of “spring”. Similarly, the associative word determination unit 102 determines “autumn” as an associative word of “fall”.
  • the associative word determination unit 102 checks whether “test” is included in the concept information. In this example, it is assumed that this “test” is not included in the concept information. Further, the associative word determination unit 102 confirms whether “ ⁇ a”, “01”, and “02” are also included in the concept information. In this example, it is assumed that these words are not included in the concept information.
  • the associative word determination unit 102 checks whether or not all the words included in the decomposition data are included in the concept information, and determines the associated word of the word for the included words. . Then, the associative word determination unit 102 includes the original word attribute in the determined word attribute. That is, the associative word determination unit 102 includes information indicating that the word is the first part of the character string before decomposition in the “winter” attribute. Further, the associative word determination unit 102 may include information indicating that the attribute is not decomposed in the attribute of “autumn”. According to the configuration of FIG. 6, “autumn” can also be an associative word of “spring”, and “winter” can be an associative word of “fall”. Therefore, the associative word determination unit 102 may include the attribute “fall” in the attribute “winter”, for example.
  • the associative word determining unit 102 uses these words to determine the superordinate concepts of these words. For example, when the superordinate concepts of “spring” are “season” and “elastic body”, the associative word determination unit 102 may determine the subordinate concept of “elastic body” as an associative word of “spring”. is there.
  • the associative word determination unit 102 searches for a superordinate concept common to “spring” and “fall”. That is, the associative word determination unit 102 confirms whether the superordinate concept of “spring” and the superordinate concept of “fall” are the same. When the superordinate concepts are the same, the associative word determination unit 102 searches for this superordinate concept, and is a subordinate concept of “season” with respect to the superordinate concept “season” searched for, “spring” and “ Words other than “fall” are determined as associative words of “spring” and “fall”. As a result, the information generation apparatus 100 can generate dummy information that is difficult for an attacker to be determined as a dummy.
  • the associative word determination unit 102 determines as many associative words the number of words that is equal to or more than the number of original words used for determining the associative word. Thereby, the information generation apparatus 100 can generate at least as many pieces of dummy information as the regular information. Note that the number of words to be determined as associative words is arbitrary, and may not be the same as the number of original words used to determine the associative words.
  • the associative word determination unit 102 includes attributes of “spring” and “fall” in the respective attributes of “winter” and “autumn”, which are words determined as associative words. That is, the associative word determination unit 102 includes, in each of the attributes “winter” and “autumn”, information indicating that the word is the first part of the character string before decomposition and that it is not decomposed.
  • the associative word determination unit 102 outputs “winter” and “autumn” to the synthesis unit 103.
  • the associative word determination unit 102 may have the same attribute for words having the same superordinate concept. For example, as shown in FIG. 6, since “spring” and “fall” have the same superordinate concept, the associative word determination unit 102 includes the attribute “fall” in the attribute “spring”, and “fall” ”Attribute may be included in the“ spring ”attribute. Then, the associative word determination unit 102 outputs information indicating that the attribute of the word included in the decomposed data has been changed to the synthesis unit 103.
  • the sequential number generation unit 104 generates sequential number phrases of words that can generate sequential number phrases from among one or more words included in the decomposed data (step S44).
  • the step S44 may be performed after the step S42, may be performed simultaneously with the step S43, or may be performed before the step S43.
  • the serial number generation unit 104 when receiving the decomposed data as shown on the right side of FIG. 5, the serial number generation unit 104 identifies “ ⁇ a”, “01”, and “02” as words having continuity. The serial number generation unit 104 generates “ ⁇ b” and “ ⁇ c” based on “ ⁇ a”. The serial number generation unit 104 generates “03” and “04” based on “01” and “02”. Note that the number of serial number phrases generated by the serial number generation unit 104 is not particularly limited.
  • the serial number generation unit 104 includes information included as attributes “ ⁇ a”, “01”, and “02” in the attributes “ ⁇ b”, “ ⁇ c”, “03”, and “04”, respectively. Information indicating that the word is the last part of the character string before decomposition. Then, the sequence number generation unit 104 outputs the generated sequence number phrase to the synthesis unit 103.
  • step S43 and step S44 the synthesizing unit 103 generates associative words “winter” and “autumn” and sequential word phrases “ ⁇ b”, “ ⁇ c”, “03”, and “04”. Then, dummy information is generated by performing synthesis processing using the words included in the decomposed data (step S45).
  • FIG. 7 is a diagram for explaining the operation of the combining unit 103 of the information generating apparatus 100 according to the present embodiment.
  • the synthesis unit 103 combines the following (A) and (B).
  • the combining unit 103 generates dummy information by combining (combining) the arrays having the same attribute information as one array in each of (A) and (B).
  • the attribute of the associative word and decomposition data A included in (A) includes information indicating that the word is the first part of the character string before decomposition, and the serial number included in (B).
  • the attribute of the phrase and the decomposition data B includes information indicating that the word is the last part of the character string before decomposition. Therefore, the synthesizing unit 103 takes the associative word and the decomposed data A included in (A) as one array, and the sequential number phrase and decomposed data B included in (B) as one array. Word).
  • “X” between the arrays indicates that the elements of the arrays are combined.
  • the synthesizing unit 103 uses the combination element information stored in the storage unit 120 or the collection unit 110 so that the combined character string is not the same as the original character string (regular information). Confirm that the character string is not included.
  • the synthesis unit 103 does not use the combined character string as dummy information.
  • the synthesis unit 103 generates dummy information using the associative word generated based on the concept information of the component element information. Therefore, the information generating apparatus 100 can generate dummy information that does not feel uncomfortable even when used as component information. Further, the synthesizing unit 103 may include a character string that does not use an associative word, for example, “test03” and “test-a” in the dummy information. Thereby, more character strings can be generated as dummy information.
  • the combining unit 103 may store the generated dummy information as information that can be browsed by an attacker, for example, in an external storage device or the like, or may transmit it to a predetermined device. Further, the composition unit 103 may be configured to transmit the generated dummy information to another device or system when an inquiry is made from another device or system.
  • the information generation apparatus 100 may generate dummy information for other types of component information as well. it can. For example, even when the component information is a user account, the information generation apparatus 100 can generate dummy information for the user account in the same manner as the host name.
  • the information generation apparatus 100 can generate dummy information for these pieces of information.
  • the information generation apparatus 100 generates dummy information
  • the component information is a file name, a URI, and a mail address.
  • FIG. 8 is a diagram for explaining the operation of the information generating apparatus 100 when the component information (collected data) to be collected is a file name.
  • FIG. 8 illustrates a case where the file name is a character string including a space between words.
  • the analysis unit 101 decomposes the character string (file name) into “Japan”, “summer”, and “2014”. Then, the analysis unit 101 includes, in the attribute “Japane”, information indicating that the word is the first part of the character string before decomposition and information indicating that a space follows the word. Further, the analysis unit 101 includes, in the “summer” attribute, information indicating that the word is the second part of the character string before decomposition and information indicating that a space follows the word. Further, the analysis unit 101 includes information indicating that the word is the last part of the character string before decomposition in the attribute “2014”.
  • the analysis unit 101 includes the information indicating the position of the space in the attribute of the word immediately before the space, but the analysis unit 101 of the present embodiment is not limited to this. .
  • the analysis unit 101 may be configured to include information indicating the position of the space in the attribute of the word immediately after the space.
  • the associative word determination unit 102 confirms whether these words are included in the concept information.
  • the associative word determination unit 102 searches for a superordinate concept of “Japan” and “summer”. In this example, it is assumed that “Japan” and “summer” do not have the same superordinate concept.
  • the associative word determination unit 102 determines the associative word (for example, “American”) of “Japan” and the associative word (for example, “winter”) of “summer”.
  • the associative word determination unit 102 includes the attribute “Japan” in the attribute “American” and the attribute “summer” in the attribute “winter”.
  • serial number generation unit 104 generates “2013” based on “2014”. Then, the serial number generation unit 104 includes the attribute “2014” in the attribute “2013”.
  • the synthesizing unit 103 then includes (1) an associative word or a word that cannot be generated from a word received from the analyzing unit 101, and (2) a serial number of the word received from the analyzing unit 101.
  • words having the same attribute information for each word for which serial number phrases can be generated are arranged as one array, and dummy information is generated by combining these arrays.
  • the synthesis unit 103 combines the following (A) to (C).
  • A “American” is an association word or a word that has an attribute of being a word of the first part of a character string before decomposition, among words that cannot be generated as a sequential number phrase among words received from the analysis unit 101 ”And“ Japane ”,
  • B A word having an attribute that it is a word of the second part of the character string before the decomposition among the words that cannot be generated from the associative word or the serial number phrase among the words received from the analysis unit 101.
  • C Sequential number phrases or words that can generate a sequential number phrase among the words received from the analysis unit 101, are words having an attribute of being the last part of the character string before decomposition. “2013” and “2014”.
  • the synthesis unit 103 inserts a space at a predetermined position according to information indicating the position of the space included in the attribute of each word. Thereby, the composition unit 103 can generate dummy information including character strings such as “American winter 2013” and “American summer 2014” as shown on the right side of FIG.
  • the component information collected by the collection unit 110 is a mail address.
  • 9 and 10 are diagrams for explaining the operation of the information generating apparatus 100 when the component information (collected data) to be collected is a mail address.
  • the analysis unit 101 decomposes the e-mail address into a local part and a domain for each e-mail address. Then, the analysis unit 101 analyzes the character string included in the local part and breaks it down into words. An example of the decomposed word is shown on the right side of FIG. The local part “a-xxx” of the first mail address shown on the right side of FIG. 9 is decomposed into “a-” and “xxx” as shown in FIG. At this time, the analysis unit 101 assigns information indicating that the word is the first part of the character string before decomposition to “a ⁇ ” as an attribute.
  • the analysis unit 101 assigns information indicating that the word is the second (last in the local part) word of the character string before decomposition to “xxx”. Similarly, for the other mail addresses, the analysis unit 101 decomposes the local part into a domain and divides the local part into words. In this example, the at sign is included in the domain as the first character of the domain.
  • the associative word determination unit 102 determines “vvv” and “nnn” as the associative words of “xxx” and “kkk”, and determines “yy” as the associative word of “zz”.
  • sequence number generation unit 104 generates “c-” and “d-” as sequence numbers of “a-” and “b-”, and generates “02” as sequence numbers of “01”.
  • the synthesizing unit 103 (1) the associative word or the word received from the analyzing unit 101 that cannot generate the serial number phrase, and (2) the sequential number phrase or the word received from the analyzing unit 101 Among them, for each word for which serial number phrases can be generated, those having the same attribute information are made into one array, and the domains are made into one array, and dummy information is generated by combining these arrays.
  • the synthesis unit 103 combines the following (A) to (C).
  • (A) Sequential number phrases or words that can generate a sequential number phrase among the words received from the analysis unit 101, are words having an attribute of being the first part of the character string before decomposition. “A-”, “b-”, “c-” and “d-”,
  • (C) Domain are words having an attribute of being the first part of the character string before decomposition. “A-”, “b-”, “c-” and “d-”.
  • the synthesis unit 103 combines the following (D) to (F) as shown in FIG. (D) “zz”, which is an associative word or a word that cannot be generated as a sequential number phrase among the words received from the analysis unit 101, has the attribute that it is the first part of the character string before decomposition. ”And“ yy ”, (E) A word having an attribute that it is a word of the second part of the character string before decomposition, out of words that can generate a serial number word or phrase among the words received from the analysis unit 101. Certain "01” and "02”, (F) Domain.
  • the synthesizing unit 103 can generate dummy information including a character string such as “a-vvv@yyy.ne.jp” as shown on the right side of FIG.
  • the information generation apparatus 100 even if the component information is a mail address, dummy information that is difficult for an attacker to determine can be generated if the information is dummy information.
  • FIG. 11 is a diagram for explaining the operation of the information generation apparatus 100 when the collected data is a URI.
  • the analysis unit 101 decomposes the character string described as the URI for each hierarchy. And the analysis part 101 analyzes the character string of each hierarchy, and decomposes
  • An example of the decomposed word is shown in FIG. As shown in FIG. 11, the character string “folder01” in the first hierarchy in the URI is broken down into “folder” and “01”.
  • the analysis unit 101 uses information indicating the hierarchy including the character string before the decomposition of the decomposed word and information indicating the position of the decomposed word in the character string before the decomposition of the decomposed word. Include as an attribute.
  • the associative word determination unit 102 determines an associative word for the decomposed characters. Further, the serial number generation unit 104 generates a serial number phrase. Then, the synthesizing unit 103 synthesizes words for each hierarchy, and then synthesizes character strings in each hierarchy. Since the synthesis method is the same as that described above, the description thereof is omitted. The synthesizing unit 103 generates, as dummy information, a character string in which delimiters that delimit layers are inserted between the layers.
  • the information generation apparatus 100 even if the component information is a URI, dummy information that is difficult to be discriminated by an attacker as dummy information can be generated.
  • the same effects as those of the information generation apparatus 10 according to the first embodiment can be obtained. Further, according to the information generation apparatus 100 according to the present embodiment, even if the component information is a host name, a file name, a user account, a mail address, a URI, etc., more preferably, it is dummy information for the attacker. It is possible to generate dummy information that is difficult to discriminate.
  • FIG. 12 is a diagram for explaining a configuration of conceptual information stored in the storage unit 120 of the information generating apparatus 100 according to the present embodiment.
  • the storage unit 120 indicates that a higher concept such as “spring” or “fall” is “season” and a higher concept such as “season” is “xxx”.
  • FIG. 12 shows that “yyy” is present in the upper concept on the multiple layers of “xxx”.
  • “zzz” is included in the subordinate concept of “yyy”
  • “fruit” is included in the subordinate concept of “zzz”
  • “apple” and “orange” are included in the subordinate concept of “fruit”.
  • each word is referred to as a node in the present embodiment.
  • the associative word determination unit 102 calculates the distance between the superordinate concepts when a plurality of superordinate concepts common to the plurality of words included in the concept information are searched. For example, it is assumed that the words decomposed by the analysis unit 101 include “spring”, “fall”, “apple”, and “orange”. At this time, the associative word determination unit 102 searches for a superordinate concept of all words. Since “spring” and “fall” have the same superordinate concept “season”, and “apple” and “orange” have the same superordinate concept “fruit”, the associative word determination unit 102 has the superordinate concept It is determined that two are retrieved.
  • the associative word determination unit 102 calculates the inter-node distance between the “season” node and the “fruit” node.
  • the distance between the node and the parent node of this node is 1.
  • This distance is also called the arrival hop count. That is, the distance (the number of hops reached) from the node to the parent node of this node is 1.
  • the associative word determination unit 102 selects another upper level in a distance (also called an intermediate distance or the number of intermediate hops) from at least one of the higher level concepts (“season”, “fruit”) to approximately half of the calculated distance.
  • a word that is a subordinate concept to the concept is determined as an associative word.
  • the associative word determination unit 102 determines, as an associative word, a word that is a child node (subordinate concept) of a node (superordinate concept) whose arrival hop count from the “season” node is four.
  • the node having the number of hops reached from the “season” node of 4 is “city” and the subordinate concept thereof is “tokyo”, “paris”, “kyoto”, etc.
  • the associative word determination unit 102 selects “ A predetermined number of words are determined as associative words from “tokyo”, “paris”, “kyoto”, and the like. At this time, the number of words determined as associative words is arbitrary.
  • the subordinate concept of a node whose number of hops reached from the “season” node is the number of intermediate hops has been described as an associative word.
  • the number of hops reached from the “fruit” node is the number of intermediate hops.
  • a certain node may be an associative word.
  • the information generation apparatus 100 can generate dummy information using a keyword close to a word (keyword) included in regular information. Therefore, according to the information generation device 100 according to the present embodiment, not only the upper concept (referred to as the direct superordinate concept) for the word included in the regular information and the word sharing the superordinate concept, A word that is different from a direct concept in terms of words included in regular information can also be determined as an associative word. Thereby, according to the information generation device 100 according to the present embodiment, it is possible to generate dummy information that is beyond the range assumed from the legitimate information and is more difficult for an attacker to identify as dummy information.
  • Modification 1 Next, a modification according to the present embodiment will be described. In this modification, another method of associative word generation by the associative word determination unit 102 will be described.
  • the associative word determination unit 102 searches for a superordinate concept of all words. Note that “spring” and “fall” have the same superordinate concept “season”, “apple” and “orange” have the same superordinate concept “fruit”, and “tokyo” and “paris” are the same. Suppose that it has a superordinate concept “city”. Therefore, the associative word determination unit 102 determines that three superordinate concepts have been searched.
  • the associative word determination unit 102 calculates the number of hops reached between the three superordinate concepts. That is, the associative word determination unit 102 determines (1) the distance between “season” and “fruit”, (2) the distance between “season” and “city”, and (3) “fruit” and “city”. Is calculated.
  • the associative word determination unit 102 selects another superordinate concept that is in the average distance (also referred to as the mean hop count) of each hop number reached from at least one of the superordinate concepts (“season”, “fruit”, “city”).
  • the subordinate concept word for is determined as an associative word.
  • the associative word determination unit 102 determines a word that is a child node (subordinate concept) of a node having a hop count of 4 from the “fruit” node as an associative word. . If the node having the number of hops reached from the “fruit” node is “restaurant” and the subordinate concept is “cafeteria”, “teashop”, “pub”, etc., the associative word determination unit 102 selects “ A predetermined number of words are determined as associative words from “cafeteria”, “teashop”, “pub”, and the like. At this time, the number of words determined as associative words is arbitrary.
  • the subordinate concept of a node whose average hop count is the number of hops reached from the “fruit” node is an associative word, but the number of hops reached from the “season” node or the “city” node.
  • a node having the average number of hops may be used as an associative word.
  • a subordinate concept of a node whose number of hops reached from the route is the average number of hops may be used as an association word.
  • the information generating apparatus 100 according to the present modification can obtain the same effects as the information generating apparatus 100 according to the third embodiment.
  • the associative word determination unit 102 is the number of intermediate hops calculated from at least one of the higher concepts.
  • the subordinate concept words for other superordinate concepts may be further determined as associative words.
  • Modification 2 Next, another modification according to the present embodiment will be described. In this modification, another method of associative word generation by the associative word determination unit 102 will be described.
  • the associative word determination unit 102 may determine an associative word using an initial value given in advance.
  • This initial value is, for example, a value indicating how many levels go up from a certain word.
  • the associative word determination unit 102 specifies a word in a hierarchy higher than the initial value from words included in the concept information (a superordinate concept at a predetermined distance). For example, when the word included in the concept information is “winter” and the initial value is 2, the associative word determination unit 102 has two higher-level concepts (“xxx” in FIG. 12) above “winter”. Is identified. Then, the associative word determination unit 102 determines a word of a lower concept of “xxx” as an associative word.
  • the associative word determination unit 102 may add the associative word determined in the present modification to the associative word determined in the third embodiment or the first modification.
  • the information generation device 100 uses dummy information for an attacker who exceeds the range assumed from regular information. It is possible to generate dummy information that is more difficult to discriminate.
  • the associative word determining unit 102 according to the present modification may determine an associative word using an initial value given in advance, similarly to the associative word determining unit 102 according to the second modification.
  • the initial value in this modification is a value indicating the number of required associative words.
  • the associative word determination unit 102 searches for a superordinate concept of a word included in the concept information, and determines a subordinate concept word for the superordinate concept as an associative word. At this time, when the number of words of the lower concept is smaller than the initial value, the associative word determination unit 102 determines the word of the lower concept for the higher concept of the higher concept as the associative word.
  • the associative word determination unit 102 is a superordinate concept one level higher than “winter” (“season” in FIG. 12). Is identified. Then, the associative word determination unit 102 checks whether the number of words of the subordinate concept of “season” and the number of words other than “winter” is equal to or greater than the initial value (8). When the number of words of the subordinate concept of “season” excluding “winter” is 6, for example, the associative word determination unit 102 identifies the superordinate concept of “season” (“xxx” in FIG. 12). .
  • the associative word determination unit 102 confirms whether the number of words of the subordinate concept of “xxx” is equal to or larger than the initial value.
  • the associative word determination unit 102 searches the higher-level concept by going back up the hierarchy until a higher-level concept having a word equal to or higher than the initial value as a lower-level concept appears. Then, when there is a superordinate concept having a word equal to or higher than the initial value as a subordinate concept, the associative word determination unit 102 determines a subordinate concept word for the superordinate concept as an associative word.
  • the associative word determination unit 102 may add the associative word determined in the present modification to the associative word determined in the third embodiment, the first modification, or the second modification.
  • the information generation device 100 uses dummy information for an attacker who exceeds the range assumed from regular information. It is possible to generate dummy information that is more difficult to discriminate.
  • the composition unit 103 of the information generation apparatus 100 may assign a priority to the combined (after composition) character string. For example, when the associative word and / or sequential number phrase is included in the combined character string, the combining unit 103 may set the priority of the character string higher. For example, when the number of appearances is included in the attribute of the word included in the combined character string, the combining unit 103 may set the priority of the character string including the word with a higher appearance number to a higher priority. Good. Further, for example, when the associative word included in the combined character string is determined from a predetermined number of words or more, the combining unit 103 may set the priority of the character string higher. . Thus, the priority setting method is not particularly limited. The priority may indicate a level or may be ranked.
  • the synthesis unit 103 selects a character string with a priority greater than a predetermined value. Generated as dummy information.
  • the synthesis unit 103 when the priority given to the character string is a rank (priority order), the synthesis unit 103 generates a character string having a higher priority than a predetermined value as dummy information. For example, when the predetermined value is N (N is a natural number), the synthesis unit 103 generates the top N character strings as dummy information.
  • the character string generated by the synthesizing unit 103 is a character string that is more difficult for an attacker to identify as dummy information than the character string generated in the first to third embodiments described above. Therefore, the information generating apparatus 100 according to the present embodiment can generate only character strings that are more difficult to be identified as dummy information by an attacker as dummy information.
  • FIG. 13 is a functional block diagram showing an example of the functional configuration of the information generating apparatus according to the present embodiment.
  • the information generation apparatus 400 according to the present embodiment is configured to further include a storage unit 420 in the information generation apparatus 100 according to the second to fourth embodiments.
  • the information generation apparatus 400 includes an analysis unit 101, an associative word determination unit 102, a synthesis unit 103, a sequence number generation unit 104, a collection unit 110, and a storage unit (first unit). 2 storage units) 120 and a storage unit (first storage unit) 420.
  • the configuration of the information generation system including the information generation apparatus 400 according to the present embodiment is the same as the configuration described with reference to FIG.
  • the storage unit 420 may be realized by a storage device that is separate from the information generation device 400.
  • the storage unit 120 and the storage unit 420 are described as an example of a separate configuration.
  • the storage unit 120 and the storage unit 420 are realized by a single storage unit. It may be.
  • the storage unit 420 stores material information.
  • the material information is information indicating a material that can be used as dummy information.
  • the material information is information including words that are listed in advance by the user and registered in the storage unit 420 as words that are difficult for a computer to automatically generate. Words that are difficult for a computer to generate automatically are dummy-like names that can be used as dummy information (for example, proxygate2, ip8800, dhcp01, etc.) and unique in the corporate system 300 to which the dummy information is applied. This is a word composed of a name that conforms to the naming convention.
  • the unique naming rule is, for example, a rule that the name of a server installed in Tokyo is “tk-svr”. That is, the words included in the material information are character strings that are highly likely to be used in the system, but are not conceptualized.
  • the associative word determination unit 102 checks whether the word received from the analysis unit 101 is included in the concept information. If the word is a word that is not included in the concept information, the word can generate a serial number phrase. Check whether or not. If the word not included in the concept information is not a word that can generate a serial number phrase, the associative word determination unit 102 may register the word as material information in the storage unit 420.
  • the material information includes not only words registered in advance by the user but also words registered by the associative word determination unit 102.
  • the word to be registered as material information by the user may be selected from words determined to be not included in the concept information by the associative word determination unit 102.
  • the associative word determination unit 102 determines whether or not the words themselves are actually used. It may be confirmed by making an inquiry to a DNS (Domain Name System), a DHCP (Dynamic Host Configuration Protocol) server, or the like in the in-company system 300. Then, as a result of the inquiry, if the word itself that is not included in the concept information is not actually used, the associative word determination unit 102 may register this word as material information.
  • DNS Domain Name System
  • DHCP Dynamic Host Configuration Protocol
  • the associative word determination unit 102 may attach an attribute to each word registered as material information. Information included in this attribute may be information arbitrarily registered by the user. When the word registered as the material information is a word supplied from the analysis unit 101, the attribute of the word registered as the material information may be an attribute given to this word by the analysis unit 101.
  • the synthesizing unit 103 receives the decomposed word from the analyzing unit 101.
  • the synthesizing unit 103 receives an associative word from the associative word determining unit 102.
  • the synthesizing unit 103 also receives sequential number phrases from the sequential number generating unit 104. Further, the synthesis unit 103 acquires material information from the storage unit 420.
  • the synthesizing unit 103 synthesizes not only the associative word determined by the associative word determining unit 102 but also the word included in the material information acquired from the storage unit 420 as the associative word. Note that the synthesizing unit 103 may perform synthesis using the words themselves decomposed by the analyzing unit 101, as in the second and third embodiments described above. Further, as in the fourth embodiment described above, the synthesis unit 103 may assign a priority to the synthesized character string and use the higher priority as dummy information. Thus, since the synthesizing method of the synthesizing unit 103 according to the present embodiment uses the same synthesizing method as in each of the above-described embodiments, detailed description thereof is omitted in the present embodiment.
  • the information generation apparatus 400 according to the present embodiment generates dummy information using the material information.
  • the information generation device 400 according to the present embodiment can obtain the same effects as those of the information generation devices according to the first to fourth embodiments described above.
  • the information generation apparatus 400 according to the present embodiment can generate dummy information more similar to regular information.
  • Example of hardware configuration> a configuration example of hardware capable of realizing the information generation apparatus (10, 100, 400) according to each embodiment described above will be described.
  • the information generation device (10, 100, 400) described above may be realized as a dedicated device, but may be realized using a computer (information processing device).
  • FIG. 14 is a diagram illustrating a hardware configuration of a computer (information processing apparatus) capable of realizing each embodiment of the present invention.
  • the hardware of the information processing apparatus (computer) 90 shown in FIG. 14 includes a CPU (Central Processing Unit) 11, a communication interface (I / F) 12, an input / output user interface 13, a ROM (Read Only Memory) 14, a RAM ( Random Access Memory) 15, a storage device 17, and a drive device 18 of a computer-readable storage medium 19, which are connected via a bus 16.
  • the input / output user interface 13 is a man-machine interface such as a keyboard which is an example of an input device and a display as an output device.
  • the communication interface 12 is a general communication means for the devices according to the above-described embodiments (FIGS. 1, 3, and 13) to communicate with an external device via the communication network 80.
  • the CPU 11 controls the overall operation of the information processing apparatus 90 that realizes the information generation apparatuses (10, 100, 400) according to the embodiments.
  • a program (computer program) that can realize the processing described in each of the above-described embodiments is supplied to the information processing apparatus 90 illustrated in FIG. It implement
  • the program is stored in the apparatus in the various processes described in the flowchart (FIG. 4) referred to in the description of the above embodiments, or in the block diagrams shown in FIGS. It may be a program capable of realizing each part (each block) shown.
  • the program supplied in the information processing apparatus 90 may be stored in a readable / writable temporary storage memory (15) or a non-volatile storage device (17) such as a hard disk drive. That is, in the storage device 17, the program group 17 ⁇ / b> A is a program that can realize the function of each unit shown in the information generation device (10, 100, 400) in each of the above-described embodiments.
  • the various kinds of stored information 17B are, for example, collected data, decomposed data, associative words, sequential number phrases, dummy information, conceptual information, material information, and the like in the above-described embodiments.
  • the constituent unit of each program module is not limited to the division of each block shown in the block diagrams (FIG. 1, FIG. 3, FIG. 13). May be selected as appropriate during mounting.
  • the program is supplied into the apparatus via various computer-readable recording media (19) such as a CD (Compact Disk) -ROM and a flash memory.
  • a general procedure can be adopted at present, such as a method and a method of downloading from the outside via a communication line (80) such as the Internet.
  • each embodiment can be considered to be configured by a code (program group 17A) constituting the computer program or a storage medium (19) in which the code is stored.
  • Information generation apparatus 100 Information generation apparatus 101 Analysis part 102 Associative word determination part 103 Composition part 104 Serial number generation part 110 Collection part 120 Storage part 200 Network 300 In-company system 400 Information generation apparatus 420 Storage part 80 Communication network 90 Information processing apparatus 11 CPU 12 Communication interface 13 Input / output user interface 14 ROM 15 RAM 16 bus 17 storage device 18 drive device 19 storage medium

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

Provided is art for generating dummy information that is less readily distinguished by attackers as dummy information. This information-generating device comprises: an analysis unit that breaks down, into words, a character string included in constituent element information related to constituent elements of a system; an associated-word determination unit that, on the basis of concept information and in relation to words that are among the aforementioned words and are included in the concept information, determines associated words for said words; and a synthesis unit that generates dummy information comprising a character string that is different from the character string included in the constituent element information. Said dummy information is generated by combining: the associated words; and either words that come before and/or after the words in the character string prior to being broken down into the words used to determine the associated words, or associated words for said words.

Description

情報生成装置、情報生成方法および記録媒体Information generating apparatus, information generating method, and recording medium
 本発明は、情報生成装置、情報生成方法および記録媒体に関する。 The present invention relates to an information generation device, an information generation method, and a recording medium.
 昨今、企業や社会インフラへのサイバー攻撃に対する防衛策が考えられている。このような防衛策として、サイバー攻撃やウイルスの侵入を監視し、検知し、遮断する対策が取られている。 Recently, defense measures against cyber attacks on corporate and social infrastructure are being considered. As such defensive measures, measures are taken to monitor, detect and block cyber attacks and virus intrusions.
 しかしながら、サイバー攻撃が一向に止まない事象を鑑みると、攻撃手法の進化、攻撃の検知精度の完全保証の技術的困難さ等の理由から、企業や社会インフラへのウイルス侵入を完全に防御することは非常に困難である。 However, given the phenomena that cyber attacks do not stop at once, it is not possible to completely protect against virus intrusion into corporations and social infrastructure due to the evolution of attack methods and technical difficulties in fully guaranteeing attack detection accuracy. It is very difficult.
 そこで、サイバー攻撃が企業や社会インフラを構成するネットワークへ侵入してしまう、または既に内部にウイルスが侵入していることを前提とした防衛策を考える必要がある。 Therefore, it is necessary to consider defensive measures based on the premise that a cyber attack has infiltrated the network that constitutes a company or social infrastructure, or that a virus has already infiltrated the inside.
 特許文献1には、個人情報等が外部に流出した場合にその流出元を特定する方法が記載されている。特許文献1に記載の技術では、データベースの検索結果にダミーデータを混在させ、このダミーデータと検索要求者の識別情報とを対応付けることにより、このダミーデータを含む顧客データを流出させた検索要求者を特定する。 Patent Document 1 describes a method of identifying the outflow source when personal information or the like leaks to the outside. In the technique described in Patent Literature 1, dummy search data is mixed with search results of a database, and the search requester who leaks customer data including this dummy data by associating this dummy data with the identification information of the search requester. Is identified.
 また、特許文献2には、セキュリティロックを解除するための条件を満たしていない場合に、ダミーデータを表示部に表示する情報処理装置が記載されている。 Patent Document 2 describes an information processing apparatus that displays dummy data on a display unit when a condition for releasing the security lock is not satisfied.
 また、データの生成方法として、2つの単語を夫々2つに分割して一方の単語の前半と他方の単語の後半とを連結する方法が特許文献3に記載されている。また、ある用語をそれに関係する複数の用語に優先度を付与して対応付ける方法が特許文献4に記載されている。 Further, as a data generation method, Patent Document 3 discloses a method of dividing two words into two and connecting the first half of one word and the second half of the other word. Further, Patent Document 4 discloses a method of assigning a priority to a plurality of terms related to a certain term and assigning them a priority.
 また、数字、英字、記号などの順序を表す連番語句の値を増減させる方法が特許文献5に記載されている。 Also, Patent Document 5 describes a method for increasing or decreasing the value of sequential number phrases indicating the order of numbers, English letters, symbols, and the like.
特開2005-222135号公報JP 2005-222135 A 特開2013-250776号公報JP 2013-250776 A 特開2009-271784号公報JP 2009-271784 A 特開平10-214271号公報JP-A-10-214271 特開平8-171559号公報JP-A-8-171559
 ダミーデータ等のダミー情報を使って、ネットワーク内に侵入した攻撃者を特定する場合、ダミー情報は、攻撃者にダミー情報であると判別されにくい情報であることが好ましい。つまり、ダミー情報は、正規情報との区別がつきにくく、違和感の無いものであり、攻撃者が引っ掛かりやすいものであることが好ましい。ここで、正規情報とは、既に存在する(使用されている)情報のことを示す。 When identifying an attacker who has entered the network using dummy information such as dummy data, the dummy information is preferably information that is difficult for an attacker to identify as dummy information. That is, it is preferable that the dummy information is difficult to distinguish from regular information, has no sense of incongruity, and is easily caught by an attacker. Here, regular information indicates information that already exists (is used).
 特許文献1に記載の技術では、シノニムを使用したり、住所の屋号等として考えられるワードを定義しておき適宜組み合わせたりすることにより、ダミーデータを生成している。 In the technique described in Patent Document 1, dummy data is generated by using a synonym or by defining a word considered as an address name or the like and combining it appropriately.
 また、特許文献2に記載の技術では、ダミーデータとして使用可能なデータの中で情報の一部を入れ替えることによってダミーデータを生成している。 In the technique described in Patent Document 2, dummy data is generated by replacing a part of information in data usable as dummy data.
 このように、元の情報(データ)の書き換えや、組み合わせによって生成されたダミーデータは、攻撃者によってダミーと判別される可能性が高いという問題がある。 Thus, there is a problem that the dummy data generated by rewriting or combining the original information (data) is likely to be determined as a dummy by an attacker.
 また、特許文献3に記載の単語を任意の場所で前後に分割した文字列同士を連結する方法では、生成された文字列が意味をなさない、単なる文字の羅列になってしまう。このようなダミーデータも、攻撃者によってダミーと判別される可能性が高い。 Also, in the method of connecting character strings obtained by dividing the word described in Patent Document 3 back and forth at an arbitrary place, the generated character string becomes a simple character enumeration that does not make sense. Such dummy data is also likely to be determined as a dummy by an attacker.
 また、ダミー情報が予め用意された静的リストの場合、ダミー情報を適用する対象とするICT(Information and Communication Technology)システムによっては、ダミー情報として判別されやすくなってしまう可能性がある。また、ダミー情報を適用する対象のICTシステムの正規情報を整理して、このICTシステムに適したダミー情報を人手で生成する場合、正規情報の量と対象のICTシステムの数とが多くなると、ダミー情報の生成に膨大な時間がかかってしまう。 In addition, when the dummy information is a static list prepared in advance, depending on an ICT (Information and Communication Technology) system to which the dummy information is applied, it may be easily determined as dummy information. Further, when the regular information of the target ICT system to which the dummy information is applied is organized and the dummy information suitable for the ICT system is manually generated, when the amount of regular information and the number of target ICT systems increase, It takes an enormous amount of time to generate dummy information.
 本発明は、上記課題に鑑みてなされたものであり、その目的は、攻撃者にダミー情報であるとより判別されにくいダミー情報を生成する技術を提供することにある。 The present invention has been made in view of the above-described problems, and an object of the present invention is to provide a technique for generating dummy information that is more difficult for an attacker to identify as dummy information.
 本発明の一態様に係る情報生成装置は、システムの構成要素に関する構成要素情報に含まれる文字列を単語に分解する解析手段と、前記分解された単語のうち概念情報に含まれる単語に対し、該概念情報に基づいて、前記単語の連想語を決定する連想語決定手段と、前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせることにより、前記構成要素情報に含まれる文字列とは異なる文字列からなるダミー情報を生成する合成手段と、を備える。 An information generation apparatus according to an aspect of the present invention provides an analysis unit that decomposes a character string included in component information relating to a component of a system into words, and a word included in conceptual information among the decomposed words, An associative word determining means for determining an associative word of the word based on the concept information, the associative word, and a character string before the word used for determining the associative word before and after the word And a synthesizing unit that generates dummy information including a character string different from the character string included in the component element information by combining a word following at least one of the above or an associated word of the word.
 本発明の一態様に係る情報生成方法は、システムの構成要素に関する構成要素情報に含まれる文字列を単語に分解し、前記分解された単語のうち概念情報に含まれる単語に対し、該概念情報に基づいて、前記単語の連想語を決定し、前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせることにより、前記構成要素情報に含まれる文字列とは異なる文字列からなるダミー情報を生成する。 An information generation method according to an aspect of the present invention decomposes a character string included in component information relating to a component of a system into words, and applies the concept information to a word included in concept information among the decomposed words. Based on the word, the word following the at least one of the word before and after the word in the character string before the word used for the word and the word used for the word determination, or The dummy information including a character string different from the character string included in the component element information is generated by combining the associated word with the word.
 なお、上記情報生成装置または情報生成方法を、コンピュータによって実現するコンピュータプログラム、およびそのコンピュータプログラムが格納されている、コンピュータ読み取り可能な記憶媒体も、本発明の範疇に含まれる。 Note that a computer program that realizes the information generation apparatus or the information generation method by a computer and a computer-readable storage medium that stores the computer program are also included in the scope of the present invention.
 本発明によれば、攻撃者にダミー情報であるとより判別されにくいダミー情報を生成することができる。 According to the present invention, it is possible to generate dummy information that is more difficult for an attacker to identify as dummy information.
本発明の第1の実施の形態に係る情報生成装置の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a function structure of the information generation apparatus which concerns on the 1st Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成システムの構成の一例を示す図である。It is a figure which shows an example of a structure of the information generation system which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成装置の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a function structure of the information generation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成装置の処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a process of the information generation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成装置の解析部の動作を説明するための図である。It is a figure for demonstrating operation | movement of the analysis part of the information generation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成装置の記憶部に格納された概念情報の構成を説明するための図である。It is a figure for demonstrating the structure of the conceptual information stored in the memory | storage part of the information generation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成装置の合成部の動作を説明するための図である。It is a figure for demonstrating operation | movement of the synthetic | combination part of the information generation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施の形態に係る情報生成装置において、収集する構成要素情報がファイル名の場合の動作を説明するための図である。It is a figure for demonstrating operation | movement when the component information collected in the information generation apparatus which concerns on the 2nd Embodiment of this invention is a file name. 本発明の第2の実施の形態に係る情報生成装置において、収集する構成要素情報がメールアドレスの場合の動作を説明するための図である。It is a figure for demonstrating operation | movement when the component information collected in the information generation apparatus which concerns on the 2nd Embodiment of this invention is a mail address. 本発明の第2の実施の形態に係る情報生成装置において、収集する構成要素情報がメールアドレスの場合の動作を説明するための図である。It is a figure for demonstrating operation | movement when the component information collected in the information generation apparatus which concerns on the 2nd Embodiment of this invention is a mail address. 本発明の第2の実施の形態に係る情報生成装置において、収集する構成要素情報がURI(Uniform Resource Identifier)の場合の動作を説明するための図である。It is a figure for demonstrating operation | movement when the component information collected in the information generator which concerns on the 2nd Embodiment of this invention is URI (Uniform Resource Identifier). 本発明の第3の実施の形態に係る情報生成装置の記憶部に格納された概念情報の構成を説明するための図である。It is a figure for demonstrating the structure of the conceptual information stored in the memory | storage part of the information generation apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第5の実施の形態に係る情報生成装置の機能構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a function structure of the information generation apparatus which concerns on the 5th Embodiment of this invention. 本発明の各実施の形態を実現可能なコンピュータ(情報処理装置)のハードウェア構成を例示的に説明する図である。It is a figure which illustrates illustartively the hardware constitutions of the computer (information processing apparatus) which can implement | achieve each embodiment of this invention.
 <第1の実施の形態>
 本発明の第1の実施の形態について、図面を参照して詳細に説明する。図1は、本発明の第1の実施の形態に係る情報生成装置10の機能構成の一例を示す機能ブロック図である。なお、図1に示す情報生成装置10は、本発明に特有な構成について示したものであり、図1に示す情報生成装置10が図1に示されていない部材を有していてもよいことは言うまでもない。また、図面中の矢印の方向は、一例を示すものであり、ブロック間の信号の向きを限定するものではない。以降に参照する、他のブロック図においても同様に、図面中の矢印の方向は、一例を示すものであり、ブロック間の信号の向きを限定するものではない。
<First Embodiment>
A first embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 is a functional block diagram showing an example of a functional configuration of the information generation apparatus 10 according to the first embodiment of the present invention. Note that the information generation apparatus 10 shown in FIG. 1 shows a configuration unique to the present invention, and the information generation apparatus 10 shown in FIG. 1 may have a member that is not shown in FIG. Needless to say. Moreover, the direction of the arrow in the drawings shows an example, and does not limit the direction of signals between blocks. Similarly in other block diagrams to be referred to hereinafter, the directions of the arrows in the drawings show an example and do not limit the direction of signals between the blocks.
 図1に示す通り、情報生成装置10は、解析部101と、連想語決定部102と、合成部103とを備えている。 1, the information generation apparatus 10 includes an analysis unit 101, an associative word determination unit 102, and a synthesis unit 103.
 解析部101は、システムに含まれるサーバ等の構成要素に関する構成要素情報に含まれる文字列を解析し、1または複数の単語に分解する。構成要素情報には、例えば、各構成要素を示すホスト名、各構成要素にアクセスするために用いるユーザアカウント、各構成要素内に格納された情報資源を示すファイル名、該情報資源の場所を示すURI(Uniform Resource Identifier)等の情報が含まれるが本実施の形態の構成要素情報はこれに限定されるものではない。構成要素情報には、例えば、各構成要素を利用するユーザのメールアドレスが含まれてもよい。この構成要素情報は、実際にシステム内で使用されている情報であるため、このシステムにおける正規情報とも呼ぶ。 The analysis unit 101 analyzes the character string included in the component information related to the component such as a server included in the system, and decomposes it into one or more words. The component information includes, for example, a host name indicating each component, a user account used to access each component, a file name indicating an information resource stored in each component, and a location of the information resource Information such as URI (Uniform Resource Identifier) is included, but the component information of the present embodiment is not limited to this. The component information may include, for example, the email address of the user who uses each component. Since this component element information is information that is actually used in the system, it is also called regular information in this system.
 解析部101は、構成要素情報に複数の文字列が含まれる場合、文字列ごとに、該文字列を解析し、該文字列を1または複数の単語に分解する。そして、解析部101は、分解した単語を、連想語決定部102および合成部103に出力する。 When the component information includes a plurality of character strings, the analysis unit 101 analyzes the character string for each character string and decomposes the character string into one or more words. Then, the analysis unit 101 outputs the decomposed word to the associative word determination unit 102 and the synthesis unit 103.
 連想語決定部102は、解析部101から、分解された単語を受け取る。そして、連想語決定部102は、受け取った単語が概念情報に含まれるかを確認し、概念情報に含まれる単語に対して、この概念情報に基づいて、この単語の連想語を決定する。連想語決定部102は、決定した連想語を、合成部103に出力する。 The associative word determination unit 102 receives the decomposed word from the analysis unit 101. Then, the associative word determination unit 102 confirms whether or not the received word is included in the concept information, and determines an associative word of the word based on the concept information for the word included in the concept information. The associative word determination unit 102 outputs the determined associative word to the synthesis unit 103.
 合成部103は、解析部101から分解された単語を受け取る。また、合成部103は、連想語決定部102から連想語を受け取る。そして、合成部103は、連想語と、該連想語の決定に用いた単語の分解前の文字列内において、単語の前および後ろの少なくとも一方に続く単語、または、単語の連想語とを組み合わせる。これにより、合成部103は、構成要素情報に含まれる文字列とは異なる文字列を生成する。 The synthesizing unit 103 receives the decomposed word from the analyzing unit 101. The synthesizing unit 103 receives an associative word from the associative word determining unit 102. The synthesizing unit 103 combines the associative word and the word following at least one of the word before and after the word or the word associative word in the character string before the word used for the determination of the word. . Thereby, the composition unit 103 generates a character string different from the character string included in the component element information.
 例えば、構成要素情報に含まれる文字列が「spring01」および「fall02」である場合、解析部101は、「spring01」を「spring」と「01」とに分解する。また、解析部101は、「fall02」を「fall」と「02」とに分解する。 For example, when the character strings included in the component element information are “spring01” and “fall02”, the analysis unit 101 decomposes “spring01” into “spring” and “01”. Also, the analysis unit 101 decomposes “fall02” into “fall” and “02”.
 そして、連想語決定部102は、「spring」、「fall」、「01」および「02」が夫々、概念情報に含まれるか否かを確認する。そして、「spring」と「fall」とが概念情報に含まれる場合、連想語決定部102は、概念情報に基づいて、「spring」と「fall」との連想語(例えば、「winter」および「autumn」とする)を決定する。 Then, the associative word determination unit 102 checks whether or not “spring”, “fall”, “01”, and “02” are included in the concept information. Then, when “sspring” and “fall” are included in the concept information, the associative word determination unit 102, based on the concept information, associate words of “sspring” and “fall” (for example, “winter” and “fall”). ”outumn”).
 そして、合成部103は、(1)「winter」と、(2)この「winter」の決定に用いた単語の分解前の文字列内において、この単語の後に続く単語と、を組み合わせる。 Then, the synthesizing unit 103 combines (1) “winter” and (2) a word following this word in the character string before the word used for the determination of “winter”.
 この連想語の決定に用いた単語は、1つであってもよいし、複数であってもよい。例えば、「spring」と「fall」とが、「season」という上位概念を共有する場合、連想語決定部102は、この2つの単語に対する連想語「winter」を、決定する。このように、2つの単語を用いて連想語を決定した場合、上記(2)における「winter」の決定に用いた単語の分解前の文字列は、「spring01」および「fall02」である。そして、「winter」の決定に用いた単語の後に続く単語とは、「01」および「02」である。「autumn」に関しても同様のことがいえる。 The number of words used to determine this associative word may be one or plural. For example, when “spring” and “fall” share the superordinate concept “season”, the associative word determination unit 102 determines an associative word “winter” for the two words. Thus, when an associative word is determined using two words, the character strings before the word decomposition used in the determination of “winter” in the above (2) are “spring01” and “fall02”. The words following the word used to determine “winter” are “01” and “02”. The same is true for “autumn”.
 これにより、合成部103は、「winter01」「winter02」、「autumn01」および「autumn02」を生成する。これらの文字列は、構成要素情報に含まれる文字列「spring01」および「fall02」とは異なる文字列になる。 Thereby, the synthesis unit 103 generates “winter01”, “winter02”, “autumn01”, and “autumn02”. These character strings are different from the character strings “spring01” and “fall02” included in the component information.
 以上のように、合成部103は、生成した文字列からなるダミー情報を生成することができる。 As described above, the synthesis unit 103 can generate dummy information including the generated character string.
 (効果)
 ホスト名等の構成要素情報を決定する際、利用者は、分類、カテゴリ、連番等のヒューリスティックスに基づいて、この構成要素情報を命名する傾向にある。本実施の形態に係る情報生成装置10によって生成されたダミー情報は、概念情報に基づいて生成されるため、ヒューリスティックスに基づいて生成された構成要素情報と、区別が付きづらい情報であると言える。また、本実施の形態に係る情報生成装置10によって生成されたダミー情報は、構成要素情報の概念情報に基づいて生成されるため、構成要素情報として用いたとしても違和感が無い情報であると言える。
(effect)
When determining component information such as a host name, users tend to name this component information based on heuristics such as classification, category, serial number, and the like. Since the dummy information generated by the information generation apparatus 10 according to the present embodiment is generated based on the concept information, it can be said that the information is difficult to distinguish from the component element information generated based on the heuristics. Further, since the dummy information generated by the information generation apparatus 10 according to the present embodiment is generated based on the concept information of the component element information, it can be said that the information does not feel strange even when used as the component element information. .
 このように、本実施の形態に係る情報生成装置10は、攻撃者にダミー情報であると、より判別されにくいダミー情報を生成することができる。つまり、本実施の形態に係る情報生成装置10は、攻撃者がダミーと気づかずに使用するようなダミー情報を自動で生成することができる。 Thus, the information generation apparatus 10 according to the present embodiment can generate dummy information that is more difficult to discriminate when it is dummy information for an attacker. That is, the information generation apparatus 10 according to the present embodiment can automatically generate dummy information that is used by an attacker without noticing a dummy.
 このように生成されたダミー情報をシステムに用いるシステムは、攻撃者をより検知することができる。 A system that uses dummy information generated in this way for the system can detect an attacker more.
 <第2の実施の形態>
 次に、上述した第1の実施の形態を基本とする第2の実施の形態について説明する。なお、説明の便宜上、上述した第1の実施の形態で説明した図面に含まれる部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。
<Second Embodiment>
Next, a second embodiment based on the above-described first embodiment will be described. For convenience of explanation, members having the same functions as those included in the drawings described in the first embodiment described above are given the same reference numerals, and descriptions thereof are omitted.
 まず、本実施の形態に係る情報生成システム1の構成について説明する。図2は、本実施の形態に係る情報生成システム1の構成の一例を示す図である。図2に示す通り、本実施の形態に係る情報生成システム1は、情報生成装置100と、企業内システム300とを含んでいる。情報生成装置100と企業内システム300とはネットワーク200を介して通信可能に接続している。 First, the configuration of the information generation system 1 according to the present embodiment will be described. FIG. 2 is a diagram illustrating an example of the configuration of the information generation system 1 according to the present embodiment. As shown in FIG. 2, the information generation system 1 according to the present embodiment includes an information generation device 100 and an in-company system 300. The information generation apparatus 100 and the in-company system 300 are connected via a network 200 so that they can communicate with each other.
 企業内システム300は、ICT(Information and Communication Technology)を利用したシステムである。本実施の形態では、ICTシステムの一例として、企業内に構築されたシステムを例に挙げ説明を行うが、本実施の形態のICTシステムはこれに限定されるものではない。ICTシステムは、ネットワーク等を経由して、サービスを利用可能な環境であればよい。 The in-company system 300 is a system using ICT (Information and Communication Technology). In the present embodiment, an example of an ICT system will be described taking a system built in a company as an example, but the ICT system of the present embodiment is not limited to this. The ICT system only needs to be an environment where services can be used via a network or the like.
 企業内システム300には、サーバ、クライアント、および、ストレージ等の各種機器が含まれている。以下、これらを企業内システム300の構成要素と呼ぶ。 The in-company system 300 includes various devices such as a server, a client, and a storage. Hereinafter, these are referred to as components of the in-company system 300.
 情報生成装置100は、ダミー情報を生成する装置である。情報生成装置100の構成については、図面を変えて説明する。 The information generation device 100 is a device that generates dummy information. The configuration of the information generation apparatus 100 will be described with reference to different drawings.
 (情報生成装置100の構成)
 次に、本実施の形態に係る情報生成装置100の機能構成について、図3を参照して説明する。図3は、本実施の形態に係る情報生成装置100の機能構成の一例を示す機能ブロック図である。本実施の形態に係る情報生成装置100は、第1の実施の形態で説明した情報生成装置10に、連番生成部104と、収集部110と、記憶部120とを備える構成である。具体的には、図3に示す通り、情報生成装置100は、解析部101と、連想語決定部102と、合成部103と、連番生成部104と、収集部110と、記憶部120と、を備えている。
(Configuration of information generating apparatus 100)
Next, the functional configuration of the information generation apparatus 100 according to the present embodiment will be described with reference to FIG. FIG. 3 is a functional block diagram illustrating an example of a functional configuration of the information generation apparatus 100 according to the present embodiment. The information generation apparatus 100 according to the present embodiment has a configuration in which the information generation apparatus 10 described in the first embodiment includes a sequential number generation unit 104, a collection unit 110, and a storage unit 120. Specifically, as illustrated in FIG. 3, the information generation apparatus 100 includes an analysis unit 101, an associative word determination unit 102, a synthesis unit 103, a sequence number generation unit 104, a collection unit 110, and a storage unit 120. It is equipped with.
 収集部110は、ネットワーク200を介して、企業内システム300の各構成要素に関する構成要素情報を収集する手段である。 The collection unit 110 is means for collecting component information regarding each component of the in-company system 300 via the network 200.
 収集部110は、構成要素情報を、例えばディレクトリサービス等から収集する。そして、収集部110は、収集した構成要素情報(以下、収集データと呼ぶ)を、解析部101に出力する。なお、収集部110は、収集した構成要素情報を、例えば、後述する記憶部120または収集部110内に格納してもよい。また、収集部110は、企業内システム300の各構成要素に関する構成要素情報のうち、特定の種類(例えば、ホスト名)のデータを収集する構成であってもよい。 The collection unit 110 collects component information from, for example, a directory service. Then, the collection unit 110 outputs the collected component information (hereinafter referred to as collection data) to the analysis unit 101. Note that the collection unit 110 may store the collected component information in, for example, the storage unit 120 or the collection unit 110 described later. The collection unit 110 may be configured to collect data of a specific type (for example, a host name) among the component element information regarding each component of the in-company system 300.
 解析部101は、収集部110から、収集データを受け取る。そして、解析部101は、第1の実施の形態における解析部101と同様に、収集データ(構成要素情報)に含まれる1または複数の文字列を言語解析し、各文字列を1または複数の単語に分解する。本実施の形態では、解析部101は、言語解析として、形態素解析を用いるとするが、本実施の形態の言語解析手法はこれに限定されるものではない。解析部101は、その他の言語解析手法を用いて、言語解析を行ってもよい。 The analysis unit 101 receives the collected data from the collection unit 110. Similarly to the analysis unit 101 in the first embodiment, the analysis unit 101 performs language analysis on one or more character strings included in the collected data (component element information), and each character string is converted into one or more character strings. Break it down into words. In the present embodiment, the analysis unit 101 uses morphological analysis as language analysis, but the language analysis method of the present embodiment is not limited to this. The analysis unit 101 may perform language analysis using other language analysis methods.
 解析部101は、文字列ごとに分解した1または複数の単語(分解データと呼ぶ)を、連想語決定部102、合成部103および連番生成部104に出力する。このとき、解析部101は、分解した1または複数の単語の夫々に対し、該単語の属性を付与する。この属性には、分解前の文字列(収集データに含まれる文字列)における位置を示す情報が含まれる。例えば、解析部101が「AAA01」という文字列を、「AAA」と「01」とに分解した場合、「AAA」の属性には、分解前の文字列の最初の部分の単語であることを示す情報が含まれる。なお、この属性に含まれる情報は、これに限定されるものではない。属性には、例えば、分解した単語の後にスペースがあるという情報等が含まれてもよい。また、属性には、例えば、分解前の文字列を示す情報が含まれてもよい。 The analysis unit 101 outputs one or a plurality of words (referred to as decomposition data) decomposed for each character string to the associative word determination unit 102, the synthesis unit 103, and the sequence number generation unit 104. At this time, the analysis unit 101 assigns the attribute of the word to each of the decomposed one or more words. This attribute includes information indicating the position in the character string before decomposition (character string included in the collected data). For example, when the analysis unit 101 decomposes the character string “AAA01” into “AAA” and “01”, the attribute “AAA” indicates that the word is the first part of the character string before decomposition. Contains information to indicate. The information included in this attribute is not limited to this. The attribute may include, for example, information that there is a space after the decomposed word. The attribute may include information indicating a character string before decomposition, for example.
 なお、解析部101は、収集データに複数種類の構成要素情報が含まれる場合、収集データから解析対象の種類のデータを抽出して、解析処理を行ってもよい。 Note that when the collected data includes multiple types of component information, the analysis unit 101 may perform analysis processing by extracting data of the analysis target type from the collected data.
 記憶部120には、あらかじめ準備された概念情報が格納されている。概念情報とは、単語の概念を規定するための辞書である概念辞書を含む情報である。なお、概念情報はこれに限定されず、例えば、ある単語の反意語、同義語等が含まれるものであってもよい。また、概念情報は企業内システム300内の構成要素に合わせて、適宜変更してもよい。 The storage unit 120 stores conceptual information prepared in advance. Concept information is information including a concept dictionary, which is a dictionary for defining the concept of words. The concept information is not limited to this, and may include, for example, antonyms, synonyms, and the like of a certain word. The conceptual information may be changed as appropriate according to the components in the in-company system 300.
 なお、本実施の形態では、記憶部120が情報生成装置100内に内蔵される構成を例に説明を行うが、記憶部120に関する構成はこれに限定されるものではない。記憶部120は、情報生成装置100とは別個の記憶装置で実現されるものであってもよい。 In the present embodiment, the configuration in which the storage unit 120 is built in the information generating apparatus 100 will be described as an example, but the configuration related to the storage unit 120 is not limited to this. The storage unit 120 may be realized by a storage device that is separate from the information generation device 100.
 連想語決定部102は、第1の実施の形態の連想語決定部102と同様に、解析部101によって分解された1または複数の単語の夫々の連想語を、概念情報に基づいて決定する。具体的には、連想語決定部102は、解析部101から、言語解析によって分解された分解データを受け取る。そして、連想語決定部102は、記憶部120に格納された概念情報を参照し、分解データに含まれる、1または複数の単語の夫々が、該概念情報に含まれているかを確認する。連想語決定部102は、概念情報に含まれる単語に対し、該単語と同じ上位概念を有する他の単語を、この単語に対する連想語として決定する。 The associative word determination unit 102 determines each associative word of one or more words decomposed by the analysis unit 101 based on the concept information, similarly to the associative word determination unit 102 of the first embodiment. Specifically, the associative word determination unit 102 receives the decomposed data decomposed by the language analysis from the analysis unit 101. Then, the associative word determination unit 102 refers to the concept information stored in the storage unit 120 and confirms whether one or more words included in the decomposition data are included in the concept information. The associative word determination unit 102 determines another word having the same superordinate concept as that of the word included in the concept information as an associative word for the word.
 連想語決定部102は、決定した連想語に対し、この連想語の決定に用いた単語(解析部101から出力された分解データに含まれる単語)の属性を、この連想語の属性として付与する。例えば、連想語の決定に用いた単語(元の単語と呼ぶ)の属性に、分解前の文字列の位置を示す情報が含まれる場合、連想語決定部102は、連想語の属性に、この文字列の位置を示す情報を含める。なお、連想語の属性には、元の単語を示す情報が含まれてもよい。 The associative word determination unit 102 assigns, to the determined associative word, the attribute of the word used in the determination of the associative word (the word included in the decomposition data output from the analysis unit 101) as the attribute of the associative word. . For example, when the attribute of the word used to determine the associative word (referred to as the original word) includes information indicating the position of the character string before decomposition, the associative word determining unit 102 sets the attribute of the associative word to Include information indicating the position of the string. Note that the attribute of the associative word may include information indicating the original word.
 そして、連想語決定部102は決定した連想語を、連番生成部104に出力する。 Then, the associative word determination unit 102 outputs the determined associative word to the serial number generation unit 104.
 連番生成部104は、解析部101から、分解データを受け取る。そして、連番生成部104は、受け取った単語のうち、連番や連続する語句(以降、連番語句と呼ぶ)を生成可能な単語を特定する。連番語句を生成可能な単語とは、連続性を有する数字、アルファベット等である。なお、連番語句を生成可能な単語は、数字、アルファベットに限定されず、例えば、「α、β、γ、・・・」のような語句であってもよい。つまり、連番語句を生成可能な単語は、所定の配列に含まれる単語であればよい。 The serial number generation unit 104 receives the decomposed data from the analysis unit 101. Then, the serial number generation unit 104 identifies a word that can generate serial numbers and consecutive phrases (hereinafter referred to as sequential number phrases) among the received words. The word that can generate the serial number phrase is a continuous number, alphabet, or the like. In addition, the word which can produce | generate a serial number phrase is not limited to a number and an alphabet, For example, phrases, such as "(alpha), (beta), (gamma), ...", may be sufficient. That is, the words that can generate sequential number phrases may be words included in a predetermined array.
 連番生成部104は、特定した単語の連番語句を生成する。つまり、連番生成部104は、特定した単語が含まれる所定の配列内に含まれる単語であって、該特定した単語とは別の単語を連番語句として抽出する。連番生成部104は、生成した連番語句に対し、該連番語句の生成の元となる単語(連番語句を生成可能な単語)の属性を、連番語句の属性として付与する。例えば、元の単語の属性に、分解前の文字列の位置を示す情報が含まれる場合、連番生成部104は、連番語句の属性に、この文字列の位置を示す情報を含める。なお、連番語句の属性には、元の単語を示す情報が含まれてもよい。 The serial number generation unit 104 generates a serial number phrase of the identified word. That is, the serial number generation unit 104 extracts words that are included in a predetermined array including the identified word and that are different from the identified word as sequential number phrases. The serial number generation unit 104 assigns, to the generated serial number word / phrase, an attribute of a word (word that can generate the serial number word / phrase) that is a source of generation of the serial number word / phrase as an attribute of the serial number word / phrase. For example, when the original word attribute includes information indicating the position of the character string before decomposition, the serial number generation unit 104 includes information indicating the position of the character string in the attribute of the serial number phrase. Note that the serial number phrase attribute may include information indicating the original word.
 そして、連番生成部104は、生成した連番語句を合成部103に出力する。 Then, the sequence number generation unit 104 outputs the generated sequence number phrase to the synthesis unit 103.
 合成部103は、解析部101から、分解データを受け取る。また、合成部103は、連想語決定部102から連想語を受け取る。また、合成部103は、連番生成部104から連番語句を受け取る。そして、合成部103は、連想語と、該連想語の決定に用いた単語の分解前の文字列内において、単語の前および後ろの少なくとも一方に続く単語、または、単語の連想語とを組み合わせる(合成する)。なお、合成部103の合成方法の具体例については、図面を変えて説明する。 The synthesizing unit 103 receives the decomposed data from the analyzing unit 101. The synthesizing unit 103 receives an associative word from the associative word determining unit 102. The synthesizing unit 103 also receives sequential number phrases from the sequential number generating unit 104. The synthesizing unit 103 combines the associative word and the word following at least one of the word before and after the word or the word associative word in the character string before the word used for the determination of the word. (Synthesize). A specific example of the combining method of the combining unit 103 will be described with reference to different drawings.
 (情報生成装置100の処理の流れ)
 次に、図4~図7を参照して、本実施の形態に係る情報生成装置100の処理について説明する。図4は、本実施の形態に係る情報生成装置100の処理の流れの一例を示すフローチャートである。また、図5~図7は、夫々、本実施の形態に係る情報生成装置100の動作を説明するための図である。以下では、収集部110が収集する構成要素情報がホスト名であることを例に説明を行う。
(Processing flow of information generating apparatus 100)
Next, processing of the information generating apparatus 100 according to the present embodiment will be described with reference to FIGS. FIG. 4 is a flowchart illustrating an example of a processing flow of the information generation apparatus 100 according to the present embodiment. 5 to 7 are diagrams for explaining the operation of the information generating apparatus 100 according to the present embodiment. In the following, description will be given by taking as an example that the component information collected by the collection unit 110 is a host name.
 図4に示す通り、収集部110が、構成要素情報を収集する(ステップS41)。そして、解析部101が収集した構成要素情報(収集データ)に含まれる文字列を言語解析し、1または複数の単語に分解する(ステップS42)。 As shown in FIG. 4, the collection unit 110 collects component information (step S41). Then, the character string included in the component information (collected data) collected by the analysis unit 101 is linguistically analyzed and decomposed into one or a plurality of words (step S42).
 ここで、ステップS42の処理について、例を挙げて説明を行う。図5は、本実施の形態に係る情報生成装置100の解析部101の動作を説明するための図である。 Here, the process of step S42 will be described with an example. FIG. 5 is a diagram for explaining the operation of the analysis unit 101 of the information generation apparatus 100 according to the present embodiment.
 収集部110が収集した収集データには、図5の左側に示す4つのホスト名「spring-a」、「fall」、「test01」、「test02」が含まれているとする。解析部101は、これらのホスト名の夫々を言語解析し、1または複数の単語に分解する。つまり解析部101は、「spring」、「-a」、「fall」、「test」、「01」、「02」の6つの単語を含む分解データを生成する。ここで、解析部101は、重複する単語(この例の場合、「test」)の属性に、出現回数を含めてもよい。 The collection data collected by the collection unit 110 includes the four host names “spring-a”, “fall”, “test01”, and “test02” shown on the left side of FIG. The analysis unit 101 performs linguistic analysis on each of these host names and decomposes them into one or a plurality of words. That is, the analysis unit 101 generates decomposed data including six words “spring”, “−a”, “fall”, “test”, “01”, and “02”. Here, the analysis unit 101 may include the number of appearances in the attribute of the overlapping word (in this example, “test”).
 また、解析部101は、「spring」および「test」の属性に、分解前の文字列の最初の部分の単語であることを示す情報を含め、「-a」、「01」、「02」の属性に、分解前の文字列の最後の部分の単語であることを示す情報を含める。また、解析部101は、「fall」の属性に、分解されていないことを示す情報を含めてもよい。 In addition, the analysis unit 101 includes information indicating that the word is the first part of the character string before decomposition in the attributes “spring” and “test” and includes “−a”, “01”, “02”. In the attribute of, information indicating that the word is the last part of the character string before decomposition is included. Further, the analysis unit 101 may include information indicating that the attribute has not been decomposed in the attribute of “fall”.
 解析部101は、これらの単語を含む分解データを連想語決定部102、合成部103および連番生成部104に出力する。 The analysis unit 101 outputs the decomposed data including these words to the associative word determination unit 102, the synthesis unit 103, and the sequence number generation unit 104.
 図4に戻り、情報生成装置100による処理の続きの説明を行う。ステップS42の後、連想語決定部102が、記憶部120を参照し、分解データに含まれる、1または複数の単語のうち、概念情報に含まれる単語の連想語を決定する(ステップS43)。 Returning to FIG. 4, the continuation of the processing by the information generating apparatus 100 will be described. After step S42, the associative word determination unit 102 refers to the storage unit 120 and determines an associative word of the word included in the conceptual information among one or more words included in the decomposed data (step S43).
 ここで、図6を参照して、ステップS43の処理について、更に説明する。図6は、記憶部120に格納された概念情報の構成を説明するための図である。図6に示す通り、記憶部120には、ツリー構造の概念情報が格納されている。なお、概念情報のデータ構造はこれに限定されるものではなく、ある単語の上位概念および/または下位概念がわかる構造であればよい。 Here, the process of step S43 will be further described with reference to FIG. FIG. 6 is a diagram for explaining the configuration of the conceptual information stored in the storage unit 120. As shown in FIG. 6, the storage unit 120 stores tree-structured conceptual information. Note that the data structure of the concept information is not limited to this, and any structure may be used as long as the superordinate concept and / or subordinate concept of a word can be understood.
 図6には、概念情報の一例として、「spring」、「summer」、「winter」、「fall」、「autumn」等が含まれている。そして、これらの単語の上位概念にあたる「season」が、「spring」等の上位概念として、「spring」等の単語に関連付けられている。同様に、「season」の上位概念「xxx」が、「season」の上位概念として「season」に関連付けられている。 FIG. 6 includes “spring”, “summer”, “winter”, “fall”, “autumn”, and the like as examples of conceptual information. Then, “season”, which is a superordinate concept of these words, is associated with a word such as “sspring” as a superordinate concept such as “spring”. Similarly, the superordinate concept “xxx” of “season” is associated with “season” as a superordinate concept of “season”.
 なお、図6では、ある単語の直近の上位概念として、1つの上位概念が関連付けられているが、本実施の形態はこれに限定されるものではない。1つの単語は、複数の上位概念に関連付けられていてもよい。例えば、「spring」は、「弾性体」という上位概念にも関連付けられていてもよい。 In FIG. 6, one superordinate concept is associated as a superordinate concept of a word, but the present embodiment is not limited to this. One word may be associated with a plurality of superordinate concepts. For example, “spring” may be associated with a superordinate concept “elastic body”.
 連想語決定部102は、分解データに含まれる「spring」が概念情報に含まれるか確認し、含まれる場合、その単語の上位概念を検索する。そして、「spring」の上位概念「season」が検索された場合、連想語決定部102は、この「season」の下位概念の中から任意の単語を連想語として決定する。本例では、連想語決定部102は、「winter」を「spring」の連想語として決定したとする。同様に、連想語決定部102は、「autumn」を「fall」の連想語として決定する。 The associative word determination unit 102 checks whether or not “spring” included in the decomposition data is included in the concept information, and if included, searches for a superordinate concept of the word. When the superordinate concept “season” of “spring” is searched, the associative word determination unit 102 determines an arbitrary word from the subordinate concepts of “season” as an associative word. In this example, it is assumed that the associative word determination unit 102 determines “winter” as an associative word of “spring”. Similarly, the associative word determination unit 102 determines “autumn” as an associative word of “fall”.
 また、連想語決定部102は、「test」が概念情報に含まれるか確認する。本例では、この「test」は概念情報に含まれないとする。また、連想語決定部102は、「-a」、「01」、「02」の夫々についても、概念情報に含まれるかを確認する。本例では、これらの単語も、概念情報に含まれないとする。 Also, the associative word determination unit 102 checks whether “test” is included in the concept information. In this example, it is assumed that this “test” is not included in the concept information. Further, the associative word determination unit 102 confirms whether “−a”, “01”, and “02” are also included in the concept information. In this example, it is assumed that these words are not included in the concept information.
 このように、連想語決定部102は、分解データに含まれる単語の全てに対し、概念情報に含まれるか否かを確認し、含まれる単語に対しては、該単語の連想語を決定する。そして、連想語決定部102は、決定した単語の属性に元の単語の属性を含める。つまり、連想語決定部102は、「winter」の属性に、分解前の文字列の最初の部分の単語であることを示す情報を含める。また、連想語決定部102は、「autumn」の属性に、分解されていないことを示す情報を含めてもよい。なお、図6の構成によれば、「autumn」は、「spring」の連想語とすることもでき、「winter」は「fall」の連想語とすることができる。そのため、連想語決定部102は、例えば、「winter」の属性に、「fall」の属性を含めてもよい。 As described above, the associative word determination unit 102 checks whether or not all the words included in the decomposition data are included in the concept information, and determines the associated word of the word for the included words. . Then, the associative word determination unit 102 includes the original word attribute in the determined word attribute. That is, the associative word determination unit 102 includes information indicating that the word is the first part of the character string before decomposition in the “winter” attribute. Further, the associative word determination unit 102 may include information indicating that the attribute is not decomposed in the attribute of “autumn”. According to the configuration of FIG. 6, “autumn” can also be an associative word of “spring”, and “winter” can be an associative word of “fall”. Therefore, the associative word determination unit 102 may include the attribute “fall” in the attribute “winter”, for example.
 なお、分解データに同じ上位概念を有する単語が複数含まれる場合、連想語決定部102はこれらの単語を用いて、これらの単語の上位概念を決定することが好ましい。例えば、「spring」の上位概念は、「season」と「弾性体」とである場合、連想語決定部102は、「弾性体」の下位概念を「spring」の連想語として決定する可能性がある。 In addition, when a plurality of words having the same superordinate concept are included in the decomposed data, it is preferable that the associative word determining unit 102 uses these words to determine the superordinate concepts of these words. For example, when the superordinate concepts of “spring” are “season” and “elastic body”, the associative word determination unit 102 may determine the subordinate concept of “elastic body” as an associative word of “spring”. is there.
 そのため、連想語決定部102は、「spring」と「fall」とで共通する上位概念を検索する。つまり、連想語決定部102は、「spring」の上位概念と、「fall」の上位概念とが同じであるか否かを確認する。そして、上位概念同士が同じである場合、連想語決定部102は、この上位概念を検索し、検索された上位概念「season」に対し、「season」の下位概念であり、「spring」および「fall」以外の単語を、「spring」および「fall」の連想語として決定する。これにより、情報生成装置100は、攻撃者に、ダミーであると判別されづらいダミー情報を生成することができる。 Therefore, the associative word determination unit 102 searches for a superordinate concept common to “spring” and “fall”. That is, the associative word determination unit 102 confirms whether the superordinate concept of “spring” and the superordinate concept of “fall” are the same. When the superordinate concepts are the same, the associative word determination unit 102 searches for this superordinate concept, and is a subordinate concept of “season” with respect to the superordinate concept “season” searched for, “spring” and “ Words other than “fall” are determined as associative words of “spring” and “fall”. As a result, the information generation apparatus 100 can generate dummy information that is difficult for an attacker to be determined as a dummy.
 このとき、連想語決定部102は、連想語の決定に用いた元の単語の数以上の数の単語を、連想語として決定することが好ましい。これにより、情報生成装置100は、少なくとも正規情報と同数以上のダミー情報を生成することができる。なお、連想語として決定する単語の数は、任意であり、連想語の決定に用いた元の単語の数と同数でなくてもよい。 At this time, it is preferable that the associative word determination unit 102 determines as many associative words the number of words that is equal to or more than the number of original words used for determining the associative word. Thereby, the information generation apparatus 100 can generate at least as many pieces of dummy information as the regular information. Note that the number of words to be determined as associative words is arbitrary, and may not be the same as the number of original words used to determine the associative words.
 そして、連想語決定部102は、連想語として決定した単語である「winter」および「autumn」の夫々の属性に、「spring」および「fall」の属性を含める。つまり、連想語決定部102は、「winter」および「autumn」の夫々の属性に、分解前の文字列の最初の部分の単語であること、および、分解されていないことを示す情報を含める。 Then, the associative word determination unit 102 includes attributes of “spring” and “fall” in the respective attributes of “winter” and “autumn”, which are words determined as associative words. That is, the associative word determination unit 102 includes, in each of the attributes “winter” and “autumn”, information indicating that the word is the first part of the character string before decomposition and that it is not decomposed.
 そして、連想語決定部102は、「winter」および「autumn」を合成部103に出力する。 Then, the associative word determination unit 102 outputs “winter” and “autumn” to the synthesis unit 103.
 また、連想語決定部102は、上位概念が同じ単語は、同じ属性を有するようにしてもよい。例えば、図6に示す通り、「spring」と「fall」とは上位概念が同じであるため、連想語決定部102は、「spring」の属性に、「fall」の属性を含ませ、「fall」の属性に「spring」の属性を含ませてもよい。そして、連想語決定部102は、分解データに含まれる単語の、属性を変更したことを示す情報を合成部103に出力する。 Also, the associative word determination unit 102 may have the same attribute for words having the same superordinate concept. For example, as shown in FIG. 6, since “spring” and “fall” have the same superordinate concept, the associative word determination unit 102 includes the attribute “fall” in the attribute “spring”, and “fall” ”Attribute may be included in the“ spring ”attribute. Then, the associative word determination unit 102 outputs information indicating that the attribute of the word included in the decomposed data has been changed to the synthesis unit 103.
 図4に戻り、情報生成装置100による処理の続きの説明を行う。連番生成部104が、分解データに含まれる、1または複数の単語のうち、連番語句を生成可能な単語の連番語句を生成する(ステップS44)。なお、このステップS44は、ステップS42の後であればよく、ステップS43と同時に行われてもよいし、ステップS43より前に行われてもよい。 Returning to FIG. 4, the continuation of the processing by the information generating apparatus 100 will be described. The sequential number generation unit 104 generates sequential number phrases of words that can generate sequential number phrases from among one or more words included in the decomposed data (step S44). The step S44 may be performed after the step S42, may be performed simultaneously with the step S43, or may be performed before the step S43.
 連番生成部104は、例えば、図5の右側に示すような分解データを受け取ると、この中から連続性を有する単語として、「-a」、「01」、「02」を特定する。そして、連番生成部104は、「-a」に基づいて、「-b」および「-c」を生成する。また、連番生成部104は、「01」および「02」に基づいて、「03」および「04」を生成する。なお、連番生成部104が生成する連番語句の数は特に限定されるものではない。 For example, when receiving the decomposed data as shown on the right side of FIG. 5, the serial number generation unit 104 identifies “−a”, “01”, and “02” as words having continuity. The serial number generation unit 104 generates “−b” and “−c” based on “−a”. The serial number generation unit 104 generates “03” and “04” based on “01” and “02”. Note that the number of serial number phrases generated by the serial number generation unit 104 is not particularly limited.
 そして、連番生成部104は、「-b」、「-c」、「03」および「04」の夫々の属性に、「-a」、「01」、「02」の属性として含まれる情報である、分解前の文字列の最後の部分の単語であることを示す情報を含める。そして、連番生成部104は、生成した連番語句を合成部103に出力する。 Then, the serial number generation unit 104 includes information included as attributes “−a”, “01”, and “02” in the attributes “−b”, “−c”, “03”, and “04”, respectively. Information indicating that the word is the last part of the character string before decomposition. Then, the sequence number generation unit 104 outputs the generated sequence number phrase to the synthesis unit 103.
 ステップS43およびステップS44の終了後、合成部103は、連想語である「winter」および「autumn」と、連番語句である「-b」、「-c」、「03」および「04」と、分解データに含まれる単語とを用いて、合成処理を行うことによりダミー情報を生成する(ステップS45)。 After completion of step S43 and step S44, the synthesizing unit 103 generates associative words “winter” and “autumn” and sequential word phrases “−b”, “−c”, “03”, and “04”. Then, dummy information is generated by performing synthesis processing using the words included in the decomposed data (step S45).
 このように、連番生成部104によって生成された連番語句も合成に用いることにより、攻撃者に、ダミーであると判別されづらいダミー情報をより多くのパターンで生成することができる。 In this way, by using the sequential number words and phrases generated by the sequential number generation unit 104 for synthesis, it is possible to generate dummy information that is difficult for an attacker to be determined as a dummy in more patterns.
 ここで、図7を参照して、ステップS45の処理について、更に説明する。図7は、本実施の形態に係る情報生成装置100の合成部103の動作を説明するための図である。 Here, with reference to FIG. 7, the process of step S45 will be further described. FIG. 7 is a diagram for explaining the operation of the combining unit 103 of the information generating apparatus 100 according to the present embodiment.
 図7に示す通り、合成部103は、以下の(A)と、(B)とを組み合わせる。
(A)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語(図7では分解データAと記載)、
(B)連番語句、または、解析部101から受け取った単語のうち連番語句を生成可能な単語(図7では分解データBと記載)。
As shown in FIG. 7, the synthesis unit 103 combines the following (A) and (B).
(A) an associative word or a word that cannot generate a serial number phrase among the words received from the analysis unit 101 (described as decomposition data A in FIG. 7),
(B) A serial number phrase or a word that can generate a serial number phrase among the words received from the analysis unit 101 (described as decomposition data B in FIG. 7).
 このとき、合成部103は、(A)および(B)の夫々において、同じ属性情報を有するものを1つの配列とし、この配列同士を組み合わせる(合成する)ことにより、ダミー情報を生成する。本例では、(A)に含まれる連想語および分解データAの属性は、分解前の文字列の最初の部分の単語であることを示す情報を含んでおり、(B)に含まれる連番語句および分解データBの属性は、分解前の文字列の最後の部分の単語であることを示す情報を含む。したがって、合成部103は、(A)に含まれる連想語および分解データAを1つの配列とし、(B)に含まれる連番語句および分解データBを1つの配列として、この配列の各要素(単語)同士を組み合わせる。なお、図7において、配列間の「X」は、配列の各要素同士を組み合わせることを示している。 At this time, the combining unit 103 generates dummy information by combining (combining) the arrays having the same attribute information as one array in each of (A) and (B). In this example, the attribute of the associative word and decomposition data A included in (A) includes information indicating that the word is the first part of the character string before decomposition, and the serial number included in (B). The attribute of the phrase and the decomposition data B includes information indicating that the word is the last part of the character string before decomposition. Therefore, the synthesizing unit 103 takes the associative word and the decomposed data A included in (A) as one array, and the sequential number phrase and decomposed data B included in (B) as one array. Word). In FIG. 7, “X” between the arrays indicates that the elements of the arrays are combined.
 なお、このとき、合成部103は、組み合わせた文字列が、元の文字列(正規の情報)と同じにならないように、記憶部120または収集部110内に格納された構成要素情報に、組み合わせた文字列が含まれていないことを確認する。組み合わせた文字列が、構成要素情報に含まれる場合は、元の文字列であるため、合成部103は、この組み合わせた文字列をダミー情報としない。 At this time, the synthesizing unit 103 uses the combination element information stored in the storage unit 120 or the collection unit 110 so that the combined character string is not the same as the original character string (regular information). Confirm that the character string is not included. When the combined character string is included in the component element information, since it is the original character string, the synthesis unit 103 does not use the combined character string as dummy information.
 このように、合成部103は、構成要素情報の概念情報に基づいて生成した連想語を用いてダミー情報を生成する。したがって、情報生成装置100は、構成要素情報として用いたとしても違和感が無いダミー情報を生成することができる。また、合成部103は、例えば、「test03」、「test-a」のように、連想語を用いない文字列もダミー情報に含めてもよい。これにより、より多くの文字列をダミー情報として生成することができる。 Thus, the synthesis unit 103 generates dummy information using the associative word generated based on the concept information of the component element information. Therefore, the information generating apparatus 100 can generate dummy information that does not feel uncomfortable even when used as component information. Further, the synthesizing unit 103 may include a character string that does not use an associative word, for example, “test03” and “test-a” in the dummy information. Thereby, more character strings can be generated as dummy information.
 なお、合成部103は、生成したダミー情報を、攻撃者から閲覧可能な情報として、例えば、外部の記憶装置等に格納してもよいし、所定の機器に送信してもよい。また、合成部103は、生成したダミー情報を、他の装置やシステムから問い合わせがあった場合に、この他の装置やシステムに送信する構成であってもよい。 The combining unit 103 may store the generated dummy information as information that can be browsed by an attacker, for example, in an external storage device or the like, or may transmit it to a predetermined device. Further, the composition unit 103 may be configured to transmit the generated dummy information to another device or system when an inquiry is made from another device or system.
 なお、図5~図7では、構成要素情報がホスト名の場合を例に挙げ説明を行ったが、情報生成装置100は、その他の種類の構成要素情報も同様にダミー情報を生成することができる。例えば、構成要素情報がユーザアカウントであった場合も、情報生成装置100は、ホスト名と同様の方法で、ユーザアカウントに対するダミー情報を生成することができる。 In FIGS. 5 to 7, the case where the component information is a host name has been described as an example. However, the information generation apparatus 100 may generate dummy information for other types of component information as well. it can. For example, even when the component information is a user account, the information generation apparatus 100 can generate dummy information for the user account in the same manner as the host name.
 また、情報生成装置100は、構成要素情報が、ファイル名、URI、メールアドレスであった場合も、これらの情報に対するダミー情報を生成することができる。 In addition, even when the component information is a file name, URI, or mail address, the information generation apparatus 100 can generate dummy information for these pieces of information.
 以下では、構成要素情報がファイル名の場合、URIの場合、メールアドレスの場合に分けて、情報生成装置100がダミー情報を生成する方法について説明する。 Hereinafter, a method in which the information generation apparatus 100 generates dummy information will be described separately for the case where the component information is a file name, a URI, and a mail address.
 (構成要素情報がファイル名の場合)
 以下では、収集部110が収集する構成要素情報がファイル名であることを例に説明を行う。図8は、収集する構成要素情報(収集データ)がファイル名の場合における情報生成装置100の動作を説明するための図である。
(When the component information is a file name)
In the following, description will be given by taking as an example that the component information collected by the collection unit 110 is a file name. FIG. 8 is a diagram for explaining the operation of the information generating apparatus 100 when the component information (collected data) to be collected is a file name.
 収集データがファイル名であった場合も、情報生成装置100は、ホスト名と同様の処理でダミー情報を生成することができる。図8では、ファイル名が単語間にスペースを含んだ文字列である場合について説明する。 Even when the collected data is a file name, the information generating apparatus 100 can generate dummy information by the same process as the host name. FIG. 8 illustrates a case where the file name is a character string including a space between words.
 図8に示す通り、収集データに含まれるファイル名が「Japanese summer 2014」であるとする。このとき、解析部101は、この文字列(ファイル名)を「Japanese」、「summer」、「2014」に分解する。そして、解析部101は、「Japanese」の属性に、分解前の文字列の最初の部分の単語であることを示す情報と、単語の後にスペースが入ることを示す情報と、を含める。また、解析部101は、「summer」の属性に、分解前の文字列の2番目の部分の単語であることを示す情報と、単語の後にスペースが入ることを示す情報とを含める。また、解析部101は、「2014」の属性に、分解前の文字列の最後の部分の単語であることを示す情報を含める。 Suppose that the file name included in the collected data is “Japanese summer 2014” as shown in FIG. At this time, the analysis unit 101 decomposes the character string (file name) into “Japan”, “summer”, and “2014”. Then, the analysis unit 101 includes, in the attribute “Japane”, information indicating that the word is the first part of the character string before decomposition and information indicating that a space follows the word. Further, the analysis unit 101 includes, in the “summer” attribute, information indicating that the word is the second part of the character string before decomposition and information indicating that a space follows the word. Further, the analysis unit 101 includes information indicating that the word is the last part of the character string before decomposition in the attribute “2014”.
 なお、上述したとおり、解析部101は、スペースの位置を示す情報を、そのスペースの直前の単語の属性に含めているが、本実施の形態の解析部101はこれに限定されるものではない。解析部101は、スペースの位置を示す情報を、スペースの直後の単語の属性に含める構成であってもよい。 As described above, the analysis unit 101 includes the information indicating the position of the space in the attribute of the word immediately before the space, but the analysis unit 101 of the present embodiment is not limited to this. . The analysis unit 101 may be configured to include information indicating the position of the space in the attribute of the word immediately after the space.
 そして、連想語決定部102は、これらの単語が、概念情報に含まれるか確認する。本例では、「Japanese」と「summer」とが概念情報に含まれるとする。連想語決定部102は、「Japanese」と「summer」との上位概念を検索する。本例では、「Japanese」と「summer」とは、同じ上位概念を有していないとする。連想語決定部102は、「Japanese」の連想語(例えば、「American」)および「summer」の連想語(例えば、「winter」)を決定する。 Then, the associative word determination unit 102 confirms whether these words are included in the concept information. In this example, it is assumed that “Japan” and “summer” are included in the concept information. The associative word determination unit 102 searches for a superordinate concept of “Japan” and “summer”. In this example, it is assumed that “Japan” and “summer” do not have the same superordinate concept. The associative word determination unit 102 determines the associative word (for example, “American”) of “Japan” and the associative word (for example, “winter”) of “summer”.
 そして、連想語決定部102は、「American」の属性に、「Japanese」の属性を含め、「winter」の属性に、「summer」の属性を含める。 Then, the associative word determination unit 102 includes the attribute “Japan” in the attribute “American” and the attribute “summer” in the attribute “winter”.
 また、連番生成部104は、「2014」に基づいて、「2013」を生成する。そして、連番生成部104は、「2013」の属性に「2014」の属性を含める。 Further, the serial number generation unit 104 generates “2013” based on “2014”. Then, the serial number generation unit 104 includes the attribute “2014” in the attribute “2013”.
 そして、合成部103は、(1)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語と、(2)連番語句、または、解析部101から受け取った単語のうち連番語句を生成可能な単語ごとに、同じ属性情報を有するものを1つの配列とし、この配列同士を組み合わせることにより、ダミー情報を生成する。 The synthesizing unit 103 then includes (1) an associative word or a word that cannot be generated from a word received from the analyzing unit 101, and (2) a serial number of the word received from the analyzing unit 101. Of these, words having the same attribute information for each word for which serial number phrases can be generated are arranged as one array, and dummy information is generated by combining these arrays.
 つまり、図8に示す通り、合成部103は、以下の(A)~(C)を組み合わせる。
(A)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語、のうち、分解前の文字列の最初の部分の単語であるという属性を有する単語である「American」および「Japanese」、
(B)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語、のうち、分解前の文字列の2番目の部分の単語であるという属性を有する単語である「winter」および「summer」、
(C)連番語句、または、解析部101から受け取った単語のうち連番語句を生成可能な単語、のうち、分解前の文字列の最後の部分の単語であるという属性を有する単語である「2013」および「2014」。
That is, as shown in FIG. 8, the synthesis unit 103 combines the following (A) to (C).
(A) “American” is an association word or a word that has an attribute of being a word of the first part of a character string before decomposition, among words that cannot be generated as a sequential number phrase among words received from the analysis unit 101 ”And“ Japane ”,
(B) A word having an attribute that it is a word of the second part of the character string before the decomposition among the words that cannot be generated from the associative word or the serial number phrase among the words received from the analysis unit 101. winter "and" summer ",
(C) Sequential number phrases or words that can generate a sequential number phrase among the words received from the analysis unit 101, are words having an attribute of being the last part of the character string before decomposition. “2013” and “2014”.
 また、合成部103は、(A)~(C)を組み合わせる際、各単語の属性に含まれるスペースの位置を示す情報に従って、所定の位置にスペースを挿入する。これにより、合成部103は、図8の右側に示す通り、例えば、「American winter 2013」、「American summer 2014」等の文字列からなるダミー情報を生成することができる。 Further, when combining (A) to (C), the synthesis unit 103 inserts a space at a predetermined position according to information indicating the position of the space included in the attribute of each word. Thereby, the composition unit 103 can generate dummy information including character strings such as “American winter 2013” and “American summer 2014” as shown on the right side of FIG.
 (構成要素情報がメールアドレスの場合)
 以下では、収集部110が収集する構成要素情報がメールアドレスであることを例に説明を行う。図9および図10は、収集する構成要素情報(収集データ)がメールアドレスの場合における情報生成装置100の動作を説明するための図である。
(When the component information is an email address)
In the following, description will be given by taking as an example that the component information collected by the collection unit 110 is a mail address. 9 and 10 are diagrams for explaining the operation of the information generating apparatus 100 when the component information (collected data) to be collected is a mail address.
 図9に示す通り、収集データがメールアドレスの場合、解析部101は、メールアドレス毎に、該メールアドレスをローカル部と、ドメインとに分解する。そして、解析部101は、ローカル部に含まれる文字列を解析し、単語に分解する。分解された単語の一例を図9の右側に示す。図9の右側に示す1番目のメールアドレスのローカル部「a-xxx」は、図9に示す通り、「a-」と「xxx」とに分解される。このとき、解析部101は、「a-」に分解前の文字列の最初の部分の単語であることを示す情報を属性として付与する。また、解析部101は、「xxx」に、分解前の文字列の2番目(ローカル部では最後)の部分の単語であることを示す情報を付与する。解析部101は、その他のメールアドレスに対しても同様に、ローカル部と、ドメインとに分解し、ローカル部を単語に分割する。なお、本例では、アットマークは、ドメインの最初の文字として、ドメインに含めるとする。 As shown in FIG. 9, when the collected data is an e-mail address, the analysis unit 101 decomposes the e-mail address into a local part and a domain for each e-mail address. Then, the analysis unit 101 analyzes the character string included in the local part and breaks it down into words. An example of the decomposed word is shown on the right side of FIG. The local part “a-xxx” of the first mail address shown on the right side of FIG. 9 is decomposed into “a-” and “xxx” as shown in FIG. At this time, the analysis unit 101 assigns information indicating that the word is the first part of the character string before decomposition to “a−” as an attribute. In addition, the analysis unit 101 assigns information indicating that the word is the second (last in the local part) word of the character string before decomposition to “xxx”. Similarly, for the other mail addresses, the analysis unit 101 decomposes the local part into a domain and divides the local part into words. In this example, the at sign is included in the domain as the first character of the domain.
 そして、連想語決定部102は、「xxx」および「kkk」の連想語として、「vvv」および「nnn」を決定し、「zz」の連想語として、「yy」を決定したとする。 Assume that the associative word determination unit 102 determines “vvv” and “nnn” as the associative words of “xxx” and “kkk”, and determines “yy” as the associative word of “zz”.
 また、連番生成部104は、「a-」および「b-」の連番語句として「c-」および「d-」を生成し、「01」の連番語句として、「02」を生成したとする。 Further, the sequence number generation unit 104 generates “c-” and “d-” as sequence numbers of “a-” and “b-”, and generates “02” as sequence numbers of “01”. Suppose that
 その後、合成部103は、(1)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語と、(2)連番語句、または、解析部101から受け取った単語のうち連番語句を生成可能な単語ごとに、同じ属性情報を有するものを1つの配列とし、また、ドメインを1つの配列とし、この配列同士を組み合わせることにより、ダミー情報を生成する。 After that, the synthesizing unit 103 (1) the associative word or the word received from the analyzing unit 101 that cannot generate the serial number phrase, and (2) the sequential number phrase or the word received from the analyzing unit 101 Among them, for each word for which serial number phrases can be generated, those having the same attribute information are made into one array, and the domains are made into one array, and dummy information is generated by combining these arrays.
 つまり、図10に示す通り、合成部103は、以下の(A)~(C)を組み合わせる。
(A)連番語句、または、解析部101から受け取った単語のうち連番語句を生成可能な単語、のうち、分解前の文字列の最初の部分の単語であるという属性を有する単語である「a-」、「b-」、「c-」および「d-」、
(B)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語、のうち、分解前の文字列の2番目の部分の単語であるという属性を有する単語である「xxx」、「kkk」、「vvv」および「nnn」、
(C)ドメイン。
That is, as shown in FIG. 10, the synthesis unit 103 combines the following (A) to (C).
(A) Sequential number phrases or words that can generate a sequential number phrase among the words received from the analysis unit 101, are words having an attribute of being the first part of the character string before decomposition. “A-”, “b-”, “c-” and “d-”,
(B) A word having an attribute that it is a word of the second part of the character string before the decomposition among the words that cannot be generated from the associative word or the serial number phrase among the words received from the analysis unit 101. xxx "," kkk "," vvv "and" nnn ",
(C) Domain.
 また、合成部103は、図10に示す通り、以下の(D)~(F)を組み合わせる。
(D)連想語、または、解析部101から受け取った単語のうち連番語句を生成できない単語、のうち、分解前の文字列の最初の部分の単語であるという属性を有する単語である「zz」および「yy」、
(E)連番語句、または、解析部101から受け取った単語のうち連番語句を生成可能な単語、のうち、分解前の文字列の2番目の部分の単語であるという属性を有する単語である「01」および「02」、
(F)ドメイン。
Further, the synthesis unit 103 combines the following (D) to (F) as shown in FIG.
(D) “zz”, which is an associative word or a word that cannot be generated as a sequential number phrase among the words received from the analysis unit 101, has the attribute that it is the first part of the character string before decomposition. ”And“ yy ”,
(E) A word having an attribute that it is a word of the second part of the character string before decomposition, out of words that can generate a serial number word or phrase among the words received from the analysis unit 101. Certain "01" and "02",
(F) Domain.
 これにより、合成部103は、図10の右側に示す通り、例えば、「a-vvv@yyy.ne.jp」等の文字列からなるダミー情報を生成することができる。 Thereby, the synthesizing unit 103 can generate dummy information including a character string such as “a-vvv@yyy.ne.jp” as shown on the right side of FIG.
 このように、本実施の形態に係る情報生成装置100によれば、構成要素情報がメールアドレスであっても、攻撃者にダミー情報であると、判別されにくいダミー情報を生成することができる。 As described above, according to the information generation apparatus 100 according to the present embodiment, even if the component information is a mail address, dummy information that is difficult for an attacker to determine can be generated if the information is dummy information.
 (構成要素情報がURIの場合)
 以下では、収集部110が収集する構成要素情報がURIであることを例に説明を行う。図11は、収集データがURIの場合における情報生成装置100の動作を説明するための図である。
(When component information is URI)
In the following, description will be given by taking as an example that the component information collected by the collection unit 110 is a URI. FIG. 11 is a diagram for explaining the operation of the information generation apparatus 100 when the collected data is a URI.
 図11に示す通り、収集データがURIの場合、解析部101は、URI毎に、該URIとして記載された文字列を階層ごとに分解する。そして、解析部101は、各階層の文字列を解析し、単語に分解する。分解された単語の一例を図11に示す。図11に示す通り、URIにおける第1階層の文字列「folder01」は、「folder」と「01」とに分解されている。 As shown in FIG. 11, when the collected data is a URI, the analysis unit 101 decomposes the character string described as the URI for each hierarchy. And the analysis part 101 analyzes the character string of each hierarchy, and decomposes | disassembles it into a word. An example of the decomposed word is shown in FIG. As shown in FIG. 11, the character string “folder01” in the first hierarchy in the URI is broken down into “folder” and “01”.
 このとき、解析部101は、分解された単語の分解前の文字列が含まれる階層を示す情報と、分解された単語の分解前の文字列における位置を示す情報とを、分解された単語の属性として含める。 At this time, the analysis unit 101 uses information indicating the hierarchy including the character string before the decomposition of the decomposed word and information indicating the position of the decomposed word in the character string before the decomposition of the decomposed word. Include as an attribute.
 そして、連想語決定部102は、分解された文字に対し、連想語を決定する。また、連番生成部104は連番語句を生成する。そして、合成部103は、階層ごとに単語を合成し、その後、各階層の文字列を合成する。合成の方法は、上述した方法と同様であるため、説明を省略する。なお、合成部103は、階層を区切る区切り文字を、階層間に挿入した文字列を、ダミー情報として生成する。 Then, the associative word determination unit 102 determines an associative word for the decomposed characters. Further, the serial number generation unit 104 generates a serial number phrase. Then, the synthesizing unit 103 synthesizes words for each hierarchy, and then synthesizes character strings in each hierarchy. Since the synthesis method is the same as that described above, the description thereof is omitted. The synthesizing unit 103 generates, as dummy information, a character string in which delimiters that delimit layers are inserted between the layers.
 このように、本実施の形態に係る情報生成装置100によれば、構成要素情報がURIであっても、攻撃者にダミー情報であると、判別されにくいダミー情報を生成することができる。 As described above, according to the information generation apparatus 100 according to the present embodiment, even if the component information is a URI, dummy information that is difficult to be discriminated by an attacker as dummy information can be generated.
 以上のように、本実施の形態に係る情報生成装置100によれば、第1の実施の形態に係る情報生成装置10と同様の効果を奏する。また、本実施の形態に係る情報生成装置100によれば、構成要素情報がホスト名、ファイル名、ユーザアカウント、メールアドレスおよびURI等であっても、より好適に、攻撃者にダミー情報であると、判別されにくいダミー情報を生成することができる。 As described above, according to the information generation apparatus 100 according to the present embodiment, the same effects as those of the information generation apparatus 10 according to the first embodiment can be obtained. Further, according to the information generation apparatus 100 according to the present embodiment, even if the component information is a host name, a file name, a user account, a mail address, a URI, etc., more preferably, it is dummy information for the attacker. It is possible to generate dummy information that is difficult to discriminate.
 <第3の実施の形態>
 次に、本発明の第3の実施の形態について、図面を参照して説明する。なお、説明の便宜上、上述した第1および第2の実施の形態で説明した図面に含まれる部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。本実施の形態では、連想語決定部102による連想語生成の他の方法について説明する。なお、本実施の形態に係る情報生成装置100は、図3に示す構成と同様の構成を有するため、説明を省略する。
<Third Embodiment>
Next, a third embodiment of the present invention will be described with reference to the drawings. For convenience of explanation, members having the same functions as those included in the drawings described in the first and second embodiments described above are given the same reference numerals, and descriptions thereof are omitted. In the present embodiment, another method of associative word generation by the associative word determination unit 102 will be described. Note that the information generation apparatus 100 according to the present embodiment has the same configuration as that shown in FIG.
 図12は、本実施の形態に係る情報生成装置100の記憶部120に格納された概念情報の構成を説明するための図である。図12に示す通り、記憶部120には、「spring」や「fall」等の上位概念が「season」であり、「season」等の上位概念が「xxx」であることを示している。また、図12には、「xxx」の複数階層上の上位概念に「yyy」があることを示している。また、「yyy」の下位概念に「zzz」があり、「zzz」の下位概念に「fruit」があり、「fruit」の下位概念に「apple」や「orange」があることを示している。 FIG. 12 is a diagram for explaining a configuration of conceptual information stored in the storage unit 120 of the information generating apparatus 100 according to the present embodiment. As illustrated in FIG. 12, the storage unit 120 indicates that a higher concept such as “spring” or “fall” is “season” and a higher concept such as “season” is “xxx”. Further, FIG. 12 shows that “yyy” is present in the upper concept on the multiple layers of “xxx”. In addition, “zzz” is included in the subordinate concept of “yyy”, “fruit” is included in the subordinate concept of “zzz”, and “apple” and “orange” are included in the subordinate concept of “fruit”.
 図12に示す通り、本実施の形態に係る記憶部120に格納されている概念情報のデータ構造は、木構造であるため、本実施の形態では、各単語をノードと呼ぶ。 As shown in FIG. 12, since the data structure of the concept information stored in the storage unit 120 according to the present embodiment is a tree structure, each word is referred to as a node in the present embodiment.
 本実施の形態に係る連想語決定部102は、概念情報に含まれる複数の単語間で共通する上位概念が複数検索されたとき、上位概念間の距離を算出する。例えば、解析部101が分解した単語に、「spring」、「fall」、「apple」および「orange」が含まれるとする。このとき、連想語決定部102は、全ての単語の上位概念を検索する。「spring」と「fall」とが同じ上位概念「season」を有し、「apple」と「orange」とが同じ上位概念「fruit」を有しているため、連想語決定部102は、上位概念が2つ検索されたと判定する。 The associative word determination unit 102 according to the present embodiment calculates the distance between the superordinate concepts when a plurality of superordinate concepts common to the plurality of words included in the concept information are searched. For example, it is assumed that the words decomposed by the analysis unit 101 include “spring”, “fall”, “apple”, and “orange”. At this time, the associative word determination unit 102 searches for a superordinate concept of all words. Since “spring” and “fall” have the same superordinate concept “season”, and “apple” and “orange” have the same superordinate concept “fruit”, the associative word determination unit 102 has the superordinate concept It is determined that two are retrieved.
 その後、連想語決定部102は、「season」のノードと、「fruit」のノードのノード間距離を算出する。なお、本実施の形態では、ノードとこのノードの親ノードとの距離を1とする。この距離を到達ホップ数とも呼ぶ。つまり、ノードから、このノードの親ノードに到達するまでの距離(到達ホップ数)は、1となる。 Thereafter, the associative word determination unit 102 calculates the inter-node distance between the “season” node and the “fruit” node. In this embodiment, the distance between the node and the parent node of this node is 1. This distance is also called the arrival hop count. That is, the distance (the number of hops reached) from the node to the parent node of this node is 1.
 そして、連想語決定部102は、上位概念(「season」、「fruit」)の少なくとも何れかから、算出した距離の略半分までの距離(中間距離、中間ホップ数とも呼ぶ)にある他の上位概念に対する下位概念である単語を、連想語として決定する。 Then, the associative word determination unit 102 selects another upper level in a distance (also called an intermediate distance or the number of intermediate hops) from at least one of the higher level concepts (“season”, “fruit”) to approximately half of the calculated distance. A word that is a subordinate concept to the concept is determined as an associative word.
 例えば、「season」のノード、および、「fruit」のノード間の到達ホップ数が8である場合、中間ホップ数は4となる。したがって、連想語決定部102は、「season」ノードからの到達ホップ数が4のノード(上位概念)の子ノード(下位概念)である単語を連想語として決定する。「season」ノードからの到達ホップ数が4のノードが「city」であり、その下位概念が「tokyo」、「paris」、「kyoto」等であった場合、連想語決定部102は、この「tokyo」、「paris」、「kyoto」等から所定の数の単語を連想語として決定する。なお、このとき連想語として決定する単語の数は任意である。 For example, when the number of hops reached between the “season” node and the “fruit” node is 8, the number of intermediate hops is 4. Therefore, the associative word determination unit 102 determines, as an associative word, a word that is a child node (subordinate concept) of a node (superordinate concept) whose arrival hop count from the “season” node is four. When the node having the number of hops reached from the “season” node of 4 is “city” and the subordinate concept thereof is “tokyo”, “paris”, “kyoto”, etc., the associative word determination unit 102 selects “ A predetermined number of words are determined as associative words from “tokyo”, “paris”, “kyoto”, and the like. At this time, the number of words determined as associative words is arbitrary.
 なお、上記例では、「season」ノードからの到達ホップ数が中間ホップ数であるノードの下位概念を連想語とすることについて説明したが、「fruit」ノードからの到達ホップ数が中間ホップ数であるノードを連想語としてもよい。 In the above example, the subordinate concept of a node whose number of hops reached from the “season” node is the number of intermediate hops has been described as an associative word. However, the number of hops reached from the “fruit” node is the number of intermediate hops. A certain node may be an associative word.
 これにより、本実施の形態に係る情報生成装置100は、正規の情報に含まれる単語(キーワード)に近いキーワードを用いて、ダミー情報を生成することができる。したがって、本実施の形態に係る情報生成装置100によれば、正規の情報に含まれる単語に対する1つ上の上位概念(直系の上位概念と呼ぶ)と、上位概念を共有する単語だけでなく、正規の情報に含まれる単語とは、直系の上位概念が異なる単語も、連想語として決定することができる。これにより、本実施の形態に係る情報生成装置100によれば、正規の情報から想定される範囲を超えた、攻撃者にダミー情報であるとより判別されにくいダミー情報を生成することができる。 Thereby, the information generation apparatus 100 according to the present embodiment can generate dummy information using a keyword close to a word (keyword) included in regular information. Therefore, according to the information generation device 100 according to the present embodiment, not only the upper concept (referred to as the direct superordinate concept) for the word included in the regular information and the word sharing the superordinate concept, A word that is different from a direct concept in terms of words included in regular information can also be determined as an associative word. Thereby, according to the information generation device 100 according to the present embodiment, it is possible to generate dummy information that is beyond the range assumed from the legitimate information and is more difficult for an attacker to identify as dummy information.
 (変形例1)
 次に、本実施の形態に係る変形例について説明する。本変形例では連想語決定部102による連想語生成の更に他の方法について説明する。
(Modification 1)
Next, a modification according to the present embodiment will be described. In this modification, another method of associative word generation by the associative word determination unit 102 will be described.
 本変形例では、例えば、概念情報に含まれる複数の単語間で共通する上位概念が3つ以上検索されたときについて説明する。例えば、解析部101が分解した単語に、「spring」、「fall」、「apple」、「orange」、「tokyo」および「paris」が含まれるとする。このとき、連想語決定部102は、全ての単語の上位概念を検索する。なお、「spring」と「fall」とが同じ上位概念「season」を有し、「apple」と「orange」とが同じ上位概念「fruit」を有し、「tokyo」と「paris」とが同じ上位概念「city」を有しているとする。そのため、連想語決定部102は、上位概念が3つ検索されたと判定する。 In this modification, for example, a case where three or more superordinate concepts that are common among a plurality of words included in the concept information are searched will be described. For example, it is assumed that the words decomposed by the analysis unit 101 include “spring”, “fall”, “apple”, “orange”, “tokyo”, and “paris”. At this time, the associative word determination unit 102 searches for a superordinate concept of all words. Note that “spring” and “fall” have the same superordinate concept “season”, “apple” and “orange” have the same superordinate concept “fruit”, and “tokyo” and “paris” are the same. Suppose that it has a superordinate concept “city”. Therefore, the associative word determination unit 102 determines that three superordinate concepts have been searched.
 そして、連想語決定部102は、3つの上位概念間の到達ホップ数を夫々算出する。つまり、連想語決定部102は、(1)「season」と「fruit」との距離と、(2)「season」と「city」との距離と、(3)「fruit」と「city」との距離と、を算出する。 Then, the associative word determination unit 102 calculates the number of hops reached between the three superordinate concepts. That is, the associative word determination unit 102 determines (1) the distance between “season” and “fruit”, (2) the distance between “season” and “city”, and (3) “fruit” and “city”. Is calculated.
 そして、連想語決定部102は、上位概念(「season」、「fruit」、「city」)の少なくとも何れかから、各到達ホップ数の平均距離(平均ホップ数とも呼ぶ)にある他の上位概念に対する下位概念の単語を、連想語として決定する。 Then, the associative word determination unit 102 selects another superordinate concept that is in the average distance (also referred to as the mean hop count) of each hop number reached from at least one of the superordinate concepts (“season”, “fruit”, “city”). The subordinate concept word for is determined as an associative word.
 例えば、各到達ホップ数の平均が4である場合、連想語決定部102は、「fruit」ノードからの到達ホップ数が4のノードの子ノード(下位概念)である単語を連想語として決定する。「fruit」ノードからの到達ホップ数が4のノードが「restaurant」であり、その下位概念が「cafeteria」、「teashop」、「pub」等であった場合、連想語決定部102は、この「cafeteria」、「teashop」、「pub」等から所定の数の単語を連想語として決定する。なお、このとき連想語として決定する単語の数は任意である。 For example, if the average number of hops reached is 4, the associative word determination unit 102 determines a word that is a child node (subordinate concept) of a node having a hop count of 4 from the “fruit” node as an associative word. . If the node having the number of hops reached from the “fruit” node is “restaurant” and the subordinate concept is “cafeteria”, “teashop”, “pub”, etc., the associative word determination unit 102 selects “ A predetermined number of words are determined as associative words from “cafeteria”, “teashop”, “pub”, and the like. At this time, the number of words determined as associative words is arbitrary.
 なお、上記例では、「fruit」ノードからの到達ホップ数が平均ホップ数であるノードの下位概念を連想語とすることについて説明したが、「season」ノードまたは「city」ノードからの到達ホップ数が平均ホップ数であるノードを連想語としてもよい。また、ルートからの到達ホップ数が平均ホップ数であるノードの下位概念を連想語としてもよい。 In the above example, it has been described that the subordinate concept of a node whose average hop count is the number of hops reached from the “fruit” node is an associative word, but the number of hops reached from the “season” node or the “city” node. A node having the average number of hops may be used as an associative word. Further, a subordinate concept of a node whose number of hops reached from the route is the average number of hops may be used as an association word.
 これにより、本変形例に係る情報生成装置100は、第3の実施の形態に係る情報生成装置100と同様の効果を得ることができる。 Thereby, the information generating apparatus 100 according to the present modification can obtain the same effects as the information generating apparatus 100 according to the third embodiment.
 なお、本変形例に係る連想語決定部102は、上述した第3の実施の形態に係る連想語決定部102と同様に、上位概念の少なくとも何れかから、算出した到達ホップ数の中間ホップ数にある他の上位概念に対する下位概念の単語を、更に、連想語として決定してもよい。 Note that the associative word determination unit 102 according to the present modification, like the associative word determination unit 102 according to the third embodiment described above, is the number of intermediate hops calculated from at least one of the higher concepts. The subordinate concept words for other superordinate concepts may be further determined as associative words.
 (変形例2)
 次に、本実施の形態に係る他の変形例について説明する。本変形例では連想語決定部102による連想語生成の更に他の方法について説明する。
(Modification 2)
Next, another modification according to the present embodiment will be described. In this modification, another method of associative word generation by the associative word determination unit 102 will be described.
 本変形例に係る連想語決定部102は、予め与えられた初期値を用いて、連想語を決定してもよい。この初期値は、例えば、ある単語から何階層上にさかのぼるかを示す値である。このとき、連想語決定部102は、概念情報に含まれる単語から初期値分上の階層の単語(所定距離の上位概念)を特定する。例えば、概念情報に含まれる単語が「winter」であり、初期値が2である場合、連想語決定部102は、「winter」から2つ上の上位概念(図12においては、「xxx」)を特定する。そして、連想語決定部102は、この「xxx」の下位概念の単語を連想語として決定する。なお、連想語決定部102は、上記第3の実施の形態または変形例1で決定した連想語に、本変形例で決定した連想語を加えてもよい。 The associative word determination unit 102 according to this modification may determine an associative word using an initial value given in advance. This initial value is, for example, a value indicating how many levels go up from a certain word. At this time, the associative word determination unit 102 specifies a word in a hierarchy higher than the initial value from words included in the concept information (a superordinate concept at a predetermined distance). For example, when the word included in the concept information is “winter” and the initial value is 2, the associative word determination unit 102 has two higher-level concepts (“xxx” in FIG. 12) above “winter”. Is identified. Then, the associative word determination unit 102 determines a word of a lower concept of “xxx” as an associative word. The associative word determination unit 102 may add the associative word determined in the present modification to the associative word determined in the third embodiment or the first modification.
 本変形例に係る情報生成装置100は、連想語決定部102が、このように連想語を決定した場合であっても、正規の情報から想定される範囲を超えた、攻撃者にダミー情報であるとより判別されにくいダミー情報を生成することができる。 Even if the associative word determination unit 102 determines an associative word in this way, the information generation device 100 according to the present modified example uses dummy information for an attacker who exceeds the range assumed from regular information. It is possible to generate dummy information that is more difficult to discriminate.
 (変形例3)
 次に、本実施の形態に係る他の変形例について説明する。本変形例では連想語決定部102による連想語生成の更に他の方法について説明する。
(Modification 3)
Next, another modification according to the present embodiment will be described. In this modification, another method of associative word generation by the associative word determination unit 102 will be described.
 本変形例に係る連想語決定部102は、変形例2に係る連想語決定部102と同様に、予め与えられた初期値を用いて、連想語を決定してもよい。本変形例における初期値は、必要な連想語の数を示す値である。 The associative word determining unit 102 according to the present modification may determine an associative word using an initial value given in advance, similarly to the associative word determining unit 102 according to the second modification. The initial value in this modification is a value indicating the number of required associative words.
 連想語決定部102は、概念情報に含まれる単語の上位概念を検索し、この上位概念に対する下位概念の単語を連想語として決定する。このとき、この下位概念の単語の数が初期値より少ない場合、連想語決定部102はこの上位概念のさらに上位の概念に対する下位概念の単語を連想語として決定する。 The associative word determination unit 102 searches for a superordinate concept of a word included in the concept information, and determines a subordinate concept word for the superordinate concept as an associative word. At this time, when the number of words of the lower concept is smaller than the initial value, the associative word determination unit 102 determines the word of the lower concept for the higher concept of the higher concept as the associative word.
 例えば、概念情報に含まれる単語が「winter」であり、初期値が8である場合、連想語決定部102は、「winter」から1つ上の上位概念(図12においては、「season」)を特定する。そして、連想語決定部102は、この「season」の下位概念の単語の数であって、「winter」以外の単語の数が、初期値(8)以上であるかを確認する。「winter」を除く、「season」の下位概念の単語の数が、例えば、6である場合、連想語決定部102は、「season」の上位概念(図12においては「xxx」)を特定する。そして、連想語決定部102は、この「xxx」の下位概念の単語数が、初期値以上であるかを確認する。このように、本実施の形態に係る連想語決定部102は、初期値以上の単語を下位概念として有する上位概念が現れるまで、階層をさかのぼって、上位概念を検索する。そして、初期値以上の単語を下位概念として有する上位概念があった場合、連想語決定部102は、この上位概念に対する下位概念の単語を連想語として決定する。なお、連想語決定部102は、上記第3の実施の形態、変形例1、または、変形例2で決定した連想語に、本変形例で決定した連想語を加えてもよい。 For example, when the word included in the concept information is “winter” and the initial value is 8, the associative word determination unit 102 is a superordinate concept one level higher than “winter” (“season” in FIG. 12). Is identified. Then, the associative word determination unit 102 checks whether the number of words of the subordinate concept of “season” and the number of words other than “winter” is equal to or greater than the initial value (8). When the number of words of the subordinate concept of “season” excluding “winter” is 6, for example, the associative word determination unit 102 identifies the superordinate concept of “season” (“xxx” in FIG. 12). . Then, the associative word determination unit 102 confirms whether the number of words of the subordinate concept of “xxx” is equal to or larger than the initial value. As described above, the associative word determination unit 102 according to the present embodiment searches the higher-level concept by going back up the hierarchy until a higher-level concept having a word equal to or higher than the initial value as a lower-level concept appears. Then, when there is a superordinate concept having a word equal to or higher than the initial value as a subordinate concept, the associative word determination unit 102 determines a subordinate concept word for the superordinate concept as an associative word. The associative word determination unit 102 may add the associative word determined in the present modification to the associative word determined in the third embodiment, the first modification, or the second modification.
 本変形例に係る情報生成装置100は、連想語決定部102が、このように連想語を決定した場合であっても、正規の情報から想定される範囲を超えた、攻撃者にダミー情報であるとより判別されにくいダミー情報を生成することができる。 Even if the associative word determination unit 102 determines an associative word in this way, the information generation device 100 according to the present modified example uses dummy information for an attacker who exceeds the range assumed from regular information. It is possible to generate dummy information that is more difficult to discriminate.
 <第4の実施の形態>
 次に、本発明の第4の実施の形態について説明する。なお、説明の便宜上、上述した第1から第3の実施の形態で説明した図面に含まれる部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。本実施の形態では、合成部103によるダミー情報生成の他の方法について説明する。なお、本実施の形態に係る情報生成装置100は、図3に示す構成と同様の構成を有するため、説明を省略する。
<Fourth embodiment>
Next, a fourth embodiment of the present invention will be described. For convenience of explanation, members having the same functions as those included in the drawings described in the first to third embodiments described above are given the same reference numerals, and descriptions thereof are omitted. In this embodiment, another method of generating dummy information by the synthesis unit 103 will be described. Note that the information generation apparatus 100 according to the present embodiment has the same configuration as that shown in FIG.
 上述した各実施の形態では、合成部103が組み合わせた文字列全てを、ダミー情報とすることについて説明を行ったが、合成部103の構成はこれに限定されるものではない。 In each of the above-described embodiments, it has been described that all character strings combined by the combining unit 103 are dummy information, but the configuration of the combining unit 103 is not limited to this.
 本実施の形態に係る情報生成装置100の合成部103は、組み合わせた(合成後の)文字列に優先度を付してもよい。合成部103は、例えば、合成後の文字列に、連想語、および/または、連番語句が含まれる場合、この文字列の優先度をより高く設定してもよい。また、例えば、合成後の文字列に含まれる単語の属性に、出現回数が含まれる場合、合成部103は、この出現回数がより高い単語を含む文字列の優先度をより高く設定してもよい。また、例えば、合成後の文字列に含まれる連想語が、所定数以上の単語から決定されたものであった場合、合成部103は、この文字列の優先度をより高く設定してもよい。このように、優先度の設定方法は特に限定されない。また、優先度は、レベルを示すものであってもよいし、順位付けされたものであってもよい。 The composition unit 103 of the information generation apparatus 100 according to the present embodiment may assign a priority to the combined (after composition) character string. For example, when the associative word and / or sequential number phrase is included in the combined character string, the combining unit 103 may set the priority of the character string higher. For example, when the number of appearances is included in the attribute of the word included in the combined character string, the combining unit 103 may set the priority of the character string including the word with a higher appearance number to a higher priority. Good. Further, for example, when the associative word included in the combined character string is determined from a predetermined number of words or more, the combining unit 103 may set the priority of the character string higher. . Thus, the priority setting method is not particularly limited. The priority may indicate a level or may be ranked.
 文字列に付された優先度が、レベルを示すものであり、数値が大きいものがより高いレベルを示すものであった場合、合成部103は、この優先度が所定の値より大きい文字列をダミー情報として生成する。 When the priority assigned to the character string indicates a level, and a value with a large numerical value indicates a higher level, the synthesis unit 103 selects a character string with a priority greater than a predetermined value. Generated as dummy information.
 また、文字列に付された優先度が順位(優先順位)である場合、合成部103はこの優先順位が所定の値より高い文字列をダミー情報として生成する。例えば、所定の値がNである場合(Nは自然数)、合成部103は、上位N件の文字列をダミー情報として生成する。 Further, when the priority given to the character string is a rank (priority order), the synthesis unit 103 generates a character string having a higher priority than a predetermined value as dummy information. For example, when the predetermined value is N (N is a natural number), the synthesis unit 103 generates the top N character strings as dummy information.
 例えば、出現回数が多い文字列や、連番は、攻撃者にダミー情報であると判別されにくい文字列であると言える。したがって、合成部103が生成する文字列は、上述した第1から第3の実施の形態において生成される文字列よりも、攻撃者にダミー情報であるとより判別されにくい文字列となる。よって、本実施の形態に係る情報生成装置100は、攻撃者にダミー情報であるとより判別されにくい文字列のみを、ダミー情報として生成することができる。 For example, it can be said that a character string with a large number of appearances or a serial number is a character string that is difficult for an attacker to identify as dummy information. Therefore, the character string generated by the synthesizing unit 103 is a character string that is more difficult for an attacker to identify as dummy information than the character string generated in the first to third embodiments described above. Therefore, the information generating apparatus 100 according to the present embodiment can generate only character strings that are more difficult to be identified as dummy information by an attacker as dummy information.
 <第5の実施の形態>
 次に、本発明の第5の実施の形態について、図面を参照して詳細に説明する。
<Fifth embodiment>
Next, a fifth embodiment of the present invention will be described in detail with reference to the drawings.
 なお、説明の便宜上、上述した第1から第4の実施の形態で説明した部材と同じ機能を有する部材については、同じ符号を付し、その説明を省略する。 For convenience of explanation, members having the same functions as those described in the first to fourth embodiments are denoted by the same reference numerals and description thereof is omitted.
 図13は、本実施の形態に係る情報生成装置の機能構成の一例を示す機能ブロック図である。本実施の形態に係る情報生成装置400は、第2から第4の実施の形態に係る情報生成装置100に、更に記憶部420を備える構成である。具体的には、図13に示す通り、情報生成装置400は、解析部101と、連想語決定部102と、合成部103と、連番生成部104と、収集部110と、記憶部(第2の記憶部)120と、記憶部(第1の記憶部)420とを備えている。なお、本実施の形態に係る情報生成装置400が含まれる情報生成システムの構成は、図2で説明した構成と同様であるため、説明を省略する。 FIG. 13 is a functional block diagram showing an example of the functional configuration of the information generating apparatus according to the present embodiment. The information generation apparatus 400 according to the present embodiment is configured to further include a storage unit 420 in the information generation apparatus 100 according to the second to fourth embodiments. Specifically, as illustrated in FIG. 13, the information generation apparatus 400 includes an analysis unit 101, an associative word determination unit 102, a synthesis unit 103, a sequence number generation unit 104, a collection unit 110, and a storage unit (first unit). 2 storage units) 120 and a storage unit (first storage unit) 420. The configuration of the information generation system including the information generation apparatus 400 according to the present embodiment is the same as the configuration described with reference to FIG.
 なお、本実施の形態では、記憶部420が情報生成装置400内に内蔵される構成を例に説明を行うが、記憶部420に関する構成は、これに限定されるものではない。記憶部420は、情報生成装置400とは別個の記憶装置で実現されるものであってもよい。また、本実施の形態では、記憶部120と記憶部420とは別個の構成であることを例に説明を行うが、記憶部120と記憶部420とは、1つの記憶部で実現されるものであってもよい。 In the present embodiment, a configuration in which the storage unit 420 is built in the information generating apparatus 400 will be described as an example. However, the configuration related to the storage unit 420 is not limited to this. The storage unit 420 may be realized by a storage device that is separate from the information generation device 400. In the present embodiment, the storage unit 120 and the storage unit 420 are described as an example of a separate configuration. However, the storage unit 120 and the storage unit 420 are realized by a single storage unit. It may be.
 記憶部420には、素材情報が格納されている。素材情報とは、ダミー情報として利用可能な素材を示す情報である。具体的には、素材情報は、コンピュータが自動で生成することが困難な単語として、利用者が予めリストアップし、記憶部420に登録した単語からなる情報である。コンピュータが自動で生成することが困難な単語とは、ダミー情報として利用できそうなダミーっぽい名前(例えば、proxygate2、ip8800、dhcp01等)や、ダミー情報を適用する対象の企業内システム300における独自の命名規則に則った名前等、によって構成される単語である。独自の命名規則とは、例えば、東京に設置されたサーバの名前を「tk-svr」にする、という規則である。つまり、素材情報に含まれる単語は、システムで使用される可能性が高い文字列であるが、概念化されない文字列である。 The storage unit 420 stores material information. The material information is information indicating a material that can be used as dummy information. Specifically, the material information is information including words that are listed in advance by the user and registered in the storage unit 420 as words that are difficult for a computer to automatically generate. Words that are difficult for a computer to generate automatically are dummy-like names that can be used as dummy information (for example, proxygate2, ip8800, dhcp01, etc.) and unique in the corporate system 300 to which the dummy information is applied. This is a word composed of a name that conforms to the naming convention. The unique naming rule is, for example, a rule that the name of a server installed in Tokyo is “tk-svr”. That is, the words included in the material information are character strings that are highly likely to be used in the system, but are not conceptualized.
 連想語決定部102は、解析部101から受け取った単語が概念情報に含まれるかを確認し、この単語が概念情報に含まれない単語である場合、この単語が連番語句を生成可能な単語か否かを確認する。そして、この概念情報に含まれない単語が、連番語句を生成可能な単語ではない場合、連想語決定部102は、この単語を素材情報として、記憶部420に登録してもよい。 The associative word determination unit 102 checks whether the word received from the analysis unit 101 is included in the concept information. If the word is a word that is not included in the concept information, the word can generate a serial number phrase. Check whether or not. If the word not included in the concept information is not a word that can generate a serial number phrase, the associative word determination unit 102 may register the word as material information in the storage unit 420.
 このように、素材情報には、利用者があらかじめ登録した単語だけでなく、連想語決定部102によって登録された単語が含まれる。なお、連想語決定部102によって、概念情報に含まれないと判断された単語に対し、利用者が素材情報として登録する単語を選別してもよい。 As described above, the material information includes not only words registered in advance by the user but also words registered by the associative word determination unit 102. Note that the word to be registered as material information by the user may be selected from words determined to be not included in the concept information by the associative word determination unit 102.
 また、利用者が予めリストアップし、記憶部420に素材情報として登録する単語に対し、連想語決定部102は、この単語そのものが実際に使用されているか否かを、解析部101およびネットワーク200を介して、企業内システム300内の、DNS(Domain Name System)やDHCP(Dynamic Host Configuration Protocol)サーバ等に問い合わせを行うことにより確認してもよい。そして、連想語決定部102は、問い合わせた結果、概念情報に含まれない単語自身が実際に使用されていない場合、この単語を素材情報として登録してもよい。 In addition, for words that are listed in advance and registered as material information in the storage unit 420 by the user, the associative word determination unit 102 determines whether or not the words themselves are actually used. It may be confirmed by making an inquiry to a DNS (Domain Name System), a DHCP (Dynamic Host Configuration Protocol) server, or the like in the in-company system 300. Then, as a result of the inquiry, if the word itself that is not included in the concept information is not actually used, the associative word determination unit 102 may register this word as material information.
 なお、連想語決定部102は、素材情報として登録した単語の夫々に対し、属性を付してもよい。この属性に含まれる情報は、利用者が任意に登録したものであってもよい。また、素材情報として登録する単語が、解析部101から供給された単語の場合、素材情報として登録した単語の属性は、解析部101によってこの単語に付与された属性であってもよい。 Note that the associative word determination unit 102 may attach an attribute to each word registered as material information. Information included in this attribute may be information arbitrarily registered by the user. When the word registered as the material information is a word supplied from the analysis unit 101, the attribute of the word registered as the material information may be an attribute given to this word by the analysis unit 101.
 合成部103は、解析部101から分解された単語を受け取る。また、合成部103は、連想語決定部102から連想語を受け取る。また、合成部103は、連番生成部104から連番語句を受け取る。更に、合成部103は、記憶部420から素材情報を取得する。 The synthesizing unit 103 receives the decomposed word from the analyzing unit 101. The synthesizing unit 103 receives an associative word from the associative word determining unit 102. The synthesizing unit 103 also receives sequential number phrases from the sequential number generating unit 104. Further, the synthesis unit 103 acquires material information from the storage unit 420.
 そして、合成部103は、連想語決定部102が決定した連想語だけでなく、記憶部420から取得した素材情報に含まれる単語を、上記連想語とし、合成を行う。なお、合成部103は、上述した第2および第3の実施の形態と同様に、解析部101によって分解された単語そのものを用いて合成を行ってもよい。また、合成部103は、上述した第4の実施の形態と同様に、合成した文字列に優先度を付して、優先度のより高いものをダミー情報としてもよい。このように、本実施の形態に係る合成部103の合成方法は、上述した各実施の形態と同様の合成方法を用いるため、本実施の形態では詳しい説明を省略する。 The synthesizing unit 103 synthesizes not only the associative word determined by the associative word determining unit 102 but also the word included in the material information acquired from the storage unit 420 as the associative word. Note that the synthesizing unit 103 may perform synthesis using the words themselves decomposed by the analyzing unit 101, as in the second and third embodiments described above. Further, as in the fourth embodiment described above, the synthesis unit 103 may assign a priority to the synthesized character string and use the higher priority as dummy information. Thus, since the synthesizing method of the synthesizing unit 103 according to the present embodiment uses the same synthesizing method as in each of the above-described embodiments, detailed description thereof is omitted in the present embodiment.
 以上のように、本実施の形態に係る情報生成装置400は、素材情報を用いて、ダミー情報を生成する。これにより、本実施の形態に係る情報生成装置400は、上述した第1から第4の実施の形態に係る情報生成装置と同様の効果を得ることができる。また、本実施の形態に係る情報生成装置400は、より正規情報に類似したダミー情報を生成することができる。 As described above, the information generation apparatus 400 according to the present embodiment generates dummy information using the material information. Thereby, the information generation device 400 according to the present embodiment can obtain the same effects as those of the information generation devices according to the first to fourth embodiments described above. In addition, the information generation apparatus 400 according to the present embodiment can generate dummy information more similar to regular information.
 <ハードウェアの構成例>
 ここで、上述した各実施の形態に係る情報生成装置(10、100、400)を実現可能なハードウェアの構成例について説明する。上述した情報生成装置(10、100、400)は、専用の装置として実現してもよいが、コンピュータ(情報処理装置)を用いて実現してもよい。
<Example of hardware configuration>
Here, a configuration example of hardware capable of realizing the information generation apparatus (10, 100, 400) according to each embodiment described above will be described. The information generation device (10, 100, 400) described above may be realized as a dedicated device, but may be realized using a computer (information processing device).
 図14は、本発明の各実施の形態を実現可能なコンピュータ(情報処理装置)のハードウェア構成を例示する図である。 FIG. 14 is a diagram illustrating a hardware configuration of a computer (information processing apparatus) capable of realizing each embodiment of the present invention.
 図14に示した情報処理装置(コンピュータ)90のハードウェアは、CPU(Central Processing Unit)11、通信インタフェース(I/F)12、入出力ユーザインタフェース13、ROM(Read Only Memory)14、RAM(Random Access Memory)15、記憶装置17、及びコンピュータ読み取り可能な記憶媒体19のドライブ装置18を備え、これらがバス16を介して接続された構成を有する。入出力ユーザインタフェース13は、入力デバイスの一例であるキーボードや、出力デバイスとしてのディスプレイ等のマンマシンインタフェースである。通信インタフェース12は、上述した各実施の形態に係る装置(図1、図3、図13)が、外部装置と、通信ネットワーク80を介して通信するための一般的な通信手段である。係るハードウェア構成において、CPU11は、各実施の形態に係る情報生成装置(10、100、400)を実現する情報処理装置90について、全体の動作を司る。 The hardware of the information processing apparatus (computer) 90 shown in FIG. 14 includes a CPU (Central Processing Unit) 11, a communication interface (I / F) 12, an input / output user interface 13, a ROM (Read Only Memory) 14, a RAM ( Random Access Memory) 15, a storage device 17, and a drive device 18 of a computer-readable storage medium 19, which are connected via a bus 16. The input / output user interface 13 is a man-machine interface such as a keyboard which is an example of an input device and a display as an output device. The communication interface 12 is a general communication means for the devices according to the above-described embodiments (FIGS. 1, 3, and 13) to communicate with an external device via the communication network 80. In the hardware configuration, the CPU 11 controls the overall operation of the information processing apparatus 90 that realizes the information generation apparatuses (10, 100, 400) according to the embodiments.
 上述した各実施の形態は、例えば、上記各実施の形態において説明した処理を実現可能なプログラム(コンピュータ・プログラム)を、図14に示す情報処理装置90に対して供給した後、そのプログラムを、CPU11に読み出して実行することによって実現される。なお、係るプログラムは、例えば、上記各実施の形態の説明において参照したフローチャート(図4)に記載した各種処理や、或いは、図1、図3、図13に示したブロック図において当該装置内に示した各部(各ブロック)を実現可能なプログラムであってもよい。 In each of the above-described embodiments, for example, a program (computer program) that can realize the processing described in each of the above-described embodiments is supplied to the information processing apparatus 90 illustrated in FIG. It implement | achieves by reading to CPU11 and performing. The program is stored in the apparatus in the various processes described in the flowchart (FIG. 4) referred to in the description of the above embodiments, or in the block diagrams shown in FIGS. It may be a program capable of realizing each part (each block) shown.
 また、情報処理装置90内に供給されたプログラムは、読み書き可能な一時記憶メモリ(15)またはハードディスクドライブ等の不揮発性の記憶装置(17)に格納されてもよい。即ち、記憶装置17において、プログラム群17Aは、例えば、上述した各実施の形態における情報生成装置(10、100、400)内に示した各部の機能を実現可能なプログラムである。また、各種の記憶情報17Bは、例えば、上述した各実施の形態における収集データ、分解データ、連想語、連番語句、ダミー情報、概念情報、素材情報等である。ただし、情報処理装置90へのプログラムの実装に際して、個々のプログラム・モジュールの構成単位は、ブロック図(図1、図3、図13)に示した各ブロックの区分けには限定されず、当業者が実装に際して適宜選択してよい。 The program supplied in the information processing apparatus 90 may be stored in a readable / writable temporary storage memory (15) or a non-volatile storage device (17) such as a hard disk drive. That is, in the storage device 17, the program group 17 </ b> A is a program that can realize the function of each unit shown in the information generation device (10, 100, 400) in each of the above-described embodiments. The various kinds of stored information 17B are, for example, collected data, decomposed data, associative words, sequential number phrases, dummy information, conceptual information, material information, and the like in the above-described embodiments. However, when the program is installed in the information processing apparatus 90, the constituent unit of each program module is not limited to the division of each block shown in the block diagrams (FIG. 1, FIG. 3, FIG. 13). May be selected as appropriate during mounting.
 また、前記の場合において、当該装置内へのプログラムの供給方法は、CD(Compact Disk)-ROM、フラッシュメモリ等のコンピュータ読み取り可能な各種の記録媒体(19)を介して当該装置内にインストールする方法や、インターネット等の通信回線(80)を介して外部よりダウンロードする方法等のように、現在では一般的な手順を採用することができる。そして、このような場合において、各実施の形態は、係るコンピュータプログラムを構成するコード(プログラム群17A)或いは係るコードが格納された記憶媒体(19)によって構成されると捉えることができる。 In the above case, the program is supplied into the apparatus via various computer-readable recording media (19) such as a CD (Compact Disk) -ROM and a flash memory. A general procedure can be adopted at present, such as a method and a method of downloading from the outside via a communication line (80) such as the Internet. In such a case, each embodiment can be considered to be configured by a code (program group 17A) constituting the computer program or a storage medium (19) in which the code is stored.
 以上、本発明を、上述した模範的な実施の形態に適用した例として説明した。しかしながら、本発明の技術的範囲は、上述した各実施の形態に記載した範囲には限定されない。当業者には、係る実施の形態に対して多様な変更または改良を加えることが可能であることは明らかである。そのような場合、係る変更または改良を加えた新たな実施の形態も、本発明の技術的範囲に含まれ得る。そしてこのことは、請求の範囲に記載した事項から明らかである。 The present invention has been described above as an example applied to the exemplary embodiment described above. However, the technical scope of the present invention is not limited to the scope described in each embodiment described above. It will be apparent to those skilled in the art that various modifications and improvements can be made to the embodiment. In such a case, new embodiments to which such changes or improvements are added can also be included in the technical scope of the present invention. This is clear from the matters described in the claims.
 この出願は、2014年9月19日に出願された日本出願特願2014-190805を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims priority based on Japanese Patent Application No. 2014-190805 filed on September 19, 2014, the entire disclosure of which is incorporated herein.
 1  情報生成システム
 10  情報生成装置
 100  情報生成装置
 101  解析部
 102  連想語決定部
 103  合成部
 104  連番生成部
 110  収集部
 120  記憶部
 200  ネットワーク
 300  企業内システム
 400  情報生成装置
 420  記憶部
 80  通信ネットワーク
 90  情報処理装置
 11  CPU
 12  通信インタフェース
 13  入出力ユーザインタフェース
 14  ROM
 15  RAM
 16  バス
 17  記憶装置
 18  ドライブ装置
 19  記憶媒体
DESCRIPTION OF SYMBOLS 1 Information generation system 10 Information generation apparatus 100 Information generation apparatus 101 Analysis part 102 Associative word determination part 103 Composition part 104 Serial number generation part 110 Collection part 120 Storage part 200 Network 300 In-company system 400 Information generation apparatus 420 Storage part 80 Communication network 90 Information processing apparatus 11 CPU
12 Communication interface 13 Input / output user interface 14 ROM
15 RAM
16 bus 17 storage device 18 drive device 19 storage medium

Claims (15)

  1.  システムの構成要素に関する構成要素情報に含まれる文字列を単語に分解する解析手段と、
     前記分解された単語のうち概念情報に含まれる単語に対し、該概念情報に基づいて、前記単語の連想語を決定する連想語決定手段と、
     前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせることにより、前記構成要素情報に含まれる文字列とは異なる文字列からなるダミー情報を生成する合成手段と、を備える情報生成装置。
    An analysis means for decomposing a character string included in component information relating to a component of the system into words;
    An associative word determining means for determining an associative word of the word based on the concept information for a word included in the conceptual information among the decomposed words;
    By combining the associative word and the word following at least one of the word before and after the word in the character string before the word used for the determination of the associative word, or the word associative word, An information generation apparatus comprising: synthesis means for generating dummy information including a character string different from the character string included in the component element information.
  2.  前記連想語決定手段は、前記概念情報に基づいて、前記分解された単語のうち概念情報に含まれる複数の単語間で共通する上位概念を検索し、共通する上位概念が検索された場合、該上位概念に対する下位概念の単語であって、前記複数の単語以外の単語を、前記概念情報に含まれる複数の単語に対する連想語として決定する、請求項1に記載の情報生成装置。 The associative word determination means searches for a superordinate concept common among a plurality of words included in the concept information among the decomposed words based on the concept information, and when a common superordinate concept is searched, The information generation apparatus according to claim 1, wherein words other than the plurality of words that are lower concept words with respect to the higher concept are determined as associative words with respect to the plurality of words included in the concept information.
  3.  前記連想語決定手段は、前記概念情報に含まれる複数の単語の数以上の、前記下位概念の単語を連想語として決定する、請求項2に記載の情報生成装置。 3. The information generation apparatus according to claim 2, wherein the associative word determination unit determines words of the lower concept that are equal to or more than the number of words included in the concept information as associative words.
  4.  前記連想語決定手段は、前記概念情報に含まれる複数の単語間で共通する上位概念が複数検索されたとき、上位概念間のデータ構造上の距離を算出し、前記上位概念の少なくとも何れかから、該距離の略半分までの距離にある他の上位概念に対する下位概念の単語を、連想語として決定する、請求項2または3に記載の情報生成装置。 The associative word determining means calculates a distance on the data structure between the superordinate concepts when a plurality of superordinate concepts common to the plurality of words included in the concept information are searched, and from at least one of the superordinate concepts The information generation device according to claim 2, wherein a word of a subordinate concept relative to another superordinate concept at a distance up to approximately half of the distance is determined as an associative word.
  5.  前記連想語決定手段は、前記概念情報に含まれる複数の単語間で共通する上位概念が3つ以上検索されたとき、上位概念間のデータ構造上の距離を夫々算出し、前記上位概念の少なくとも何れかから、前記距離の平均距離にある他の上位概念に対する下位概念の単語を、連想語として決定する、請求項2または3に記載の情報生成装置。 The associative word determining means calculates a distance in the data structure between the superordinate concepts when at least three superordinate concepts common to the plurality of words included in the concept information are searched, and at least the superordinate concepts The information generation apparatus according to claim 2 or 3, wherein a word of a lower concept with respect to another higher concept at an average distance of the distance is determined as an associative word.
  6.  前記分解された単語のうち、所定の配列に含まれる単語に対し、該所定の配列内の他の単語を、前記所定の配列に含まれる単語に対する連番語句として生成する、連番生成手段を更に備え、
     前記合成手段は、前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語であって、前記所定の配列に含まれる単語に対する前記連番語句とを組み合わせた文字列を含む前記ダミー情報を生成する、請求項1から5の何れか1項に記載の情報生成装置。
    Serial number generating means for generating, for the words included in the predetermined array among the decomposed words, other words in the predetermined array as serial number phrases for the words included in the predetermined array; In addition,
    The synthesizing means includes the association word and a word that follows at least one of the word before and after the word in the character string before the word used for determination of the association word, and is included in the predetermined array. The information generation apparatus according to any one of claims 1 to 5, wherein the dummy information including a character string that is a combination of the sequential number phrases for a word to be generated is generated.
  7.  前記連想語決定手段は、前記概念情報に含まれる単語からのデータ構造上の距離が、所定距離である上位概念を特定し、該上位概念に対する下位概念の単語を、連想語として決定する、請求項1から6の何れか1項に記載の情報生成装置。 The associative word determining means specifies a superordinate concept whose distance on a data structure from a word included in the concept information is a predetermined distance, and determines a subordinate concept word for the superordinate concept as an associative word. Item 7. The information generation device according to any one of Items 1 to 6.
  8.  前記連想語決定手段は、前記概念情報に基づいて、前記分解された単語のうち概念情報に含まれる複数の単語の上位概念を検索し、該上位概念に対する下位概念の単語を連想語として決定する際、前記連想語の数が所定の値以上となる前記上位概念を検索し、該上位概念に対する下位概念の単語を、連想語として決定する、請求項1から7の何れか1項に記載の情報生成装置。 The associative word determining means searches a superordinate concept of a plurality of words included in the concept information among the decomposed words based on the concept information, and determines a word of a subordinate concept with respect to the superordinate concept as an associative word. In this case, the superordinate concept in which the number of associative words is equal to or greater than a predetermined value is searched, and a subordinate concept word for the superordinate concept is determined as an associative word. Information generator.
  9.  前記合成手段は、組み合わせた文字列に優先度を付し、前記優先度が所定の値以上か否かに基づいて、前記ダミー情報を生成する、請求項1から8の何れか1項に記載の情報生成装置。 The said synthetic | combination means attaches a priority to the combined character string, The said dummy information is produced | generated based on whether the said priority is more than a predetermined value, It is any one of Claim 1-8 Information generator.
  10.  システムで使用される可能性が高い単語の素材を示す素材情報を格納する第1の記憶手段を備え、
     前記合成手段は、前記素材情報に含まれる単語を前記連想語として、前記ダミー情報を生成する、請求項1から9の何れか1項に記載の情報生成装置。
    A first storage means for storing material information indicating material of a word that is likely to be used in the system;
    The information generating apparatus according to claim 1, wherein the synthesizing unit generates the dummy information using a word included in the material information as the associative word.
  11.  前記構成要素情報には、メールアドレスが含まれており、
     前記解析手段は、前記メールアドレスをローカル部と、ドメインとに分解し、前記ローカル部に含まれる文字列を、単語に分解し、
     前記合成手段は、前記ローカル部に対し、前記ローカル部の単語の前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせ、
     前記組み合わせた文字列と、前記ドメインとを更に組み合わせることにより、前記メールアドレスとして記載された文字列とは異なる文字列からなるダミー情報を生成する、請求項1から10の何れか1項に記載の情報生成装置。
    The component information includes an email address,
    The analysis means decomposes the mail address into a local part and a domain, decomposes a character string included in the local part into words,
    The synthesizing unit is configured to provide the local part with at least one of the associative word of the word in the local part and the character string before the word used for determination of the associative word before and after the word. Combining the following word or the associated word of the word,
    The dummy information which consists of a character string different from the character string described as the said mail address by further combining the said combined character string and the said domain is produced | generated in any one of Claim 1 to 10. Information generator.
  12.  前記構成要素情報には、URI(Uniform Resource Identifier)が含まれており、
     前記解析手段は、前記URIとして記載された文字列を階層ごとに分解し、分解した各階層の文字列を、単語に分解し、
     前記合成手段は、前記階層の少なくとも何れかの階層において、前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせ、
     前記組み合わせた文字列を少なくとも1つの階層で含むように、各階層の文字列同士を組み合わせ、前記URIとして記載された文字列とは異なる文字列からなるダミー情報を生成する、請求項1から11の何れか1項に記載の情報生成装置。
    The component element information includes a URI (Uniform Resource Identifier),
    The analysis means decomposes the character string described as the URI for each hierarchy, decomposes the decomposed character string of each hierarchy into words,
    The synthesizing means includes a word following at least one of the word before and after the word in the associative word and the character string before the word used for the determination of the word in the hierarchy. Or in combination with the associated word of the word,
    12. The dummy information including a character string different from the character string described as the URI is generated by combining the character strings of the layers so as to include the combined character string in at least one hierarchy. The information generation device according to any one of the above.
  13.  前記概念情報を格納する第2の記憶手段を更に備える、請求項1から12の何れか1項に記載の情報生成装置。 The information generation device according to any one of claims 1 to 12, further comprising second storage means for storing the conceptual information.
  14.  システムの構成要素に関する構成要素情報に含まれる文字列を単語に分解し、
     前記分解された単語のうち概念情報に含まれる単語に対し、該概念情報に基づいて、前記単語の連想語を決定し、
     前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせることにより、前記構成要素情報に含まれる文字列とは異なる文字列からなるダミー情報を生成する、情報生成方法。
    Decompose character strings contained in component information about system components into words,
    For words included in the concept information among the decomposed words, based on the concept information, determine an associative word of the word,
    By combining the associative word and the word following at least one of the word before and after the word in the character string before the word used for the determination of the associative word, or the word associative word, An information generation method for generating dummy information composed of a character string different from the character string included in the component element information.
  15.  システムの構成要素に関する構成要素情報に含まれる文字列を単語に分解する処理と、
     前記分解された単語のうち概念情報に含まれる単語に対し、該概念情報に基づいて、前記単語の連想語を決定する処理と、
     前記連想語と、該連想語の決定に用いた単語の分解前の文字列内において、前記単語の前および後ろの少なくとも一方に続く単語、または、前記単語の連想語とを組み合わせることにより、前記構成要素情報に含まれる文字列とは異なる文字列からなるダミー情報を生成する処理と、をコンピュータに実行させるプログラムを記憶する、コンピュータ読み取り可能な記録媒体。
    A process of decomposing a character string included in component information related to system components into words;
    A process of determining an associative word of the word based on the concept information for a word included in the concept information among the decomposed words;
    By combining the associative word and the word following at least one of the word before and after the word in the character string before the word used for the determination of the associative word, or the word associative word, A computer-readable recording medium for storing a program for causing a computer to execute processing for generating dummy information including a character string different from a character string included in component element information.
PCT/JP2015/004707 2014-09-19 2015-09-16 Information-generating device, information-generating method, and recording medium WO2016042762A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2016548560A JP6436171B2 (en) 2014-09-19 2015-09-16 Information generating apparatus, information generating method and program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014190805 2014-09-19
JP2014-190805 2014-09-19

Publications (1)

Publication Number Publication Date
WO2016042762A1 true WO2016042762A1 (en) 2016-03-24

Family

ID=55532817

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/004707 WO2016042762A1 (en) 2014-09-19 2015-09-16 Information-generating device, information-generating method, and recording medium

Country Status (2)

Country Link
JP (1) JP6436171B2 (en)
WO (1) WO2016042762A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112749388A (en) * 2019-10-31 2021-05-04 株式会社野村综合研究所 Risk management assistance device
KR102613985B1 (en) * 2023-03-31 2023-12-14 고려대학교산학협력단 Method, apparatus and system for defending for backward privacy downgrade attack in searchable encryption

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009289000A (en) * 2008-05-29 2009-12-10 Softbank Mobile Corp System, method and program for supporting measure against spam mail
WO2010044180A1 (en) * 2008-10-15 2010-04-22 日本電気株式会社 Information processing device
WO2014103115A1 (en) * 2012-12-26 2014-07-03 三菱電機株式会社 Illicit intrusion sensing device, illicit intrusion sensing method, illicit intrusion sensing program, and recording medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009289000A (en) * 2008-05-29 2009-12-10 Softbank Mobile Corp System, method and program for supporting measure against spam mail
WO2010044180A1 (en) * 2008-10-15 2010-04-22 日本電気株式会社 Information processing device
WO2014103115A1 (en) * 2012-12-26 2014-07-03 三菱電機株式会社 Illicit intrusion sensing device, illicit intrusion sensing method, illicit intrusion sensing program, and recording medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JUN OKAMOTO: "Construction of Associative Concept Dictionary with Distance Information, and Comparison with Electronic Concept Dictionary", JOURNAL OF NATURAL LANGUAGE PROCESSING, vol. 8, no. 4, 10 October 2001 (2001-10-10), pages 37 - 54 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112749388A (en) * 2019-10-31 2021-05-04 株式会社野村综合研究所 Risk management assistance device
JP2021071943A (en) * 2019-10-31 2021-05-06 株式会社野村総合研究所 Risk management support apparatus
JP7368184B2 (en) 2019-10-31 2023-10-24 株式会社野村総合研究所 Risk management support device
CN112749388B (en) * 2019-10-31 2024-05-24 株式会社野村综合研究所 Risk management assistance device
KR102613985B1 (en) * 2023-03-31 2023-12-14 고려대학교산학협력단 Method, apparatus and system for defending for backward privacy downgrade attack in searchable encryption

Also Published As

Publication number Publication date
JPWO2016042762A1 (en) 2017-07-20
JP6436171B2 (en) 2018-12-12

Similar Documents

Publication Publication Date Title
EP2336908B1 (en) Search device, search method and search program using open search engine
JP4848317B2 (en) Database indexing system, method and program
US10277613B2 (en) URL matching apparatus, URL matching method, and URL matching program
GB2557015A (en) System and method for extracting entities in electronic documents
US8458187B2 (en) Methods and systems for visualizing topic location in a document redundancy graph
US10552497B2 (en) Unbiasing search results
US11178175B2 (en) Combo-squatting domain linkage
US11750649B2 (en) System and method for blocking phishing attempts in computer networks
Patricia Aires et al. A link-based approach to detect media bias in news websites
Gunn Mendeley: Enabling and understanding scientific collaboration
Alzhrani et al. Automated big text security classification
US10362060B2 (en) Curtailing search engines from obtaining and controlling information
JP6972935B2 (en) Related score calculation system, method and program
CN110929185B (en) Website directory detection method and device, computer equipment and computer storage medium
JP6436171B2 (en) Information generating apparatus, information generating method and program
JP7033115B2 (en) Search processing method and device based on clipboard data
JP2007108356A (en) Personal information concealing device and program for same
WO2012125064A1 (en) Method for displaying an advertisement on internet resources depending on the combined content thereof
Annamalai et al. Accessing the data efficiently using prediction of dynamic data algorithm
US9286349B2 (en) Dynamic search system
US20090248673A1 (en) Method of sorting web pages, search terminal and client terminal
Sameera et al. Cybercrime: To Detect Suspected User’s Chat Using Text Mining
JP2011175486A (en) Apparatus, program and method for supporting check of name collecting
CN103701951B (en) The analysis method of website visiting record and the analytical equipment of website visiting record
KR101893029B1 (en) Method and Apparatus for Classifying Vulnerability Information Based on Machine Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15842886

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016548560

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15842886

Country of ref document: EP

Kind code of ref document: A1