WO2016139918A1 - Log data processing method, log data processing program, and log data processing apparatus - Google Patents

Log data processing method, log data processing program, and log data processing apparatus Download PDF

Info

Publication number
WO2016139918A1
WO2016139918A1 PCT/JP2016/000978 JP2016000978W WO2016139918A1 WO 2016139918 A1 WO2016139918 A1 WO 2016139918A1 JP 2016000978 W JP2016000978 W JP 2016000978W WO 2016139918 A1 WO2016139918 A1 WO 2016139918A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
log data
pieces
data
log
Prior art date
Application number
PCT/JP2016/000978
Other languages
French (fr)
Inventor
Hiroaki Fujiwara
Original Assignee
Canon Kabushiki Kaisha
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Kabushiki Kaisha filed Critical Canon Kabushiki Kaisha
Publication of WO2016139918A1 publication Critical patent/WO2016139918A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/40Data acquisition and logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Definitions

  • the present invention relates to a log data processing method, a log data processing program, and a log data processing apparatus.
  • log information information in which histories or the likes related to operations and control of various devices configuring the production apparatuses are recorded
  • log data data in which the multiple pieces of log information are integrated
  • the log information (message) read out from the log data is clustered into pieces of information to identify variable portions in the log information and confidential information (confidential attribute) is identified using the predetermined replacement condition (rule). If the confidential information is not identified according to the condition (rule), the confidential information is capable of being identified from the positional relationship of the confidential information in the clustered log information.
  • the log format or the information to be replaced may be added or modified with update of the control programs for such a production apparatus.
  • the log information for which the confidential information is not capable of being identified according to the rule may be added and may appear only once.
  • the log information for which the confidential information is not capable of being identified according to the rule appears only once and no similar information exists, the log information may not be clustered into pieces of information and the confidential information may not be identified. Accordingly, it is necessary for a user to add a rule to identify the confidential information, thereby imposing the burden of the replacement on the user.
  • the present invention provides a log data processing method, a log data processing program, and a log data processing apparatus capable of eliminating or reducing the burden of replacement.
  • a log data processing method of replacing information included in log data is performed in an information processing apparatus.
  • the log data processing method includes acquiring first information identified from first definition data that defines a format of the log data; editing second definition data used to identify the first information included in the log data by adding the acquired first information to the second definition data; and replacing the first information that coincides with the first information in the second definition data and that is included in the log data with other information different from the first information.
  • Fig. 1 is a block diagram illustrating an exemplary hardware configuration of an information processing apparatus.
  • Fig. 2 is a block diagram illustrating an exemplary configuration of a CPU.
  • Fig. 3 is a flowchart illustrating an exemplary process performed by the CPU according to a first embodiment.
  • Fig. 4 illustrates exemplary first definition data according to the first embodiment.
  • Fig. 5 illustrates exemplary conversion target log data according to the first embodiment.
  • Fig. 6 illustrates exemplary second definition data according to the first embodiment.
  • Fig. 7 illustrates exemplary conversion result log data according to the first embodiment.
  • Fig. 8 illustrates exemplary definition data when the method of the first embodiment is not applied.
  • Fig. 9 illustrates exemplary conversion result log data when the method of the first embodiment is not applied.
  • Fig. 1 is a block diagram illustrating an exemplary hardware configuration of an information processing apparatus.
  • Fig. 2 is a block diagram illustrating an exemplary configuration of a CPU.
  • Fig. 3 is a
  • FIG. 10 is a flowchart illustrating an exemplary process performed by the CPU according to a second embodiment.
  • Fig. 11 illustrates exemplary second definition data according to the second embodiment.
  • Fig. 12 illustrates exemplary conversion result log data according to the second embodiment.
  • Fig. 13 illustrates exemplary conversion target log data according to a third embodiment.
  • Fig. 14 is a flowchart illustrating an exemplary process performed by the CPU according to the third embodiment.
  • Fig. 15 is a diagram for describing an exemplary rule for generating third information according to the third embodiment.
  • Fig. 16 illustrates exemplary second definition data according to the third embodiment.
  • Fig. 17 illustrates exemplary conversion result log data according to the third embodiment.
  • the target log data may be data generated in a production apparatus (manufacturing apparatus), such as a semiconductor manufacturing apparatus, or data resulting from processing of the above data.
  • the target log data may be data including log information related to the production apparatus.
  • the log information includes a history related to operations and control of machines, electronic devices, and electrical devices composing the production apparatus, a history of communication between the production apparatus and an external system, a history of operations by an operator, and a history of authentication by a user.
  • information used to distinguish products (goods), information used to distinguish jigs or members used in the production, information concerning the content of processing of the products, and so on are prepared in advance and these pieces of information are input before a production process is started.
  • a history of input operations is recorded in the log data.
  • the pieces of information that are input may also be recorded.
  • the pieces of information that are input may be confidential information that should be kept secret from a third party for the user of the production apparatus. As described above, the confidential information may be included in the log data. When the log data is taken out of the production site, it is necessary to replace the portion where the confidential information is output with information that is not the confidential information.
  • Fig. 1 is a block diagram illustrating an exemplary hardware configuration of an information processing apparatus (information processing unit) capable of performing a log data processing method.
  • the log data processing method is performed by a processing unit (a central processing unit (CPU) or a micro-processing unit (MPU)) in a computer, which reads out programs to execute the programs.
  • the software and the programs realizing the functions of the information processing apparatus are supplied to an information processing apparatus composed of one or more computers via a network or various recording media.
  • the processing unit in the information processing apparatus reads out the programs recorded in a recording medium or stored in a storage medium to execute the programs. Multiple computers that are apart from each other may transmit and receive data through wired communication or wireless communication to perform various processes in the programs.
  • the information processing apparatus may be provided in a server connected to the production apparatus or in the production apparatus.
  • a CPU 101 is a processing unit that performs arithmetic operations for a variety of data processing related to the log data processing and controls each component connected to a bus 108.
  • a read only memory (ROM) 102 is a memory exclusively used for reading of data and stores a basic control program.
  • a random access memory (RAM) 103 is a memory used for reading and writing of data and is used to temporarily store a variety of arithmetic processing by the CPU 101 and data.
  • An external storage unit 104 is used to store conversion target log data 204, first definition data 205, second definition data 206, and conversion result log data 207, which are described below.
  • the external storage unit 104 is also used for a temporary storage area of a system program in an operating system (OS) in the information processing apparatus and a temporary storage area during processing of programs and data. Although the input and output speed of data of the external storage unit 104 is slower than that of the RAM 103, the external storage unit 104 is capable of storing a large amount of data.
  • the external storage unit 104 is desirably a non-volatile storage unit capable of permanently storing data so that the stored data can be referred to over a long time.
  • the external storage unit 104 is mainly composed of a magnetic storage unit (hard disk drive (HDD)), the external storage unit 104 may be a unit that reads and writes data using an external medium that is loaded, such as a compact disc (CD), a digital versatile disk (DVD), or a memory card.
  • HDD hard disk drive
  • CD compact disc
  • DVD digital versatile disk
  • An input unit 105 is used to input characters and data into the information processing apparatus and corresponds to various keyboards and mice.
  • a display unit 106 is used to display the result of processing by the information processing apparatus and corresponds to a cathode ray tube (CRT) or a liquid crystal monitor.
  • a communication unit 107 is used to establish data communication with another information processing apparatus via a local area network (LAN) using a communication protocol, such as Transmission Control Protocol/Internet Protocol (TCP/IP).
  • LAN local area network
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the communication unit 107 may acquire the conversion target log data 204, the first definition data 205, the second definition data 206, and the conversion result log data 207 from another information processing apparatus through the data communication and may store the conversion target log data 204, the first definition data 205, the second definition data 206, and the conversion result log data 207 in the other information processing apparatus.
  • Fig. 2 is a block diagram illustrating an exemplary configuration of the CPU 101.
  • the CPU 101 includes an acquirer 201, an editor 202, and a replacer 203.
  • the acquirer 201 acquires first information, which is information to be replaced, using the conversion target log data 204 and the first definition data 205 acquired with the external storage unit 104.
  • the editor 202 acquires the first information from the acquirer 201, performs editing in which the acquired first information is added to the second definition data 206, and stores the second definition data 206 in the external storage unit 104.
  • the replacer 203 generates the conversion result log data 207 using the conversion target log data 204 and the second definition data 206 acquired from the external storage unit 104 and stores the generated conversion result log data 207 in the external storage unit 104.
  • Fig. 3 is a flowchart illustrating an exemplary process performed by the CPU 101 according to the first embodiment.
  • the CPU 101 starts the log data processing.
  • the acquirer 201 acquires the conversion target log data 204.
  • the acquirer 201 acquires the first information included in the log data on the basis of the first definition data 205.
  • the editor 202 performs the editing in which the acquired first information is added to the second definition data 206.
  • the replacer 203 acquires the conversion target log data 204 and the second definition data 206 and replaces the first information included in the conversion target log data 204 with second information, which is different from the first information, on the basis of the second definition data 206.
  • Step 305 the replacer 203 outputs the conversion result log data 207 as the replaced data. Then, the CPU 101 terminates the log data processing.
  • the above process is the batch processing in which Steps 301 to 305 in the flowchart illustrated in Fig. 3 are performed for the entire conversion target log data 204. Alternatively, part (for example, a range corresponding to one line) of the conversion target log data 204 may be acquired and Steps 301 to 303, 301 to 304, or 301 to 305 may be repeatedly performed in the acquired range.
  • the acquirer 201 illustrated in Fig. 2 acquires the first information included in the log data on the basis of the first definition data 205, as described above.
  • a situation in which the first information is output in the log may be identified, as in "the case in which the information is input before the production process is started.”
  • the log format in the output into the log data can be determined in advance and the position where the first information is acquired in the log data can be determined on the basis of the log format.
  • Fig. 4 illustrates an example of the first definition data 205 in which the log format is defined.
  • the log format is used as a replacement condition (rule) and is used to determine the position of the first information in the log data.
  • the first definition data 205 in Fig. 4 includes pattern information 401 (401-A to 401-C), position information 402 (402-A to 402-C), and range information 403 (403-A to 403-C) as the log formats.
  • the pattern information 401 is used to determine the position where the first information is output.
  • the position information 402 represents the relationship between the position of a character string coinciding with the pattern information 401 and the position of the first information.
  • the range information 403 defines a range acquired as the first information.
  • the pattern information 401, the position information 402, and the range information 403 are listed in this order in a comma-delimited manner.
  • One definition is registered in one line, that is, a total of three definitions are registered in the first definition data 205 in Fig. 4.
  • the pattern information 401 and the position information 402 are registered as fixed character strings and the range information 403 is registered as a regular expression or a combination of a fixed character string and a regular expression for the log information of a text format.
  • the first definition data 205 may be stored in a binary format or another format, instead of the text format.
  • the first definition data 205 may be stored in a file format on a file system or may be stored in a database.
  • the content of the first definition data 205 may be reversibly encrypted.
  • the pattern information 401 is defined on the basis of the log information for which the output of the first information with a predetermined pattern, for example, a certain character string or data set is set as the log format.
  • a predetermined pattern for example, a certain character string or data set is set as the log format.
  • a fixed character string, a regular expression, or a combination of a fixed character string and a regular expression may be used for the log information of the text format.
  • a series of data sets of a certain order may be used for the log information of the binary format or another format.
  • the position information 402 defines information based on a predetermined rule that represents the positional relationship between the pattern identified by the pattern information 401 and the position where the acquisition of the first information is started.
  • the position information 402 is registered for the log information of the text format.
  • the first information is acquired from a position shifted from a character next to the final character of the character string coinciding with the pattern information 401 to the end-of-line direction by the number of characters of the position information 402. "0" is registered for the position information 402 in the first and second definitions and "5" is registered for the position information 402 in the third definition.
  • the range information 403 defines information based on a predetermined method of defining the range acquired as the first information.
  • the range for example, the number of characters to be acquired or a regular expression coinciding with the first information as a character string pattern may be defined for the log information of the text format.
  • regular expressions with which the character strings of the pieces of first information coincide are defined and registered, as in the pieces of range information 403-A to 403-C.
  • the range information 403-A represents an arbitrary character string of one or more characters.
  • the range information 403-B represents a character string resulting from combination of an arbitrary character string of one or more characters and a fixed character string ".job.”
  • the range information 403-C represents a character string resulting from combination of an arbitrary character string of one or more characters and a fixed character string ".msk.”
  • Fig. 5 illustrates an example of the conversion target log data 204.
  • the conversion target log data 204 may be stored in the binary format or another format, instead of the text format.
  • the conversion target log data 204 may be stored in a file format on a file system or may be stored in a database.
  • the content of the conversion target log data 204 may be reversibly encrypted.
  • the conversion target log data 204 is the log data of the text format in which a time stamp is added and a new line is started for each piece of log information.
  • the time stamp is simply represented by a character string "YYYY-MM-DD HH:MM:SS.SSS.”
  • the lines in which the pieces of first information to be replaced are output are extracted in the example illustrated in Fig. 5.
  • the acquirer 201 acquires the conversion target log data 204 in Fig. 5 and determines whether a portion coinciding with any character string in the pattern information 401 defined in the first definition data 205 in Fig. 4 is included in the conversion target log data 204. In the determination, a method of comparing each character in the log information with each character in the pattern information 401 from the beginning of the log information may be used or a method using a Rabin-Karp string search algorithm, a Boyer-Moore string search algorithm, or another string search algorithm may be used.
  • the acquirer 201 sets a position shifted from a character next to the final character of a character string 501 coinciding with the pattern information 401 to the end-of-line direction by the number of characters of the position information 402-A as the position where the first information is acquired.
  • the acquirer 201 acquires a character string 504 coinciding with the range information 403-A from the position as the first information.
  • a character string 505 and a character string 506 are acquired from the eighth line and the twenty-first line, respectively, in Fig.
  • Fig. 6 illustrates an example of the second definition data 206 in which the pieces of first information are stored in the text format.
  • each character string in the pattern information 401 is added in a bracket ([]) and the corresponding piece(s) of first information acquired from the conversion target log data 204 is (are) added to the subsequent line (lines).
  • the editor 202 does not redundantly add the first information existing in the second definition data 206. Accordingly, the first information in the second definition data 206 may be confirmed in the addition of the first information and only the pieces of first information which do not exist in the second definition data 206 may be added. Alternatively, after all the pieces of first information included in the conversion target log data 204 may be added to the second definition data 206, the redundant pieces of first information may be deleted. When the first information including the piece of first information to be added exists, the existing first information may be divided. When the first information including the existing piece of first information is to be added, the first information to be added may be divided for addition. When the size of the second definition data 206 is increased, an unnecessary piece of first information may be determined and the determined unnecessary piece of first information may be deleted.
  • a method of determining the piece of first information that does not appear in the conversion target log data 204 for a certain period to be the unnecessary piece of first information may be used or a method of determining the piece of first information that has been added to the second definition data 206 before the certain period to be the unnecessary piece of first information may be used.
  • the second definition data 206 may be stored in the binary format or another format, instead of the text format.
  • the second definition data 206 may be stored in a file format on a file system or may be stored in a database.
  • the content of second definition data 206 may be reversibly encrypted.
  • the first definition data 205 and the second definition data 206 may be in the same file when they are stored in the file format and may be in the same table when they are stored in the database.
  • the replacer 203 acquires the second definition data 206 and the conversion target log data 204.
  • the replacer 203 searches the conversion target log data 204 for the piece of first information existing in the second definition data 206 and replaces the character string coinciding with the piece of first information existing in the second definition data 206 with the second information.
  • a method of comparing each character in the character string of the log information with each character in the character string of the first information from the beginning of the log information may be used or a method using the Rabin-Karp string search algorithm, the Boyer-Moore string search algorithm, or another string search algorithm may be used.
  • the second information may be a certain fixed character string defined by the replacer 203 in advance.
  • the second information may be defined as an arbitrary fixed character string in the first definition data 205, the second definition data 206, or dedicated definition data (not illustrated) and the replacer 203 may acquire the defined second information.
  • the second information is not necessarily included in the second definition data 206, as described above.
  • the pieces of pattern information in brackets
  • the replacer 203 acquires the pieces of second information 601 to 603 simultaneously with the acquisition of the second definition data 206 to determine the second information for the first information.
  • the replacer 203 outputs the conversion result log data 207 in which the first information included in the conversion target log data 204 is replaced with the second information.
  • Fig. 7 illustrates an example in which the conversion result log data 207 is output in the same format as that of the conversion target log data 204.
  • the conversion result log data 207 may be stored in the binary format or another format, instead of the text format.
  • the conversion result log data 207 may have the same format as that of the conversion target log data 204 or may have a format different from that of the conversion target log data 204.
  • the conversion result log data 207 may be stored in a file format on a file system or may be stored in a database.
  • the content of the conversion result log data 207 may be reversibly encrypted.
  • a character string 701 is replaced with the second information 601 in Fig. 6 and character strings 702 to 706 are replaced with the second information 602 in Fig. 6.
  • the pieces of first information in the log data such as the twentieth line, the thirteenth line, and the seventeenth line in Fig. 5, for which the definition of the log formats does not exist in the first definition data 205 are also capable of being replaced from the pieces of first information added to the second definition data 206 without defining the log format in the first definition data 205 in advance.
  • the replacer 203 may add the log format to the first definition data 205.
  • the replacement is available without adding the log format to the first definition data 205 if the first information that is capable of being identified from the first information acquired from other log information is output when the control program of the production apparatus is next updated.
  • the replacer 203 may replace part or all of each line other than the line to be subjected to the replacement of the first information with the second information, among the lines included in the conversion target log data 204, with a certain fixed character string.
  • Part of the line corresponds to, for example, the character string to the end of the line after the time stamp. This replaces the character strings to be replaced with no omission.
  • Fig. 8 illustrates exemplary definition data when the method of the first embodiment is not applied.
  • the definition data includes pattern information 801 (801-A to 801-C), position information 802 (802-A to 802-C), and range information 803 (803-A to 803-C).
  • the definition data also includes second information 804 (804-A to 804-C).
  • the pattern information 801, the position information 802, the range information 803, and the second information 804 are listed in this order in a comma-delimited manner.
  • One definition is registered in one line, that is, a total of three definitions are registered in the definition data in Fig. 8.
  • the character string coinciding with the range information 803 is replaced with the second information 804 registered in the definition data from a position shifted from a character next to the final character of the character string coinciding with the pattern information 801 registered in the definition data in Fig. 8 to the end-of-line direction by the number of characters of the position information 802.
  • Fig. 9 illustrates an example of the conversion result log data 207 resulting from the conversion of the conversion target log data 204 in Fig. 5 without applying the method of the first embodiment before the definition data in Fig. 8 is edited upon update of the control program of the production apparatus and addition of new log formats.
  • the twelfth line, the thirteenth line, and the seventeenth line indicate the log formats that are newly added and correspond to the pieces of log information appearing once.
  • Character strings 903 to 905 output in the twelfth line, the thirteenth line, and the seventeenth line, respectively, should be replaced as the pieces of first information with the pieces of second information, as in the eighth line in which similar information is output.
  • the character strings 903 to 905 are not replaced.
  • the method of the first embodiment it is necessary to edit the definition data synchronously with any new addition or modification of the log format upon update of the control program of the production apparatus. Since the number of definitions in the definition data is increased as the increasing kinds of the log formats, the amount of log information for which the information to be replaced is not determined is also increased.
  • the first information is capable of being replaced without adding the log format to the first definition data 205 if the first information is the same as the first information added to the second definition data 206.
  • the log data processing method it is possible to eliminate or reduce the burden of the replacement.
  • the replacement is available when the first information that is capable of being identified from the first information acquired from other log information is output even if the log information for which the information to be replaced is not determined appears only once (not multiple times).
  • the first information is the confidential information and the second information is not the confidential information, it is possible to prevent the confidential information from leaking outside.
  • a second embodiment will now be described with reference to Fig. 10 to FIG. 12, in which pieces of third information corresponding to the multiple different pieces of first information are generated with numbers in the second definition data 206. Points that are not described in the second embodiment may be similar to those in the first embodiment.
  • pieces of log information including the same first information in the conversion target log data 204 may be determined to be in the same group.
  • IDs lot identifiers
  • Fig. 10 is a flowchart illustrating an exemplary process performed by the CPU 101 according to the second embodiment.
  • the process illustrated in Fig. 10 differs from the process illustrated in Fig. 3 in Step 1003 and Step 1004.
  • the third information is generated to edit the second definition data 206.
  • the first information included in the conversion target log data 204 is replaced with the third information.
  • the editor 202 performs editing in which the third information corresponding to the first information is generated before the first information is added to the second definition data 206 and the generated third information is added to the second definition data 206 with the first information. If the first information to be added exists in the second definition data 206, the third information corresponding to the first information is not generated and added.
  • Step 1004 the replacer 203 acquires the conversion target log data 204 and the second definition data 206 and replaces the first information included in the conversion target log data 204 with the third information on the basis of the second definition data 206. Accordingly, the same pieces of information are capable of being determined to be in the same group also in the multiple pieces of third information after the replacement.
  • a predetermined generation rule is set in the editor 202 in advance so that one-to-one correspondence is established between the first information and the third information.
  • the editor 202 may generate a number character string so that the one-to-one correspondence is established between the first information and the third information or may combine an arbitrary prefix or suffix character string with a number to generate a character string so that the one-to-one correspondence is established between the first information and the third information.
  • the editor 202 may define an arbitrary format in the definition data, such as the first definition data 205, to generate the third information according to the format.
  • Fig. 11 illustrates an example of the second definition data 206.
  • the third information corresponding to the first information is generated with the first information and the generated third information is added to the second definition data 206.
  • the pieces of first information and pieces of third information 1101 to 1103 corresponding to the pieces of first information are listed in a comma-delimited manner.
  • the replacer 203 acquires the pieces of third information 1101 to 1103 simultaneously with the acquisition of the second definition data 206 to determine the piece of third information for each piece of first information.
  • Fig. 12 illustrates an example of the conversion result log data 207 generated by performing the replacement to the first definition data 205 illustrated in Fig. 5 on the basis of the second definition data 206 illustrated in Fig. 11.
  • the use of the pieces of third information having the one-to-one correspondence with the pieces of first information in the above manner allows pieces of third information 1201, 1202 to 1205, and 1206 to be replaced so that the pieces of third information 1201, 1202 to 1205, and 1206 are determined to be the pieces of log information in different groups, as in the log data before the conversion.
  • the replacement is available when the first information that is capable of being identified from the first information acquired from other log information is output even if the log information for which the information to be replaced is not determined appears only once (not multiple times).
  • the first information is the confidential information and the third information is not the confidential information, it is possible to prevent the confidential information from leaking outside.
  • the pieces of log information determined to be in the same group on the basis of the same first information exist, the pieces of log information are determined to be in the corresponding group on the basis of the same third information also in the conversion result log data 207.
  • a third embodiment will now be described with reference to Fig. 13 to FIG. 17, in which each piece of first information is divided into multiple elements and pieces of information resulting from conversion for every element are combined to each other to generate the pieces of third information corresponding to the multiple pieces of first information in the second definition data 206. Points that are not described in the third embodiment may be similar to those in the first and second embodiments.
  • the replacement of the first information with the third information different from the first information is suitable for conversion of the log data into a readable mode for human beings, that is, a mode in which the readability of the log data is improved.
  • Fig. 13 illustrates an example of the conversion target log data 204.
  • the conversion target log data 204 is the log data of the text format in which a time stamp is added and a new line is started for each piece of log information.
  • the lines in which the pieces of first information to be replaced are output are extracted in the example illustrated in Fig. 13.
  • Character strings 1301 to 1303 coinciding with the pieces of pattern information and pieces of first information 1304 to 1306 to be replaced are output in the example illustrated in Fig. 13.
  • the log formats coinciding with the tenth line and the nineteenth line in Fig. 13 are not defined in the first definition data 205 and correspond to the log information appearing once.
  • the pieces of first information 1304 to 1306 to be replaced are unit IDs for identifying processing units in the production apparatus. Characters and figures that are difficult for human beings to distinguish are listed in the pieces of first information 1304 to 1306.
  • the same pieces of information means the same units
  • the unit ID of the same processing unit be replaced with the same pieces of third information and a character string "U001010001" representing a unit ID be replaced with a character string "Station01-R-C01."
  • Fig. 14 is a flowchart illustrating an exemplary process performed by the CPU 101 according to the third embodiment.
  • the process illustrated in Fig. 14 differs from the processes illustrated in Fig. 3 and Fig. 10 in in Step 1403 in which the first information is converted for every element to generate the third information and the second definition data 206 is edited.
  • the editor 202 performs editing in which the third information corresponding to the first information is generated before the first information is added to the second definition data 206 and the third information is added to the second definition data 206 with the first information.
  • Fig. 15 is a diagram for describing an exemplary rule for generating the third information corresponding to the first information from the first information.
  • the character string of the first information is divided into three characters (1501) following the first character "U", two characters (1502) following the three characters, and four characters (1503) following the two characters and correspondence tables of the pieces of information after the replacement, which correspond to the divided pieces of information, are set as the rule.
  • Each piece of information after the replacement to be defined in the correspondence tables should be the character string readable for human beings and should be unique in the correspondence table for each element. Accordingly, the third information generated according to the above rule is readable for human beings and the one-to-one correspondence is established between the first information and the third information.
  • the data in the correspondence tables may be certain data defined by the editor 202 in advance.
  • the data in the correspondence tables may be defined as arbitrary data in the first definition data 205, the second definition data 206, or dedicated definition data (not illustrated) and the editor 202 may acquire the defined data.
  • Fig. 16 illustrates an example of the second definition data 206.
  • the editor 202 generates the third information from the first information according to the rule illustrated in Fig. 15.
  • pieces of third information 1601 to 1603 corresponding to the pieces of first information are added after the pieces of first information and commas.
  • Each piece of third information is generated as one character string in which the pieces of information converted according to the rule in the correspondence tables in Fig. 15 are connected with hyphen (-).
  • the editor 202 generates the third information in the above manner in the addition of the first information and edits the second definition data 206.
  • the replacer 203 acquires the third information simultaneously with the acquisition of the second definition data 206 to determine the piece of third information for each piece of first information.
  • Fig. 17 illustrates an example of the conversion result log data 207 resulting from replacement of the conversion target log data 204 in Fig. 13 on the basis of the second definition data 206 in Fig. 16.
  • the use of the third information corresponding to the first information in the above manner allows the pieces of first information to be converted into the character strings readable for human beings, like pieces of third information 1701 to 1707, and allows the pieces of log information replaced from the pieces of first information in the same group to be determined, as in the second embodiment.
  • the replacement is available when the first information that is capable of being identified from the first information acquired from other log information is output even if the log information for which the information to be replaced is not determined appears only once (not multiple times).
  • the first information is the confidential information and the third information is not the confidential information, it is possible to prevent the confidential information from leaking outside.
  • the pieces of log information determined to be in the same group on the basis of the same first information exist, the pieces of log information are determined to be in the corresponding group on the basis of the same third information also in the conversion result log data 207.
  • the log data processing methods according to the first to third embodiments may be realized in a combination of the first embodiment and the second embodiment, a combination of the first embodiment and the third embodiment, a combination of the second embodiment and the third embodiment, or a combination of all of the first to third embodiments, in addition to the realization of each embodiment.
  • Embodiment(s) of the present invention can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a 'non-transitory computer-readable storage medium') to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s).
  • computer executable instructions e.g., one or more programs
  • a storage medium which may also be referred to more fully as
  • the computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions.
  • the computer executable instructions may be provided to the computer, for example, from a network or the storage medium.
  • the storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD) TM ), a flash memory device, a memory card, and the like.

Abstract

A log data processing method of replacing information included in log data includes acquiring first information identified from first definition data that defines a format of the log data; editing second definition data used to identify the first information included in the log data by adding the acquired first information to the second definition data; and replacing the first information that coincides with the first information in the second definition data and that is included in the log data with other information different from the first information.

Description

LOG DATA PROCESSING METHOD, LOG DATA PROCESSING PROGRAM, AND LOG DATA PROCESSING APPARATUS
The present invention relates to a log data processing method, a log data processing program, and a log data processing apparatus.
In production apparatuses, such as semiconductor manufacturing apparatuses, programs that control the production apparatuses collect information in which histories or the likes related to operations and control of various devices configuring the production apparatuses are recorded (such information is hereinafter referred to as log information) and store data in which the multiple pieces of log information are integrated (such data is hereinafter referred to as log data) in information processing apparatuses.
As a method of replacing part of information in log data with other information, there has been, for example, a method of determining information to be replaced according to a replacement condition (rule) for log information in which the format thereof (hereinafter referred to as log format) and the information to be replaced have been grasped in advance (refer to PTL 1).
In the processing method proposed in PTL 1, the log information (message) read out from the log data is clustered into pieces of information to identify variable portions in the log information and confidential information (confidential attribute) is identified using the predetermined replacement condition (rule). If the confidential information is not identified according to the condition (rule), the confidential information is capable of being identified from the positional relationship of the confidential information in the clustered log information.
Japanese Patent Laid-Open No. 2013-137740
The log format or the information to be replaced may be added or modified with update of the control programs for such a production apparatus. In this case, the log information for which the confidential information is not capable of being identified according to the rule may be added and may appear only once. However, with the method proposed in PTL 1, when the log information for which the confidential information is not capable of being identified according to the rule appears only once and no similar information exists, the log information may not be clustered into pieces of information and the confidential information may not be identified. Accordingly, it is necessary for a user to add a rule to identify the confidential information, thereby imposing the burden of the replacement on the user.
The present invention provides a log data processing method, a log data processing program, and a log data processing apparatus capable of eliminating or reducing the burden of replacement.
According to an embodiment of the present invention, a log data processing method of replacing information included in log data is performed in an information processing apparatus. The log data processing method includes acquiring first information identified from first definition data that defines a format of the log data; editing second definition data used to identify the first information included in the log data by adding the acquired first information to the second definition data; and replacing the first information that coincides with the first information in the second definition data and that is included in the log data with other information different from the first information.
Further features of the present invention will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
Fig. 1 is a block diagram illustrating an exemplary hardware configuration of an information processing apparatus. Fig. 2 is a block diagram illustrating an exemplary configuration of a CPU. Fig. 3 is a flowchart illustrating an exemplary process performed by the CPU according to a first embodiment. Fig. 4 illustrates exemplary first definition data according to the first embodiment. Fig. 5 illustrates exemplary conversion target log data according to the first embodiment. Fig. 6 illustrates exemplary second definition data according to the first embodiment. Fig. 7 illustrates exemplary conversion result log data according to the first embodiment. Fig. 8 illustrates exemplary definition data when the method of the first embodiment is not applied. Fig. 9 illustrates exemplary conversion result log data when the method of the first embodiment is not applied. Fig. 10 is a flowchart illustrating an exemplary process performed by the CPU according to a second embodiment. Fig. 11 illustrates exemplary second definition data according to the second embodiment. Fig. 12 illustrates exemplary conversion result log data according to the second embodiment. Fig. 13 illustrates exemplary conversion target log data according to a third embodiment. Fig. 14 is a flowchart illustrating an exemplary process performed by the CPU according to the third embodiment. Fig. 15 is a diagram for describing an exemplary rule for generating third information according to the third embodiment. Fig. 16 illustrates exemplary second definition data according to the third embodiment. Fig. 17 illustrates exemplary conversion result log data according to the third embodiment.
Embodiments of the present invention will herein be described with reference to the attached drawings.
First Embodiment
A first embodiment will now be described with reference to Fig. 1 to Fig. 7, in which information in log data is replaced with other information and the log data subjected to the conversion is output. The target log data may be data generated in a production apparatus (manufacturing apparatus), such as a semiconductor manufacturing apparatus, or data resulting from processing of the above data. Specifically, the target log data may be data including log information related to the production apparatus. The log information includes a history related to operations and control of machines, electronic devices, and electrical devices composing the production apparatus, a history of communication between the production apparatus and an external system, a history of operations by an operator, and a history of authentication by a user.
For example, in such a production apparatus, information used to distinguish products (goods), information used to distinguish jigs or members used in the production, information concerning the content of processing of the products, and so on are prepared in advance and these pieces of information are input before a production process is started. Upon input of these pieces of information, a history of input operations is recorded in the log data. However, at this time, the pieces of information that are input may also be recorded. The pieces of information that are input may be confidential information that should be kept secret from a third party for the user of the production apparatus. As described above, the confidential information may be included in the log data. When the log data is taken out of the production site, it is necessary to replace the portion where the confidential information is output with information that is not the confidential information.
Fig. 1 is a block diagram illustrating an exemplary hardware configuration of an information processing apparatus (information processing unit) capable of performing a log data processing method. The log data processing method is performed by a processing unit (a central processing unit (CPU) or a micro-processing unit (MPU)) in a computer, which reads out programs to execute the programs. The software and the programs realizing the functions of the information processing apparatus are supplied to an information processing apparatus composed of one or more computers via a network or various recording media. The processing unit in the information processing apparatus reads out the programs recorded in a recording medium or stored in a storage medium to execute the programs. Multiple computers that are apart from each other may transmit and receive data through wired communication or wireless communication to perform various processes in the programs. The information processing apparatus may be provided in a server connected to the production apparatus or in the production apparatus.
In the example illustrated in Fig. 1, a CPU 101 is a processing unit that performs arithmetic operations for a variety of data processing related to the log data processing and controls each component connected to a bus 108. A read only memory (ROM) 102 is a memory exclusively used for reading of data and stores a basic control program. A random access memory (RAM) 103 is a memory used for reading and writing of data and is used to temporarily store a variety of arithmetic processing by the CPU 101 and data. An external storage unit 104 is used to store conversion target log data 204, first definition data 205, second definition data 206, and conversion result log data 207, which are described below. The external storage unit 104 is also used for a temporary storage area of a system program in an operating system (OS) in the information processing apparatus and a temporary storage area during processing of programs and data. Although the input and output speed of data of the external storage unit 104 is slower than that of the RAM 103, the external storage unit 104 is capable of storing a large amount of data. The external storage unit 104 is desirably a non-volatile storage unit capable of permanently storing data so that the stored data can be referred to over a long time. Although the external storage unit 104 is mainly composed of a magnetic storage unit (hard disk drive (HDD)), the external storage unit 104 may be a unit that reads and writes data using an external medium that is loaded, such as a compact disc (CD), a digital versatile disk (DVD), or a memory card.
An input unit 105 is used to input characters and data into the information processing apparatus and corresponds to various keyboards and mice. A display unit 106 is used to display the result of processing by the information processing apparatus and corresponds to a cathode ray tube (CRT) or a liquid crystal monitor. A communication unit 107 is used to establish data communication with another information processing apparatus via a local area network (LAN) using a communication protocol, such as Transmission Control Protocol/Internet Protocol (TCP/IP). The communication unit 107 may acquire the conversion target log data 204, the first definition data 205, the second definition data 206, and the conversion result log data 207 from another information processing apparatus through the data communication and may store the conversion target log data 204, the first definition data 205, the second definition data 206, and the conversion result log data 207 in the other information processing apparatus.
Fig. 2 is a block diagram illustrating an exemplary configuration of the CPU 101. Referring to Fig. 2, the CPU 101 includes an acquirer 201, an editor 202, and a replacer 203. The acquirer 201 acquires first information, which is information to be replaced, using the conversion target log data 204 and the first definition data 205 acquired with the external storage unit 104. The editor 202 acquires the first information from the acquirer 201, performs editing in which the acquired first information is added to the second definition data 206, and stores the second definition data 206 in the external storage unit 104. The replacer 203 generates the conversion result log data 207 using the conversion target log data 204 and the second definition data 206 acquired from the external storage unit 104 and stores the generated conversion result log data 207 in the external storage unit 104.
Fig. 3 is a flowchart illustrating an exemplary process performed by the CPU 101 according to the first embodiment. Referring to Fig. 3, the CPU 101 starts the log data processing. In Step 301, the acquirer 201 acquires the conversion target log data 204. In Step 302, the acquirer 201 acquires the first information included in the log data on the basis of the first definition data 205. In Step 303, the editor 202 performs the editing in which the acquired first information is added to the second definition data 206. In Step 304, the replacer 203 acquires the conversion target log data 204 and the second definition data 206 and replaces the first information included in the conversion target log data 204 with second information, which is different from the first information, on the basis of the second definition data 206. In Step 305, the replacer 203 outputs the conversion result log data 207 as the replaced data. Then, the CPU 101 terminates the log data processing. The above process is the batch processing in which Steps 301 to 305 in the flowchart illustrated in Fig. 3 are performed for the entire conversion target log data 204. Alternatively, part (for example, a range corresponding to one line) of the conversion target log data 204 may be acquired and Steps 301 to 303, 301 to 304, or 301 to 305 may be repeatedly performed in the acquired range.
The acquirer 201 illustrated in Fig. 2 acquires the first information included in the log data on the basis of the first definition data 205, as described above. A situation in which the first information is output in the log may be identified, as in "the case in which the information is input before the production process is started." In the case of the log information for which the situation in which the first information is output in the log is determined, the log format in the output into the log data can be determined in advance and the position where the first information is acquired in the log data can be determined on the basis of the log format.
Fig. 4 illustrates an example of the first definition data 205 in which the log format is defined. The log format is used as a replacement condition (rule) and is used to determine the position of the first information in the log data. The first definition data 205 in Fig. 4 includes pattern information 401 (401-A to 401-C), position information 402 (402-A to 402-C), and range information 403 (403-A to 403-C) as the log formats. The pattern information 401 is used to determine the position where the first information is output. The position information 402 represents the relationship between the position of a character string coinciding with the pattern information 401 and the position of the first information. The range information 403 defines a range acquired as the first information.
In the example illustrated in Fig. 4, the pattern information 401, the position information 402, and the range information 403 are listed in this order in a comma-delimited manner. One definition is registered in one line, that is, a total of three definitions are registered in the first definition data 205 in Fig. 4. In all of the three definitions, the pattern information 401 and the position information 402 are registered as fixed character strings and the range information 403 is registered as a regular expression or a combination of a fixed character string and a regular expression for the log information of a text format. The first definition data 205 may be stored in a binary format or another format, instead of the text format. The first definition data 205 may be stored in a file format on a file system or may be stored in a database. The content of the first definition data 205 may be reversibly encrypted.
The pattern information 401 is defined on the basis of the log information for which the output of the first information with a predetermined pattern, for example, a certain character string or data set is set as the log format. A fixed character string, a regular expression, or a combination of a fixed character string and a regular expression may be used for the log information of the text format. A series of data sets of a certain order may be used for the log information of the binary format or another format.
The position information 402 defines information based on a predetermined rule that represents the positional relationship between the pattern identified by the pattern information 401 and the position where the acquisition of the first information is started. In the example illustrated in Fig. 4, the position information 402 is registered for the log information of the text format. The first information is acquired from a position shifted from a character next to the final character of the character string coinciding with the pattern information 401 to the end-of-line direction by the number of characters of the position information 402. "0" is registered for the position information 402 in the first and second definitions and "5" is registered for the position information 402 in the third definition.
The range information 403 defines information based on a predetermined method of defining the range acquired as the first information. In the definition of the range, for example, the number of characters to be acquired or a regular expression coinciding with the first information as a character string pattern may be defined for the log information of the text format. In the example illustrated in Fig. 4, regular expressions with which the character strings of the pieces of first information coincide are defined and registered, as in the pieces of range information 403-A to 403-C. The range information 403-A represents an arbitrary character string of one or more characters. The range information 403-B represents a character string resulting from combination of an arbitrary character string of one or more characters and a fixed character string ".job." The range information 403-C represents a character string resulting from combination of an arbitrary character string of one or more characters and a fixed character string ".msk."
Fig. 5 illustrates an example of the conversion target log data 204. The conversion target log data 204 may be stored in the binary format or another format, instead of the text format. The conversion target log data 204 may be stored in a file format on a file system or may be stored in a database. The content of the conversion target log data 204 may be reversibly encrypted.
In the example illustrated in Fig. 5, the conversion target log data 204 is the log data of the text format in which a time stamp is added and a new line is started for each piece of log information. In the first embodiment, the time stamp is simply represented by a character string "YYYY-MM-DD HH:MM:SS.SSS." In particular, the lines in which the pieces of first information to be replaced are output are extracted in the example illustrated in Fig. 5. An example will now be described, in which the first information is acquired from the conversion target log data 204 in Fig. 5 on the basis of the first definition data 205 in Fig. 4, the second definition data 206 is edited, and the conversion result log data 207 is output.
The acquirer 201 acquires the conversion target log data 204 in Fig. 5 and determines whether a portion coinciding with any character string in the pattern information 401 defined in the first definition data 205 in Fig. 4 is included in the conversion target log data 204. In the determination, a method of comparing each character in the log information with each character in the pattern information 401 from the beginning of the log information may be used or a method using a Rabin-Karp string search algorithm, a Boyer-Moore string search algorithm, or another string search algorithm may be used.
In the example illustrated in Fig. 5, it is determined that the pattern information 401-A registered in the first definition in the first definition data 205 in Fig. 4 is included in the fourth line in the log data. When the coincidence of the pattern information is determined, the acquirer 201 sets a position shifted from a character next to the final character of a character string 501 coinciding with the pattern information 401 to the end-of-line direction by the number of characters of the position information 402-A as the position where the first information is acquired. The acquirer 201 acquires a character string 504 coinciding with the range information 403-A from the position as the first information. Similarly, a character string 505 and a character string 506 are acquired from the eighth line and the twenty-first line, respectively, in Fig. 5 as the pieces of first information on the basis of the pattern information 401-B, the position information 402-B, and the range information 403-B registered in the second definition of the first definition data 205. The log formats coinciding with the twelfth line, the thirteenth line, and the seventeenth line in Fig. 5 are not defined in the first definition data 205 in Fig. 4 and correspond to the pieces of log information appearing once. No first information is acquired in the example illustrated in Fig. 5 for the third definition in the first definition data 205 in Fig. 4. In the first embodiment, the character strings 504 to 506 are simply represented for convenience.
Next, the editor 202 performs editing in which the character strings 504 to 506 acquired as the pieces of first information are added to the second definition data 206. Fig. 6 illustrates an example of the second definition data 206 in which the pieces of first information are stored in the text format. In the example illustrated in Fig. 6, each character string in the pattern information 401 is added in a bracket ([]) and the corresponding piece(s) of first information acquired from the conversion target log data 204 is (are) added to the subsequent line (lines).
The editor 202 does not redundantly add the first information existing in the second definition data 206. Accordingly, the first information in the second definition data 206 may be confirmed in the addition of the first information and only the pieces of first information which do not exist in the second definition data 206 may be added. Alternatively, after all the pieces of first information included in the conversion target log data 204 may be added to the second definition data 206, the redundant pieces of first information may be deleted. When the first information including the piece of first information to be added exists, the existing first information may be divided. When the first information including the existing piece of first information is to be added, the first information to be added may be divided for addition. When the size of the second definition data 206 is increased, an unnecessary piece of first information may be determined and the determined unnecessary piece of first information may be deleted. In order to determine the unnecessary piece of first information, for example, a method of determining the piece of first information that does not appear in the conversion target log data 204 for a certain period to be the unnecessary piece of first information may be used or a method of determining the piece of first information that has been added to the second definition data 206 before the certain period to be the unnecessary piece of first information may be used.
The second definition data 206 may be stored in the binary format or another format, instead of the text format. The second definition data 206 may be stored in a file format on a file system or may be stored in a database. The content of second definition data 206 may be reversibly encrypted. The first definition data 205 and the second definition data 206 may be in the same file when they are stored in the file format and may be in the same table when they are stored in the database.
The replacer 203 acquires the second definition data 206 and the conversion target log data 204. The replacer 203 searches the conversion target log data 204 for the piece of first information existing in the second definition data 206 and replaces the character string coinciding with the piece of first information existing in the second definition data 206 with the second information. In order to determine the coincidence with the first information in the conversion target log data 204, a method of comparing each character in the character string of the log information with each character in the character string of the first information from the beginning of the log information may be used or a method using the Rabin-Karp string search algorithm, the Boyer-Moore string search algorithm, or another string search algorithm may be used.
All the portions coinciding with the first information in the conversion target log data 204 are replaced with the second information. The second information may be a certain fixed character string defined by the replacer 203 in advance. The second information may be defined as an arbitrary fixed character string in the first definition data 205, the second definition data 206, or dedicated definition data (not illustrated) and the replacer 203 may acquire the defined second information. Although the example in which the second information is defined in the second definition data 206 is illustrated in Fig. 6, the second information is not necessarily included in the second definition data 206, as described above. In the example illustrated in Fig. 6, the pieces of pattern information (in brackets) are followed by commas and the corresponding pieces of second information 601 to 603. The replacer 203 acquires the pieces of second information 601 to 603 simultaneously with the acquisition of the second definition data 206 to determine the second information for the first information.
The replacer 203 outputs the conversion result log data 207 in which the first information included in the conversion target log data 204 is replaced with the second information. Fig. 7 illustrates an example in which the conversion result log data 207 is output in the same format as that of the conversion target log data 204. The conversion result log data 207 may be stored in the binary format or another format, instead of the text format. The conversion result log data 207 may have the same format as that of the conversion target log data 204 or may have a format different from that of the conversion target log data 204. The conversion result log data 207 may be stored in a file format on a file system or may be stored in a database. The content of the conversion result log data 207 may be reversibly encrypted.
Referring to Fig. 7, a character string 701 is replaced with the second information 601 in Fig. 6 and character strings 702 to 706 are replaced with the second information 602 in Fig. 6. As described above, the pieces of first information in the log data, such as the twentieth line, the thirteenth line, and the seventeenth line in Fig. 5, for which the definition of the log formats does not exist in the first definition data 205 are also capable of being replaced from the pieces of first information added to the second definition data 206 without defining the log format in the first definition data 205 in advance. When the log format of the first information that is replaced is not defined in the first definition data 205, the replacer 203 may add the log format to the first definition data 205. As described above, the replacement is available without adding the log format to the first definition data 205 if the first information that is capable of being identified from the first information acquired from other log information is output when the control program of the production apparatus is next updated. In addition, the replacer 203 may replace part or all of each line other than the line to be subjected to the replacement of the first information with the second information, among the lines included in the conversion target log data 204, with a certain fixed character string. Part of the line corresponds to, for example, the character string to the end of the line after the time stamp. This replaces the character strings to be replaced with no omission.
(Comparative Examples)
Examples to which the method of the first embodiment is not applied will now be described with reference to Fig. 8 and Fig. 9 for comparison. Fig. 8 illustrates exemplary definition data when the method of the first embodiment is not applied. As in the example of the first definition data 205 illustrated in Fig. 4, the definition data includes pattern information 801 (801-A to 801-C), position information 802 (802-A to 802-C), and range information 803 (803-A to 803-C). In addition, the definition data also includes second information 804 (804-A to 804-C). In the example illustrated in Fig. 8, the pattern information 801, the position information 802, the range information 803, and the second information 804 are listed in this order in a comma-delimited manner. One definition is registered in one line, that is, a total of three definitions are registered in the definition data in Fig. 8. The character string coinciding with the range information 803 is replaced with the second information 804 registered in the definition data from a position shifted from a character next to the final character of the character string coinciding with the pattern information 801 registered in the definition data in Fig. 8 to the end-of-line direction by the number of characters of the position information 802. When the method of the first embodiment is not applied, it is necessary to perform the definition for all the pieces of pattern information in the log formats in which the pieces of first information to be replaced are output.
Fig. 9 illustrates an example of the conversion result log data 207 resulting from the conversion of the conversion target log data 204 in Fig. 5 without applying the method of the first embodiment before the definition data in Fig. 8 is edited upon update of the control program of the production apparatus and addition of new log formats. In the example illustrated in Fig. 9, the twelfth line, the thirteenth line, and the seventeenth line indicate the log formats that are newly added and correspond to the pieces of log information appearing once. Character strings 903 to 905 output in the twelfth line, the thirteenth line, and the seventeenth line, respectively, should be replaced as the pieces of first information with the pieces of second information, as in the eighth line in which similar information is output. However, since the editing in which the log formats are added to the definition data is not performed, the character strings 903 to 905 are not replaced. As described above, when the method of the first embodiment is not applied, it is necessary to edit the definition data synchronously with any new addition or modification of the log format upon update of the control program of the production apparatus. Since the number of definitions in the definition data is increased as the increasing kinds of the log formats, the amount of log information for which the information to be replaced is not determined is also increased.
In contrast, with the log data processing method according to the first embodiment, even in the log format that is not defined in the first definition data 205, the first information is capable of being replaced without adding the log format to the first definition data 205 if the first information is the same as the first information added to the second definition data 206.
Accordingly, with the log data processing method according to the first embodiment, it is possible to eliminate or reduce the burden of the replacement. In addition, the replacement is available when the first information that is capable of being identified from the first information acquired from other log information is output even if the log information for which the information to be replaced is not determined appears only once (not multiple times). When the first information is the confidential information and the second information is not the confidential information, it is possible to prevent the confidential information from leaking outside.
Second Embodiment
A second embodiment will now be described with reference to Fig. 10 to FIG. 12, in which pieces of third information corresponding to the multiple different pieces of first information are generated with numbers in the second definition data 206. Points that are not described in the second embodiment may be similar to those in the first embodiment. In the replacement of the first information with the third information different from the first information, pieces of log information including the same first information in the conversion target log data 204 may be determined to be in the same group. In this case, also in the conversion result log data 207, it may be desirable to replace the same first information with the same third information to determine the first information and the third information to be the pieces of log information in the corresponding group. For example, when the pieces of first information are lot identifiers (IDs) used to distinguish production lots from each other, it is desirable to determine the pieces of log information with which the same lot ID is replaced to be in the same group in the conversion result log data 207.
Fig. 10 is a flowchart illustrating an exemplary process performed by the CPU 101 according to the second embodiment. The process illustrated in Fig. 10 differs from the process illustrated in Fig. 3 in Step 1003 and Step 1004. In Step 1003, the third information is generated to edit the second definition data 206. In Step 1004, the first information included in the conversion target log data 204 is replaced with the third information. Specifically, in Step 1003, the editor 202 performs editing in which the third information corresponding to the first information is generated before the first information is added to the second definition data 206 and the generated third information is added to the second definition data 206 with the first information. If the first information to be added exists in the second definition data 206, the third information corresponding to the first information is not generated and added. In Step 1004, the replacer 203 acquires the conversion target log data 204 and the second definition data 206 and replaces the first information included in the conversion target log data 204 with the third information on the basis of the second definition data 206. Accordingly, the same pieces of information are capable of being determined to be in the same group also in the multiple pieces of third information after the replacement. In the generation of the third information, a predetermined generation rule is set in the editor 202 in advance so that one-to-one correspondence is established between the first information and the third information. For example, the editor 202 may generate a number character string so that the one-to-one correspondence is established between the first information and the third information or may combine an arbitrary prefix or suffix character string with a number to generate a character string so that the one-to-one correspondence is established between the first information and the third information. Alternatively, the editor 202 may define an arbitrary format in the definition data, such as the first definition data 205, to generate the third information according to the format.
Fig. 11 illustrates an example of the second definition data 206. In the example illustrated in Fig. 11, the third information corresponding to the first information is generated with the first information and the generated third information is added to the second definition data 206. Specifically, the pieces of first information and pieces of third information 1101 to 1103 corresponding to the pieces of first information are listed in a comma-delimited manner. The replacer 203 acquires the pieces of third information 1101 to 1103 simultaneously with the acquisition of the second definition data 206 to determine the piece of third information for each piece of first information.
Fig. 12 illustrates an example of the conversion result log data 207 generated by performing the replacement to the first definition data 205 illustrated in Fig. 5 on the basis of the second definition data 206 illustrated in Fig. 11. The use of the pieces of third information having the one-to-one correspondence with the pieces of first information in the above manner allows pieces of third information 1201, 1202 to 1205, and 1206 to be replaced so that the pieces of third information 1201, 1202 to 1205, and 1206 are determined to be the pieces of log information in different groups, as in the log data before the conversion.
Accordingly, with the log data processing method according to the second embodiment, it is possible to eliminate or reduce the burden of the replacement. In addition, the replacement is available when the first information that is capable of being identified from the first information acquired from other log information is output even if the log information for which the information to be replaced is not determined appears only once (not multiple times). When the first information is the confidential information and the third information is not the confidential information, it is possible to prevent the confidential information from leaking outside. When the pieces of log information determined to be in the same group on the basis of the same first information exist, the pieces of log information are determined to be in the corresponding group on the basis of the same third information also in the conversion result log data 207.
Third Embodiment
A third embodiment will now be described with reference to Fig. 13 to FIG. 17, in which each piece of first information is divided into multiple elements and pieces of information resulting from conversion for every element are combined to each other to generate the pieces of third information corresponding to the multiple pieces of first information in the second definition data 206. Points that are not described in the third embodiment may be similar to those in the first and second embodiments. The replacement of the first information with the third information different from the first information is suitable for conversion of the log data into a readable mode for human beings, that is, a mode in which the readability of the log data is improved.
Fig. 13 illustrates an example of the conversion target log data 204. The conversion target log data 204 is the log data of the text format in which a time stamp is added and a new line is started for each piece of log information. In particular, the lines in which the pieces of first information to be replaced are output are extracted in the example illustrated in Fig. 13. Character strings 1301 to 1303 coinciding with the pieces of pattern information and pieces of first information 1304 to 1306 to be replaced are output in the example illustrated in Fig. 13. The log formats coinciding with the tenth line and the nineteenth line in Fig. 13 are not defined in the first definition data 205 and correspond to the log information appearing once.
The pieces of first information 1304 to 1306 to be replaced are unit IDs for identifying processing units in the production apparatus. Characters and figures that are difficult for human beings to distinguish are listed in the pieces of first information 1304 to 1306. When "the same pieces of information means the same units" for the pieces of first information in the replacement of the first information with the third information, it is desirable to determine the same units from the pieces of information after the replacement and to replace the first information with the information readable for human beings. For example, it is desirable that the unit ID of the same processing unit be replaced with the same pieces of third information and a character string "U001010001" representing a unit ID be replaced with a character string "Station01-R-C01."
Fig. 14 is a flowchart illustrating an exemplary process performed by the CPU 101 according to the third embodiment. The process illustrated in Fig. 14 differs from the processes illustrated in Fig. 3 and Fig. 10 in in Step 1403 in which the first information is converted for every element to generate the third information and the second definition data 206 is edited. In Step 1403, the editor 202 performs editing in which the third information corresponding to the first information is generated before the first information is added to the second definition data 206 and the third information is added to the second definition data 206 with the first information.
Fig. 15 is a diagram for describing an exemplary rule for generating the third information corresponding to the first information from the first information. In the example illustrated in Fig. 15, the character string of the first information is divided into three characters (1501) following the first character "U", two characters (1502) following the three characters, and four characters (1503) following the two characters and correspondence tables of the pieces of information after the replacement, which correspond to the divided pieces of information, are set as the rule. Each piece of information after the replacement to be defined in the correspondence tables should be the character string readable for human beings and should be unique in the correspondence table for each element. Accordingly, the third information generated according to the above rule is readable for human beings and the one-to-one correspondence is established between the first information and the third information. The data in the correspondence tables may be certain data defined by the editor 202 in advance. Alternatively, the data in the correspondence tables may be defined as arbitrary data in the first definition data 205, the second definition data 206, or dedicated definition data (not illustrated) and the editor 202 may acquire the defined data.
Fig. 16 illustrates an example of the second definition data 206. The editor 202 generates the third information from the first information according to the rule illustrated in Fig. 15. In the example illustrated in Fig. 16, in the addition of the pieces of first information, pieces of third information 1601 to 1603 corresponding to the pieces of first information are added after the pieces of first information and commas. Each piece of third information is generated as one character string in which the pieces of information converted according to the rule in the correspondence tables in Fig. 15 are connected with hyphen (-). The editor 202 generates the third information in the above manner in the addition of the first information and edits the second definition data 206. The replacer 203 acquires the third information simultaneously with the acquisition of the second definition data 206 to determine the piece of third information for each piece of first information.
Fig. 17 illustrates an example of the conversion result log data 207 resulting from replacement of the conversion target log data 204 in Fig. 13 on the basis of the second definition data 206 in Fig. 16. The use of the third information corresponding to the first information in the above manner allows the pieces of first information to be converted into the character strings readable for human beings, like pieces of third information 1701 to 1707, and allows the pieces of log information replaced from the pieces of first information in the same group to be determined, as in the second embodiment.
Accordingly, with the log data processing method according to the third embodiment, it is possible to eliminate or reduce the burden of the replacement. In addition, the replacement is available when the first information that is capable of being identified from the first information acquired from other log information is output even if the log information for which the information to be replaced is not determined appears only once (not multiple times). When the first information is the confidential information and the third information is not the confidential information, it is possible to prevent the confidential information from leaking outside. When the pieces of log information determined to be in the same group on the basis of the same first information exist, the pieces of log information are determined to be in the corresponding group on the basis of the same third information also in the conversion result log data 207. In addition, it is possible to convert the first information into a mode in which the readability of the log data is improved.
The log data processing methods according to the first to third embodiments may be realized in a combination of the first embodiment and the second embodiment, a combination of the first embodiment and the third embodiment, a combination of the second embodiment and the third embodiment, or a combination of all of the first to third embodiments, in addition to the realization of each embodiment.
According to the present invention, it is possible to provide a log data processing method, a log data processing program, and a log data processing apparatus capable of eliminating or reducing the burden of the replacement.
Other Embodiments
Embodiment(s) of the present invention can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a 'non-transitory computer-readable storage medium') to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)TM), a flash memory device, a memory card, and the like.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2015-042754, filed March 04, 2015, which is hereby incorporated by reference herein in its entirety.

Claims (13)

  1. A log data processing method of replacing information included in log data, which is performed in an information processing apparatus, the log data processing method comprising:
    acquiring first information identified from first definition data that defines a format of the log data;
    editing second definition data used to identify the first information included in the log data by adding the acquired first information to the second definition data; and
    replacing the first information that coincides with the first information in the second definition data and that is included in the log data with other information different from the first information.
  2. The log data processing method according to Claim 1,
    wherein the acquiring acquires a plurality of pieces of first information that are different from each other,
    wherein the editing generates pieces of third information corresponding to the plurality of pieces of first information in the second definition data and edits the second definition data using the acquired pieces of first information and the pieces of third information, and
    wherein the replacing replaces the pieces of first information that coincide with the pieces of first information in the second definition data and that are included in the log data with the pieces of third information identified from the second definition data.
  3. The log data processing method according to Claim 2,
    wherein the pieces of third information establish one-to-one correspondence with the pieces of first information.
  4. The log data processing method according to Claim 3,
    wherein the editing generates the pieces of third information which are added with numbers corresponding to the pieces of first information.
  5. The log data processing method according to Claim 2,
    wherein the pieces of third information have readability higher than that of the pieces of first information.
  6. The log data processing method according to Claim 5,
    wherein the editing divides the pieces of first information into a plurality of elements and generates the pieces of third information by combining pieces of information resulting from conversion for every element.
  7. The log data processing method according to Claim 1,
    wherein the replacing replaces confidential information included in the log data with information that is not the confidential information.
  8. The log data processing method according to Claim 1,
    wherein the first definition data and the second definition data are in a same file or in a same table.
  9. The log data processing method according to Claim 1,
    wherein, if an unnecessary piece of first information exists in the second definition data, the editing deletes the unnecessary piece of first information from the second definition data.
  10. The log data processing method according to Claim 1,
    wherein, among lines included in the log data, the replacing replaces lines other than a line in which the first information is replaced with other information different from the first information with fixed character strings.
  11. A program causing an information processing apparatus to perform the log data processing method according to Claim 1.
  12. A log data processing apparatus that replaces information included in log data, the log data processing apparatus comprising:
    an acquiring unit configured to acquire first information identified from first definition data that defines a format of the log data;
    an editing unit configured to edit second definition data used to identify the first information included in the log data by adding the acquired first information to the second definition data; and
    a replacing unit configured to replace the first information that coincides with the first information in the second definition data and that is included in the log data with other information different from the first information.
  13. A manufacturing apparatus that manufactures a good, the manufacturing apparatus comprising:
    an information processing unit configured to process log data related to the manufacturing apparatus,
    wherein the information processing unit acquires first information identified from first definition data that defines a format of the log data; edits second definition data used to identify the first information included in the log data by adding the acquired first information to the second definition data; and replaces the first information that coincides with the first information in the second definition data and that is included in the log data with other information different from the first information.
PCT/JP2016/000978 2015-03-04 2016-02-24 Log data processing method, log data processing program, and log data processing apparatus WO2016139918A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015042754A JP2016162366A (en) 2015-03-04 2015-03-04 Log data processing method, log data processing program, and log data processor
JP2015-042754 2015-03-04

Publications (1)

Publication Number Publication Date
WO2016139918A1 true WO2016139918A1 (en) 2016-09-09

Family

ID=56845157

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/000978 WO2016139918A1 (en) 2015-03-04 2016-02-24 Log data processing method, log data processing program, and log data processing apparatus

Country Status (2)

Country Link
JP (1) JP2016162366A (en)
WO (1) WO2016139918A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110063672A1 (en) * 2009-09-16 2011-03-17 Konica Minolta Business Technologies, Inc. Apparatus and method for log management, and computer-readable storage medium for computer program
US20110276541A1 (en) * 2010-05-10 2011-11-10 Ricoh Company, Ltd. Information processing system
JP2014235568A (en) * 2013-06-03 2014-12-15 富士通株式会社 Data processing apparatus and data processing apparatus program for use in failure analysis, data processing method for use in failure analysis, and data processing method for use in failure analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110063672A1 (en) * 2009-09-16 2011-03-17 Konica Minolta Business Technologies, Inc. Apparatus and method for log management, and computer-readable storage medium for computer program
US20110276541A1 (en) * 2010-05-10 2011-11-10 Ricoh Company, Ltd. Information processing system
JP2014235568A (en) * 2013-06-03 2014-12-15 富士通株式会社 Data processing apparatus and data processing apparatus program for use in failure analysis, data processing method for use in failure analysis, and data processing method for use in failure analysis

Also Published As

Publication number Publication date
JP2016162366A (en) 2016-09-05

Similar Documents

Publication Publication Date Title
CN110781231A (en) Batch import method, device, equipment and storage medium based on database
JP2021518021A (en) Data processing methods, equipment and computer readable storage media
US20160019266A1 (en) Query generating method and query generating device
US8359359B2 (en) Device, method, and computer program product for supporting creation of reply mail
WO2016139918A1 (en) Log data processing method, log data processing program, and log data processing apparatus
JP2015019372A (en) Data analysis system and method
JP2018022433A (en) Control program, apparatus, and method
CN110636042B (en) Method, device and equipment for updating verified block height of server
JPWO2013031129A1 (en) Information processing apparatus, information processing method, and program
JP2017130159A (en) Communication control device, communication control method, program, and communication system
JP6900265B2 (en) Data analysis system and data analysis method
JP6728840B2 (en) Image processing server, distribution device and program
JP2015022356A (en) Test scenario variation creation device, method, and program
JP5718256B2 (en) System performance analysis apparatus, system performance analysis method, and system performance analysis program
US11853969B2 (en) Managing artifact information, especially comparing and merging artifact information, method and system
JP2014099004A (en) Master file difference automatic output device
CN110781194B (en) Application program table data processing method and device, electronic equipment and storage medium
WO2017145357A1 (en) Information processing device, information processing method, and information processing program
JP6708086B2 (en) Operation information recording program, operation information recording device, and operation information recording method
WO2020170401A1 (en) Information processing device, information processing method, and information processing program
CN104778244A (en) Method and device for searching data
JPWO2013168375A1 (en) Security design apparatus and security design method
JP6606876B2 (en) Information processing apparatus and information processing method
US20150309975A1 (en) Non-transitory computer readable medium, information processing apparatus, and information processing method
JP2015011595A (en) Information processor and information processing program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16758617

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16758617

Country of ref document: EP

Kind code of ref document: A1