US20210056071A1 - Method for generating a coherent representation for at least two log files - Google Patents

Method for generating a coherent representation for at least two log files Download PDF

Info

Publication number
US20210056071A1
US20210056071A1 US16/547,782 US201916547782A US2021056071A1 US 20210056071 A1 US20210056071 A1 US 20210056071A1 US 201916547782 A US201916547782 A US 201916547782A US 2021056071 A1 US2021056071 A1 US 2021056071A1
Authority
US
United States
Prior art keywords
log
log files
files
file
entry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/547,782
Inventor
Dmitriy Fradkin
André Scholz
Matthias Loskyll
Georgia Olympia Brikis
Rakebul Hasan
Vladimir Lavrik
Alexander STORL
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Siemens Corp
Original Assignee
Siemens AG
Siemens Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens AG, Siemens Corp filed Critical Siemens AG
Priority to US16/547,782 priority Critical patent/US20210056071A1/en
Assigned to SIEMENS CORPORATION reassignment SIEMENS CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FRADKIN, DMITRIY
Assigned to SIEMENS AKTIENGESELLSCHAFT reassignment SIEMENS AKTIENGESELLSCHAFT ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: Brikis, Georgia Olympia, Hasan, Rakebul, Storl, Alexander, Scholz, André, LAVRIK, Vladimir, LOSKYLL, Matthias
Priority to PCT/EP2020/073289 priority patent/WO2021032820A1/en
Priority to CN202080059319.5A priority patent/CN114245895A/en
Priority to EP20771206.8A priority patent/EP3991054A1/en
Priority to US17/635,203 priority patent/US20220292053A1/en
Publication of US20210056071A1 publication Critical patent/US20210056071A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/116Details of conversion of file system types or formats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • G06F16/1794Details of file format conversion

Definitions

  • the following relates to a computer-implemented method for generating a coherent representation for at least two log tiles. Further, the invention relates to a corresponding computer program product and generating unit.
  • the amount of data or data volume is still increasing until now.
  • the data can include human- and machine-generated data.
  • This large or voluminous data is known under the terms “big data” or “large scale data”.
  • the digital data will substantially grow in the next years in view of the digital transformation and Industry 4.0.
  • Big data challenges include in particular capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.
  • the industrial plants usually comprise distinct parts, modules or units with a multiplicity of individual functions.
  • Exemplary units include sensors and actuators.
  • the units and functions have to be controlled and regulated in an interacting manner. They are often monitored, controlled and regulated by automation systems, for example the Simatic S7 system of Siemens AG.
  • the units can either exchange data directly with one another or communicate via a bus system with one another and with a master control unit, if the plant has such a unit.
  • the units are connected to the bus system via parallel or, more often, serial interfaces.
  • log files A large amount of log files is generated during operation of such industrial plants.
  • Each log file comprises one or more log entries and has a different structure or format depending on the computing unit, program or process it was generated by.
  • Log mining tasks struggle with the variety of log file structures, formats and types that can be found in heterogenous computer systems, such as the aforementioned industrial plants.
  • Exemplary tasks include the identification of anomalies in the log entries, comparison of the log files from one industrial plant over time, extraction of log files and/or extraction of relevant information of the log files from different industrial plants.
  • the information extraction can be accomplished automatically with regular expressions.
  • the patterns have to be defined and tested by an expert based on expert knowledge.
  • a disadvantage is that the definition, testing and pattern matching is error prone and time-consuming.
  • Each log file of the plurality of log files comprises at least a timestarnp and a message.
  • each log file can comprise additional elements or information including an internal structure, indicating message code and indicators of the computing unit, technical system, subsystem or component e.g. where it was generated. According to which, in this example the additional element or information gives an indication about the origin of the log file.
  • this additional information is extracted from the diverse log files and incorporated into processed log files.
  • the term extracting can be equally referred to as parsing.
  • the log files are extended with the addition information.
  • the incorporation or extension allows understanding the log files not only in terms of their content, but also their origin and other important data.
  • the processed log files are in accordance with a coherent representation.
  • the coherent representation allows the consideration of diverse types of log files from different origins and varying structural characteristics.
  • a log file can have one or more log entries.
  • a log entry is exactly one line.
  • a log entry comprises multiple lines.
  • separators between log entries or between different parts of a log message of a log entry can differ from program to program.
  • Time stamps can have different formats in different log files. Part of the timestamp e.g. date can be included in the log file name or in one of the header lines, while the remainder e.g. time is recorded for each log entry.
  • the additional information is an information selected from the group comprising: a computing unit which generated the log file, a program which generated the log file, configuration information of the computing unit which generated the log file, a log entry template and a connection between a log entry and the computing unit the log entry references. Accordingly, any additional auxiliary information can be incorporated.
  • log entries are instances of a log entry template.
  • the message of the log entry consists partly of a fixed text and partly of dynamically generated values, thus two parts.
  • the log entry template can be expressed as “Unable to open file % s”, whereas the part “Unable to open file” is the fixed part and “% s” is the variable part.
  • the actual instances have specific file paths in the message text.
  • the coherent representation is an input for log mining or any other analysis.
  • the output of the method or result in the form of the coherent representation can be used for distinct tasks.
  • the knowledge graph is important for diagnosis and repair of problems in an industrial environment e.g. industrial plants.
  • the method allows the transformation of a set or collection of diverse log files from computing units or systems into a knowledge graph.
  • the problems e.g. defects or failures of industrial plants can be handled in an efficient timely manner.
  • a further aspect of the invention is a computer program product directly loadable into an internal memory of a computer, comprising software code portions for performing the steps according to the aforementioned method when said computer program product is running on a computer.
  • a further aspect of the invention is a generating unit for performing the aforementioned method.
  • the unit may be realized as any device, or any means, for computing, in particular for executing a software, an app, or an algorithm.
  • the generating unit may consist of or comprise a central processing unit (CPU) and/or a memory operatively connected to the CPU.
  • the unit may also comprise an array of CPUs, an array of graphical processing units (GPUs), at least one application-specific integrated circuit (ASIC), at least one field-programmable gate array, or any combination of the foregoing.
  • the unit may comprise at least one module which in turn may comprise software and/or hardware. Some, or even all, modules of the units may be implemented by a cloud computing platform.
  • FIG. 1 illustrates a flowchart of the method according to the invention
  • FIG. 2 illustrates an exemplary knowledge graph according to an embodiment of the invention
  • FIG. 4 illustrates distinct configuration tiles according to an embodiment of the invention.
  • FIG. 5 illustrates an exemplary use case of the method according to the invention.
  • FIG. 1 illustrates a flowchart of the method according to the invention with the method steps S 1 to S 3 .
  • the method steps S 1 to S 3 will be explained in the following in more detail.
  • each log file of the at least two log files comprises at least one log entry 10 with at least one time stamp 12 and at least one message 14 , wherein the at least two log tiles differ from one another with respect to at least one distinctive criteria.
  • At least one additional information of each log file of the at least two log files is extracted S 2 .
  • each log file of the at least two log files is combined with the extracted additional information into at least two processed log tiles S 3 , wherein the at least two processed log files comply with a coherent representation.
  • the method according to the invention results in the coherent representation, which can be directly loaded used for a knowledge graph.
  • the method can be performed by the generating unit.
  • the generating unit can be equally referred to as universal parser or universal parsing unit.
  • log entries The interconnections between log entries and computing units or devices they reference can be collected as well. Accordingly, the log entry messages can be used to identify cross-reference computer names and IP addresses.
  • the output can be loaded into a knowledge graph, as explained further above.
  • An exemplary knowledge graph is shown in FIG. 2 , comprising the following entities and relations:
  • the knowledge graph provides the users e.g. experts and service technicians an organized view of the log file data.
  • FIG. 5 An exemplary use case is shown in FIG. 5 .
  • the log files can be collected from different customer plants with SEMATIC systems.
  • the knowledge extraction process is described by the bottom part of the figure.
  • log files are clustered.
  • Log messages and time stamps are extracted by generic parsers.
  • the messages can be used to extract templates. Further, the content of messages can be extracted. All information is inserted into a knowledge graph for further analysis according to the right part of the figure, like anomaly detection, failure prediction and root cause understanding by a combination of statistical and knowledge graph analytics.
  • the data can refer to

Abstract

Provided is a Computer-implemented method for Receiving the at least two log files; wherein each log file of the at least two log files includes at least one log entry with at least one time stamp and at least one message; wherein the at least two log files differ from one another with respect to at least one distinctive criteria; Extracting at least one additional information of each log file of the at least two log files; and Combining each log file of the at least two log files with the extracted additional information into at least two processed log tiles; wherein the at least two processed log files comply with a coherent representation. Further, the invention relates to a corresponding computer program product and generating unit.

Description

    FIELD OF TECHNOLOGY
  • The following relates to a computer-implemented method for generating a coherent representation for at least two log tiles. Further, the invention relates to a corresponding computer program product and generating unit.
  • BACKGROUND
  • The amount of data or data volume is still increasing until now. The data can include human- and machine-generated data. This large or voluminous data is known under the terms “big data” or “large scale data”. Especially, the digital data will substantially grow in the next years in view of the digital transformation and Industry 4.0.
  • Thus, the importance of automated large scale data analysis or data processing will gain in importance since the manual analysis becomes unfeasible for the experts. This analysis or processing paradigm encompasses a series of different methods and systems to process big data. Big data challenges include in particular capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.
  • Considering complex industrial plants, the industrial plants usually comprise distinct parts, modules or units with a multiplicity of individual functions. Exemplary units include sensors and actuators. The units and functions have to be controlled and regulated in an interacting manner. They are often monitored, controlled and regulated by automation systems, for example the Simatic S7 system of Siemens AG. The units can either exchange data directly with one another or communicate via a bus system with one another and with a master control unit, if the plant has such a unit. The units are connected to the bus system via parallel or, more often, serial interfaces.
  • A large amount of log files is generated during operation of such industrial plants. Each log file comprises one or more log entries and has a different structure or format depending on the computing unit, program or process it was generated by. Log mining tasks struggle with the variety of log file structures, formats and types that can be found in heterogenous computer systems, such as the aforementioned industrial plants. Exemplary tasks include the identification of anomalies in the log entries, comparison of the log files from one industrial plant over time, extraction of log files and/or extraction of relevant information of the log files from different industrial plants.
  • According to prior art, users or experts have to manually analyze the huge amount of log files and to extract the relevant information from the log files. However, such manual approaches rely on expert knowledge and require a lot of manual effort. Thus, they are errorprone, time consuming and expensive.
  • According to prior art, besides the manual approaches, the information extraction can be accomplished automatically with regular expressions. However, the patterns have to be defined and tested by an expert based on expert knowledge. A disadvantage is that the definition, testing and pattern matching is error prone and time-consuming.
  • An aspect relates to provide a computer-implemented method for generating a coherent representation for at least two log files in an efficient and reliable manner.
  • SUMMARY
  • This problem is according to one aspect of the invention solved by computer-implemented method for generating a coherent representation for at least two log files, comprising the steps:
      • a. Receiving the at least two log files; wherein
      • b. each log file of the at least two log files comprises at least one log entry with at least one time stamp and at least one message; wherein
      • c. the at least two log files differ from one another with respect to at least one distinctive criteria;
      • d. Extracting at least one additional information of each log file of the at least two log files; and
      • e. Combining each log file of the at least two log files with the extracted additional information into at least two processed log files; wherein
      • f. the at least two processed log files comply with a coherent representation.
  • Accordingly, the invention is directed to a computer-implemented method for generating a coherent representation for at least two log files. In other words, the log files comply with a coherent representation or are in accordance with a coherent representation, which can be directly used as input for further method steps or applications e.g. log mining tasks. Log mining tasks are directed to the aforementioned analysis of log files. In other words, the coherent representation can be used as input for log mining,
  • In a first step, the log files are provided as input. During operation, a computing unit or technical system generates a huge amount of log files, see further above. Thereby, the log files are in most of the cases of different format or type. In other words, according to this example, the distinctive criterium is the format or the type. For example, the log entry structure can vary between different types of log files i.e. those produced or generated by different programs or computing units.
  • Each log file of the plurality of log files comprises at least a timestarnp and a message. Furthermore, each log file can comprise additional elements or information including an internal structure, indicating message code and indicators of the computing unit, technical system, subsystem or component e.g. where it was generated. According to which, in this example the additional element or information gives an indication about the origin of the log file.
  • In further steps this additional information is extracted from the diverse log files and incorporated into processed log files. The term extracting can be equally referred to as parsing. In other words, the log files are extended with the addition information. The incorporation or extension allows understanding the log files not only in terms of their content, but also their origin and other important data.
  • The processed log files are in accordance with a coherent representation. The coherent representation allows the consideration of diverse types of log files from different origins and varying structural characteristics.
  • In one aspect of the invention the at least one distinctive criterium is selected from the group comprising type, format and structure. Accordingly, a log file can have one or more log entries. Thus, according to some types of log files, a log entry is exactly one line. According to other types, a log entry comprises multiple lines. Moreover, separators between log entries or between different parts of a log message of a log entry can differ from program to program. Time stamps can have different formats in different log files. Part of the timestamp e.g. date can be included in the log file name or in one of the header lines, while the remainder e.g. time is recorded for each log entry. The advantage is that the parsing or extracting step can be flexibly applied on diverse log files irrespective of any differences.
  • In one aspect of the invention the additional information is an information selected from the group comprising: a computing unit which generated the log file, a program which generated the log file, configuration information of the computing unit which generated the log file, a log entry template and a connection between a log entry and the computing unit the log entry references. Accordingly, any additional auxiliary information can be incorporated.
  • Log Entry Template:
  • Usually, log entries are instances of a log entry template. This means that the message of the log entry consists partly of a fixed text and partly of dynamically generated values, thus two parts. For example, the log entry template can be expressed as “Unable to open file % s”, whereas the part “Unable to open file” is the fixed part and “% s” is the variable part. The actual instances have specific file paths in the message text.
  • The advantage of this additionally or auxiliary information is that the information content of the log files is significantly increased.
  • In another aspect of the invention the coherent representation is an input for log mining or any other analysis.
  • In a further aspect of the invention the method comprises the further step of loading the coherent representation into a knowledge graph.
  • Accordingly, the output of the method or result in the form of the coherent representation can be used for distinct tasks. Thereby, the knowledge graph is important for diagnosis and repair of problems in an industrial environment e.g. industrial plants. In other words, the method allows the transformation of a set or collection of diverse log files from computing units or systems into a knowledge graph. Thus, the problems e.g. defects or failures of industrial plants can be handled in an efficient timely manner.
  • A further aspect of the invention is a computer program product directly loadable into an internal memory of a computer, comprising software code portions for performing the steps according to the aforementioned method when said computer program product is running on a computer.
  • A further aspect of the invention is a generating unit for performing the aforementioned method.
  • The unit may be realized as any device, or any means, for computing, in particular for executing a software, an app, or an algorithm. For example, the generating unit may consist of or comprise a central processing unit (CPU) and/or a memory operatively connected to the CPU. The unit may also comprise an array of CPUs, an array of graphical processing units (GPUs), at least one application-specific integrated circuit (ASIC), at least one field-programmable gate array, or any combination of the foregoing. The unit may comprise at least one module which in turn may comprise software and/or hardware. Some, or even all, modules of the units may be implemented by a cloud computing platform.
  • BRIEF DESCRIPTION
  • Some of the embodiments will be described in detail, with reference to the following figures, wherein like designations denote like members, where in:
  • FIG. 1 illustrates a flowchart of the method according to the invention;
  • FIG. 2 illustrates an exemplary knowledge graph according to an embodiment of the invention;
  • FIG. 3 illustrates distinct log tiles according to an embodiment of the invention;
  • FIG. 4 illustrates distinct configuration tiles according to an embodiment of the invention; and
  • FIG. 5 illustrates an exemplary use case of the method according to the invention.
  • DETAILED DESCRIPTION
  • FIG. 1 illustrates a flowchart of the method according to the invention with the method steps S1 to S3. The method steps S1 to S3 will be explained in the following in more detail.
  • In a first step the at least two log tiles are received S1, wherein each log file of the at least two log files comprises at least one log entry 10 with at least one time stamp 12 and at least one message 14, wherein the at least two log tiles differ from one another with respect to at least one distinctive criteria. These log files are depicted in FIG. 3,
  • In a second step at least one additional information of each log file of the at least two log files is extracted S2.
  • In a third step each log file of the at least two log files is combined with the extracted additional information into at least two processed log tiles S3, wherein the at least two processed log files comply with a coherent representation.
  • The method according to the invention results in the coherent representation, which can be directly loaded used for a knowledge graph. The method can be performed by the generating unit. The generating unit can be equally referred to as universal parser or universal parsing unit.
  • Additional or Auxiliary Information
      • a computing unit which generated the log file
        • The information about the computing unit the log file was generated by can be collected.
      • a program which generated the log file
        • The information about the program the log file was generated by can be collected, in particular the name of the program that generated the log file can be extracted.
        • Log files generated by different computing units, programs or processes can end up in different locations i.e. along different file paths. The file paths can contain the additionally information about what computing units, programs or processes generated which log files. The algorithm is represented with the following exemplar)/pseudo code:
  • PARSELOGFILE(filepath):
    logFile = openRead(filepath)
    logEntries = [ ]
    while NOT logFile.endOfFile( )
     line = logFile.readLine( )
     num =+ 1
     buffer = ””
     ts = ””
     if findTimestamp(line):
    buffer += line
     else:
     logEntries.append(buffer, ts)
     ts = findTimestamp(line)
     buffer = splitString(line, ts)
    endwhile
    logFile.close( )
    return logEntries
    function splitString(l, ts)
    pos = l.find(ts) + ts.lenght
    return l[pos:]
    endfunction
    function findTimestamp(l)
     // Set of regular expressions specifying
     // different formats of timestamps
     tsRegExList
     for regEx in tsRegExList
     if regEx.match(l):
     return regEx.match(l)
     else
     return 0
    endfunction
      •  Accordingly, the paths of the log files can be extracted to identify the computing unit, program or process that generated the respective log file. Different programs tend to write their log files into separate locations and data from different compute units is likely to be dumped separately. Thus, the specific log entries can be associated with the respective computing unit, program or process.
      • configuration information of the computing unit which generated the log file
        • The device configuration information can be collected, e.g. values of configuration settings in the log entries. Further, certain log files can be linked to the computing units, program or process that generated them.
  • For example, the configuration information or file of a program might specify where the log files will be written or set flags for certain behaviors. These configuration files are depicted in FIG. 4.
      • a log entry template
        • The templates of the underlying structure that log entry messages have can be collected.
        • Accordingly, log files from large distributed systems can reflect the system structure:
        • There can be multiple computing units of different types or fulfilling different roles e.g. servers and clients or embedded systems, but running same or similar software programs. Thus, the log file dumps from each such a computing unit contain same or similar types of log files. Further, computing units generating different types of log entries likely have different functions.
        • Moreover, the log entry messages can comprise information about network organization e.g. by mentioning names or IP addresses or different computers.
        • An exemplary log file dump or snapshot can be expressed as follows:
        • PlantX/ComputerY/file_path_for_programZ/logs (or settings/config files)
        • The log entry template can be determined by clustering or grouping the message texts and identification of invariant parts. Thereby, the variable parts are the template parameters and the messages with the same fixed palls are generated from the same templates.
        • Having identified the log entry templates, the multi-language versions of the same template can be identified as well since they are generated by the same computing unit, program or process and thus have the same number or parameters. This semantic verification can be performed manually or automatically with automated translators.
      • a connection between a log entry and the computing unit the log entry references
  • The interconnections between log entries and computing units or devices they reference can be collected as well. Accordingly, the log entry messages can be used to identify cross-reference computer names and IP addresses.
  • Knowledge Graph
  • The output can be loaded into a knowledge graph, as explained further above. An exemplary knowledge graph is shown in FIG. 2, comprising the following entities and relations:
      • A Plant consists of multiple devices
      • Some devices have computers in order to perform computations
      • A process is an instance of a program running on a computer
      • A program can have multiple General Log Templates (GLT)
      • Each GLT has a message template with several parameters
      • A log template is a language-specific version of a GLT
      • A log entry 10 is an instantiation of a log template
      • A log entry 10 has a timestamp 12 (TS)
      • A log entry 10 has a message text 14—template with parameters filled
      • A log entry is contained in a log file (LF) and produced by a Process and is therefore linked to computing unit
      • A computing unit or computer is referenced by a log entry in a message
      • A configuration file (CF) on a computing unit can have multiple configuration values
      • (CV) affecting the whole computing unit or specific processes
      • A configuration value can be directly referenced by a log entry message or can have indirect relevance
      • A plant can have multiple Snapshots generated at different points in time
  • Exemplary Applications
  • At present time most of the operation and control of industrial equipment is managed by standard or special control software. Humans may be frequently engaged in a monitoring capacity, but only get involved in problem situations. However, when such situations arise it may be nontrivial to identify causes and potential solutions. The main way to get insight into operations of such computer-controlled systems is by examining information from relevant log files. This task is performed manually by experienced service technicians making it time-consuming and not always as accurate as needed.
  • The knowledge graph provides the users e.g. experts and service technicians an organized view of the log file data.
  • An exemplary use case is shown in FIG. 5. The log files can be collected from different customer plants with SEMATIC systems. The knowledge extraction process is described by the bottom part of the figure.
  • In a first step the log files are clustered. Log messages and time stamps are extracted by generic parsers. The messages can be used to extract templates. Further, the content of messages can be extracted. All information is inserted into a knowledge graph for further analysis according to the right part of the figure, like anomaly detection, failure prediction and root cause understanding by a combination of statistical and knowledge graph analytics.
  • Considering industrial applications and environments, the data can refer to
      • Power plants. The power plants can have multiple turbines and other pieces of equipment.
      • Modem factories. The factories can have multiple interacting automated tools.
      • Trains. The trains can have multiple semi-autonomous systems, for example for door control, climate control and for movement.
      • Medical equipment. The equipment can have separate controllers for operating different movable parts e.g. the patient bed or the scanning tools and the devices e.g. MRT for imaging and data collection.
  • Reference Signs
  • S1 to S3 Method steps 1 to 3
  • 10 log entry
  • 12 time stamp (TS) of log entry
  • 14 message of log entry
  • Although the present invention has been disclosed in the form of preferred embodiments and variations thereon, it will be understood that numerous additional modifications and variations could be made thereto without departing from the scope of the invention.
  • For the sake of clarity, it is to be understood that the use of ‘a’ or ‘an’ throughout this application does not exclude a plurality, and ‘comprising’ does not exclude other steps or elements.

Claims (7)

1. A computer-implemented method for generating a coherent representation for at least two log tiles, comprising the steps:
a. Receiving the at least two log files; wherein
b. each log file of the at least two log files comprises at least one log entry with at least one time stamp and at least one message; wherein
c. the at least two log files differ from one another with respect to at least one distinctive criteria;
d. Extracting at least one additional information of each log file of the at least two log tiles; and
e. Combining each log file of the at least two log files with the extracted additional information into at least two processed log files; wherein
f. the at least two processed log files comply with a coherent representation.
2. The method according to claim 1, wherein the at least one distinctive criteria is selected from the group comprising:
type format and structure.
3. The method according to claim 1, wherein the additional information is an information selected from the group comprising:
a computing unit which generated the log file, a program which generated the log file, configuration information of the computing unit which generated the log file, a log entry template and a connection between a log entry and the computing unit the log entry references.
4. The method according to claim 1, wherein the coherent representation is an input for log mining or any other further analysis.
5. The method according to claim 1, wherein the method comprises the further step of loading the coherent representation into a knowledge graph.
6. A computer program product directly loadable into an internal memory of a computer, comprising software code portions for performing the steps according to claim I when said computer program product is running on a computer.
7. The generating unit for performing the steps according to claim 1.
US16/547,782 2019-08-22 2019-08-22 Method for generating a coherent representation for at least two log files Abandoned US20210056071A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US16/547,782 US20210056071A1 (en) 2019-08-22 2019-08-22 Method for generating a coherent representation for at least two log files
PCT/EP2020/073289 WO2021032820A1 (en) 2019-08-22 2020-08-20 Method for generating a coherent representation for at least two log files
CN202080059319.5A CN114245895A (en) 2019-08-22 2020-08-20 Method for generating consistent representation for at least two log files
EP20771206.8A EP3991054A1 (en) 2019-08-22 2020-08-20 Method for generating a coherent representation for at least two log files
US17/635,203 US20220292053A1 (en) 2019-08-22 2020-08-20 Method for generating a coherent representation for at least two log files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US16/547,782 US20210056071A1 (en) 2019-08-22 2019-08-22 Method for generating a coherent representation for at least two log files

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/635,203 Continuation US20220292053A1 (en) 2019-08-22 2020-08-20 Method for generating a coherent representation for at least two log files

Publications (1)

Publication Number Publication Date
US20210056071A1 true US20210056071A1 (en) 2021-02-25

Family

ID=72470318

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/547,782 Abandoned US20210056071A1 (en) 2019-08-22 2019-08-22 Method for generating a coherent representation for at least two log files
US17/635,203 Pending US20220292053A1 (en) 2019-08-22 2020-08-20 Method for generating a coherent representation for at least two log files

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/635,203 Pending US20220292053A1 (en) 2019-08-22 2020-08-20 Method for generating a coherent representation for at least two log files

Country Status (4)

Country Link
US (2) US20210056071A1 (en)
EP (1) EP3991054A1 (en)
CN (1) CN114245895A (en)
WO (1) WO2021032820A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4099225A1 (en) 2021-05-31 2022-12-07 Siemens Aktiengesellschaft Method for training a classifier and system for classifying blocks

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5890154A (en) * 1997-06-06 1999-03-30 International Business Machines Corp. Merging database log files through log transformations
US6792458B1 (en) * 1999-10-04 2004-09-14 Urchin Software Corporation System and method for monitoring and analyzing internet traffic
US8214899B2 (en) * 2006-03-15 2012-07-03 Daniel Chien Identifying unauthorized access to a network resource
US20120054675A1 (en) * 2010-08-26 2012-03-01 Unisys Corporation Graphical user interface system for a log analyzer
ES2755780T3 (en) * 2011-09-16 2020-04-23 Veracode Inc Automated behavior and static analysis using an instrumented sandbox and machine learning classification for mobile security
US9697100B2 (en) * 2014-03-10 2017-07-04 Accenture Global Services Limited Event correlation
EP3291120B1 (en) * 2016-09-06 2021-04-21 Accenture Global Solutions Limited Graph database analysis for network anomaly detection systems
US10528454B1 (en) * 2018-10-23 2020-01-07 Fmr Llc Intelligent automation of computer software testing log aggregation, analysis, and error remediation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4099225A1 (en) 2021-05-31 2022-12-07 Siemens Aktiengesellschaft Method for training a classifier and system for classifying blocks
WO2022253636A1 (en) 2021-05-31 2022-12-08 Siemens Aktiengesellschaft Method for training a classifier and system for classifying blocks

Also Published As

Publication number Publication date
CN114245895A (en) 2022-03-25
US20220292053A1 (en) 2022-09-15
EP3991054A1 (en) 2022-05-04
WO2021032820A1 (en) 2021-02-25

Similar Documents

Publication Publication Date Title
JP6643211B2 (en) Anomaly detection system and anomaly detection method
CN113614666A (en) System and method for detecting and predicting faults in an industrial process automation system
CN116209963A (en) Fault diagnosis and solution recommendation method, device, system and storage medium
CN103761173A (en) Log based computer system fault diagnosis method and device
US9043651B2 (en) Systematic failure remediation
CN112287009A (en) Interface calling and interface data warehousing method, device, equipment and storage medium
DE102017220140A1 (en) Polling device, polling method and polling program
US9489379B1 (en) Predicting data unavailability and data loss events in large database systems
US20220292053A1 (en) Method for generating a coherent representation for at least two log files
CN117501275A (en) Method, computer program product and computer system for analyzing data consisting of a large number of individual messages
US20230098474A1 (en) Processing continuous integration failures
US11822578B2 (en) Matching machine generated data entries to pattern clusters
US20220035359A1 (en) System and method for determining manufacturing plant topology and fault propagation information
CN112988444B (en) Processing method, processing device and processing equipment for server cluster fault diagnosis, method for server fault diagnosis and computer-readable storage medium
CN114064387A (en) Log monitoring method, system, device and computer readable storage medium
CN112416896A (en) Data abnormity warning method and device, storage medium and electronic device
US20210110284A1 (en) Method and system for automatic error diagnosis in a test environment
US20230004591A1 (en) Method for generating triples from log entries
JP2022002029A (en) Data analysis system, data analysis method, and data analysis program
CN112699005A (en) Server hardware fault monitoring method, electronic equipment and storage medium
CN113392000B (en) Test case execution result analysis method, device, equipment and storage medium
US20240004747A1 (en) Processor System and Failure Diagnosis Method
US20220261332A1 (en) Computer-implemented method for determining at least one quality attribute for at least one defect of interest
CN110399293B (en) System test method, device, computer equipment and storage medium
JP2020057078A (en) Inspection control device, inspection control method, and inspection control program

Legal Events

Date Code Title Description
AS Assignment

Owner name: SIEMENS AKTIENGESELLSCHAFT, GERMANY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SCHOLZ, ANDRE;LOSKYLL, MATTHIAS;BRIKIS, GEORGIA OLYMPIA;AND OTHERS;SIGNING DATES FROM 20190924 TO 20191014;REEL/FRAME:051261/0641

Owner name: SIEMENS CORPORATION, FLORIDA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:FRADKIN, DMITRIY;REEL/FRAME:051261/0749

Effective date: 20190920

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION