CN111130877A - NLP-based weblog processing system and method - Google Patents

NLP-based weblog processing system and method Download PDF

Info

Publication number
CN111130877A
CN111130877A CN201911334997.3A CN201911334997A CN111130877A CN 111130877 A CN111130877 A CN 111130877A CN 201911334997 A CN201911334997 A CN 201911334997A CN 111130877 A CN111130877 A CN 111130877A
Authority
CN
China
Prior art keywords
equipment
information
bank
language processing
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911334997.3A
Other languages
Chinese (zh)
Other versions
CN111130877B (en
Inventor
冒佳明
赵俊峰
曹晶
夏飞
夏元轶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd filed Critical Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority to CN201911334997.3A priority Critical patent/CN111130877B/en
Publication of CN111130877A publication Critical patent/CN111130877A/en
Application granted granted Critical
Publication of CN111130877B publication Critical patent/CN111130877B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications

Abstract

The invention discloses a network log processing system based on NLP, which comprises a natural language processing component and a database; establishing a classified word bank, a preset word meaning bank and a language processing model in the database, wherein the classified word bank sets a plurality of transliterated words which are specifically corresponding to equipment types or keywords which take high-frequency words derived after word segmentation processing as standards; the classified word bank is associated with a preset word sense bank in a mapping mode, and the preset word sense bank is associated with a language processing model; the natural processing component performs inductive classification, analysis and determination of meanings contained in natural language sentences on syslog source data and log files of the equipment. The invention overcomes the defect that the traditional template-based method can not be analyzed aiming at undefined logs, improves the usability of the system and the usability of users.

Description

NLP-based weblog processing system and method
Technical Field
The invention relates to the technical field of network security, in particular to a network log processing system and method based on NLP.
Background
In modern society with higher and higher network popularity, network monitoring and management are important guarantees for reasonable utilization of network resources and information. In order to conveniently and rapidly learn and control the running state of the whole network in an all-around manner and timely respond to problems and threats existing in the network, a network manager currently and generally collects and analyzes network logs in a centralized manner through a log management component, provides the implementation running state of equipment in the network for the manager, and realizes effective control on risks.
In recent years, with the development of artificial intelligence technology, Natural Language Processing (NLP) technology stands out from the field of many artificial intelligence and becomes an important direction. NLP has many advantages over traditional template-based language generation techniques, in that the generation technique minimizes human involvement and can automatically learn input-to-output mappings from data. The goal of Syslog on the other hand, which is dedicated to convey the device developer's specification and elucidation of the device's state, is essentially a discrete, symbolic, absolute signal system with all the features of natural language, which in most cases can stand alone from context and be less ambiguous than human language. Therefore, the results of machine learning by NLP method and then analyzing Syslog would be very accurate.
Currently, it is common practice in the industry to use the syslog protocol, a standard for passing message records over the internet protocol (TCP/IP) network. The protocol is supported by more equipment manufacturers and various system platforms, and the syslog instruction is used for network information management and network security audit. In the aspect of message format, the syslog message format has a certain structuralization, and various network management systems or log servers can analyze the content of the syslog message by receiving the syslog message, so that the simple judgment of event level and event characteristics is realized. The basic idea of the protocol is simple and efficient, namely, the sending end and the receiving end do not need to perform interface joint debugging mutually, and the log forwarding can be realized.
Unlike SNMP, the message body part of syslog has no strict format control, and developers cannot obtain the length of the whole message body, the data type of parameters and the length of parameters through a message structure, so that the definition of the log is greatly different if the standards of different manufacturers are different. In the actual application process, the situation that the defined background of the vendor for the syslog is inconsistent with the user service environment also exists. In summary, the current industry syslog component mainly has the following defects:
1. since syslog is written based on the knowledge of device vendor developers, the presentation of the same meaning in different vendors/models of devices also varies slightly.
2. The readability of the syslog message is poor, and due to the excessive professional terms, a manager needs to have a large amount of professional background knowledge to understand the meaning of the message.
3. The log events are not unified, so that the alarm level classification and the event classification cannot be effectively classified, and certain obstacles are caused to association analysis.
4. The traditional translation technology based on the template has poor flexibility and applicability.
Therefore, the need for a weblog processing component that can express the intention of the syslog writer and does not need much expertise is very urgent for the network manager.
Disclosure of Invention
The invention aims to provide a NLP-based weblog processing system and method, which can effectively process various logs of unknown types and formats, overcome the defect that the undefined logs cannot be analyzed by the conventional template-based method, improve the usability of the system and improve the usability of users.
In order to achieve the purpose, the invention provides the following technical scheme:
an NLP-based weblog processing system comprises a natural language processing component and a database; establishing a classified word bank, a preset word meaning bank and a language processing model in the database, wherein the classified word bank sets a plurality of transliterated words which are specifically corresponding to equipment types or keywords which take high-frequency words derived after word segmentation processing as standards; the classified word bank is associated with a preset word sense bank in a mapping mode, and the preset word sense bank is associated with a language processing model;
the natural language processing component is used for carrying out induction classification and analysis on syslog source data and log files of the equipment, determining meanings contained in natural language sentences, acquiring a certain number of sentences to train a language processing model of the database, carrying out training and learning on the sentences by the natural language processing component according to effective fields acquired by a preset word meaning library so as to generate a plurality of training words as keywords, generating analysis information for the keywords, and generating the language processing model according to the keywords and the corresponding analysis information, wherein the language processing model adopts a neural network architecture.
Preferably, the natural language processing component further comprises an acquisition module, a segmentation module and an analysis module; the acquisition module is used for receiving basic information or training sentences of the equipment source, caching the basic information or the training sentences into a database, classifying according to predefined rules, and caching the basic information or the training sentences into a classified word bank;
the segmentation module is used for matching and segmenting the training sentences based on a preset word sense library, and segmenting the training sentences into at least one training word as a keyword;
the analysis module is used for analyzing the obtained key words and generating corresponding analysis information, and the analysis information comprises part-of-speech labels and word meaning annotations; the part-of-speech labels are the parts-of-speech of the keywords in the training sentences, and the part-of-speech labels are defined by an Oxford English-Chinese dictionary and/or an English-Chinese double-solution micro-soft computer dictionary.
Preferably, the acquisition module further comprises a device determination module, a content acquisition unit and an association analysis unit,
the equipment determining module acquires equipment information in a network environment by adopting an equipment discovery technology and stores basic information of the equipment into a classified word bank of a database;
the content acquisition unit acquires a network log file from network equipment monitored by the syslog server as a data source for language processing model training and acquires statements to be analyzed.
The association analysis unit is used for constructing the attribute association relationship between the acquired equipment information and the syslog file.
Preferably, the system further includes a training module, and the training module is configured to update the language processing model with the parsing information obtained from the parsing module, so as to update the language processing model of the corresponding device.
Preferably, the devices include, but are not limited to, switches, servers, gateways, routers, and network security devices; the device network discovery modes include but are not limited to SNMP, ARP and ICMP protocols.
Preferably, the basic information of the device includes, but is not limited to, a device name, a device type, a device IP, and a device manufacturer.
The invention provides a network log processing method based on NLP, which is characterized by comprising the following specific steps: the method comprises the steps that an acquisition module acquires equipment information in a network environment, records basic information of the equipment information as a basic statement for statement acquisition, and acquires a network log of equipment in a syslog server; the segmentation module segments the acquired basic sentence into at least one keyword, the keyword corresponds to attributes in the classified word bank one by one, the analysis module analyzes corresponding analysis information for the keyword according to analysis in a preset word meaning bank, and display information is exported to a web page.
In the method, a plurality of classified words are set in the classified word bank, and keywords corresponding to the characteristics of different types of equipment are arranged in the classified word bank.
In the method, the preset word sense library is provided with the corresponding analysis information of the keywords after deep neural network training.
In the method, when the Syslog is reported, the natural language processing component extracts log content of the network equipment to be analyzed, decomposes a plurality of decomposed words, analyzes the decomposed words according to a trained language processing model, extracts features and classifies the words, further determines meanings contained in the Syslog content, enables the natural language processing module to accurately and quickly translate sentence contents according to the determined meanings, translates original sentences into information which is described in Chinese and easy to understand by a user, determines category attribution of the logs according to the log contents, and provides a basis for log classification attribution.
The invention has the technical effects and advantages that:
(1) because the training sentences have strong relevance with the personalized characteristics of the manufacturer developers of the equipment, the segmented training words are closer to the language habits of the developers, and the personalized characteristics of the developers can be reflected. The divided keywords are used as the keywords of the developer, so that the natural language processing model can learn the keywords, and the natural language processing model can be closer to the language habit and the manufacturer style of the equipment.
(2) The user uses the system to obtain the manufacturer information and the device type of the device, under the normal condition, the content of syslog of the same manufacturer and the same device type is very similar, and the classification which is closer to the habit of a device developer is obtained from the dimension, so that the language habit of the developer is known.
(3) The system can simplify the learning process of the language processing model, reduce the learning period and improve the accuracy of syslog processing analysis of equipment of the same manufacturer, the same type, different types or unknown types in an unknown network environment.
(4) The network equipment needing to be collected is determined through the equipment of the network equipment and is closer to the actual situation of the network environment, and finally the whole log is translated and classified through the natural language processor, so that the logs of various types of network equipment processed through natural language are easier for users to understand.
(5) The method of the invention can effectively process various logs of unknown types and formats, overcomes the defect that the original template-based method can not analyze undefined logs, improves the usability of the system and improves the usability of users.
Drawings
Fig. 1 is a schematic view of the overall structure of the present invention.
Fig. 2 is a flow chart of the operation of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
As shown in fig. 1-2, the present embodiment provides an NLP-based weblog processing system, which includes a natural language processing component and a database; establishing a classified word bank, a preset word meaning bank and a language processing model in the database, wherein the classified word bank sets a plurality of transliterated words which are specifically corresponding to equipment types or keywords which take high-frequency words derived after word segmentation processing as standards;
specifically, the method comprises the following steps:
in the first case: and setting a plurality of classified words, wherein the classified words cover the keywords corresponding to different characteristics of different types of equipment.
Such as: BGP, OSPF and CPU are extended from the router; the server is extended to a CPU and a memory.
In the second case: and performing matching word segmentation processing on syslog based on a preset word sense library, and if a word matched with a preset word in the preset word sense library exists, extracting the word as the type under the classification.
For another example: the above-mentioned syslog of the target device is "DHCPD-4-PING _ confict: DHCPaddressconflict: serverpinned172.18.44.170." the preset keyword library includes DHCP, "and then the keyword, DHCP, can be extracted from the target record information by matching.
In the third case: selection of keywords with high-frequency words as standards derived after word segmentation processing
Firstly, word segmentation processing is carried out on target recorded information, entity words with entity significance are extracted, and then keywords are determined according to the occurrence frequency of the extracted entity words.
If a keyword can not be matched and enters a certain classification, part of speech and semantic translation are carried out, at least the same words (pre-classified words) are sequenced, and the same classified words with the preset numerical values are screened out to be used as target classified words.
For example: pre-classified words extracted from router syslog of certain model
Including OSPF, PING _ confict, LINK,
the pre-categorizations extracted from a router of a certain model are BGP, PING _ confict, UPDOWN,
the pre-classified words extracted from a server of a certain model are PING _ CONFLICT, UPDOWN, IPACESS. Wherein the PING _ confict occurs in different types, and the probability of model equipment is the highest.
Thus, training pre-classified words of all models of equipment are taken for sequencing. And extracting the words with the most number of the same pre-classified words as classified words, and then adding the words into a classified word library for classification.
As above, PING _ confict, the result of the classification word after processing by the language processing model is a PING CONFLICT.
The classified word bank is associated with a preset word sense bank in a mapping mode, and the preset word sense bank is associated with a language processing model;
in the system, the natural processing component is used for inducing, classifying and analyzing syslog source data and log files of the equipment, determining meanings contained in natural language sentences, acquiring a certain number of sentences to train a language processing model of the database, the natural language processing component acquires effective fields from the sentences according to a preset word meaning library to train and learn, generating a plurality of training words as keywords, generating analysis information for the keywords, and generating the language processing model according to the keywords and the corresponding analysis information, wherein the language processing model adopts a neural network architecture.
The natural language processing component also comprises an acquisition module, a segmentation module and an analysis module;
the acquisition module is used for receiving basic information or training sentences of the equipment source, caching the basic information or the training sentences into a database, classifying according to predefined rules, and caching the basic information or the training sentences into a classified word bank;
the segmentation module is used for matching and segmenting the training sentences based on a preset word sense library, and segmenting the training sentences into at least one training word as a keyword;
the analysis module is used for analyzing the obtained key words and generating corresponding analysis information, and the analysis information comprises part-of-speech labels and word meaning annotations; the part-of-speech labels are the parts-of-speech of the keywords in the training sentences, and the part-of-speech labels are defined by an Oxford English-Chinese dictionary and/or an English-Chinese double-solution micro-soft computer dictionary.
In the system, the acquisition module further comprises an equipment determination module, a content acquisition unit and an association analysis unit,
the equipment determining module acquires equipment information in a network environment by adopting an equipment discovery technology and stores basic information of the equipment into a classified word bank of a database;
the content acquisition unit acquires a network log file from network equipment monitored by the syslog server as a data source for language processing model training and acquires statements to be analyzed.
Specifically, by setting a log server, a Syslog original data message is acquired by adopting technologies such as network interception and the like. And analyzing the message to obtain the IP address, the log level and the log content, and storing the log content in a database.
The association analysis unit is used for constructing the attribute association relationship between the acquired equipment information and the syslog file.
Specifically, after data acquisition is completed, data cleaning and filtering are performed on the equipment through correlation modes such as models, equipment and IP, the data cleaning includes removing errors and invalid data, and the filtering is performed by selecting logs according to the equipment range to provide a data source for next NLP training.
The system also comprises a training module, wherein the training module is used for updating the language processing model by the analysis information acquired from the analysis module so as to update the language processing model of the corresponding equipment.
The devices include, but are not limited to, switches, servers, gateways, routers, and network security devices; the device network discovery modes include but are not limited to SNMP, ARP and ICMP protocols.
The basic information of the device includes, but is not limited to, device name, device type, device IP, and device vendor.
In the process of discovering the network equipment, manufacturers, equipment types and equipment models of all the equipment are obtained, and the manufacturers, the equipment types and the equipment models need to be mapped. Specifically, since the management address of the device in the network is unique, a mapping relationship between the manufacturer, the device type, the model and the IP can be constructed.
Such as: huacheng-switch-S12700-192.168.9.1.
S9700-192.168.9.1.
Hua ye-router-NE 40-192.168.9.1.
NE80-192.168.9.1.
And storing the records as data sources into a database.
The invention provides a network log processing method based on NLP, which comprises the following steps: the method comprises the steps that an acquisition module acquires equipment information in a network environment, records basic information of the equipment information as a basic statement for statement acquisition, and acquires an equipment-based network log in a syslog server; the segmentation module segments the acquired basic sentence into at least one keyword, the keyword corresponds to attributes in the classified word bank one by one, the analysis module analyzes corresponding analysis information for the keyword according to analysis in a preset word meaning bank, and display information is exported to a web page.
The classified word bank is provided with a plurality of classified words, and keywords corresponding to the characteristics of different types of equipment are arranged in the classified word bank.
And the preset word meaning library is provided with the corresponding analysis information of the keywords after deep neural network training.
When the Syslog is reported, the natural language processing component extracts log content of the network equipment, which needs to be analyzed, from the log, decomposes a plurality of decomposed words, analyzes the decomposed words according to a trained language processing model, extracts features and classifies the words, further determines meanings contained in the Syslog content, enables the natural language processing module to accurately and quickly translate statement content according to the determined meanings, translates original statements into information which is described in Chinese and easy to understand by a user, determines category attribution of the log according to the log content, and provides a basis for log classification attribution.
Example 2
In order to more clearly illustrate that the equipment model and manufacturer in the NLP-based weblog processing method extract the relevant syslog, the description is made with reference to an example:
the network device can monitor through syslog protocol, and transmits the log information to the remote server module in User Datagram Protocol (UDP), and the remote receiving log server module must monitor the UDP port 514 through syslog, and process the local machine according to the configuration in the syslog.
The method comprises the following concrete steps:
1) collecting original log information.
A data source: a device or system that provides syslog format log data; the device may be a firewall, switch, router, server, or other host that has a linux-like operating system installed.
syslog log server module: collecting syslog format log data from a device or system and further saving its raw data.
Log file: the processed syslog format log data from the syslog log server is saved, and each line represents one piece of log information.
2) And analyzing and collecting useful log data.
Monitoring a log file: monitoring log data collection work, detecting whether log files have data written in, meaning that data are collected, and simultaneously triggering the log data to be filtered.
And (3) filtering: adopting the log data meeting the set conditions, and transferring to related event calling; and discarding the log data which do not accord with the set conditions, and returning to the monitoring of the log file. The event filtering condition is mainly to perform log filtering from the management address extracted above. Data cleaning is performed for following natural language training, and invalid data interference is reduced.
A database: storing the filtered original log data and extracting useful information data through analysis.
Analyzing and collecting data: triggered by the monitoring program, before the new log data is written into the database in the execution time sequence, the original data is extracted from the database, and the work of extracting useful information is completed, wherein the extracted information is as follows: the time when the event occurred, the protocol used when the event occurred, the origin of the event, the destination of the event, device information, and the like. And after the processing is finished, writing the result into a database for further analysis.
The results show that: and displaying the primary analysis acquisition result in a Web mode or other humanized modes.
In practice, the amount of the liquid to be used,
the content provides a log data acquisition mode: the method comprises the steps of collecting original log information by taking a log server as a center, and respectively carrying out log data analysis by taking a log file and a database as centers to collect useful log information.
The acquisition method can ensure that each transaction of the acquisition work can be independently completed: the syslog server collects the original log data without being influenced by the analysis and collection of a program and the reading and writing of a database, and the analysis and collection speed of the subsequent analysis and collection part is reduced without being influenced by the collection work of the syslog server.
On one hand, the log data of each network device is ensured not to be lost, and the safety information is completely ensured;
on the other hand, the time spent by the network management system from data acquisition to action reaction is short, and real-time log information can be obtained.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.

Claims (11)

1. A NLP-based weblog processing system, characterized in that: the system comprises a natural language processing component and a database; establishing a classified word bank, a preset word meaning bank and a language processing model in the database, wherein the classified word bank sets a plurality of transliterated words which are specifically corresponding to equipment types or keywords which take high-frequency words derived after word segmentation processing as standards; the classified word bank is associated with a preset word sense bank in a mapping mode, and the preset word sense bank is associated with a language processing model;
the natural processing component is used for carrying out inductive classification and analysis on syslog source data and log files of the equipment and determining meanings contained in natural language sentences,
and acquiring a certain number of sentences to train a language processing model of the database, wherein the natural language processing component acquires effective fields from the sentences according to a preset word meaning library to train and learn so as to generate a plurality of training words as keywords, generates analytic information for the keywords, and generates the language processing model according to the keywords and the corresponding analytic information, and the language processing model adopts a neural network architecture.
2. The NLP-based weblog processing system of claim 1, wherein: the natural language processing component also comprises an acquisition module, a segmentation module and an analysis module;
the acquisition module is used for receiving basic information or training sentences of the equipment source, caching the basic information or the training sentences into a database, classifying according to predefined rules, and caching the basic information or the training sentences into a classified word bank;
the segmentation module is used for matching and segmenting the training sentences based on a preset word sense library, and segmenting the training sentences into at least one training word as a keyword;
the analysis module is used for analyzing the obtained key words and generating corresponding analysis information, and the analysis information comprises part-of-speech labels and word meaning annotations; the part-of-speech labels are the parts-of-speech of the keywords in the training sentences, and the part-of-speech labels are defined by an Oxford English-Chinese dictionary and/or an English-Chinese double-solution micro-soft computer dictionary.
3. The NLP-based weblog processing system of claim 2, wherein: the acquisition module further comprises an equipment determination module, a content acquisition unit and an association analysis unit,
the equipment determining module acquires equipment information in a network environment by adopting an equipment discovery technology and stores basic information of the equipment into a classified word bank of a database;
the content acquisition unit acquires a network log file from network equipment monitored by the syslog server as a data source for language processing model training and acquires statements to be analyzed.
4. The association analysis unit is used for constructing the attribute association relationship between the acquired equipment information and the syslog file.
5. The NLP-based weblog processing system of claim 1, wherein: the system further comprises a training module, wherein the training module is used for updating the language processing model by the analysis information acquired from the analysis module so as to update the language processing model of the corresponding equipment.
6. The NLP-based weblog processing system of claim 1, wherein: the devices include, but are not limited to, switches, servers, gateways, routers, and network security devices; the device network discovery modes include but are not limited to SNMP, ARP and ICMP protocols.
7. The NLP-based weblog processing system of claim 2, wherein: the basic information of the device includes, but is not limited to, device name, device type, device IP, and device vendor.
8. A network log processing method based on NLP is characterized by comprising the following specific steps: the method comprises the steps that an acquisition module acquires equipment information in a network environment, records basic information of the equipment information as a basic statement for statement acquisition, and acquires an equipment-based network log in a syslog server; the segmentation module segments the acquired basic sentence into at least one keyword, the keyword corresponds to attributes in the classified word bank one by one, the analysis module analyzes corresponding analysis information for the keyword according to analysis in a preset word meaning bank, and display information is exported to a web page.
9. The NLP-based weblog processing method according to claim 5, wherein: the classified word bank is provided with a plurality of classified words, and keywords corresponding to the characteristics of different types of equipment are arranged in the classified word bank.
10. The NLP-based weblog processing method according to claim 5, wherein: and the preset word meaning library is provided with the corresponding analysis information of the keywords after deep neural network training.
11. The NLP-based weblog processing method according to claim 5, wherein: when the Syslog is reported, the natural language processing component extracts log content of the network equipment, which needs to be analyzed, from the log, decomposes a plurality of decomposed words, analyzes the decomposed words according to a trained language processing model, extracts features and classifies the words, further determines meanings contained in the Syslog content, enables the natural language processing module to accurately and quickly translate statement content according to the determined meanings, translates original statements into information which is described in Chinese and easy to understand by a user, determines category attribution of the log according to the log content, and provides a basis for log classification attribution.
CN201911334997.3A 2019-12-23 2019-12-23 NLP-based weblog processing system and method Active CN111130877B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911334997.3A CN111130877B (en) 2019-12-23 2019-12-23 NLP-based weblog processing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911334997.3A CN111130877B (en) 2019-12-23 2019-12-23 NLP-based weblog processing system and method

Publications (2)

Publication Number Publication Date
CN111130877A true CN111130877A (en) 2020-05-08
CN111130877B CN111130877B (en) 2022-10-04

Family

ID=70501624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911334997.3A Active CN111130877B (en) 2019-12-23 2019-12-23 NLP-based weblog processing system and method

Country Status (1)

Country Link
CN (1) CN111130877B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111884858A (en) * 2020-07-29 2020-11-03 中国工商银行股份有限公司 Equipment asset information verification method, device, system and medium
CN113407505A (en) * 2021-07-01 2021-09-17 中孚安全技术有限公司 Method and system for processing security log elements
CN114385890A (en) * 2022-03-22 2022-04-22 深圳市世纪联想广告有限公司 Internet public opinion monitoring system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1726488A (en) * 2002-05-07 2006-01-25 国际商业机器公司 Integrated development tool for building a natural language understanding application
CN104375989A (en) * 2014-12-01 2015-02-25 国家电网公司 Natural language text keyword association network construction system
CN104391881A (en) * 2014-10-30 2015-03-04 杭州安恒信息技术有限公司 Word segmentation algorithm-based log parsing method and word segmentation algorithm-based log parsing system
US20150242396A1 (en) * 2014-02-21 2015-08-27 Jun-Huai Su Translating method for translating a natural-language description into a computer-language description
CN107506349A (en) * 2017-08-04 2017-12-22 卓智网络科技有限公司 A kind of user's negative emotions Forecasting Methodology and system based on network log
CN107656997A (en) * 2017-09-20 2018-02-02 广东欧珀移动通信有限公司 Natural language processing method, apparatus, storage medium and terminal device
CN107660283A (en) * 2015-04-03 2018-02-02 甲骨文国际公司 For realizing the method and system of daily record resolver in Log Analysis System
CN109033135A (en) * 2018-06-06 2018-12-18 北京大学 A kind of natural language querying method and system of software-oriented project knowledge map
US20190005018A1 (en) * 2017-06-30 2019-01-03 Open Text Corporation Systems and methods for diagnosing problems from error logs using natural language processing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1726488A (en) * 2002-05-07 2006-01-25 国际商业机器公司 Integrated development tool for building a natural language understanding application
US20150242396A1 (en) * 2014-02-21 2015-08-27 Jun-Huai Su Translating method for translating a natural-language description into a computer-language description
CN104391881A (en) * 2014-10-30 2015-03-04 杭州安恒信息技术有限公司 Word segmentation algorithm-based log parsing method and word segmentation algorithm-based log parsing system
CN104375989A (en) * 2014-12-01 2015-02-25 国家电网公司 Natural language text keyword association network construction system
CN107660283A (en) * 2015-04-03 2018-02-02 甲骨文国际公司 For realizing the method and system of daily record resolver in Log Analysis System
US20190005018A1 (en) * 2017-06-30 2019-01-03 Open Text Corporation Systems and methods for diagnosing problems from error logs using natural language processing
CN107506349A (en) * 2017-08-04 2017-12-22 卓智网络科技有限公司 A kind of user's negative emotions Forecasting Methodology and system based on network log
CN107656997A (en) * 2017-09-20 2018-02-02 广东欧珀移动通信有限公司 Natural language processing method, apparatus, storage medium and terminal device
CN109033135A (en) * 2018-06-06 2018-12-18 北京大学 A kind of natural language querying method and system of software-oriented project knowledge map

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111884858A (en) * 2020-07-29 2020-11-03 中国工商银行股份有限公司 Equipment asset information verification method, device, system and medium
CN111884858B (en) * 2020-07-29 2023-01-03 中国工商银行股份有限公司 Equipment asset information verification method, device, system and medium
CN113407505A (en) * 2021-07-01 2021-09-17 中孚安全技术有限公司 Method and system for processing security log elements
CN114385890A (en) * 2022-03-22 2022-04-22 深圳市世纪联想广告有限公司 Internet public opinion monitoring system
CN114385890B (en) * 2022-03-22 2022-05-20 深圳市世纪联想广告有限公司 Internet public opinion monitoring system

Also Published As

Publication number Publication date
CN111130877B (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN111130877B (en) NLP-based weblog processing system and method
US10795753B2 (en) Log-based computer failure diagnosis
WO2018036239A1 (en) Method, apparatus and system for monitoring internet media events based on industry knowledge mapping database
US7774290B2 (en) Pattern abstraction engine
WO2020108063A1 (en) Feature word determining method, apparatus, and server
US10467256B2 (en) Automatic query pattern generation
CN111930547A (en) Fault positioning method and device and storage medium
CN108108288A (en) A kind of daily record data analytic method, device and equipment
WO2021174812A1 (en) Data cleaning method and apparatus for profile, and medium and electronic device
CN110035087B (en) Method, device, equipment and storage medium for recovering account information from traffic
WO2020019490A1 (en) Interface testing method, electronic device and storage medium
CN104573024A (en) Self-adaptive extracting method and system for heterogeneous security log information under complex network system
CN110765483A (en) Configured log desensitization method and device and electronic equipment
WO2020140624A1 (en) Method for extracting data from log, and related device
CN113971205A (en) Threat report attack behavior extraction method, device, equipment and storage medium
CN110688558A (en) Method and device for searching web page, electronic equipment and storage medium
CN115296892A (en) Data information service system
JP5154132B2 (en) Name conversion recognition device and method
US11475222B2 (en) Automatically extending a domain taxonomy to the level of granularity present in glossaries in documents
CN107678916A (en) A kind of analysis and diagnosis method and system based on CPU register informations
WO2021128721A1 (en) Method and device for text classification
CN109558418B (en) Method for automatically identifying information
CN115587364B (en) Firmware vulnerability input point positioning method and device based on front-end and back-end correlation analysis
EP4235407A1 (en) Method and system for mapping intermediate representation objects for facilitating incremental analysis
WO2024045128A1 (en) Artificial intelligence model display method and apparatus, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant