CN111177098A - Method and system for checking system log context - Google Patents

Method and system for checking system log context Download PDF

Info

Publication number
CN111177098A
CN111177098A CN201911376364.9A CN201911376364A CN111177098A CN 111177098 A CN111177098 A CN 111177098A CN 201911376364 A CN201911376364 A CN 201911376364A CN 111177098 A CN111177098 A CN 111177098A
Authority
CN
China
Prior art keywords
log
data
context
fid
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911376364.9A
Other languages
Chinese (zh)
Other versions
CN111177098B (en
Inventor
周杰
马楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CITIC Aibank Corp Ltd
Original Assignee
CITIC Aibank Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CITIC Aibank Corp Ltd filed Critical CITIC Aibank Corp Ltd
Priority to CN201911376364.9A priority Critical patent/CN111177098B/en
Publication of CN111177098A publication Critical patent/CN111177098A/en
Application granted granted Critical
Publication of CN111177098B publication Critical patent/CN111177098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
    • G06F9/44526Plug-ins; Add-ons

Abstract

The invention discloses a method and a system for checking system log context, in particular to a method and a system for checking the context after large-scale centralized collection of log search. The method comprises four steps: the method comprises the steps of firstly, carrying out large-scale log data centralized collection on a log by using a log collection program, initializing an FID, setting the location of the FID of the original position of a target log, and generating an FID self-increment sequence. And step two, the data receiving end receives the log data and forwards the data to kafka in batch after receiving the log data. And step three, after the consumption analysis is carried out on the original data by the distributed data acquisition engine, the original data is stored in the ES, and the ES search engine carries out retrieval for positioning the FID of the target log. And step four, determining the context relation of the log to be inquired according to the conditions of multiple dimensions by realizing the positioning of the FID of the original position of the target log, and checking the context of the result.

Description

Method and system for checking system log context
Technical Field
The invention relates to the field, in particular to the technical field of monitoring/analysis of a computer system and an application log, and particularly relates to a method for checking the context of a system log.
Background
The logs are usually distributed on a plurality of physical machines/virtual machines, and development and operation and maintenance personnel need to log in the machines when checking the logs, but the method is not available today when the distributed/micro-service is widely applied, and often technicians need to log in N machines to find the desired logs.
In the solution of the prior art, a log centralized collection scheme similar to an ELK architecture is adopted, so that the efficiency of a technician for checking and analyzing a system/service log can be effectively improved, a good search skill can be mastered, and even effects such as rapid report statistics can be achieved.
However, this solution has certain drawbacks, which are mainly reflected in that when a technician needs to view the context-associated logs, the problem that cannot be solved is encountered because the ES engine cannot determine which documents are the contexts of the searched logs, and the inverted index and full-text search techniques of the ES are designed for quick retrieval and statistics and do not have the self-increment ID technique like Mysql.
At present, in an existing open source architecture, a row number is not recorded during log collection, and during log search, the log can be located through keywords, but because the context of the log does not contain the keywords which can be located, what the context is cannot be directly queried. It is very unfriendly to produce troubleshooting, especially continuous business logs.
Current log searches cannot view the context of a selected log entry. The Text index searching technology of the ES mainly adopts the index after segmenting the document and supports fuzzy/precise query.
Although the producer produces data to kafka in order, kafka needs to disperse data to different partitions, which results in that when the consumer consumes, the consumer cannot guarantee that log data still enters the ES engine according to the order of collection, and the consumer is distributed and transversely expandable, so that self-increment ID cannot be realized for the document by maintaining unique ID in the memory.
Based on the technical characteristics, although the log entry where the desired keyword is located can be quickly searched, the context of the selected log entry is invisible, and the contexts of a plurality of service logs have relevance and can be checked in a linkage manner to clarify the problem.
The above problems can cause the hidden troubles of inaccurate log analysis of technicians, incapability of positioning problems and the like.
The main terms to which this application relates are to be interpreted:
source: log file path recorded in log document.
FID: the log line number recorded in the log document is generated when the log is collected according to the method of the invention.
timeFrom: the start time in the log search criteria needs to be shifted forward by one minute in order to account for the fact that the time to select a log entry may be at the earliest in range and that a date switch may be encountered.
timeTill: the deadline in the log search criteria needs to be offset one minute backward in order to account for the time at which the log entry is selected being likely to be at the latest in the range and likely to encounter a date switch.
Golang: the method is designed by google, is a new development language for fully utilizing hardware performance and simultaneously considering development efficiency, and has cross-platform/natural high concurrency/garbage recovery support and rapid compiling.
Apprname: in the management of the business, each business system is required to have three system codes, the operation and maintenance management is unified to maintain, and when the log is collected, the business system to which the log belongs is defined.
Tag: in the aspect of management, each service system is required to be named and divided according to different deployment units, and unique service system logs can be distinguished through appname + tag during collection.
Word segmentation: in brief, the text to be put in storage is split and divided into a plurality of phrases, each phrase can be used as a search keyword, and meanwhile, the whole text is also the search keyword, so that the fuzzy matching capability of the search is ensured.
Inverted indexing: the normal index is to generate an ID for the document, if the ID is known, the corresponding document can be quickly searched, but the index mode cannot deal with the condition that a user searches the documents in which certain specific phrases appear, an inverted index mode is used here, the phrases are used as key indexes, the ID is used as corresponding content, and document searching and statistics according to the key words are realized.
Disclosure of Invention
The invention aims to provide a method for checking system log context, and particularly solves the problem of checking the context after large-scale centralized collection of log search. The technical problem is that the log can be quickly retrieved and inquired by unifying and centralizing UI interfaces, and the context can be retrieved by simply clicking to check the context, so that a user can be helped to quickly locate the inquiry result.
In the existing open source architecture, the line number is not recorded during log collection, so that the problems of log positioning and context checking cannot be realized.
Based on the logic diagram of FIG. 1, the invention is designed based on the open source product Elasticissearch search engine, and in terms of the overall architecture, the invention refers to an ELK open source log analysis platform for design, but a kafka component is added to an ES engine at the acquisition receiving end, and due to the high-performance I/O throughput of the kafka, the invention ensures that data consumption can completely enter the ES and provides a buffer function.
The inventors have developed using the golang language. Js is developed at the front end of the search based on the search function designed by an ES search engine, and the web at the back end of the search is developed by python.
The key point of the invention is that when the log is collected, the FID (line number) attribute is given, and the context of the log to be inquired can be uniquely determined according to the condition of a plurality of dimensions after the log is put in storage.
The technical effect achieved by the invention is that the sequential context of the selected log items can be uniquely determined through 5 fields in total, namely IP/source/FID/timeFrom/timeTill, and then the display is returned through the customized UI. Meanwhile, the user does not need to care about the background processing process, and only needs to select the log in the interface and click to view the context.
The method comprises the steps that an open-source elastic search is used for rear-end storage and search, and the data flow of the scheme is that a client acquires and preprocesses data, and the data are uploaded to a receiving end in batches; the data are forwarded to Kafka in batches, then consumption analysis is carried out, the data are dumped into an ES after the consumption analysis, and the data are retrieved by an ES search engine; by realizing the positioning of the retrieval result, the context relationship of the log to be inquired is uniquely determined, and the context of the result is checked.
Based on the logic diagram of fig. 1, the implementation of the technical scheme of the invention is as follows:
step 1, collecting logs by using a log collection program, wherein the collection program sets appname and tag during collection to distinguish the attributions of the collected log service systems, and IP information, Hostname data and source data are also uploaded to a data receiving end during collection. By initializing the FID, a FID incrementing sequence is generated.
And 2, the data receiving end receives and forwards the data. The receiving means that the collected and preprocessed data are sent to a data receiving end in batches, and the receiving end receives the data; and the data receiving end receives data based on the http interface, and forwards the data to kafka in batch for buffering.
Step 3, performing data analysis through a distributed data acquisition engine developed by golang, consuming data from kafka and analyzing corresponding fields according to established rules; after consumption and analysis, the data is dumped into an ES search engine and retrieved by the ES search engine for locating the FID of the original position of the target log.
And 4, determining the context of the log to be queried by positioning the search result and checking the context of the result.
Optionally or preferably, the method for checking contexts after log search is collected in a large-scale centralized manner, wherein in the step four, the number of the checked contexts is set to be 50;
when the FID reaches the upper limit of 2^31-1, resetting to 0 for rotation, and selecting 50 logs according to the default context check, so that in the query condition of the web-server combination, when the selected logs are in a boundary range, the FID conversion is automatically carried out, and the queried logs are ensured to be in the range of 50 logs in the context.
Preferably, the method for viewing context after searching logs in large-scale centralized collection manner as claimed in claim 1, wherein when log collection is initiated, a uint32 type integer is opened in the memory for maintenance.
Optionally, the number of viewing context entries may be further expanded, and the interval range is greater than 50 and less than 10000.
Preferably, the method for collecting log search and viewing context in large scale according to claim 1, wherein the search engine Elasticsearch of claim 1.
Preferably, the method for collecting log search and view context in large scale according to claim 1, wherein the distributed publish-subscribe messaging system according to claim 1 is Kafka.
A system for checking the context of a system log is characterized by comprising a log acquisition unit, a log receiving unit, a data buffering unit, a log analyzing unit, a data storage unit and a log retrieval unit;
the log acquisition unit is used for intensively adopting log file information in a large scale and storing the acquired logs in the data storage unit for storage;
the data receiving unit is used for receiving the log data acquired by the log acquisition unit;
the log buffer unit is used for receiving the log file sent by the log acquisition unit and adding the received log file information to the distributed message queue;
the data analysis unit consumes data from the distributed message queue and analyzes corresponding fields according to a set rule; after consumption and analysis, dumping the data in a data storage unit, and searching and positioning a target log by a search engine;
the data storage unit is used for storing the data after consumption and analysis;
and the log retrieval unit is used for positioning the FID of the original position of the target log according to the log query condition of the user and retrieving to obtain a corresponding retrieval target log file.
The system for viewing system log context may be stored in a computer readable storage medium, the storage medium comprising: ROM, RAM, magnetic or optical disks, and the like. The archiving analysis platform to which the log is applied is a Storm distributed log analysis platform.
Drawings
FIG. 1 is a logic diagram of an embodiment of the method of the present invention;
FIG. 2 is a schematic diagram of a log collection and FID generation provided by an embodiment of the method of the present invention;
FIG. 3 is a flow chart illustrating the context of log search in the method of the present invention;
FIG. 4 is a schematic diagram of a component system of an embodiment of the system of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clear, the following describes embodiments of the present invention in detail with reference to fig. 2 and 3. It is to be understood that the described embodiments are merely illustrative or exemplary in nature and are in no way intended to limit the invention, its application, or uses. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some or all of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by presenting examples of the invention. The present invention is in no way limited to any specific configuration and algorithm set forth below, but rather covers any modification, replacement or improvement of elements, components or algorithms without departing from the spirit of the invention.
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Detailed description of the preferred embodiments
The method for viewing the system log context is developed according to the following technical scheme as shown in fig. 2 and fig. 3:
step S101, the log collection program reads the configuration file and determines the log to be collected according to the configuration information. And the log collection program converts the configuration information into required data and registers the data to the global variable. After registration, the log collection program allocates enough threads to collect logs according to the global variable.
And performing large-scale centralized collection on the logs, wherein the collection thread firstly distributes initialization FID and seek, the collection thread firstly distributes initialization seek and FID, and the collection thread reads a file path, a file name and iNode information.
And setting the FID positioning of the target log and generating an FID self-increment sequence. The specific generation rules for FIDs are detailed below.
The generation rule of the FID is specifically:
1. when log collection is initiated, a uint32 type integer (with the size of 2^32-1) is opened in the memory for maintenance, and the uint32 type is selected by considering that:
firstly, the type finally stored in the Elasticissearch is realized, and the integer type is selected in the Elasticissearch, and the size of the integer type is 2^31-1(int32 type, which is already contained in the positive integer range by the agent 32);
the value type may also select uint64, providing a larger value range, but for the number of log lines, uint32 is already sufficient, setting a larger memory type consumes more memory space.
Secondly, in web-servers realized by different languages and different operating systems, the compatibility is supported for the length of the int type;
furthermore, since the size of a single log is usually limited, 2^31-1 lines, can already contain a long log, have low repeatability, and do not waste more storage space to store a larger FID.
2. When the FID reaches the upper limit of 2^31-1, resetting to 0 for rotation, and selecting 50 logs according to the default context check, so that in the query condition of the web-server combination, when the selected logs are in a boundary range, the FID conversion is automatically carried out, and the queried logs are ensured to be in the range of 50 logs in the context.
3. The log collection program distributes an independent collection thread for each collection log, maintains the self-growth of the FID and the FID variable of thread safety in the threads, and can ensure that the FID conflict problem can not occur; meanwhile, the position of the log seek and the FID information during collection are registered to the global variable, and the timing thread is used for brushing the data into the disk, so that the collection progress loss caused by accidental exit of the program is prevented.
The acquisition program sets appname and tag during acquisition so as to distinguish the business system affiliation of the acquired log.
According to fig. 2, judging whether the iNode is equal to the recorded iNode; if the two sets of information are equal, reading the recorded seek information; if not, seek is initialized to 0; reading a row of logs Seek + +, FID + +, in a cyclic process; compare FID to 2147483647;
if less than or equal to, the FID remains unchanged, and if greater than the FID, it is initialized to 0.
The document is a structure describing that data is stored in the ES, and it can be understood that each piece of log data, the values containing all of its attribute fields constitute one document,
during collection, besides forming document by log, seek, FID and other information, IP information, Hostname data and source data are also uploaded to a document file at a data receiving end, and are stored in an ES after consumption. And then inserting the data into the channel, reading the data from the channel, and after the data is accumulated to a sufficient amount or the data is submitted to a unified uploading interface in batches after the last time of submission exceeds N seconds.
And S102, receiving the log data by the data receiving end, and forwarding the data to kafka in batch after receiving the log data.
Specifically, the data side of the service personnel receives and forwards the data. The receiving means that the collected and preprocessed data are sent to a data receiving end in batches, and the receiving end receives the data; the receiving end receives data based on the http interface, and the data receiving end forwards the data to kafka in batches.
kafka provides a better caching mechanism, and the data parsing and storage ES at the back end can become an asynchronous mode, so that the data processing throughput is improved. Since data parsing is very CPU resource consuming, ES warehousing consumes CPU and disk I/O, and I/O efficiency is less than kafka. Meanwhile, due to the fact that data receiving and data analyzing consumption are decoupled, independent capacity expansion can be conducted on the analyzing module, and the receiving module does not need too many resources and only needs to be verified and forwarded.
The kafka component is added to the ES engine at the acquisition receiving end, so that the kafka can be used for high-performance I/O throughput, the data consumption can be ensured to completely enter the ES, and a buffering function is provided.
Step S103, performing data analysis through a distributed data acquisition engine developed by golang, consuming data from kafka and analyzing corresponding fields according to established rules; after consumption and analysis, the data is dumped into the ES and retrieved by an ES search engine for locating the FID of the original position of the target log.
And step S104, determining the context relation of the log to be inquired according to the conditions of multiple dimensions by realizing the FID positioning of the target log, and checking the context of the result.
The multiple dimensions, including but not limited to: the time range [ @ time map ] + IP + log file absolute path [ source ] + log line number [ FID ], the final search condition executed in the search engine is @ time map gte $ { timeFrom }, lte: $ { timeTill }, IP: $ { IP } _ AND source: $ { source } AND FID >, AND the result set of the query is sorted according to FID.
The number of the checked context pieces can be set, the current environment setting is defaulted to 50, but the current environment setting can be optimized to be selected through a front end, namely the number is not limited, but at most, the number does not exceed 10000, 10000 returns the maximum result set for the default inquiry of the ES cluster, and the maximum value needs to modify the ES configuration file and restart the cluster to realize modification.
Explaining the technical scheme from the perspective of user operation, as shown in fig. 3, a user selects a time range with timeFrom and timetip, and inputs a search statement, the back end of the search statement is parsed into Lucene grammar required by ES search, so as to search logs, when the user sees a located interested log, the user selects "view context" on the log, the basic attribute of the log contains IP information, the IP information is used for locating a unique host, source is used for locating a unique log file, FID is used for locating the original position of the log, a program automatically combines a new search statement for querying ES to obtain context, and the applied query statement is:
IP $ { IP } AND source $ { source } AND FID $ { FID } -25AND FID $ { FID } +25, AND the query result set is sorted by FID, AND context query can be completed.
The invention is applied to the log collection agent of the production and test environment, and the message information sent by the log collection agent is referred to in the appendix of the specification.
The invention solves the problems that after large-scale log collection and centralized storage, only search is needed and context-dependent query cannot be carried out, improves the problem positioning capability and positioning efficiency of technical personnel before fault processing, and provides basic guarantee for stable operation and rapid disposal of production environment.
In the embodiment of the system, a log acquisition unit reads a configuration file, determines the log to be acquired according to configuration information, initializes the FID, sets the location of the FID of the original position of the target log, and generates an FID auto-increment sequence.
Sending the log file collected by the log file collecting unit 201 to the log receiving unit 202, after the log receiving unit 202 receives the log file sent by the log collecting unit 201, the data receiving unit adds the received log file information to the data buffering unit 203, and buffers in the distributed message queue;
and acquiring a log file of the FID for positioning the target log, and analyzing the acquired log file according to a preset analysis strategy to obtain a corresponding log analysis result. After the distributed data collection engine of the data analysis unit 204 performs consumption analysis on the original data, the analyzed data is stored in the data storage unit 205 for the query requirement of the subsequent user.
When a user needs to view a context, the log retrieval unit 206 receives a log query request of the user, analyzes the log query request to obtain a corresponding log query condition, retrieves the log query condition based on the obtained log query condition and a view context entry range selected by the user, locates the FID of the original position of the target log, obtains a corresponding log file, and outputs and displays the context of the target log to the user.
The system may be stored in a computer readable storage medium, the storage medium comprising: ROM, RAM, magnetic or optical disks, and the like. The archiving analysis platform to which the log is applied is a Storm distributed log analysis platform. The log storage device comprises a log acquisition unit, a log buffering unit, a log storage unit and a log query unit.
Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The use of "first," "second," and similar terms in the description and claims of this patent does not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. Also, the use of the terms "a" or "an" and the like do not denote a limitation of quantity, but rather denote the presence of at least one.
The above description is only exemplary embodiments of the present invention and should not be taken as limiting the invention, and any modifications, equivalents, improvements and the like that are within the spirit and principle of the present invention should be included in the scope of the present invention.
Appendix of the description:
{ "recv _ time _ map": 1566810054606 "," log _ ID ": AWsgWD2vQcDD _ c1GxIpc", "raw _ message": "[ 2019/08/2617: 00:54CST ] [ INFO ] (rpc. StateHttpServereesponse: 212) SG _10003HttpResponse: SG:172.20.101.131:8881, provider:184002100, service: ifiCoreApi, version:20180509, method: apply, src:174001100, transfer-encoding: chunked, date: Mon, Aug 201917: 00:54GMT, msoa. request: 1619415, msoa. transit: 4f87c2e3a 4a access f 4a 3a 4a access c 4a 5a c 8 b" { (R _ ID "{ -STP _ I" { -NO _ I "{ (R _ I": W _ O ": No. 5": 54) (A "{" COsW # 1": WO 25" { "COsW # 12": WO 25 ": A": WO 25 "-: A" -: WO 25 ",",300A ", "MESSAGE _ ID _ IN": ac 1464490000201908261954387 "," sourceSererId ": bzaqyjsserap 1001", "STEP _ NUM": 1"," TRANS _ CODE ": WMT0010012", "TRANS _ SERIAL": 567 bzIQYQYJ 2324395519 "," PREV _ TRANS _ CODE ": 2", "wsId": QYJ "," brandchyD ": 800001", "priority": 1"," BOLEN ": 602", "version": 1.620.03RSP _ ENqTRY "{" App _ Id ": AQY _001", "Format": QdWoFcJQfJdWdWoJ 435427 "," JNfJQfJfJfJQfJfJQfQ 435427 ": TfJfJfJfJfQ": 27 ": WoJfQ": 27 ": 5": WoJNfJQfQ ": W # JfJfJQfJfJfQ": 27 ": W # JQfQ": 27 ": W # JQfJNfJfJfJQfJfJfJfJfJfQ", - "WfJQfJNfJQfJfJfJfJfJfJQfJQfJQfJNfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJNO": 435459 "," JSfJfJfJfJSfJfJfJSfJSfJfJfJfJSfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJSfJSfJSfJSfJSfJSfJfJfJfJSfJSfJfJfJSfJSfJSfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJfJf, "destBranchNo" "," MSG _ CHARSET "", "UTF-8", "PROCESSS _ INF \ n400\ nO" ", and" trade is successful.
"," sourceType ": 9999", "_ SERVER _ TYPE": http "," sysDate ": 20190826", "subconsceq": 11000000000000000000"," _ TIMERFLOW _ NAME ": STEP _ STATE": 2"," REQUEST _ ID ": WMT _001", "_ HIRER _ CODEC _ KEY": null "," result ": 1", "_ WORKFLOW _ NAME": TRANSNSNEW "," ENTRY "{" SysDate ": 20190826", "wsId": QYJ "," brandCHId ": 800001", "subsnsq": 5"," TRANSId ": 190826 bIQYQYQYJ 232519": 43133431335 "," sourcesID "" "servo 00100" "SR": 3600005 "," weather ": 3600005": 3600008 ",": 250 "", "servo Id" "WM III": 250 ",",00005 ", "RetMessage": success "," remark ": 11890", "sourceAppId": P00044A001"," tradeDate ": 20220402", "actName": "," checkDate ": 20220402", "retCode": 00000000"," sourcePlatformId ": P00044", "tradeAmount":1.0 "," tradeTime ": 141002", "actNo": 10000000: "10000000": mark ": P0007", "tradeStadesStatus": 1"," messageCode ": FIT 7", "TRANS _ DAT \ n202\ nE": 3587145 "," sourceId ": AQY _ 001": SysTime ": 36001": SMsTime ": 36001", "MESSAGE": 4354001 ": SMtQUE # 8", "WM _ WM # wo 300": # and # found ": 8", "MESSAND": 36001 ": 360000": 36001 "," MES # 8"," MESSAND ": 360000 # 12", "MES # wo 001": 360000 # 12"," MES # 3 ": 8": SMtQUEST # WM # 3"," MES # 12 ": 8": W # SMITsID "," MES # 001 ": W # WM # 3": 8 ": W # SMITsSAND": W # 1"," MES # and "" MESSAND ": 8": W # found ": W # 1", "MESSAND": W # found ": 8" "," SMATC ": W # found": 8 ": W # found" "," SMATC, "_ FILTER _ CHAIN": outWithSignForers "," async ": false", "PROCESS _ CODE": 000000"," MSG _ FMT ": JSON", "OUT _ MAP _ ID":895 "," TRANS _ TIME ": 170053" }, "ID": ac 14600009020182619587 "} \ n0", "ip": 172.20.101.131"," source _ update _ timestamp ": 1566810054526", "source": v "/var/log/fsg/service-gateway/logs/' log": bzafsg001 "," appname ": fsg", "domain": fiD ": 268435455", "sttag": 1559615781000: "sgs" } 1559615781000: "mapping"
Inputting search conditions on a user interface:
"msoa.traceid:4f87c2e3a3fb4a05accaf656f4dfe8f0"AND"msoa.requestid:1619415"
and obtaining an explicit log return result, and then clicking to view the context to finally realize the purpose of viewing the context.

Claims (9)

1. A method and a system for viewing system log context are characterized by comprising the following steps:
the method comprises the steps that firstly, log data are collected through a log collection program, the log collection program generates an FID self-increment sequence, and the data are uploaded to a data receiving end in batches.
Step two, the data receiving end receives the log data collected in the step one, and the data receiving end forwards the data to a distributed message queue in batches for buffering;
and step three, after the distributed data acquisition engine carries out consumption analysis on the original data, storing the analyzed data into a search engine, and searching by the search engine for positioning the FID of the target log.
And fourthly, determining the context of the target log by positioning the FID of the target log, thereby realizing the viewing of the context.
2. A method for viewing system log context as described in claim 1, wherein said viewing context in step four sets the number of viewing to 50;
when the FID reaches the upper limit of 2^31-1, resetting to 0 for rotation, and selecting 50 logs according to the default context check, so that in the query condition of the web-server combination, when the selected logs are in a boundary range, the FID conversion is automatically carried out, and the queried logs are ensured to be in the range of 50 logs in the context.
3. A method for viewing system log context as defined in claim 2, wherein the number of views is set to be greater than 50 and less than 10000.
4. A method for viewing system log context as defined in claim 1, wherein a uint32 type integer is opened up in memory for maintenance when log collection is initiated.
5. A method for viewing system log context as recited in claim 1, wherein the retrieval engine Elasticsearch of claim 1.
6. A method for viewing system log context as defined in claim 1, wherein the distributed message queue of claim 1 is Kafka.
7. A system for checking the context of a system log is characterized by comprising a log file acquisition unit, a log receiving unit, a data buffering unit, a data analysis unit, a data storage unit and a log retrieval unit;
the log acquisition unit is used for intensively adopting log file information in a large scale and storing the acquired logs in the data storage unit for storage;
the data receiving unit is used for receiving the log data acquired by the log acquisition unit;
the log buffer unit is used for receiving the log file sent by the log acquisition unit and adding the received log file information to the distributed message queue;
the data analysis unit consumes data from the distributed message queue and analyzes corresponding fields according to a set rule; after consumption and analysis, the data is dumped into a data storage unit, and a search engine retrieves and locates the target log.
The data storage unit is used for storing the data after consumption and analysis;
and the log retrieval unit is used for positioning the FID of the original position of the target log according to the log query condition of the user and retrieving to obtain a corresponding retrieval target log file.
8. A system for viewing system log context as recited in claim 7, wherein said system is storable in a computer readable storage medium, said storage medium comprising: ROM, RAM, hard disk devices.
9. A system for viewing system log context as described in claim 7, wherein the system is applied to a log archive analysis platform that is a Storm distributed log analysis platform.
CN201911376364.9A 2019-12-27 2019-12-27 Method and system for checking system log context Active CN111177098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911376364.9A CN111177098B (en) 2019-12-27 2019-12-27 Method and system for checking system log context

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911376364.9A CN111177098B (en) 2019-12-27 2019-12-27 Method and system for checking system log context

Publications (2)

Publication Number Publication Date
CN111177098A true CN111177098A (en) 2020-05-19
CN111177098B CN111177098B (en) 2023-09-22

Family

ID=70650390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911376364.9A Active CN111177098B (en) 2019-12-27 2019-12-27 Method and system for checking system log context

Country Status (1)

Country Link
CN (1) CN111177098B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858475A (en) * 2020-07-14 2020-10-30 深圳前海移联科技有限公司 Universal distributed log context retrieval system and method
CN112579394A (en) * 2020-12-24 2021-03-30 罗婷 Log processing system and method applied to internet finance and computer equipment

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5623679A (en) * 1993-11-19 1997-04-22 Waverley Holdings, Inc. System and method for creating and manipulating notes each containing multiple sub-notes, and linking the sub-notes to portions of data objects
US5799325A (en) * 1993-11-19 1998-08-25 Smartpatents, Inc. System, method, and computer program product for generating equivalent text files
CN101192231A (en) * 2006-11-27 2008-06-04 国际商业机器公司 Bookmark based on context
JP2010146346A (en) * 2008-12-19 2010-07-01 Kddi Corp Context retrieval method and device
JP2010271989A (en) * 2009-05-22 2010-12-02 Nippon Telegr & Teleph Corp <Ntt> Content retrieval method, content retrieval system, and content retrieval program
JP2011095841A (en) * 2009-10-27 2011-05-12 Sdl Plc In-context exact (ice) match
CN102902768A (en) * 2012-09-24 2013-01-30 广东威创视讯科技股份有限公司 Method and system for searching and displaying file content
CN104679885A (en) * 2015-03-17 2015-06-03 北京理工大学 User search string organization name recognition method based on semantic feature model
CN106250424A (en) * 2016-07-22 2016-12-21 杭州朗和科技有限公司 The searching method of a kind of daily record context, Apparatus and system
CN108920364A (en) * 2018-06-21 2018-11-30 深圳壹账通智能科技有限公司 Software defect positioning method, device, terminal and computer readable storage medium
US20190073406A1 (en) * 2017-09-05 2019-03-07 Nec Laboratories America, Inc. Processing of computer log messages for visualization and retrieval
CN109542750A (en) * 2018-11-26 2019-03-29 深圳天源迪科信息技术股份有限公司 Distributed information log system
CN109684351A (en) * 2018-12-18 2019-04-26 上海达梦数据库有限公司 A kind of executive plan inspection method, device, server and storage medium
US20190171633A1 (en) * 2017-11-13 2019-06-06 Lendingclub Corporation Multi-system operation audit log
CN110288004A (en) * 2019-05-30 2019-09-27 武汉大学 A kind of diagnosis method for system fault and device excavated based on log semanteme

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5623679A (en) * 1993-11-19 1997-04-22 Waverley Holdings, Inc. System and method for creating and manipulating notes each containing multiple sub-notes, and linking the sub-notes to portions of data objects
US5799325A (en) * 1993-11-19 1998-08-25 Smartpatents, Inc. System, method, and computer program product for generating equivalent text files
CN101192231A (en) * 2006-11-27 2008-06-04 国际商业机器公司 Bookmark based on context
JP2010146346A (en) * 2008-12-19 2010-07-01 Kddi Corp Context retrieval method and device
JP2010271989A (en) * 2009-05-22 2010-12-02 Nippon Telegr & Teleph Corp <Ntt> Content retrieval method, content retrieval system, and content retrieval program
JP2011095841A (en) * 2009-10-27 2011-05-12 Sdl Plc In-context exact (ice) match
CN102902768A (en) * 2012-09-24 2013-01-30 广东威创视讯科技股份有限公司 Method and system for searching and displaying file content
CN104679885A (en) * 2015-03-17 2015-06-03 北京理工大学 User search string organization name recognition method based on semantic feature model
CN106250424A (en) * 2016-07-22 2016-12-21 杭州朗和科技有限公司 The searching method of a kind of daily record context, Apparatus and system
US20190073406A1 (en) * 2017-09-05 2019-03-07 Nec Laboratories America, Inc. Processing of computer log messages for visualization and retrieval
US20190171633A1 (en) * 2017-11-13 2019-06-06 Lendingclub Corporation Multi-system operation audit log
CN108920364A (en) * 2018-06-21 2018-11-30 深圳壹账通智能科技有限公司 Software defect positioning method, device, terminal and computer readable storage medium
CN109542750A (en) * 2018-11-26 2019-03-29 深圳天源迪科信息技术股份有限公司 Distributed information log system
CN109684351A (en) * 2018-12-18 2019-04-26 上海达梦数据库有限公司 A kind of executive plan inspection method, device, server and storage medium
CN110288004A (en) * 2019-05-30 2019-09-27 武汉大学 A kind of diagnosis method for system fault and device excavated based on log semanteme

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
蒲志明: "云平台中日志管理模块的研究与实现", pages 1 - 94 *
路洁: "基于海量日志的大规模软件系统异常检测平台的研究与实现", pages 1 - 77 *
饶翔: "基于日志的大规模分布式软件系统可信保障技术研究", pages 1 - 150 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858475A (en) * 2020-07-14 2020-10-30 深圳前海移联科技有限公司 Universal distributed log context retrieval system and method
CN112579394A (en) * 2020-12-24 2021-03-30 罗婷 Log processing system and method applied to internet finance and computer equipment

Also Published As

Publication number Publication date
CN111177098B (en) 2023-09-22

Similar Documents

Publication Publication Date Title
US20230144450A1 (en) Multi-partitioning data for combination operations
US11726892B2 (en) Realtime data stream cluster summarization and labeling system
US11366859B2 (en) Hierarchical, parallel models for extracting in real time high-value information from data streams and system and method for creation of same
US20230237094A1 (en) Processing ingested data to identify anomalies
US11182098B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
US11151137B2 (en) Multi-partition operation in combination operations
US11663176B2 (en) Data field extraction model training for a data intake and query system
US7720845B2 (en) Systems and methods for updating query results based on query deltas
US11704490B2 (en) Log sourcetype inference model training for a data intake and query system
US20220036177A1 (en) Data field extraction by a data intake and query system
CN100428244C (en) Apparatus, system, and method for synchronizing change histories in enterprise applications
US20130124548A1 (en) System and Method for Presenting A Plurality of Email Threads for Review
US20210279265A1 (en) Optimization for Real-Time, Parallel Execution of Models for Extracting High-Value Information from Data Streams
CN109388637A (en) Data warehouse information processing method, device, system, medium
WO2014145092A2 (en) Hierarchical, parallel models for extracting in real time high-value information from data streams and system and method for creation of same
CN112269816B (en) Government affair appointment correlation retrieval method
CN109710767B (en) Multilingual big data service platform
US10901811B2 (en) Creating alerts associated with a data storage system based on natural language requests
CN103353901A (en) Orderly table data management method and system based on Hadoop distributed file system (HDFS)
CN111177098A (en) Method and system for checking system log context
WO2022026984A1 (en) Data field extraction model training for a data intake and query system
CN110895538A (en) Data retrieval method, device, storage medium and processor
EP3380906A1 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
Bordino et al. Advancing NLP via a distributed-messaging approach
US20190034555A1 (en) Translating a natural language request to a domain specific language request based on multiple interpretation algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant