CN111966641A - Universal log normalization model configuration method and device - Google Patents

Universal log normalization model configuration method and device Download PDF

Info

Publication number
CN111966641A
CN111966641A CN202010828346.6A CN202010828346A CN111966641A CN 111966641 A CN111966641 A CN 111966641A CN 202010828346 A CN202010828346 A CN 202010828346A CN 111966641 A CN111966641 A CN 111966641A
Authority
CN
China
Prior art keywords
normalization
preset
normalized
log
strategy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010828346.6A
Other languages
Chinese (zh)
Other versions
CN111966641B (en
Inventor
杨佳宁
郭娴
陈柯宇
杨立宝
李莹
樊佳讯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Industrial Control Systems Cyber Emergency Response Team
Original Assignee
China Industrial Control Systems Cyber Emergency Response Team
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Industrial Control Systems Cyber Emergency Response Team filed Critical China Industrial Control Systems Cyber Emergency Response Team
Priority to CN202010828346.6A priority Critical patent/CN111966641B/en
Publication of CN111966641A publication Critical patent/CN111966641A/en
Application granted granted Critical
Publication of CN111966641B publication Critical patent/CN111966641B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Abstract

The invention discloses a general log normalization model configuration method and a device, wherein the method comprises the following steps: acquiring log information to be normalized; selecting a preset normalization strategy from a preset normalization strategy library according to a preset sequence, and taking the preset normalization strategy as a current preset normalization strategy; based on the current preset normalization strategy, performing normalization processing on the log information to be normalized to obtain a current normalization result; sequentially selecting preset normalization strategies as current preset normalization strategies according to a preset sequence, and repeating the steps: based on the current preset normalization strategy, normalization processing is carried out on the log information to be normalized to obtain a current normalization result, a normalization strategy with universality and usability is provided, different normalization strategies and assignment modes are automatically set according to user needs, and the requirements of different users are met.

Description

Universal log normalization model configuration method and device
Technical Field
The invention relates to the technical field of information, in particular to a general log canonicalization model configuration method and a general log canonicalization model configuration device.
Background
At present, big data becomes a technical hotspot in the current industry, and particularly, with the deployment of cloud computing services, the big data is regarded as the most important service application thereof, and the development prospect thereof is hopefully.
The large data traffic often involves multiple product items, different product items, whose generated raw logs are different, and the log fields and expressions in different raw logs are also different. In the prior art, a log access system can only identify and process original logs of original product projects, when new original logs of product projects are accessed, a set of log fields and expressed links capable of identifying the original logs need to be re-developed, so that development cost is high, log normalization is a process of mapping log records with different formats, different fields and different meanings into uniform field values of products, and therefore a new log management method is needed to solve the problems in the prior art.
Disclosure of Invention
In view of the foregoing problems in the prior art, an object of the present invention is to provide a general log normalization model configuration method and apparatus, which can improve the processing capability for different types of logs.
In order to solve the technical problems, the specific technical scheme of the invention is as follows:
in one aspect, the present invention provides a general log normalization model configuration method, which is characterized by comprising the following steps:
acquiring log information to be normalized;
selecting a preset normalization strategy from a preset normalization strategy library according to a preset sequence, and taking the preset normalization strategy as a current preset normalization strategy;
based on the current preset normalization strategy, performing normalization processing on the log information to be normalized to obtain a current normalization result;
sequentially selecting preset normalization strategies as current preset normalization strategies according to a preset sequence, and repeating the steps: and performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result.
Further, the selecting a preset normalization policy from a preset normalization policy library according to a preset sequence further includes, before the using the preset normalization policy as the current preset normalization policy:
and establishing a preset normalization strategy library, wherein the preset normalization strategy library comprises a plurality of groups of preset normalization strategies.
Further, the establishing a preset normalization policy library includes:
determining the type of the log to be normalized based on the log information to be normalized;
acquiring a regular expression matched with the log type to be normalized based on the log type to be normalized;
performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value;
and assigning the field extracted value based on a preset field assignment mode to obtain a normalized result.
Further, the performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value includes:
acquiring a field index based on the regular expression;
and performing normalization processing on the log information to be normalized through the regular expression based on the field index to obtain a field extraction value, wherein the field extraction value corresponds to the field index.
Further, the preset field assignment mode includes one or more of the following: direct assignment, mapping table assignment, formatted assignment, function assignment, and regular expression.
Further, the preset normalized policy library comprises at least one predefined policy management group and at least one custom policy management group.
Further, the method further comprises:
and establishing a data dictionary.
In a second aspect, the present invention further provides a general log normalization model configuration apparatus, including:
the log information acquisition module is used for acquiring log information to be normalized;
the current preset normalization strategy determining module is used for selecting preset normalization strategies from a preset normalization strategy library according to a preset sequence and taking the preset normalization strategies as current preset normalization strategies;
the current normalization result generation module is used for performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result;
and the normalization result generation module is used for sequentially selecting preset normalization strategies as the current preset normalization strategies according to a preset sequence and repeating the steps: and performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result.
Further, the apparatus further comprises:
the device comprises a preset normalization strategy base establishing module, a normalization strategy base setting module and a normalization strategy base setting module, wherein the preset normalization strategy base comprises a plurality of groups of preset normalization strategies;
and the data dictionary establishing module is used for establishing a data dictionary.
Further, the preset normalization policy library establishing module includes:
the log type determining unit is used for determining the log type to be normalized based on the log information to be normalized;
the regular expression determining unit is used for acquiring a regular expression matched with the log type to be normalized based on the log type to be normalized;
the field extraction unit is used for performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value;
and the normalized result acquisition unit is used for assigning the field extracted value based on a preset field assignment mode to obtain a normalized result.
By adopting the technical scheme, the general log normalization model configuration method and device have the following beneficial effects:
1. the general log canonicalization model configuration method and the device have strong applicability, can process logs of different types and different sources, and improve the processing efficiency.
2. The general log normalization model configuration method and device provided by the invention have the advantages that the accuracy is high, a large amount of data can be processed, a proper regular expression is selected according to different normalization strategies, the accuracy of field extraction is improved, and a large amount of data can be efficiently processed.
3. The invention provides a general log normalization model configuration method and a general log normalization model configuration device, provides a normalization strategy with universality and usability, automatically sets different normalization strategies and assignment modes according to user needs, and meets the requirements of different users.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings used in the description of the embodiment or the prior art will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram of an application environment provided by the present invention;
FIG. 2 is a flowchart illustrating a general log modeling model configuration method according to the present invention;
FIG. 3 is a flow diagram of a generalized log-modeling model configuration method in some embodiments of the invention;
FIG. 4 is a schematic flow chart of step S102 in FIG. 3;
FIG. 5 is a schematic structural diagram of a general log normalization model configuration apparatus provided in the present invention;
FIG. 6 is a schematic structural diagram of an electronic device according to the present invention;
FIG. 7 is a schematic diagram of a storage medium according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or device.
Referring to fig. 1, fig. 1 is a schematic diagram of an application environment according to an embodiment of the present invention, which may include a client 01 and a server 02, where the client and the server may be directly or indirectly connected through wired or wireless communication. The user experiences the business service with the client. When the service is updated, the client can report the log to the server. It should be noted that fig. 1 is only an example. Specifically, the client 01 may include a smart phone, a desktop computer, a tablet computer, a notebook computer, an Augmented Reality (AR)/Virtual Reality (VR) device, a digital assistant, a smart speaker, a smart wearable device, and other types of physical devices, and may also include software running in the physical devices, such as a computer program. The operating system running on the client 01 may include, but is not limited to, an Android system (Android system), an IOS system (which is a mobile operating system developed by apple inc.), linux (an operating system), Microsoft Windows (Microsoft Windows operating system), and the like. Specifically, the server 02 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The server 02 may comprise a network communication unit, a processor and a memory, etc. The server 02 may provide background services for the clients.
In practical applications, a Fog Computing (Fog Computing) mode may be employed in the communication. In this mode data, (data) processing and applications are concentrated in the devices at the edge of the network, rather than being kept almost entirely in the cloud. Fog Computing may be an extended concept of Cloud Computing (Cloud Computing). The service experienced by the user may be an audio/video service, a game service, and the like, and may specifically be directed to an update download application therein. The service experienced by the user can be implemented based on the application of a fog-computed PCDN (P2P content distribution network), which can guarantee high quality data transmission. Further, a normal service program (service main logic, for example, watching a video online) of a product (for example, a video application) providing a relevant service for a user may use a fog calculation technology, and a log reporting logic (triggered when the service main logic is abnormal) of the product may use a conventional data communication technology with a central point.
In a specific embodiment, when the client corresponds to an entity device, a computer program provided by a service provider and pointing to a certain product is run in the entity device. When the client corresponds to a computer program running in the physical device, the computer program is provided by the service provider and directed to a product.
The computer program comprises a normal service program corresponding to the service main logic and a program corresponding to the log reporting logic, and correspondingly, when the normal service program is abnormal, the client executes the log reporting logic to report the log to a server (belonging to a service provider) corresponding to a certain product. The server receiving the reported log may be a specific log server. In practical application, the client can also point to a program corresponding to the log reporting logic, and a log reporting system can be constructed based on the program corresponding to the log reporting logic.
The directly reported log data is data information which cannot be normally identified, the same log data which is convenient to identify can be obtained through log normalization, wherein the log normalization is a process of mapping log records with different formats, different fields and different meanings into uniform field values of products.
In order to better implement the log processing process, a specific embodiment of a general log modeling model configuration method according to the present invention is described below, fig. 2 is a flowchart of a general log modeling model configuration method according to an embodiment of the present invention, and the present specification provides the method operation steps as described in the embodiment or the flowchart, but may include more or less operation steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2, the method may include:
s103: acquiring log information to be normalized;
in this embodiment, the client obtains the server state information based on the actual service state, and the server state information representing the current configuration condition of the log server may include at least one of: the service operation type, and the log format, the field, the meaning and other information corresponding to the service operation type. Of course, the server status information may also include error code reporting data and the like.
The log information may be acquired by a log collector, where the log information is initial log information, that is, log information to be normalized, the log collector may collect server state information at regular time, and in some other embodiments, the server state information may also be reported by a client, where the information collected by the log collector or the information uploaded by the client may be specific information, such as log data in a specific program running state.
S103: selecting a preset normalization strategy from a preset normalization strategy library according to a preset sequence, and taking the preset normalization strategy as a current preset normalization strategy;
the preset normalization policy library is a plurality of preset normalization policy groups set by a user as required, and each preset normalization policy group can be set according to different log source types, so that log normalization rules of the preset normalization policy groups are different, and obtained results are also different. When log normalization is performed, a current preset normalization strategy group needs to be selected in advance, so that the log processing can be realized.
Further, from another perspective, the preset normalization policy library may be divided into at least one predefined policy management group and at least one custom policy management group, where the predefined policy management group is a non-editable normalization policy, that is, does not allow a user to edit and delete, and may include data such as system basic configuration information, and the custom policy management group is an editable management group, and may include deletable information such as application software parameter information.
In some possible embodiments, the step S103 may further include, before:
s102: establishing a preset normalization strategy library, wherein the preset normalization strategy library comprises a plurality of groups of preset normalization strategies
In actual work, corresponding normalized policy names can be set, and the normalized policy names are not repeated, optionally, the policy names do not exceed 64 characters, and are not allowed to be repeated in the same tree node, wherein each normalized policy name corresponds to one normalized policy management group, and the normalized policy management group can be set according to the type of a server, or can be set according to different service types, and is set autonomously according to the requirements of users. When the selected current preset normalization strategy is a predefined strategy management group, the editing state of the selected current preset normalization strategy is defaulted to a forbidden state, and the operation that the normalization strategy group cannot be edited is represented. In order to further facilitate an operator to quickly know the content corresponding to each normalized policy management group, corresponding policy description content can be set, specifically, the policy description may not exceed 128 characters, so that content redundancy can be avoided, and simplification of page display is not facilitated.
Specifically, as shown in fig. 4, the step S102 may further include:
s201: determining the type of the log to be normalized based on the log information to be normalized;
it is understood that the log types to be normalized, i.e. the normalization policy is determined by different log types, which may be different from log source, wherein the log source may include, for example, server, firewall, switch, Active Directory (AD), Intrusion Detection System (IDS), end tool, etc.
S203: acquiring a regular expression matched with the log type to be normalized based on the log type to be normalized;
the regular expression is used as a character processing tool, the matched regular expression can be preset according to different log source types, and the regular expression can extract log information reported by the corresponding log source.
S205: performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value;
after determining the log source type and the regular expression matched with the log source type, the sample of the log to be normalized can be brought into the regular expression to obtain the corresponding field extraction value.
In some possible embodiments, the obtaining of the field extracted value may be obtained in the form of a field index, and specifically, may include the following steps:
s2051: acquiring a field index based on the regular expression;
s2052: and performing normalization processing on the log information to be normalized through the regular expression based on the field index to obtain a field extraction value, wherein the field extraction value corresponds to the field index.
The field extraction value can be obtained through operation of an operator, and is filled to a corresponding extraction value position according to a set index, illustratively, the index can be an index number starting from 1, the index column is filled in sequence according to a sequence number by default, and the index naming mode is as follows: $ plus number, such as: and $1, $2, $3 …, it should be noted that the indexes may be repeated, and the extracted field values are sequentially filled according to the set index number, where the field extraction value may be in a form of splicing multiple characters, and of course, the field extraction value may be repeated.
Specifically, after clicking an "extract field value" button through a client, an operator displays the values extracted from the sample according to an index sequence according to a regular expression.
In the embodiment of the present specification, different fields can be obtained in sequence according to the index number, and in order to facilitate a user to quickly understand and identify the meaning of each extracted field, a field name can be set for each index, where the field name is a field name supported by a system, such as a generation time, a process, a device name, an event name, and the like.
It should be noted that the fields extracted by the regular expression can be divided into two types, namely "general fields" and "non-general fields", and the "general fields" and the "non-general fields" are added together to form "all fields" of the system. The general field indicates all logs, whether matching with fields which the normalized strategy should have or not, and also indicates fields of unnormalized logs, the general field does not allow a user to edit and delete, for example, fields which all exist in various logs, and under general conditions, the general field is also obtained by normalization of a regular expression; the non-general fields represent editable fields, and the operator can add and delete the fields by himself.
S207: and assigning the field extracted value based on a preset field assignment mode to obtain a normalized result.
It can be understood that, through the obtained field extraction value and the corresponding field name, the operator already knows the meaning of the field, but because the field extraction value expression forms obtained by different log sources (different manufacturers) and different index numbers are also different, when multiple field extraction values are combined together, the identification of the meaning of the field extraction values, or the quick obtaining of a field extraction value is a great workload, so that the field extraction values can be internally unified particularly.
Therefore, in the embodiment of the specification, the field assignment mode can be preset, log information of different manufacturers can be mapped into an internal uniform value, and the universality and the usability of the normalization function are improved.
Specifically, the preset field assignment manner includes one or more of the following: direct assignment, mapping table assignment, formatted assignment, function assignment, and regular expression. Wherein, the direct assignment is to directly assign the corresponding value of the index to the field; assigning a mapping table to take the index corresponding value as a mapping original value, and assigning the mapped value to a designated field; the formatted assignment is to assign a value corresponding to the index to a specified field according to a specified format; function assignment is to assign the value corresponding to the index to a designated field after function operation; the regular expression is that the value corresponding to the index is evaluated to the designated field after the regular expression operation. Therefore, an operator can set corresponding assignment modes according to different log source types or different index numbers under the same log source type, so as to realize assignment of field extraction values, for example, when assignment is performed through a preset field value mapping table, a field value mapping table can be created and edited.
And a proper regular expression is selected according to different normalization strategies, so that the accuracy of field extraction is improved, and a large amount of data can be efficiently processed.
In actual work, the field data types and the assignment supporting modes can also be set correspondingly, as shown in table 1, the correspondence between different field data types and assignment supporting modes is shown:
table 1 correspondence table of field data types and assignment modes
Direct assignment Mapping table assignments Formatted valuations Function valuation Regular expression
string - -
double - -
int - -
time - -
ip -
When the time, double, int types are directly assigned, the processing can be performed according to the format specified by the corresponding field by default.
When field assignment is performed, the field assignment method generally includes a field extraction value list and a field assignment list, and the field assignment list has different display contents according to different assignment modes, which is specifically as follows:
when the field assignment mode is direct assignment, the field assignment list does not display assignment content, so that the field extraction value list can represent field value meaning.
And when the field assignment mode is the assignment of the mapping table, filling the assigned value in a corresponding list according to the associated mapping table.
When the field assignment mode is a formatting assignment, different formatting lists may be set according to corresponding field types, for example, when the field data type is double, a digital formatting list is listed, as shown in table 2, which is one representation of the digital formatting list:
TABLE 2 digital formatting List
Format Examples of the invention
*#,### 1,234
*#,###.* 1,234.78
*#### 1234
*####.* 1234.67
Note: indicates an arbitrary figure number
When the field data type is time, a time formatting list is listed, as shown in table 3, which is a representation of the time formatting list:
TABLE 3 time formatted List
Figure BDA0002637034700000101
Figure BDA0002637034700000111
When the field assignment mode is function assignment, the field assignment list mode displays a function list allowing selection of the selected field, for example, when the data type is string, the list content is assignment realized based on base64 decoding; when the data type is ip, the list content is assigned based on decimal to IPv4, hexadecimal to IPv4, binary to IPv4, and the like.
When the field assignment mode is the assignment of the regular expression, displaying an editable text box for a user to input the regular expression, providing a test regular expression button, clicking the test button, giving a test result in a dialog box form, giving a test passing prompt when the test passes, and giving a proposed value; and if the test fails, giving an error prompt and a reason, and then carrying out the processes of optimizing the regular expression and the like by an operator according to the reason of the test failure.
It should be noted that, the above-mentioned preset normalization policy library is only one establishment method of the preset normalization policy, multiple sets of preset normalization policies can be established by repeating the above-mentioned procedure to form the preset normalization policy library, and other manners capable of establishing the corresponding normalization policy are also within the scope of the present application.
S105: based on the current preset normalization strategy, performing normalization processing on the log information to be normalized to obtain a current normalization result;
by selecting the current preset normalization strategy, the corresponding regular expression is also determined, the log information to be normalized is brought into the regular expression extraction field extraction value corresponding to the current preset normalization strategy, and meanwhile, the normalization result is obtained in an assignment mode, namely the internal uniform value of the field.
S107: sequentially selecting preset normalization strategies as current preset normalization strategies according to a preset sequence, and repeating the steps: and performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result.
A set of processed normalization results can be obtained through the current preset normalization strategy, in order to obtain all normalization results or obtain normalization results of a specific log source according to requirements, the preset normalization strategies can be selected from the preset normalization strategy library in sequence as the current preset normalization strategy, the selection can be successfully according to preset sequencing, and in some other embodiments, the selection can be random. The normalization result is obtained by normalizing the log sample to be normalized according to each preset normalization strategy, and the normalization result is not repeated here, so that the embodiment of the specification provides a normalization strategy with universality and usability, different normalization strategies and assignment modes are automatically set according to the needs of users, and the requirements of different users are met.
On the basis of the foregoing embodiments, in some possible embodiments, the general log-normalized model configuration method may further include the following steps:
and establishing a data dictionary. The setup data is used to record the relationship between the system stored value and the description, which functions to convert the internal stored value into a descriptive language understandable by the user.
Such as the dictionary for alarm levels in table 1 below:
TABLE 1 alarm dictionary
Storing value Description language
0 Fatal disease
1 High level
2 Middle stage
3 Low grade
It should be noted that the storage value in the data dictionary may be a field extraction value, or may be an internal uniform value obtained by assigning the field extraction value.
Specifically, different controls may be set according to the association condition of the field value and the data dictionary, when association has been achieved, the dictionary table value may be selected in a drop-down list manner, and when the associated data dictionary is not set, association may also be achieved by supplementing the data dictionary, or a text input box is displayed, and direct insertion of the data dictionary is performed.
An embodiment of the present invention further provides a general log normalization model configuration device, as shown in fig. 5, where the device includes:
the log information acquisition module is used for acquiring log information to be normalized;
the current preset normalization strategy determining module is used for selecting preset normalization strategies from a preset normalization strategy library according to a preset sequence and taking the preset normalization strategies as current preset normalization strategies;
the current normalization result generation module is used for performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result;
and the normalization result generation module is used for sequentially selecting preset normalization strategies as the current preset normalization strategies according to a preset sequence and repeating the steps: and performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result.
Further, the apparatus further comprises:
the device comprises a preset normalization strategy base establishing module, a normalization strategy base setting module and a normalization strategy base setting module, wherein the preset normalization strategy base comprises a plurality of groups of preset normalization strategies;
and the data dictionary establishing module is used for establishing a data dictionary.
Further, the preset normalization policy library establishing module includes:
the log type determining unit is used for determining the log type to be normalized based on the log information to be normalized;
the regular expression determining unit is used for acquiring a regular expression matched with the log type to be normalized based on the log type to be normalized;
the field extraction unit is used for performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value;
and the normalized result acquisition unit is used for assigning the field extracted value based on a preset field assignment mode to obtain a normalized result.
It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.
In a specific embodiment, as shown in fig. 6, a schematic structural diagram of an electronic device provided in an embodiment of the present invention is shown. The electronic device 800 may include components such as memory 810 for one or more computer-readable storage media, processor 820 for one or more processing cores, input unit 830, display unit 840, Radio Frequency (RF) circuitry 850, wireless fidelity (WiFi) module 860, and power supply 870. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 7 does not constitute a limitation of electronic device 800, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components. Wherein:
the memory 810 may be used to store software programs and modules, and the processor 820 executes various functional applications and data processing by operating or executing the software programs and modules stored in the memory 810 and calling data stored in the memory 810. The memory 810 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 810 may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device. Accordingly, memory 810 may also include a memory controller to provide processor 820 with access to memory 810.
The processor 820 is a control center of the electronic device 800, connects various parts of the whole electronic device by using various interfaces and lines, and performs various functions of the electronic device 800 and processes data by operating or executing software programs and/or modules stored in the memory 810 and calling data stored in the memory 810, thereby performing overall monitoring of the electronic device 800. The Processor 820 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input unit 830 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. Specifically, the input unit 830 may include an image input device 831 and other input devices 832. The image input device 831 may be a camera or a photoelectric scanning device. The input unit 830 may include other input devices 832 in addition to the image input device 831. In particular, other input devices 832 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 840 may be used to display information input by or provided to a user and various graphical user interfaces of an electronic device, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 840 may include a Display panel 841, and the Display panel 841 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like, as an option.
The RF circuit 850 may be used for receiving and transmitting signals during a message transmission or communication process, and in particular, for receiving downlink messages from a base station and then processing the received downlink messages by the one or more processors 820; in addition, data relating to uplink is transmitted to the base station. In general, the RF circuitry 850 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 850 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.
WiFi belongs to short-range wireless transmission technology, and the electronic device 800 can help the user send and receive e-mails, browse web pages, access streaming media, etc. through the WiFi module 860, and it provides the user with wireless broadband internet access. Although fig. 6 shows WiFi module 860, it is understood that it does not belong to the essential components of electronic device 800, and may be omitted entirely as needed within the scope not changing the essence of the invention.
The electronic device 800 also includes a power supply 870 (e.g., a battery) for powering the various components, which may be logically coupled to the processor 820 via a power management system to manage charging, discharging, and power consumption via the power management system. The power source 870 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
It should be noted that, although not shown, the electronic device 800 may further include a bluetooth module, and the like, which is not described herein again.
An embodiment of the present invention further provides a storage medium, as shown in fig. 7, where at least one instruction, at least one program, a code set, or an instruction set is stored in the storage medium, and the at least one instruction, the at least one program, the code set, or the instruction set is executable by a processor of an electronic device to implement any one of the above-mentioned methods for configuring a generic log-normalized model.
Optionally, in an embodiment of the present invention, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus, the electronic device and the storage medium embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiment.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A general log normalization model configuration method is characterized by comprising the following steps:
acquiring log information to be normalized;
selecting a preset normalization strategy from a preset normalization strategy library according to a preset sequence, and taking the preset normalization strategy as a current preset normalization strategy;
based on the current preset normalization strategy, performing normalization processing on the log information to be normalized to obtain a current normalization result;
sequentially selecting preset normalization strategies as current preset normalization strategies according to a preset sequence, and repeating the steps: and performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result.
2. The method for configuring a universal log normalization model according to claim 1, wherein the selecting a preset normalization policy from a preset normalization policy library according to a preset sequence further comprises, before the using the preset normalization policy as a current preset normalization policy:
and establishing a preset normalization strategy library, wherein the preset normalization strategy library comprises a plurality of groups of preset normalization strategies.
3. The method for configuring a universal log normalization model according to claim 2, wherein the establishing a preset normalization policy library comprises:
determining the type of the log to be normalized based on the log information to be normalized;
acquiring a regular expression matched with the log type to be normalized based on the log type to be normalized;
performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value;
and assigning the field extracted value based on a preset field assignment mode to obtain a normalized result.
4. The method for configuring the general log normalization model according to claim 3, wherein the normalizing the log information to be normalized by the regular expression to obtain a field extraction value comprises:
acquiring a field index based on the regular expression;
and performing normalization processing on the log information to be normalized through the regular expression based on the field index to obtain a field extraction value, wherein the field extraction value corresponds to the field index.
5. The method as claimed in claim 3, wherein the preset field assignment manner includes one or more of the following: direct assignment, mapping table assignment, formatted assignment, function assignment, and regular expression.
6. The method of claim 1, wherein the pre-provisioned normalized policy repository includes at least one pre-defined policy management group and at least one custom policy management group.
7. The method of claim 1, wherein the method further comprises:
and establishing a data dictionary.
8. A generic log normalization model configuration apparatus, the apparatus comprising:
the log information acquisition module is used for acquiring log information to be normalized;
the current preset normalization strategy determining module is used for selecting preset normalization strategies from a preset normalization strategy library according to a preset sequence and taking the preset normalization strategies as current preset normalization strategies;
the current normalization result generation module is used for performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result;
and the normalization result generation module is used for sequentially selecting preset normalization strategies as the current preset normalization strategies according to a preset sequence and repeating the steps: and performing normalization processing on the log information to be normalized based on the current preset normalization strategy to obtain a current normalization result.
9. The apparatus of claim 8, wherein the apparatus further comprises:
the device comprises a preset normalization strategy base establishing module, a normalization strategy base setting module and a normalization strategy base setting module, wherein the preset normalization strategy base comprises a plurality of groups of preset normalization strategies;
and the data dictionary establishing module is used for establishing a data dictionary.
10. The apparatus of claim 8, wherein the pre-configured normalized policy repository establishing module comprises:
the log type determining unit is used for determining the log type to be normalized based on the log information to be normalized;
the regular expression determining unit is used for acquiring a regular expression matched with the log type to be normalized based on the log type to be normalized;
the field extraction unit is used for performing normalization processing on the log information to be normalized through the regular expression to obtain a field extraction value;
and the normalized result acquisition unit is used for assigning the field extracted value based on a preset field assignment mode to obtain a normalized result.
CN202010828346.6A 2020-08-18 2020-08-18 Universal log normalization model configuration method and device Active CN111966641B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010828346.6A CN111966641B (en) 2020-08-18 2020-08-18 Universal log normalization model configuration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010828346.6A CN111966641B (en) 2020-08-18 2020-08-18 Universal log normalization model configuration method and device

Publications (2)

Publication Number Publication Date
CN111966641A true CN111966641A (en) 2020-11-20
CN111966641B CN111966641B (en) 2022-12-06

Family

ID=73389208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010828346.6A Active CN111966641B (en) 2020-08-18 2020-08-18 Universal log normalization model configuration method and device

Country Status (1)

Country Link
CN (1) CN111966641B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101394267A (en) * 2008-10-08 2009-03-25 北京启明星辰信息技术股份有限公司 Security information management system and method based on general normalized labeling language
US20140229596A1 (en) * 2013-02-12 2014-08-14 International Business Machines Corporation Dynamic generation of policy enforcement rules and actions from policy attachment semantics
CN107330034A (en) * 2017-06-26 2017-11-07 百度在线网络技术(北京)有限公司 A kind of log analysis method and device, computer equipment, storage medium
CN109040037A (en) * 2018-07-20 2018-12-18 南京方恒信息技术有限公司 A kind of safety auditing system based on strategy and rule
CN109656894A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 Log standardization storage method, device, equipment and readable storage medium storing program for executing
US20190155953A1 (en) * 2017-11-17 2019-05-23 Vmware, Inc. Efficient log-file-based query processing
CN110990350A (en) * 2019-11-28 2020-04-10 泰康保险集团股份有限公司 Log analysis method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101394267A (en) * 2008-10-08 2009-03-25 北京启明星辰信息技术股份有限公司 Security information management system and method based on general normalized labeling language
US20140229596A1 (en) * 2013-02-12 2014-08-14 International Business Machines Corporation Dynamic generation of policy enforcement rules and actions from policy attachment semantics
CN107330034A (en) * 2017-06-26 2017-11-07 百度在线网络技术(北京)有限公司 A kind of log analysis method and device, computer equipment, storage medium
US20190155953A1 (en) * 2017-11-17 2019-05-23 Vmware, Inc. Efficient log-file-based query processing
CN109040037A (en) * 2018-07-20 2018-12-18 南京方恒信息技术有限公司 A kind of safety auditing system based on strategy and rule
CN109656894A (en) * 2018-11-13 2019-04-19 平安科技(深圳)有限公司 Log standardization storage method, device, equipment and readable storage medium storing program for executing
CN110990350A (en) * 2019-11-28 2020-04-10 泰康保险集团股份有限公司 Log analysis method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
何伟等: "一种基于最大匹配和向量空间模型的用户检索词规范化方法", 《数字图书馆论坛》 *
温辉: "CA安全运维管理系统的设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN111966641B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN104981768B (en) Stream data receiver and save routine based on cloud
CN108989066B (en) Equipment management method and device
KR101421567B1 (en) Device management server, client and method for locating target operation object
US20200213365A1 (en) Tag-Based Security Policy Creation in a Distributed Computing Environment
CN111178012A (en) Form rendering method, device and equipment and storage medium
CN104796434A (en) Message pushing method and message server
CN109754072B (en) Processing method of network offline model, artificial intelligence processing device and related products
CN106470150B (en) Relation chain storage method and device
CN104869048A (en) Grouping processing method, device and system of MicroBlog data
CN111651639B (en) Address space management method, device, equipment and medium
CN111966641B (en) Universal log normalization model configuration method and device
CN114443940A (en) Message subscription method, device and equipment
CN111008209A (en) Data account checking method, device and system, storage medium and electronic device
CN103634348A (en) Terminal device and method for releasing information
CN114466387B (en) Updating method and device of configuration file of base station, storage medium and electronic device
CN112380411B (en) Sensitive word processing method, device, electronic equipment, system and storage medium
CN112311818A (en) Method, device, terminal and storage medium for downloading applet data packet
KR20210000041A (en) Method and apparatus for analyzing log data in real time
CN113965571B (en) Management method, device, equipment and medium of distributed embedded equipment
CN111191103B (en) Method, device and storage medium for identifying and analyzing enterprise subject information from internet
CN114866970A (en) Policy control method, system and related equipment
CN111385110B (en) Network management method and device
CN114006939A (en) Message pushing method and device
US10601635B1 (en) Apparatus, system, and method for wireless management of a distributed computer system
CN101686243A (en) Method, device and system for managing information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant