CN113760849A - Log processing method, system, electronic device and computer readable storage medium - Google Patents

Log processing method, system, electronic device and computer readable storage medium Download PDF

Info

Publication number
CN113760849A
CN113760849A CN202111323495.8A CN202111323495A CN113760849A CN 113760849 A CN113760849 A CN 113760849A CN 202111323495 A CN202111323495 A CN 202111323495A CN 113760849 A CN113760849 A CN 113760849A
Authority
CN
China
Prior art keywords
log
log data
data
query
clickhouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111323495.8A
Other languages
Chinese (zh)
Other versions
CN113760849B (en
Inventor
林存练
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Mingyuan Cloud Technology Co Ltd
Original Assignee
Shenzhen Mingyuan Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Mingyuan Cloud Technology Co Ltd filed Critical Shenzhen Mingyuan Cloud Technology Co Ltd
Priority to CN202111323495.8A priority Critical patent/CN113760849B/en
Publication of CN113760849A publication Critical patent/CN113760849A/en
Application granted granted Critical
Publication of CN113760849B publication Critical patent/CN113760849B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a log processing method, a system, electronic equipment and a computer readable storage medium, wherein the log processing method comprises the following steps: obtaining log data, and writing the log data into a clickhouse cluster according to a preset table blood relationship in the clickhouse cluster to obtain a log list table; and when a query request sent by a query end is received, querying a log list table corresponding to the query request in the clickhouse cluster. The invention can solve the technical problem of low log query speed in the prior art.

Description

Log processing method, system, electronic device and computer readable storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a log processing method and system, an electronic device, and a computer-readable storage medium.
Background
The log data is data generated by the computer during operation, and includes descriptions of related operations such as network equipment, service programs, date, time, users and actions, and the log can be collected and analyzed, and the state of the system at a certain moment can be known by looking up the log, so that a basis is provided for optimization of the service system, prevention of network security problems and the like.
In the big data era, with the development of services, log data generated by each service is increased explosively, so how to store and quickly query huge amounts of log data becomes a technical problem to be solved at present.
Disclosure of Invention
The invention mainly aims to provide a log processing method, a log processing system, electronic equipment and a computer readable storage medium, and aims to solve the technical problem of how to improve the log query speed.
In order to achieve the above object, the present invention provides a log processing method, which includes:
obtaining log data, and writing the log data into a clickhouse cluster according to a tenant identification of the log data to obtain a log list table;
and if a query request sent by a query end is received, querying a log list table corresponding to the query request in the clickhouse cluster.
Optionally, the clickhouse cluster includes a plurality of log libraries, and the step of writing the log data into the clickhouse cluster according to the tenant identifier of the log data to obtain a log list table includes:
if the tenant identifications of the log data exist, storing the log data with the same tenant identification to the same clickhouse cluster;
determining product categories corresponding to all log data in the same clickhouse cluster;
and if the same product type exists in the product types, storing the log data corresponding to the same product type into the same log library to obtain a log list table.
Optionally, the step of storing the log data corresponding to the same product category in the same log library to obtain a log list table includes:
establishing a table blood relationship structure in the log library;
and according to the table blood relationship structure, inputting the log data corresponding to the same product category into a log library to form a log list table.
Optionally, the table blood relationship structure includes a business base table structure and a materialized view table structure, and the step of inputting the log data corresponding to the same product category into a log library to form a log list table according to the table blood relationship structure includes:
inputting the log data corresponding to the same product category into a service bottom table structure in the log library to obtain a service list table;
and inputting the log data meeting preset scene conditions in the service sheet table into a materialized view chart structure to obtain a materialized view chart, wherein the log sheet table comprises the service sheet table and the materialized view chart.
Optionally, the step of obtaining log data includes:
acquiring initial log data, and performing structure conversion on the initial log data according to the table blood relationship structure through a pulsar cluster;
and taking the initial data after the structure conversion as log data.
Optionally, the step of querying, in the clickhouse cluster, a log list table corresponding to the query request includes:
acquiring tenant information in the query request, and determining a matched clickhouse cluster matched with the tenant information;
and according to the scene information in the query request, obtaining a log list table corresponding to the scene information in the matched clickhouse cluster.
Optionally, the step of obtaining, according to the context information in the query request, a log list table corresponding to the context information in the matching clickhouse cluster includes:
obtaining product information in the query request, and determining a matching log library matched with the product information in the matching clickhouse cluster;
and acquiring a log list table corresponding to the scene information in the matching log library according to the scene information in the query request.
In addition, to achieve the above object, the present invention further provides a log processing system, which includes a log storage end and a log query end, wherein,
the log storage end is used for acquiring log data, and writing the log data into a clickhouse cluster according to the tenant identification of the log data to acquire a log list table;
the log storage end is used for querying a log list table corresponding to a query request in the clickhouse cluster if the query request sent by the query end is received;
and the log query end is used for sending a query request to the log storage end and receiving a log list table which is returned by the log storage end and corresponds to the query request.
In addition, to achieve the above object, the present invention also provides an electronic device including: the device comprises a memory, a processor and a program which is stored on the memory and can run on the processor and realize the log processing method, wherein the program can realize the steps of the log processing method when being executed by the processor.
Further, to achieve the above object, the present application also provides a computer-readable storage medium having stored thereon a program for implementing a log processing method, which when executed by a processor implements the steps of the log processing method as described above.
The invention provides a log processing method, a log processing system, electronic equipment and a computer readable storage medium, which are used for acquiring log data, writing the log data into a clickhouse cluster according to tenant identification of the log data to obtain a log list table, and inquiring the log list table corresponding to an inquiry request in the clickhouse cluster if the inquiry request sent by an inquiry end is received. In the method, the clickhouse cluster is used as a storage database of the log, the log data are stored in the clickhouse cluster in an isolated mode according to the tenant identification of the log data to form a log single table, and the single table query performance based on the clickhouse can greatly improve the query rate of the log data.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a log processing method according to a first embodiment of the present invention;
FIG. 3 is a schematic diagram of an application scenario in the log processing method of the present invention;
FIG. 4 is a schematic diagram of a clickhouse data table blood-border design in the log processing method of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the technical solutions in the embodiments may be combined with each other, but must be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of the technical solutions should not be considered to exist, and is not within the protection scope of the present invention.
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the device may be an electronic device, and the electronic device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and optionally, the user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration of the apparatus shown in fig. 1 is not intended to be limiting of the apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a computer program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to invoke the computer program stored in the memory 1005 and perform the following operations:
obtaining log data, and writing the log data into a clickhouse cluster according to a tenant identification of the log data to obtain a log list table;
and if a query request sent by a query end is received, querying a log list table corresponding to the query request in the clickhouse cluster.
Based on the hardware structure, the invention provides various embodiments of the log processing method.
The invention provides a log processing method.
Referring to fig. 2, fig. 2 is a flowchart illustrating a log processing method according to a first embodiment of the present invention.
In this embodiment, the log processing method includes the steps of:
step S10, obtaining log data, writing the log data into a clickhouse cluster according to the tenant identification of the log data, and obtaining a log list table;
in this embodiment, a clickhouse cluster is used as a storage database of logs, and log data is stored based on the clickhouse cluster, where the clickhouse cluster is a column-type database management system (DBMS) for online analysis (OLAP). Firstly, log data are obtained, and according to tenant identification of the log data, the log data are respectively written into corresponding clickhouse clusters, and a log list table is obtained. The log sheet table comprises a service base table and a materialized view table. The business base table is a base table which is set according to specific business characteristics and is associated with log data, the materialized view chart is a materialized view which is constructed according to business scenes based on the business base table, the materialized view is a database object comprising a query result and is used for generating a summary table based on data table summation, the materialized view chart can be constructed in clickhouse, a preset chart relationship is a clickhouse data table blood-edge relationship which is designed in advance according to the specific business scenes, the clickhouse data table blood-edge relationship in the embodiment comprises all log list tables under the same product corresponding to the same tenant, namely all the log list tables corresponding to the same product of the same tenant are stored in a log library. Clickhouse data table kindred as shown in figure 4.
In the embodiment, based on the rapid query performance of the clickhouse to the wide table, the log data are finally stored in the clickhouse according to the wide table form of the table blood-edge relationship structure, and the query speed of the log is greatly improved.
Step S20, if receiving a query request sent by a query end, querying a log list table corresponding to the query request in the clickhouse cluster.
In this embodiment, if a query request sent by a query end is received, a log list table corresponding to the query request is queried in a clickhouse cluster, a specific service underlying table and/or a materialized view table in the clickhouse can be queried according to the query request, and log data is stored in a wide table form according to a preset table blood relationship, so that the query end can query the log list table in the wide table according to the query request, and the log query efficiency is improved. The Query end in this embodiment may be a service Query analysis system, and the clickhouse cluster supports SQL (Structured Query Language) Query, and when log Query is needed, the service Query analysis system initiates a Query request, and converts the Query request into SQL through log Query service (logquery), so as to Query log data stored in the clickhouse cluster, and Query a log list table corresponding to the Query request. In this embodiment, the query request may also be classified by a BFF layer (backup for bound, also called aggregation layer or adaptation layer), specifically, the query request is classified according to a service type in the BFF aggregation layer, for example, according to a service type of a user, a behavior anomaly alarm, and the like.
Further, in step S10, the step of writing the log data into a clickhouse cluster according to the tenant identifier of the log data includes:
step A, if a plurality of tenant identifications of the log data exist, storing the log data with the same tenant identification to the same clickhouse cluster;
in this embodiment, log data is stored in a tenant isolation manner based on clickhouse, a tenant identifier is identification information for identifying different tenants, and each tenant corresponds to an ID of a unique identifier. If the tenant identifications of a plurality of pieces of log data exist, the log data with the same tenant identification are stored in the same clickhouse cluster, the log data with the same tenant identification are the log data of the same tenant, and the log data of the same tenant are stored in the same clickhouse cluster, namely the log data of different tenants are stored separately, physical resources are isolated, and services are guaranteed to be not influenced by each other. In this embodiment, log data is stored in an isolated manner according to a tenant identifier, and the role is as follows: the log data of different tenants are stored separately, so that mutual isolation of physical resources is realized, and services of different tenants do not influence each other.
In this embodiment, referring to fig. 3, log data are acquired from each client through an acquisition interface (application sdk), and then the log data are reported to a pulser cluster through an agent (agent), and when the agent (sdk) reports through the agent, the agent identifies a tenant identifier of the log data to determine a tenant corresponding to the log data, and further writes the log data with the same tenant identifier into an execution stream of the tenant, that is, writes the log data of the same tenant into the same clickhouse cluster. For example, as shown in fig. 3, log data that identifies the tenant as tenant 1 is stored in a tenant 1clickhouse cluster, and log data that identifies the tenant as tenant 2 is stored in a tenant 2clickhouse cluster. Wherein the collection interface may be an application sdk (software development kit) for collecting log data. According to the embodiment, log data are stored in corresponding clickhouse clusters in an isolated manner according to tenant identifications, and one tenant corresponds to one clickhouse cluster, so that the effects that physical resources are isolated from each other and services are not influenced by each other are achieved.
Step B, determining product categories corresponding to all log data in the same clickhouse cluster;
and step C, if the same product type exists in the product types, storing the log data corresponding to the same product type into the same log library to obtain a log list table.
In this embodiment, one clickhouse cluster includes a plurality of log libraries, log data of the same tenant are stored in the same clickhouse cluster, the log data in the same clickhouse cluster is log data of the same tenant, the same tenant includes different product categories, product categories corresponding to all the log data of the same tenant are determined, and if the same product category exists in each product category, the log data corresponding to the same product category is stored in the same log library, so that a log list table is obtained. That is, different products under the same tenant are respectively stored in different log libraries, log data between each product is isolated according to the log libraries, and logical isolation is achieved, wherein the log libraries are collection, storage and query units of the log data in log service.
Further, in the step C, the log data corresponding to the same product category is stored in the same log library, and the step of obtaining the log list table includes:
step c1, establishing a table blood relationship structure in the log library;
and c2, inputting the log data corresponding to the same product category into a log library to form a log list table according to the chart blood relationship structure.
In this embodiment, a table blood-edge relationship structure is established in a clickhouse cluster according to specific service characteristics, in this embodiment, one log library corresponds to a set of table blood-edge relationship structures, that is, one log library includes all log data of the same product of the same tenant, and all the log data form a log list table in the table blood-edge relationship structure for storage.
In this embodiment, according to a preset table blood relationship in the clickhouse cluster, the log data is written into the clickhouse cluster through the output layer, and then the log list table is obtained. Wherein, the same product under the same tenant corresponds to a set of table blood relationship, that is, a set of table blood relationship corresponds to a log library. The log library comprises all the business base tables and the materialized view tables related to the product. In this embodiment, the log data is stored in the manner of "tenant → log library → table blood relationship structure", so that a wide table as shown in fig. 4 can be formed, the log wide table corresponding to the service scene is queried according to different service scenes, and then a specific log list table can be queried from the log wide table. And classifying the log data in the same clickhouse cluster according to different product categories, and respectively storing the log data in different log libraries in the same clickhouse cluster. And writing the classified log data into a preset table blood relationship template, wherein the log list comprises a business list and a materialized view chart.
Further, in the step c2, the step of inputting the log data corresponding to the same product category into a log library to form a log list table according to the table blood relationship structure includes:
step c21, inputting the log data corresponding to the same product category into a service bottom table structure in the log library to obtain a service list table;
and c22, inputting the log data meeting the preset scene conditions in the service sheet table into a materialized view chart structure to obtain a materialized view chart, wherein the log sheet table comprises the service sheet table and the materialized view chart.
In this embodiment, the table blood relationship structure includes a service base table structure and a materialized view table structure, the log sheet includes a service sheet table and a materialized view table, log data corresponding to the same product category is input to the service base table structure in the log library to obtain a service sheet table, and log data satisfying a preset scene condition in the service sheet table is input to the materialized view table structure to obtain the materialized view table. The preset scene condition is a preset scene condition according to a specific service scene, and the process of establishing the preset table blood relationship in the clickhouse cluster can be as follows: the method comprises the steps of establishing a service base table according to service characteristics, establishing a materialized view table on the service base table according to specific service scenes, namely, pre-establishing a preset condition on the base table, and automatically flowing into the materialized view table when log data in the service base table reaches the preset condition, wherein as shown in FIG. 4, a front-end report detail table, a user table, a rear-end report detail table, front-end page resource loading, a link relation table is the service base table, a full-link session I page list, a front-end statistical table, a front-end error detail table, a front-end error statistical table and a front-end user table are the materialized view tables generated according to the front-end report detail table, and the materialized view tables 1-6 are the materialized view tables generated by the rear-end report detail table. Referring to fig. 4, when the log data in the back-end report detail table satisfies the preset scene condition of the influent view table 1, the log data satisfying the preset scene condition of the influent view table 1 automatically flows into the materialized view table 1. In this embodiment, a plurality of materialized view tables are generated from one bottom table, or one materialized view table may be generated from a plurality of bottom tables. The clickhouse supports the establishment of materialized view charts, reduces a large amount of secondary development work, and can establish and obtain a plurality of materialized view charts meeting the service scene according to one base table. And a table blood relationship structure is established in clickhouse without redevelopment, and log data are classified, so that log query efficiency is improved.
Further, in the step S10, the step of acquiring the log data includes:
step c23, collecting initial log data, and performing structure conversion on the initial log data according to the table blood relationship structure through a pulsar cluster;
and step c24, taking the initial data after the structure conversion as log data.
In this embodiment, the initial log data may be acquired through the acquisition interface, the initial log data is reported to the pulser cluster through the agent, the acquired log data is subjected to structure conversion through the pulser cluster, the log data is converted into a data structure corresponding to a preset table blood-edge relationship, and then the converted log data is written into the clickhouse cluster through the output layer output, so that a log list table is obtained. Referring to fig. 3, in the specific process of acquiring log data in this embodiment, an acquisition interface application sdk is used to acquire original log data, identify a tenant identifier of the original log data through an agent, write the original log data into a pulser cluster, and then derive the log data after structure conversion from the pulser cluster through an output end. In this embodiment, the collection Interface may be sdk (software development kit), or an API (Application Programming Interface), and may also directly collect, through an agent, original log data, where the original log data is the original log data collected from a client, and the pulser is a distributed message publishing/subscribing transfer platform, and in this embodiment, the original log data is written into a pulser cluster, and the original log data is converted into a table blood margin relationship structure by the pulser, and then the log data of the table blood margin relationship structure is written into a clickhouse cluster by an output layer, so as to facilitate subsequent fast query on the log data, and further, the original log data may be temporarily stored into the pulser cluster by the above-mentioned method, so as to reduce the computation pressure of the subsequent clickhouse. Through the message persistence of pulsar, the condition of log data loss can be avoided when any area fails or is down. In this embodiment, original log data is collected through sdk/API/agent, and through a rich log collection method, a user can conveniently collect log data into clickhouse for centralized management. Based on the high-efficiency data import capability of clickhosu, the problem of rapid storage of large-flow logs is solved, hundred million-level log retrieval supports second-level returned results, log analysis can aggregate hundred million-level log data per second, and a user can conveniently and rapidly analyze and process massive log data.
As another embodiment, the log analysis may also be implemented by Presto, where Presto is a distributed SQL (Structured Query Language) Query engine, configured to Query large data sets distributed in one or more different data sources, and Presto uses distributed Query, but Presto is based on memory computing, and when multiple large tables are associated, a problem of memory overflow is easily caused.
The invention provides a log processing method, a log processing system, electronic equipment and a computer readable storage medium, which are used for acquiring log data, writing the log data into a clickhouse cluster according to tenant identification of the log data to obtain a log list table, and inquiring the log list table corresponding to an inquiry request in the clickhouse cluster if the inquiry request sent by an inquiry end is received. In the method, the clickhouse cluster is used as a storage database of the log, the log data are stored in the clickhouse cluster in an isolated mode according to the tenant identification of the log data to form a log single table, and the single table query performance based on the clickhouse can greatly improve the query rate of the log data. The method can realize the quick import of the log data based on the clicuse, and the high compression ratio of the clicuse is used for storage, so that the storage cost is saved, and the query rate of the log data is improved. Further, a clichouse column type storage mode is utilized, and massive log data can be rapidly and efficiently inquired.
Further, based on the first embodiment of the present invention, a second embodiment of the log processing method of the present invention is provided, in this embodiment, in step S20, the step of querying the log list table corresponding to the query request in the clickhouse cluster includes:
step D, acquiring tenant information in the query request, and determining a matched clickhouse cluster matched with the tenant information;
and step E, acquiring a log list table corresponding to the scene information in the matched clickhouse cluster according to the scene information in the query request.
In this embodiment, when a query request sent by a log query end is received, the query request includes tenant information, scenario information, and product information, the tenant information may be a tenant identifier, and the tenant information in the query request is acquired, so that a tenant that needs to be queried by the query request can be determined, that is, a clickhouse cluster that matches with the tenant information can be matched according to the tenant information, for example, if the tenant information in the query request is tenant 1, a clickhouse cluster that matches with the tenant information can be determined, and a log list table corresponding to the scenario information is acquired in the clickhouse cluster according to the scenario information in the query request.
Further, in the step E, according to the scene information in the query request, a step of obtaining a log list table corresponding to the scene information in the matching clickhouse cluster includes:
step e1, acquiring the product information in the query request, and determining a matching log library matched with the product information in the clickhouse cluster;
and e2, obtaining a log list table corresponding to the scene information in the matching log library according to the scene information in the query request.
In this embodiment, after determining the matched clickhouse cluster matched with the tenant information, the product information in the query request is obtained, and the matched log library matched with the product information in the clickhouse cluster can be determined according to the product information, for example, the product information in the query request corresponds to the product information matched with the log library 1, and the log list table corresponding to the scene information is obtained in the log library 1. According to the embodiment, the logs are stored separately according to tenants, huge log data are stored in an isolated mode, the search range is narrowed, and the query speed is further improved.
In the embodiment, a clickhouse cluster is used as a storage database of the log, the log data is stored according to a preset blood relationship in a mode of tenant → log library → table blood relationship structure to form a wide log table, and the query efficiency of the log is greatly improved and the fast log query analysis is realized by using the powerful wide table and single table query performance of the clickhouse.
Corresponding to the first embodiment and the second embodiment, an embodiment of the present application further provides a log processing system, including: a log storage end and a log query end, wherein,
the log storage end is used for acquiring log data, and writing the log data into a clickhouse cluster according to the tenant identification of the log data to acquire a log list table;
the log storage end is used for querying a log list table corresponding to a query request in the clickhouse cluster if the query request sent by the query end is received;
and the log query end is used for sending a query request to the log storage end and receiving a log list table which is returned by the log storage end and corresponds to the query request.
Specifically, referring to fig. 3, as shown in fig. 3, raw log data is collected from a client through an sdk application interface, sdk sends the collected raw log data to an agent, the agent may be an agent process, the agent uploads the raw log data to an input interface of a pulser cluster service, the input interface may be an input collector, the input collector writes the raw log data into a pulser cluster, the pulser cluster receives the raw log data, performs structure conversion on the raw log data, and exports the converted log data in the pulser cluster to a clickhouse cluster through an output layer, that is, the clickhouse cluster obtains the log data from the pulser cluster.
In an embodiment, before the log data is imported into the clickhouse cluster, the log data may further be subjected to etl processing, where etl is a processing process of the data, specifically, extraction (extract), transformation (transform), loading (load), and the like are performed on the log data, and by performing etl processing on a large amount of log data, scattered, random, and non-uniform data may be integrated together, so as to provide an analysis basis for project decision making. Taking application analysis as an example, when an application or a server is abnormal, log query can be performed in a service query analysis system, the service query analysis system initiates a query request, the query request can pass through log query service, namely a log query website, the query request is converted into SQL language through the log query service, and log query is performed in clickhouse.
In this embodiment, when a service or an application is abnormal, a log query service query may be used to perform a complete link query of an error behavior on a clickhouse. Under the condition that the data volume is more than one billion log base tables, log data are collected and reported and finally are guided and stored into a clickhouse cluster, and a plurality of materialized view charts can be established for different base tables according to different service scenes on the basis of high compression ratio of clickhouse, a column type storage database, high-efficiency data guiding capacity and materialized view chart structures, so that the etl process is greatly reduced by the materialized views, the storage cost is reduced, the rapid storage and query analysis of large-flow log data are realized, and the user is assisted to solve the scene problems of service operation and maintenance, service monitoring, log audit and the like through logs. And the query end rapidly queries the log data in clickhouse according to the tenant information, the product information and the scene information in the query request. If the tenant information in the query request is tenant 1, and the product information is product information corresponding to the log library 1, it is indicated that log data in the log library 1 of the tenant 1 needs to be queried at this time, then tenant 1clickhouse is determined according to the last tenant information, the log library 1 is determined according to the product information, and then a log list table in the log library 1 is obtained according to scene information in the request information, and clickhouse supports building of a materialized view, so that many development works are saved, and many service scenes do not need to be regenerated into a bottom table.
In the embodiments of the present application, the same or corresponding contents as those in the first embodiment or the second embodiment are referred to the above description, and are not repeated herein.
The log processing system provided by the embodiment of the invention adopts the log processing method provided by the embodiment, so that the technical problem of low log data query speed is solved. Other technical features in the log processing system of the present application are the same as those disclosed in the log processing method of the first embodiment or the second embodiment, and are not described herein again. Compared with the prior art, the log processing system provided by the embodiment of the invention has the advantages that the storage cost is greatly reduced based on the data storage performance of the clickhouse high compression ratio, the columnar storage database of the clickhouse is used as the storage database of the log, the high compression ratio of the clickhouse is used for storage, the storage cost is saved, the log data can be quickly imported based on the high-efficiency import performance of the clickhouse, the large-table query performance is improved, the quick query analysis capability is realized, meanwhile, the high-efficiency data import capability of the clickhouse is realized, and the quick storage of the large-flow log can be realized.
The present embodiment provides a computer-readable storage medium having stored thereon computer-readable program instructions for executing the log processing method of one of the above-described embodiments.
The computer readable storage medium provided by the embodiments of the present invention may be a U disk, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or any combination thereof. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present embodiment, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer-readable storage medium provided by the invention stores the computer-readable program instruction for executing the log processing method, and solves the technical problem of low log data query speed. Compared with the prior art, the beneficial effects of the computer-readable storage medium provided by the embodiment of the present invention are the same as the beneficial effects of the log processing method provided by the first embodiment or the second embodiment, and are not described herein again.
The present application further provides an electronic device, the device comprising: at least one processor, and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the log processing method as described above.
The electronic equipment provided by the application solves the technical problem of low log data query speed. Compared with the prior art, the electronic device provided by the embodiment of the present invention has the same beneficial effects as the log processing method provided by the first embodiment or the second embodiment, and details are not repeated herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A log processing method, characterized by comprising the steps of:
obtaining log data, and writing the log data into a clickhouse cluster according to a tenant identification of the log data to obtain a log list table;
and if a query request sent by a query end is received, querying a log list table corresponding to the query request in the clickhouse cluster.
2. The log processing method according to claim 1, wherein the clickhouse cluster includes a plurality of log libraries, and the step of writing the log data into the clickhouse cluster according to the tenant identifier of the log data to obtain the log list table includes:
if the tenant identifications of the log data exist, storing the log data with the same tenant identification to the same clickhouse cluster;
determining product categories corresponding to all log data in the same clickhouse cluster;
and if the same product type exists in the product types, storing the log data corresponding to the same product type into the same log library to obtain a log list table.
3. The log processing method according to claim 2, wherein the step of storing the log data corresponding to the same product category in the same log library to obtain a log list table comprises:
establishing a table blood relationship structure in the log library;
and according to the table blood relationship structure, inputting the log data corresponding to the same product category into a log library to form a log list table.
4. A log processing method as claimed in claim 3, wherein the table blood relationship structure includes a business base table structure and a materialized view table structure, and the step of inputting the log data corresponding to the same product category into a log library to form a log list table according to the table blood relationship structure comprises:
inputting the log data corresponding to the same product category into a service bottom table structure in the log library to obtain a service list table;
and inputting the log data meeting preset scene conditions in the service sheet table into a materialized view chart structure to obtain a materialized view chart, wherein the log sheet table comprises the service sheet table and the materialized view chart.
5. The log processing method of claim 3, wherein the step of obtaining log data comprises:
acquiring initial log data, and performing structure conversion on the initial log data according to the table blood relationship structure through a pulsar cluster;
and taking the initial data after the structure conversion as log data.
6. The log processing method of claim 1, wherein the step of querying a log sheet table corresponding to the query request in the clickhouse cluster comprises:
acquiring tenant information in the query request, and determining a matched clickhouse cluster matched with the tenant information;
and according to the scene information in the query request, obtaining a log list table corresponding to the scene information in the matched clickhouse cluster.
7. The log processing method according to claim 6, wherein the step of obtaining the log list table corresponding to the scenario information in the matching clickhouse cluster according to the scenario information in the query request includes:
obtaining product information in the query request, and determining a matching log library matched with the product information in the matching clickhouse cluster;
and acquiring a log list table corresponding to the scene information in the matching log library according to the scene information in the query request.
8. A log processing system comprises a log storage end and a log query end, wherein,
the log storage end is used for acquiring log data, and writing the log data into a clickhouse cluster according to the tenant identification of the log data to acquire a log list table;
the log storage end is used for querying a log list table corresponding to a query request in the clickhouse cluster if the query request sent by the query end is received;
and the log query end is used for sending a query request to the log storage end and receiving a log list table which is returned by the log storage end and corresponds to the query request.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the log processing method as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the readable storage medium has stored thereon a program implementing a log processing method, the program implementing the log processing method being executed by a processor to implement the steps of the log processing method according to any one of claims 1 to 7.
CN202111323495.8A 2021-11-10 2021-11-10 Log processing method, system, electronic device and computer readable storage medium Active CN113760849B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111323495.8A CN113760849B (en) 2021-11-10 2021-11-10 Log processing method, system, electronic device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111323495.8A CN113760849B (en) 2021-11-10 2021-11-10 Log processing method, system, electronic device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN113760849A true CN113760849A (en) 2021-12-07
CN113760849B CN113760849B (en) 2022-04-08

Family

ID=78784912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111323495.8A Active CN113760849B (en) 2021-11-10 2021-11-10 Log processing method, system, electronic device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN113760849B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115102830A (en) * 2022-06-24 2022-09-23 平安银行股份有限公司 Log reduction method, apparatus, computer device and computer-readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037489A1 (en) * 2007-08-02 2009-02-05 International Business Machines Corporation Method And System For Response Time Optimization
US20110258242A1 (en) * 2010-04-16 2011-10-20 Salesforce.Com, Inc. Methods and systems for appending data to large data volumes in a multi-tenant store
US20170293697A1 (en) * 2016-04-11 2017-10-12 Oracle International Corporation Graph processing system that can define a graph view from multiple relational database tables
CN110232098A (en) * 2019-04-22 2019-09-13 汇通达网络股份有限公司 A kind of data warehouse administered based on data and genetic connection designs
CN110795509A (en) * 2019-09-29 2020-02-14 北京淇瑀信息科技有限公司 Method and device for constructing index blood relationship graph of data warehouse and electronic equipment
CN112835917A (en) * 2021-01-28 2021-05-25 山东浪潮通软信息科技有限公司 Data caching method and system based on blood relationship distribution
CN112905708A (en) * 2021-03-31 2021-06-04 浙江太美医疗科技股份有限公司 Database operation method and system based on software as a service (SaaS) system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037489A1 (en) * 2007-08-02 2009-02-05 International Business Machines Corporation Method And System For Response Time Optimization
US20110258242A1 (en) * 2010-04-16 2011-10-20 Salesforce.Com, Inc. Methods and systems for appending data to large data volumes in a multi-tenant store
US20170293697A1 (en) * 2016-04-11 2017-10-12 Oracle International Corporation Graph processing system that can define a graph view from multiple relational database tables
CN110232098A (en) * 2019-04-22 2019-09-13 汇通达网络股份有限公司 A kind of data warehouse administered based on data and genetic connection designs
CN110795509A (en) * 2019-09-29 2020-02-14 北京淇瑀信息科技有限公司 Method and device for constructing index blood relationship graph of data warehouse and electronic equipment
CN112835917A (en) * 2021-01-28 2021-05-25 山东浪潮通软信息科技有限公司 Data caching method and system based on blood relationship distribution
CN112905708A (en) * 2021-03-31 2021-06-04 浙江太美医疗科技股份有限公司 Database operation method and system based on software as a service (SaaS) system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115102830A (en) * 2022-06-24 2022-09-23 平安银行股份有限公司 Log reduction method, apparatus, computer device and computer-readable storage medium
CN115102830B (en) * 2022-06-24 2023-07-14 平安银行股份有限公司 Log reduction method, device, computer equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN113760849B (en) 2022-04-08

Similar Documents

Publication Publication Date Title
CN109034993B (en) Account checking method, account checking equipment, account checking system and computer readable storage medium
US11157512B2 (en) Method and system for replicating data to heterogeneous database and detecting synchronization error of heterogeneous database through SQL packet analysis
CN111311326A (en) User behavior real-time multidimensional analysis method and device and storage medium
CN112765282B (en) Data online analysis processing method, device, equipment and storage medium
WO2020155651A1 (en) Method and device for storing and querying log information
US20230017300A1 (en) Query method and device suitable for olap query engine
CN112732647B (en) Log searching method, device, equipment and storage medium
CN113760849B (en) Log processing method, system, electronic device and computer readable storage medium
CN111400436A (en) Search method and device based on user intention recognition
CN114139040A (en) Data storage and query method, device, equipment and readable storage medium
CN112612832B (en) Node analysis method, device, equipment and storage medium
CN111125226B (en) Configuration data acquisition method and device
CN107633094B (en) Method and device for data retrieval in cluster environment
CN113407541B (en) Data acquisition method, data acquisition equipment, storage medium and device
US20220019597A1 (en) Data management device and data management method
CN111143329B (en) Data processing method and device
CN114240663A (en) Data reconciliation method, device, terminal and storage medium
CN111695031A (en) Label-based searching method, device, server and storage medium
CN110990430A (en) Large-scale data parallel processing system
CN110633430A (en) Event discovery method, device, equipment and computer readable storage medium
CN116541801B (en) Multi-device information centralized processing system, device and storage medium
CN115168399B (en) Data processing method, device and equipment based on graphical interface and storage medium
CN116644139A (en) Data management method, device, equipment and storage medium
CN116028320A (en) Data analysis method, device and operation evaluation system of data product service assembly
CN115794788A (en) Data deduplication method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant