CN110311817B - Container log processing system for Kubernetes cluster - Google Patents
Container log processing system for Kubernetes cluster Download PDFInfo
- Publication number
- CN110311817B CN110311817B CN201910578033.7A CN201910578033A CN110311817B CN 110311817 B CN110311817 B CN 110311817B CN 201910578033 A CN201910578033 A CN 201910578033A CN 110311817 B CN110311817 B CN 110311817B
- Authority
- CN
- China
- Prior art keywords
- log
- module
- container
- processing system
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/069—Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to the technical field of containers, discloses a container log processing system for a Kubernets cluster, and solves the problem of how to collect, search and archive container logs randomly distributed in the Kubernets cluster. The invention discloses a dynamic deployment service based on Kubernetes, which is used for marking a specific label on the deployment service, configuring a log collection component to collect, search and archive the label on the deployment service, utilizing information as log collection buffer storage, compressing and storing the archived log to solve the problems of influence on service performance and high storage cost in the log collection process, configuring a disposal event of a Filebeat as the specific label of the deployment service, and achieving dynamic switching of log collection. The invention is suitable for data center transmission control.
Description
Technical Field
The invention relates to the technical field of containers, in particular to a container log processing system for a Kubernetes cluster.
Background
With the popularization of micro-service architecture, more and more companies adopt micro-services to build their own service platforms, adopt container platforms represented by kubernets to manage the micro-services, and use the kubernets to perform a series of container arrangement operations such as resource scheduling and dynamic capacity expansion. The log is used as important information for recording the running state of the container and is used as key data for diagnosing and positioning problems in daily production, and the significance of the log is more and more emphasized by people. Particularly, in a large-scale container cluster, a series of problems such as how to collect randomly distributed container log data and provide distributed container log archiving and searching functions, etc. of one microservice, which have to be multiple copies and randomly distributed to different host nodes, become a challenge to be faced in the containerization deployment process.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the container log processing system for the Kubernets cluster is provided, and the problem of how to collect, search and archive container logs randomly distributed in the Kubernets cluster is solved.
In order to solve the problems, the invention adopts the technical scheme that: the container log processing system for the Kubernetes cluster comprises a log acquisition module, a log collection module, a log consumption module, a log archiving program, a log buffer storage module, a search analysis service module and two specific tags;
two specific tags are attached to applications deployed in a Kubernets cluster, wherein the tag value of one tag A is the same as the application name, and the other tag B is used for determining whether application logs need to be collected and archived;
the log acquisition module is used for acquiring application log data;
the log collection module is used for writing the log collected by the log collection module into the log buffer storage module and configuring log discarding event conditions, wherein the log data written into the log buffer storage module needs to contain the two specific tags;
the log consumption module is used for consuming the log data in the buffer storage module and writing the consumed log data into the search analysis service module, wherein the log data written into the search analysis service module needs to contain the two specific tags;
the log archiving program is used for archiving the log data collected by the search analysis service module, the log archiving program determines whether the application service log needs to be archived or not through the label B before archiving, and the label A is used for searching the service module to retrieve the archived data during archiving.
Furthermore, the parameters of the log collection module need to be set before the log collection module collects the logs, so that the logs in a single container can be updated in a rolling manner, and the logs in the single container are prevented from being overlarge.
Further, the log collection module may be Docker.
Further, the log buffer storage module may be Kafka.
Further, the log collection module may be a filebed.
Furthermore, the Filebeat is deployed in the kubernets cluster in a DaemonSet manner, so as to ensure that each host node in the kubernets cluster runs one pod copy, and when a new node is added into the kubernets cluster or an old node is removed, the Filebeat can automatically schedule a pod to the new node or delete the redundant copy, so as to ensure that the log of each node can be correctly collected.
Further, the log consumption module may be a Logstash.
Further, the archive program may also have a retry mechanism to ensure that archive data is not lost.
Further, the search service module may be an Elasticsearch.
The invention has the beneficial effects that: the invention adopts asynchronous processing for collecting and archiving container logs in the Kubernetes cluster, namely, the logs collected by the log collecting module are written into the log buffer storage module, thereby reducing the performance influence on the service in the process of collecting and archiving the cluster logs. And the distributed log real-time search and history filing all-round log data are provided, the data are ensured not to be lost, and the log filing adopts a compressed storage method, so that the log storage cost is greatly reduced.
Drawings
FIG. 1 is a schematic flow chart of an embodiment.
Fig. 2 is a schematic structural diagram of the embodiment.
Detailed Description
In order to solve the problem of how to collect, search and archive the logs of the randomly distributed containers in the Kubernets cluster, two specific labels are marked when an application is deployed in the Kubernets cluster, wherein the value of one label A is the same as the application name, so that the distributed logs are conveniently archived and searched, and the value of the other label B determines whether the application logs need to be collected and archived. The application which is convenient to control and does not need to collect logs causes unnecessary resource waste to the system.
The application writes the log into the standard output, all nodes in the kubernets cluster write the application log into a host node file system through a log acquisition module Docker, the Docker can process the application log through a module called as a Lopredriver of the Docker, and the Lopredriver is a module used by the Docker for processing the standard output of the container. Docker supports a plurality of different processing modes, and the invention adopts Docker to default JSON File logs. For a large-scale cluster container, the log file size is increased very fast, which undoubtedly will quickly exhaust the disk space of the host node, so we need to set the container log rolling size.
According to the invention, a log collection module FileBeat is deployed in a Kubernets cluster, and the application log collected by each host node through Docker is uploaded to a log centralized storage, the FileBeat is deployed in the Kubernets cluster in a DaemonSet mode, the DaemonSet can ensure that each host node in the Kubernets cluster runs a pod copy, when a new node is added into the Kubernets cluster or an old node is removed, the cloud can be automatically scheduled to the new node or the redundant copy is deleted, and the log of each node can be correctly collected. Because the application logs have the characteristics of real-time generation and large data volume, great I/O pressure and even log data loss can be caused to the centralized storage of the logs by simultaneously uploading the Filebeat to the application logs to the centralized storage, so that the Filebeat is configured to directly transmit the log data to a plurality of partitions under Topic specified in a log buffer storage module Kafka cluster, and the application logs are temporarily cached by utilizing the characteristic that Kafka supports ultrahigh concurrent writing. Meanwhile, the log data transmitted by the Filebeat needs to be configured, a label marked when the application is deployed needs to be added, and a log discarding event is started, and whether the log needs to be transmitted by Kafka depends on whether log collection is started when the application is deployed.
According to the method, a log consumption module Logstash is deployed in a Kubernets cluster, the Logstash consumes Filebeat and transmits a Kafka specified Topic message, and an application log is written into a log centralized storage search service module Elasticissearch. Configuring the Logstash to write into the template of the Elasticissearch as one Index per day, preventing the single Index from being too large, causing performance impact on log search, and the log data written into the Elasticissearch contains a specific label marked when the application is deployed. And all logs in the cluster are written into the Elasticissearch in a centralized manner, the occupation of storage space is very high, a timing task is configured to ensure that the Elasticissearch only stores log data within 6 months, and the performance of a log search interface is ensured.
The invention deploys a log filing program in a Kubernetes cluster, and is used for filing the elastic search collection log to a plurality of copies of dispersed log data according to the levels of the cluster, the space and the application, wherein the filing interval is once per hour, the label A is used for searching a service module to retrieve the filed data during filing, and the filed log data is uploaded to an object storage Ceph. The log archiving program utilizes a Kubernets watch mechanism to count resource changes in the cluster, only logs on the application with the log collection tag are archived, the archiving program compresses the application logs before uploading to the Ceph, the compressed logs greatly reduce the storage space, the log archiving program is convenient to permanently store the logs, and the problem that the log searching can only provide 6 months of time limit is solved.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following examples.
Examples
With reference to fig. 1 and fig. 2, an embodiment provides a method for processing a container log in a kubernets cluster, which mainly includes the following steps:
the method comprises the following steps: the method comprises the steps of deploying application apps in a system and starting log collection, wherein the system marks two specific tags of matrix-application apps and matrix-logger on the application apps.
Step two: the Docker log collection related parameters were modified to ensure that a single container log could be rolled up, while setting log-opt max-size 100m to prevent the single container log from being too large.
Step three: the method comprises the steps of deploying Filebeat in a cluster in a system, designating the Filebeat to asynchronously write log data into Topic named as matrix by Kafka, adding drop _ event in a Filebeat configuration template, discarding the event when the matrix-loader is off, not collecting the log data, adding add _ kubernets _ metadata under processors, and needing to add specific label matrix-application deployed in include _ fields to facilitate searching and archiving distributed logs subsequently.
Step four: deploying Logstash in the system, consuming Topic information named matrix by Kafka, writing consumption data into an Elasticissearch, configuring a Logstash output plug-in as the Elasticissearch, and designating index creation format as one per day. The system provides a retrieval interface to search the application app into the Elasticsearch log in real time.
Step five: a log filing program is deployed in the system to file multiple copies of scattered log data of an Elasticisearch collection log according to the levels of clusters, spaces and applications, the log filing program utilizes a Kubernets watch mechanism to check matrix-logger which is on, log collection is started by an application app, the filing program retrieves the log data stored by the Elasticisearch through a matrix-application which is an app attribute, the logs of the application app are written into a local storage according to a time sequence, gz compression is carried out on the local logs, the logs are transmitted into a Ceph after compression is finished, results are written into a database, downloading and checking of historical filing logs are facilitated for a service, and if the whole filing process is interrupted, the filing program retries and files the data again, so that log data are not lost.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (9)
1. The container log processing system for the Kubernetes cluster is characterized by comprising a log acquisition module, a log collection module, a log consumption module, a log archiving program, a log buffer storage module, a search service module and two specific tags;
two specific tags are attached to applications deployed in a Kubernets cluster, wherein the tag value of one tag A is the same as the application name, and the other tag B is used for determining whether application logs need to be collected and archived;
the log acquisition module is used for acquiring application log data;
the log collection module is used for writing the log collected by the log collection module into the log buffer storage module and configuring a log discarding event condition, wherein the log data written into the log buffer storage module needs to contain the two specific tags;
the log consumption module is used for consuming the log data in the buffer storage module and writing the consumed log data into the search service module, wherein the log data written into the search service module needs to comprise the two specific tags;
the log archiving program is used for archiving the log data collected by the search service module, the log archiving program determines whether the application service log needs to be archived or not through the label B before archiving, and the label A is used for searching the search service module to retrieve the archived data during archiving.
2. The container log processing system for a Kubernetes cluster of claim 1, wherein parameters of the log collection module are set before collection by the log collection module to ensure that individual container logs can be updated on a rolling basis while preventing individual container logs from being too large.
3. The container log processing system for a kubernets cluster of claim 1, wherein the log collection module is Docker.
4. The container log processing system for a kubernets cluster of claim 1, wherein the log buffering storage module is Kafka.
5. The container log processing system for a kubernets cluster of claim 1, wherein the log collection module is a filebed.
6. The container log processing system for a kubernets cluster of claim 5, wherein Filebeat is deployed in the kubernets cluster in a DaemonSet manner to ensure that each host node in the kubernets cluster runs one copy of a pod, and when a new node is added to the kubernets cluster or an old node is removed, the Filebeat automatically schedules the pod to the new node or deletes the redundant copy to ensure that the log of each node can be collected correctly.
7. The container log processing system for a Kubernetes cluster of claim 1, wherein the log consumption module is logstack.
8. The container log processing system for a kubernets cluster of claim 1, wherein the log archive program is further provided with a retry mechanism.
9. The container log processing system for a kubernets cluster of claim 1, wherein the search service module is an Elasticsearch.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910578033.7A CN110311817B (en) | 2019-06-28 | 2019-06-28 | Container log processing system for Kubernetes cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910578033.7A CN110311817B (en) | 2019-06-28 | 2019-06-28 | Container log processing system for Kubernetes cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110311817A CN110311817A (en) | 2019-10-08 |
CN110311817B true CN110311817B (en) | 2021-09-28 |
Family
ID=68079456
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910578033.7A Active CN110311817B (en) | 2019-06-28 | 2019-06-28 | Container log processing system for Kubernetes cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110311817B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111190875A (en) * | 2019-12-27 | 2020-05-22 | 航天信息股份有限公司 | Log aggregation method and device based on container platform |
CN113127526A (en) * | 2019-12-30 | 2021-07-16 | 中科星图股份有限公司 | Distributed data storage and retrieval system based on Kubernetes |
CN113760638A (en) * | 2020-10-15 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | Log service method and device based on kubernets cluster |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107180051B (en) * | 2016-03-11 | 2021-02-12 | 华为技术有限公司 | Log management method and server |
US10705880B2 (en) * | 2017-09-22 | 2020-07-07 | Vmware, Inc. | Cluster updating using temporary update-monitor pod |
KR102016238B1 (en) * | 2017-12-05 | 2019-08-29 | 숭실대학교산학협력단 | System and method for supervising doker container, computer readable medium for performing the method |
CN108363802B (en) * | 2018-02-28 | 2021-10-29 | 深圳市华云中盛科技股份有限公司 | Container-based text collection method and system |
CN109347814A (en) * | 2018-10-05 | 2019-02-15 | 李斌 | A kind of container cloud security means of defence and system based on Kubernetes building |
CN109491859B (en) * | 2018-10-16 | 2021-10-26 | 华南理工大学 | Collection method for container logs in Kubernetes cluster |
CN109739825B (en) * | 2018-12-29 | 2021-04-30 | 优刻得科技股份有限公司 | Method, apparatus and storage medium for managing log |
-
2019
- 2019-06-28 CN CN201910578033.7A patent/CN110311817B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110311817A (en) | 2019-10-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110311817B (en) | Container log processing system for Kubernetes cluster | |
CN104040481B (en) | Method and system for merging, storing and retrieving incremental backup data | |
CN111339103B (en) | Data exchange method and system based on full-quantity fragmentation and incremental log analysis | |
CN104317800A (en) | Hybrid storage system and method for mass intelligent power utilization data | |
CN104978361B (en) | Method and device for storing real-time monitoring data of power environment | |
CN101673192B (en) | Method for time-sequence data processing, device and system therefor | |
CN112286941B (en) | Big data synchronization method and device based on Binlog + HBase + Hive | |
CN101916290B (en) | Managing method of internal memory database and device | |
CN102779138B (en) | The hard disk access method of real time data | |
CN110609813B (en) | Data storage system and method | |
CN110727406A (en) | Data storage scheduling method and device | |
US20140156603A1 (en) | Method and an apparatus for splitting and recovering data in a power system | |
CN106648442A (en) | Metadata node internal memory mirroring method and device | |
CN110750372A (en) | Log system based on shared memory and log management method | |
US10642530B2 (en) | Global occupancy aggregator for global garbage collection scheduling | |
CN103761262A (en) | Repetition log control method based on syslogd | |
CN117093367B (en) | Service data processing method, device and storage medium | |
CN103488564A (en) | Multichannel test data compressing and merging method for distributed real-time test system | |
CN116401324A (en) | Real-time bin counting method and system for lithium battery manufacturing industry | |
CN109189724B (en) | Method and device for improving audio and video data storage efficiency of video monitoring system | |
CN108647278B (en) | File management method and system | |
CN102937956A (en) | Method and device for storing real-time messages in intelligent substation | |
KR101666440B1 (en) | Data processing method in In-memory Database System based on Circle-Queue | |
CN113986824A (en) | Method for efficiently storing and retrieving time sequence data | |
CN105095502A (en) | Log collection method of cluster storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |