CN106959866B - Log collection client and upgrading method thereof - Google Patents

Log collection client and upgrading method thereof Download PDF

Info

Publication number
CN106959866B
CN106959866B CN201610011466.0A CN201610011466A CN106959866B CN 106959866 B CN106959866 B CN 106959866B CN 201610011466 A CN201610011466 A CN 201610011466A CN 106959866 B CN106959866 B CN 106959866B
Authority
CN
China
Prior art keywords
daemon
upgrading
version
upgrade
daemon process
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610011466.0A
Other languages
Chinese (zh)
Other versions
CN106959866A (en
Inventor
唐恺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610011466.0A priority Critical patent/CN106959866B/en
Priority to PCT/CN2016/112854 priority patent/WO2017118334A1/en
Publication of CN106959866A publication Critical patent/CN106959866A/en
Application granted granted Critical
Publication of CN106959866B publication Critical patent/CN106959866B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Stored Programmes (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a log collecting client and an upgrading method thereof, the method firstly sends a heartbeat request to a configuration server, receives a heartbeat request response returned by the configuration server, then downloads an upgrading file and suspends the heartbeat request according to an upgrading instruction carried in the heartbeat request response, stops collecting new log data, writes the collected log data which are not sent into a local file, records the current progress point, adopts the downloaded upgrading file to upgrade, and checks whether the upgrading is successful, if the upgrading is successful, the log data written into the local file are sent to a data server, and starts collecting the log data from the recorded progress point, the log data start to work with the upgraded version, otherwise, the log data return to the version before upgrading to work. The log collection client comprises a heartbeat request module, an upgrade response module and an upgrade check module. The invention has no loss of data collection in the upgrading process, and can automatically roll back when the new version program is abnormal.

Description

Log collection client and upgrading method thereof
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a log collection client and an upgrading method thereof.
Background
With the development of electronic information technology, a big data age has come. The log is a widely distributed and important data resource, and can complete the work of system monitoring, operation auditing, data analysis and the like based on the log. The log collection client is a program running on the device operating system, and can read the content of a designated log file according to the acquisition configuration, process the content and send the processed content to the log server.
In order to circumvent the potential risk of known program bugs and provide a better functional experience, client programs often need to be upgraded to higher versions. However, in an actual service scenario, logs are generated every moment, and the client program cannot avoid replacing executable files and restarting processes in the upgrading process, so that the log acquisition progress is easily lost in the upgrading process.
In the prior art, two types of schemes are mainly used in the industry for solving the problem of upgrading the log collection client. The first scheme is cold upgrade, for example, open source log collection software such as logstack (version 1.5.4), fluent (version 2.2.1) and the like, and the program version upgrade process is divided into three steps:
executing a control script on the device to stop the running old version process;
installing the new version program file to the equipment in the modes of yum or tar package and the like;
and executing the control script on the equipment to start the new version process and finish upgrading.
The second scheme is dual-program file hot upgrade, and such client software can run two program files on the device, which respectively correspond to two processes: one is a log collection process which installs SIGTERM signals and performs preparation operations for program exit in signal processing functions; the other is a daemon process which is responsible for downloading new program files and completing the version switching from old to new. The upgrading process comprises four steps:
the daemon process detects that a new client program installation package is available in one polling and downloads the new client program installation package to a local machine;
the daemon process sends a SIGTERM signal to the log collection process;
generally, after receiving the sigtherm signal, the log collection process completes the exit preparation operation and actively exits after recording the log collection progress to the local. If the log collection process quits overtime (for example, the process does not complete the exit preparation operation one minute after receiving sigtherm), the daemon process sends out SIGKILL to forcibly end the log collection process.
And the daemon process detects that the log collection process of the old version exits, starts the new version program and finishes upgrading.
However, the existing cold upgrading scheme needs manual work to participate in the upgrading process, the operation and maintenance cost is high, the old process can be forcibly killed in the program upgrading process, the log acquisition progress is lost, and the upgrading of the program version has influence on the integrity of data collection; if a new version program file is not available (e.g., crash occurs after startup), there is no automatic version rollback mechanism. In the existing dual-program hot-upgrading scheme, a log collection program is combined with a daemon program to support automatic operation, but in the upgrading process, a daemon process is in one-way communication with a log collection process through signals, and after receiving a SIGTERM signal, if the log collection process cannot normally exit in a short time (for example, persistence of a log collection progress is not completed), the daemon process can send out a SIGKILL signal again after overtime and is forced to stop running. Therefore, the log acquisition progress before upgrading cannot be acquired after the new version program is started, and data acquisition is lost. And after the daemon process sends out SIGTERM, the log collection process of the old version normally exits, but when a new collection program started later cannot be started normally, log collection is interrupted, and manual operation and maintenance intervention is needed.
Disclosure of Invention
The invention aims to provide a log collection client and an upgrading method thereof, which finish self-upgrading of a program in a single-program file and double-process running mode, and solve the problems of data loss and version rollback during upgrading failure.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a log collection client upgrading method is applied to a log collection client, and comprises the following steps:
sending a heartbeat request to a configuration server, and receiving a heartbeat request response returned by the configuration server;
according to an upgrading instruction carried in the heartbeat request response, downloading an upgrading file, suspending the heartbeat request from being sent, stopping collecting new log data, writing the collected log data which are not sent into a local file, recording a current progress point, and upgrading by adopting the downloaded upgrading file;
and checking whether the upgrading is successful, if so, sending the log data written in the local file to the data server, starting to collect the log data from the recorded progress point, starting to work with the upgraded version, and otherwise, returning to the version before upgrading for working.
After the log collection client is started, a daemon process and a working process are created, and then a heartbeat request is sent to a configuration server, wherein the heartbeat request comprises the following steps:
the working process sends heartbeat requests to the configuration server periodically, the heartbeat requests carry the version number of the current log collection client and the IP address of the host machine, so that the configuration server sends empty heartbeat request responses under the condition that the API upgrading requests do not exist, and sends heartbeat request responses carrying upgrading instructions under the condition that the API upgrading requests exist, and the upgrading instructions comprise the version number of the log collection client to be upgraded and the downloading address of the log collection client.
Further, before the log collection client upgrading method adopts the downloaded upgrade file to upgrade, the log collection client upgrading method further includes:
the worker process sends a signal SIGUSR1 to the daemon process informing about the upgrade operation.
Further, the daemon has the following global states:
A. DAEMON _ INIT, the DAEMON prepares to perform initialization work;
B. DAEMON _ INIT _ FAIL, DAEMON initialization execution FAILs;
C. the DAEMON _ NORMAL is used for initializing and successfully executing the DAEMON process and starting the DAEMON work;
D. DAEMON _ UPDATE, a DAEMON prepares to execute program upgrade work;
E. DAEMON _ UPDATE _ FAIL, the DAEMON FAILs to perform the program upgrade work.
Further, the upgrading by using the downloaded upgrade file includes:
after receiving the SIGURSR 1 signal, the DAEMON sets the global state as DAEMON _ UPDATE;
when the DAEMON process detects that the current global state is the DAEMON _ UPDATE in the DAEMON cycle, upgrading by adopting the downloaded upgrade file;
and the daemon process sends a SIGKILL signal to the working process, and the working process exits.
Further, the upgrading by using the downloaded upgrade file further comprises the steps of:
executing the upgraded log collection client program, and creating a daemon process and a working process under a new version;
periodically and circularly detecting the global state by the daemon process under the new version;
if the working process under the new version abnormally exits after being started, so that the global state is changed into DAEMON _ UPDATE _ FAIL, sending a notification signal SIGUR 2 to the DAEMON under the original version, and attaching a start failure message;
and if the state is maintained as DAEMON _ INIT and no exception occurs in the work process under the new version in the cycle period, sending a notification signal SIGUR 2 and a starting success message to the DAEMON process under the original version.
Further, the checking whether the upgrade is successful includes:
periodically and circularly checking a notification signal SIGUSR2 from the daemon process under the new version;
if the SIGUSR2 from the daemon process under the new version does not exist in the cycle period, the daemon process under the original version considers that the new version is started overtime and sends an SIGKILL command to the process group where the daemon process under the new version is located, the operation of the new program is finished, and then the daemon process under the original version restarts the working process and returns to the state before upgrading;
if a SIGURSR 2 signal from the daemon process under the new version is received and a start failure message is obtained in the cycle period, the daemon process under the original version sends a SIGKILL command to a process group where the daemon process under the new version is located, operation of the new program is finished, and then the daemon process under the original version restarts a working process and returns to a state before upgrading;
if the SIGUSR2 signal from the daemon process under the new version is received in the cycle period and the start success message is obtained, the daemon process under the original version quits and the upgrade is successful.
The invention also provides a log collection client, which comprises:
the heartbeat request module is used for sending a heartbeat request to the configuration server and receiving a heartbeat request response returned by the configuration server;
the upgrading response module is used for downloading an upgrading file according to an upgrading instruction carried in the heartbeat request response, suspending sending of the heartbeat request and stopping collection of new log data, writing the collected log data which are not sent into a local file, recording a current progress point, and upgrading by adopting the downloaded upgrading file;
and the upgrading checking module is used for checking whether the upgrading is successful or not, if the upgrading is successful, the log data written in the local file is sent to the data server, the log data is collected from the recorded progress point, the log data starts to work with the upgraded version, and otherwise, the log data returns to the version before the upgrading to work.
Further, after the log collection client is started, a daemon process and a work process are created, and when the heartbeat request module sends a heartbeat request to the configuration server, the following operations are executed:
the working process sends heartbeat requests to the configuration server periodically, the heartbeat requests carry the version number of the current log collection client and the IP address of the host machine, so that the configuration server sends empty heartbeat request responses under the condition that the API upgrading requests do not exist, and sends heartbeat request responses carrying upgrading instructions under the condition that the API upgrading requests exist, and the upgrading instructions comprise the version number of the log collection client to be upgraded and the downloading address of the log collection client.
Further, before the upgrade is performed by using the downloaded upgrade file, the upgrade response module further performs the following operations:
the worker process sends a signal SIGUSR1 to the daemon process informing about the upgrade operation.
Further, the daemon has the following global states:
A. DAEMON _ INIT, the DAEMON prepares to perform initialization work;
B. DAEMON _ INIT _ FAIL, DAEMON initialization execution FAILs;
C. the DAEMON _ NORMAL is used for initializing and successfully executing the DAEMON process and starting the DAEMON work;
D. DAEMON _ UPDATE, a DAEMON prepares to execute program upgrade work;
E. DAEMON _ UPDATE _ FAIL, the DAEMON FAILs to perform the program upgrade work.
Further, when the upgrade response module uses the downloaded upgrade file for upgrading, the following operations are performed:
after receiving the SIGURSR 1 signal, the DAEMON sets the global state as DAEMON _ UPDATE;
when the DAEMON process detects that the current global state is the DAEMON _ UPDATE in the DAEMON cycle, upgrading by adopting the downloaded upgrade file;
and the daemon process sends a SIGKILL signal to the working process, and the working process exits.
Further, when the upgrade response module uses the downloaded upgrade file for upgrading, the following operations are also executed:
executing the upgraded log collection client program, and creating a daemon process and a working process under a new version;
periodically and circularly detecting the global state by the daemon process under the new version;
if the working process under the new version abnormally exits after being started, so that the global state is changed into DAEMON _ UPDATE _ FAIL, sending a notification signal SIGUR 2 to the DAEMON under the original version, and attaching a start failure message;
and if the state is maintained as DAEMON _ INIT and no exception occurs in the work process under the new version in the cycle period, sending a notification signal SIGUR 2 and a starting success message to the DAEMON process under the original version.
Further, the upgrade checking module, when checking whether the upgrade is successful, performs the following operations:
periodically and circularly checking a notification signal SIGUSR2 from the daemon process under the new version;
if the SIGUSR2 from the daemon process under the new version does not exist in the cycle period, the daemon process under the original version considers that the new version is started overtime and sends an SIGKILL command to the process group where the daemon process under the new version is located, the operation of the new program is finished, and then the daemon process under the original version restarts the working process and returns to the state before upgrading;
if a SIGURSR 2 signal from the daemon process under the new version is received and a start failure message is obtained in the cycle period, the daemon process under the original version sends a SIGKILL command to a process group where the daemon process under the new version is located, operation of the new program is finished, and then the daemon process under the original version restarts a working process and returns to a state before upgrading;
if the SIGUSR2 signal from the daemon process under the new version is received in the cycle period and the start success message is obtained, the daemon process under the original version quits and the upgrade is successful.
According to the log collection client and the upgrading method thereof, manual operation and maintenance intervention is not needed in the upgrading process, the parent process and the child process in the upgrading process are in two-way communication, the upgrading operation is executed after negotiation is consistent, and data are not lost before and after upgrading; if the new program is abnormally started, the daemon process can quickly discover and automatically execute version rollback operation.
Drawings
FIG. 1 is a flowchart of a log collection client upgrade method of the present invention;
FIG. 2 is a flowchart illustrating operation of the original version client of the present invention;
FIG. 3 is a flow chart of the operation of the new version client of the present invention;
FIG. 4 is a schematic diagram of a log collection client structure according to the present invention.
Detailed Description
The technical solutions of the present invention are further described in detail below with reference to the drawings and examples, which should not be construed as limiting the present invention.
The log system generally comprises a log collection Client installed on a host, a configuration server ConfigServer for managing log collection clients running on all hosts, and a data server DataServer for receiving log data collected by the log collection Client. The host machines are devices for recording logs of the log system, and each host machine is provided with a log collection client.
After a log collection client of a host computer is started, two processes start to run, one is a daemon process Daemon process, and the other is a worker process. When the log collection client is started, a parent process DaemonProcess is created, then a system call fork is called, and a child process WorkerProcess is created. The subprocess WorkerProcess collects the content of the specified log file according to the collection configuration of the user and sends the content to the data server through the network, and simultaneously, the subprocess also sends a heartbeat request to the configuration server at regular time (for example, 1 minute) and receives the instruction of the configuration server through the response content of the heartbeat request. The parent process DaemonProcess is a daemon process, the child process can be restarted after the WorkerProcess exits unexpectedly, and the upgrading process is triggered when a Client version upgrading instruction is found.
As shown in fig. 1, the method for upgrading a log collection client according to this embodiment is applied to a log collection client, and includes:
step S1, sending a heartbeat request to the configuration server, and receiving a heartbeat request response returned by the configuration server.
After a Client on a host computer is started, WorkerProcess sends a heartbeat request to a ConfigServer every 1 minute, and the request content comprises the version number v _1 of the current program file and the ip _1 of the host computer. When there is no upgrade operation, the ConfigServer returns an empty content to WorkerProcess in the response to the heartbeat request.
Assuming that a new Client version v _2 exists for the host machine ip _1, an operation and maintenance person sends a request to the ConfigServer through an upgrading API, and the ConfigServer sets the state of the host machine ip _1 after receiving the upgrading API request: the current version v _1 and the version to be upgraded v _ 2.
The ConfigServer returns an upgrade instruction in response to the heartbeat request, the upgrade instruction including a v _2 version number, a v _2 program upgrade file (HTTP download address), and md5sum of executable files.
And step S2, according to the upgrading instruction carried in the heartbeat request response, suspending sending the heartbeat request and stopping collecting new log data, writing the collected log data into a local file, recording the current progress point, downloading the upgrading file and starting upgrading.
For convenience of description, the present embodiment refers to the log collection client of version v _1 as ClientV1, and the two corresponding processes are WorkerProcessV1 and daemonprocesv 1. The log collection client upgraded to version v _2 is called ClientV2, and the two corresponding processes are WorkerProcessV2 and DaemonProcessV 2.
After the ClientV1 runs, when WorkerProcessV1 finds that there is an upgrade instruction in the response of the heartbeat request, the preparation upgrade operation is started:
and downloading the program upgrade file to the local computer, decompressing and verifying the executable file md5 sum.
The reading of new log data is stopped.
The read log data in the memory are written into a local file buffer file after being analyzed, the log data written into the buffer file are collected but not sent, and the buffer file is sent to a DataServer by a ClientV2 after the updating is completed. Writing the buffer file can greatly reduce the problem of long upgrading time caused by network sending delay.
The progress point CheckPoint is recorded. Log collection is ongoing and CheckPoint saves the state and persists to the file. The content comprises the following steps: the log directory, the log file name, the log file signature and the current collection position of the log file.
WorkerProcessV1 issues SIGUSR1 to DaemonProcessV1, notifying the upgrade operation.
In this embodiment, for a daemon process, 5 global states are defined, which are used to represent the process state of daemon process, and are respectively:
1、DAEMON_INIT
the DaemonProcess prepares to perform initialization work.
2、DAEMON_INIT_FAIL
DaemonProcess initiates execution failure.
3、DAEMON_NORMAL
The DaemonProcess initializes the execution success and starts the daemon work.
4、DAEMON_UPDATE
DaemonProcess prepares to perform program upgrade work.
5、DAEMON_UPDATE_FAIL
DaemonProcess fails to perform the program upgrade work.
Meanwhile, the DaemonProcess process has the following signal processing functions:
1) SIGCHLD Signal processing function of DaemonProcess.
The SIGCHLD signal indicates that its child process WorkerProcess exits abnormally, and if the global state is DAEMON _ INIT, the state changes to DAEMON _ INIT _ FAIL.
2) SIGURSR 1 signal processing function by Daemon Process.
In this embodiment, the customized SIGUSR1 is a signal sent by WorkerProcess to DaemonProcess for notifying upgrade operation, and DaemonProcess sets the global state to be DAEMON _ UPDATE after receiving the signal.
3) SIGURSR 2 signal processing function of Daemon Process
The custom SIGURSR 2 is a signal sent by the new version DaemonProcess of the upgrade start to the old version DaemonProcess. If the signal is accompanied by a message DaemonStartSuccess (the new version DaemonProcess, WorkerProcess start successfully), then DaemonProcess exits actively; if the signaling message is DaemonStartFail (new version DaemonProcess or WorkerProcess failed to start), the DaemonProcess global state is set to DAEMON _ UPDATE _ FAIL.
4. And the SIGKILL sends the SIGKILL to the process, and the process receiving the signal stops running.
Thus after WorkerProcessV1 issues SIGUSR1 to daemonprocesvv 1, daemonprocesvv 1 processes the SIGUSR1 signal, enters an interrupt, and the signal processing function sets the global state to DAEMON _ UPDATE.
As shown in fig. 2, ClientV1 sets DAEMON process v1 to DAEMON _ INIT state and installs SIGCHLD signal after startup, and then fork goes workerprocess v1 for log collection cycle, DAEMON process v1 installs SIGUSR1 signal, setting DAEMON _ NORMAL state. When the ConfigServer carries an upgrade instruction in the returned heartbeat request response, the WorkerProcessV1 sends SIGURSR 1 to DaemonProcessV1, and DaemonProcessV1 detects that the current global state is DAEMON _ UPDATE in the guard cycle and starts upgrading.
DaemonProcessV1 sends SIGKILL to WorkerProcessV1, at which time the memory queue of WorkerProcessV1 is empty, and then WorkerProcessV1 exits without data loss.
Daemonprocesv 1 installed sigasur 2 signal: if the installation fails, setting the current state as DAEMON _ NORMAL, executing rollback, re-fork out of WorkerProcessV1 for running, ending the upgrade operation and recovering to the state before the upgrade; if the SIGUSR2 is installed successfully, fork executes a sub-process, executes the new version program file ClientV2 under the current process space of the sub-process, and starts to circularly detect whether the upgrade is successful.
Step S3, checking whether the upgrade is successful, if the upgrade is successful, sending the log data written in the local file to the data server, and starting to collect the log data from the recorded progress point, and starting to work with the upgraded version, otherwise, returning to the version before the upgrade.
As shown in fig. 3, after the new version program file ClientV2 is executed, DaemonProcessV2 performs initialization work.
The current state is set to DAEMON _ INIT.
The SIGURR 2 signal and the SIGCHLD signal are installed.
DaemonProcessV2fork goes out of WorkerProcessV2 execution and enters a loop wait of 5 seconds, DaemonProcessV2 detects the global state:
if the status is found to be changed to DAEMON _ INIT _ FAIL (abnormal exit after workerprocess v2 is started, interruption of the SIGCHLD signal results in a global status change), a signal sigsr 2 is sent to daemonprocesv 1, with the message daemoonstartfail attached.
If the status is found to remain as DAEMON _ INIT and no exception has occurred in workerprocess v2 within 5 seconds, a sigasur 2 signal and message daemoonstartsuccess are sent to DaemonProcessV 1.
Following fig. 2, while daemonprocesv 1 waits 15 seconds for a signal from daemonprocesv 2 to be checked. There are three cases:
if no sigurr 2 from daemon process v2 is available within 15 seconds, daemon process v1 considers that the start new version times out and sends SIGKILL command to the process group in which daemon process v2 is located, ending the running of the new program, and then daemon process v1 restarts workerprocess v1 and returns to the state before upgrading.
If the SIGURSR 2 signal is received within 15 seconds and the message DaemonStartFail is obtained, the DaemonProcessV1 sends a SIGKILL command to the process group in which the DaemonProcessV2 is located, the operation of the new program is ended, and then the DaemonProcessV1 restarts the WorkerProcessV1 and returns to the state before upgrading. That is, DaemonProcessV1 will clean the process group of ClientV2 and roll back to the V1 version work.
If the SIGURSR 2 signal is received within 15 seconds and the message DaemonStartSuccess is obtained, the DaemonProcessV1 executes exit, namely the DaemonProcessV1 actively exits after receiving the signal, the DaemonProcessV1 exits, the DaemonProcessV2 and the WorkerProcessV2 completely take over and upgrade is completed, only two processes of the V2 version run on the machine, and the upgrade process is successfully finished.
The log collection client terminal is deployed on hundreds of thousands of servers, and the upgrading of all machine client terminal versions can be completed in 10 minutes through the upgrading API. The client upgrading of a single machine can be generally completed within 5 seconds, data collection is not lost in the process, and the new version program can automatically roll back when being abnormal.
As shown in fig. 4, the log collection client according to this embodiment includes a heartbeat request module, an upgrade response module, and an upgrade check module. The log collection client of the embodiment is installed on a host machine and used for collecting log data and finishing program upgrading by interacting with a configuration server.
The heartbeat request module is used for sending a heartbeat request to the configuration server and receiving a heartbeat request response returned by the configuration server; the upgrading response module is used for downloading an upgrading file according to an upgrading instruction carried in the heartbeat request response, suspending sending of the heartbeat request and stopping collection of new log data, writing the collected log data which are not sent into a local file, recording a current progress point, and upgrading by adopting the downloaded upgrading file; and the upgrading checking module is used for checking whether the upgrading is successful or not, if the upgrading is successful, the log data written in the local file is sent to the data server, the log data is collected from the recorded progress point, the log data starts to work with the upgraded version, and otherwise, the log data returns to the version before the upgrading to work.
After the log collection client is started, a daemon process and a working process are created, and operations performed by each module in the upgrading process are described below.
When sending a heartbeat request to a configuration server, a heartbeat request module executes the following operations:
the working process sends heartbeat requests to the configuration server periodically, the heartbeat requests carry the version number of the current log collection client and the IP address of the host machine, so that the configuration server sends empty heartbeat request responses under the condition that the API upgrading requests do not exist, and sends heartbeat request responses carrying upgrading instructions under the condition that the API upgrading requests exist, and the upgrading instructions comprise the version number of the log collection client to be upgraded and the downloading address of the log collection client.
In this embodiment, before the upgrade response module performs upgrade using the downloaded upgrade file, the following operations are further performed:
the worker process sends a signal SIGUSR1 to the daemon process informing of the upgrade operation.
In this embodiment, when the upgrade response module performs upgrade using the downloaded upgrade file, the following operations are performed:
after receiving the SIGURSR 1 signal, the DAEMON sets the global state as DAEMON _ UPDATE;
when the DAEMON process detects that the current global state is the DAEMON _ UPDATE in the DAEMON cycle, upgrading by adopting the downloaded upgrade file;
and the daemon process sends a SIGKILL signal to the working process, and the working process exits.
In this embodiment, when the upgrade response module performs upgrade using the downloaded upgrade file, the following operations are further performed:
executing the upgraded log collection client program, and creating a daemon process and a working process under a new version;
periodically and circularly detecting the global state by the daemon process under the new version;
if the working process under the new version abnormally exits after being started, so that the global state is changed into DAEMON _ UPDATE _ FAIL, sending a notification signal SIGUR 2 to the DAEMON under the original version, and attaching a start failure message;
and if the state is maintained as DAEMON _ INIT and no exception occurs in the work process under the new version in the cycle period, sending a notification signal SIGUR 2 and a starting success message to the DAEMON process under the original version.
In this embodiment, when the upgrade check module checks whether the upgrade is successful, the following operations are performed:
periodically and circularly checking a notification signal SIGUSR2 from the daemon process under the new version;
if the SIGUSR2 from the daemon process under the new version does not exist in the cycle period, the daemon process under the original version considers that the new version is started overtime and sends an SIGKILL command to the process group where the daemon process under the new version is located, the operation of the new program is finished, and then the daemon process under the original version restarts the working process and returns to the state before upgrading;
if a SIGURSR 2 signal from the daemon process under the new version is received and a start failure message is obtained in the cycle period, the daemon process under the original version sends a SIGKILL command to a process group where the daemon process under the new version is located, operation of the new program is finished, and then the daemon process under the original version restarts a working process and returns to a state before upgrading;
if the SIGUSR2 signal from the daemon process under the new version is received in the cycle period and the start success message is obtained, the daemon process under the original version quits and the upgrade is successful.
The above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and those skilled in the art can make various corresponding changes and modifications according to the present invention without departing from the spirit and the essence of the present invention, but these corresponding changes and modifications should fall within the protection scope of the appended claims.

Claims (4)

1. A log collection client upgrading method is applied to a log collection client, and is characterized by comprising the following steps:
after the log collection client is started, a first daemon process and a first working process are created, wherein the first daemon process has the following global states:
A. DAEMON _ INIT, the DAEMON prepares to perform initialization work;
B. DAEMON _ INIT _ FAIL, DAEMON initialization execution FAILs;
C. the DAEMON _ NORMAL is used for initializing and successfully executing the DAEMON process and starting the DAEMON work;
D. DAEMON _ UPDATE, a DAEMON prepares to execute program upgrade work;
E. the DAEMON _ UPDATE _ FAIL FAILs to execute the program upgrading work;
the first working process periodically sends heartbeat requests to the configuration server, receives heartbeat request responses returned by the configuration server, downloads upgrade files and suspends sending heartbeat requests according to upgrade instructions carried in the heartbeat request responses, stops collecting new log data, writes the collected log data which are not sent into a local file, records the current progress point, and downloads the upgrade files to the local;
the first working process sends a signal SIGURSR 1 for notifying the upgrading operation to the first DAEMON process, and the first DAEMON process sets the global state as DAEMON _ UPDATE after receiving the SIGURSR 1 signal;
when the first DAEMON process detects that the current global state is the DAEMON _ UPDATE in the DAEMON cycle, upgrading by adopting a downloaded upgrade file;
the first daemon process sends a SIGKILL signal to the first working process, and the first working process exits;
executing the upgraded log collection client program, and creating a second daemon process and a second working process under a new version;
the second daemon process periodically and circularly detects the global state;
if the second working process exits abnormally after being started, so that the global state is changed into DAEMON _ UPDATE _ FAIL, sending a notification signal SIGUR 2 to the first DAEMON process under the original version, and attaching a start failure message;
if the state is maintained as DAEMON _ INIT and no abnormality occurs in the second working process in the new version in the cycle period, sending a notification signal SIGUR 2 and a starting success message to the first DAEMON process in the original version;
periodically and circularly checking a notification signal SIGUSR2 from a second daemon process under the new version by using the first daemon process under the original version;
if the SIGUSR2 from the second daemon process under the new version does not exist in the cycle period, the first daemon process under the original version considers that the new version is started overtime and sends an SIGKILL command to the process group where the second daemon process under the new version is located, the operation of the new program is finished, and then the first daemon process under the original version restarts the first working process and returns to the state before upgrading;
if a SIGURSR 2 signal from the second daemon process under the new version is received and a start failure message is obtained in the cycle period, the first daemon process under the original version sends a SIGKILL command to a process group where the second daemon process under the new version is located, operation of the new program is finished, and then the first daemon process under the original version restarts the first working process and returns to a state before upgrading;
and if the SIGUSR2 signal from the second daemon process under the new version is received in the cycle period and the start success message is obtained, the first daemon process under the original version quits and the upgrade succeeds.
2. The log collection client upgrading method according to claim 1, wherein the heartbeat request carries a version number of a current log collection client and an IP address of a host, so that the configuration server sends an empty heartbeat request response without an API upgrading request, and sends a heartbeat request response with an upgrading instruction under the API upgrading request, wherein the upgrading instruction includes the version number of the log collection client to be upgraded and a download address thereof.
3. A log collection client, wherein the log collection client performs the following operations:
after the log collection client is started, a first daemon process and a first working process are created, wherein the first daemon process has the following global states:
A. DAEMON _ INIT, the DAEMON prepares to perform initialization work;
B. DAEMON _ INIT _ FAIL, DAEMON initialization execution FAILs;
C. the DAEMON _ NORMAL is used for initializing and successfully executing the DAEMON process and starting the DAEMON work;
D. DAEMON _ UPDATE, a DAEMON prepares to execute program upgrade work;
E. the DAEMON _ UPDATE _ FAIL FAILs to execute the program upgrading work;
the first working process periodically sends heartbeat requests to the configuration server, receives heartbeat request responses returned by the configuration server, downloads upgrade files and suspends sending heartbeat requests according to upgrade instructions carried in the heartbeat request responses, stops collecting new log data, writes the collected log data which are not sent into a local file, records the current progress point, and downloads the upgrade files to the local;
the first working process sends a signal SIGURSR 1 for notifying the upgrading operation to the first DAEMON process, and the first DAEMON process sets the global state as DAEMON _ UPDATE after receiving the SIGURSR 1 signal;
when the first DAEMON process detects that the current global state is the DAEMON _ UPDATE in the DAEMON cycle, upgrading by adopting a downloaded upgrade file;
the first daemon process sends a SIGKILL signal to the first working process, and the first working process exits;
executing the upgraded log collection client program, and creating a second daemon process and a second working process under a new version;
the second daemon process periodically and circularly detects the global state;
if the second working process exits abnormally after being started, so that the global state is changed into DAEMON _ UPDATE _ FAIL, sending a notification signal SIGUR 2 to the first DAEMON process under the original version, and attaching a start failure message;
if the state is maintained as DAEMON _ INIT and no abnormality occurs in the second working process in the new version in the cycle period, sending a notification signal SIGUR 2 and a starting success message to the first DAEMON process in the original version;
periodically and circularly checking a notification signal SIGUSR2 from a second daemon process under the new version by using the first daemon process under the original version;
if the SIGUSR2 from the second daemon process under the new version does not exist in the cycle period, the first daemon process under the original version considers that the new version is started overtime and sends an SIGKILL command to the process group where the second daemon process under the new version is located, the operation of the new program is finished, and then the first daemon process under the original version restarts the first working process and returns to the state before upgrading;
if a SIGURSR 2 signal from the second daemon process under the new version is received and a start failure message is obtained in the cycle period, the first daemon process under the original version sends a SIGKILL command to a process group where the second daemon process under the new version is located, operation of the new program is finished, and then the first daemon process under the original version restarts the first working process and returns to a state before upgrading;
and if the SIGUSR2 signal from the second daemon process under the new version is received in the cycle period and the start success message is obtained, the first daemon process under the original version quits and the upgrade succeeds.
4. The log collection client according to claim 3, wherein the heartbeat request carries a version number of the current log collection client and an IP address of a host, so that the configuration server sends an empty heartbeat request response without an API upgrade request, and sends a heartbeat request response with an upgrade instruction under the API upgrade request, wherein the upgrade instruction includes the version number of the log collection client to be upgraded and a download address thereof.
CN201610011466.0A 2016-01-08 2016-01-08 Log collection client and upgrading method thereof Active CN106959866B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610011466.0A CN106959866B (en) 2016-01-08 2016-01-08 Log collection client and upgrading method thereof
PCT/CN2016/112854 WO2017118334A1 (en) 2016-01-08 2016-12-29 Log collection client and updating method therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610011466.0A CN106959866B (en) 2016-01-08 2016-01-08 Log collection client and upgrading method thereof

Publications (2)

Publication Number Publication Date
CN106959866A CN106959866A (en) 2017-07-18
CN106959866B true CN106959866B (en) 2020-12-01

Family

ID=59274159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610011466.0A Active CN106959866B (en) 2016-01-08 2016-01-08 Log collection client and upgrading method thereof

Country Status (2)

Country Link
CN (1) CN106959866B (en)
WO (1) WO2017118334A1 (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110045971B (en) * 2018-01-16 2023-03-24 浙江宇视科技有限公司 System upgrade recovery method and device
CN108363610A (en) * 2018-02-09 2018-08-03 华为技术有限公司 A kind of control method and equipment of virtual machine monitoring plug-in unit
CN110879713B (en) * 2018-09-06 2023-06-20 山东华软金盾软件股份有限公司 Android terminal strong encryption plug-in thermal update management method
CN109257218B (en) * 2018-09-19 2021-08-06 上海电子信息职业技术学院 Island self-healing method of network system based on SNMP protocol
CN109361542B (en) * 2018-10-29 2021-10-15 北京奇艺世纪科技有限公司 Client fault processing method, device, system, terminal and server
CN109542750A (en) * 2018-11-26 2019-03-29 深圳天源迪科信息技术股份有限公司 Distributed information log system
CN112181443B (en) * 2019-07-01 2023-04-07 中国移动通信集团浙江有限公司 Automatic service deployment method and device and electronic equipment
CN111124465B (en) * 2019-11-28 2023-06-20 武汉虹信技术服务有限责任公司 Cross-network C/S program remote upgrading method and system
CN111061499B (en) * 2019-12-31 2023-06-13 上海赫千电子科技有限公司 ECU updating method and system based on file system
CN113329046A (en) * 2020-02-28 2021-08-31 珠海格力电器股份有限公司 Data transmission method, system and storage medium
CN113329044A (en) * 2020-02-28 2021-08-31 北京京东振世信息技术有限公司 Monitoring agent program upgrading method and upgrading device
CN111385296B (en) * 2020-03-04 2022-06-21 深信服科技股份有限公司 Business process restarting method, device, storage medium and system
CN111596940B (en) * 2020-05-19 2023-04-07 杭州视联动力技术有限公司 Version upgrading method and device, electronic equipment and storage medium
CN112596941B (en) * 2020-12-28 2023-10-03 凌云光技术股份有限公司 Tool result judging method and device of industrial image processing software
CN112905230A (en) * 2021-03-16 2021-06-04 深圳市麦谷科技有限公司 Application program management method and device, terminal equipment and storage medium
CN114584464A (en) * 2022-03-07 2022-06-03 浪潮云信息技术股份公司 Cloud platform full-automatic management log collection method and terminal
CN115509559B (en) * 2022-09-30 2023-09-01 广州朗桥维视通信技术有限公司 Zero-contact deployment system and method
CN115576792A (en) * 2022-11-24 2023-01-06 北京宝兰德软件股份有限公司 Log collection system and method
CN117056288A (en) * 2023-08-17 2023-11-14 齐鲁空天信息研究院 Method and system for searching and downloading server file

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145973A (en) * 2007-10-23 2008-03-19 华为技术有限公司 Software upgrade method and device
CN103064860A (en) * 2011-10-21 2013-04-24 阿里巴巴集团控股有限公司 Database high availability implementation method and device
CN103677870A (en) * 2012-09-10 2014-03-26 腾讯科技(深圳)有限公司 System upgrading method and system upgraded by means of method
CN105187262A (en) * 2015-10-27 2015-12-23 上海斐讯数据通信技术有限公司 Router upgrading method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1276348C (en) * 2003-01-15 2006-09-20 联想(北京)有限公司 Automatic upgrading method for diskfree working station
CN101719165B (en) * 2010-01-12 2014-12-17 浪潮电子信息产业股份有限公司 Method for realizing high-efficiency rapid backup of database
US8935689B2 (en) * 2012-08-13 2015-01-13 International Business Machines Corporation Concurrent embedded application update and migration

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101145973A (en) * 2007-10-23 2008-03-19 华为技术有限公司 Software upgrade method and device
CN103064860A (en) * 2011-10-21 2013-04-24 阿里巴巴集团控股有限公司 Database high availability implementation method and device
CN103677870A (en) * 2012-09-10 2014-03-26 腾讯科技(深圳)有限公司 System upgrading method and system upgraded by means of method
CN105187262A (en) * 2015-10-27 2015-12-23 上海斐讯数据通信技术有限公司 Router upgrading method and system

Also Published As

Publication number Publication date
CN106959866A (en) 2017-07-18
WO2017118334A1 (en) 2017-07-13

Similar Documents

Publication Publication Date Title
CN106959866B (en) Log collection client and upgrading method thereof
US10642599B1 (en) Preemptive deployment in software deployment pipelines
US8146060B2 (en) Data processing system and method for execution of a test routine in connection with an operating system
US9146839B2 (en) Method for pre-testing software compatibility and system thereof
JP6291248B2 (en) Firmware upgrade error detection and automatic rollback
US9485151B2 (en) Centralized system management on endpoints of a distributed data processing system
US10379922B1 (en) Error recovery in a virtual machine-based development environment
US20160132420A1 (en) Backup method, pre-testing method for environment updating and system thereof
US20090327815A1 (en) Process Reflection
JP5579650B2 (en) Apparatus and method for executing monitored process
JP2010086181A (en) Virtual machine system, method for managing thereof, program, and recording medium
CN110895487B (en) Distributed task scheduling system
US10528427B1 (en) Self-healing system for distributed services and applications
KR20040047209A (en) Method for automatically recovering computer system in network and recovering system for realizing the same
CN112698846B (en) Method and system for automatically installing patches in Linux system
CN110895486B (en) Distributed task scheduling system
CN111090546B (en) Method, device and equipment for restarting operating system and readable storage medium
US10353729B1 (en) Managing service dependencies across virtual machines in a development environment
CN112099825A (en) Method, device and equipment for upgrading component and storage medium
CN111698558A (en) Television software upgrading method, television terminal and computer readable storage medium
TWI740886B (en) Log collection client terminal and its upgrading method
CN110196749B (en) Virtual machine recovery method and device, storage medium and electronic device
US20110264953A1 (en) Self-Healing Failover Using a Repository and Dependency Management System
WO2016131294A1 (en) Version upgrade processing method and device
CN115934390A (en) Method and system for processing application program crash and device for running application program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant