TWI740886B - Log collection client terminal and its upgrading method - Google Patents

Log collection client terminal and its upgrading method Download PDF

Info

Publication number
TWI740886B
TWI740886B TW106102497A TW106102497A TWI740886B TW I740886 B TWI740886 B TW I740886B TW 106102497 A TW106102497 A TW 106102497A TW 106102497 A TW106102497 A TW 106102497A TW I740886 B TWI740886 B TW I740886B
Authority
TW
Taiwan
Prior art keywords
upgrade
guard
version
program
trip
Prior art date
Application number
TW106102497A
Other languages
Chinese (zh)
Other versions
TW201828057A (en
Inventor
唐愷
Original Assignee
香港商阿里巴巴集團服務有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 香港商阿里巴巴集團服務有限公司 filed Critical 香港商阿里巴巴集團服務有限公司
Priority to TW106102497A priority Critical patent/TWI740886B/en
Publication of TW201828057A publication Critical patent/TW201828057A/en
Application granted granted Critical
Publication of TWI740886B publication Critical patent/TWI740886B/en

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本發明公開了一種日誌收集客戶端及其升級方法,該方法首先向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應,隨後根據心跳請求響應中攜帶的升級指令,下載升級檔案並暫停發送心跳請求,並停止收集新的日誌資料,將已經收集但未完成發送的日誌資料寫入本地檔案,記錄目前進度點,採用下載的升級檔案進行升級,並檢查是否升級成功,如果升級成功則將寫入本地檔案的日誌資料發送到資料伺服器,並從記錄的進度點開始收集日誌資料,以升級後的版本開始工作,否則回退到升級前的版本進行工作。本發明日誌收集客戶端包括心跳請求模組、升級響應模組和升級檢查模組。本發明升級過程中資料收集無丟失,新版程式異常時可以自動轉返。 The invention discloses a log collection client terminal and an upgrade method thereof. The method first sends a heartbeat request to a configuration server, receives a heartbeat request response returned by the configuration server, and then downloads an upgrade file according to an upgrade instruction carried in the heartbeat request response Pause sending heartbeat requests, and stop collecting new log data, write the collected but not completed log data to the local file, record the current progress point, use the downloaded upgrade file to upgrade, and check whether the upgrade is successful, if the upgrade is successful The log data written into the local file is sent to the data server, and the log data is collected from the recorded progress point, and the work starts with the upgraded version, otherwise, it returns to the pre-upgrade version to work. The log collection client of the present invention includes a heartbeat request module, an upgrade response module, and an upgrade check module. There is no loss of data collection during the upgrading process of the present invention, and the new version of the program can be automatically transferred back when the new version of the program is abnormal.

Description

日誌收集客戶端及其升級方法 Log collection client terminal and its upgrading method

本發明屬於電腦技術領域,尤其關於一種日誌收集客戶端及其升級方法。 The invention belongs to the field of computer technology, and particularly relates to a log collection client terminal and an upgrade method thereof.

隨著電子資訊技術的發展,大數據時代已經到來。日誌是一種分佈廣泛且重要的資料資源,基於日誌可以完成系統監控、運營審計、資料分析等工作。日誌收集客戶端是運行在設備作業系統上的程式,可以根據採集配置讀取指定日誌檔案內容、處理後發送到日誌服務端。 With the development of electronic information technology, the era of big data has arrived. Log is a widely distributed and important data resource. Based on the log, system monitoring, operation audit, data analysis and other tasks can be completed. The log collection client is a program running on the equipment operating system, which can read the content of the specified log file according to the collection configuration, and send it to the log server after processing.

為了規避已知程式bug的潛在風險、提供更好的功能體驗,客戶端程式往往需要升級為更高的版本。然而在實際業務場景下,日誌每時每刻在產生,客戶端程式升級不可避免要更換可執行檔案並重啟行程,因此在升級的過程中容易造成日誌採集進度的丟失。 In order to avoid the potential risks of known program bugs and provide a better functional experience, the client program often needs to be upgraded to a higher version. However, in actual business scenarios, logs are generated all the time, and client program upgrades inevitably require replacement of executable files and restart of the process. Therefore, it is easy to cause the loss of log collection progress during the upgrade process.

現有技術方案在解決日誌收集客戶端升級的問題上,業內主要有兩類方案。方案一為冷升級,例如Logstash(1.5.4版本)、fluentd(2.2.1版本)等開源日誌收集軟體,其程式版本升級過程分為三個步驟: 在設備上執行控制腳本停止正在運行的舊版本行程;通過yum或tar包等方式安裝新版本程式檔案到設備;在設備上執行控制腳本啟動新版本行程並完成升級。 In the existing technical solutions, there are mainly two types of solutions in the industry to solve the problem of upgrading the log collection client. Solution one is a cold upgrade, such as Logstash (version 1.5.4), fluentd (version 2.2.1) and other open source log collection software. The program version upgrade process is divided into three steps: Execute the control script on the device to stop the running old Version schedule; install the new version program file to the device by means of yum or tar package; execute the control script on the device to start the new version schedule and complete the upgrade.

方案二為雙程式檔案熱升級,這類客戶端軟體會在設備上運行兩個程式檔案,分別對應兩個行程:一個是日誌收集行程,該行程安裝SIGTERM信號並在信號處理函數中執行程式退出的準備操作;另一個是守護行程,負責下載新的程式檔案並完成從舊到新的版本切換。其升級過程包括四步:守護行程在一次輪詢中檢測到有新的客戶端程式安裝包可用,並將其下載到本機;守護行程向日誌收集行程發出SIGTERM信號;一般情況下,日誌收集行程在接收到SIGTERM信號後,完成退出準備操作並記錄日誌採集進度到本地後主動退出。若日誌收集行程的退出動作超時(比如行程接受SIGTERM後一分鐘沒有完成退出準備操作),守護行程將發出SIGKILL強制結束日誌收集行程運行。 The second solution is a dual program file hot upgrade. This type of client software will run two program files on the device, corresponding to two schedules: one is the log collection schedule, which installs the SIGTERM signal and executes the program exit in the signal processing function The other is to guard the schedule, which is responsible for downloading new program files and completing the switch from the old to the new version. The upgrade process includes four steps: the guardian program detects that a new client program installation package is available in a poll and downloads it to the machine; the guardian program sends a SIGTERM signal to the log collection program; in general, log collection After the itinerary receives the SIGTERM signal, it completes the exit preparation operation and records the log collection progress to the local area and then actively exits. If the exit action of the log collection process times out (for example, the exit preparation operation is not completed one minute after the process accepts SIGTERM), the guard process will issue SIGKILL to force the end of the log collection process.

守護行程檢測到舊版本日誌收集行程已退出,啟動新版本程式並完成升級。 The guard program detects that the old version of the log collection program has exited, starts the new version of the program and completes the upgrade.

然而現有的冷升級方案,需要人工參與升級過程,運維成本高,並且程式升級過程中會強制殺死舊行程造成日誌採集進度的丟失,程式版本升級對資料收集的完整性有影響;假如新版本程式檔案不可用(如啟動後發生 crash),也沒有自動的版本回退機制。現有雙程式熱升級方案中,日誌收集程式與守護程式相結合,支援自動化操作,但升級過程中,守護行程通過信號與日誌收集行程單向通信,日誌收集行程在收到SIGTERM信號後,若短時間無法正常退出(比如沒有完成日誌採集進度的持久化),守護行程在超時後會再次發出SIGKILL信號並被強制終止其運行。這樣新版本程式啟動後無法獲取升級前的日誌採集進度,造成資料收集的丟失。並且在守護行程發出SIGTERM後,舊版本日誌收集行程正常退出,但是隨後啟動的新採集程式無法正常啟動時,日誌收集中斷,需要人工運維介入。 However, the existing cold upgrade solution requires manual participation in the upgrade process, which results in high operation and maintenance costs, and the program upgrade process will forcefully kill the old itinerary and cause the loss of the log collection progress. The program version upgrade will affect the integrity of data collection; if it is new The version program file is not available (such as a crash after startup), and there is no automatic version rollback mechanism. In the existing dual-program hot upgrade solution, the log collection program and the daemon are combined to support automated operation. However, during the upgrade process, the daemon program communicates with the log collection program in one direction through a signal. The log collection program is short after receiving the SIGTERM signal. The time cannot exit normally (for example, the persistence of the log collection progress is not completed), and the guard process will send the SIGKILL signal again after the timeout, and its operation will be forcibly terminated. In this way, the log collection progress before the upgrade cannot be obtained after the new version of the program is started, resulting in the loss of data collection. And after the SIGTERM is issued by the guard schedule, the log collection schedule of the old version exits normally, but the new collection program that is subsequently started fails to start normally, the log collection is interrupted and manual operation and maintenance intervention is required.

本發明的目的是提供一種日誌收集客戶端及其升級方法,通過單程式檔案、雙行程運行的方式完成程式自身升級,解決了升級過程中可能出現的資料丟失問題和升級失敗時的版本回退問題。 The purpose of the present invention is to provide a log collection client and an upgrade method thereof. The program itself can be upgraded through a single program file and double-stroke operation, which solves the problem of data loss that may occur during the upgrade process and the version rollback when the upgrade fails. problem.

為了實現上述目的,本發明技術方案如下:一種日誌收集客戶端升級方法,應用於日誌收集客戶端,該方法包括:向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應;根據心跳請求響應中攜帶的升級指令,下載升級檔案並暫停發送心跳請求,並停止收集新的日誌資料,將已經 收集但未完成發送的日誌資料寫入本地檔案,記錄目前進度點,採用下載的升級檔案進行升級;檢查是否升級成功,如果升級成功則將寫入本地檔案的日誌資料發送到資料伺服器,並從記錄的進度點開始收集日誌資料,以升級後的版本開始工作,否則回退到升級前的版本進行工作。 In order to achieve the above objectives, the technical solution of the present invention is as follows: a log collection client upgrade method, applied to the log collection client, the method includes: sending a heartbeat request to the configuration server, receiving the heartbeat request response returned by the configuration server; according to the heartbeat The upgrade instruction carried in the request response, download the upgrade file and suspend sending the heartbeat request, and stop collecting new log data, write the collected but not completed log data to the local file, record the current progress point, and use the downloaded upgrade file Perform the upgrade; check whether the upgrade is successful, if the upgrade is successful, send the log data written to the local file to the data server, and collect the log data from the recorded progress point, start working with the upgraded version, otherwise roll back to the upgrade Work on the previous version.

其中,該日誌收集客戶端啟動後,創建有守護行程和工作行程,則該向配置伺服器發送心跳請求,包括:工作行程定期向配置伺服器發送心跳請求,該心跳請求中攜帶目前日誌收集客戶端的版本號和主機的IP位址,以便配置伺服器在沒有升級API請求的情況下發送空的心跳請求響應,在有升級API請求的情況下發送攜帶升級指令的心跳請求響應,該升級指令包括待升級日誌收集客戶端的版本號及其下載地址。 Among them, after the log collection client is started, a guard schedule and a work schedule are created, then a heartbeat request should be sent to the configuration server, including: the work schedule periodically sends a heartbeat request to the configuration server, and the heartbeat request carries the current log collection client The version number of the terminal and the IP address of the host to configure the server to send an empty heartbeat request response without an upgrade API request, and send a heartbeat request response with an upgrade instruction when there is an upgrade API request. The upgrade instruction includes The version number and download address of the client to be upgraded log collection.

進一步地,該日誌收集客戶端升級方法在採用下載的升級檔案進行升級之前,還包括:該工作行程向守護行程發出通知升級操作的信號SIGUSR1。 Further, the log collection client upgrade method before the upgrade is performed using the downloaded upgrade file, further includes: the work schedule sends a signal SIGUSR1 notifying the upgrade operation to the guard schedule.

進一步地,該守護行程具有如下全域狀態:A、DAEMON_INIT,守護行程準備執行初始化工作;B、DAEMON_INIT_FAIL,守護行程初始化執行失敗;C、DAEMON_NORMAL,守護行程初始化執行成 功,並開始守護工作;D、DAEMON_UPDATE,守護行程準備執行程式升級工作;E、DAEMON_UPDATE_FAIL,守護行程執行程式升級工作失敗。 Further, the guarding itinerary has the following global status: A, DAEMON_INIT, the guarding itinerary is ready to perform initialization work; B, DAEMON_INIT_FAIL, the guarding itinerary initialization execution fails; C, DAEMON_NORMAL, the guarding itinerary initialization is successfully executed, and the guarding work begins; D, DAEMON_UPDATE , Guardian itinerary is ready to perform program upgrade work; E, DAEMON_UPDATE_FAIL, Guardian itinerary execution program upgrade task failed.

進一步地,該採用下載的升級檔案進行升級,包括:守護行程接收SIGUSR1信號後設置全域狀態為DAEMON_UPDATE;守護行程在守護迴圈中檢測到目前全域狀態為DAEMON_UPDATE時,採用下載的升級檔案進行升級;守護行程發送SIGKILL信號給工作行程,工作行程退出。 Further, the upgrade using the downloaded upgrade file includes: the guard journey sets the global status to DAEMON_UPDATE after receiving the SIGUSR1 signal; when the guard journey detects that the current global status is DAEMON_UPDATE in the guard loop, the downloaded upgrade file is used to upgrade; The guard schedule sends the SIGKILL signal to the work schedule, and the work schedule exits.

進一步地,該採用下載的升級檔案進行升級,還包括步驟:執行升級後的日誌收集客戶端程式,創建新版本下的守護行程和工作行程;新版本下的守護行程週期迴圈檢測全域狀態;如果新版本下的工作行程啟動後異常退出,導致全域狀態變為DAEMON_UPDATE_FAIL,則發送通知信號SIGUSR2給原版本下的守護行程,附上啟動失敗消息;若發現狀態保持為DAEMON_INIT,迴圈週期內新版本下的工作行程未有異常發生,則發送通知信號SIGUSR2和啟動成功消息給原版本下的守護行程。 Further, the upgrade using the downloaded upgrade file also includes the steps of: executing the upgraded log collection client program to create the guard schedule and work schedule under the new version; the guard schedule cycle under the new version checks the global status; If the work schedule under the new version exits abnormally after starting, causing the global status to change to DAEMON_UPDATE_FAIL, the notification signal SIGUSR2 will be sent to the guard schedule under the original version with a startup failure message; if the status is found to remain DAEMON_INIT, the new cycle will be updated. If no abnormality occurs in the work schedule under the version, the notification signal SIGUSR2 and the start-up success message are sent to the guard schedule under the original version.

進一步地,該檢查是否升級成功,包括: 原版本下的守護行程週期迴圈檢查來自新版本下的守護行程的通知信號SIGUSR2;若在迴圈週期內沒有來自新版本下的守護行程的SIGUSR2,原版本下的守護行程認為啟動新版本超時並向新版本下的守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的守護行程重新開機工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的守護行程的SIGUSR2信號並得到啟動失敗消息,則原版本下的守護行程向新版本下的守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的守護行程重新開機工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的守護行程的SIGUSR2信號且得到啟動成功消息,則原版本下的守護行程退出,升級成功。 Further, the check whether the upgrade is successful includes: a loop check of the guard trip cycle under the original version is the notification signal SIGUSR2 from the guard trip under the new version; if there is no SIGUSR2 from the guard trip under the new version in the loop cycle, The guard schedule under the original version considers that the start of the new version is timed out and sends a SIGKILL command to the schedule group of the guard schedule under the new version to end the operation of the new program, and then the guard schedule under the original version restarts the work schedule and rolls back to before the upgrade Status; if the SIGUSR2 signal from the guard trip under the new version is received during the loop cycle and a startup failure message is obtained, the guard trip under the original version sends the SIGKILL command to the trip group of the guard trip under the new version to end the new program After that, the guard trip under the original version restarts the work schedule and returns to the pre-upgrade state; if the SIGUSR2 signal from the guard trip under the new version is received during the loop cycle and the startup success message is obtained, the original version is downloaded The guardian program exited and the upgrade was successful.

本發明還提出了一種日誌收集客戶端,該日誌收集客戶端包括:心跳請求模組,用於向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應;升級響應模組,用於根據心跳請求響應中攜帶的升級指令,下載升級檔案並暫停發送心跳請求並停止收集新的日誌資料,將已經收集但未完成發送的日誌資料寫入本地檔案,記錄目前進度點,採用下載的升級檔案進行升級;升級檢查模組,用於檢查是否升級成功,如果升級成 功則將寫入本地檔案的日誌資料發送到資料伺服器,並從記錄的進度點開始收集日誌資料,以升級後的版本開始工作,否則回退到升級前的版本進行工作。 The present invention also proposes a log collection client, the log collection client includes: a heartbeat request module for sending a heartbeat request to the configuration server, and receiving a heartbeat request response returned by the configuration server; an upgrade response module for According to the upgrade instructions carried in the heartbeat request response, download the upgrade file and suspend sending the heartbeat request and stop collecting new log data, write the collected but not completed log data to the local file, record the current progress point, and use the downloaded upgrade The file is upgraded; the upgrade check module is used to check whether the upgrade is successful, if the upgrade is successful, the log data written to the local file is sent to the data server, and the log data is collected from the recorded progress point to the upgraded version Start to work, otherwise fall back to the version before the upgrade to work.

進一步地,該日誌收集客戶端啟動後,創建有守護行程和工作行程,該心跳請求模組在向配置伺服器發送心跳請求時,執行如下操作:工作行程定期向配置伺服器發送心跳請求,該心跳請求中攜帶目前日誌收集客戶端的版本號和主機的IP位址,以便配置伺服器在沒有升級API請求的情況下發送空的心跳請求響應,在有升級API請求的情況下發送攜帶升級指令的心跳請求響應,該升級指令包括待升級日誌收集客戶端的版本號及其下載地址。 Further, after the log collection client is started, it creates a guard schedule and a work schedule. When the heartbeat request module sends a heartbeat request to the configuration server, it performs the following operations: the work schedule periodically sends a heartbeat request to the configuration server. The heartbeat request carries the version number of the current log collection client and the IP address of the host, so that the configuration server sends an empty heartbeat request response without an upgrade API request, and sends an upgrade instruction if there is an upgrade API request In response to the heartbeat request, the upgrade instruction includes the version number of the log collection client to be upgraded and its download address.

進一步地,該升級響應模組在採用下載的升級檔案進行升級之前,還執行如下操作:該工作行程向守護行程發出通知升級操作的信號SIGUSR1。 Further, before the upgrade response module uses the downloaded upgrade file to upgrade, it also performs the following operations: the work schedule sends a signal SIGUSR1 to the guard schedule to notify the upgrade operation.

進一步地,該守護行程具有如下全域狀態:A、DAEMON_INIT,守護行程準備執行初始化工作;B、DAEMON_INIT_FAIL,守護行程初始化執行失敗;C、DAEMON_NORMAL,守護行程初始化執行成功,並開始守護工作;D、DAEMON_UPDATE,守護行程準備執行程式升級 工作;E、DAEMON_UPDATE_FAIL,守護行程執行程式升級工作失敗。 Further, the guarding itinerary has the following global status: A, DAEMON_INIT, the guarding itinerary is ready to perform initialization work; B, DAEMON_INIT_FAIL, the guarding itinerary initialization execution fails; C, DAEMON_NORMAL, the guarding itinerary initialization is successfully executed, and the guarding work begins; D, DAEMON_UPDATE , Guardian itinerary is ready to perform program upgrade work; E, DAEMON_UPDATE_FAIL, Guardian itinerary execution program upgrade task failed.

進一步地,該升級響應模組在採用下載的升級檔案進行升級時,執行如下操作:守護行程接收SIGUSR1信號後設置全域狀態為DAEMON_UPDATE;守護行程在守護迴圈中檢測到目前全域狀態為DAEMON_UPDATE時,採用下載的升級檔案進行升級;守護行程發送SIGKILL信號給工作行程,工作行程退出。 Further, when the upgrade response module uses the downloaded upgrade file to upgrade, it performs the following operations: After the guard program receives the SIGUSR1 signal, it sets the global state to DAEMON_UPDATE; when the guard program detects that the current global state is DAEMON_UPDATE in the guard loop, Use the downloaded upgrade file to upgrade; the guard schedule sends a SIGKILL signal to the work schedule, and the work schedule exits.

進一步地,該升級響應模組在採用下載的升級檔案進行升級時,還執行如下操作:執行升級後的日誌收集客戶端程式,創建新版本下的守護行程和工作行程;新版本下的守護行程週期迴圈檢測全域狀態;如果新版本下的工作行程啟動後異常退出,導致全域狀態變為DAEMON_UPDATE_FAIL,則發送通知信號SIGUSR2給原版本下的守護行程,附上啟動失敗消息;若發現狀態保持為DAEMON_INIT,迴圈週期內新版本下的工作行程未有異常發生,則發送通知信號SIGUSR2和啟動成功消息給原版本下的守護行程。 Further, when the upgrade response module uses the downloaded upgrade file to upgrade, it also performs the following operations: executes the upgraded log collection client program, creates the guard schedule and the work schedule under the new version; the guard schedule under the new version Periodic loop detection of the global status; if the work schedule under the new version exits abnormally after starting, causing the global status to change to DAEMON_UPDATE_FAIL, the notification signal SIGUSR2 will be sent to the guard schedule under the original version with a startup failure message; if the status is found to remain as DAEMON_INIT, there is no abnormality in the work schedule under the new version in the loop cycle, then the notification signal SIGUSR2 and the start success message are sent to the guard schedule under the original version.

進一步地,該升級檢查模組在檢查是否升級成功時,執行如下操作: 原版本下的守護行程週期迴圈檢查來自新版本下的守護行程的通知信號SIGUSR2;若在迴圈週期內沒有來自新版本下的守護行程的SIGUSR2,原版本下的守護行程認為啟動新版本超時並向新版本下的守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的守護行程重新開機工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的守護行程的SIGUSR2信號並得到啟動失敗消息,則原版本下的守護行程向新版本下的守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的守護行程重新開機工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的守護行程的SIGUSR2信號且得到啟動成功消息,則原版本下的守護行程退出,升級成功。 Further, when the upgrade check module checks whether the upgrade is successful, it performs the following operations: The guard trip cycle of the original version loops to check the notification signal SIGUSR2 from the guard trip of the new version; if there is no signal from the new version in the loop cycle For the SIGUSR2 of the guardian schedule under the version, the guardian schedule under the original version thinks that the start of the new version is timed out and sends a SIGKILL command to the schedule group of the guardian schedule under the new version to end the operation of the new program, and then the guardian schedule under the original version restarts Work schedule and roll back to the state before the upgrade; if you receive the SIGUSR2 signal from the guard schedule under the new version and get the start failure message during the loop cycle, the guard schedule under the original version will be transferred to the guard schedule under the new version. The group sends the SIGKILL command to end the operation of the new program, and then the guard schedule under the original version restarts the working schedule and returns to the pre-upgrade state; if the SIGUSR2 signal from the guard schedule under the new version is received during the loop cycle and gets If the startup message is successful, the guard program under the original version exits, and the upgrade is successful.

本發明提出的一種日誌收集客戶端及其升級方法,升級過程無需人工運維干預,升級過程中的父子行程雙向通信,協商一致後執行升級操作,升級前後資料不丟失;如果新程式啟動異常,守護行程可以快速發現並自動執行版本回退操作。 The log collection client terminal and its upgrade method proposed by the present invention require no manual operation and maintenance intervention during the upgrade process. During the upgrade process, the parent-child journey communicates bidirectionally, and the upgrade operation is performed after consensus. The data before and after the upgrade is not lost; if the new program starts abnormally, Guardian itinerary can quickly discover and automatically perform version rollback operations.

S1‧‧‧步驟 S1‧‧‧Step

S2‧‧‧步驟 S2‧‧‧Step

S3‧‧‧步驟 S3‧‧‧Step

圖1為本發明日誌收集客戶端升級方法流程圖;圖2為本發明原版本客戶端運行流程圖; 圖3為本發明新版本客戶端運行流程圖;圖4為本發明日誌收集客戶端結構示意圖。 Fig. 1 is a flowchart of the method for upgrading a log collection client of the present invention; Fig. 2 is a flowchart of the operation of the original version of the client of the present invention; Fig. 3 is a flowchart of the operation of a new version of the client of the present invention; Fig. 4 is the structure of the log collection client of the present invention Schematic.

下面結合附圖和實施例對本發明技術方案做進一步詳細說明,以下實施例不構成對本發明的限定。 The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments, and the following embodiments do not constitute a limitation to the present invention.

日誌系統一般包括安裝在主機上的日誌收集客戶端Client,以及用於管理所有主機上運行的日誌收集客戶端的配置伺服器ConfigServer,以及接收日誌收集客戶端採集到的日誌資料的資料伺服器DataServer。主機就是日誌系統所要記錄日誌的設備,每個主機上都安裝有日誌收集客戶端。 The log system generally includes a log collection client Client installed on the host, a configuration server ConfigServer used to manage log collection clients running on all hosts, and a data server DataServer that receives log data collected by the log collection client. The host is the device that the log system needs to record logs, and a log collection client is installed on each host.

主機的日誌收集客戶端啟動後,有兩個行程開始運行,一個是守護行程DaemonProcess,一個是工作行程WorkerProcess。在開機記錄收集客戶端時,先創建父行程DaemonProcess,然後調用系統調用fork,創建子行程WorkerProcess。子行程WorkerProcess根據使用者的收集配置採集指定日誌檔案內容並通過網路發送到資料伺服器,同時子行程WorkerProcess還通過定時(例如1分鐘)向配置伺服器發送心跳請求,通過心跳請求的響應內容來接受配置伺服器的指令。父行程DaemonProcess為守護行程,在發現WorkerProcess意外退出後會重啟子行程,在發現Client版本升級指令時觸發升級流程。 After the host's log collection client is started, two processes start to run, one is the daemon process DaemonProcess, and the other is the work process WorkerProcess. When starting the record collection client, first create the parent process DaemonProcess, and then call the system call fork to create the child process WorkerProcess. The worker process sub-process collects the content of the specified log file according to the user's collection configuration and sends it to the data server through the network. At the same time, the worker process sub-process also sends a heartbeat request to the configuration server at a fixed time (for example, 1 minute), and responds to the heartbeat request. To accept the command to configure the server. The parent process DaemonProcess is a guard process, and the child process is restarted when the WorkerProcess is found to exit unexpectedly, and the upgrade process is triggered when the Client version upgrade instruction is found.

如圖1所示,本實施例一種日誌收集客戶端升級方 法,應用於日誌收集客戶端,包括: As shown in Figure 1, a method for upgrading a log collection client in this embodiment is applied to the log collection client and includes:

步驟S1、向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應。 Step S1: Send a heartbeat request to the configuration server, and receive a heartbeat request response returned by the configuration server.

主機上Client啟動後,WorkerProcess每1分鐘向ConfigServer發送一次心跳請求,請求內容包括目前程式檔案版本號v_1和主機ip_1。在沒有升級操作時,ConfigServer在心跳請求的響應中返回空內容給WorkerProcess。 After the Client on the host is started, WorkerProcess sends a heartbeat request to ConfigServer every 1 minute. The request content includes the current program file version number v_1 and host ip_1. When there is no upgrade operation, ConfigServer returns empty content to WorkerProcess in the response to the heartbeat request.

假設對於主機ip_1有新的Client版本v_2,運維人員通過升級API向ConfigServer發出請求,ConfigServer在收到升級API請求後,設置主機ip_1的狀態:目前版本v_1,待升級版本v_2。 Assuming that there is a new Client version v_2 for the host ip_1, the operation and maintenance personnel sends a request to ConfigServer through the upgrade API. After receiving the upgrade API request, ConfigServer sets the status of the host ip_1: current version v_1, version v_2 to be upgraded.

則ConfigServer在心跳請求的響應中返回升級指令,升級指令包括v_2版本號、v_2程式升級檔案(HTTP下載地址)和可執行檔案的md5sum。 Then ConfigServer returns an upgrade instruction in the response to the heartbeat request. The upgrade instruction includes the v_2 version number, the v_2 program upgrade file (HTTP download address), and the md5sum of the executable file.

步驟S2、根據心跳請求響應中攜帶的升級指令,暫停發送心跳請求並停止收集新的日誌資料,將已經收集的日誌資料寫入本地檔案,記錄目前進度點,下載升級檔案開始升級。 Step S2, according to the upgrade instruction carried in the heartbeat request response, suspend sending the heartbeat request and stop collecting new log data, write the collected log data into a local file, record the current progress point, download the upgrade file to start the upgrade.

為便於描述,本實施例將版本v_1的日誌收集客戶端稱為ClientV1,其對應的兩個行程為WorkerProcessV1和DaemonProcessV1。將升級為版本v_2的日誌收集客戶端稱為ClientV2,其對應的兩個行程為WorkerProcessV2和DaemonProcessV2。 For ease of description, this embodiment refers to the log collection client of version v_1 as ClientV1, and its two corresponding processes are WorkerProcessV1 and DaemonProcessV1. The log collection client upgraded to version v_2 is called ClientV2, and its corresponding two processes are WorkerProcessV2 and DaemonProcessV2.

則在ClientV1運行後,WorkerProcessV1發現心跳請求的響應中有升級指令時,開始準備升級操作:下載程式升級檔案到本機,解壓縮後並驗證可執行檔案md5sum。 After ClientV1 runs, when WorkerProcessV1 finds that there is an upgrade instruction in the response to the heartbeat request, it starts to prepare for the upgrade operation: download the program upgrade file to the local machine, unzip it and verify the executable file md5sum.

停止讀取新的日誌資料。 Stop reading new log data.

記憶體中已讀到的日誌資料完成解析後寫入本地檔案BuffeFile,寫入BuffeFil的日誌資料是已經收集但是未完成發送的日誌資料,BufferFile會在升級完成後由ClientV2發送到DataServer。寫BufferFile可以大大減少網路發送延時導致的升級時間長問題。 The log data that has been read in the memory is parsed and written to the local file BuffeFile. The log data written to BuffeFil is the log data that has been collected but not yet sent. BufferFile will be sent by ClientV2 to the DataServer after the upgrade is completed. Writing BufferFile can greatly reduce the problem of long upgrade time caused by network transmission delay.

記錄進度點CheckPoint。日誌採集是有進度的,CheckPoint保存該狀態並會持久化到檔案。內容包括:日誌目錄,日誌檔案案名,日誌檔案簽名,日誌檔案目前採集到的位置。 Record the progress point CheckPoint. Log collection is progressing, CheckPoint saves the state and persists to the file. The content includes: log directory, log file case name, log file signature, and the location where the log file is currently collected.

WorkerProcessV1向DaemonProcessV1發出SIGUSR1,通知升級操作。 WorkerProcessV1 sends SIGUSR1 to DaemonProcessV1 to notify the upgrade operation.

本實施例對於DaemonProcess行程,定義了5種全域狀態,用於表示DaemonProcess的行程狀態,分別為: In this embodiment, for the DaemonProcess process, five global states are defined, which are used to represent the process state of the DaemonProcess, which are:

1、DAEMON_INIT 1. DAEMON_INIT

DaemonProcess準備執行初始化工作。 DaemonProcess is ready to perform initialization work.

2、DAEMON_INIT_FAIL 2. DAEMON_INIT_FAIL

DaemonProcess初始化執行失敗。 DaemonProcess initialization failed.

3、DAEMON_NORMAL 3. DAEMON_NORMAL

DaemonProcess初始化執行成功,並開始守護工作。 DaemonProcess initialization and execution are successful, and start guarding work.

4、DAEMON_UPDATE 4. DAEMON_UPDATE

DaemonProcess準備執行程式升級工作。 DaemonProcess is ready to perform program upgrade work.

5、DAEMON_UPDATE_FAIL 5. DAEMON_UPDATE_FAIL

DaemonProcess執行程式升級工作失敗。 DaemonProcess failed to execute program upgrade.

同時DaemonProcess行程有如下信號處理函數: At the same time, DaemonProcess has the following signal processing functions:

1)、DaemonProcess的SIGCHLD信號處理函數。 1) SIGCHLD signal processing function of DaemonProcess.

SIGCHLD信號表明其子行程WorkerProcess異常退出,若全域狀態為DAEMON_INIT,則狀態改變為DAEMON_INIT_FAIL。 The SIGCHLD signal indicates that its sub-stroke WorkerProcess exits abnormally. If the global status is DAEMON_INIT, the status changes to DAEMON_INIT_FAIL.

2)、DaemonProcess的SIGUSR1信號處理函數。 2) SIGUSR1 signal processing function of DaemonProcess.

本實施例自訂SIGUSR1是WorkerProcess發送給DaemonProcess用以通知升級操作的信號,DaemonProcess收到該信號後設置全域狀態為DAEMON_UPDATE。 The customized SIGUSR1 in this embodiment is a signal sent by WorkerProcess to DaemonProcess to notify the upgrade operation, and DaemonProcess sets the global status to DAEMON_UPDATE after receiving the signal.

3)、DaemonProcess的SIGUSR2信號處理函數 3) SIGUSR2 signal processing function of DaemonProcess

本實施例自訂SIGUSR2是升級啟動的新版DaemonProcess發送給舊版DaemonProcess的信號。若信號附帶消息DaemonStartSuccess(新版本DaemonProcess、WorkerProcess啟動成功),則DaemonProcess主動退出;若信號消息為DaemonStartFail(新版本DaemonProcess或WorkerProcess啟動失敗),則設置DaemonProcess全域狀態為DAEMON_UPDATE_FAIL。 In this embodiment, the customized SIGUSR2 is a signal sent to the old version of DaemonProcess from the new version of DaemonProcess initiated by the upgrade. If the signal is accompanied by a message DaemonStartSuccess (the new version of DaemonProcess, WorkerProcess starts successfully), DaemonProcess exits actively; if the signal message is DaemonStartFail (the new version of DaemonProcess or WorkerProcess fails to start), the global status of DaemonProcess is set to DAEMON_UPDATE_FAIL.

4、SIGKILL,向行程發出SIGKILL後,接收到該信號的行程終止運行。 4. SIGKILL, after sending SIGKILL to the itinerary, the itinerary that receives the signal terminates its operation.

從而在WorkerProcessV1向DaemonProcessV1發出 SIGUSR1後,DaemonProcessV1處理SIGUSR1信號,進入中斷,信號處理函數將全域狀態設置為DAEMON_UPDATE。 Therefore, after WorkerProcessV1 sends SIGUSR1 to DaemonProcessV1, DaemonProcessV1 processes the SIGUSR1 signal and enters an interrupt. The signal processing function sets the global state to DAEMON_UPDATE.

如圖2所示,ClientV1在啟動後,設置DaemonProcessV1為DAEMON_INIT狀態,並安裝SIGCHLD信號,隨後fork出WorkerProcessV1進行日誌採集迴圈,DaemonProcessV1安裝SIGUSR1信號,設置狀態為DAEMON_NORMAL。當ConfigServer在返回的心跳請求響應中攜帶升級指令後,WorkerProcessV1發送SIGUSR1給DaemonProcessV1,DaemonProcessV1在守護迴圈中檢測到目前全域狀態為DAEMON_UPDATE,開始升級。 As shown in Figure 2, after ClientV1 is started, it sets DaemonProcessV1 to DAEMON_INIT state and installs the SIGCHLD signal, then forks WorkerProcessV1 for log collection loops, DaemonProcessV1 installs the SIGUSR1 signal, and sets the state to DAEMON_NORMAL. When ConfigServer carries the upgrade command in the response to the heartbeat request, WorkerProcessV1 sends SIGUSR1 to DaemonProcessV1, and DaemonProcessV1 detects that the current global status is DAEMON_UPDATE in the guard loop and starts the upgrade.

DaemonProcessV1發送SIGKILL給WorkerProcessV1,此時WorkerProcessV1的記憶體佇列是空的,隨後WorkerProcessV1退出,無資料丟失。 DaemonProcessV1 sends SIGKILL to WorkerProcessV1. At this time, the memory queue of WorkerProcessV1 is empty, and then WorkerProcessV1 exits without data loss.

DaemonProcessV1安裝SIGUSR2信號:若安裝失敗,則設置目前狀態為DAEMON_NORMAL並執行回退,重新fork出WorkerProcessV1運行,結束升級操作並恢復到升級前狀態;若SIGUSR2安裝成功,則fork一個子行程,在子行程目前行程空間下執行新版本程式檔案ClientV2,並開始迴圈檢測是否升級成功。 DaemonProcessV1 installs SIGUSR2 signal: if the installation fails, set the current status to DAEMON_NORMAL and perform a rollback, re-fork out WorkerProcessV1 to run, end the upgrade operation and return to the pre-upgrade state; if the installation of SIGUSR2 is successful, fork a sub-stroke, in the sub-stroke Run the new version of the program file ClientV2 in the current travel space, and start looping to check whether the upgrade is successful.

步驟S3、檢查是否升級成功,如果升級成功則將寫入本地檔案的日誌資料發送到資料伺服器,並從記錄的進度點開始收集日誌資料,以升級後的版本開始工作,否則 回退到升級前的版本。 Step S3. Check whether the upgrade is successful. If the upgrade is successful, send the log data written to the local file to the data server, and collect the log data from the recorded progress point, and start working with the upgraded version, otherwise, roll back to the upgrade The previous version.

如圖3所示,在執行新版本程式檔案ClientV2後,DaemonProcessV2執行初始化工作。 As shown in Figure 3, after executing the new version of the program file ClientV2, DaemonProcessV2 performs initialization work.

設置目前狀態為DAEMON_INIT。 Set the current status to DAEMON_INIT.

安裝SIGUSR2信號、SIGCHLD信號。 Install SIGUSR2 signal and SIGCHLD signal.

DaemonProcessV2 fork出WorkerProcessV2執行,並進入5秒的迴圈等待,DaemonProcessV2檢測全域狀態:若發現狀態變為DAEMON_INIT_FAIL(WorkerProcessV2啟動後異常退出,中斷處理SIGCHLD信號導致全域狀態變化),則發送信號SIGUSR2給DaemonProcessV1,附上消息DaemonStaftFail。 DaemonProcessV2 forks WorkerProcessV2 to execute, and enters a 5-second loop wait. DaemonProcessV2 detects the global status: if it finds that the status changes to DAEMON_INIT_FAIL (WorkerProcessV2 exits abnormally after starting, the interrupt processing SIGCHLD signal causes global status changes), then it sends the signal SIGUSR2 to DaemonProcessV1, Attach the message DaemonStaftFail.

若發現狀態保持為DAEMON_INIT,5秒內WorkerProcessV2未有異常發生,則發送SIGUSR2信號和消息DaemonStartSuccess給DaemonProcessV1。 If it is found that the status remains as DAEMON_INIT, and no abnormality occurs in WorkerProcessV2 within 5 seconds, the SIGUSR2 signal and the message DaemonStartSuccess will be sent to DaemonProcessV1.

接圖2,而DaemonProcessV1等待15秒鐘時間,檢查來自DaemonProcessV2的信號。分三種情況:若15秒內沒有來自DaemonProcessV2的SIGUSR2,DaemonProcessV1認為啟動新版本超時並向DaemonProcessV2所在行程組發送SIGKILL命令,結束新程式的運行,隨後DaemonProcessV1重新啟動WorkerProcessV1並回退至升級前狀態。 Continuing to Figure 2, and DaemonProcessV1 waits for 15 seconds to check the signal from DaemonProcessV2. There are three situations: if there is no SIGUSR2 from DaemonProcessV2 within 15 seconds, DaemonProcessV1 thinks that the start of the new version has timed out and sends a SIGKILL command to DaemonProcessV2's trip group to end the operation of the new program. Then DaemonProcessV1 restarts WorkerProcessV1 and returns to the pre-upgrade state.

若15秒內收到SIGUSR2信號並得到消息DaemonStartFail,則DaemonProcessV1向DaemonProcessV2所在行程組發送SIGKILL命令,結束新 程式的運行,隨後DaemonProcessV1重新啟動WorkerProcessV1並回退至升級前狀態。即DaemonProcessV1將清理ClientV2的行程組並回退到V1版本工作。 If the SIGUSR2 signal is received within 15 seconds and the message DaemonStartFail is received, DaemonProcessV1 sends a SIGKILL command to the trip group of DaemonProcessV2 to end the operation of the new program, and then DaemonProcessV1 restarts WorkerProcessV1 and returns to the pre-upgrade state. That is, DaemonProcessV1 will clean up ClientV2's itinerary group and fall back to V1 to work.

若15秒內收到SIGUSR2信號且得到消息DaemonStartSuccess,則DaemonProcessV1執行exit退出,即DaemonProcessV1在收到信號後將主動退出,DaemonProcessV1退出後,DaemonProcessV2和WorkerProcessV2完全接管,升級完成,此後只有V2版本的兩個行程在機器上運行,升級過程成功結束。 If the SIGUSR2 signal is received within 15 seconds and the message DaemonStartSuccess is received, DaemonProcessV1 will execute exit, that is, DaemonProcessV1 will exit actively after receiving the signal. After DaemonProcessV1 exits, DaemonProcessV2 and WorkerProcessV2 will take over completely, and the upgrade will be completed. After that, there are only two versions of V2. The itinerary runs on the machine and the upgrade process ends successfully.

應用本發明的日誌收集客戶端在十幾萬台伺服器上部署,通過升級API可以在10分鐘完成所有機器客戶端版本的升級。單台機器的客戶端升級一般可以在5秒內完成,過程中資料收集無丟失,新版程式異常時可以自動轉返。 The log collection client of the present invention is deployed on hundreds of thousands of servers, and the upgrade of all machine client versions can be completed in 10 minutes through the upgrade API. The client upgrade of a single machine can generally be completed within 5 seconds. There is no loss of data collection during the process, and the new version of the program can be automatically transferred back when the new version of the program is abnormal.

如圖4所示,本實施例一種日誌收集客戶端,包括心跳請求模組、升級響應模組和升級檢查模組。本實施例的日誌收集客戶端安裝在主機上,用於進行日誌資料的採集,與配置伺服器交互完成程式的升級。 As shown in FIG. 4, a log collection client of this embodiment includes a heartbeat request module, an upgrade response module, and an upgrade check module. The log collection client of this embodiment is installed on the host to collect log data and interact with the configuration server to complete the program upgrade.

其中,心跳請求模組,用於向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應;升級響應模組,用於根據心跳請求響應中攜帶的升級指令,下載升級檔案並暫停發送心跳請求並停止收集新的日誌資料,將已經收集但未完成發送的日誌資料寫入本地檔案,記錄目前 進度點,採用下載的升級檔案進行升級;升級檢查模組,用於檢查是否升級成功,如果升級成功則將寫入本地檔案的日誌資料發送到資料伺服器,並從記錄的進度點開始收集日誌資料,以升級後的版本開始工作,否則回退到升級前的版本進行工作。 Among them, the heartbeat request module is used to send a heartbeat request to the configuration server and receive the heartbeat request response returned by the configuration server; the upgrade response module is used to download the upgrade file and suspend sending according to the upgrade instruction carried in the heartbeat request response Heartbeat request and stop collecting new log data, write log data that has been collected but not yet sent to a local file, record the current progress point, and use the downloaded upgrade file to upgrade; upgrade check module is used to check whether the upgrade is successful, If the upgrade is successful, the log data written to the local file will be sent to the data server, and the log data will be collected from the recorded progress point, and work with the upgraded version, otherwise it will fall back to the pre-upgrade version to work.

本實施例日誌收集客戶端啟動後,創建有守護行程和工作行程,以下分別闡述各模組在升級過程中所做的操作。 After the log collection client of this embodiment is started, a guard schedule and a work schedule are created. The following describes the operations performed by each module during the upgrade process.

心跳請求模組在向配置伺服器發送心跳請求時,執行如下操作:工作行程定期向配置伺服器發送心跳請求,該心跳請求中攜帶目前日誌收集客戶端的版本號和主機的IP位址,以便配置伺服器在沒有升級API請求的情況下發送空的心跳請求響應,在有升級API請求的情況下發送攜帶升級指令的心跳請求響應,該升級指令包括待升級日誌收集客戶端的版本號及其下載地址。 When the heartbeat request module sends a heartbeat request to the configuration server, it performs the following operations: the work schedule sends a heartbeat request to the configuration server periodically, and the heartbeat request carries the version number of the current log collection client and the IP address of the host for configuration The server sends an empty heartbeat request response without an upgrade API request, and sends a heartbeat request response with an upgrade instruction if there is an upgrade API request. The upgrade instruction includes the version number of the client to be upgraded log collection and its download address .

本實施例中,升級響應模組在採用下載的升級檔案進行升級之前,還執行如下操作:工作行程向守護行程發出通知升級操作的信號SIGUSR1。 In this embodiment, before the upgrade response module uses the downloaded upgrade file to upgrade, it also performs the following operations: the work schedule sends the guard schedule a signal SIGUSR1 notifying the upgrade operation.

本實施例中,升級響應模組在採用下載的升級檔案進行升級時,執行如下操作:守護行程接收SIGUSR1信號後設置全域狀態為DAEMON_UPDATE; 守護行程在守護迴圈中檢測到目前全域狀態為DAEMON_UPDATE時,採用下載的升級檔案進行升級;守護行程發送SIGKILL信號給工作行程,工作行程退出。 In this embodiment, when the upgrade response module uses the downloaded upgrade file to upgrade, it performs the following operations: After the guard program receives the SIGUSR1 signal, it sets the global state to DAEMON_UPDATE; when the guard program detects that the current global state is DAEMON_UPDATE in the guard loop , Use the downloaded upgrade file to upgrade; the guard schedule sends a SIGKILL signal to the work schedule, and the work schedule exits.

本實施例中,升級響應模組在採用下載的升級檔案進行升級時,還執行如下操作:執行升級後的日誌收集客戶端程式,創建新版本下的守護行程和工作行程;新版本下的守護行程週期迴圈檢測全域狀態;如果新版本下的工作行程啟動後異常退出,導致全域狀態變為DAEMON_UPDATE_FAIL,則發送通知信號SIGUSR2給原版本下的守護行程,附上啟動失敗消息;若發現狀態保持為DAEMON_INIT,迴圈週期內新版本下的工作行程未有異常發生,則發送通知信號SIGUSR2和啟動成功消息給原版本下的守護行程。 In this embodiment, when the upgrade response module uses the downloaded upgrade file to upgrade, it also performs the following operations: executes the upgraded log collection client program, creates the guard schedule and work schedule under the new version; the guard under the new version The travel cycle loops to detect the global status; if the work travel under the new version exits abnormally after starting, causing the global status to change to DAEMON_UPDATE_FAIL, then the notification signal SIGUSR2 is sent to the guard travel under the original version with a startup failure message; if the status remains For DAEMON_INIT, if there is no abnormality in the work schedule of the new version in the loop cycle, the notification signal SIGUSR2 and the startup success message are sent to the guard schedule of the original version.

本實施例中,升級檢查模組在檢查是否升級成功時,執行如下操作:原版本下的守護行程週期迴圈檢查來自新版本下的守護行程的通知信號SIGUSR2;若在迴圈週期內沒有來自新版本下的守護行程的SIGUSR2,原版本下的守護行程認為啟動新版本超時並向新版本下的守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的守護行程重新開機工作行程並回退至升級前狀態; 若在迴圈週期內收到來自新版本下的守護行程的SIGUSR2信號並得到啟動失敗消息,則原版本下的守護行程向新版本下的守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的守護行程重新開機工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的守護行程的SIGUSR2信號且得到啟動成功消息,則原版本下的守護行程退出,升級成功。 In this embodiment, when the upgrade check module checks whether the upgrade is successful, it performs the following operations: the guard trip cycle under the original version loops to check the notification signal SIGUSR2 from the guard trip under the new version; For the SIGUSR2 of the guard schedule under the new version, the guard schedule under the original version considers that the start of the new version has timed out and sends a SIGKILL command to the schedule group of the guard schedule under the new version to end the operation of the new program, and then the guard schedule under the original version restarts Turn on the working schedule and return to the state before the upgrade; if you receive the SIGUSR2 signal from the guard schedule under the new version and get the start failure message during the loop cycle, the guard schedule under the original version will go to the guard schedule under the new version The trip group sends the SIGKILL command to end the operation of the new program, and then the guard trip under the original version restarts the working trip and returns to the pre-upgrade state; if the SIGUSR2 signal from the guard trip under the new version is received during the loop cycle and If you get the startup success message, the guard program in the original version exits, and the upgrade is successful.

以上實施例僅用以說明本發明的技術方案而非對其進行限制,在不背離本發明精神及其實質的情況下,熟悉本領域的技術人員當可根據本發明作出各種相應的改變和變形,但這些相應的改變和變形都應屬於本發明所附的申請專利範圍的保護範圍。 The above embodiments are only used to illustrate the technical solutions of the present invention rather than to limit them. Without departing from the spirit and essence of the present invention, those skilled in the art can make various corresponding changes and modifications according to the present invention. , But these corresponding changes and deformations should belong to the scope of protection of the attached patent application of the present invention.

Claims (4)

一種日誌收集客戶端升級方法,應用於日誌收集客戶端,其特徵在於,該方法包括:該日誌收集客戶端啟動後,創建有第一守護行程和第一工作行程,該第一守護行程具有如下全域狀態:A、DAEMON_INIT,守護行程準備執行初始化工作;B、DAEMON_INIT_FAIL,守護行程初始化執行失敗;C、DAEMON_NORMAL,守護行程初始化執行成功,並開始守護工作;D、DAEMON_UPDATE,守護行程準備執行程式升級工作;E、DAEMON_UPDATE_FAIL,守護行程執行程式升級工作失敗;該第一工作行程定期向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應,根據心跳請求響應中攜帶的升級指令,下載升級檔案並暫停發送心跳請求,並停止收集新的日誌資料,將已經收集但未完成發送的日誌資料寫入本地檔案,記錄目前進度點,下載升級檔案到本地;該第一工作行程向第一守護行程發出通知升級操作的信號SIGUSR1,第一守護行程接收SIGUSR1信號後設置全域狀態為DAEMON_UPDATE; 該第一守護行程在守護迴圈中檢測到目前全域狀態為DAEMON_UPDATE時,採用下載的升級檔案進行升級;該第一守護行程發送SIGKILL信號給第一工作行程,第一工作行程退出;執行升級後的日誌收集客戶端程式,創建新版本下的第二守護行程和第二工作行程;該第二守護行程週期迴圈檢測全域狀態;如果第二工作行程啟動後異常退出,導致全域狀態變為DAEMON_UPDATE_FAIL,則發送通知信號SIGUSR2給原版本下的第一守護行程,附上啟動失敗消息;若發現狀態保持為DAEMON_INIT,迴圈週期內新版本下的第二工作行程未有異常發生,則發送通知信號SIGUSR2和啟動成功消息給原版本下的第一守護行程;原版本下的第一守護行程週期迴圈檢查來自新版本下的第二守護行程的通知信號SIGUSR2;若在迴圈週期內沒有來自新版本下的第二守護行程的SIGUSR2,原版本下的第一守護行程認為啟動新版本超時並向新版本下的第二守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的第一守護行程重新啟動第一工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的第二守護行程的SIGUSR2信號並得到啟動失敗消息,則原版本下的第一守護行程向新版本下的第二守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的第一 守護行程重新啟動第一工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的第二守護行程的SIGUSR2信號且得到啟動成功消息,則原版本下的第一守護行程退出,升級成功。 A log collection client upgrade method, applied to a log collection client, is characterized in that the method includes: after the log collection client is started, a first guarding process and a first working process are created, and the first guarding process has the following Global status: A, DAEMON_INIT, the guard process is ready to perform initialization work; B, DAEMON_INIT_FAIL, the guard process initialization fails; C, DAEMON_NORMAL, the guard process is initialized successfully, and the guard begins; D, DAEMON_UPDATE, the guard process is ready to perform program upgrades ; E, DAEMON_UPDATE_FAIL, the upgrade of the guard program execution program failed; the first working program periodically sends a heartbeat request to the configuration server, receives the heartbeat request response returned by the configuration server, and downloads the upgrade file according to the upgrade command carried in the heartbeat request response And suspend sending heartbeat requests, and stop collecting new log data, write the collected but not completed log data to the local file, record the current progress point, download the upgrade file to the local; the first work schedule is the first guard schedule Send out the signal SIGUSR1 to notify the upgrade operation. After receiving the SIGUSR1 signal, the first guard trip sets the global status to DAEMON_UPDATE; When the first guardian program detects that the current global status is DAEMON_UPDATE in the guardian loop, the downloaded upgrade file is used to upgrade; the first guardian program sends a SIGKILL signal to the first working program, and the first working program exits; after executing the upgrade Log collection client program to create the new version of the second guard trip and the second work schedule; the second guard trip cycle loops to detect the global status; if the second work process exits abnormally after the start, the global status changes to DAEMON_UPDATE_FAIL , The notification signal SIGUSR2 is sent to the first guard trip under the original version, with a startup failure message; if the status is found to remain DAEMON_INIT, and there is no abnormality in the second work trip under the new version in the loop cycle, the notification signal is sent SIGUSR2 and the startup success message are sent to the first guard trip under the original version; the first guard trip cycle under the original version loops to check the notification signal SIGUSR2 from the second guard trip under the new version; if there is no signal from the new guard trip during the cycle The SIGUSR2 of the second guardian course under the version, the first guardian course under the original version thinks that the start of the new version has timed out and sends a SIGKILL command to the course group of the second guardian course under the new version to end the operation of the new program, and then the original version Restart the first working trip and return to the state before the upgrade; if you receive the SIGUSR2 signal from the second guard trip under the new version and get the start failure message during the loop cycle, the original version will be downloaded The first guardian itinerary of the new version sends a SIGKILL command to the itinerary group of the second guardian itinerary under the new version to end the operation of the new program, and then the first The guard trip restarts the first working trip and returns to the state before the upgrade; if the SIGUSR2 signal from the second guard trip under the new version is received during the loop cycle and the start-up success message is obtained, the first guard in the original version The itinerary exited and the upgrade was successful. 根據申請專利範圍第1項所述的日誌收集客戶端升級方法,其中,該心跳請求中攜帶目前日誌收集客戶端的版本號和主機的IP位址,以便配置伺服器在沒有升級API請求的情況下發送空的心跳請求響應,在有升級API請求的情況下發送攜帶升級指令的心跳請求響應,該升級指令包括待升級日誌收集客戶端的版本號及其下載地址。 According to the log collection client upgrade method described in item 1 of the scope of patent application, the heartbeat request carries the version number of the current log collection client and the IP address of the host, so that the configuration server can be configured without an API request to upgrade Send an empty heartbeat request response, and if there is an upgrade API request, send a heartbeat request response carrying an upgrade instruction. The upgrade instruction includes the version number of the client to be upgraded log collection and its download address. 一種日誌收集客戶端,其特徵在於,該日誌收集客戶端執行如下操作:該日誌收集客戶端啟動後,創建有第一守護行程和第一工作行程,該第一守護行程具有如下全域狀態:A、DAEMON_INIT,守護行程準備執行初始化工作;B、DAEMON_INIT_FAIL,守護行程初始化執行失敗;C、DAEMON_NORMAL,守護行程初始化執行成功,並開始守護工作;D、DAEMON_UPDATE,守護行程準備執行程式升級工作;E、DAEMON_UPDATE_FAIL,守護行程執行程式升級工作失敗; 該第一工作行程定期向配置伺服器發送心跳請求,接收配置伺服器返回的心跳請求響應,根據心跳請求響應中攜帶的升級指令,下載升級檔案並暫停發送心跳請求,並停止收集新的日誌資料,將已經收集但未完成發送的日誌資料寫入本地檔案,記錄目前進度點,下載升級檔案到本地;該第一工作行程向第一守護行程發出通知升級操作的信號SIGUSR1,第一守護行程接收SIGUSR1信號後設置全域狀態為DAEMON_UPDATE;該第一守護行程在守護迴圈中檢測到目前全域狀態為DAEMON_UPDATE時,採用下載的升級檔案進行升級;該第一守護行程發送SIGKILL信號給第一工作行程,第一工作行程退出;執行升級後的日誌收集客戶端程式,創建新版本下的第二守護行程和第二工作行程;該第二守護行程週期迴圈檢測全域狀態;如果第二工作行程啟動後異常退出,導致全域狀態變為DAEMON_UPDATE_FAIL,則發送通知信號SIGUSR2給原版本下的第一守護行程,附上啟動失敗消息;若發現狀態保持為DAEMON_INIT,迴圈週期內新版本下的第二工作行程未有異常發生,則發送通知信號SIGUSR2和啟動成功消息給原版本下的第一守護行程;原版本下的第一守護行程週期迴圈檢查來自新版本下的第二守護行程的通知信號SIGUSR2; 若在迴圈週期內沒有來自新版本下的第二守護行程的SIGUSR2,原版本下的第一守護行程認為啟動新版本超時並向新版本下的第二守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的第一守護行程重新啟動第一工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的第二守護行程的SIGUSR2信號並得到啟動失敗消息,則原版本下的第一守護行程向新版本下的第二守護行程所在行程組發送SIGKILL命令,結束新程式的運行,隨後原版本下的第一守護行程重新啟動第一工作行程並回退至升級前狀態;若在迴圈週期內收到來自新版本下的第二守護行程的SIGUSR2信號且得到啟動成功消息,則原版本下的第一守護行程退出,升級成功。 A log collection client is characterized in that the log collection client performs the following operations: after the log collection client is started, a first guardian schedule and a first working schedule are created, and the first guardian schedule has the following global status: A , DAEMON_INIT, guard itinerary is ready to execute initialization work; B, DAEMON_INIT_FAIL, guard itinerary initialization execution failed; C, DAEMON_NORMAL, guard itinerary initialization is successfully executed, and start guarding work; D, DAEMON_UPDATE, guarding itinerary is ready to execute program upgrade work; E, DAEMON_UPDATE_FAIL , The upgrade of the guard program execution program failed; The first work schedule periodically sends heartbeat requests to the configuration server, receives the heartbeat request response returned by the configuration server, downloads the upgrade file and suspends sending the heartbeat request according to the upgrade instructions carried in the heartbeat request response, and stops collecting new log data , Write the collected but not completed log data to the local file, record the current progress point, download the upgrade file to the local; the first work process sends the first guard process the signal SIGUSR1 to notify the upgrade operation, and the first guard process receives After the SIGUSR1 signal, set the global status to DAEMON_UPDATE; when the first guardian program detects that the current global status is DAEMON_UPDATE in the guardian loop, the downloaded upgrade file is used to upgrade; the first guardian program sends the SIGKILL signal to the first working schedule, Exit the first work schedule; execute the upgraded log collection client program to create the second guard schedule and the second work schedule under the new version; the second guard schedule cycles to check the global status; if the second work schedule starts Abnormal exit, causing the global status to change to DAEMON_UPDATE_FAIL, then send the notification signal SIGUSR2 to the first guard trip under the original version, with a startup failure message; if the status is found to remain DAEMON_INIT, the second work schedule under the new version in the loop cycle If no abnormality occurs, the notification signal SIGUSR2 and the start success message are sent to the first guard trip under the original version; the first guard trip cycle under the original version loops to check the notification signal SIGUSR2 from the second guard trip under the new version; If there is no SIGUSR2 from the second guard trip under the new version in the loop cycle, the first guard trip under the original version considers that the start of the new version has timed out and sends a SIGKILL command to the trip group of the second guard trip under the new version. End the operation of the new program, and then restart the first working stroke under the first guard stroke in the original version and return to the state before the upgrade; if the SIGUSR2 signal from the second guard stroke in the new version is received during the loop cycle, and If the startup failure message is obtained, the first guard trip under the original version sends a SIGKILL command to the trip group of the second guard trip under the new version to end the operation of the new program, and then the first guard trip under the original version restarts the first job The itinerary and roll back to the state before the upgrade; if the SIGUSR2 signal from the second guardian course under the new version is received during the cycle and the start success message is obtained, the first guardian course under the original version exits and the upgrade is successful. 根據申請專利範圍第3項所述的日誌收集客戶端,其中,該心跳請求中攜帶目前日誌收集客戶端的版本號和主機的IP位址,以便配置伺服器在沒有升級API請求的情況下發送空的心跳請求響應,在有升級API請求的情況下發送攜帶升級指令的心跳請求響應,該升級指令包括待升級日誌收集客戶端的版本號及其下載地址。 According to the log collection client described in item 3 of the scope of patent application, the heartbeat request carries the version number of the current log collection client and the IP address of the host, so that the configuration server can send a blank without an upgrade API request. If there is an upgrade API request, it sends a heartbeat request response carrying an upgrade instruction. The upgrade instruction includes the version number of the log collection client to be upgraded and its download address.
TW106102497A 2017-01-23 2017-01-23 Log collection client terminal and its upgrading method TWI740886B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW106102497A TWI740886B (en) 2017-01-23 2017-01-23 Log collection client terminal and its upgrading method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW106102497A TWI740886B (en) 2017-01-23 2017-01-23 Log collection client terminal and its upgrading method

Publications (2)

Publication Number Publication Date
TW201828057A TW201828057A (en) 2018-08-01
TWI740886B true TWI740886B (en) 2021-10-01

Family

ID=63960538

Family Applications (1)

Application Number Title Priority Date Filing Date
TW106102497A TWI740886B (en) 2017-01-23 2017-01-23 Log collection client terminal and its upgrading method

Country Status (1)

Country Link
TW (1) TWI740886B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113141263B (en) * 2020-01-02 2022-09-27 广东博智林机器人有限公司 Upgrading method, device, system and storage medium
CN111796842A (en) * 2020-06-10 2020-10-20 云南电网有限责任公司 Remote upgrading method and device for log client software

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200405202A (en) * 2002-09-20 2004-04-01 Ibm Method and apparatus for automatic updating and testing of software
TW201108115A (en) * 2009-08-28 2011-03-01 Hon Hai Prec Ind Co Ltd A method for upgrading software of gateways
US20120174085A1 (en) * 2010-12-30 2012-07-05 Volker Driesen Tenant Move Upgrade
CN103677870A (en) * 2012-09-10 2014-03-26 腾讯科技(深圳)有限公司 System upgrading method and system upgraded by means of method
CN105187262A (en) * 2015-10-27 2015-12-23 上海斐讯数据通信技术有限公司 Router upgrading method and system
US20160019043A1 (en) * 2014-07-15 2016-01-21 Oracle International Corporation Automatic generation and execution of server update processes

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200405202A (en) * 2002-09-20 2004-04-01 Ibm Method and apparatus for automatic updating and testing of software
TW201108115A (en) * 2009-08-28 2011-03-01 Hon Hai Prec Ind Co Ltd A method for upgrading software of gateways
US20120174085A1 (en) * 2010-12-30 2012-07-05 Volker Driesen Tenant Move Upgrade
CN103677870A (en) * 2012-09-10 2014-03-26 腾讯科技(深圳)有限公司 System upgrading method and system upgraded by means of method
US20160019043A1 (en) * 2014-07-15 2016-01-21 Oracle International Corporation Automatic generation and execution of server update processes
CN105187262A (en) * 2015-10-27 2015-12-23 上海斐讯数据通信技术有限公司 Router upgrading method and system

Also Published As

Publication number Publication date
TW201828057A (en) 2018-08-01

Similar Documents

Publication Publication Date Title
WO2017118334A1 (en) Log collection client and updating method therefor
US8146060B2 (en) Data processing system and method for execution of a test routine in connection with an operating system
US8412984B2 (en) Debugging in a cluster processing network
US9485151B2 (en) Centralized system management on endpoints of a distributed data processing system
CN110895487B (en) Distributed task scheduling system
JP5579650B2 (en) Apparatus and method for executing monitored process
CN110895484A (en) Task scheduling method and device
CN110895488B (en) Task scheduling method and device
JP5444178B2 (en) Backup / restore processing device, backup / restore processing method and program
KR20140025503A (en) Replaying jobs at a secondary location of a service
CN110895486B (en) Distributed task scheduling system
CN110895483A (en) Task recovery method and device
CN111800304A (en) Process running monitoring method, storage medium and virtual device
TW200426571A (en) Policy-based response to system errors occurring during os runtime
TWI740886B (en) Log collection client terminal and its upgrading method
CN110196749B (en) Virtual machine recovery method and device, storage medium and electronic device
CN110268378A (en) The method for creating the data backup of virtual automation solution, the computer program of implementation method and the virtual server run by method
CN110895485A (en) Task scheduling system
CA2152329C (en) Apparatus and methods for software rejuvenation
US11290330B1 (en) Reconciliation of the edge state in a telemetry platform
JP2004086769A (en) Application updating processing method, updating processing system, and updating processing program
CN112948008A (en) Ironic based physical bare computer management method
WO2016131294A1 (en) Version upgrade processing method and device
CN110188008B (en) Job scheduling master-slave switching method and device, computer equipment and storage medium
EP3993353A2 (en) System and method for managing clusters in an edge network

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees