CN112667472A - Data source connection state monitoring device and method - Google Patents

Data source connection state monitoring device and method Download PDF

Info

Publication number
CN112667472A
CN112667472A CN202011576061.4A CN202011576061A CN112667472A CN 112667472 A CN112667472 A CN 112667472A CN 202011576061 A CN202011576061 A CN 202011576061A CN 112667472 A CN112667472 A CN 112667472A
Authority
CN
China
Prior art keywords
state
data source
refreshing
connection
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011576061.4A
Other languages
Chinese (zh)
Other versions
CN112667472B (en
Inventor
黄海明
高东升
胡高坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dream Database Co Ltd
Wuhan Dameng Database Co Ltd
Original Assignee
Wuhan Dream Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dream Database Co Ltd filed Critical Wuhan Dream Database Co Ltd
Priority to CN202011576061.4A priority Critical patent/CN112667472B/en
Publication of CN112667472A publication Critical patent/CN112667472A/en
Application granted granted Critical
Publication of CN112667472B publication Critical patent/CN112667472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a device and a method for monitoring a data source connection state, wherein the device comprises a data source state monitor and a data source state manager, and the data source state manager comprises a data source state warehouse, a start-stop device and a refreshing module; the data source state warehouse is provided with a plurality of state storages, and each state storage stores the connection state of one type of data source; the method comprises the following steps that a state refreshing task is established for each type of data source by a start-stop device and is put into a delay queue; the refreshing module controls each state refreshing task in the queue to periodically refresh the data source connection state and update the data source connection state in the corresponding state memory; and when the monitor monitors the refresh request of the data source state monitoring client, forwarding the data source connection state returned by the data source state warehouse to the data source state monitoring client. The scheme realizes monitoring of the connection state of multiple data sources on the premise of reducing the resource consumption of the system as much as possible, and realizes the balance of real-time performance and resource consumption.

Description

Data source connection state monitoring device and method
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of data monitoring, in particular to a device and a method for monitoring a connection state of a data source.
[ background of the invention ]
The data exchange system (hereinafter abbreviated as ETL) is a data processing system with functions of data extraction (Extract), cleansing conversion (Transform) and data loading (Load), wherein the data source is external data that the data exchange system needs to connect when reading or writing data, and the connection information of the data sources is stored in the form of metadata for the system to access at any time. In the data exchange system, a user needs to be able to conveniently check whether some or all data sources can be connected normally or cause of a failure that makes the data sources unconnected so as to remove the failure, and therefore a functional module is needed in the system to record and maintain the connection state of the data source object, and store and refresh information on whether each data source is connectable in the current environment, the cause of the failure that makes the data sources unconnected, and the like.
The monitoring of the connection state of the data source is a process of sending a connection request to the data source server and judging whether the data source is connectable or resolving a fault cause causing the data source to be not connectable according to a returned result, and is a connection test operation. The connection test modes of different types of data sources are different, and although the connection test modes of the same type of data sources are the same, the consumed time and resources are different; in addition, performing connection testing is a very time consuming and resource consuming operation, and in the actual use of a data exchange system, there are often a large number of different types of data sources. Under the condition of monitoring the connection state of large-batch data sources of different types, if all the data sources are unconditionally and uninterruptedly tested, great pressure is applied to the server of the system.
The traditional monitoring technology generally adopts the following two mechanisms to monitor the connection state of the data source:
1) and in the linear polling mode, connection test is performed on each existing data source in turn. The method has less resource consumption, but has the problem of insufficient real-time performance, and when the time consumption of a certain data source connection test process is long, the connection test processes of all the subsequent data sources can be delayed;
2) and (4) a concurrent mode, namely starting a thread for each data source to perform connection test. This approach is highly efficient in refreshing, but can put a great strain on the system.
Therefore, the balance between real-time performance and resource consumption is difficult to realize in both the linear polling mode and the concurrent mode; in addition, the conventional monitoring technology needs to continuously monitor the connection states of a large number of different types of data sources, and cannot be adjusted according to actual conditions, so that unnecessary system resource consumption is caused, and waste of system memory and CPU resources is caused.
In view of the above, it is an urgent problem in the art to overcome the above-mentioned drawbacks of the prior art.
[ summary of the invention ]
The technical problems to be solved by the invention are as follows:
when a data exchange system is actually used, a large number of data sources of different types exist, the traditional monitoring technology needs to uninterruptedly perform connection test on all the data sources, cannot be adjusted according to actual conditions, causes waste of resources of a memory and a CPU (central processing unit) of the system, and is difficult to realize balance between real-time performance and resource consumption no matter a linear polling mode or a concurrent mode.
The invention achieves the above purpose by the following technical scheme:
in a first aspect, the present invention provides a data source connection status monitoring apparatus, including a data source status listener and a data source status manager, where the data source status listener is configured to monitor a connection request and a refresh request of a data source status monitoring client, and schedule a work of the data source status manager according to the request;
the data source state manager comprises a data source state warehouse, a start-stop device and a refreshing module;
the data source state warehouse is provided with a plurality of state memories, and each state memory is used for storing the connection state of one type of data source;
the method comprises the steps that when the start-stop device is started for the first time, a state refreshing task which is responsible for refreshing the connection state of each type of data source in the ETL system is established for each type of data source, and the state refreshing task is placed into a delay queue of a refreshing module;
the refreshing module controls each state refreshing task in the delay queue to periodically refresh the connection state of the data source according to a preset interval and update the connection state in the corresponding state memory;
when the data source state monitor monitors a refresh request of the data source state monitoring client, the data source state monitor forwards the refresh request to the data source state manager, and forwards a data source connection state returned by the data source state warehouse to the data source state monitoring client sending the refresh request.
Preferably, the refresh module further includes a delay queue manager and a thread pool;
the delay queue manager takes out the state refreshing task with the expired delay from the delay queue according to a preset interval, packages the state refreshing task into a state refreshing thread and puts the state refreshing thread into the thread pool;
the thread pool is used for allocating resources for the state refreshing thread to run so as to refresh the connection state of the data source in the corresponding state memory;
if the ETL system still has the data source and the start-stop device is in the starting state, the corresponding state refreshing task is resubmitted to the delay queue after the state refreshing thread is finished running.
Preferably, a counter is arranged in the data source state monitor, and when a connection request sent by the data source state monitoring client is monitored, the counter is increased by 1; every time when the disconnection of the data source state monitoring client is monitored, the counter is decreased by 1;
when the counter is larger than 0, the data source state listener sends a starting instruction to the data source state manager; when the counter equals 0, the data source status listener sends a "pause" instruction to the data source status manager.
Preferably, the start-stop device is configured to receive "start" and "pause" instructions sent by the data source status listener, and maintain a flag of a "start-stop state machine" according to the instructions;
when the start-stop device receives a start instruction and the start-stop state machine is in a stop state, the start-stop device creates a state refreshing task for each type of data source in the ETL system, puts the state refreshing task into the delay queue for waiting execution, and sets the state of the start-stop state machine as start;
and when the start-stop device receives a pause instruction, setting the state of the start-stop state machine to stop, and enabling the refreshing module to be in a standby state at the moment.
Preferably, the data source state manager further includes a broadcast module, where the broadcast module is configured to periodically obtain connection states of various data sources from the data source state warehouse according to a preset interval, and send the connection states to the data source state monitor, so that the data source state monitor periodically sends the data source connection states to each data source state monitoring client.
Preferably, the data source state manager further comprises a task manager, and the task manager is in communication connection with the data source management client;
when the data source adding operation is executed in the data source management client, the task manager is used for adding a state refreshing task which is responsible for refreshing the connection state of the data source to the delay queue;
and when the data source management client executes the operation of deleting the data source, the task manager is used for removing the state refreshing task which is responsible for refreshing the connection state of the data source from the delay queue.
In a second aspect, the present invention provides a method for monitoring a connection status of a data source, where the method includes:
the data source state manager creates a state refreshing task for refreshing the connection state of each type of data source in the ETL system, and puts the state refreshing task into a delay queue for waiting execution;
each state refreshing task in the delay queue periodically refreshes the connection state of the data source according to a preset interval and completes state updating in a corresponding state memory of a data source state warehouse;
when the data source state monitor monitors the refresh request of the data source state monitoring client, the refresh request is forwarded to the data source state manager, and the connection state of various data sources returned by the data source state warehouse is forwarded to the data source state monitoring client sending the refresh request.
Preferably, the method further comprises:
and a broadcasting module in the data source state manager acquires the connection state of each type of data source from the data source state warehouse according to a preset period, sends the connection state to the data source state monitor, and periodically sends the data source connection state to each data source state monitoring client by the data source state monitor.
Preferably, the method further comprises:
when the data source management client side executes the operation of adding or deleting the data source, the data source management server sends the data source type of the data source to the task manager;
the task manager acquires a data source metadata list of the type of the data source from the data source management server and judges whether the acquired list is empty;
when the obtained list is not empty and a state refreshing task corresponding to the data source does not exist in the delay queue, the task manager adds a state refreshing task in charge of refreshing the connection state of the data source to the delay queue;
and when the acquired list is empty and the state refreshing task corresponding to the data source exists in the delay queue, the task manager removes the state refreshing task which is responsible for refreshing the connection state of the data source from the delay queue.
Preferably, each state refresh task in the delay queue periodically refreshes the connection state of the data source according to a preset interval, and completes state update in a corresponding state memory of the data source state warehouse, specifically:
the delay queue manager takes out the state refreshing task with the expired delay from the delay queue according to a preset interval, packages the state refreshing task into a state refreshing thread and puts the state refreshing thread into a thread pool;
the thread pool allocates resources for the state refresh thread to run so as to refresh the connection state of the data source in the corresponding state memory;
and when each state refreshing thread runs, judging whether to submit the state refreshing task of the data source to the delay queue, if the data source still exists in the ETL system and the start-stop device is in the starting state, resubmitting the corresponding state refreshing task to the delay queue after the execution is finished.
Compared with the prior art, the invention has the beneficial effects that:
the invention adopts a classification monitoring mechanism and a delay queue mechanism, the connection state information of a large quantity of various data sources is divided and stored in a warehouse according to the types of the data sources, a state refreshing task which is responsible for refreshing the state of each data source is established for each type of the data sources and added into the delay queue, each state refreshing task can periodically refresh and store the connection state of one type of the data sources, and the stored data source connection state is fed back when the refreshing request of a monitoring client is monitored, so that the connection test of all the data sources is not required to be carried out uninterruptedly, the unnecessary waste of system resources is avoided, and the feedback can be carried out in time when the client needs to access the data source connection state, thereby realizing the balance of real-time and resource consumption.
Moreover, the connection test work and the connection state broadcast work are decoupled, the result of the latest connection test of a certain type of data source is obtained from the warehouse and is sent out, and the data source is not sent together after all refreshing results are finished one round, so that the balance between real-time performance and resource consumption can be ensured, the problem that the real-time performance of a linear polling mode is insufficient is solved, and the problem that the resources are consumed excessively by a concurrent mode is also solved.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a structural diagram of a data source connection status monitoring apparatus according to an embodiment of the present invention;
fig. 2 is a flowchart of a data source connection status monitoring method according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other. The invention will be described in detail below with reference to the figures and examples.
Example 1:
in order to solve the problems that the conventional monitoring technology needs to uninterruptedly perform connection tests on all data sources, cannot adjust according to actual conditions, is difficult to realize balance between real-time performance and resource consumption, and the like, an embodiment of the present invention provides a data source connection state monitoring apparatus, which mainly includes two components, namely, a data source state monitor and a data source state manager, as shown in fig. 1.
The data source state monitor is used for monitoring connection requests and refresh requests of a plurality of data source state monitoring clients, and scheduling the work of the data source state manager according to the requests, and is equivalent to a transfer structure between the data source state manager and the data source state monitoring clients.
The data source state manager mainly comprises a data source state warehouse, a start-stop device and a refreshing module. The data source state warehouse is provided with a plurality of state memories, and each state memory is used for storing the connection state of one type of data source. When the start-stop device is started for the first time, a state refreshing task which is responsible for refreshing the connection state of each type of data source in the ETL system can be created for each type of data source, and the state refreshing task is placed into a delay queue of the refreshing module to wait for execution. The refreshing module can control each state refreshing task in the delay queue to periodically refresh the connection state of the data source according to a preset interval and update the connection state in the corresponding state memory.
When the data source state monitor monitors a refresh request of the data source state monitoring client, the data source state monitor forwards the refresh request to the data source state manager, and forwards the connection state of various data sources returned by the data source state warehouse to the data source state monitoring client sending the refresh request.
The data source status listener and the data source status manager are specifically described below with reference to the accompanying drawings.
1. Data source state monitor
In connection with fig. 1, the data source status listener includes a counter, a repeater, and a broadcaster. The registration of the data source status listener and the functions of each structure are described below in conjunction with the associated code.
1) Registration of data source status listener: when the ETL server is started, the SocketIO service starts a data source status monitor, which is responsible for monitoring the connection and refresh requests of the data source status monitor client, and the related codes are implemented as follows:
SocketIONamespace datasourceStatusNamespace=server.addNamespace(“/datasource-status”);
DataSourceStatsListener=datasourceListener=DataSourceStatsListener.getInstance();
datasourceStatusListener.setDataSourceStatusNamespace(datasourceStatusNamespace);
datasourceStatusNamespace.addListeners(datasourceStatusListener);
2) data source status listener functionality
a. Monitoring the connection request of the data source state monitoring client: when the data source state monitor monitors that a data source state monitoring client side sends a connection request, the counter is increased by 1; whenever it is monitored that the active data source status monitoring client is disconnected, the counter is decremented by 1.
b. Scheduling the work of the data source state manager: when the counter is larger than 0, the data source state listener sends a starting instruction to the data source state manager; when the counter equals 0, the data source status listener sends a "pause" instruction to the data source status manager.
c. Monitoring a refresh request of a data source state monitoring client: when the data source state monitor monitors a refresh request of a data source state monitoring client, the repeater forwards the refresh request to the data source state manager, and forwards a data source connection state returned by the data source state manager to the data source state monitoring client sending the refresh request for the data source state monitoring client to use.
d. Broadcast data source connection status: and the broadcaster sends the connection state related information of various data sources to all the data source state monitoring clients according to the data periodically sent by the data source state manager, and the connection state related information is used by each data source state monitoring client.
3) Relevant code for realizing data source state monitor function
Figure BDA0002863293440000091
Figure BDA0002863293440000101
2. Data source state manager
Referring to fig. 1, the data source state manager includes a data source state repository, a start/stop device, a refresh module, a broadcast module, a reader, a task manager, and a parameter updater, where the data source state repository includes a plurality of state memories, and the refresh module includes a delay queue, a delay queue manager, and a thread pool. The initialization of the data source state manager and the specific functions of each structure are described below in conjunction with related code.
1) Initialization of data source state manager
Initialization is performed when a data source state manager instance is created, and mainly comprises the following three aspects:
a. initialization of parameters: the method mainly comprises the steps of initializing a refresh INTERVAL (CHECK _ INTERVAL), namely the preset INTERVAL mentioned above, and taking a value within the range of 1 second to 8 hours generally corresponding to the sleep time of each state refresh task after the state refresh task is executed once, wherein the value can be specifically defined; for example, 15 seconds may be set.
b. Initialization of the data source state warehouse: the method mainly comprises the steps that a plurality of state memories StatusBean are configured in the data source state warehouse, so that each state memory correspondingly stores the connection states of a class of data sources, including unknown, normal connection and abnormal connection. In addition to the connection status of each type of data source, the status memory stores the refresh status of each data source, including waiting for refresh, refreshing, and refreshed.
Specifically, the data structure of the data source state repository is as follows:
Figure BDA0002863293440000111
Figure BDA0002863293440000121
among the types of data sources, there are typically database data sources (i.e., DB data sources), Hadoop data sources, Redis data sources, ES data sources, FTP data sources, and the like. In connection with fig. 1, DS1, DS2, etc. in each state memory represent data source IDs, and the following small circles represent data source connection states, and here, a black circle and a white circle respectively represent connection normality and connection abnormality as an example. The state memory corresponding to the ES data source is EMPTY (EMPTY), which proves that there is no data source of the type in the ETL system at this time, that is, there is no ES data source of the datasourceType.
c. Starting the daemon thread: the daemon thread mainly refers to a broadcast module and a delay queue manager in the refreshing module, namely the delay queue manager and the broadcast module need to be started when the data source state manager is initialized.
2) Start-stop device
And the start-stop device is responsible for receiving the start-up instruction and the pause instruction sent by the data source state monitor and maintaining the mark of the start-stop state machine according to the instruction. After the ETL server is started, each time a data source state monitoring client sends a connection request, the start-stop device receives a start instruction sent by the data source state monitor; when all the data source state monitoring clients are disconnected, the start-stop device receives a 'pause' instruction sent by the data source state monitor.
And when the start-stop device receives a start instruction, the state of the start-stop state machine is judged. If the start-stop state machine is in the start state, nothing is done. If the 'start-stop state machine' is in a 'stop' state, which indicates that the start-stop device is started for the first time, the start-stop device creates a state refreshing task for each type of data source existing in the ETL system, namely creates a state refreshing task for a state memory with a hasDataSource value of true in a data source state warehouse, and puts the state refreshing task into the delay queue for waiting execution; meanwhile, the state of the start-stop state machine is set as start.
And when the start-stop device receives a pause instruction, setting the state of the start-stop state machine to stop, wherein no data source state monitoring client is connected. When the state refreshing task in the refreshing module runs, if the state of the start-stop state machine is found to be 'stop', the ping operation (the ping operation is also the connection test operation of the data source state) is not executed, and a new state refreshing task does not need to be added into the delay queue, so that the refreshing module is in a standby state, unnecessary resource consumption is avoided, and the working pressure of the server is reduced.
The relevant codes for realizing the function of the start-stop device are as follows:
Figure BDA0002863293440000131
Figure BDA0002863293440000141
3) refresh module
The delay queue manager is a daemon thread and is responsible for taking out the state refreshing tasks (represented by small hexagons in the figure) with expired delay from the delay queue according to preset intervals, packaging the state refreshing tasks into state refreshing threads (represented by adding T in the small hexagons in the figure) and putting the state refreshing threads into the thread pool. And the thread pool is used for allocating resources for the state refreshing thread to run so as to refresh the connection state of the data source in the corresponding state memory.
Each state refreshing thread corresponds to a type of data source, when each state refreshing thread runs, the data source connection state in the state memory corresponding to the type in the data source state warehouse is refreshed, then whether the state refreshing task of the type of data source needs to be submitted to the delay queue after the thread runs is judged, and the judgment is specifically carried out by judging whether the type of data source exists in the ETL system and the starting state of the start-up device. If the ETL system still has the data source (namely the list of the data source is not empty) and the start-stop device is in a starting state (namely the state of the start-stop state machine is started), after the running of the state refreshing thread is finished, the corresponding state refreshing task is resubmitted to the delay queue so as to be taken out for use after the preset interval of dormancy; if the ETL system does not have the data source or the start-stop device is in the pause state, the thread is not used for state refreshing subsequently, and therefore the corresponding state refreshing task does not need to be submitted to the delay queue again.
The relevant codes for realizing the functions of the refresh module are as follows:
Figure BDA0002863293440000151
Figure BDA0002863293440000161
4) broadcasting module
The broadcast module is a daemon thread, periodically accesses the state of a start-stop state machine according to a preset INTERVAL CHECK _ INTERVAL, and if the state is started, the connection state of all existing data sources is read from the data source state warehouse through the reader and is sent to the data source state monitor, so that the data source state monitor can periodically send the data source connection state to all connected data source state monitor clients through the broadcaster. If the state of the start-stop state machine is stop, nothing is done.
5) Reading device
The reader is used for reading the data source connection state from the data source state warehouse and can be used as a transmission bridge between the data source state warehouse and the broadcast module and between the data source state warehouse and the data source state monitoring in-broadcaster. When the broadcast function is executed, the broadcast module reads the data source connection state from the data source state warehouse through the reader and sends the data source connection state to a broadcaster in the data source state listener. When a refresh request of a data source state monitoring client is monitored, the repeater forwards the refresh request to the data source state manager, and the reader reads a data source connection state from the data source state warehouse and returns the data source connection state to the repeater.
Further, the reader is also configured to provide an interface for reading the connection status of the data source from the data source status repository for external access. As shown in fig. 1, the reader may be connected to an external APP device, and an external application module may obtain a data source connection state required by the external application module through the reader.
6) Task manager
The task manager is in communication connection with the plurality of data source management clients. When the data source adding operation is executed in the data source management client, the task manager adds a state refreshing task which is responsible for refreshing the connection state of the data source to the delay queue; and when the data source deleting operation is executed in the data source management client, the task manager removes the state refreshing task which is responsible for refreshing the connection state of the data source from the delay queue.
Specifically, referring to fig. 1, the task manager is communicatively connected to a data source management server, and the data source management server is communicatively connected to the data source management client. When the data source management client executes the operation of adding or deleting the data source, the data source management server sends the data source type of the data source to the task manager, the task manager acquires all data source metadata lists of the data source type from the data source management server, and at this time:
a. and if the acquired list is not empty and the state refreshing task corresponding to the type of data source does not exist in the delay queue, the task manager adds the state refreshing task responsible for refreshing the connection state of the type of data source to the delay queue.
b. And if the acquired list is empty and the state refreshing task corresponding to the type of data source exists in the delay queue, the task manager removes the state refreshing task responsible for refreshing the connection state of the type of data source from the delay queue.
The above mechanism ensures that: the connection state of the newly added data source type can be maintained, and the connection state of the emptied data source type is not refreshed any more, so that the refreshing work can not be omitted and the resource waste caused by unnecessary idle work can not be caused when the data sources are dynamically increased or decreased.
7) Parameter updater
The parameter updater is mainly used for updating a refresh INTERVAL, and when the refresh INTERVAL is adjusted according to actual needs in a system, the parameter updater updates on the basis of the original CHECK _ INTERVAL, so that the refresh module can periodically execute a state refresh thread according to the new refresh INTERVAL.
The device provided by the embodiment of the invention can adopt a classification monitoring mechanism and a delay queue mechanism to store the connection states of a large quantity of various data sources into a warehouse according to the types of the data sources by arranging the data source state manager and the data source state monitor, creates a thread which is responsible for refreshing the state of each data source and adds the thread into the delay queue, periodically refreshes and stores the connection states of one type of data source by each thread, and feeds back the stored data source connection states when monitoring the refreshing request of the monitoring client, so that connection tests are not required to be carried out on all data sources uninterruptedly, unnecessary system resource waste is avoided, and the data source connection states can be fed back in time when the client needs to access the data source connection states, thereby realizing the balance of real-time performance and resource consumption.
In addition, the connection test work and the connection state broadcast work are decoupled, the result of the latest connection test of a certain type of data source is obtained from the warehouse and sent out, the balance between the real-time performance and the resource consumption can be better ensured, the problem that the real-time performance of a linear polling mode is insufficient is solved, and the problem that the resource consumption of a concurrent mode is excessive is also solved.
Example 2:
on the basis of the foregoing embodiment 1, an embodiment of the present invention further provides a data source connection status monitoring method, which is completed by using the data source connection status monitoring apparatus described in embodiment 1, and as shown in fig. 2, mainly includes the following steps:
step 10, the data source state manager creates a state refreshing task for each type of data source in the ETL system, and puts the state refreshing task into a delay queue for execution.
With reference to the apparatus shown in fig. 1, the data source status listener may monitor connection requests of each data source status monitoring client in real time: every time a connection request sent by a data source state monitoring client is monitored, adding 1 to a counter, and sending a starting instruction to a start-stop device in a data source state manager; and when the disconnection of the data source state monitoring client is monitored, the counter is reduced by 1, and when the counter is equal to 0, namely all the data source state monitoring clients are disconnected, a 'pause' instruction is sent to the start-stop device.
And when the start-stop device receives a start instruction, the state of the start-stop state machine is judged. If the start-stop state machine is in the start state, nothing is done. If the 'start-stop state machine' is in a 'stop' state, which indicates that the start-stop device is started for the first time, the start-stop device creates a state refreshing task for each type of data source existing in the ETL system, namely creates a state refreshing task for a state memory with a hasDataSource value of true in a data source state warehouse, and puts the state refreshing task into the delay queue for waiting execution; meanwhile, the state of the start-stop state machine is set as start. The relevant codes of the specific execution process can refer to embodiment 1, which is not described herein again.
And step 20, periodically refreshing the connection state of the data source according to a preset interval by each state refreshing task in the delay queue, and completing state updating in a corresponding state memory of the data source state warehouse.
With reference to the apparatus shown in fig. 1, the specific implementation process is as follows: the delay queue manager takes out a state refreshing task with expired delay from the delay queue according to a preset INTERVAL CHECK _ INTERVAL, packages the state refreshing task into a state refreshing thread and puts the state refreshing thread into a thread pool; then the thread pool allocates resources for the state refresh thread to run so as to refresh the connection state of the data source in the corresponding state memory; and when each state refreshing thread runs, judging whether to resubmit the state refreshing task of the data source to the delay queue, if the data source still exists in the ETL system and the start-stop device is in the starting state, resubmit the corresponding state refreshing task to the delay queue after the execution is finished, otherwise, resubmit the state refreshing task. The state refreshing thread refreshes the connection state of the corresponding type data source by executing ping operation, namely, the connection test is carried out on the type data source, and if no error occurs, the data source is normally connected.
The preset INTERVAL CHECK _ INTERVAL can be set according to actual requirements when the data source state manager is initialized, and values are generally taken within the range of 1 second to 8 hours; for example, the time can be set to 15 seconds, and each status refreshing thread performs status refreshing every 15 seconds, that is, each status refreshing thread can sleep for 15 seconds after execution is completed and then restart the thread without uninterrupted execution. In addition, when the refresh INTERVAL needs to be changed in the whole monitoring process, the refresh can be directly updated on the basis of the original CHECK _ INTERVAL through the parameter updater, and each subsequent state refresh thread can execute state refresh according to the new refresh INTERVAL.
The relevant codes of the specific execution process can refer to embodiment 1, which is not described herein again.
And step 30, when the data source state monitor monitors the refresh request of the data source state monitoring client, forwarding the refresh request to the data source state manager, and forwarding the connection state of various data sources returned by the data source state warehouse to the data source state monitoring client sending the refresh request.
With reference to the apparatus shown in fig. 1, when the data source status listener hears a refresh request from any data source status monitoring client, the data source status listener forwards the refresh request to the data source status manager through the relay; and then the reader reads the connection state of various data sources from the data source state warehouse and returns the connection state to the repeater, and the repeater forwards the connection state of the data sources returned by the data source state warehouse to the data source state monitoring client sending the refreshing request, so that the actual use requirement of the data source state monitoring client can be met in time. The relevant codes of the specific execution process can refer to embodiment 1, which is not described herein again.
Further, in addition to timely feedback of the connection state of the data sources according to actual needs, the connection state of each type of data source may be periodically broadcast to each data source state monitoring client, and then the method further includes:
and a broadcasting module in the data source state manager acquires the connection state of each type of data source from the data source state warehouse according to a preset period, sends the connection state to the data source state monitor, and periodically sends the data source connection state to each data source state monitoring client by the data source state monitor. More specifically, the broadcast module periodically accesses the state of the "start-stop state machine" according to a preset INTERVAL CHECK _ INTERVAL, and if the state is "start", the connection state of all existing data sources is read from the data source state warehouse through the reader and is sent to the data source state listener, so that the data source state listener periodically sends the connection state of the data sources to all connected data source state monitoring clients through the broadcaster; if the state of the start-stop state machine is stop, nothing is done.
Further, in order to ensure that the connection status of the data source of the newly added data source type in the ETL system can be maintained, and the connection status of the data source of the emptied data source type is not refreshed, thereby ensuring that the refreshing operation is not missed and the waste of resources caused by unnecessary idle work is not done when the data sources are dynamically increased or decreased, the method may further include:
when the data source management client side executes the operation of adding or deleting the data source, the data source management server sends the data source type of the data source to the task manager; the task manager acquires a data source metadata list of the type of the data source from the data source management server and judges whether the acquired list is empty; when the obtained list is not empty and a state refreshing task corresponding to the data source does not exist in the delay queue, the task manager adds a state refreshing task in charge of refreshing the connection state of the data source to the delay queue; and when the acquired list is empty and the state refreshing task corresponding to the data source exists in the delay queue, the task manager removes the state refreshing task which is responsible for refreshing the connection state of the data source from the delay queue.
In summary, the method provided by the embodiment of the invention has the following beneficial effects:
a classification monitoring mechanism is adopted, a state refreshing task which is responsible for refreshing the state of each type of data source is created, a thread does not need to be started for each data source to carry out connection testing, and system pressure and resource waste are reduced;
by adopting a delay queue mechanism, each state refreshing task can periodically refresh the state, namely each state refreshing task can be restarted after being put into a certain sleep mode after running is finished, and connection test is not required to be carried out uninterruptedly, so that unnecessary system resource waste is avoided;
a socketIO mechanism is adopted, when a refresh request of a monitoring client is monitored, the stored data source connection state is fed back, and the operation is carried out only when the data source connection state needs to be accessed, so that unnecessary performance consumption is avoided;
the connection test work and the connection state broadcast work are decoupled, and balance is realized between instantaneity and resource consumption, so that the problem of insufficient instantaneity of a linear polling mode is solved, and the problem of excessive resource consumption of a concurrent mode is also solved;
the refresh interval can be dynamically adjusted to better meet the user's needs according to actual needs.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A data source connection state monitoring device is characterized by comprising a data source state monitor and a data source state manager, wherein the data source state monitor is used for monitoring a connection request and a refresh request of a data source state monitoring client and scheduling the work of the data source state manager according to the request;
the data source state manager comprises a data source state warehouse, a start-stop device and a refreshing module;
the data source state warehouse is provided with a plurality of state memories, and each state memory is used for storing the connection state of one type of data source;
the method comprises the steps that when the start-stop device is started for the first time, a state refreshing task which is responsible for refreshing the connection state of each type of data source in the ETL system is established for each type of data source, and the state refreshing task is placed into a delay queue of a refreshing module;
the refreshing module controls each state refreshing task in the delay queue to periodically refresh the connection state of the data source according to a preset interval and update the connection state in the corresponding state memory;
when the data source state monitor monitors a refresh request of the data source state monitoring client, the data source state monitor forwards the refresh request to the data source state manager, and forwards a data source connection state returned by the data source state warehouse to the data source state monitoring client sending the refresh request.
2. The data source connection status monitoring apparatus according to claim 1, wherein the refresh module further comprises a delay queue manager and a thread pool;
the delay queue manager takes out the state refreshing task with the expired delay from the delay queue according to a preset interval, packages the state refreshing task into a state refreshing thread and puts the state refreshing thread into the thread pool;
the thread pool is used for allocating resources for the state refreshing thread to run so as to refresh the connection state of the data source in the corresponding state memory;
if the ETL system still has the data source and the start-stop device is in the starting state, the corresponding state refreshing task is resubmitted to the delay queue after the state refreshing thread is finished running.
3. The data source connection status monitoring device according to claim 1, wherein a counter is provided in the data source status listener, and the counter is incremented by 1 each time a connection request from the data source status monitoring client is monitored; every time when the disconnection of the data source state monitoring client is monitored, the counter is decreased by 1;
when the counter is larger than 0, the data source state listener sends a starting instruction to the data source state manager; when the counter equals 0, the data source status listener sends a "pause" instruction to the data source status manager.
4. The data source connection state monitoring device according to claim 3, wherein the start-stop is configured to receive "start" and "pause" commands sent by the data source state monitor, and maintain a "start-stop state machine" flag according to the commands;
when the start-stop device receives a start instruction and the start-stop state machine is in a stop state, the start-stop device creates a state refreshing task for each type of data source in the ETL system, puts the state refreshing task into the delay queue for waiting execution, and sets the state of the start-stop state machine as start;
and when the start-stop device receives a pause instruction, setting the state of the start-stop state machine to stop, and enabling the refreshing module to be in a standby state at the moment.
5. The data source connection state monitoring device according to claim 1, wherein the data source state manager further includes a broadcast module, and the broadcast module is configured to periodically obtain connection states of various data sources from the data source state repository at preset intervals, and send the connection states to the data source state listener, so that the data source state listener periodically sends the connection states of the data sources to the data source state monitoring clients.
6. The data source connection status monitoring device according to claim 1, wherein the data source status manager further comprises a task manager, and the task manager is in communication connection with the data source management client;
when the data source adding operation is executed in the data source management client, the task manager is used for adding a state refreshing task which is responsible for refreshing the connection state of the data source to the delay queue;
and when the data source management client executes the operation of deleting the data source, the task manager is used for removing the state refreshing task which is responsible for refreshing the connection state of the data source from the delay queue.
7. A data source connection status monitoring method, wherein the data source connection status monitoring apparatus of any one of claims 1 to 6 is adopted, and the method comprises:
the data source state manager creates a state refreshing task for refreshing the connection state of each type of data source in the ETL system, and puts the state refreshing task into a delay queue for waiting execution;
each state refreshing task in the delay queue periodically refreshes the connection state of the data source according to a preset interval and completes state updating in a corresponding state memory of a data source state warehouse;
when the data source state monitor monitors the refresh request of the data source state monitoring client, the refresh request is forwarded to the data source state manager, and the connection state of various data sources returned by the data source state warehouse is forwarded to the data source state monitoring client sending the refresh request.
8. The data source connection status monitoring method according to claim 7, wherein the method further comprises:
and a broadcasting module in the data source state manager acquires the connection state of each type of data source from the data source state warehouse according to a preset period, sends the connection state to the data source state monitor, and periodically sends the data source connection state to each data source state monitoring client by the data source state monitor.
9. The data source connection status monitoring method according to claim 7, wherein the method further comprises:
when the data source management client side executes the operation of adding or deleting the data source, the data source management server sends the data source type of the data source to the task manager;
the task manager acquires a data source metadata list of the type of the data source from the data source management server and judges whether the acquired list is empty;
when the obtained list is not empty and a state refreshing task corresponding to the data source does not exist in the delay queue, the task manager adds a state refreshing task in charge of refreshing the connection state of the data source to the delay queue;
and when the acquired list is empty and the state refreshing task corresponding to the data source exists in the delay queue, the task manager removes the state refreshing task which is responsible for refreshing the connection state of the data source from the delay queue.
10. The method for monitoring the connection state of the data source according to claim 7, wherein each state refreshing task in the delay queue periodically refreshes the connection state of the data source according to a preset interval, and completes state updating in a corresponding state memory of a data source state repository, specifically:
the delay queue manager takes out the state refreshing task with the expired delay from the delay queue according to a preset interval, packages the state refreshing task into a state refreshing thread and puts the state refreshing thread into a thread pool;
the thread pool allocates resources for the state refresh thread to run so as to refresh the connection state of the data source in the corresponding state memory;
and when each state refreshing thread runs, judging whether to submit the state refreshing task of the data source to the delay queue, if the data source still exists in the ETL system and the start-stop device is in the starting state, resubmitting the corresponding state refreshing task to the delay queue after the execution is finished.
CN202011576061.4A 2020-12-28 2020-12-28 Data source connection state monitoring device and method Active CN112667472B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011576061.4A CN112667472B (en) 2020-12-28 2020-12-28 Data source connection state monitoring device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011576061.4A CN112667472B (en) 2020-12-28 2020-12-28 Data source connection state monitoring device and method

Publications (2)

Publication Number Publication Date
CN112667472A true CN112667472A (en) 2021-04-16
CN112667472B CN112667472B (en) 2022-04-08

Family

ID=75410309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011576061.4A Active CN112667472B (en) 2020-12-28 2020-12-28 Data source connection state monitoring device and method

Country Status (1)

Country Link
CN (1) CN112667472B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7200666B1 (en) * 2000-07-07 2007-04-03 International Business Machines Corporation Live connection enhancement for data source interface
US20110208822A1 (en) * 2010-02-22 2011-08-25 Yogesh Chunilal Rathod Method and system for customized, contextual, dynamic and unified communication, zero click advertisement and prospective customers search engine
CN103036736A (en) * 2012-11-30 2013-04-10 航天恒星科技有限公司 Configuration equipment monitoring system and monitoring method based on data sources
CN106202346A (en) * 2016-06-29 2016-12-07 浙江理工大学 A kind of data load and clean engine, dispatch and storage system
CN111061715A (en) * 2019-12-16 2020-04-24 北京邮电大学 Web and Kafka-based distributed data integration system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7200666B1 (en) * 2000-07-07 2007-04-03 International Business Machines Corporation Live connection enhancement for data source interface
US20110208822A1 (en) * 2010-02-22 2011-08-25 Yogesh Chunilal Rathod Method and system for customized, contextual, dynamic and unified communication, zero click advertisement and prospective customers search engine
CN103036736A (en) * 2012-11-30 2013-04-10 航天恒星科技有限公司 Configuration equipment monitoring system and monitoring method based on data sources
CN106202346A (en) * 2016-06-29 2016-12-07 浙江理工大学 A kind of data load and clean engine, dispatch and storage system
CN111061715A (en) * 2019-12-16 2020-04-24 北京邮电大学 Web and Kafka-based distributed data integration system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴春成-ZJU: "大众点评ETL数据传输平台整体架构", 《HTTPS://BLOG.CSDN.NET/LVZHUYIYI/ARTICLE/DETAILS/51842923》 *
朱国强等: "基于发布/订阅技术的数据抽取", 《微计算机信息》 *

Also Published As

Publication number Publication date
CN112667472B (en) 2022-04-08

Similar Documents

Publication Publication Date Title
US8930521B2 (en) Method, apparatus, and computer program product for enabling monitoring of a resource
CN112000445A (en) Distributed task scheduling method and system
CN109150987B (en) Two-layer container cluster elastic expansion method based on host layer and container layer
CN113569987A (en) Model training method and device
CN110716793B (en) Method, device, equipment and storage medium for executing distributed transaction
CN114064414A (en) High-availability cluster state monitoring method and system
US20110179303A1 (en) Persistent application activation and timer notifications
CN111143170A (en) Cloud mobile phone monitoring system and method
CN114844809A (en) Multi-factor arbitration method and device based on network heartbeat and kernel disk heartbeat
US11748164B2 (en) FAAS distributed computing method and apparatus
CN112667472B (en) Data source connection state monitoring device and method
CN111475333B (en) Database backup method and device based on openstack
CN112199432A (en) High-performance data ETL device based on distribution and control method
CN116719623A (en) Job scheduling method, job result processing method and device
CN115357395A (en) Fault equipment task transfer method and system, electronic equipment and storage medium
CN115421898A (en) Big data task scheduling management system and method based on quartz framework
CN112527469B (en) Fault-tolerant combination method of cloud computing server
CN114116178A (en) Cluster framework task management method and related device
CN112350837B (en) Cloud platform-based power application cluster management method and device
CN114546631A (en) Task scheduling method, control method, core, electronic device and readable medium
CN113010307B (en) Multi-chain blockchain browser system and application method thereof
CN110750369B (en) Distributed node management method and system
CN111163158B (en) Data processing method and electronic equipment
CN114915659B (en) Network request processing method and device, electronic equipment and storage medium
CN117544584B (en) Control method, device, switch and medium based on double CPU architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant