Background technology
With the development of big data technology, the data volume more massive than ever that enterprise can store and process reaches
TB even PB ranks.At present enterprise realizes mainly off-line analysiss business, generation of this kind of business from data in mass data
Generation to result generally requires the T+1 even longer cycles.For to much very high to requirement of real-time industry, this is not
Their business need can be met.How quickly processing data, more in real time feedback result be big data field be badly in need of solution
Problem certainly.
The appearing as of real-time processing engine meet enterprise process in real time big data demand provide may, by real-time
Engine is processed, enterprise can be helped to carry out ETL, real-time statement analysiss, the even work such as real-time machine study.Lead on the market at present
The several distributed real-time processing engine of stream includes:Apache Flink, Spark Streaming etc., user is drawn by these
Holding up the api interface of offer can realize real-time demand business.
Real-time processing and traditional batch processing services are very different, and most important of which is some real-time processing business
Process be the data, i.e. data without border when endlessly, client usually requires that real-time processing business 7 × 24 is uninterrupted
Operation.But because distributed system all can be because of a variety of as network, hardware fault reason cause to stop service, in this feelings
Discovery in time is required under condition, it is ensured that data are not lost, and fault recovery is carried out in the most short time.At present such as Apache
These main flow real-time processing engines such as Flink and Spark Streaming, although be all provided with the reliability that mechanism ensures data
Property, but the automatic fault for all not providing complete set recovers service.
Application content
One purpose of the application is to provide a kind of automatic fault recovery technology of real-time processing engine.
For achieving the above object, this application provides a kind of real-time processing engine failure restoration methods, the method includes:
When synchrolock is got, become master server;
When real-time processing application is performed, the application message with regard to the current real-time processing application for performing is recorded, so that standby
Server continues executing with corresponding real-time processing application when master server is become by obtaining the application message;
When breaking down, the synchrolock is discharged, to trigger synchrolock described in standby server application.
Further, the method also includes:
When the synchrolock is not got, become standby server;
When master server discharges synchrolock, apply for the synchrolock;
When the synchrolock of master server release is got by application, become new master server;
The application message is obtained, and corresponding real-time processing application is continued executing with according to the application message.
Further, in the real-time processing application, the process behaviour with regard to the real-time processing application is defined by SQL statement
Make.
Further, the process with regard to the real-time processing application is operated, including:
Create the operation of the real-time processing application;
The operation of real-time processing is carried out to data.
Further, the method also includes:
When the request to create of real-time processing application is obtained, stored the real-time processing application persistence by metadata
Preserve into data base.
Further, real-time processing application is performed, including:
Obtain the SQL statement of the real-time processing application;
The operator of the operation that data are carried out with real-time processing is obtained according to the SQL statement;
The operator is committed to into computing cluster, the operator is performed by the computing cluster, to realize to data
Real-time processing operation.
Further, the SQL statement of the real-time processing application is obtained, including:
The real-time processing application is obtained from data base by metadata storage;
Obtain the SQL statement of the operation for being used to carry out data real-time processing in the real-time processing application.
Further, the application message of real-time processing application of the record with regard to currently performing, including:
Record node is created in coordination service system, is write with regard to the current real-time place for performing in the record node
Ought to application message.
Further, the method also includes:
When stopping performing the real-time processing application, real-time processing application described in coordination service system is deleted corresponding
Application message and record node.
Further, the application message is obtained, including:
From the record node of coordination service system, the application message with regard to the current real-time processing application for performing is read.
Based on the another aspect of the application, additionally provide a kind of real-time processing engine failure and recover server, the service
Device includes:
Switching device, for when synchrolock is got, making book server become master server, and is breaking down
When, the synchrolock is discharged, to trigger synchrolock described in standby server application;
Real-time processing device, for when real-time processing application is performed, recording with regard to the current real-time processing application for performing
Application message so that standby server is when master server is become, by obtaining the application message corresponding reality is continued executing with
When process application.
Further, the switching device, is additionally operable to, when the synchrolock is not got, make book server become standby clothes
Business device;When master server discharges synchrolock, apply for the synchrolock;And master server release is being got by application
During synchrolock, book server is set to become new master server;
The real-time processing device, is additionally operable to obtain the application message, and continued executing with according to the application message it is right
The real-time processing application answered.
Further, in the real-time processing application, the process behaviour with regard to the real-time processing application is defined by SQL statement
Make.
Further, the process with regard to the real-time processing application is operated, including:
Create the operation of the real-time processing application;
The operation of real-time processing is carried out to data.
Further, the real-time processing device, is additionally operable to when the request to create of real-time processing application is obtained, by unit
Data storage preserves the real-time processing application persistence into data base.
Further, the real-time processing device, for obtaining the SQL statement of the real-time processing application;According to described
SQL statement obtains the operator of the operation that data are carried out with real-time processing;And the operator is committed to into computing cluster, by
The computing cluster performs the operator, to realize the real-time processing operation to data.
Further, the real-time processing device, for obtaining the real-time processing from data base by metadata storage
Using;And obtain in the real-time processing application for data to be carried out with the SQL statement of the operation of real-time processing.
Further, the real-time processing device, for creating record node in coordination service system, to the record
The application message with regard to the current real-time processing application for performing is write in node.
Further, the real-time processing device, is additionally operable to, when stopping performing the real-time processing application, delete and coordinate
The corresponding application message of real-time processing application described in service system and record node.
Further, the real-time processing device, for from the record node of coordination service system, reading with regard to current
The application message of the real-time processing application of execution.
Compared with prior art, in the scheme that the application is provided, if any one server gets on startup synchronization
Lock, then become master server and externally provide service;It is main if breaking down during master server externally provides server
Server will discharge the synchrolock, to trigger synchrolock described in standby server application so that standby server can get together
Lock is walked so as to become new master server, additionally, master server is when real-time processing application is performed, is recorded with regard to current execution
The application message of real-time processing application, so that standby server is when master server is become, is continued by obtaining the application message
Corresponding real-time processing application is performed, so as to realize the automatic recovery of failure.
Specific embodiment
The application is described in further detail below in conjunction with the accompanying drawings.
In one typical configuration of the application, terminal, the equipment of service network include one or more processors
(CPU), input/output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media, can be by any side
Method or technology are realizing information Store.Information can be computer-readable instruction, data structure, the module of program or other numbers
According to.The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM
(SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read only memory
(ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc (CD-
ROM), digital versatile disc (DVD) or other optical storages, magnetic cassette tape, magnetic disk storage or other magnetic storages
Equipment or any other non-transmission medium, can be used to store the information that can be accessed by a computing device.
The embodiment of the present application provides a kind of real-time processing engine failure restoration methods, the real-time place that the method is suitable for
Multiple servers are included in reason engine, during real-time processing engine start, multiple servers can simultaneously start application synchrolock
(Lock)。
For any one server (Server), when synchrolock is got, become master server (Active
Server).The master server is obtained in that the resources use right of computing cluster, externally provides service, i.e. real-time processing engine
Data processing function, realized by calling the resource of computing cluster by master server;And correspondingly, if server is not obtained
The synchrolock is got, then will become standby server (Standby Server), the standby server does not obtain computing cluster
Resources use right, does not externally provide service.
In actual scene, the server can apply for synchrolock, and the calculating to coordination service system Zookeeper
Cluster can be Spark Cluster.When server obtains the resources use right of computing cluster, Spark can be passed through
The executor Executor that Cluster is provided completing corresponding real time processing tasks, it is concrete as shown in Figure 1.Real-time processing is drawn
When holding up just startup, two servers start simultaneously, and go application one synchrolock to Zookeeper, obtain the service of synchrolock
Device becomes Active Server, and starts offer service, and obtains the resources use right of Spark Cluster, and another is not
Getting the server of synchrolock then becomes Standby Server.Here, it will be appreciated by those skilled in the art that illustrating in figure
The quantity of each dvielement be likely less than the quantity of respective element in actual scene (quantity of such as server can be more than two
Individual, all servers for not getting synchrolock will all become Standby Server), but this omission is far and away with will not
Impact is carried out premised on clear, sufficient disclosure to the present invention.
Master server performs real-time processing application when service is externally provided by using the resource of computing cluster
(application) real time processing tasks of data are completed.The data processing task basic for one, at least includes:Definition
Data from data source are analyzed process by real time data source, and result is exported in specified storage.Fig. 2 is this
The data flow diagram of real time processing tasks involved by one embodiment of application, wherein, message system is subscribed to using distributed post
Kafka reads real time data as data source, from some data source partition (Partition) of kafka, to real time data
Analyzing and processing predominantly wherein wrong log information (log information comprising ERROR) is written to into the result table of data base
In.Specifically, master server reads data by starting receptor Receiver from kafka, then by filter
Filter filters the log information not comprising ERROR, and the data after filtration are write into data base by Sink operations
In the result table of Database.
In actual scene, when master server breaks down, it will the release synchrolock, to trigger standby server Shen
Please the synchrolock.Now, for standby server, when master server discharges synchrolock, apply for the synchrolock, and passing through
When application gets the synchrolock of master server release, become new master server, be achieved in oneself between active/standby server
Dynamic switching.
Thus, also can continue to be smoothed out in active-standby switch to ensure real time processing tasks, the embodiment of the present application
In the scheme of offer, when real-time processing application is performed, master server can be recorded with regard to the current real-time processing application for performing
Application message, so that standby server is when master server is become, is continued executing with corresponding real-time by the acquisition application message
Process application.I.e. for server when new master server is become, it will obtain the application message, and believed according to the application
Breath continues executing with corresponding real-time processing application.
In one embodiment of the application, master server can using coordination service system (such as Zookeeper) come
The application message with regard to the current real-time processing application for performing is recorded, concrete processing procedure includes:Create in coordination service system
Record node is built, the application message with regard to the current real-time processing application for performing then is write in the record node.Fig. 3 shows
Having gone out a kind of utilization Zookeeper carries out the schematic diagram of real-time processing application state tracking, wherein, when real-time processing application starts
Afterwards, master server can follow the trail of the state of real-time processing application, and the real-time processing application of current operation is recorded in internal memory
Application message, then creates record node [/Running/app], to record application message in Zookeeper.
Further, when stopping performing the real-time processing application, master server can delete institute in coordination service system
State the corresponding application message of real-time processing application and record node.For the example shown in Fig. 3, when real-time processing application stops
During execution, master server can be deleted with regard to the application message of the real-time processing application in internal memory, and can be by correspondence on Zookeeper
Record node [/Running/app] delete.
Correspondingly, standby server, when new master server is become, is also from the record node of coordination service system, to read
Take the application message with regard to the current real-time processing application for performing.By being answered by the current real-time processing for performing of master server record
Application message, and the application message is read when new master server is become for server, thus continue executing with correlation
The mode of real-time processing application, can realize the mechanism of complete automatically restoring fault.Fig. 4 shows the side of the embodiment of the present application
Case realizes the principle of fault recovery, when main server-a ctiver Server break down, standby server S tandby
Server can obtain synchrolock from coordination service system Zookeeper, thus become new Active Server, now
It will obtain computing cluster Spark Cluster resources use right, and from Zookeeper read record node [/
Running/app] in application message, that is, original master server when breaking down also in the real-time processing application of operation
Relevant information.Then, the real-time processing application is resubmited Spark Cluster and is performed by new Active Server,
To complete the real time processing tasks not completed before, the automatic recovery of failure is realized.
Further, in the real-time processing application processed for realizing data to implement of the embodiment of the present application, by SQL languages
Sentence definition is operated with regard to the process of the real-time processing application.Specifically, these are operated with regard to the process of the real-time processing application, can
With including creating the operation of the real-time processing application, and data are carried out with the operation of real-time processing etc..Compared to existing skill
In art, such as real-time processing engine such as Apache Flink, Spark Streaming is needed by writing Java, Scala etc.
Defining aforesaid operations, user needs from the beginning to open application code when definition is operated with regard to the process of real-time processing application
Beginning build programmed environment, obtain rely on SDK, pack, be deployed to cluster and carry out test and use, need to be familiar with various API, distribution
The logic of formula system is very complicated and poorly efficient.And the process behaviour with regard to real-time processing application is defined using SQL statement
Make, without the need for building programmed environment, being independent of SDK, simplify configuration, the process of modification, there is convenient management.
In one embodiment of the application, can be defined at the correlation of real-time processing application by following SQL statement
Reason operation.For example, the operation of the real-time processing application is created, can is [create application app
[properties (" parallelism "=" 2 ")]], and the operation that data are carried out with real-time processing can include:Definition
Operation [create stream source (id int, the name string, message in the real time data source on Kafka
String) streamproperties (" source "=" kafka ", kafka.zookeeper "=" broker:2181”,
" topic "=" source ")];Define the filter operation of real time data:[create stream errorLogs as
Select*from source where message like " %ERROR% "];And the data after filtration are write into result
The operation of table:[insert into result select*from errorlogs] etc..
Because the relevant treatment operations of real-time processing application need to be defined by SQL statement, therefore the embodiment of the present application institute
In the method for offer, the concrete establishment mode of real-time processing application is as follows:Server please in the establishment for obtaining real-time processing application
When asking, the real-time processing application persistence is preserved into data base by metadata storage.
Wherein, the request to create may come from client device, and user creates specific function by client device
Real-time processing application, to complete corresponding real time processing tasks.Therefore, server S erver is when request to create is received, meeting
The relevant information of real-time processing application is preserved into data base by metadata storage (MetaStore), specifically, the number
Can be the data base using sql like language such as MySQL according to storehouse.Fig. 5 shows and create in one embodiment of the application place in real time
Ought to handling process.When request to create is received, request to create (create request) is sent to MetaStore,
Then MetaStore sends corresponding write request (write request) to MySQL again, so as to realize the preservation of persistence.
Specifically, the relevant information of real-time processing application can include following field:The identification information of real-time processing application
(ID), title (Name), creation time (CreateTime), nearest modification time (LastModifyTime) and corresponding hold
SQL statement (Command) of row task etc., its tool cuticle topography in MySQL is as shown in table 1:
Field |
Type |
Whether it is major key |
ID |
Bigint(20) |
It is |
Name |
Varchar(128) |
|
CreateTime |
Int(11) |
|
LastModifyTime |
Int(11) |
|
Command |
mediumtext |
|
Table 1
In addition, when real-time processing application is created, user can also specify the real-time processing application upon execution
Some configurations, table 2 shows a kind of table structure being configured in MySQL of real-time processing application, can include following field:
Identification information (ID), parameter key (PARAM_KEY), parameter value (PARAM_VALUE) of real-time processing application etc..
Field |
Type |
Whether it is major key |
APP_ID |
Bigint(20) |
It is |
PARAM_KEY |
Varchar(128) |
It is |
PARAM_VALUE |
Varchar(4000) |
|
Table 2
In actual process, user can perform real-time processing by client device Client using SQL statement
Using such as [start application app].Master server after the order for receiving the execution real-time processing application,
It is performing the process of real-time processing application, specifically includes following process step:
First, master server obtains the SQL statement of the real-time processing application.It is described in the scene as described in precedent
The relevant information of real-time processing application is stored in MySQL by MetaStore persistences.Now, master server is obtaining SQL languages
In the processing procedure of sentence, metadata storage will be first passed through and obtain the real-time processing application from data base, that is, receiving execution
After the order of real-time processing application, send to MetaStore and ask so that MetaStore is obtained again from the tables of data of MySQL
With regard to the relevant information of real-time processing application, master server is then returned to.Due to containing SQL statement in relevant information, make
Master server to obtain the real-time processing application in for data are carried out real-time processing operation SQL statement.For example,
Master server may finally get the SQL languages of the operation with regard to the data after filtration to be write result table of real-time processing application
Sentence:[insert into result select*from errorlogs].
Then, master server obtains the operator of the operation that data are carried out with real-time processing according to the SQL statement.At this
In one embodiment of application, master server can pass through SQL compilers (Compiler) by the SQL statement of real-time processing application
Parsing generates an implement plan (Execution Plan), and the implement plan includes several operators:ROp is from Kafka
Read the operator of data, FOp be the operator of intermediate filtered data, SOp be final result output operator.
Finally, the operator is committed to computing cluster by master server, and by the computing cluster operator is performed,
To realize the real-time processing operation to data.For example, by taking the scene in the embodiment of the present application as an example, master server will be comprising operation
The implement plan of symbol is committed to Spark Cluster, is performed by the executor Executor in Spark Cluster, tool
It is as shown in Figure 6 that body performs flow process.
Based on same inventive concept, real-time processing engine failure is additionally provided in the embodiment of the present application and recovers server,
The corresponding method of fault recovery server is the real-time processing engine failure restoration methods in previous embodiment, and it is solved
Certainly the principle of problem is similar to methods described.
Fig. 7 shows that a kind of real-time processing engine failure that the embodiment of the present application is provided recovers server, including switching
Device 710 and real-time processing device 720.Multiple above-mentioned servers are included in real-time processing engine, during real-time processing engine start,
Multiple servers can simultaneously start application synchrolock.
For any one server (Server), its switching device 710 are used for when synchrolock is got, this service is made
Device becomes master server (Active Server).The master server is obtained in that the resources use right of computing cluster, externally carries
For service, the i.e. data processing function of real-time processing engine, realized by calling the resource of computing cluster by master server;
And correspondingly, switching device 710 is used for when the synchrolock is not got, book server is set to become standby server (Standby
Server), the standby server does not obtain the resources use right of computing cluster, does not externally provide service.
In actual scene, the server can apply for synchrolock, and the calculating to coordination service system Zookeeper
Cluster can be Spark Cluster.When server obtains the resources use right of computing cluster, Spark can be passed through
The executor Executor that Cluster is provided completing corresponding real time processing tasks, it is concrete as shown in Figure 1.Real-time processing is drawn
When holding up just startup, two servers start simultaneously, and go application one synchrolock to Zookeeper, obtain the service of synchrolock
Device becomes Active Server, and starts offer service, and obtains the resources use right of Spark Cluster, and another is not
Getting the server of synchrolock then becomes Standby Server.Here, it will be appreciated by those skilled in the art that illustrating in figure
The quantity of each dvielement be likely less than the quantity of respective element in actual scene (quantity of such as server can be more than two
Individual, all servers for not getting synchrolock will all become Standby Server), but this omission is far and away with will not
Impact is carried out premised on clear, sufficient disclosure to the present invention.
Master server performs real-time processing application and completes number when service is externally provided by using the resource of computing cluster
According to real time processing tasks.The data processing task basic for one, at least includes:Real time data source is defined, to from number
Process is analyzed according to the data in source, result is exported in specified storage.Fig. 2 is involved by one embodiment of the application
And the data flow diagram of real time processing tasks, wherein, message system kafka is subscribed to as data source using distributed post, from
The message queue of kafka reads real time data, and the analyzing and processing of real time data predominantly (is wrapped wherein wrong log information
Log information containing ERROR) it is written to the result table of data base.Specifically, master server is by starting receptor
Receiver reads data from kafka, then filters the log information not comprising ERROR by filter F ilter, and will
Data after filtration are write in the result table of database D atabase by Sink operations.
In actual scene, when master server breaks down, its switching device 710 will discharge the synchrolock, with
Synchrolock described in the standby server application of triggering.Now, for standby server, its switching device 710 can be same in master server release
During step lock, apply for the synchrolock, and when the synchrolock of master server release is got by application, become book server
New master server, the automatic switchover being achieved between active/standby server.
Thus, also can continue to be smoothed out in active-standby switch to ensure real time processing tasks, the embodiment of the present application
In the scheme of offer, when real-time processing application is performed, the real-time processing device 720 of master server can be recorded with regard to current execution
Real-time processing application application message so that standby server is when master server is become, by obtain the application message after
It is continuous to perform corresponding real-time processing application.I.e. for server when new master server is become, its real-time processing device 720 will
The application message is obtained, and corresponding real-time processing application is continued executing with according to the application message.
In one embodiment of the application, master server can using coordination service system (such as Zookeeper) come
The application message with regard to the current real-time processing application for performing is recorded, the concrete processing procedure of its real-time processing device 720 includes:
Record node is created in coordination service system, then writing in the record node should with regard to the current real-time processing for performing
Application message.Fig. 3 shows that a kind of utilization Zookeeper carries out the schematic diagram of real-time processing application state tracking, wherein,
After real-time processing application starts, master server can follow the trail of the state of real-time processing application, and current fortune is recorded in internal memory
The application message of capable real-time processing application, then create in Zookeeper (create) one record node [/
Running/app], to record application message.
Further, when stopping performing the real-time processing application, the real-time processing device 720 of master server can be deleted
The corresponding application message of real-time processing application described in coordination service system and record node.For the example shown in Fig. 3, when
When real-time processing application stops performing, master server can be deleted in internal memory with regard to the application message of the real-time processing application, and meeting
Corresponding record node [/Running/app] on Zookeeper is deleted into (Remove).
Correspondingly, when new master server is become, its real-time processing device 720 is also from coordination service system to standby server
In the record node of system, the application message with regard to the current real-time processing application for performing is read.By being worked as by master server record
The application message of the real-time processing application of front execution, and standby server is when new master server is become, and reads the application letter
Breath, thus continues executing with the mode of related real-time processing application, can realize the mechanism of complete automatically restoring fault.Fig. 4 shows
The scheme for having gone out the embodiment of the present application realizes the principle of fault recovery, when main server-a ctiver Server break down,
Standby server S tandby Server can obtain synchrolock from coordination service system Zookeeper, thus become new
Active Server, now it will obtain the resources use right of computing cluster Spark Cluster, and read from Zookeeper
Take the application message in record node [/Running/app], that is, original master server when breaking down also in operation
The relevant information of real-time processing application.Then, the real-time processing application is resubmited Spark by new Active Server
Cluster is performed, and to complete the real time processing tasks not completed before, realizes the automatic recovery of failure.
Further, in the real-time processing application processed for realizing data to implement of the embodiment of the present application, by SQL languages
Sentence definition is operated with regard to the process of the real-time processing application.Specifically, these are operated with regard to the process of the real-time processing application, can
With including creating the operation of the real-time processing application, and data are carried out with the operation of real-time processing etc..Compared to existing skill
In art, such as real-time processing engine such as Apache Flink, Spark Streaming is needed by writing Java, Scala etc.
Defining aforesaid operations, user needs from the beginning to open application code when definition is operated with regard to the process of real-time processing application
Beginning build programmed environment, obtain rely on SDK, pack, be deployed to cluster and carry out test and use, need to be familiar with various API, distribution
The logic of formula system is very complicated and poorly efficient.And the process behaviour with regard to real-time processing application is defined using SQL statement
Make, without the need for building programmed environment, being independent of SDK, simplify configuration, the process of modification, there is convenient management.
In one embodiment of the application, can be defined at the correlation of real-time processing application by following SQL statement
Reason operation.For example, the operation of the real-time processing application is created, can is [create application app
[properties (" parallelism "=" 2 ")]], and the operation that data are carried out with real-time processing can include:Definition
Operation [create stream source (id int, the name string, message in the real time data source on Kafka
String) streamproperties (" source "=" kafka ", kafka.zookeeper "=" broker:2181”,
" topic "=" source ")];Define the filter operation of real time data:[create stream errorLogs as
Select*from source where message like " %ERROR% "];And the data after filtration are write into result
The operation of table:[insert into result select*from errorlogs] etc..
Because the relevant treatment operations of real-time processing application need to be defined by SQL statement, therefore the embodiment of the present application institute
In the server of offer, the concrete establishment mode of real-time processing application is as follows:Server is obtaining the establishment of real-time processing application
During request, the real-time processing application persistence is preserved into data base by metadata storage.
Wherein, the request to create may come from client device, and user creates specific function by client device
Real-time processing application, to complete corresponding real time processing tasks.Therefore, server S erver is when request to create is received, meeting
The relevant information of real-time processing application is preserved into data base by metadata storage (MetaStore), specifically, the number
Can be the data base using sql like language such as MySQL according to storehouse.Fig. 5 shows in one embodiment of the application, creates and locate in real time
Ought to handling process.Server when request to create is received, by request to create (create request) send to
MetaStore, then MetaStore send corresponding write request (write request) to MySQL again, so as to realize holding
The preservation of longization.
Specifically, the relevant information of real-time processing application can include following field:The identification information of real-time processing application
(ID), title (Name), creation time (CreateTime), nearest modification time (LastModifyTime) and corresponding hold
SQL statement (Command) of row task etc., its tool cuticle topography in MySQL is as shown in table 1:
In addition, when real-time processing application is created, user can also specify the real-time processing application upon execution
Some configurations, table 2 shows a kind of table structure being configured in MySQL of real-time processing application, can include following field:
Identification information (ID), parameter key (PARAM_KEY), parameter value (PARAM_VALUE) of real-time processing application etc..
In actual process, user can perform real-time processing by client device Client using SQL statement
Using such as [start application app].Master server after the order for receiving the execution real-time processing application,
Its real-time processing device 72 performs the process of real-time processing application, specifically includes following process step:
First, the real-time processing device 720 of master server obtains the SQL statement of the real-time processing application.In such as precedent
In described scene, the relevant information of the real-time processing application is stored in MySQL by MetaStore persistences.Now, it is main
In the processing procedure for obtaining SQL statement, its real-time processing device 720 will first pass through metadata storage and obtain from data base server
The real-time processing application is taken, i.e., after the order for performing real-time processing application is received, is sent to MetaStore and is asked, made
MetaStore obtains again relevant information with regard to real-time processing application from the tables of data of MySQL, be then returned to main service
Device.Due to containing SQL statement in relevant information so that master server to obtain the real-time processing application in be used for data
Carry out the SQL statement of the operation of real-time processing.For example, master server may finally get real-time processing application with regard to inciting somebody to action
The SQL statement of the operation of the data write result table after filter:[insert into result select*from
errorlogs]。
Then, the real-time processing device 720 of master server is obtained according to the SQL statement and carries out real-time processing to data
The operator of operation.In one embodiment of the application, master server can be by SQL compilers (Compiler) by real time
The SQL statement parsing for processing application generates an implement plan (Execution Plan), and the implement plan includes several operations
Symbol:ROp be read from Kafka the operator of data, FOp be the operator of intermediate filtered data, SOp be final result output
Operator.
Finally, the operator is committed to computing cluster by the real-time processing device 720 of master server, is collected by described calculating
Group performs the operator, to realize the real-time processing operation to data.For example, by taking the scene in the embodiment of the present application as an example,
Implement plan comprising operator is committed to Spark Cluster by master server, by the executor in Spark Cluster
Executor is performed, and concrete execution flow process is as shown in Figure 6.
In sum, in the scheme that the application is provided, if any one server gets on startup synchrolock, into
Service is externally provided for master server;During master server externally provides server, if breaking down, master server will
The synchrolock is discharged, to trigger synchrolock described in standby server application so that standby server can get synchrolock so as to
Become new master server, additionally, master server is when real-time processing application is performed, record with regard to the current real-time processing for performing
Using application message so that standby server is when master server is become, by obtaining the application message correspondence is continued executing with
Real-time processing application, so as to realize the automatic recovery of failure.
Additionally, the scheme of the application defines the process operation with regard to real-time processing application using SQL statement, without the need for building
Programmed environment, SDK is independent of, simplifies configuration, the process of modification, there is convenient management.
It should be noted that the application can be carried out in the assembly of software and/or software with hardware, for example, can adopt
Realized with special IC (ASIC), general purpose computer or any other similar hardware device.In one embodiment
In, the software program of the application can pass through computing device to realize steps described above or function.Similarly, the application
Software program (including related data structure) can be stored in computer readable recording medium storing program for performing, for example, RAM memory,
Magnetically or optically driver or floppy disc and similar devices.In addition, some steps or function of the application can employ hardware to realize, example
Such as, as coordinating so as to perform the circuit of each step or function with processor.
In addition, the part of the application can be applied to computer program, such as computer program instructions, when its quilt
When computer is performed, by the operation of the computer, can call or provide according to the present processes and/or technical scheme.
And the programmed instruction of the present processes is called, in being possibly stored in fixed or moveable recording medium, and/or pass through
Data flow in broadcast or other signal bearing medias and be transmitted, and/or be stored according to described program instruction operation
In the working storage of computer equipment.Here, including a device according to one embodiment of the application, the device includes using
In the memorizer and the processor for execute program instructions of storage computer program instructions, wherein, when the computer program refers to
When order is by the computing device, method and/or skill of the plant running based on aforementioned multiple embodiments according to the application is triggered
Art scheme.
It is obvious to a person skilled in the art that the application is not limited to the details of above-mentioned one exemplary embodiment, Er Qie
In the case of without departing substantially from spirit herein or basic feature, the application can be in other specific forms realized.Therefore, no matter
From the point of view of which point, embodiment all should be regarded as exemplary, and be nonrestrictive, scope of the present application is by appended power
Profit is required rather than described above is limited, it is intended that all in the implication and scope of the equivalency of claim by falling
Change is included in the application.Any reference in claim should not be considered as and limit involved claim.This
Outward, it is clear that " including ", a word was not excluded for other units or step, and odd number is not excluded for plural number.That what is stated in device claim is multiple
Unit or device can also be realized by a unit or device by software or hardware.