CN105335362B - The processing method and system of real time data, instant disposal system for treating - Google Patents
The processing method and system of real time data, instant disposal system for treating Download PDFInfo
- Publication number
- CN105335362B CN105335362B CN201410229319.1A CN201410229319A CN105335362B CN 105335362 B CN105335362 B CN 105335362B CN 201410229319 A CN201410229319 A CN 201410229319A CN 105335362 B CN105335362 B CN 105335362B
- Authority
- CN
- China
- Prior art keywords
- node
- real
- processing system
- time
- time data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title abstract description 4
- 238000012545 processing Methods 0.000 claims abstract description 296
- 238000009826 distribution Methods 0.000 claims abstract description 19
- 238000003860 storage Methods 0.000 claims description 29
- 238000000034 method Methods 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 20
- 238000012544 monitoring process Methods 0.000 claims description 6
- 238000011161 development Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Landscapes
- Processing Of Solid Wastes (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides the processing method and system, instant disposal system for treating of a kind of real time data.The embodiment of the present application obtains the real time data that real time processing system is sent by merge node, and then according to distribution policy, in at least one described local node, or in merge node and at least one described local node, determine a node, to enable processing node that the real time data is written in the instant disposal system for treating as processing node, so that realizing instant disposal system for treating flexibly can be calculated or be inquired for real time data.
Description
[ technical field ] A method for producing a semiconductor device
The present application relates to data processing technologies, and in particular, to a method and a system for processing real-time data, and an instant processing system.
[ background of the invention ]
The instant processing system can perform flexible calculation or inquiry aiming at offline data, namely calculation of unpredictable rules. For example, the proportion of men and women of the honest and non-honest members in the members is calculated, or, for example, the proportion of transaction medals of the honest and non-honest members in the members is calculated, or, for example, the proportion of field authentication of the honest and non-honest members in the members is calculated.
However, existing point-in-time processing systems do not allow for flexible computation or querying of real-time data.
[ summary of the invention ]
Various aspects of the present application provide a method and a system for processing real-time data, and an instant processing system, so as to implement that the instant processing system can flexibly calculate or query real-time data, and is particularly suitable for calculating or querying mass real-time data.
In one aspect of the present application, a method for processing real-time data is provided, where the method is applied to an instant processing system, where the instant processing system includes a merge node and at least one local node, and the method includes:
the merging node obtains real-time data sent by a real-time processing system;
the merging node determines a node as a processing node according to a distribution strategy, and the node is used for writing the real-time data into the instant processing system; wherein,
the processing node comprises the merge node or one of the at least one local node.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where after the merging node determines a node according to the distribution policy, the method further includes:
and the processing node writes the real-time data into the instant processing system.
The above aspect and any possible implementation manner further provide an implementation manner that the writing, by the processing node, the real-time data into the just-in-time processing system includes:
the processing node creates a full-text index file of the real-time data;
and the processing node writes the full-text index file into the instant processing system.
The above aspect and any possible implementation manner further provide an implementation manner, where the writing, by the processing node, the full-text index file into the just-in-time processing system includes:
the processing node monitors the state condition of the real-time data;
and if the state condition meets a first writing condition, the processing node writes the full-text index file into a quick storage device of the instant processing system.
The foregoing aspects and any possible implementations further provide an implementation, where if the status condition satisfies a first writing condition, the writing, by the processing node, the full-text index file into a fast storage device of the instant processing system includes:
and if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the processing node writes the full-text index file into the instant processing system, and further includes:
and if the state condition meets a second writing condition, the processing node writes the full-text index file into the storage device of the instant processing system and writes the full-text index file into the slow storage device of the instant processing system.
The foregoing aspects and any possible implementations further provide an implementation, where if the status condition satisfies a second writing condition, the writing, by the processing node, the full-text index file into a slow storage device of the instant processing system includes:
and if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where after the processing node monitors the status condition of the real-time data, the processing node further includes:
and if the state condition meets at least one of the first writing condition and the second writing condition, starting a new query engine by the processing node for carrying out instant query on the real-time data.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where after the processing node starts a new query engine to perform an instant query on the real-time data, the processing node further includes:
the merging node receives a data query request, wherein the data query request comprises query conditions;
the merging node distributes a data query request to the at least one local node;
each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result, and return the query result to the merge node;
and the merging node merges the query results to obtain a final query result.
In another aspect of the present application, there is provided an instant processing system comprising a merge node and at least one local node, wherein,
the merging node is used for acquiring real-time data sent by the real-time processing system; determining a processing node according to the distribution strategy; wherein the processing node comprises one of the merge node or the at least one local node;
and the processing node is used for writing the real-time data into the instant processing system.
The above-described aspects and any possible implementations further provide an implementation in which the merge node and the at least one local node form a distributed cloud architecture.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
Creating a full-text index file of the real-time data;
and writing the full-text index file into the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
Monitoring the state condition of the real-time data; and
and if the state condition meets a first writing condition, writing the full-text index file into a quick storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
And if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
And if the state condition meets a second writing condition, writing the full-text index file into the storage device of the instant processing system and writing the full-text index file into the slow storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation of the processing node, which is specifically configured to
And if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
The above-described aspects and any possible implementation further provide an implementation in which the processing node is further configured to
And if the state condition meets at least one of the first writing condition and the second writing condition, starting a new query engine for performing instant query on the real-time data.
The above-mentioned aspects and any possible implementation further provide an implementation that the merge node is further configured to
Receiving a data query request, wherein the data query request comprises query conditions;
distributing the data query request to the at least one local node, so that each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result;
obtaining the query result;
and combining the query results to obtain a final query result.
In another aspect of the present application, a real-time data processing system is provided, which includes a real-time processing system and the real-time processing system provided in the above aspect; wherein,
and the real-time processing system is used for sending the real-time data to the instant processing system.
According to the technical scheme, the real-time data sent by the real-time processing system is obtained through the merging node, and then one node is determined in the at least one local node or the merging node and the at least one local node according to the distribution strategy to serve as the processing node, so that the processing node can write the real-time data into the real-time processing system, and the real-time processing system can flexibly calculate or inquire the real-time data.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and those skilled in the art can also obtain other drawings according to the drawings without inventive labor.
Fig. 1 is a schematic flowchart of a method for processing real-time data according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a real-time processing system according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of a real-time data processing system according to another embodiment of the present application.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Fig. 1 is a schematic flowchart of a processing method of real-time data according to an embodiment of the present application, and is applied to a real-time processing system, as shown in fig. 2, where the real-time processing system includes a merge node and at least one local node, as shown in fig. 1.
101. And the merging node acquires real-time data sent by the real-time processing system.
Optionally, in a possible implementation manner of this embodiment, the real-time Data received by the merge node may be stream Data (Streaming Data) generated by a real-time Data source directly received by a real-time processing system, for example, stream Data generated by an application system such as a network monitoring system, a financial analysis system, a traffic flow prediction system, and the World Wide Web (Web for short), or may also be a calculation result obtained by the real-time processing system performing calculation, such as summation, according to a fixed rule on the stream Data generated by the received real-time Data source, which is not particularly limited in this embodiment.
Specifically, the real-time processing system may receive, by calling an Application Programming Interface (API) of the real-time processing system, a data write request sent by a request processing component of the real-time processing system, where the data write request includes real-time data. And the request processing component sends the data write request to a merge node, and the merge node is responsible for processing the data write request.
The merge node may be a pre-configured fixed node, or may also be a local node randomly selected by the request processing component, or may also be a local node selected by the request processing component according to an election policy, which is not particularly limited in this embodiment.
102. And the merging node determines a node as a processing node according to a distribution strategy, and is used for writing the real-time data into the instant processing system.
Wherein the processing node comprises one of the merge node or the at least one local node.
Optionally, in a possible implementation manner of this embodiment, if the merge node is a pre-configured fixed node, in 102, the merge node may determine a node from the at least one local node according to a distribution policy, so as to serve as the processing node. In this case, the processing node may be one of the at least one local node.
Optionally, in a possible implementation manner of this embodiment, if the merge node is a local node randomly selected by the request processing component, or a local node selected by the request processing component according to an election policy, in 102, the merge node may determine a node from the merge node and the at least one local node according to a distribution policy, so as to serve as the processing node. In this case, the processing node may be the merge node or one of the at least one local node.
Specifically, the distribution policy may include, but is not limited to, a hash operation policy and a polling policy, which is not particularly limited in this embodiment.
For example, the merge node may perform a hash operation on identification Information (ID) of the real-time data to determine a corresponding node as a processing node; or, for another example, the merge node may further adopt a polling policy to sequentially select one node as a processing node; this embodiment is not particularly limited.
Optionally, in a possible implementation manner of this embodiment, after 102, the processing node writes the real-time data into the immediate processing system.
In particular, the merge node may deliver a data write request to a processing node. It is understood that, if the merge node is a pre-configured fixed node, the merge node may specifically forward the data write request to the processing node; if the merge node is a local node randomly selected by the request processing component or a local node selected by the request processing component according to the election policy, the merge node may specifically not forward any more, and directly perform subsequent operations on the data write request.
After receiving the data write request, the processing node may specifically create a full-text index file of the real-time data. The processing node may then write the full-text index file to the instant processing system. Specifically, the detailed description of the method for creating the full-text index file can refer to the related contents in the prior art, and is not repeated here.
Therefore, the processing node creates the full-text index file of the real-time data and writes the full-text index file into the instant processing system, so that the full-text index file of the real-time data can be used for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
The term "real-time processing" as used herein, also referred to as streaming processing, i.e., process query and calculation, refers to processing performed whenever the application is running. For example, in a double 11 campaign of large selling websites, the current transaction amount is calculated at any time. In Alibaba, this application is called the trade live room. The data streams are generated continuously with time, and the time interval of each data stream (which can be set to be in the order of seconds, minutes and the like according to the application) is calculated once.
The "immediate processing" referred to herein means processing performed in a short time after the application is run. For example, the user makes a request, completes the internal computation of the application in a short time, and then returns the result.
After the processing node creates the full-text index file of the real-time data, the full-text index file can be immediately written into the instant processing system, or the full-text index file can not be immediately written into the instant processing system, but the full-text index file is written into the instant processing system after the real-time data meets certain conditions. Therefore, the system overhead of the instant processing system can be effectively reduced.
For example, the processing node may selectively execute the operation of writing the full-text index file into the real-time processing system by monitoring a status condition of the real-time data.
For example, if the status condition satisfies a first writing condition, for example, the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or, for example, the amount of the real-time data reaches a first maximum document number, maxDoc1, etc., the processing node may write the full-text index file into a fast storage device, for example, a memory, of the instant processing system. The Memory of the instant processing system may be a Memory of a computer, or may also be a running Memory of a mobile phone, i.e., a system Memory, such as a Random Access Memory (RAM), and the like, which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a soft commit operation.
For another example, if the state condition satisfies a second writing condition, for example, the receiving time of the real-time data reaches a second maximum visible time, that is, maxTime2, or for example, the number of the real-time data reaches a second maximum document number, that is, maxDoc2, and the like, the processing node may write the full-text index file into a slow storage device of the instant processing system, for example, a hard disk, or may also be a non-operating Memory of a mobile phone, that is, a physical Memory, for example, a Read-Only Memory (ROM), a Memory card, and the like, which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a hard commit operation.
Or, for another example, if the status condition satisfies at least one of the first writing condition and the second writing condition, for example, the first writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or for another example, the number of the real-time data reaches a second maximum number of documents, maxDoc1, or the like; for example, the second writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime2, or, for example, the amount of the real-time data reaches a second maximum document number, maxDoc2, and the like, and the processing node may further start a new query engine to perform an instant query on the real-time data.
It is to be understood that the first writing condition and the second writing condition have no relation, and this embodiment is not particularly limited thereto.
Therefore, the instant processing system can write the real-time data into the instant processing system in a full-text index file form, so that the instant query of the real-time data is ensured.
In this way, the application system program can receive the sent data query request by the request processing component of the instant processing system by calling the query API of the instant processing system, wherein the data query request includes the query condition. And the request processing component sends the data query request to a merging node, and the merging node is responsible for processing the data query request.
In particular, the merge node may distribute the data query request to the at least one local node. It can be understood that, if the merge node is a pre-configured fixed node, the merge node may specifically distribute the data query request to the at least one local node, and the merge node does not perform subsequent operations on the data query request any more; if the merge node is a local node randomly selected by the request processing component or a local node selected by the request processing component according to the election policy, the merge node may specifically distribute the data query request to the at least one local node, and at the same time, the merge node continues to perform subsequent operations on the data query request.
After receiving the data query request, each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively perform query, that is, execute a calculation operation corresponding to a query condition, to obtain a query result, and return the query result to the merge node, and then merge the query results by the merge node, to obtain a final query result, and return the final query result to the application system.
At this point, the instant processing system finishes one flexible calculation or query aiming at the real-time data.
It should be noted that, in this embodiment, the merge node and the at least one local node form a distributed cloud architecture. Specifically, centralized information configuration and management can be performed through a configuration management center, such as ZooKeeper, and automatic fault tolerance and automatic load balancing of requests can be realized. For example, each node, i.e., the merge node and each of the at least one local node, may include an active node and at least one standby node. If the main node fails and is not available, one standby node can be used as the main node to continue providing service so as to realize automatic fault tolerance. Each standby node can be an active node according to a balancing strategy, and provides service, so that automatic load balancing of requests is realized.
With the development of the information society, more and more information is digitalized, especially with the development of the internet, data is explosively increased, and a large amount of real-time data appears, which can be called as mass real-time data. Due to the adoption of the distributed cloud architecture formed by the merging nodes and the at least one local node and the combination of the technical scheme provided by the invention, the massive real-time data can be well subjected to instant query or calculation.
In this embodiment, the real-time data sent by the real-time processing system is obtained through the merge node, and then a node is determined as a processing node in the at least one local node or the merge node and the at least one local node according to the distribution policy, so that the processing node can write the real-time data into the real-time processing system, thereby implementing flexible calculation or query of the real-time data by the real-time processing system.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
Fig. 2 is a schematic structural diagram of a real-time processing system according to another embodiment of the present application, as shown in fig. 2. The instant processing system of the present embodiment may include a merge node 21 and at least one local node 221、222……22n. Wherein n is an integer greater than or equal to 1. The merge node 21 and the local node 221、222……22nThe connection is made through a wired or wireless network. A merge node 21 and at least one local node 221、222……22nThe nodes form a cluster, which may be formed by 1 or more servers, and each node occupies a part of the processing resources of the cluster, that is, each node corresponds to one server, and each server may have a plurality of nodes deployed thereon.
Wherein,
the merge node 21 is configured to obtain real-time data sent by a real-time processing system; determining a processing node according to the distribution strategy; wherein the processing nodes comprise the merge node 21 or the at least one local node 221、222……22nA local node 22 ofiAnd i is an integer greater than 0 and less than or equal to n.
And the processing node is used for writing the real-time data into the instant processing system.
Optionally, in a possible implementation manner of this embodiment, the real-time Data received by the merge node 21 may be stream Data (Streaming Data) generated by a real-time Data source directly received by a real-time processing system, for example, stream Data generated by an application system such as a network monitoring system, a financial analysis system, a traffic flow prediction system, a Web, and the like, or may also be a calculation result obtained by the real-time processing system by performing calculation, such as summation and the like, according to a fixed rule on the stream Data generated by the received real-time Data source, which is not particularly limited in this embodiment.
Specifically, the real-time processing system 25 may receive a data write request sent by a request processing component of the real-time processing system by calling an Application Programming Interface (API) 23 of the real-time processing system, where the data write request includes real-time data. The request processing component sends the data write request to the merge node 21, and the merge node 21 is responsible for processing the data write request.
Wherein the merge node 21 may be a pre-configured fixed node, or may also be a local node 22 randomly selected by the request processing componentjJ is an integer greater than 0 and less than or equal to n, or may be a local node 22 selected by the request processing component according to the election policyjJ is an integer greater than 0 and less than or equal to n, which is not particularly limited in this embodiment.
Optionally, in a possible implementation manner of this embodiment, if the merge node 21 is a pre-configured fixed node, the merge node 21 may send the at least one local node 22 according to a distribution policy1、222……22nDetermining a node as the processing node. In this case, the processing node may be the at least one local node 221、222……22nA local node 22 ofiAnd i is an integer greater than 0 and less than or equal to n.
Optionally, in a possible implementation manner of this embodiment, if the merge node 21 is a local node 22 that is randomly selected by the request processing componentjOr a local node 22 selected by the request processing component according to the election policyjThen, the merge node 21 may root the merge node 21 according to a distribution policyFrom the merge node 21 and the at least one local node 22 according to a distribution policy1、222……22nDetermining a node as the processing node. In this case, the processing node may be the merge node 21 or the at least one local node 221、222……22nA local node 22 ofi。
Specifically, the distribution policy may include, but is not limited to, a hash operation policy and a polling policy, which is not particularly limited in this embodiment.
For example, the merge node 21 may perform a hash operation on identification Information (ID) of the real-time data to determine a corresponding node as a processing node; or, for another example, the merge node 21 may further adopt a polling policy to sequentially select one node as a processing node; this embodiment is not particularly limited.
In particular, the merge node 21 may deliver a data write request to a processing node. It is understood that, if the merge node 21 is a pre-configured fixed node, the merge node 21 may specifically forward the data write request to the processing node; if the merge node 21 is a local node 22 randomly selected by the request processing componentjOr a local node 22 selected by the request processing component according to the election policyjSpecifically, the merge node 21 may not forward any more, and directly perform subsequent operations on the data write request.
After receiving the data write request, the processing node may be specifically configured to create a full-text index file of the real-time data; and writing the full-text index file into the instant processing system. Specifically, the detailed description of the method for creating the full-text index file can refer to the related contents in the prior art, and is not repeated here.
Therefore, the processing node creates the full-text index file of the real-time data and writes the full-text index file into the instant processing system, so that the full-text index file of the real-time data can be used for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
The term "real-time processing" as used herein, also referred to as streaming processing, i.e., process query and calculation, refers to processing performed whenever the application is running. For example, in a double 11 campaign of large selling websites, the current transaction amount is calculated at any time. In Alibaba, this application is called the trade live room. The data streams are generated continuously with time, and the time interval of each data stream (which can be set to be in the order of seconds, minutes and the like according to the application) is calculated once.
The "immediate processing" referred to herein means processing performed in a short time after the application is run. For example, the user makes a request, completes the internal computation of the application in a short time, and then returns the result.
After the processing node creates the full-text index file of the real-time data, the full-text index file can be immediately written into the instant processing system, or the full-text index file can not be immediately written into the instant processing system, but the full-text index file is written into the instant processing system after the real-time data meets certain conditions. Therefore, the system overhead of the instant processing system can be effectively reduced.
For example, the processing node may selectively execute the operation of writing the full-text index file into the real-time processing system by monitoring a status condition of the real-time data.
For example, if the status condition satisfies a first writing condition, for example, the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or, for example, the amount of the real-time data reaches a first maximum document number, maxDoc1, etc., the processing node may write the full-text index file into a fast storage device, for example, a memory, of the instant processing system. The Memory of the instant processing system may be a Memory of a computer, or may also be a running Memory of a mobile phone, i.e., a system Memory, such as a Random Access Memory (RAM), and the like, which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a soft commit operation.
For another example, if the state condition satisfies a second writing condition, for example, the receiving time of the real-time data reaches a second maximum visible time, that is, maxTime2, or for example, the number of the real-time data reaches a second maximum document number, that is, maxDoc2, and the like, the processing node may write the full-text index file into a slow storage device of the instant processing system, for example, a hard disk, or may also be a non-operating Memory of a mobile phone, that is, a physical Memory, for example, a Read-Only Memory (ROM), a Memory card, and the like, which is not limited in this embodiment. In some cases, the full-text index file write operation performed when the status condition of the real-time data satisfies the first write condition may also be referred to as a hard commit operation.
Or, for another example, if the status condition satisfies at least one of the first writing condition and the second writing condition, for example, the first writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime1, or for another example, the number of the real-time data reaches a second maximum number of documents, maxDoc1, or the like; for example, the second writing condition may be that the receiving time of the real-time data reaches a first maximum visible time, maxTime2, or, for example, the amount of the real-time data reaches a second maximum document number, maxDoc2, and the like, and the processing node may further start a new query engine to perform an instant query on the real-time data.
It is to be understood that the first writing condition and the second writing condition have no relation, and this embodiment is not particularly limited thereto.
Therefore, the instant processing system can write the real-time data into the instant processing system in a full-text index file form, so that the instant query of the real-time data is ensured.
Thus, the application system program 26 may receive a data query request sent by a request processing component of the instant processing system, the data query request including a query condition, by calling the query API24 of the instant processing system. The request processing component sends the data query request to the merge node 21, and the merge node 21 is responsible for processing the data query request.
In particular, the merge node 21 may distribute the data query request to the at least one local node 221、222……22n. It is understood that, if the merge node 21 is a pre-configured fixed node, the merge node 21 may specifically distribute the data query request to the at least one local node 221、222……22nThe merge node 21 no longer performs subsequent operations on the data query request; if the merge node 21 is a local node 22 randomly selected by the request processing componentjOr a local node 22 selected by the request processing component according to the election policyjThe merge node 21 may specifically distribute the data query request to the at least one local node 221、222……22nMeanwhile, the merge node 21 continues to perform subsequent operations on the data query request.
After receiving the data query request, the at least one local node 221、222……22nOf each local node, or of the at least one local node 221、222……22nEach local node and the merging node 21 in the system respectively perform query, i.e. execute the calculation operation corresponding to the query condition to obtain the query result, and return the query result to the merging node 21, so that the merging node 21 combines the query resultAnd merging the results to obtain a final query result, and returning the final query result to the application system.
At this point, the instant processing system finishes one flexible calculation or query aiming at the real-time data.
It should be noted that, in this embodiment, the merge node and the at least one local node form a distributed cloud architecture. Specifically, centralized information configuration and management can be performed through a configuration management center, such as ZooKeeper, and automatic fault tolerance and automatic load balancing of requests can be realized. For example, each node, i.e. the merge node 21 and the at least one local node 221、222……22nEach local node may include an active node and at least one standby node. If the main node fails and is not available, one standby node can be used as the main node to continue providing service so as to realize automatic fault tolerance. Each standby node can be an active node according to a balancing strategy, and provides service, so that automatic load balancing of requests is realized.
With the development of the information society, more and more information is digitalized, especially with the development of the internet, data is explosively increased, and a large amount of real-time data appears, which can be called as mass real-time data. Due to the adoption of the distributed cloud architecture formed by the merging nodes and at least one local node and the combination of the technical scheme provided by the invention, the massive real-time data can be well processed.
In this embodiment, the real-time data sent by the real-time processing system is obtained through the merge node, and then a node is determined as a processing node in the at least one local node or the merge node and the at least one local node according to the distribution policy, so that the processing node can write the real-time data into the real-time processing system, thereby implementing flexible calculation or query of the real-time data by the real-time processing system.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
Fig. 3 is a schematic structural diagram of a real-time data processing system according to another embodiment of the present application, as shown in fig. 3. The real-time data processing system of the present embodiment may include a real-time processing system 31 and an instant processing system 32 provided in the embodiment corresponding to fig. 2. Wherein,
the real-time processing system 31 is configured to send the real-time data to the instant processing system 32.
For a detailed description of the instant processing system 32, reference may be made to relevant contents in the embodiment corresponding to fig. 2, which is not described herein again.
In this embodiment, the real-time data sent by the real-time processing system is obtained through the merge node, and then a node is determined as a processing node in the at least one local node or the merge node and the at least one local node according to the distribution policy, so that the processing node can write the real-time data into the real-time processing system, thereby implementing flexible calculation or query of the real-time data by the real-time processing system.
In addition, by adopting the technical scheme provided by the application, the real-time processing system can flexibly calculate or query the real-time data, so that the flexibility of real-time data real-time query can be effectively improved.
In addition, by adopting the technical scheme provided by the application, the full-text index file of the real-time data is created and written into the instant processing system, so that the full-text index file of the real-time data can be utilized for instant query of the real-time data, and query in the whole real-time data is not needed, therefore, the system overhead of the instant processing system can be reduced, and the efficiency of instant query of the real-time data is improved.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or page components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.
Claims (16)
1. A method for processing real-time data, which is applied to a real-time processing system, wherein the real-time processing system comprises a merge node and at least one local node, and the method comprises the following steps:
the merging node obtains real-time data sent by a real-time processing system;
the merging node determines a node as a processing node according to a distribution strategy;
the processing node creates a full-text index file of the real-time data and monitors the state condition of the real-time data;
if the state condition meets a second writing condition, the processing node writes the full-text index file into a slow storage device of the instant processing system; wherein,
the processing node comprises the merge node or one of the at least one local node.
2. The method of claim 1, further comprising:
and if the state condition meets a first writing condition, the processing node writes the full-text index file into a quick storage device of the instant processing system.
3. The method of claim 2, wherein if the status condition satisfies a first writing condition, the processing node writing the full-text index file to a fast storage device of the instant processing system, comprising:
and if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
4. The method of claim 1, wherein if the status condition satisfies a second writing condition, the processing node writing the full-text index file to a slow storage device of the instant processing system, comprising:
and if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
5. The method of claim 2, wherein after the processing node monitors the status condition of the real-time data, further comprising:
and if the state condition meets the first writing condition, the processing node starts a new query engine for carrying out instant query on the real-time data.
6. The method of claim 1, wherein after the processing node monitors the status condition of the real-time data, further comprising:
and if the state condition meets the second writing condition, starting a new query engine by the processing node for carrying out instant query on the real-time data.
7. The method of claim 5 or 6, wherein the processing node starts a new query engine for performing the instant query of the real-time data, and further comprising:
the merging node receives a data query request, wherein the data query request comprises query conditions;
the merging node distributes a data query request to the at least one local node;
each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result, and return the query result to the merge node;
and the merging node merges the query results to obtain a final query result.
8. An instant processing system comprising a merge node and at least one local node; wherein,
the merging node is used for acquiring real-time data sent by the real-time processing system; determining a processing node according to the distribution strategy; wherein the processing node comprises one of the merge node or the at least one local node;
the processing node is used for creating a full-text index file of the real-time data and monitoring the state condition of the real-time data; and if the state condition meets a second writing condition, writing the full-text index file into a slow storage device of the instant processing system.
9. The just-in-time processing system of claim 8, wherein the merge node and the at least one local node form a distributed cloud architecture.
10. The just-in-time processing system of claim 8, wherein the processing node is further configured to
And if the state condition meets a first writing condition, writing the full-text index file into a quick storage device of the instant processing system.
11. The just-in-time processing system of claim 10, wherein the processing node is specifically configured to
And if the receiving time of the real-time data reaches a first maximum visible time or the quantity of the real-time data reaches a first maximum document number, the processing node writes the full-text index file into a fast storage device of the instant processing system.
12. The just-in-time processing system of claim 8, wherein the processing node is specifically configured to
And if the receiving time of the real-time data reaches a second maximum visible time or the quantity of the real-time data reaches a second maximum document quantity, the processing node writes the full-text index file into a slow storage device of the instant processing system.
13. The just-in-time processing system of claim 10, wherein the processing node is further configured to
And if the state condition meets the first writing condition, starting a new query engine for carrying out instant query on the real-time data.
14. The just-in-time processing system of claim 8, wherein the processing node is further configured to
And if the state condition meets the second writing condition, starting a new query engine for carrying out instant query on the real-time data.
15. The just-in-time processing system of claim 13 or 14, wherein the merge node is further configured to
Receiving a data query request, wherein the data query request comprises query conditions;
distributing the data query request to the at least one local node, so that each local node in the at least one local node, or each local node in the at least one local node and the merge node, respectively execute the calculation operation corresponding to the query condition to obtain a query result;
obtaining the query result;
and combining the query results to obtain a final query result.
16. A real-time data processing system comprising a real-time processing system and the instant processing system of any one of claims 8 to 14; wherein,
and the real-time processing system is used for sending the real-time data to the instant processing system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410229319.1A CN105335362B (en) | 2014-05-28 | 2014-05-28 | The processing method and system of real time data, instant disposal system for treating |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410229319.1A CN105335362B (en) | 2014-05-28 | 2014-05-28 | The processing method and system of real time data, instant disposal system for treating |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105335362A CN105335362A (en) | 2016-02-17 |
CN105335362B true CN105335362B (en) | 2019-06-11 |
Family
ID=55285906
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410229319.1A Active CN105335362B (en) | 2014-05-28 | 2014-05-28 | The processing method and system of real time data, instant disposal system for treating |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105335362B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10122788B2 (en) * | 2016-03-29 | 2018-11-06 | Amazon Technologies, Inc. | Managed function execution for processing data streams in real time |
CN107391541B (en) * | 2017-05-16 | 2020-10-20 | 创新先进技术有限公司 | Real-time data merging method and device |
CN111310170A (en) * | 2020-01-16 | 2020-06-19 | 深信服科技股份有限公司 | Anti-leakage method and device for application program and computer readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102118405A (en) * | 2009-12-31 | 2011-07-06 | 比亚迪股份有限公司 | P2P (Peer-to-Peer) network system applied to real-time video data transmission |
CN102880475A (en) * | 2012-10-23 | 2013-01-16 | 上海普元信息技术股份有限公司 | Real-time event handling system and method based on cloud computing in computer software system |
CN103152287A (en) * | 2013-03-27 | 2013-06-12 | 恒生电子股份有限公司 | Method and device for reliably receiving real-time data |
CN103338261A (en) * | 2013-07-04 | 2013-10-02 | 北京泰乐德信息技术有限公司 | Storage and processing method and system of rail transit monitoring data |
CN103560943A (en) * | 2013-10-31 | 2014-02-05 | 北京邮电大学 | Network analytic system and method supporting real-time mass data processing |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7613848B2 (en) * | 2006-06-13 | 2009-11-03 | International Business Machines Corporation | Dynamic stabilization for a stream processing system |
-
2014
- 2014-05-28 CN CN201410229319.1A patent/CN105335362B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102118405A (en) * | 2009-12-31 | 2011-07-06 | 比亚迪股份有限公司 | P2P (Peer-to-Peer) network system applied to real-time video data transmission |
CN102880475A (en) * | 2012-10-23 | 2013-01-16 | 上海普元信息技术股份有限公司 | Real-time event handling system and method based on cloud computing in computer software system |
CN103152287A (en) * | 2013-03-27 | 2013-06-12 | 恒生电子股份有限公司 | Method and device for reliably receiving real-time data |
CN103338261A (en) * | 2013-07-04 | 2013-10-02 | 北京泰乐德信息技术有限公司 | Storage and processing method and system of rail transit monitoring data |
CN103560943A (en) * | 2013-10-31 | 2014-02-05 | 北京邮电大学 | Network analytic system and method supporting real-time mass data processing |
Also Published As
Publication number | Publication date |
---|---|
CN105335362A (en) | 2016-02-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108924250B (en) | Service request processing method and device based on block chain and computer equipment | |
US20200374288A1 (en) | Block chain-based multi-chain management method and system, electronic device, and storage medium | |
CN108712488B (en) | Data processing method and device based on block chain and block chain system | |
CN110401720B (en) | Information processing method, device, system, application server and medium | |
CN111459986B (en) | Data computing system and method | |
CN109040227B (en) | Service request response method and device based on block chain and computer equipment | |
CN110351363B (en) | Data backup method, device and computer readable storage medium | |
CN106453146B (en) | Method, system, device and readable storage medium for allocating private cloud computing resources | |
US9191236B2 (en) | Message broadcasting in a clustered computing environment | |
CN109376172B (en) | Data acquisition method and system based on block chain | |
CN110958218A (en) | Data transmission method based on multi-network communication and related equipment | |
CN103516763B (en) | Method for processing resource and system and device | |
CN111813868B (en) | Data synchronization method and device | |
US20190370293A1 (en) | Method and apparatus for processing information | |
CN105491078A (en) | Data processing method and device in SOA system, and SOA system | |
CN105335362B (en) | The processing method and system of real time data, instant disposal system for treating | |
CN110691042A (en) | Resource allocation method and device | |
CN104899278A (en) | Method and apparatus for generating data operation logs of Hbase database | |
US10033737B2 (en) | System and method for cross-cloud identity matching | |
CN108696559B (en) | Stream processing method and device | |
US10970417B1 (en) | Differential privacy security for benchmarking | |
CN112991064A (en) | Service processing method, device, computer equipment and storage medium | |
CN104852964A (en) | Multifunctional server scheduling method | |
CN104899495A (en) | Biological recognition system | |
CN116703071A (en) | Resource sharing method, device and equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20211125 Address after: No. 699, Wangshang Road, Binjiang District, Hangzhou, Zhejiang Patentee after: Alibaba (China) Network Technology Co.,Ltd. Address before: Box 847, four, Grand Cayman capital, Cayman Islands, UK Patentee before: ALIBABA GROUP HOLDING Ltd. |