CN109726225A

CN109726225A - A kind of storage of distributed stream data and querying method based on Storm

Info

Publication number: CN109726225A
Application number: CN201910026601.2A
Authority: CN
Inventors: 蔡瑞初; 林峰极; 郝志峰; 王立; 黄泽林; 陈炳丰; 温雯; 王丽娟
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2019-05-07
Anticipated expiration: 2039-01-11
Also published as: CN109726225B

Abstract

The present invention provide it is a kind of based on Storm distributed stream data storage and querying method, the present invention is based on Storm data stream type Computational frames, CEPHFS is as under data bottom storage system, pass through the signature analysis to distributive type data, real-time subregion and index construct are carried out to data, by the good data block compression deposit CEPHFS of subregion.According to the attribute of the key of data block and two dimensions of temporal when search operation, it is corresponding subquery by query decomposition, and the file that may contain required data is only read by bloomFilter method, qualified data are selected by predicate, aggregate operation is carried out after submitting subquery results to merge, and returns to user.Computing resource is made full use of to improve the efficiency of data storage and inquiry.The present invention have the characteristics that application scenarios extensively, low time delay, load balancing, and can be realized high speed storing.

Description

A kind of storage of distributed stream data and querying method based on Storm

Technical field

The present invention relates to technical field of data processing, especially a kind of distributed stream data based on Storm are stored and are looked into Inquiry method.

Background technique

With the fast development of network technology, the high speed of real-time streaming data caused by social networks and Location Service Platform etc. Increase, occurs carrying out magnanimity flow data the requirement of processing response in real time in more and more fields, so that the high speed of data Insertion and real-time searching become a very important data-handling capacity, user can obtain in real time desired historical data and New data.For providing the platform such as Baidu map of location service, Amap etc. is per second all instantaneously to produce the position of magnanimity Information and trail change data, in order to meet the needs of users and improve company's benefit, plateform system is required to support Real-time insertion storage on million grades of flow datas is inquired with low delay, such as client needs to obtain 5km range near current time The GPS information of interior all vehicles, or specify driving trace of certain vehicle within past 1 hour.

Common key-value memory technology open source, which is realized, updates leaf node band as HBase is reduced using LSM-Tree The time overhead come, but new data and the historical data needs being inserted into every time are updated, in the inquiry time delay of time range It is excessively high；The Druid of common time series databases technology such as Alibaba's open source only supports inverted index, looks into key range It is more inefficient in inquiry.In order to solve this problem, must design one can be carried out high speed storing and reality for magnanimity flow data When the distributed data base technique inquired, all support efficient inquiry in key scope and time range, this requires data of newly arriving It can be separated with historical data, avoid the traversal of unrelated range data as far as possible in inquiry, improve search efficiency, guarantee simultaneously The load balancing of system difference node, carrys out the utilization rate of maximum resource.

Summary of the invention

In view of the deficiencies of the prior art, the present invention provide it is a kind of based on Storm distributed stream data storage and issuer Method, present invention analysis flow data can reach the stability feature with data distribution according to close sequence under true environment, With problem efficient in key scope and time range is unable to satisfy in present database technology, provide a kind of magnanimity flow data Under efficient index and time-domain range real-time query processing method.The present invention is directed to by for upcoming flow data into Line range divides, and is respectively stored into distributed file system after different machines nodal parallel index, and when inquiry carries out inquiry point Solution, executes subquery parallel, filters, and after the operation such as polymerization, amalgamation result is returned.

The technical solution of the present invention is as follows: a kind of storage of distributed stream data and querying method, the present invention based on Storm are logical The B+Tree index for establishing several isolation ranges in real time when receiving distributed stream data is crossed, distribution is arrived in storage after reaching threshold value File system, and query decomposition is carried out in inquiry, the subquery under parallel processing different range keeps load balancing, completes Merge afterwards and returns to real-time storage as a result, realizing the flow data insertion of high-throughput and inquiring, specifically includes the following steps:

S1), receive source data and be distributed to downstream units building index structure；

S2), by index structure boil down to data block and distributed file storage system CEPHFS is written；

It S3), is several independent subqueries by query decomposition based on querying condition and data block information；

S4), the son for being distributed to the independent query processing unit in downstream by accessing distributed file storage system CEPHFS Inquiry；

S5), receive the subquery results returned and merging returns to user.

Further, step S1) in, the received each source data of flow data storage system is data element ancestral, is defined as d= {d_k,d_t,d_r, wherein d_kIt is the major key of first ancestral, d_tIt is time attribute, d_rIt is other attribute values of first ancestral, K and T define one The two-dimensional space D=(K, T) of major key and time-domain；Major key range is fixed, and time range is continuously increased, and the section major key K is expressed as K (k-, k+), the section time-domain T are expressed as T (t-, t+), establish unique rectangle r≤K, T >={ (k, t) ∈ R according to two sections |k∈K,t∈T}。

Further, by rectangle r≤K, the data tuple write-in in T >={ (k, t) ∈ R | k ∈ K, t ∈ T } range is unique right In the template B+Tree answered, key reaches the template B+Tree of threshold value chunkSize size in memory as indexing with chunk shape Formula storage is to distributed file system, and chunk is made of key array and array of data, the key value of key storage order of array, packet Include the offset of a direction array of data.

Further, it is based on two-dimensional space D=(K, T), the querying condition of flow data storage system can be defined as one Triple q={ K_q,T_q,f_q, K_q,T_qThe condition range of choice on major key and time-domain, query range cutting be a r≤ K,T≥{(k,t)∈R|k∈K_q,t∈T_q},f_q: t- > { true, false } is the customized condition filter function of user, is used to Judge whether the selection for meeting user.

Further, the blocks of files difference based on the storage of different subquery server S ubquery Server nodes is gentle The template B+Tree leaf node deposited is different, realizes the algorithm of query decomposition scheduling, calculates subquery server S ubquery Server carries out inquiry distribution to each untreated subquery priority query, until untreated subquery collection is combined into sky, and The leaf segment point data inquired recently is written and is cached, realizes the caching locality of inquiry distribution, data block locality and load are equal Weighing apparatus；Specific algorithm process is as follows:

To S (q_i) andIt shuffles, if S (q_i) preceding, then the two is spliced into new arrayWherein, subscript It is small to represent priority height, it willElement include priority be separately added into each subquery server S ubquery Server Subquery priority query in, all q_iAfter all having handled, to the priority of sub- query service device Subquery Server Queue successively takes out highest priority and untreated q_iIt is allocated, until all q_iIt is assigned, wherein S (q_i) generation Table has q_iThe subquery server S ubquery Server array of range data,Represent remaining subquery server The array of Subquery Server, q_i∈ q represents the subquery after one query is decomposed.

Further, step S2) in, index structure is tree index structure, and tree index structure size is being more than specified Threshold value after, the data element ancestral in leaf node is compressed by Snappy algorithm, is written in the form of data block point It is permanently stored in cloth document storage system CEPHFS, and by first ancestral's major key of data block, the relevant metadata of time-domain range Meta data manager metadata keeper is recorded；It can be become in a certain range according to flow data key major key domain Change, and time-domain can ever-increasing characteristic, the non-leaf nodes part of tree index structure is carried out to be left template, with side Just index templates are directly used in building next time, the division of the progress node as building B+ tree is avoided, when causing very big Between expense.

Further, step S3) in, query decomposition is looked into for several independent sons based on querying condition and data block information It askes, specifically includes the following steps:

S301), major key and time-domain in the querying condition that query scheduling device query dispatcher is provided according to user Range, the data block metadata information read in meta data manager (metadata keeper) compares, by query region It is divided into a series of two-dimensional index regions；

S302), the equivalent Rule of judgment provided based on user, is filtered out by Bloom filter bloomFilter method Certain subquery region for not containing target data member ancestral；

S303), the independent subquery server in downstream will likely be only distributed to containing the subquery of target data member ancestral Subquery Server。

Further, step S4) in, it is distributed to that downstream is independent to be looked by accessing distributed file storage system CEPHFS The subquery of processing unit is ask, specifically includes the following steps:

S401), subquery server S ubquery Server read parallel in distributed file storage system CephFs with The corresponding data block of subquery, the template part of index structure, obtains leaf node for all leaf nodes in first read block Opposite offset and it is packed compressed after offset, be calculated may include target key range a series of leaf nodes offset；

S402), the leaf node part based on index structure in offset read block file, passes through Snappy algorithm solution Obtained leaf node packet data block byte is pressed, is deserialized as leaf node, and do the filtering in time range and equivalence condition；

S403), aggregate operation is carried out to filtered volume of data member ancestral, inquiry is sent to after serializing and is adjusted Spend device query dispatcher.

The invention has the benefit that

1, application scenarios of the present invention are extensive, distributed stream data handling utility such as communication common carrier monitoring analysis network flow, Position networked platforms vehicle flowrate trail change, electric business platform festivals or holidays conclusion of the business index etc. in real time realize that the data of mass data are real When transmission process.

2, the present invention can be realized high speed storing, and the present invention will newly arrive data and history using efficient data division mode Interval data is opened, and using data area stability feature, by reserving index, template constructs B+Tree index, avoids tree node point Split the consumption of bring plenty of time.

3, the present invention has the characteristics that low time delay, and after carrying out range cutting to querying condition, only accessing metamessage may be accorded with The file of query context, parallel processing filtering, the key operations such as polymerization are closed, and realize caching locality and file locality, are mentioned High search efficiency.

4, load balancing of the present invention is allocated different sections to the subquery of decomposition by the query scheduling algorithm of design Point, makes full use of system resource.

Detailed description of the invention

Fig. 1 is flow diagram of the invention；

Fig. 2 is structure chart of the distributed stream datum number storage of the present invention according to block；

Fig. 3 is internal structure chart of the distributed stream datum number storage of the present invention according to block leaf node；

Fig. 4 is that distributed stream data query of the present invention decomposes scheduling graph.

Specific embodiment

Specific embodiments of the present invention will be further explained with reference to the accompanying drawing:

As shown in Figure 1, a kind of storage of distributed stream data and querying method based on Storm, the present invention is by receiving The B+Tree index for establishing several isolation ranges when distributed stream data in real time, distributed field system is arrived in storage after reaching threshold value System, and query decomposition is carried out in inquiry, the subquery under parallel processing different range keeps load balancing, merges after the completion Real-time storage is returned as a result, realizing the flow data insertion and inquiry of high-throughput, specifically includes the following steps:

Wherein, the received each source data of flow data storage system is known as data element ancestral, and is defined as d={ d_k,d_t,d_r, Wherein, d_kIt is the major key of first ancestral, d_tIt is time attribute, d_rIt is other attribute values of first ancestral, K and T define a major key and time The two-dimensional space D=(K, T) in domain；Major key range is fixed, and time range is continuously increased, and the section major key K is expressed as K (k-, k+), when Between the domain section T be expressed as T (t-, t+), establish unique rectangle according to two sections:

R≤K, T >=(k, t) ∈ R | k ∈ K, t ∈ T }；

By rectangle r≤K, unique corresponding template B is written in the data tuple in T >={ (k, t) ∈ R | k ∈ K, t ∈ T } range In+Tree, key is as index, when reaching the template B+Tree of threshold value chunkSize size in memory with data block data The storage of chunk form is to distributed file system, and chunk is made of key array and array of data, key storage order of array Key value is directed toward the offset of array of data including one.

Based on two-dimensional space D=(K, T), the querying condition of flow data storage system can be defined as a triple q= {K_q,T_q,f_q, K_q,T_qIt is the condition range of choice on major key and time-domain, query range cutting is a r≤K, T >=(k, t)∈R|k∈K_q,t∈T_q},f_q: t- > { true, false } is the customized condition filter function of user, with to determine whether Meet the selection of user.

The template B+ of blocks of files difference and caching based on the storage of different subquery server S ubquery Server nodes Tree leaf node is different, realizes the algorithm of query decomposition scheduling, calculates subquery server S ubquery Server to each Untreated subquery priority query carries out inquiry distribution, until untreated subquery collection is combined into sky, and will inquire recently Leaf segment point data write-in caching, realizes the caching locality of inquiry distribution, data block locality and load balancing；Specific algorithm mistake Journey is as follows:

To S (q_i) andIt shuffles, if S (q_i) preceding, then the two is spliced into new arrayWherein, subscript It is small to represent priority height, it willElement include priority be separately added into each subquery server S ubquery Server Subquery priority query in, all q_iAfter all having handled, to the priority of sub- query service device Subquery Server Queue successively takes out highest priority and untreated q_iIt is allocated, until all q_iIt is assigned, wherein S (q_i) generation Table has q_iSubquery server (Subquery Server) array of range data,Represent remaining subquery server The array of Subquery Server, q_i∈ q represents the subquery after one query is decomposed.

S2), by index structure boil down to data block and distributed file storage system CEPHFS is written；Wherein,

Index structure is tree index structure, and tree index structure size passes through Snappy after more than specified threshold value Algorithm compresses the data element ancestral in leaf node, and distributed file storage system is written in the form of data block It is permanently stored in CEPHFS, and by first ancestral's major key of data block, the relevant metadata record of time-domain range to metadata management Device metadata keeper；It can be changed in a certain range according to flow data key major key domain, and time-domain can be continuous The characteristic of growth carries out the non-leaf nodes part of tree index structure to be left template, straight in building next time to facilitate The division for avoiding carrying out node as building B+ tree using index templates is connect, very big time overhead is caused.

S3), it is several independent subqueries by query decomposition based on querying condition and data block information, specifically includes following Step:

S301), major key and time-domain in the querying condition that query scheduling device query dispatcher is provided according to user Range, the data block metadata information read in meta data manager metadata keeper compare, and query region is drawn It is divided into a series of two-dimensional index regions；

S302), the equivalent Rule of judgment provided based on user is filtered by Bloom filter (bloomFilter) method Fall certain subquery region for not containing target data member ancestral；

S4), the son for being distributed to the independent query processing unit in downstream by accessing distributed file storage system CEPHFS Inquiry, specifically includes the following steps:

S5), receive the subquery results returned and merging returns to user.

As shown in Fig. 2, the chunk file internals of flow data write-in distributed file system.Chunk contains B+ Tree template part and leaf node two parts.Template in figure represents B+Tree template part, and leaf node represents leaf Node section, compress chunk represent leaf node it is packed compressed after data block.

B+Tree template part includes root node and the internal node part of B+Tree, each nodes records key value, child Relative displacement of the column leaf node in all leaf nodes, a column leaf is also recorded in child node etc., maximum layer internal node In the offset of chunk after node is packed compressed.

Leaf node includes key array part and array of data part, and all nodes are carried out continuous by sequence from left to right Storage.When storage file, template part is written chunk as a whole, and leaf node is written in the form after packed compressed Chunk, every group of leaf node number N are set as 20, improve the problem of compression factor carrys out processing space storage.

As shown in figure 3, the leaf node partial data in flow data storage organization chunk is laid out.Data layout is by two parts Composition, one is key array, and one is array of data.Index array in figure represents key array, data array generation Table array of data.The key value of Key storage order of array, which includes the offsets that one is directed toward array of data, when search By finding the Key value and offset of eligible range in Key array, then corresponding data element is taken into array of data Ancestral.

As shown in figure 4, the algorithm of processing query decomposition scheduling can be expressed as a figure.Pending Set generation in figure All also unassigned subqueries of table, S (q_i) the optimum allocation Subquery Server of each subquery is represented, The Subquery priority array of each subquery is represented, preferred server queue PreferedServer Arrays represents son Priority query of the query service device (Subquery Server) to all untreated subqueries.Pending Set is not empty When, each Subquery Server is stored in the data area in local data area and caching according to file system, right Subquery in Set carries out priority ranking will be preferred to all Subquery Server according to ID sequence after the completion of sequence Untreated subquery is allocated in server queue PreferedServer Arrays, until Pending Set is all Until subquery is all handled.

The above embodiments and description only illustrate the principle of the present invention and most preferred embodiment, is not departing from this Under the premise of spirit and range, various changes and improvements may be made to the invention, these changes and improvements both fall within requirement and protect In the scope of the invention of shield.

Claims

1. a kind of storage of distributed stream data and querying method based on Storm, it is characterised in that: by receiving distributed stream The B+Tree index for establishing several isolation ranges when data in real time, storage is to distributed file system after reaching threshold value, and is looking into Query decomposition is carried out when inquiry, the subquery under parallel processing different range keeps load balancing, merges to return after the completion and deposit in real time Flow data insertion and inquiry as a result, realization high-throughput are stored up, specifically includes the following steps:

S4), the subquery for being distributed to the independent query processing unit in downstream by accessing distributed file storage system CEPHFS；

S5), receive the subquery results returned and merging returns to user.

2. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists In: in step S1), the received each source data of flow data storage system is data element ancestral, is defined as d={ d_k,d_t,d_r, In, d_kIt is the major key of first ancestral, d_tIt is time attribute, d_rIt is other attribute values of first ancestral, K and T define a major key and time-domain Two-dimensional space D=(K, T)；Major key range is fixed, and time range is continuously increased, and the section major key K is expressed as K (k-, k+), time The domain section T is expressed as T (t-, t+), establishes unique rectangle r≤K according to two sections, and T >=(k, t) ∈ R | k ∈ K, t ∈ T }.

3. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 2 exists In: by rectangle r≤K, unique corresponding template B+Tree is written in the data tuple in T >={ (k, t) ∈ R | k ∈ K, t ∈ T } range In, as indexing, the template B+Tree that threshold value chunkSize size is reached in memory is stored in the form of chunk to distribution key File system, chunk are made of key array and array of data, the key value of key storage order of array, including a direction data The offset of array.

4. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 3 exists In: it is based on two-dimensional space D=(K, T), the querying condition of flow data storage system can be defined as a triple q={ K_q,T_q, f_q, K_q,T_qIt is the condition range of choice on major key and time-domain, query range cutting is a r≤K, T >=(k, t) ∈ R | k∈K_q,t∈T_q},f_q: t- > { true, false } is the customized condition filter function of user, and use is used to determine whether meeting The selection at family.

5. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 4 exists In: the template B+ of blocks of files difference and caching based on the storage of different subquery servers (Subquery Server) node Tree leaf node is different, realizes the algorithm of query decomposition scheduling, calculates subquery server S ubquery Server to each Untreated subquery priority query carries out inquiry distribution, until the inquiry leaf that untreated subquery collection is combined into sky and will look into recently Node data write-in caching, realizes the caching locality of inquiry distribution, data block locality and load balancing；Specific algorithm process It is as follows:

To S (q_i) andIt shuffles, if S (q_i) preceding, then the two is spliced into new arrayWherein, subscript small generation Table priority is high, willElement include that priority is separately added into the son of each subquery server S ubquery Server In Query priority queue, all q_iAfter all having handled, to the priority query of sub- query service device Subquery Server Successively take out highest priority and untreated q_iIt is allocated, until all q_iIt is assigned, wherein S (q_i) represent and deposit There is q_iThe subquery server S ubquery Server array of range data,Represent remaining subquery server The array of Subquery Server, q_i∈ q represents the subquery after one query is decomposed.

6. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists In: in step S2), index structure is tree index structure, and tree index structure size passes through after more than specified threshold value Snappy algorithm compresses the data element ancestral in leaf node, and distributed document storage system is written in the form of data block It is permanently stored in system CEPHFS, and by first ancestral's major key of data block, the relevant metadata record of time-domain range to metadata pipe Manage device metadata keeper.

7. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists In: by query decomposition be several independent subqueries based on querying condition and data block information in step S3), specifically include with Lower step:

S301), major key and time-domain range in the querying condition that query scheduling device query dispatcher is provided according to user, The data block metadata information read in meta data manager metadata keeper compares, and query region is divided into A series of two-dimensional index regions；

S302), the equivalent Rule of judgment provided based on user, is filtered out centainly by Bloom filter bloomFilter method Subquery region without containing target data member ancestral；

8. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists In: in step S4), the son of the independent query processing unit in downstream is distributed to by access distributed file storage system CEPHFS Inquiry, specifically includes the following steps:

S401), subquery server S ubquery Server is read parallel in distributed file storage system CephFs looks into son Corresponding data block is ask, the template part of index structure in first read block obtains leaf node for the phase of all leaf nodes To offset and it is packed compressed after offset, be calculated may include target key range a series of leaf nodes offset；

S402), the leaf node part based on index structure in offset read block file, is decompressed by Snappy algorithm The leaf node packet data block byte arrived, is deserialized as leaf node, and do the filtering in time range and equivalence condition；

S403), aggregate operation is carried out to filtered volume of data member ancestral, query scheduling device is sent to after serializing (query dispatcher)。