CN111159176A - Method and system for storing and reading mass stream data - Google Patents

Method and system for storing and reading mass stream data Download PDF

Info

Publication number
CN111159176A
CN111159176A CN201911196972.1A CN201911196972A CN111159176A CN 111159176 A CN111159176 A CN 111159176A CN 201911196972 A CN201911196972 A CN 201911196972A CN 111159176 A CN111159176 A CN 111159176A
Authority
CN
China
Prior art keywords
data
stream data
stream
storage system
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911196972.1A
Other languages
Chinese (zh)
Inventor
么广忠
郭斯杰
熊劲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201911196972.1A priority Critical patent/CN111159176A/en
Publication of CN111159176A publication Critical patent/CN111159176A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a storage method of mass stream data, which comprises the following steps: receiving streaming data from a client; storing the stream data in a line format to a distributed segment storage system to form line stream data; asynchronously storing the stream data in a columnar format to a distributed segment storage system to form columnar stream data; and returning a confirmation message to the client after the line flow data storage is completed.

Description

Method and system for storing and reading mass stream data
Technical Field
The invention belongs to the field of data processing, and particularly relates to a method and a system for storing and reading mass stream data.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
Streaming data refers to a sequence of data that arrives sequentially, massively, rapidly, and continuously, and can be generally considered as a dynamic data set that grows indefinitely over time. With the wide popularization and rapid development of technologies such as internet, mobile internet, cloud computing and internet of things, streaming data generated in the fields of network monitoring, sensor network, aerospace, weather measurement and control, financial service and the like is increasing explosively. Such streaming data that has a very fast growth rate, an increasing capacity, and no termination is also called mass streaming data. The ever-increasing amount of data requires ever-expanding storage space, and the size of storage capacity tends to be inversely proportional to storage performance. In addition, compared with static data, stream data has the characteristics of real-time arrival, independent arrival sequence, fast rate change, large data size, difficulty in being processed again once processed (unless specially stored), and the like. Therefore, applications related to streaming data have higher requirements on timeliness and throughput of data processing.
Existing data storage systems tend to focus on some aspect of performance. For example, a Hadoop Distributed File System (HDFS) organizes data into large-sized data blocks, so that communication between a client and a Master node (Master) is reduced, and high-throughput data access is provided for a user, but the high bandwidth of the HDFS is advantageous at the cost of a certain delay, and when access requests increase, the processing speed of the HDFS is severely limited by the framework structure of a single Master node. As another example, the message storage system Kafka ensures low latency of data writing by transferring data using a smaller granularity, however the transfer granularity of Kafka is not suitable for storage of mass data. The column storage formats Apache ORC and partial obtain higher compression ratio and query performance based on the column storage structures, but the metadata management scheme of the column storage structure centralization limits the compression ratio and the query performance to boundary data and is not suitable for continuous stream data.
Therefore, in order to meet the requirements of efficient storage and reading and writing of mass stream data and to solve at least one of the above problems, the present invention provides a method and a system for storing and reading mass stream data.
Disclosure of Invention
One aspect of the present invention relates to a method for storing mass stream data, comprising: receiving streaming data from a client; storing the stream data in a line format to a distributed segment storage system to form line stream data; asynchronously storing the stream data in a columnar format to a distributed segment storage system to form columnar stream data; and returning a confirmation message to the client after the line flow data storage is completed.
Optionally, wherein the asynchronously storing the stream data in a columnar format to a distributed segment storage system comprises: the stream data is asynchronously stored in a columnar format to a distributed segment storage system while the stream data is stored in a lined format to a storage system, forming columnar stream data.
Optionally, wherein the asynchronously storing the stream data in a columnar format to a distributed segment storage system comprises: firstly, storing the stream data into a distributed segment type storage system in a line type format to form line type stream data; reading the line flow data from the distributed segment storage system, and asynchronously storing the line flow data in a columnar format to the distributed segment storage system to form columnar flow data.
Optionally, wherein the asynchronously storing the stream data in a columnar format to a distributed segment storage system comprises: extracting a data pattern of the stream data; organizing the stream data in columns according to the data pattern; respectively opening up a buffer area for each column cluster in the stream data organized according to the columns; respectively adding the stream data organized according to columns to tail vacant positions of a buffer zone corresponding to the columns according to the columns; and when the buffer area is full, writing the stream data in the buffer area into the distributed segment storage system, and storing the stream data of different clusters in different segments.
Optionally, the method further includes: setting an ID for each event in the stream data; recording the position information of each event in the buffer area; when the buffer area is full, the ID and the corresponding position information of all events in the buffer area are attached to the head of the buffer area; and writing the ID and the position information of all events in the buffer zone and all stream data in the buffer zone into a data unit of the distributed segment storage system, wherein the ID and the position information of all events in the buffer zone are positioned at the head of the data unit.
Optionally, the method further includes: storing metadata of the columnar flow data to a distributed key value storage system, the metadata including two levels of flow and segment, wherein the flow-level metadata includes: information of a column cluster constituting the columnar flow data, information of a segment storing the column cluster; the segment-level metadata includes: location information, number of events, maximum and minimum values, and other relevant information for start and stop events within the segment.
Optionally, the method further includes: after asynchronously storing the stream data in a columnar format in a distributed segment storage system, deleting the line data from the distributed segment storage system if a request to read the line data is not received within a predetermined time threshold.
Another aspect of the present invention relates to a method for reading a large amount of stream data based on the above storage method, including: receiving a request of a client for reading data; querying metadata of the data in the distributed key value storage system according to the request for reading the data; and reading the data from the distributed section type storage system according to the metadata and returning the data to the client.
Optionally, the method further includes: determining initial position information of the data according to metadata of the data; reading the data after the initial position of the data in advance and placing the data in a data cache region; and reading the data from the data buffer area.
Another aspect of the invention relates to a storage system for mass streaming data, comprising a server and a storage device, capable of being used in any of the above described methods.
Compared with the prior art, the invention has the advantages that:
a high throughput, low latency, high efficiency query store and read-write method and system are provided.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a layered system architecture according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an Apache BookKeeper distributed segment storage;
FIG. 3 is a diagram illustrating a DistributedLog software stack structure;
fig. 4 illustrates a method for storing mass stream data according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an embodiment of the present invention for organizing and writing streaming data into a storage system in columns;
FIG. 6(a) is a diagram illustrating a storage structure of stream-level metadata in a primary K/V table according to an embodiment of the present invention;
FIG. 6(b) is a diagram illustrating the storage structure of metadata at a segment level in a secondary K/V table according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a stream data storage method according to an embodiment of the present invention;
fig. 8 is a method of storing stream data according to another embodiment of the present invention;
FIG. 9 is a schematic diagram of deleting a row-wise stream after completion of column-wise stream storage in one embodiment of the invention;
fig. 10 is a schematic diagram of a method for reading a large amount of stream data according to an embodiment of the present invention;
fig. 11 is a diagram showing a reading method of stream data in one embodiment of the present invention;
FIG. 12(a) is a graph comparing the performance of a memory system according to one embodiment of the present invention with Kafka writing;
FIG. 12(b) is a graph comparing the performance of a memory system according to one embodiment of the present invention with Kafka read performance;
FIG. 12(c) is a graph comparing the performance of the storage system with Kafka, part queries in accordance with one embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail by embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The method and the system for storing and reading mass stream data operate under a layered system framework. FIG. 1 illustrates a hierarchical system architecture diagram in one embodiment of the invention. As shown in fig. 1, the framework can be divided into three layers, which are a storage layer, a core service layer and an application layer from bottom to top. Wherein, the storage layer is used for providing basic storage service; the core service layer is connected with the storage layer and the application layer and used for transmitting the streaming data output or received by the application layer to the storage layer and carrying out necessary organization and management on the data; the application layer runs various data analysis applications, which may include, for example, full time domain data analysis, stream calculation, batch processing, or the like.
In one embodiment, the storage tier may be persisted using the Apache distributed segment storage system bootkeeper. BookKeeper is an extensible, high fault tolerant, and low latency storage system optimized for real-time workloads. A bootkeeper cluster may be composed of multiple storage nodes (cookies) and metadata (metadata) stores. The cookies are used for providing a series of independent storage services (servers), and each storage node is responsible for specific data storage. The BookKeeper system has good expandability, and when the capacity needs to be increased, only new storage nodes need to be automatically selected from the BookKeeper cluster and added. The newly added storage node is preferentially used due to the large remaining space, receives more new data, and does not involve any copying and moving of the existing data. In addition, the BookKeeper uses a majority-vote parallel replication algorithm (quantum-volume parallel replication algorithm) to replicate and store the same multiple copies of all data into other cookies. If a certain storage node fails, data can be read from other storage nodes in the cluster in a concurrent mode, data recovery is automatically carried out on the data of the failed node, and the influence on the service of the front end is avoided. The quorum-vote can be set by the client to ensure low latency replication of data. The metadata store in the BookKeeper cluster typically uses the ZookKeeper for service discovery and metadata management. The ZookKeeper stores metadata related to segments (also referred to as hedgers within the ZookKeeper), including currently available storage nodes, locations of segment distributions, and the like.
Before using the bookkeepers to store data, three parameters, that is, a minimum node number (ensembles, which indicates that a user needs to use several cookies), a minimum write number (write quorum, which indicates that written data needs to be kept for several backups), and a minimum response number (ack quorum, which indicates that a message is returned after several write operations are successful) are usually set.
FIG. 2 shows a BookKeeper distributed segment storage schematic. As shown in fig. 2, the BookKeeper uses segments (fragments/hedgers) as basic units for storage, and one segment is an independent data set and includes a plurality of data units (entries). Each segment is evenly distributed into the bootkeeper cluster to ensure uniformity of data and services across multiple storage nodes. Each segment is an incremental data structure that maintains data writes via a single write mode and is replicated to other storage nodes via the replication mechanism described above.
The BookKeeper provides two services of stream (stream) and table (table) in addition to basic segment storage, wherein the stream service is to manage a group of segments together according to a certain mode to form a continuous stream, and the table is used for accessing an intermediate state of stream calculation. The BookKeeper can conveniently meet most of storage requirements in real-time data processing by providing two services of flow and table.
In summary, the bookkeepers can provide high-throughput, low-latency, persistent, extensible, and consistent storage services for the present invention based on the characteristics of high write availability (i.e., storage is not limited to a single server, and the total cookies capacity is sufficient for successful writing) and high read availability (i.e., traffic capable of distributed reading in a cluster).
The core service layer above the storage layer is used for organizing and managing data, and may include data format management, segment state management, metadata management, and lifecycle management, for example. The method for storing and reading mass stream data is mainly applied to a core service layer.
Because access to streaming data often only relates to partial attributes of the data, for example, model training is performed by using specific attributes, Ad-hoc Query (Ad-hoc Query) is performed on the specific attributes, and the like, data records of attributes related to the streaming data can be rapidly queried by using columnar storage streaming data, and only values related to the attributes are returned, so that consumption of a CPU and I/O (input/output) is remarkably reduced, Query response time is reduced, and Query efficiency is improved. However, in the column-type storage, field extraction is required to be carried out on each event in the stream data, and information of each field is counted, and under the condition of the same resource, the column-type storage is slower than the line-type storage, and the delay time is longer. The invention adopts a parallel storage mode of line storage and column storage, and immediately returns a confirmation message to the client after the line storage is finished, thereby reducing the delay and improving the query efficiency.
The line storage, also called line format storage, refers to that data is stored in logical storage units based on line data, and the data in the same line exists in a continuous storage form in a storage medium. Stream data stored in a line format is referred to as line stream data, or simply line stream.
In one embodiment, the Apache DistributedLog may be used to write received stream data in order into the bookkeepers to form a line stream. The distributedLog is a high-performance Log replication service, which uses the stream service provided by BookKeeper to form a line-type stream, also called a Log stream (Log stream), by classifying and maintaining the sequence of Log records (sequences of records), and then based on a configured policy, the data stream is divided into equal-sized segments (segments) and evenly distributed to segment storage nodes (cookies) for a configurable time period (e.g. two hours) or a configurable maximum size (e.g. 128M).
Fig. 3 shows a schematic diagram of a DistributedLog software stack structure. As shown in fig. 3, the DistributedLog constructs a core service layer above the BookKeeper storage layer, which is further subdivided into a core layer and a stateless service layer. In the core layer, a process of writing a record to the disctributedlog is called a writer (writer), and a process of reading and processing a record from the disctributedlog is called a reader (reader). Each data record is a Sequence of bytes, and writers write the records in Sequence to their selected line stream (Log stream) and assign a unique Sequence Number (DLSN). In addition to the DLSN, the application may set its own serial number (Transaction ID) at the time the data record is built. When readers read records from their selected line streams, they start at a given position, which may be DLSN or Transaction ID, and read in the exact order of the line streams. In the same line stream, different readers can read the record at different starting positions. The core layer supports single writer, multi-reader semantics, i.e. for a certain line flow, there is only one active writer at a given point in time, but there are many active readers. The service layer above the core layer includes a write proxy (write proxy) and a read proxy (read proxy) that manage multiple writers and readers, respectively, and receive fan-in and fan-out of client-side mass data. The read agent may also place data records in a cache, optimizing the reader's read path to handle cases where multiple readers read the same line stream.
Columnar storage is relative to row-type storage, also called columnar format storage, and refers to storage in a logical storage unit based on column data, and data in the same column exists in a continuous storage form in a storage medium. The stream data stored in columnar format is called columnar stream data, columnar stream for short.
In one embodiment, line format storage may be performed to form line-type stream data and column format storage may be performed to form column-type stream data at the same time after receiving the stream data from the client, and an acknowledgement message may be returned to the client immediately after the line-type stream data storage is completed.
Fig. 4 shows a storage method of mass stream data in an embodiment of the present invention, which specifically includes the following steps:
the server receives the streaming data from the client, step 410.
The storage method of the invention can be applied to a server side, when a client sends a RemoteProduceCall (RPC) request program to request to write in streaming data through a remote computer, the server can receive the request and receive the streaming data from the client (including a third party platform), for example, various log files, online shopping data, game player activities, social network site information, financial transaction information or geospatial services, telemetering data from equipment or instruments connected in a data center, and the like. Stream data is usually characterized by large scale, sequential and continuous arrival, etc., and usually needs to be processed incrementally and in near real time according to recording or sequential order according to a sliding time window.
The stream data is composed of different events, each of which contains a plurality of attributes. As time progresses, new events are continually generated and continually imported into the streaming data. The data schema (schema) of the stream data represents the structure of each event and is made up of a number of fields, each field representing an attribute, with an explicit type.
And step 420, storing the stream data into the storage system according to the line format to form line stream data.
As described above, the line storage may use the DistributedLog system to store the received stream data to form a line flow, which is not described herein again.
And step 430, simultaneously extracting the data mode of the stream data, organizing the stream data according to columns according to the data mode, and asynchronously writing the stream data into BookKeeper to form columnar stream data.
In an embodiment, the server may organize the received stream data by columns while the DistributedLog stores the row-column stream, extract each field from the stream data according to a data mode of the stream data to form different columns (columns), the columns may be divided into a plurality of column clusters (column clusters), and data in the same column cluster may be continuously persisted to the same segment of the storage node of the BookKeeper.
FIG. 5 illustrates a schematic diagram of organizing and writing streaming data into a storage system in columns in one embodiment. As shown in fig. 5, the server may initialize the received stream data in the memory, extract the data pattern, and organize the data with the same attribute into a column, such as a key column or a time stamp column (write time of the recorded data). The columns may be divided into several column clusters, and the columns with stronger association or common IO characteristics may be divided into one column cluster. Column cluster names may be used as prefixes within the column names to facilitate retrieval of records. Data of the same column cluster can be persisted to the same segment in the BookKeeper in order. When the data volume is large, the server needs to switch the segments periodically in consideration of the storage space and load condition of the whole storage cluster, and at this time, the data of the same column cluster may also be persisted into different segments. Each segment is an incremental data structure, and data of the same column cluster can be sequentially written into each data unit through a single writing mode. When the data storage of a certain column of clusters is finished or all the data units in the segment are fully written, the segment is uniformly dispersed into the BookKeeper cluster so as to ensure the uniformity of data and services on a plurality of storage nodes.
The stream data organized in columns may employ a strategy of aggregate writing. In one embodiment, the server may open a buffer of suitable size in memory for each column cluster in the streaming data organized in columns. When the data comes, the tail vacancy of the buffer area can be stored firstly, when the buffer area is full, all the data in the buffer area are persisted into the cookies together, and then the buffer area is emptied for continuous use. If the buffer area is not full, the storage layer is not written. In one embodiment, when the buffer data is written into the storage device, the granularity suitable for IO transmission of the storage device may be adopted to obtain higher writing efficiency.
In one embodiment, when writing streaming data organized in columns by using buffer aggregation, each event in the streaming data can be assigned with a unique ID, and when the buffer is full, the ID of the event in the buffer and the position information thereof are attached to the head of the buffer and written into the data unit of the cookie together with all the data in the current buffer. The buffer may be set to coincide with the size of the data unit, and when the header information in the buffer and all of its data are stored in the data unit, the header information is also located at the head of the data unit. By recording the mapping from the event position to the segment storage system storage position into the head information of the data unit, any position of stream data can be quickly positioned, the expense of using the key value system storage position mapping information is reduced, and the storage position can be obtained by performing binary search when the any position is read.
In one embodiment, stream-level metadata generated in the columnar Storage and segment-level metadata may be stored in a distributed Key-Value Storage System (K/V), where the stream-level metadata is one-level metadata, which refers to intermediate data generated by organizing and storing stream data into segments in columns; the segment-level metadata is secondary metadata for recording statistical information of the segments of the storage column cluster. The K/V storage system stores metadata in the form of a table, and each K-V pair determines a unit in which a unique value of the metadata corresponding to the K-V pair is stored. FIG. 6(a) is a diagram illustrating a storage structure of stream-level metadata in a level-one K/V table in one embodiment. As shown in fig. 6(a), the primary Key (Key) of each row in the table is an ID (stream ID for short) of each data stream, the column may include a column cluster forming the data stream and a segment storing the column cluster, and the value of the unit determined by each row and column in the table is information of each data stream and its corresponding column cluster and segment. FIG. 6(b) is a diagram showing a storage structure of the metadata at the segment level in the secondary K/V table in one embodiment. As shown in fig. 6(b), the primary Key (Key) of each row in the table is the ID of each segment (segment ID for short), and the column may include the ID and position of the start-stop event in the segment, the number of events, the maximum and minimum values, and the like. The metadata of the stream level and the segment level are stored in the distributed key value system, so that the limitation that the traditional column type storage format can only store the metadata at the tail part of a file is overcome, the high efficiency of point query of the key value system can be fully exerted, and a user can conveniently and quickly access the metadata to read the data.
And step 440, immediately returning a message of successful storage to the client after the storage of the line flow data is completed.
The distributed Log can simultaneously deal with a large number of reading and writing operations of thousands of clients per second, millisecond-level delay is provided, and delay caused by column type storage can be effectively avoided by returning a storage success message to the clients immediately after the line type stream storage is completed.
Fig. 7 is a schematic diagram illustrating a stream data storage method according to an embodiment of the present invention, and as shown in fig. 7, after receiving data from a client, a server writes the data into a storage device using a DistributedLog to form line stream data, and returns an acknowledgement message to the client immediately after completion; and simultaneously, the server extracts a data mode of the streaming data from the memory and organizes the streaming data into a buffer area according to the data mode, the streaming data is written into the sectional storage equipment based on the column cluster after the buffer area is full, metadata generated in the column storage can be stored into the distributed key storage system, and when the sectional mode needs to be switched, new section information is added into the key storage system.
When the write amount of the streaming data is large, since the storage of the columnar stream is slow and an excessive amount of CPU resources are consumed, a large amount of data may be accumulated in the memory, and thus the write resource of the columnar stream data needs to be limited. In one embodiment, readers of the line flow can be used as the transfer of real-time data, and the real-time data is pulled by using a single thread, so that the problem that a large amount of data cannot be processed quickly because the data reaches the server side quickly is solved. Fig. 8 shows a method for storing stream data according to another embodiment of the present invention, as shown in fig. 8, after receiving stream data from a client, a server may first store the stream data in a discributedlog in a line format to form line stream data, then use a reader process in the disctributedlog to read the line stream data from a storage system as a data source for column-wise storage, extract a data pattern of the line stream data, organize the line stream data in a memory in columns according to the data pattern, and write the line stream data in a cookie in the column-wise format. Each column cluster can be allocated with a write thread, and because the columnar flow adopts an aggregation write strategy, data writing is performed only after certain data volume is aggregated, so that the throughput of the columnar flow is not greatly influenced by single thread data acquisition, columnar organization and storage.
The storage system stores the line-type stream data and the column-type stream data simultaneously due to the adoption of a storage method with the coexistence of the line-type storage and the column-type storage. Line typeThe long-term coexistence of the stream and the column stream occupies a large amount of storage space, and the storage resource is wasted. In one embodiment, after storage of the columnar stream is completed, if the server does not receive a request to read the line stream data within a predetermined time threshold, the line stream data may be deleted from the storage system to maintain integrity of the stream data and save storage space. FIG. 9 illustrates a schematic diagram of deleting a line flow after column flow storage is complete in one embodiment. As shown in fig. 9, the server performs line storage on the received stream data to form a line stream, and then organizes and stores the line stream in columns as a line stream after a period of time (e.g., T) has elapsednTime of day), the line stream may be deleted and only the line stream data is stored in the system.
When the server receives a request for reading data from the client, the server may read data from both the line stream and the column stream in the case where the line stream and the column stream coexist. Since the storage of the line flow is quicker than that of the column flow, after the storage of the line flow is completed, the storage of the column flow may not be completed, in this case, the server can read data from the line flow preferentially to respond to the request of the client quickly, and the problem of delay in data reading is solved effectively. And when the line stream data is deleted, reading the data from the line stream data and returning a request result.
Fig. 10 shows a method for reading mass stream data in an embodiment of the present invention, as shown in fig. 10, after receiving a RPC read data request sent by a client from a remote computer, a server may first query metadata of requested data from a distributed key storage system, determine a location of the requested data according to the metadata, and return the data to the client after reading the data from a segment storage device. The read request from the client may include, for example, information of stream data to which the requested data belongs, attribute information, or information of an event to which the requested data belongs, the stream ID, the column cluster, and the event ID to which the requested data belongs may be determined from the information of the stream data to which the requested data belongs or the information of the event to which the requested data belongs, and the location of the requested data may be located efficiently from the metadata of the stream level and the metadata of the segment level in the primary and secondary K/V tables, and the event ID and location information thereof in the data unit header information of the segment.
In general, when reading data by column, a user reads all data in a specific range. Therefore, in order to improve the efficiency of reading data by columns, in one embodiment, the server may adopt a policy of active prefetching, that is, after locating the start position of a specific data by using the above reading method, reading the next batch of data from the segment storage system in advance and placing the next batch of data in a cache. Fig. 11 is a schematic diagram illustrating a method for reading stream data according to an embodiment of the present invention, and as shown in fig. 11, after receiving a request for reading data from a client, a server may first query in a cache, and if the requested data exists in the cache, directly read the data from the cache and return the data to the client, and if the requested data is not in the cache, query metadata of the requested data from a distributed key storage system, and read the data from a segment storage device according to the metadata and return the data to the client.
The active prefetch method can effectively reduce the time delay in the reading process, but needs to use a large amount of memory resources. In one embodiment, the server limits the resources used by active prefetching, and uses a resource central control component to control the total number of prefetches allowed by the cluster, and records the total number of prefetches through an internal counter. When a columnar reader of a stream is created, a server creates a plurality of column cluster readers according to columns read by a user, each column cluster reader applies for the pre-fetching number set by the user to a resource central control assembly, the resource central control assembly returns the number of data units which can be pre-fetched, and the counter maintained inside subtracts the corresponding number; when a columnar reader is closed, the server updates the resource central control component, and the resource central control component adds the corresponding number to the counter inside the resource central control component. Therefore, the situation that other functional modules are affected due to the fact that memory resources occupied by active prefetching are used when a large number of column-type reads are served at the same time can be avoided.
The invention supports the flexible deployment of the server, and can deploy the server to an independent node and deploy the server and the storage node of the distributed segment storage system to the same node. When the storage nodes of the server and the distributed segment storage system are deployed at the same node, cross-network message transmission between the server and the segment storage system can be reduced, and delay is reduced to a certain extent. With the rapid development of high-speed networks, the delay for accessing data through a local storage device and the delay for accessing data through a network are not much different, so that under the high-speed network environment, a server can be selectively deployed in a single node, so as to respectively manage different types of nodes.
In one embodiment of the invention, the invention may be implemented in the form of a computer program. The computer program may be stored in various storage media (e.g., a hard disk, an optical disk, a flash memory, etc.), and when executed by a processor, can be used to implement the storing method and the reading method of the present invention. The computer program implementing the above-described storing and reading method of the present invention is named "CStream".
Through experimental tests, the CStream is far superior to the similar storage system in the aspects of writing, reading and querying. FIG. 12(a) shows a write performance comparison table (3 copies), each stream containing 6 fields, for 38 bytes, as shown in FIG. 12(a), for writes of event streams, both single stream and multiple streams, CStream has better throughput and latency than Kafka; FIG. 12(b) shows a read performance comparison table (3 copies), each stream containing 6 fields of 38 bytes, and FIG. 12(b) shows that for event stream reads, both single stream and multiple streams, throughput and latency of CStream are better than Kafka; fig. 12(c) shows TPC-H query performance (SF 300), which is close to Kafka when data is just loaded into CStream, as shown in fig. 12(c), because the data at this time is also line-type stored. Over time, CStream converts more and more data into columnar stores, with query performance closer and closer to part (part is a HDFS-based column store) better than Kafka.
In another embodiment of the invention, the invention may be implemented in the form of an electronic device. The electronic device comprises a processor and a memory in which a computer program is stored which, when being executed by the processor, can be used for carrying out the method of the invention.
References herein to "various embodiments," "some embodiments," "one embodiment," or "an embodiment," etc., indicate that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in various embodiments," "in some embodiments," "in one embodiment," or "in an embodiment," or the like, in various places throughout this document are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Thus, a particular feature, structure, or characteristic illustrated or described in connection with one embodiment may be combined, in whole or in part, with a feature, structure, or characteristic of one or more other embodiments without limitation, as long as the combination is not logically inconsistent or workable. Expressions appearing herein similar to "according to a", "based on a", "by a" or "using a" mean non-exclusive, i.e. "according to a" may encompass "according to a only", as well as "according to a and B", unless specifically stated or clear from context that the meaning is "according to a only". In the present application, for clarity of explanation, some illustrative operational steps are described in a certain order, but one skilled in the art will appreciate that each of these operational steps is not essential and some of them may be omitted or replaced by others. It is also not necessary that these operations be performed sequentially in the manner shown, but rather that some of these operations be performed in a different order, or in parallel, as desired, provided that the new implementation is not logically or operationally unfeasible.
Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be within the spirit and scope of the invention. Although the present invention has been described by way of preferred embodiments, the present invention is not limited to the embodiments described herein, and various changes and modifications may be made without departing from the scope of the present invention.

Claims (10)

1. A storage method of mass stream data comprises the following steps:
receiving streaming data from a client;
storing the stream data in a line format to a distributed segment storage system to form line stream data;
asynchronously storing the stream data in a columnar format to a distributed segment storage system to form columnar stream data;
and returning a confirmation message to the client after the line flow data storage is completed.
2. The method of claim 1, wherein the asynchronously storing the stream data in a columnar format to a distributed segment storage system comprises:
the stream data is asynchronously stored in a columnar format to a distributed segment storage system while the stream data is stored in a lined format to a storage system, forming columnar stream data.
3. The method of claim 1, wherein the asynchronously storing the stream data in a columnar format to a distributed segment storage system comprises:
firstly, storing the stream data into a distributed segment type storage system in a line type format to form line type stream data;
reading the line flow data from the distributed segment storage system, and asynchronously storing the line flow data in a columnar format to the distributed segment storage system to form columnar flow data.
4. The method of claim 1, wherein the asynchronously storing the stream data in a columnar format to a distributed segment storage system comprises:
extracting a data pattern of the stream data;
organizing the stream data in columns according to the data pattern;
respectively opening up a buffer area for each column cluster in the stream data organized according to the columns;
respectively adding the stream data organized according to columns to tail vacant positions of a buffer zone corresponding to the columns according to the columns;
and when the buffer area is full, writing the stream data in the buffer area into the distributed segment storage system, and storing the stream data of different clusters in different segments.
5. The method of claim 4, further comprising:
setting an ID for each event in the stream data;
recording the position information of each event in the buffer area;
when the buffer area is full, the ID and the corresponding position information of all events in the buffer area are attached to the head of the buffer area;
and writing the ID and the position information of all events in the buffer zone and all stream data in the buffer zone into a data unit of the distributed segment storage system, wherein the ID and the position information of all events in the buffer zone are positioned at the head of the data unit.
6. The method of claim 1, further comprising:
storing metadata of the columnar flow data to a distributed key-value storage system, the metadata including two levels, a flow and a segment, wherein,
the stream-level metadata includes: information of a column cluster constituting the columnar flow data, information of a segment storing the column cluster;
the segment-level metadata includes: location information, number of events, maximum and minimum values, and other relevant information for start and stop events within the segment.
7. The method of claim 1, further comprising:
after asynchronously storing the stream data in a columnar format in a distributed segment storage system, deleting the line data from the distributed segment storage system if a request to read the line data is not received within a predetermined time threshold.
8. A method for reading mass stream data based on the storage method of claim 6, comprising:
receiving a request of a client for reading data;
querying metadata of the data in the distributed key value storage system according to the request for reading the data;
and reading the data from the distributed section type storage system according to the metadata and returning the data to the client.
9. The reading method according to claim 8, further comprising:
determining initial position information of the data according to metadata of the data;
reading the data after the initial position of the data in advance and placing the data in a data cache region;
and reading the data from the data buffer area.
10. A storage system for mass streaming data, comprising a server and a storage device, operable to implement the method of any of claims 1-9.
CN201911196972.1A 2019-11-29 2019-11-29 Method and system for storing and reading mass stream data Pending CN111159176A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911196972.1A CN111159176A (en) 2019-11-29 2019-11-29 Method and system for storing and reading mass stream data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911196972.1A CN111159176A (en) 2019-11-29 2019-11-29 Method and system for storing and reading mass stream data

Publications (1)

Publication Number Publication Date
CN111159176A true CN111159176A (en) 2020-05-15

Family

ID=70556253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911196972.1A Pending CN111159176A (en) 2019-11-29 2019-11-29 Method and system for storing and reading mass stream data

Country Status (1)

Country Link
CN (1) CN111159176A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930751A (en) * 2020-08-31 2020-11-13 成都四方伟业软件股份有限公司 Time sequence data storage method and device
CN112257051A (en) * 2020-12-23 2021-01-22 畅捷通信息技术股份有限公司 WeChat-based selective data processing method, device and medium
CN113608674A (en) * 2021-06-25 2021-11-05 济南浪潮数据技术有限公司 Method and device for realizing reading and writing of distributed block storage system
WO2022166071A1 (en) * 2021-02-04 2022-08-11 华为技术有限公司 Stream data access method and apparatus in stream data storage system
CN115563128A (en) * 2022-12-07 2023-01-03 深圳市加推科技有限公司 Social data management method, device, equipment and medium
CN118192923A (en) * 2024-05-17 2024-06-14 中国西安卫星测控中心 Page display method based on time and event driving

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996250A (en) * 2010-11-15 2011-03-30 中国科学院计算技术研究所 Hadoop-based mass stream data storage and query method and system
CN102298641A (en) * 2011-09-14 2011-12-28 清华大学 Method for uniformly storing files and structured data based on key value bank
CN102332027A (en) * 2011-10-15 2012-01-25 西安交通大学 Mass non-independent small file associated storage method based on Hadoop
CN105095247A (en) * 2014-05-05 2015-11-25 中国电信股份有限公司 Symbolic data analysis method and system
CN109542889A (en) * 2018-10-11 2019-03-29 平安科技(深圳)有限公司 Stream data column storage method, device, equipment and storage medium
CN110362572A (en) * 2019-06-25 2019-10-22 浙江邦盛科技有限公司 A kind of time series database system based on column storage

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996250A (en) * 2010-11-15 2011-03-30 中国科学院计算技术研究所 Hadoop-based mass stream data storage and query method and system
CN102298641A (en) * 2011-09-14 2011-12-28 清华大学 Method for uniformly storing files and structured data based on key value bank
CN102332027A (en) * 2011-10-15 2012-01-25 西安交通大学 Mass non-independent small file associated storage method based on Hadoop
CN105095247A (en) * 2014-05-05 2015-11-25 中国电信股份有限公司 Symbolic data analysis method and system
CN109542889A (en) * 2018-10-11 2019-03-29 平安科技(深圳)有限公司 Stream data column storage method, device, equipment and storage medium
CN110362572A (en) * 2019-06-25 2019-10-22 浙江邦盛科技有限公司 A kind of time series database system based on column storage

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
VLADIMIR NIKULIN: "On the method for data streams aggregation to predict shoppers loyalty", 《2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)》, 1 October 2015 (2015-10-01) *
赵永霞: "数据库原理与应用", 华中科技大学出版社, pages: 219 *
陈绍斌: "大规模流式数据存储研究与应用", 《信息科技辑》 *
陈绍斌: "大规模流式数据存储研究与应用", 《信息科技辑》, 15 August 2018 (2018-08-15) *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930751A (en) * 2020-08-31 2020-11-13 成都四方伟业软件股份有限公司 Time sequence data storage method and device
CN112257051A (en) * 2020-12-23 2021-01-22 畅捷通信息技术股份有限公司 WeChat-based selective data processing method, device and medium
CN112257051B (en) * 2020-12-23 2021-03-19 畅捷通信息技术股份有限公司 WeChat-based selective data processing method, device and medium
WO2022166071A1 (en) * 2021-02-04 2022-08-11 华为技术有限公司 Stream data access method and apparatus in stream data storage system
CN113608674A (en) * 2021-06-25 2021-11-05 济南浪潮数据技术有限公司 Method and device for realizing reading and writing of distributed block storage system
CN113608674B (en) * 2021-06-25 2024-02-23 济南浪潮数据技术有限公司 Method and device for realizing reading and writing of distributed block storage system
CN115563128A (en) * 2022-12-07 2023-01-03 深圳市加推科技有限公司 Social data management method, device, equipment and medium
CN118192923A (en) * 2024-05-17 2024-06-14 中国西安卫星测控中心 Page display method based on time and event driving

Similar Documents

Publication Publication Date Title
US11153380B2 (en) Continuous backup of data in a distributed data store
CN109213772B (en) Data storage method and NVMe storage system
CN111159176A (en) Method and system for storing and reading mass stream data
US10642840B1 (en) Filtered hash table generation for performing hash joins
US10579610B2 (en) Replicated database startup for common database storage
CN107423422B (en) Spatial data distributed storage and search method and system based on grid
KR102564170B1 (en) Method and device for storing data object, and computer readable storage medium having a computer program using the same
US9519664B1 (en) Index structure navigation using page versions for read-only nodes
EP2735978B1 (en) Storage system and management method used for metadata of cluster file system
US8285686B2 (en) Executing prioritized replication requests for objects in a distributed storage system
US20170024315A1 (en) Efficient garbage collection for a log-structured data store
CN103595797B (en) Caching method for distributed storage system
CN107832423B (en) File reading and writing method for distributed file system
CN107798130A (en) A kind of Snapshot Method of distributed storage
CN103246616A (en) Global shared cache replacement method for realizing long-short cycle access frequency
CN107368608A (en) The HDFS small documents buffer memory management methods of algorithm are replaced based on ARC
EP2502167A1 (en) Super-records
CN111984191A (en) Multi-client caching method and system supporting distributed storage
CN108776690B (en) Method for HDFS distributed and centralized mixed data storage system based on hierarchical governance
WO2024021470A1 (en) Cross-region data scheduling method and apparatus, device, and storage medium
CN112463073A (en) Object storage distributed quota method, system, equipment and storage medium
KR102127785B1 (en) Method and apparatus for providing efficient indexing and computer program included in computer readable medium therefor
CN116561358A (en) Unified 3D scene data file storage and retrieval method based on hbase
WO2022267508A1 (en) Metadata compression method and apparatus
US11341163B1 (en) Multi-level replication filtering for a distributed database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200515

RJ01 Rejection of invention patent application after publication