WO2020078395A1 - Data storage method and apparatus, and storage medium - Google Patents
Data storage method and apparatus, and storage medium Download PDFInfo
- Publication number
- WO2020078395A1 WO2020078395A1 PCT/CN2019/111510 CN2019111510W WO2020078395A1 WO 2020078395 A1 WO2020078395 A1 WO 2020078395A1 CN 2019111510 W CN2019111510 W CN 2019111510W WO 2020078395 A1 WO2020078395 A1 WO 2020078395A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- low
- aggregated
- time
- processing unit
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
Definitions
- Embodiments of the present application relate to the technical field of data processing, and in particular, to a data storage method, device, and storage medium.
- data storage can be implemented through a data cube, where the data cube is a type of multi-dimensional matrix, that is, data of multiple dimensions can be stored.
- an implementation manner of storing data through a data cube may include: the storage device obtains data to be stored, and performs aggregate statistical processing on the obtained data to obtain corresponding aggregated data. After that, the obtained aggregated data can be merged with the existing data in the data cube, and the merged data can be stored in the data cube.
- Embodiments of the present application provide a data storage method, device, and storage medium, which can solve the problem that it takes a relatively long time to query data in the related art.
- the technical solution is as follows:
- a data storage method includes:
- the multiple aggregated data is classified and stored by multiple data processing units, wherein the types of aggregated data stored in each data processing unit are the same.
- a data storage device comprising:
- the acquisition module is used to acquire multiple pieces of data from the data source, and each piece of data carries a time stamp;
- a classification processing module configured to classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data
- An aggregation statistics module configured to aggregate statistics on each group of the multiple sets of data to obtain multiple aggregated data
- a classification storage module is used to classify and store the plurality of aggregated data by a plurality of data processing units, wherein the types of aggregated data stored in each data processing unit are the same.
- a computer-readable storage medium on which instructions are stored, and when the instructions are executed by a processor, the data storage method according to the first aspect described above is implemented.
- a computer program product containing instructions, which when executed on a computer, causes the computer to execute the data storage method described in the first aspect above.
- a storage device includes a processor and a memory, wherein the memory is used to store a computer program; the processor is used to execute a program stored on the memory, Implement the data storage method described in the first aspect above.
- Fig. 1 is a flowchart of a data storage method according to an exemplary embodiment
- Fig. 2 is a schematic diagram of a data processing unit according to an exemplary embodiment
- Fig. 3 is a schematic structural diagram of a data storage device according to an exemplary embodiment
- Fig. 4 is a schematic structural diagram of a storage device according to an exemplary embodiment.
- Spark Streaming A computing engine that can batch process data. Its basic principle is to batch process input data at a certain time interval. When the batch processing interval is shortened to the second level, it can be used to process real-time data. flow. It can support obtaining data from multiple data sources.
- Data sources can include Kafka data sources, Flume data sources, Twitter data sources, ZeroMQ data sources, Kinesis data sources, and TCP (Transmission Control Control Protocol) socket data sources.
- Kafka data sources can include Kafka data sources, Flume data sources, Twitter data sources, ZeroMQ data sources, Kinesis data sources, and TCP (Transmission Control Control Protocol) socket data sources.
- Data cube It is a kind of multi-dimensional matrix, which can be used for data analysis and indexing, and can support real-time indexing of metadata with any number of keywords.
- the data cube may be composed of memory and disk (distributed database) to implement multi-dimensional data storage based on the memory and disk.
- the related technical field proposes to store data through data cubes.
- the data is generally stored in a distributed database of the data cube, for example, the distributed database is HBase.
- the embodiments of the present application provide a data storage method, which can solve the above-mentioned problems.
- FIG. 1 please refer to the embodiment shown in FIG. 1 below.
- the data storage method provided by the embodiments of the present application may be executed by a storage device, and the storage device includes multiple data processing units to store data through the multiple data processing units.
- each data processing unit in the plurality of data processing units is composed of a memory and a disk.
- the data processing unit may be the aforementioned data cube.
- the storage device may also include Spark Streaming to obtain data from the data source through the Spark Streaming.
- Fig. 1 is a flowchart of a data storage method according to an exemplary embodiment.
- the data storage method is implemented by using the above storage device as an example for illustration.
- the data storage method may include the following implementation steps:
- Step 101 Obtain multiple pieces of data from a data source, and each piece of data carries a time stamp.
- the storage device may obtain the multiple pieces of data from the data source through Spark Streaming.
- the data source is a kafka data source
- multiple pieces of data may be read from the kafka data source through Spark Streaming.
- Each piece of data in carries a time stamp. The time stamp of each piece of data can be used to indicate the generation time of each piece of data.
- Step 102 Classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data.
- the storage device classifies the multiple pieces of data according to the time stamp of each piece of data.
- the specific implementation may include the following implementation steps:
- the multiple pieces of data may be classified according to the two data types of recent data and old data, that is, the multiple pieces of data that belong to the recent data may be classified into one category, and the old data Is divided into one category, for which a short-term time frame needs to be determined.
- the latest time is obtained from the time stamps of the multiple pieces of data, in other words, the latest time is obtained from the time stamps of the multiple pieces of data.
- the pieces of data include first data, second data, third data, and fourth data.
- the time indicated by the timestamp of the first data is June 25, 2017, and the timestamp indicated by the second data
- the time is June 29, 2017,
- the time indicated by the timestamp of the third data is July 2, 2017, and the time indicated by the timestamp of the fourth data is July 5, 2017, then the storage device obtains The latest time is July 5, 2017.
- the specific implementation of determining the target time interval that includes the latest time and the interval length is a preset threshold may include the following possible implementations:
- the first implementation manner when the latest time is within the time interval of the pre-stored interval length being the preset threshold, the time interval is determined as the target time interval.
- the preset threshold may be set by the user according to actual needs, or may be set by the storage device by default, which is not limited in this embodiment of the present application.
- the preset threshold may be 30 days.
- the pre-stored time interval is a recent time range relative to multiple pieces of data acquired in the batch.
- Time interval, wherein the target time interval is equivalent to the above-mentioned recent time range.
- the second implementation manner when the latest time is greater than the right value of the pre-stored interval length of the preset threshold, determine the time difference between the latest time and the right value of the time interval, determine the The time sum between the left value of the time interval and the time difference, update the right value of the time interval to the latest time, and update the left value of the time interval to the time sum, and determine the updated time interval Is the target time interval.
- the pre-stored time interval needs to be updated to re-determine the target time interval.
- it is equivalent to sliding the time interval to the right for a certain length of time, which is The difference between the latest time and the right value of the time interval. For example, if the pre-stored time interval is [July 1, July 15] and the latest time is July 16, the target time interval can be determined as [July 2, July 16].
- the pre-stored time interval may be updated to the target time interval.
- the target time interval may also be determined in other ways, for example, the time determined in the first implementation manner Do further calculation on the basis of the interval, and use the operation result as the target time interval. For example, add a fixed value to the left and right values of the time interval to obtain the target time interval, where the fixed value can be based on actual needs Make settings. As another example, further calculation can be performed on the basis of the updated time interval determined in the second implementation manner to obtain the target time interval, such as adding a fixed value to the left and right values of the updated time interval respectively
- the numerical value, the target time interval, etc. are not limited in the embodiment of the present application.
- the storage device may also query whether the time interval exists.
- the target time interval is determined according to the above two implementation methods.
- the storage device may generate the target time interval based on the latest time and the length of the interval. For example, the difference between the latest time and a preset threshold may be determined, and then the latest time is determined as the right value of the target time interval, and the determined difference value is determined as the left value of the target time interval.
- 1023 Classify the multiple pieces of data according to the time stamp of each piece of data and the target time interval.
- the pieces of data are classified according to the time stamp of each piece of data and the determined target time interval.
- the data whose time indicated by the time stamp in the multiple pieces of data is less than the left value of the target time interval is determined as high-level data, and the data indicated by the time stamp among the multiple pieces of data is within the target time interval Determined as low-level data.
- the piece of data is the data before the target time interval. It can be considered that the piece of data is old data. Divided into high-level data.
- the piece of data may be regarded as recent data, and this type of data is divided into low-level data here. In this way, two groups of data are obtained after data classification processing.
- Step 103 Aggregate statistics for each set of data in the multiple sets of data to obtain multiple aggregated data.
- the two sets of high-level data and low-level data obtained above need to be aggregated and counted.
- the target time interval uses day as the time granularity, based on Three time granularities of year, month, and day, according to different time levels and data attributes, aggregate statistics on the high-level data to obtain multiple first highest aggregated data, and based on year, month, day, hour, minute, and second
- aggregate statistics on the low-level data to obtain multiple second high-aggregated data and multiple first low-aggregated data.
- different time levels include different dimensions of time granularity.
- the data attribute is one-dimensional as an example.
- the storage device aggregates statistics according to different time levels and data attributes based on the three time granularities of year, month, and day.
- the different time levels include a first time level, a second time level, and a third time level.
- the first time level includes a time granularity of year
- the second time level includes a time granularity of year and month.
- the third time level includes three time granularities: year, month, and day.
- the storage device aggregates statistics on each piece of data according to the first time level and data attributes to obtain the first high-aggregated data corresponding to the first time level ; According to the second time level and data attributes, aggregate statistics on each piece of data to obtain the first highly aggregated data corresponding to the second time level; according to the third time level and data attributes, aggregate statistics on each piece of data To obtain the first highly aggregated data corresponding to the third time level.
- the storage device aggregates statistics according to different time levels and data attributes according to six time granularities of year, month, day, hour, minute, and second.
- the different time levels include not only the first time level, the second time level, and the third time level, but also the fourth time level, the fifth time level, and the sixth time level.
- the fourth time level includes the year, Four time granularities of month, day, and hour
- the fifth time hierarchy includes five time granularities of year, month, day, hour, and minute
- the sixth time hierarchy includes six times of year, month, day, hour, minute, and second granularity.
- the storage device aggregates statistics on each piece of data according to the first time level and data attributes to obtain second high-aggregated data corresponding to the first time level ;
- the sixth time level and data attributes aggregate statistics on each piece of data to obtain the first The first low-aggregated data corresponding to the six time levels.
- the data attribute is one-dimensional as an example for illustration.
- the storage device combines two dimensions of data attributes for aggregation statistics, and based on the third time level, combines two dimensions of data attributes for aggregation statistics. In this way, 12 first high Aggregate data.
- Data attributes and data attribute values are aggregated, for example, when the data attribute is age, the data attribute value may be an age value, etc.
- Step 104 When the multiple data processing units include a high-level data processing unit and a low-level data processing unit, obtain the row health in each aggregated data, and the row health of each aggregated data is generated during aggregation statistics, It is used to indicate the time level and data attributes corresponding to each aggregated data.
- each data processing unit of the plurality of data processing units is composed of a memory and a disk, and the type of aggregated data stored in each data processing unit is the same.
- the multiple data processing units include a high-level data processing unit and a low-level data processing unit, please refer to FIG. 2, which is a schematic diagram of a data processing unit according to an exemplary embodiment.
- the storage device obtains the exercise keys generated during the aggregation statistics process.
- the exercise health generated is also the same.
- the first data is aggregated based on July 2017 and a certain data attribute
- the second data is also aggregated based on July 2017 and the data attribute
- the two aggregated data obtained after the aggregated statistics Xingjian is the same.
- Step 105 Based on the row health in each first high-aggregated data and each second high-aggregated data, the plurality of first high-aggregated data and the plurality of second high-aggregated data are performed by the high-level data processing unit storage.
- the plurality of first high-aggregated data and the plurality of second high-aggregated data are stored in the high-level data processing unit, that is, the high-aggregated data and the low Part of the highly aggregated data obtained by hierarchical data aggregation statistics is stored in the same data processing unit.
- the multiple first high aggregated data and the multiple second high may include: merging the high aggregated data with the same health key among the multiple first high aggregated data and the multiple second high aggregated data to obtain multiple third high aggregated data, which The third highest aggregated data is stored in the high-level data processing unit.
- the data when storing highly aggregated data in the high-level data processing unit, the data is not directly merged with the data in the high-level data processing unit, but merged only when certain conditions are met.
- the high-aggregated data with the same health key is merged to obtain multiple third-high-aggregated data, so that when the multiple third-high-aggregated data is stored in the high-level data processing unit You can merge high-aggregation data with the same health. In this way, it is convenient for the user to subsequently query multiple pieces of data at the same time level and within the same time range at a time, avoiding the need to merge at the time of query, and improving the efficiency of data query.
- a specific implementation of storing the plurality of third-highest aggregated data in the high-level data processing unit may include: for each third-highest aggregated data in the plurality of third-highest aggregated data, querying the high-level aggregated data Whether the memory of the data processing unit stores the same data as the row health of each third-highest aggregated data, when the memory of the high-level data processing unit stores the same row health of each third-highest aggregated data Data, merge the queried data with each third-highest aggregated data, and store the merged data in the memory of the high-level data processing unit.
- the embodiment of the present application first merges the high-aggregated data in the memory, that is, queries whether the memory of the high-level data processing unit stores the data with the third-highest aggregated data. Walk the same data. If it exists, merge the high-aggregation data with the same row health directly in memory, and store the merged high-aggregation data in memory.
- the third The data with the same row health of the high-aggregated data merge the acquired data with each third-highest aggregated data, and store the merged data in the memory of the high-level data processing unit.
- Step 106 Based on the row key in each first low-aggregated data, the multiple first low-aggregated data is stored by the low-level data processing unit.
- a plurality of first low-aggregated data obtained through aggregation statistics are stored in a low-level data processing unit. Further, the storage device stores the plurality of first low-aggregated data through the low-level data processing unit based on the row health in each first low-aggregated data.
- the specific implementation process may include: The first low-aggregated data with the same row and key in the low-aggregated data are combined to obtain multiple second low-aggregated data, and the multiple second low-aggregated data are stored in the low-level data processing unit.
- the first low-aggregated data when the time level and the data attribute based on the same are the same, and the time corresponding to the time level is within the same time range, the generated exercise keys are also the same.
- the first low-aggregated data having the same health key are combined to obtain multiple second low-aggregated data, so that the multiple second low-aggregated data are stored in the low-level data processing unit At the same time, you can merge the low aggregate data with the same health. In this way, it is convenient for the user to subsequently query multiple pieces of data at the same time level and within the same time range at a time, avoiding the need to merge at the time of query, and improving the efficiency of data query.
- the above specific implementation of storing the plurality of second low-aggregated data in the low-level data processing unit may include: for each second low-aggregated data in the plurality of second low-aggregated data, query the low Whether the memory of the hierarchical data processing unit stores the same data as the row health of each second low-aggregated data; when the memory of the low-level data processing unit stores the row health of the second low-aggregated data When the data is the same, the queried data is merged with each second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
- the embodiment of the present application first merges the low-aggregated data in the memory, that is, queries whether the second low-aggregated data is stored in the memory of the low-level data processing unit Walk the same data. If it exists, merge the low-aggregated data with the same row health directly in memory, and store the merged data in memory.
- the second The data of the low-aggregated data has the same row health; the acquired data is merged with each of the second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
- the data in the memory of the high-level data processing unit is stored to the disk of the high-level data processing unit.
- the data in the memory of the low-level data processing unit is stored to the disk of the low-level data processing unit.
- the preset number threshold may be set by the user according to actual needs, or may be set by the storage device by default, which is not limited in the embodiment of the present application.
- the merged high-aggregated data is first stored in the memory of the high-level data processing unit, and the merged low-aggregated data is first stored in the memory of the low-level data processing unit, only when the memory of the high-level data processing unit When the stored data reaches a certain value, or when the data stored in the memory of the low-level data processing unit reaches a certain value, the data in the memory is written to the disk, which can reduce the number of interactions with the disk.
- querying high-aggregated data or low-aggregated data first query from the memory, when the query is not found in the memory, then query from the disk, to avoid frequent reading and writing to the disk, and improve system performance.
- the use of this storage method can also reduce the use of disks by high-level data processing units and disks by low-level data processing units.
- steps 104 to 106 are used to realize the operation of classifying and storing the multiple aggregated data by multiple data processing units.
- the storage device can also delete data in the low-level data processing unit that does not belong to the target time interval, so that low-level data can be saved Storage space of the processing unit.
- the offset of the acquired data may be recorded.
- the offset is used to indicate that the currently acquired data is The location in the data source.
- the next batch of data can be obtained according to the recorded offset. For example, if the data in the data source is numbered sequentially, and 5 pieces of data are acquired this time, the offset is 5, that is, the next piece of data will be acquired from the sixth piece of data.
- multiple pieces of data carrying a time stamp are obtained from a data source, and the multiple pieces of data are classified and processed according to the time stamp of each piece of data to obtain multiple sets of data.
- Aggregate statistics for each of the multiple sets of data and then, through multiple data processing units composed of memory and disks, classify and store the multiple aggregated data so that the aggregated data stored in each data processing unit Of the same type. In this way, in the subsequent data query, the query can be performed from the corresponding data processing unit based on the time stamp of the data to be queried, which improves the efficiency of data query.
- Fig. 3 is a schematic structural diagram of a data storage device according to an exemplary embodiment.
- the data storage device may be implemented by software, hardware, or a combination of both.
- the data storage device may include:
- the obtaining module 310 is used to obtain multiple pieces of data from a data source, and each piece of data carries a time stamp;
- the classification processing module 320 is configured to classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data;
- the aggregation statistics module 330 is configured to aggregate statistics on each group of the multiple groups of data to obtain multiple aggregated data;
- the classification storage module 340 is used to classify and store the plurality of aggregated data by a plurality of data processing units, wherein each data processing unit of the plurality of data processing units is composed of a memory and a disk, and each data processing The type of aggregated data stored in the cell is the same.
- the classification processing module 320 is used to:
- the classification processing module 320 is used to:
- the time interval is determined as the target time interval.
- the classification processing module 320 is used to:
- the updated time interval is determined as the target time interval.
- the classification processing module 320 is used to:
- the aggregation statistics module 330 is used to:
- the target time interval uses day as the time granularity, based on year, month, and day Time granularity, according to different time levels and data attributes, aggregate statistics on the high-level data to obtain multiple first highly aggregated data, and six time granularities based on year, month, day, hour, minute, and second, according to Different time levels and data attributes aggregate statistics on the low-level data to obtain multiple second high-aggregated data and multiple first low-aggregated data.
- Different time levels include different granularities of time granularity.
- the classification storage module 340 is used to:
- the row keys in each aggregated data are obtained.
- the row keys of each aggregated data are generated during aggregation statistics. To indicate the time level and data attributes corresponding to each aggregated data;
- the plurality of first high-aggregated data and the plurality of second high-aggregated data are performed by the high-level data processing unit Storing, and based on the row key in each first low-aggregated data, storing the plurality of first low-aggregated data through the low-level data processing unit.
- the classification storage module 340 is used to:
- the classification storage module 340 is used to:
- the acquired data is merged with each of the third highest aggregated data, and the merged data is stored in the memory of the high-level data processing unit.
- the classification storage module 340 is used to:
- each second low-aggregated data in the plurality of second low-aggregated data query whether the same data as the row health of each second low-aggregated data is stored in the memory of the low-level data processing unit ;
- the queried data is merged with each second low-aggregated data to The combined data is stored in the memory of the low-level data processing unit.
- the classification storage module 340 is used to:
- the The data of the second low aggregated data has the same health and health
- the acquired data is merged with each of the second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
- the classification storage module 340 is used to:
- the data in the memory of the low-level data processing unit is stored to the disk of the low-level data processing unit.
- multiple pieces of data carrying a time stamp are obtained from a data source, and the multiple pieces of data are classified and processed according to the time stamp of each piece of data to obtain multiple sets of data.
- Aggregate statistics for each of the multiple sets of data and then, through multiple data processing units composed of memory and disks, classify and store the multiple aggregated data so that the aggregated data stored in each data processing unit Of the same type. In this way, in the subsequent data query, the query can be performed from the corresponding data processing unit based on the time stamp of the data to be queried, which improves the efficiency of data query.
- the data storage device provided in the above embodiments is only exemplified by the division of the above functional modules.
- the above functions can be allocated by different functional modules as needed That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
- the data storage device and the data storage method embodiment provided in the above embodiments belong to the same concept. For the specific implementation process, refer to the method embodiments, and details are not described here.
- Fig. 4 is a schematic structural diagram of a storage device according to an exemplary embodiment. Specifically:
- the storage device 400 includes a central processing unit (CPU) 401, a system memory 404 including a random access memory (RAM) 402 and a read only memory (ROM) 403, and a system bus 405 connecting the system memory 404 and the central processing unit 401.
- the storage device 400 also includes a basic input / output system (I / O system) 406 that helps transfer information between various devices in the computer, and a large-capacity storage device for storing the operating system 413, application programs 414, and other program modules 415 407.
- I / O system basic input / output system
- the basic input / output system 406 includes a display 408 for displaying information and an input device 409 for a user to input information, such as a mouse and a keyboard.
- the display 408 and the input device 409 are both connected to the central processing unit 401 through the input and output controller 410 connected to the system bus 405.
- the basic input / output system 406 may also include an input-output controller 410 for receiving and processing input from a number of other devices such as a keyboard, mouse, or electronic stylus.
- the input output controller 410 also provides output to a display screen, printer, or other type of output device.
- the mass storage device 407 is connected to the central processing unit 401 through a mass storage controller (not shown) connected to the system bus 405.
- the mass storage device 407 and its associated computer-readable medium provide non-volatile storage for the storage device 400. That is, the mass storage device 407 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM drive.
- Computer-readable media may include computer storage media and communication media.
- Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory, or other solid-state storage technologies, CD-ROM, DVD, or other optical storage, tape cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices.
- RAM random access memory
- ROM read-only memory
- EPROM Erasable programmable read-only memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other solid-state storage technologies
- CD-ROM, DVD or other optical storage
- tape cassettes magnetic tape
- magnetic disk storage or other magnetic storage devices.
- computer storage medium is not limited to the above.
- the above-mentioned system memory 404 and mass storage device 407 may be collectively referred to as a memory.
- the storage device 400 may also be operated by a remote computer connected to the network through a network such as the Internet. That is, the storage device 400 may be connected to the network 412 through the network interface unit 411 connected to the system bus 405, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 411.
- the above memory also includes one or more programs.
- One or more programs are stored in the memory and configured to be executed by the CPU.
- the one or more programs include a method for performing data storage provided by the embodiments of the present application.
- An embodiment of the present application further provides a non-transitory computer-readable storage medium, when the instructions in the storage medium are executed by the processor of the mobile terminal, the mobile terminal can execute the data provided by the embodiment shown in FIG. 1 Storage method.
- An embodiment of the present application also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the data storage method provided by the embodiment shown in FIG. 1 described above.
- the program may be stored in a computer-readable storage medium.
- the mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Abstract
Description
Claims (26)
- 一种数据存储方法,其特征在于,所述方法包括:A data storage method, characterized in that the method includes:从数据源获取多条数据,每条数据携带时间戳;Obtain multiple pieces of data from the data source, and each piece of data carries a time stamp;根据所述每条数据的时间戳,对所述多条数据进行分类处理,得到多组数据;Classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data;对所述多组数据中的每组数据进行聚合统计,得到多个聚合数据;Aggregate statistics for each set of data in the multiple sets of data to obtain multiple aggregated data;通过多个数据处理单元对所述多个聚合数据进行分类存储,其中,每个数据处理单元中存储的聚合数据的类型相同。The multiple aggregated data is classified and stored by multiple data processing units, wherein the types of aggregated data stored in each data processing unit are the same.
- 如权利要求1所述的方法,其特征在于,所述根据所述每条数据的时间戳,对所述多条数据进行分类处理,包括:The method according to claim 1, wherein the classifying the multiple pieces of data according to the time stamp of each piece of data includes:从所述多条数据的时间戳中获取最新时间;Obtaining the latest time from the time stamps of the multiple pieces of data;确定包含所述最新时间且区间长度为预设阈值的目标时间区间;Determine a target time interval that includes the latest time and the interval length is a preset threshold;根据所述每条数据的时间戳和所述目标时间区间,对所述多条数据进行分类处理。Classify the multiple pieces of data according to the time stamp of each piece of data and the target time interval.
- 如权利要求2所述的方法,其特征在于,所述确定包含所述最新时间且区间长度为预设阈值的目标时间区间,包括:The method of claim 2, wherein the determining a target time interval that includes the latest time and the interval length is a preset threshold includes:当所述最新时间处于预先存储的区间长度为所述预设阈值的时间区间内时,将所述时间区间确定为所述目标时间区间。When the latest time is within a time interval in which the length of the pre-stored interval is the preset threshold, the time interval is determined as the target time interval.
- 如权利要求2所述的方法,其特征在于,所述确定包含所述最新时间且区间长度为预设阈值的目标时间区间,包括:The method of claim 2, wherein the determining a target time interval that includes the latest time and the interval length is a preset threshold includes:当所述最新时间大于预先存储的区间长度为所述预设阈值的时间区间的右值时,确定所述最新时间与所述时间区间的右值之间的时间差值;When the latest time is greater than the right value of the pre-stored interval length of the preset threshold, determine the time difference between the latest time and the right value of the time interval;确定所述时间区间的左值与所述时间差值之间的时间和;Determine the time sum between the left value of the time interval and the time difference;将所述时间区间的右值更新为所述最新时间,以及将所述时间区间的左值更新为所述时间和;Update the right value of the time interval to the latest time, and update the left value of the time interval to the time sum;将更新后的时间区间确定为所述目标时间区间。The updated time interval is determined as the target time interval.
- 如权利要求2-4任一项所述的方法,其特征在于,所述根据所述每条数据的时间戳和所述目标时间区间,对所述多条数据进行分类处理,包括:The method according to any one of claims 2-4, wherein the classifying the plurality of pieces of data according to the time stamp of each piece of data and the target time interval includes:将所述多条数据中时间戳指示的时间小于所述目标时间区间左值的数据确定为高层级数据,以及将所述多条数据中时间戳指示的时间处于所述目标时间区间内的数据确定为低层级数据。Determining that the time indicated by the timestamp in the multiple pieces of data is less than the left value of the target time interval as high-level data, and the data indicating that the time indicated by the timestamp in the multiple pieces of data is within the target time interval Determined as low-level data.
- 如权利要求5所述的方法,其特征在于,当所述每条数据的时间戳指示的时间均包括年、月、日、时、分、秒多个时间粒度,所述目标时间区间以日为时间粒度时,所述对所述多组数据中的每组数据进行聚合统计,得到多个聚合数据,包括:The method according to claim 5, wherein when the time indicated by the time stamp of each piece of data includes multiple time granularities of year, month, day, hour, minute, and second, the target time interval is in days In the case of time granularity, the aggregation of each group of data in the multiple groups of data to obtain multiple aggregated data includes:基于年、月、日三个时间粒度,按照不同的时间层级和数据属性,对所述高层级数据进行聚合统计,得到多个第一高聚合数据,以及基于年、月、日、时、分、秒六个时间粒度,按照不同的时间层级和数据属性,对所述低层级数据进行聚合统计,得到多个第二高聚合数据和多个第一低聚合数据,不同时间层级包括不同维度的时间粒度。Based on the three time granularities of year, month, and day, according to different time levels and data attributes, aggregate statistics on the high-level data to obtain multiple first highest aggregated data, and based on year, month, day, hour, minute , Six time granularity of seconds, according to different time levels and data attributes, aggregate statistics on the low-level data to obtain multiple second high-aggregated data and multiple first low-aggregated data, different time levels include different dimensions of Time granularity.
- 如权利要求6所述的方法,其特征在于,当所述多个数据处理单元包括高层级数据处理单元和低层级数据处理单元时,所述通过多个数据处理单元对所述多个聚合数据进行分类存储,包括:The method according to claim 6, wherein when the plurality of data processing units include a high-level data processing unit and a low-level data processing unit, the plurality of aggregated data is processed by the plurality of data processing units Carry out classified storage, including:获取每个聚合数据中的行健,所述每个聚合数据的行健是在聚合统计时生成的,用于指示所述每个聚合数据对应的时间层级和数据属性;Acquiring the row health in each aggregated data, the row health of each aggregated data is generated during aggregation statistics, and used to indicate the time level and data attributes corresponding to each aggregated data;基于每个第一高聚合数据和每个第二高聚合数据中的行健,通过所述高层级数据处理单元对所述多个第一高聚合数据和所述多个第二高聚合数据进行存储,以及基于每个第一低聚合数据中的行健,通过所述低层级数据处理单元对所述多个第一低聚合数据进行存储。Based on the row health in each first high-aggregated data and each second high-aggregated data, the plurality of first high-aggregated data and the plurality of second high-aggregated data are performed by the high-level data processing unit Storing, and based on the row key in each first low-aggregated data, storing the plurality of first low-aggregated data through the low-level data processing unit.
- 如权利要求7所述的方法,其特征在于,所述基于每个第一高聚合数据和每个第二高聚合数据中的行健,通过所述高层级数据处理单元对所述多个第一高聚合数据和所述多个第二高聚合数据进行存储,包括:The method according to claim 7, characterized in that, based on the line health in each first high-aggregated data and each second high-aggregated data, the plurality of The storing of one high-aggregated data and the plurality of second high-aggregated data includes:将所述多个第一高聚合数据和所述多个第二高聚合数据中行健相同的高聚 合数据进行合并,得到多个第三高聚合数据;Combining the plurality of first high-aggregated data and the plurality of second high-aggregated data with the same high-aggregated data with the same health key to obtain a plurality of third high-aggregated data;对于所述多个第三高聚合数据中的每个第三高聚合数据,查询所述高层级数据处理单元的内存中是否存储有与所述每个第三高聚合数据的行健相同的数据;For each third-highest aggregated data in the plurality of third-highest aggregated data, query whether the same data as the row health of each third-highest aggregated data is stored in the memory of the high-level data processing unit ;当所述高层级数据处理单元的内存中存储有与所述每个第三高聚合数据的行健相同的数据时,将查询到的数据与所述每个第三高聚合数据进行合并,将合并后的数据存储至所述高层级数据处理单元的内存中。When the memory of the high-level data processing unit stores the same data as the row health of each third-highest aggregated data, merge the queried data with the third-highest aggregated data to merge The combined data is stored in the memory of the high-level data processing unit.
- 如权利要求8所述的方法,其特征在于,所述查询所述高层级数据处理单元的内存中是否存储有与所述每个第三高聚合数据的行健相同的数据之后,还包括:The method according to claim 8, wherein after querying whether the memory of the high-level data processing unit stores the same data as the row health of each third-highest aggregated data, the method further includes:当所述高层级数据处理单元的内存中未存储有与所述每个第三高聚合数据的行健相同的数据时,从所述高层级数据处理单元的磁盘中获取与所述每个第三高聚合数据的行健相同的数据;When the same data as the row health of each third-highest aggregated data is not stored in the memory of the high-level data processing unit, the The data of the three high aggregated data is the same as the health;将获取的数据与所述每个第三高聚合数据进行合并,将合并后的数据存储至所述高层级数据处理单元的内存中。The acquired data is merged with each of the third highest aggregated data, and the merged data is stored in the memory of the high-level data processing unit.
- 如权利要求7所述的方法,其特征在于,所述基于每个第一低聚合数据中的行健,通过所述低层级数据处理单元对所述多个第一低聚合数据进行存储,包括:The method according to claim 7, wherein the storing the plurality of first low-aggregated data by the low-level data processing unit based on the line health in each first low-aggregated data includes: :将所述多个第一低聚合数据中行健相同的第一低聚合数据进行合并,得到多个第二低聚合数据;Combining the first low-aggregated data with the same health key among the multiple first low-aggregated data to obtain multiple second low-aggregated data;对于所述多个第二低聚合数据中的每个第二低聚合数据,查询所述低层级数据处理单元的内存中是否存储有与所述每个第二低聚合数据的行健相同的数据;For each second low-aggregated data in the plurality of second low-aggregated data, query whether the same data as the row health of each second low-aggregated data is stored in the memory of the low-level data processing unit ;当所述低层级数据处理单元的内存中存储有与所述每个第二低聚合数据的行健相同的数据时,将查询到的数据与所述每个第二低聚合数据进行合并,将合并后的数据存储至所述低层级数据处理单元的内存中。When the same data as the row health of each second low-aggregated data is stored in the memory of the low-level data processing unit, the queried data is merged with each second low-aggregated data to The combined data is stored in the memory of the low-level data processing unit.
- 如权利要求10所述的方法,其特征在于,所述查询所述低层级数据处 理单元的内存中是否存储有与所述每个第二低聚合数据的行健相同的数据之后,还包括:The method according to claim 10, wherein after querying whether memory of the low-level data processing unit stores the same data as the row health of each second low-aggregated data, the method further includes:当所述低层级数据处理单元的内存中未存储有与所述每个第二低聚合数据的行健相同的数据时,从所述低层级数据处理单元的磁盘中获取与所述每个第二低聚合数据的行健相同的数据;When the same data as the row health of each second low-aggregated data is not stored in the memory of the low-level data processing unit, the The data of the second low aggregated data has the same health and health;将获取的数据与所述每个第二低聚合数据进行合并,将合并后的数据存储至所述低层级数据处理单元的内存中。The acquired data is merged with each of the second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
- 如权利要求7所述的方法,其特征在于,所述方法还包括:The method of claim 7, wherein the method further comprises:当所述高层级数据处理单元的内存中的数据量达到预设数量阈值时,将所述高层级数据处理单元的内存中的数据存储至所述高层级数据处理单元的磁盘中;或者,When the amount of data in the memory of the high-level data processing unit reaches a preset number threshold, store the data in the memory of the high-level data processing unit to the disk of the high-level data processing unit; or,当所述低层级数据处理单元的内存中的数据量达到所述预设数量阈值时,将所述低层级数据处理单元的内存中的数据存储至所述低层级数据处理单元的磁盘中。When the amount of data in the memory of the low-level data processing unit reaches the preset number threshold, the data in the memory of the low-level data processing unit is stored to the disk of the low-level data processing unit.
- 一种数据存储装置,其特征在于,所述装置包括:A data storage device, characterized in that the device includes:获取模块,用于从数据源获取多条数据,每条数据携带时间戳;The acquisition module is used to acquire multiple pieces of data from the data source, and each piece of data carries a time stamp;分类处理模块,用于根据所述每条数据的时间戳,对所述多条数据进行分类处理,得到多组数据;A classification processing module, configured to classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data;聚合统计模块,用于对所述多组数据中的每组数据进行聚合统计,得到多个聚合数据;An aggregation statistics module, configured to aggregate statistics on each group of the multiple sets of data to obtain multiple aggregated data;分类存储模块,用于通过多个数据处理单元对所述多个聚合数据进行分类存储,其中,每个数据处理单元中存储的聚合数据的类型相同。A classification storage module is used to classify and store the plurality of aggregated data by a plurality of data processing units, wherein the types of aggregated data stored in each data processing unit are the same.
- 如权利要求13所述的装置,其特征在于,所述分类处理模块用于:The apparatus according to claim 13, wherein the classification processing module is used to:从所述多条数据的时间戳中获取最新时间;Obtaining the latest time from the time stamps of the multiple pieces of data;确定包含所述最新时间且区间长度为预设阈值的目标时间区间;Determine a target time interval that includes the latest time and the interval length is a preset threshold;根据所述每条数据的时间戳和所述目标时间区间,对所述多条数据进行分类处理。Classify the multiple pieces of data according to the time stamp of each piece of data and the target time interval.
- 如权利要求14所述的装置,其特征在于,所述分类处理模块用于:The apparatus according to claim 14, wherein the classification processing module is used to:当所述最新时间处于预先存储的区间长度为所述预设阈值的时间区间内时,将所述时间区间确定为所述目标时间区间.When the latest time is within the time interval of the pre-stored interval length is the preset threshold, the time interval is determined as the target time interval.
- 如权利要求14所述的装置,其特征在于,所述分类处理模块用于:The apparatus according to claim 14, wherein the classification processing module is used to:当所述最新时间大于预先存储的区间长度为所述预设阈值的时间区间的右值时,确定所述最新时间与所述时间区间的右值之间的时间差值;When the latest time is greater than the right value of the pre-stored interval length of the preset threshold, determine the time difference between the latest time and the right value of the time interval;确定所述时间区间的左值与所述时间差值之间的时间和;Determine the time sum between the left value of the time interval and the time difference;将所述时间区间的右值更新为所述最新时间,以及将所述时间区间的左值更新为所述时间和;Update the right value of the time interval to the latest time, and update the left value of the time interval to the time sum;将更新后的时间区间确定为所述目标时间区间。The updated time interval is determined as the target time interval.
- 如权利要求14-16任一项所述的装置,其特征在于,所述分类处理模块用于:The device according to any one of claims 14-16, wherein the classification processing module is used to:将所述多条数据中时间戳指示的时间小于所述目标时间区间左值的数据确定为高层级数据,以及将所述多条数据中时间戳指示的时间处于所述目标时间区间内的数据确定为低层级数据。Determining that the time indicated by the timestamp in the multiple pieces of data is less than the left value of the target time interval as high-level data, and the data indicating that the time indicated by the timestamp in the multiple pieces of data is within the target time interval Determined as low-level data.
- 如权利要求17所述的装置,其特征在于,所述聚合统计模块用于:The apparatus of claim 17, wherein the aggregation statistics module is used to:当所述每条数据的时间戳指示的时间均包括年、月、日、时、分、秒多个时间粒度,所述目标时间区间以日为时间粒度时,基于年、月、日三个时间粒度,按照不同的时间层级和数据属性,对所述高层级数据进行聚合统计,得到多个第一高聚合数据,以及基于年、月、日、时、分、秒六个时间粒度,按照不同的时间层级和数据属性,对所述低层级数据进行聚合统计,得到多个第二高聚合数据和多个第一低聚合数据,不同时间层级包括不同维度的时间粒度。When the time indicated by the timestamp of each piece of data includes multiple time granularities of year, month, day, hour, minute, and second, and the target time interval uses day as the time granularity, based on year, month, and day Time granularity, according to different time levels and data attributes, aggregate statistics on the high-level data to obtain multiple first highly aggregated data, and six time granularities based on year, month, day, hour, minute, and second, according to Different time levels and data attributes aggregate statistics on the low-level data to obtain multiple second high-aggregated data and multiple first low-aggregated data. Different time levels include different granularities of time granularity.
- 如权利要求18所述的装置,其特征在于,所述分类存储模块用于:The apparatus of claim 18, wherein the classification storage module is used to:当所述多个数据处理单元包括高层级数据处理单元和低层级数据处理单元时,获取每个聚合数据中的行健,所述每个聚合数据的行健是在聚合统计时生 成的,用于指示所述每个聚合数据对应的时间层级和数据属性;When the plurality of data processing units include a high-level data processing unit and a low-level data processing unit, the row keys in each aggregated data are obtained. The row keys of each aggregated data are generated during aggregation statistics. To indicate the time level and data attributes corresponding to each aggregated data;基于每个第一高聚合数据和每个第二高聚合数据中的行健,通过所述高层级数据处理单元对所述多个第一高聚合数据和所述多个第二高聚合数据进行存储,以及基于每个第一低聚合数据中的行健,通过所述低层级数据处理单元对所述多个第一低聚合数据进行存储。Based on the row health in each first high-aggregated data and each second high-aggregated data, the plurality of first high-aggregated data and the plurality of second high-aggregated data are performed by the high-level data processing unit Storing, and based on the row key in each first low-aggregated data, storing the plurality of first low-aggregated data through the low-level data processing unit.
- 如权利要求19所述的装置,其特征在于,所述分类存储模块用于:The apparatus of claim 19, wherein the classification storage module is used to:将所述多个第一高聚合数据和所述多个第二高聚合数据中行健相同的高聚合数据进行合并,得到多个第三高聚合数据;Combining the plurality of first high-aggregated data and the plurality of second high-aggregated data with the same high-aggregated data with the same health key to obtain a plurality of third high-aggregated data;对于所述多个第三高聚合数据中的每个第三高聚合数据,查询所述高层级数据处理单元的内存中是否存储有与所述每个第三高聚合数据的行健相同的数据;For each third-highest aggregated data in the plurality of third-highest aggregated data, query whether the same data as the row health of each third-highest aggregated data is stored in the memory of the high-level data processing unit ;当所述高层级数据处理单元的内存中存储有与所述每个第三高聚合数据的行健相同的数据时,将查询到的数据与所述每个第三高聚合数据进行合并,将合并后的数据存储至所述高层级数据处理单元的内存中。When the memory of the high-level data processing unit stores the same data as the row health of each third-highest aggregated data, merge the queried data with the third-highest aggregated data to merge The combined data is stored in the memory of the high-level data processing unit.
- 如权利要求20所述的装置,其特征在于,所述分类存储模块用于:The apparatus of claim 20, wherein the classification storage module is used to:当所述高层级数据处理单元的内存中未存储有与所述每个第三高聚合数据的行健相同的数据时,从所述高层级数据处理单元的磁盘中获取与所述每个第三高聚合数据的行健相同的数据;When the same data as the row health of each third-highest aggregated data is not stored in the memory of the high-level data processing unit, the The data of the three high aggregated data is the same as the health;将获取的数据与所述每个第三高聚合数据进行合并,将合并后的数据存储至所述高层级数据处理单元的内存中。The acquired data is merged with each of the third highest aggregated data, and the merged data is stored in the memory of the high-level data processing unit.
- 如权利要求19所述的装置,其特征在于,所述分类存储模块用于:The apparatus of claim 19, wherein the classification storage module is used to:将所述多个第一低聚合数据中行健相同的第一低聚合数据进行合并,得到多个第二低聚合数据;Combining the first low-aggregated data with the same health key among the multiple first low-aggregated data to obtain multiple second low-aggregated data;对于所述多个第二低聚合数据中的每个第二低聚合数据,查询所述低层级数据处理单元的内存中是否存储有与所述每个第二低聚合数据的行健相同的数据;For each second low-aggregated data in the plurality of second low-aggregated data, query whether the same data as the row health of each second low-aggregated data is stored in the memory of the low-level data processing unit ;当所述低层级数据处理单元的内存中存储有与所述每个第二低聚合数据的 行健相同的数据时,将查询到的数据与所述每个第二低聚合数据进行合并,将合并后的数据存储至所述低层级数据处理单元的内存中。When the same data as the row health of each second low-aggregated data is stored in the memory of the low-level data processing unit, the queried data is merged with each second low-aggregated data to The combined data is stored in the memory of the low-level data processing unit.
- 如权利要求22所述的装置,其特征在于,所述分类存储模块用于:The apparatus according to claim 22, wherein the classification storage module is used to:当所述低层级数据处理单元的内存中未存储有与所述每个第二低聚合数据的行健相同的数据时,从所述低层级数据处理单元的磁盘中获取与所述每个第二低聚合数据的行健相同的数据;When the same data as the row health of each second low-aggregated data is not stored in the memory of the low-level data processing unit, the The data of the second low aggregated data has the same health and health;将获取的数据与所述每个第二低聚合数据进行合并,将合并后的数据存储至所述低层级数据处理单元的内存中。The acquired data is merged with each of the second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
- 如权利要求19所述的装置,其特征在于,所述分类存储模块用于:The apparatus of claim 19, wherein the classification storage module is used to:当所述高层级数据处理单元的内存中的数据量达到预设数量阈值时,将所述高层级数据处理单元的内存中的数据存储至所述高层级数据处理单元的磁盘中;或者,When the amount of data in the memory of the high-level data processing unit reaches a preset number threshold, store the data in the memory of the high-level data processing unit to the disk of the high-level data processing unit; or,当所述低层级数据处理单元的内存中的数据量达到所述预设数量阈值时,所述低层级数据处理单元的内存中的数据存储至所述低层级数据处理单元的磁盘中。When the amount of data in the memory of the low-level data processing unit reaches the preset number threshold, the data in the memory of the low-level data processing unit is stored in the disk of the low-level data processing unit.
- 一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,其特征在于,所述指令被处理器执行时实现权利要求1-12所述的任一项方法的步骤。A computer-readable storage medium having instructions stored on the computer-readable storage medium, characterized in that, when the instructions are executed by a processor, the steps of any one of the methods of claims 1-12 are implemented.
- 一种存储设备,其特征在于,所述存储设备包括处理器和存储器,其中,所述存储器,用于存放计算机程序;所述处理器,用于执行所述存储器上所存放的程序,实现权利要求1-12任一所述的方法步骤。A storage device, characterized in that the storage device includes a processor and a memory, wherein the memory is used to store a computer program; the processor is used to execute a program stored on the memory to implement rights The method steps of any one of claims 1-12 are required.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811204394.7 | 2018-10-16 | ||
CN201811204394.7A CN111061758B (en) | 2018-10-16 | 2018-10-16 | Data storage method, device and storage medium |
CN201811236196.9A CN111090705B (en) | 2018-10-23 | 2018-10-23 | Multidimensional data processing method, device and equipment and storage medium |
CN201811236196.9 | 2018-10-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020078395A1 true WO2020078395A1 (en) | 2020-04-23 |
Family
ID=70282908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/111510 WO2020078395A1 (en) | 2018-10-16 | 2019-10-16 | Data storage method and apparatus, and storage medium |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2020078395A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193839A (en) * | 2016-03-15 | 2017-09-22 | 阿里巴巴集团控股有限公司 | Data aggregation method and device |
CN107924345A (en) * | 2015-06-26 | 2018-04-17 | 亚马逊技术股份有限公司 | Data storage area for the polymerization measurement result of measurement |
US20180293280A1 (en) * | 2017-04-07 | 2018-10-11 | Salesforce.Com, Inc. | Time series database search system |
-
2019
- 2019-10-16 WO PCT/CN2019/111510 patent/WO2020078395A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107924345A (en) * | 2015-06-26 | 2018-04-17 | 亚马逊技术股份有限公司 | Data storage area for the polymerization measurement result of measurement |
CN107193839A (en) * | 2016-03-15 | 2017-09-22 | 阿里巴巴集团控股有限公司 | Data aggregation method and device |
US20180293280A1 (en) * | 2017-04-07 | 2018-10-11 | Salesforce.Com, Inc. | Time series database search system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110622152B (en) | Scalable database system for querying time series data | |
KR102627690B1 (en) | Dimensional context propagation techniques for optimizing SKB query plans | |
CN111061758B (en) | Data storage method, device and storage medium | |
US8108367B2 (en) | Constraints with hidden rows in a database | |
US8978034B1 (en) | System for dynamic batching at varying granularities using micro-batching to achieve both near real-time and batch processing characteristics | |
US8725730B2 (en) | Responding to a query in a data processing system | |
KR102522274B1 (en) | User grouping method, apparatus thereof, computer, computer-readable recording medium and computer program | |
CN107301214B (en) | Data migration method and device in HIVE and terminal equipment | |
US10127283B2 (en) | Projecting effect of in-flight streamed data on a relational database | |
US8924373B2 (en) | Query plans with parameter markers in place of object identifiers | |
US20240126817A1 (en) | Graph data query | |
US10296497B2 (en) | Storing a key value to a deleted row based on key range density | |
US11086694B2 (en) | Method and system for scalable complex event processing of event streams | |
US20180121448A1 (en) | Altering In-Flight Streamed Data from a Relational Database | |
US10025826B2 (en) | Querying in-flight streamed data from a relational database | |
CN110555038A (en) | Data processing system, method and device | |
US8396858B2 (en) | Adding entries to an index based on use of the index | |
US9380126B2 (en) | Data collection and distribution management | |
Ahsaan et al. | Big data analytics: challenges and technologies | |
US10366081B2 (en) | Declarative partitioning for data collection queries | |
US20210064592A1 (en) | Computer storage and retrieval mechanisms using distributed probabilistic counting | |
WO2020078395A1 (en) | Data storage method and apparatus, and storage medium | |
US8935200B2 (en) | Dynamic database dump | |
CN110737727A (en) | data processing method and system | |
US8392374B2 (en) | Displaying hidden rows in a database after an expiration date |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19872785 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19872785 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19872785 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 031221) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19872785 Country of ref document: EP Kind code of ref document: A1 |