WO2020078395A1 - Procédé et appareil de stockage de données, et support de stockage - Google Patents

Procédé et appareil de stockage de données, et support de stockage Download PDF

Info

Publication number
WO2020078395A1
WO2020078395A1 PCT/CN2019/111510 CN2019111510W WO2020078395A1 WO 2020078395 A1 WO2020078395 A1 WO 2020078395A1 CN 2019111510 W CN2019111510 W CN 2019111510W WO 2020078395 A1 WO2020078395 A1 WO 2020078395A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
low
aggregated
time
processing unit
Prior art date
Application number
PCT/CN2019/111510
Other languages
English (en)
Chinese (zh)
Inventor
曾锐
陈国栋
徐乾龙
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201811204394.7A external-priority patent/CN111061758B/zh
Priority claimed from CN201811236196.9A external-priority patent/CN111090705B/zh
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2020078395A1 publication Critical patent/WO2020078395A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Definitions

  • Embodiments of the present application relate to the technical field of data processing, and in particular, to a data storage method, device, and storage medium.
  • data storage can be implemented through a data cube, where the data cube is a type of multi-dimensional matrix, that is, data of multiple dimensions can be stored.
  • an implementation manner of storing data through a data cube may include: the storage device obtains data to be stored, and performs aggregate statistical processing on the obtained data to obtain corresponding aggregated data. After that, the obtained aggregated data can be merged with the existing data in the data cube, and the merged data can be stored in the data cube.
  • Embodiments of the present application provide a data storage method, device, and storage medium, which can solve the problem that it takes a relatively long time to query data in the related art.
  • the technical solution is as follows:
  • a data storage method includes:
  • the multiple aggregated data is classified and stored by multiple data processing units, wherein the types of aggregated data stored in each data processing unit are the same.
  • a data storage device comprising:
  • the acquisition module is used to acquire multiple pieces of data from the data source, and each piece of data carries a time stamp;
  • a classification processing module configured to classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data
  • An aggregation statistics module configured to aggregate statistics on each group of the multiple sets of data to obtain multiple aggregated data
  • a classification storage module is used to classify and store the plurality of aggregated data by a plurality of data processing units, wherein the types of aggregated data stored in each data processing unit are the same.
  • a computer-readable storage medium on which instructions are stored, and when the instructions are executed by a processor, the data storage method according to the first aspect described above is implemented.
  • a computer program product containing instructions, which when executed on a computer, causes the computer to execute the data storage method described in the first aspect above.
  • a storage device includes a processor and a memory, wherein the memory is used to store a computer program; the processor is used to execute a program stored on the memory, Implement the data storage method described in the first aspect above.
  • Fig. 1 is a flowchart of a data storage method according to an exemplary embodiment
  • Fig. 2 is a schematic diagram of a data processing unit according to an exemplary embodiment
  • Fig. 3 is a schematic structural diagram of a data storage device according to an exemplary embodiment
  • Fig. 4 is a schematic structural diagram of a storage device according to an exemplary embodiment.
  • Spark Streaming A computing engine that can batch process data. Its basic principle is to batch process input data at a certain time interval. When the batch processing interval is shortened to the second level, it can be used to process real-time data. flow. It can support obtaining data from multiple data sources.
  • Data sources can include Kafka data sources, Flume data sources, Twitter data sources, ZeroMQ data sources, Kinesis data sources, and TCP (Transmission Control Control Protocol) socket data sources.
  • Kafka data sources can include Kafka data sources, Flume data sources, Twitter data sources, ZeroMQ data sources, Kinesis data sources, and TCP (Transmission Control Control Protocol) socket data sources.
  • Data cube It is a kind of multi-dimensional matrix, which can be used for data analysis and indexing, and can support real-time indexing of metadata with any number of keywords.
  • the data cube may be composed of memory and disk (distributed database) to implement multi-dimensional data storage based on the memory and disk.
  • the related technical field proposes to store data through data cubes.
  • the data is generally stored in a distributed database of the data cube, for example, the distributed database is HBase.
  • the embodiments of the present application provide a data storage method, which can solve the above-mentioned problems.
  • FIG. 1 please refer to the embodiment shown in FIG. 1 below.
  • the data storage method provided by the embodiments of the present application may be executed by a storage device, and the storage device includes multiple data processing units to store data through the multiple data processing units.
  • each data processing unit in the plurality of data processing units is composed of a memory and a disk.
  • the data processing unit may be the aforementioned data cube.
  • the storage device may also include Spark Streaming to obtain data from the data source through the Spark Streaming.
  • Fig. 1 is a flowchart of a data storage method according to an exemplary embodiment.
  • the data storage method is implemented by using the above storage device as an example for illustration.
  • the data storage method may include the following implementation steps:
  • Step 101 Obtain multiple pieces of data from a data source, and each piece of data carries a time stamp.
  • the storage device may obtain the multiple pieces of data from the data source through Spark Streaming.
  • the data source is a kafka data source
  • multiple pieces of data may be read from the kafka data source through Spark Streaming.
  • Each piece of data in carries a time stamp. The time stamp of each piece of data can be used to indicate the generation time of each piece of data.
  • Step 102 Classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data.
  • the storage device classifies the multiple pieces of data according to the time stamp of each piece of data.
  • the specific implementation may include the following implementation steps:
  • the multiple pieces of data may be classified according to the two data types of recent data and old data, that is, the multiple pieces of data that belong to the recent data may be classified into one category, and the old data Is divided into one category, for which a short-term time frame needs to be determined.
  • the latest time is obtained from the time stamps of the multiple pieces of data, in other words, the latest time is obtained from the time stamps of the multiple pieces of data.
  • the pieces of data include first data, second data, third data, and fourth data.
  • the time indicated by the timestamp of the first data is June 25, 2017, and the timestamp indicated by the second data
  • the time is June 29, 2017,
  • the time indicated by the timestamp of the third data is July 2, 2017, and the time indicated by the timestamp of the fourth data is July 5, 2017, then the storage device obtains The latest time is July 5, 2017.
  • the specific implementation of determining the target time interval that includes the latest time and the interval length is a preset threshold may include the following possible implementations:
  • the first implementation manner when the latest time is within the time interval of the pre-stored interval length being the preset threshold, the time interval is determined as the target time interval.
  • the preset threshold may be set by the user according to actual needs, or may be set by the storage device by default, which is not limited in this embodiment of the present application.
  • the preset threshold may be 30 days.
  • the pre-stored time interval is a recent time range relative to multiple pieces of data acquired in the batch.
  • Time interval, wherein the target time interval is equivalent to the above-mentioned recent time range.
  • the second implementation manner when the latest time is greater than the right value of the pre-stored interval length of the preset threshold, determine the time difference between the latest time and the right value of the time interval, determine the The time sum between the left value of the time interval and the time difference, update the right value of the time interval to the latest time, and update the left value of the time interval to the time sum, and determine the updated time interval Is the target time interval.
  • the pre-stored time interval needs to be updated to re-determine the target time interval.
  • it is equivalent to sliding the time interval to the right for a certain length of time, which is The difference between the latest time and the right value of the time interval. For example, if the pre-stored time interval is [July 1, July 15] and the latest time is July 16, the target time interval can be determined as [July 2, July 16].
  • the pre-stored time interval may be updated to the target time interval.
  • the target time interval may also be determined in other ways, for example, the time determined in the first implementation manner Do further calculation on the basis of the interval, and use the operation result as the target time interval. For example, add a fixed value to the left and right values of the time interval to obtain the target time interval, where the fixed value can be based on actual needs Make settings. As another example, further calculation can be performed on the basis of the updated time interval determined in the second implementation manner to obtain the target time interval, such as adding a fixed value to the left and right values of the updated time interval respectively
  • the numerical value, the target time interval, etc. are not limited in the embodiment of the present application.
  • the storage device may also query whether the time interval exists.
  • the target time interval is determined according to the above two implementation methods.
  • the storage device may generate the target time interval based on the latest time and the length of the interval. For example, the difference between the latest time and a preset threshold may be determined, and then the latest time is determined as the right value of the target time interval, and the determined difference value is determined as the left value of the target time interval.
  • 1023 Classify the multiple pieces of data according to the time stamp of each piece of data and the target time interval.
  • the pieces of data are classified according to the time stamp of each piece of data and the determined target time interval.
  • the data whose time indicated by the time stamp in the multiple pieces of data is less than the left value of the target time interval is determined as high-level data, and the data indicated by the time stamp among the multiple pieces of data is within the target time interval Determined as low-level data.
  • the piece of data is the data before the target time interval. It can be considered that the piece of data is old data. Divided into high-level data.
  • the piece of data may be regarded as recent data, and this type of data is divided into low-level data here. In this way, two groups of data are obtained after data classification processing.
  • Step 103 Aggregate statistics for each set of data in the multiple sets of data to obtain multiple aggregated data.
  • the two sets of high-level data and low-level data obtained above need to be aggregated and counted.
  • the target time interval uses day as the time granularity, based on Three time granularities of year, month, and day, according to different time levels and data attributes, aggregate statistics on the high-level data to obtain multiple first highest aggregated data, and based on year, month, day, hour, minute, and second
  • aggregate statistics on the low-level data to obtain multiple second high-aggregated data and multiple first low-aggregated data.
  • different time levels include different dimensions of time granularity.
  • the data attribute is one-dimensional as an example.
  • the storage device aggregates statistics according to different time levels and data attributes based on the three time granularities of year, month, and day.
  • the different time levels include a first time level, a second time level, and a third time level.
  • the first time level includes a time granularity of year
  • the second time level includes a time granularity of year and month.
  • the third time level includes three time granularities: year, month, and day.
  • the storage device aggregates statistics on each piece of data according to the first time level and data attributes to obtain the first high-aggregated data corresponding to the first time level ; According to the second time level and data attributes, aggregate statistics on each piece of data to obtain the first highly aggregated data corresponding to the second time level; according to the third time level and data attributes, aggregate statistics on each piece of data To obtain the first highly aggregated data corresponding to the third time level.
  • the storage device aggregates statistics according to different time levels and data attributes according to six time granularities of year, month, day, hour, minute, and second.
  • the different time levels include not only the first time level, the second time level, and the third time level, but also the fourth time level, the fifth time level, and the sixth time level.
  • the fourth time level includes the year, Four time granularities of month, day, and hour
  • the fifth time hierarchy includes five time granularities of year, month, day, hour, and minute
  • the sixth time hierarchy includes six times of year, month, day, hour, minute, and second granularity.
  • the storage device aggregates statistics on each piece of data according to the first time level and data attributes to obtain second high-aggregated data corresponding to the first time level ;
  • the sixth time level and data attributes aggregate statistics on each piece of data to obtain the first The first low-aggregated data corresponding to the six time levels.
  • the data attribute is one-dimensional as an example for illustration.
  • the storage device combines two dimensions of data attributes for aggregation statistics, and based on the third time level, combines two dimensions of data attributes for aggregation statistics. In this way, 12 first high Aggregate data.
  • Data attributes and data attribute values are aggregated, for example, when the data attribute is age, the data attribute value may be an age value, etc.
  • Step 104 When the multiple data processing units include a high-level data processing unit and a low-level data processing unit, obtain the row health in each aggregated data, and the row health of each aggregated data is generated during aggregation statistics, It is used to indicate the time level and data attributes corresponding to each aggregated data.
  • each data processing unit of the plurality of data processing units is composed of a memory and a disk, and the type of aggregated data stored in each data processing unit is the same.
  • the multiple data processing units include a high-level data processing unit and a low-level data processing unit, please refer to FIG. 2, which is a schematic diagram of a data processing unit according to an exemplary embodiment.
  • the storage device obtains the exercise keys generated during the aggregation statistics process.
  • the exercise health generated is also the same.
  • the first data is aggregated based on July 2017 and a certain data attribute
  • the second data is also aggregated based on July 2017 and the data attribute
  • the two aggregated data obtained after the aggregated statistics Xingjian is the same.
  • Step 105 Based on the row health in each first high-aggregated data and each second high-aggregated data, the plurality of first high-aggregated data and the plurality of second high-aggregated data are performed by the high-level data processing unit storage.
  • the plurality of first high-aggregated data and the plurality of second high-aggregated data are stored in the high-level data processing unit, that is, the high-aggregated data and the low Part of the highly aggregated data obtained by hierarchical data aggregation statistics is stored in the same data processing unit.
  • the multiple first high aggregated data and the multiple second high may include: merging the high aggregated data with the same health key among the multiple first high aggregated data and the multiple second high aggregated data to obtain multiple third high aggregated data, which The third highest aggregated data is stored in the high-level data processing unit.
  • the data when storing highly aggregated data in the high-level data processing unit, the data is not directly merged with the data in the high-level data processing unit, but merged only when certain conditions are met.
  • the high-aggregated data with the same health key is merged to obtain multiple third-high-aggregated data, so that when the multiple third-high-aggregated data is stored in the high-level data processing unit You can merge high-aggregation data with the same health. In this way, it is convenient for the user to subsequently query multiple pieces of data at the same time level and within the same time range at a time, avoiding the need to merge at the time of query, and improving the efficiency of data query.
  • a specific implementation of storing the plurality of third-highest aggregated data in the high-level data processing unit may include: for each third-highest aggregated data in the plurality of third-highest aggregated data, querying the high-level aggregated data Whether the memory of the data processing unit stores the same data as the row health of each third-highest aggregated data, when the memory of the high-level data processing unit stores the same row health of each third-highest aggregated data Data, merge the queried data with each third-highest aggregated data, and store the merged data in the memory of the high-level data processing unit.
  • the embodiment of the present application first merges the high-aggregated data in the memory, that is, queries whether the memory of the high-level data processing unit stores the data with the third-highest aggregated data. Walk the same data. If it exists, merge the high-aggregation data with the same row health directly in memory, and store the merged high-aggregation data in memory.
  • the third The data with the same row health of the high-aggregated data merge the acquired data with each third-highest aggregated data, and store the merged data in the memory of the high-level data processing unit.
  • Step 106 Based on the row key in each first low-aggregated data, the multiple first low-aggregated data is stored by the low-level data processing unit.
  • a plurality of first low-aggregated data obtained through aggregation statistics are stored in a low-level data processing unit. Further, the storage device stores the plurality of first low-aggregated data through the low-level data processing unit based on the row health in each first low-aggregated data.
  • the specific implementation process may include: The first low-aggregated data with the same row and key in the low-aggregated data are combined to obtain multiple second low-aggregated data, and the multiple second low-aggregated data are stored in the low-level data processing unit.
  • the first low-aggregated data when the time level and the data attribute based on the same are the same, and the time corresponding to the time level is within the same time range, the generated exercise keys are also the same.
  • the first low-aggregated data having the same health key are combined to obtain multiple second low-aggregated data, so that the multiple second low-aggregated data are stored in the low-level data processing unit At the same time, you can merge the low aggregate data with the same health. In this way, it is convenient for the user to subsequently query multiple pieces of data at the same time level and within the same time range at a time, avoiding the need to merge at the time of query, and improving the efficiency of data query.
  • the above specific implementation of storing the plurality of second low-aggregated data in the low-level data processing unit may include: for each second low-aggregated data in the plurality of second low-aggregated data, query the low Whether the memory of the hierarchical data processing unit stores the same data as the row health of each second low-aggregated data; when the memory of the low-level data processing unit stores the row health of the second low-aggregated data When the data is the same, the queried data is merged with each second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
  • the embodiment of the present application first merges the low-aggregated data in the memory, that is, queries whether the second low-aggregated data is stored in the memory of the low-level data processing unit Walk the same data. If it exists, merge the low-aggregated data with the same row health directly in memory, and store the merged data in memory.
  • the second The data of the low-aggregated data has the same row health; the acquired data is merged with each of the second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
  • the data in the memory of the high-level data processing unit is stored to the disk of the high-level data processing unit.
  • the data in the memory of the low-level data processing unit is stored to the disk of the low-level data processing unit.
  • the preset number threshold may be set by the user according to actual needs, or may be set by the storage device by default, which is not limited in the embodiment of the present application.
  • the merged high-aggregated data is first stored in the memory of the high-level data processing unit, and the merged low-aggregated data is first stored in the memory of the low-level data processing unit, only when the memory of the high-level data processing unit When the stored data reaches a certain value, or when the data stored in the memory of the low-level data processing unit reaches a certain value, the data in the memory is written to the disk, which can reduce the number of interactions with the disk.
  • querying high-aggregated data or low-aggregated data first query from the memory, when the query is not found in the memory, then query from the disk, to avoid frequent reading and writing to the disk, and improve system performance.
  • the use of this storage method can also reduce the use of disks by high-level data processing units and disks by low-level data processing units.
  • steps 104 to 106 are used to realize the operation of classifying and storing the multiple aggregated data by multiple data processing units.
  • the storage device can also delete data in the low-level data processing unit that does not belong to the target time interval, so that low-level data can be saved Storage space of the processing unit.
  • the offset of the acquired data may be recorded.
  • the offset is used to indicate that the currently acquired data is The location in the data source.
  • the next batch of data can be obtained according to the recorded offset. For example, if the data in the data source is numbered sequentially, and 5 pieces of data are acquired this time, the offset is 5, that is, the next piece of data will be acquired from the sixth piece of data.
  • multiple pieces of data carrying a time stamp are obtained from a data source, and the multiple pieces of data are classified and processed according to the time stamp of each piece of data to obtain multiple sets of data.
  • Aggregate statistics for each of the multiple sets of data and then, through multiple data processing units composed of memory and disks, classify and store the multiple aggregated data so that the aggregated data stored in each data processing unit Of the same type. In this way, in the subsequent data query, the query can be performed from the corresponding data processing unit based on the time stamp of the data to be queried, which improves the efficiency of data query.
  • Fig. 3 is a schematic structural diagram of a data storage device according to an exemplary embodiment.
  • the data storage device may be implemented by software, hardware, or a combination of both.
  • the data storage device may include:
  • the obtaining module 310 is used to obtain multiple pieces of data from a data source, and each piece of data carries a time stamp;
  • the classification processing module 320 is configured to classify the multiple pieces of data according to the time stamp of each piece of data to obtain multiple sets of data;
  • the aggregation statistics module 330 is configured to aggregate statistics on each group of the multiple groups of data to obtain multiple aggregated data;
  • the classification storage module 340 is used to classify and store the plurality of aggregated data by a plurality of data processing units, wherein each data processing unit of the plurality of data processing units is composed of a memory and a disk, and each data processing The type of aggregated data stored in the cell is the same.
  • the classification processing module 320 is used to:
  • the classification processing module 320 is used to:
  • the time interval is determined as the target time interval.
  • the classification processing module 320 is used to:
  • the updated time interval is determined as the target time interval.
  • the classification processing module 320 is used to:
  • the aggregation statistics module 330 is used to:
  • the target time interval uses day as the time granularity, based on year, month, and day Time granularity, according to different time levels and data attributes, aggregate statistics on the high-level data to obtain multiple first highly aggregated data, and six time granularities based on year, month, day, hour, minute, and second, according to Different time levels and data attributes aggregate statistics on the low-level data to obtain multiple second high-aggregated data and multiple first low-aggregated data.
  • Different time levels include different granularities of time granularity.
  • the classification storage module 340 is used to:
  • the row keys in each aggregated data are obtained.
  • the row keys of each aggregated data are generated during aggregation statistics. To indicate the time level and data attributes corresponding to each aggregated data;
  • the plurality of first high-aggregated data and the plurality of second high-aggregated data are performed by the high-level data processing unit Storing, and based on the row key in each first low-aggregated data, storing the plurality of first low-aggregated data through the low-level data processing unit.
  • the classification storage module 340 is used to:
  • the classification storage module 340 is used to:
  • the acquired data is merged with each of the third highest aggregated data, and the merged data is stored in the memory of the high-level data processing unit.
  • the classification storage module 340 is used to:
  • each second low-aggregated data in the plurality of second low-aggregated data query whether the same data as the row health of each second low-aggregated data is stored in the memory of the low-level data processing unit ;
  • the queried data is merged with each second low-aggregated data to The combined data is stored in the memory of the low-level data processing unit.
  • the classification storage module 340 is used to:
  • the The data of the second low aggregated data has the same health and health
  • the acquired data is merged with each of the second low-aggregated data, and the merged data is stored in the memory of the low-level data processing unit.
  • the classification storage module 340 is used to:
  • the data in the memory of the low-level data processing unit is stored to the disk of the low-level data processing unit.
  • multiple pieces of data carrying a time stamp are obtained from a data source, and the multiple pieces of data are classified and processed according to the time stamp of each piece of data to obtain multiple sets of data.
  • Aggregate statistics for each of the multiple sets of data and then, through multiple data processing units composed of memory and disks, classify and store the multiple aggregated data so that the aggregated data stored in each data processing unit Of the same type. In this way, in the subsequent data query, the query can be performed from the corresponding data processing unit based on the time stamp of the data to be queried, which improves the efficiency of data query.
  • the data storage device provided in the above embodiments is only exemplified by the division of the above functional modules.
  • the above functions can be allocated by different functional modules as needed That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the data storage device and the data storage method embodiment provided in the above embodiments belong to the same concept. For the specific implementation process, refer to the method embodiments, and details are not described here.
  • Fig. 4 is a schematic structural diagram of a storage device according to an exemplary embodiment. Specifically:
  • the storage device 400 includes a central processing unit (CPU) 401, a system memory 404 including a random access memory (RAM) 402 and a read only memory (ROM) 403, and a system bus 405 connecting the system memory 404 and the central processing unit 401.
  • the storage device 400 also includes a basic input / output system (I / O system) 406 that helps transfer information between various devices in the computer, and a large-capacity storage device for storing the operating system 413, application programs 414, and other program modules 415 407.
  • I / O system basic input / output system
  • the basic input / output system 406 includes a display 408 for displaying information and an input device 409 for a user to input information, such as a mouse and a keyboard.
  • the display 408 and the input device 409 are both connected to the central processing unit 401 through the input and output controller 410 connected to the system bus 405.
  • the basic input / output system 406 may also include an input-output controller 410 for receiving and processing input from a number of other devices such as a keyboard, mouse, or electronic stylus.
  • the input output controller 410 also provides output to a display screen, printer, or other type of output device.
  • the mass storage device 407 is connected to the central processing unit 401 through a mass storage controller (not shown) connected to the system bus 405.
  • the mass storage device 407 and its associated computer-readable medium provide non-volatile storage for the storage device 400. That is, the mass storage device 407 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM drive.
  • Computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory, or other solid-state storage technologies, CD-ROM, DVD, or other optical storage, tape cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices.
  • RAM random access memory
  • ROM read-only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other solid-state storage technologies
  • CD-ROM, DVD or other optical storage
  • tape cassettes magnetic tape
  • magnetic disk storage or other magnetic storage devices.
  • computer storage medium is not limited to the above.
  • the above-mentioned system memory 404 and mass storage device 407 may be collectively referred to as a memory.
  • the storage device 400 may also be operated by a remote computer connected to the network through a network such as the Internet. That is, the storage device 400 may be connected to the network 412 through the network interface unit 411 connected to the system bus 405, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 411.
  • the above memory also includes one or more programs.
  • One or more programs are stored in the memory and configured to be executed by the CPU.
  • the one or more programs include a method for performing data storage provided by the embodiments of the present application.
  • An embodiment of the present application further provides a non-transitory computer-readable storage medium, when the instructions in the storage medium are executed by the processor of the mobile terminal, the mobile terminal can execute the data provided by the embodiment shown in FIG. 1 Storage method.
  • An embodiment of the present application also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the data storage method provided by the embodiment shown in FIG. 1 described above.
  • the program may be stored in a computer-readable storage medium.
  • the mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé et un appareil de stockage de données, et un support de stockage, appartenant au domaine technique du traitement de données. Le procédé consiste à : acquérir de multiples données auprès d'une source de données, chaque donnée comportant une estampille temporelle ; selon l'estampille temporelle de chaque donnée, classifier les multiples données pour obtenir de multiples groupes de données ; effectuer un comptage agrégé sur chaque groupe de données dans les multiples groupes de données pour obtenir de multiples données agrégées ; et classifier et stocker les multiples données agrégées au moyen de multiples unités de traitement de données, chaque unité de traitement de données dans les multiples unités de traitement de données étant constituée d'une mémoire interne et d'un disque magnétique et les types de données agrégées stockées dans les unités de traitement de données étant identiques. Dans ce cas, lorsque des données sont interrogées ultérieurement, l'interrogation peut être effectuée dans une unité de traitement de données correspondante en fonction de l'estampille temporelle de données devant être interrogées afin que l'efficacité d'interrogation de données soit améliorée.
PCT/CN2019/111510 2018-10-16 2019-10-16 Procédé et appareil de stockage de données, et support de stockage WO2020078395A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201811204394.7A CN111061758B (zh) 2018-10-16 2018-10-16 数据存储方法、装置及存储介质
CN201811204394.7 2018-10-16
CN201811236196.9A CN111090705B (zh) 2018-10-23 2018-10-23 一种多维数据处理方法、装置及设备、存储介质
CN201811236196.9 2018-10-23

Publications (1)

Publication Number Publication Date
WO2020078395A1 true WO2020078395A1 (fr) 2020-04-23

Family

ID=70282908

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/111510 WO2020078395A1 (fr) 2018-10-16 2019-10-16 Procédé et appareil de stockage de données, et support de stockage

Country Status (1)

Country Link
WO (1) WO2020078395A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193839A (zh) * 2016-03-15 2017-09-22 阿里巴巴集团控股有限公司 数据聚合方法及装置
CN107924345A (zh) * 2015-06-26 2018-04-17 亚马逊技术股份有限公司 用于度量的聚合测量结果的数据存储区
US20180293280A1 (en) * 2017-04-07 2018-10-11 Salesforce.Com, Inc. Time series database search system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107924345A (zh) * 2015-06-26 2018-04-17 亚马逊技术股份有限公司 用于度量的聚合测量结果的数据存储区
CN107193839A (zh) * 2016-03-15 2017-09-22 阿里巴巴集团控股有限公司 数据聚合方法及装置
US20180293280A1 (en) * 2017-04-07 2018-10-11 Salesforce.Com, Inc. Time series database search system

Similar Documents

Publication Publication Date Title
CN110622152B (zh) 用于查询时间序列数据的可扩展数据库系统
CN111061758B (zh) 数据存储方法、装置及存储介质
US8108367B2 (en) Constraints with hidden rows in a database
US8978034B1 (en) System for dynamic batching at varying granularities using micro-batching to achieve both near real-time and batch processing characteristics
KR20200106950A (ko) Sql 질의 플랜들을 최적화하기 위한 차원 콘텍스트 전파 기술들
KR102522274B1 (ko) 사용자 그룹화 방법 및 장치, 컴퓨터 장비, 컴퓨터 판독가능 저장 매체 및 컴퓨터 프로그램
CN107301214B (zh) 在hive中数据迁移方法、装置及终端设备
US10127283B2 (en) Projecting effect of in-flight streamed data on a relational database
US8924373B2 (en) Query plans with parameter markers in place of object identifiers
US20240126817A1 (en) Graph data query
US10885050B2 (en) Altering in-flight streamed data from a relational database
US10296497B2 (en) Storing a key value to a deleted row based on key range density
US11086694B2 (en) Method and system for scalable complex event processing of event streams
US10025826B2 (en) Querying in-flight streamed data from a relational database
CN110555038A (zh) 一种数据处理系统、方法及装置
US10366081B2 (en) Declarative partitioning for data collection queries
US8396858B2 (en) Adding entries to an index based on use of the index
US20210064592A1 (en) Computer storage and retrieval mechanisms using distributed probabilistic counting
Ahsaan et al. Big data analytics: challenges and technologies
US9380126B2 (en) Data collection and distribution management
CN110737727A (zh) 一种数据处理的方法及系统
WO2020078395A1 (fr) Procédé et appareil de stockage de données, et support de stockage
US8935200B2 (en) Dynamic database dump
WO2018188416A1 (fr) Procédé et appareil de recherche de données, et dispositifs associés
US8392374B2 (en) Displaying hidden rows in a database after an expiration date

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19872785

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19872785

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19872785

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 031221)

122 Ep: pct application non-entry in european phase

Ref document number: 19872785

Country of ref document: EP

Kind code of ref document: A1