CN111881135A - Data aggregation method, device, equipment and computer readable storage medium - Google Patents

Data aggregation method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN111881135A
CN111881135A CN202010740268.4A CN202010740268A CN111881135A CN 111881135 A CN111881135 A CN 111881135A CN 202010740268 A CN202010740268 A CN 202010740268A CN 111881135 A CN111881135 A CN 111881135A
Authority
CN
China
Prior art keywords
data
aggregated
aggregation
storage pool
reading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010740268.4A
Other languages
Chinese (zh)
Inventor
孙业宽
孟祥瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010740268.4A priority Critical patent/CN111881135A/en
Publication of CN111881135A publication Critical patent/CN111881135A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Abstract

The application discloses a data aggregation method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring initial data and writing the initial data into a source storage pool; reading a plurality of data to be aggregated from a source storage pool; carrying out polymerization treatment on the data to be polymerized to obtain polymerized data; writing the aggregated data into a target storage pool, and updating metadata corresponding to each data to be aggregated; the initial data is not aggregated immediately after it is acquired, but is written first and read when aggregation is required. If a cache failure fault occurs when the data to be aggregated is read out and waits for aggregation processing, or a cache failure fault occurs in the aggregation processing process, the data to be aggregated is already written into the source storage pool, so that the data loss problem cannot occur.

Description

Data aggregation method, device, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of data aggregation technologies, and in particular, to a data aggregation method, a data aggregation device, and a computer-readable storage medium.
Background
In the big data era, mass data is more and more, and small file read-write scenes with the sizes of several K and more than ten K are more and more applied. In order to reduce the data read-write pressure of the disk, by using the principle that the speed of reading and writing a large file by the disk is usually significantly higher than that of a small file, the related technology does not directly write the small file (for example, the size is several K or more than ten K) into the disk, but writes the small file into the cache first, waits in the cache and merges the small file into the large file (for example, the size is M level) and then performs the disk-dropping operation, so that the read-write speed of the subsequent file during reading and writing is improved, and the data read-write pressure is reduced. However, if a cache miss occurs, small files in the cache that have not yet been merged into a large file and written to disk are lost. The related art may cause data loss at the time of data aggregation.
Therefore, how to solve the problem that the related art may cause data loss during data aggregation is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
In view of the above, an object of the present application is to provide a data aggregation method, a data aggregation apparatus, a data aggregation device, and a computer-readable storage medium, which solve the problem that data may be lost during data aggregation in the related art.
In order to solve the above technical problem, the present application provides a data aggregation method, including:
acquiring initial data and writing the initial data into a source storage pool;
reading a plurality of data to be aggregated from the source storage pool;
carrying out polymerization treatment on the data to be polymerized to obtain polymerized data;
and writing the aggregation data into a destination storage pool, and updating the metadata corresponding to each data to be aggregated.
Optionally, reading a plurality of the data to be aggregated from the source storage pool, including:
acquiring a first aggregation task from a task queue;
screening the initial data to obtain first to-be-aggregated data corresponding to the first aggregation task;
and reading the first data to be aggregated.
Optionally, after reading the data to be aggregated, before writing the aggregated data into the destination storage pool, the method further includes:
judging whether a second aggregated task exists in the task queue;
if the second aggregation task exists, screening the initial data to obtain second data to be aggregated corresponding to the second aggregation task;
and reading the second data to be aggregated.
Optionally, after writing the aggregated data into a destination storage pool and updating the metadata corresponding to each piece of data to be aggregated, the method further includes:
deleting first copy data corresponding to the data to be aggregated;
and generating second copy data corresponding to the aggregation data.
Optionally, the method further comprises:
and if the cache failure fault is detected, re-reading the data to be aggregated.
Optionally, the method further comprises:
acquiring a reading instruction, and determining target aggregated data according to the reading instruction;
analyzing the target aggregated data to obtain target data specified by the reading instruction;
and outputting the target data.
Optionally, the source storage pool is a fast storage pool, and the destination storage pool is a low-speed storage pool.
The present application also provides a data aggregation apparatus, including:
the acquisition module is used for acquiring initial data and writing the initial data into a source storage pool;
a reading module, configured to read a plurality of data to be aggregated from the source storage pool;
the aggregation module is used for carrying out aggregation processing on the data to be aggregated to obtain aggregated data;
and the writing module is used for writing the aggregation data into a target storage pool and updating the metadata corresponding to the data to be aggregated.
The present application further provides a data aggregation device, comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor is configured to execute the computer program to implement the data aggregation method.
The present application also provides a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the data aggregation method described above.
The data aggregation method provided by the invention comprises the steps of obtaining initial data and writing the initial data into a source storage pool; reading a plurality of data to be aggregated from a source storage pool; carrying out polymerization treatment on the data to be polymerized to obtain polymerized data; and writing the aggregated data into the destination storage pool, and updating the metadata corresponding to each data to be aggregated.
Therefore, after the initial data is obtained, the initial data is written into the source storage pool without being stored in the cache, and data aggregation is performed in a data migration mode. When data migration is performed, a plurality of data to be aggregated are read from a source storage pool, aggregation processing is performed on the data to be aggregated to obtain aggregated data, and the aggregated data is written into a destination storage pool. And after the metadata is modified, realizing data aggregation processing from the data to be aggregated to the aggregated data. The initial data is not aggregated immediately after it is acquired, but is written first and read when aggregation is required. If a cache failure fault occurs when the data to be aggregated is read out and waits for aggregation processing, or a cache failure fault occurs in the aggregation processing process, the data to be aggregated is already written into the source storage pool, so that the problem of data loss cannot occur, and the problem of data loss possibly caused by data aggregation in the related technology is solved.
In addition, the invention also provides a data aggregation device, data aggregation equipment and a computer readable storage medium, and the beneficial effects are also achieved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a data aggregation method provided in an embodiment of the present application;
fig. 2 is a flow chart of multi-threaded data aggregation according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a data aggregation apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a data aggregation device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In a possible implementation manner, please refer to fig. 1, where fig. 1 is a flowchart of a data aggregation method provided in an embodiment of the present application. The method comprises the following steps:
s101: initial data is obtained and written to the source storage pool.
It should be noted that all or part of the steps in this embodiment may be performed by the data aggregation device. The data aggregation device may be a server or a computer dedicated to data aggregation in the distributed system, or may be any server or computer having a data migration thread. For example, a plurality of servers exist in a distributed file system, and each server has a data migration thread, so that any one server can be used as a data aggregation device.
The initial data is data which is stored in the distributed cluster and has a small data volume, and the specific content and format of the initial data are not limited. It should be clear to those skilled in the art that "small" data volume is a relative concept, and when the data volume is smaller than a threshold, the data is small in volume, and the specific size of the threshold can be set by a user according to actual situations. For example, the threshold may be set to 1Mb or 4 Mb. Since the present application only aggregates small data (i.e., data with a small file volume), before step S101, the obtained data may be further filtered to obtain initial data, and the data with a data volume greater than a threshold value is directly subjected to disk-dropping storage. The source storage pool is a storage pool in which initial data is written when the initial data is written for the first time, and specifically, the source storage pool is a storage pool formed by all or part of nodes in the whole distributed cluster together, so that the specific writing position of the initial data can be in the data migration device, or in any node in the distributed cluster except for the data migration device. For example, the data may be written into the hdd (hard Disk drive) of the data migration apparatus, or may be stored into the ssd (solid statedisk) of another node. The specific form of the source storage pool is not limited, and for example, the source storage pool may be a storage pool composed of an HDD or a storage pool composed of an SSD. Preferably, since the newly stored data has a higher probability of being read within a period of time after the storage, in order to increase the data reading speed, the initial data may be stored in a fast storage pool composed of SSDs, so as to be read out quickly when reading is needed before aggregation of the initial data.
S102: a plurality of data to be aggregated is read from a source storage pool.
When data aggregation is needed, a plurality of data to be aggregated which need to be aggregated are read from a source storage pool. It should be noted that the embodiment does not limit the execution timing of the step S102, for example, in a possible implementation, the execution may be performed according to a preset cycle, that is, the data to be aggregated is read according to the preset cycle. In another possible implementation, it may be detected whether the data to be aggregated in the source storage pool that satisfies the aggregation condition is sufficient, and the data to be aggregated is read out when it is sufficient. Further, the embodiment does not limit the specific content of the aggregation condition, and may be, for example, the total volume of the initial data in the source storage pool; or may be the total volume of the same type of initial data; or may be the total volume of the initial data in which the file heat is lower than a preset heat. Depending on the polymerization conditions, the timing at which the step S102 is triggered may also be different in the same case. In another possible implementation, the actual implementation of step S102 may be selected according to actual needs, for example, data to be aggregated is read from the source storage pool when an aggregation instruction or an aggregation task is detected, and the aggregation instruction and the aggregation task may be input or generated by a user according to actual needs.
And when the data to be aggregated is read, storing the read data to be aggregated in a cache so as to aggregate all the data to be aggregated when the data to be aggregated is read. The embodiment does not limit the reading mode of the data to be aggregated, and for example, the data to be aggregated can be sequentially read in sequence; or a plurality of data to be aggregated can be simultaneously read in parallel, i.e. a plurality of aggregated data can be simultaneously read.
In a possible implementation manner, data is aggregated when the aggregation task is detected to exist, so that the flexibility degree of file aggregation is improved, and file aggregation is performed according to the needs of users. Specifically, the step S102 may include:
step 11: a first aggregated task is obtained from a task queue.
Step 12: and screening the initial data to obtain first data to be aggregated corresponding to the first aggregation task.
Step 13: and reading the first data to be aggregated.
In this embodiment, a task queue exists, and the task queue is used for storing aggregated tasks. And if the aggregated task exists in the task queue, acquiring a first aggregated task from the task queue. The first aggregated task may be any one aggregated task; or the task may be an aggregated task at the head of the task queue, that is, an aggregated task that enters the task queue first; or priorities may be set for the respective aggregated tasks, the first aggregated task may be the aggregated task with the highest priority. If there are multiple aggregation tasks with the highest priority, the first aggregation task may be any one of the aggregation tasks or an aggregation task that is first put into the task queue. After the first aggregation task is obtained, the initial data is screened according to the requirement of the first aggregation task, and first data to be aggregated corresponding to the first aggregation task is obtained. The filtering condition adopted by the filtering can be flexibly set, and can be, for example, one or a combination of several conditions, such as a time condition (aggregating data stored in the source storage pool for a time longer than a preset time), a data type condition (aggregating data with the same data type, such as audio and document), a data relevancy condition (aggregating data related to data content, such as the same type of news), a data popularity condition (aggregating data with a data popularity lower than a preset threshold), and the like. After the initial data are screened to obtain first data to be aggregated, reading the first data with aggregation into a cache so as to wait for data aggregation.
Further, in another embodiment, in order to increase the speed of data aggregation, multiple threads may be used for processing, that is, after a certain step is completed, when a condition for repeated execution is satisfied, the step may be repeatedly executed so as to be ready for the next step. Therefore, after reading the data to be aggregated, before writing the aggregated data into the destination storage pool, the method further includes:
step 21: and judging whether a second aggregation task exists in the task queue.
Step 22: and if the second aggregation task exists, screening the initial data to obtain second data to be aggregated corresponding to the second aggregation task.
Step 23: and reading the second data to be aggregated.
It should be clear to those skilled in the art that after reading the data to be aggregated, all the data to be aggregated required for representing the last aggregation task has been read out; before writing the aggregated data into the destination storage pool, it indicates that the last aggregation task is not completely completed. Due to the adoption of a multi-thread mode, the step of reading the data to be aggregated can be executed by a certain thread (such as a first thread), and the subsequent step can be executed by other threads (such as a second thread, a third thread and the like), and all threads do not interfere with each other. Therefore, after the data to be aggregated is read, whether a second aggregation task exists in the task queue can be judged, and the specific content of the second aggregation task is not limited and can be any aggregation task. It should be noted that, if there is an aggregation task in the task queue, the second aggregation task is the first aggregation task that is determined again after the task queue is updated. If the second aggregation task exists, it is indicated that other data to be aggregated exist, so that the initial data can be screened according to the second aggregation task to obtain second data to be aggregated, and the second data to be aggregated is read out.
Further, in an embodiment, a cache miss fault may occur in the data reading process, where the cache miss fault is a fault that may miss the data in the cache and disable the cache, and may specifically be a power-off fault, or may be another fault that may cause the cache miss. After the cache failure fault occurs, the data in the cache is lost, and the data to be aggregated is stored in the source storage pool, so that the data is not really lost, and at this time, the data to be aggregated can be read from the source storage pool again, and the subsequent steps are executed.
S103: and carrying out polymerization treatment on the data to be polymerized to obtain polymerized data.
And after the data to be aggregated is obtained, aggregating the data to be aggregated. The specific content of the aggregation processing is not limited in this embodiment, for example, write control information such as a storage location and a write order of each to-be-aggregated data may be generated, the to-be-aggregated data may be combined into aggregated data, and the aggregated data may be written into a destination storage pool according to the write control information, that is, the write of the aggregated data may be completed. For example, the write control information includes a storage location, a write order, and data to be aggregated, and the writing of the aggregated data is completed by writing the data to be aggregated to the storage location in the destination storage pool in the write order. In another embodiment, the data to be aggregated may be aggregated for the first time, for example, aggregating a plurality of data to be aggregated into 4Mb to obtain a plurality of basic data, then determining write control information such as the write order of each basic file, and forming the basic data into aggregated data, for example, forming 128 basic data of 4Mb into one aggregated data of 512Mb, and writing the aggregated data into the target storage pool according to the provision of the write control information.
S104: and writing the aggregated data into the destination storage pool, and updating the metadata corresponding to each data to be aggregated.
And after the aggregated data is obtained, writing the aggregated data into the target storage pool. The metadata records attribute information of each data to be aggregated, such as the storage pool, storage location, aggregation attribute, and the like. The aggregation of data can be completed after the metadata is modified. It should be noted that the present embodiment does not limit the specific method of the aggregated input writing, and may correspond to the content of the aggregated data itself.
Due to the high price of SSDs, limited by cost issues, the high-speed storage pool made up of SSDs is much smaller than the low-speed storage pool made up of HDDs. In order to increase the reading speed of the initial data before being aggregated and reduce the storage cost, the source storage pool can be determined as a fast storage pool, and the destination storage pool can be determined as a low-speed storage pool.
In one possible implementation, to prevent data loss due to a failure, copy data corresponding to the original data is generated after the original data is stored in the source storage pool. After the data to be aggregated is aggregated and the aggregated data is stored in the target storage pool, in order to improve the utilization rate of the storage space and avoid the waste of the storage space caused by too much copy data, the first copy data corresponding to the data to be aggregated can be deleted, and the second copy data corresponding to the aggregated data can be generated.
In one possible embodiment, after the aggregated data is written to the second storage pool, the target data in the aggregated data may also be read. Specifically, the method may further include:
step 31: and acquiring a reading instruction, and determining target aggregated data according to the reading instruction.
Step 32: and analyzing the target aggregated data to obtain target data specified by the reading instruction.
Step 33: and outputting the target data.
The reading instruction is used for reading the specified target data, and after the reading instruction is obtained, the corresponding target aggregated data is determined according to the reading instruction. The target aggregated data is aggregated data including the target data. After the target data is read, the target data is analyzed, and then the specified target data can be read and output. It should be noted that since the target aggregate data is also read at the same time, the effect of "pre-reading" is achieved. When other data in the target aggregated data need to be read, the target aggregated data is already read into the cache, so that the specified data can be directly read from the cache, and the data reading efficiency is improved.
By applying the data aggregation method provided by the embodiment of the application, after the initial data is obtained, the initial data is written into the source storage pool without being stored in the cache, and the data aggregation is performed in a data migration mode. When data migration is performed, a plurality of data to be aggregated are read from a source storage pool, aggregation processing is performed on the data to be aggregated to obtain aggregated data, and the aggregated data is written into a destination storage pool. And after the metadata is modified, realizing data aggregation processing from the data to be aggregated to the aggregated data. The initial data is not aggregated immediately after it is acquired, but is written first and read when aggregation is required. If a cache failure fault occurs when the data to be aggregated is read out and waits for aggregation processing, or a cache failure fault occurs in the aggregation processing process, the data to be aggregated is already written into the source storage pool, so that the problem of data loss cannot occur, and the problem of data loss possibly caused by data aggregation in the related technology is solved.
Based on the above embodiments, the present embodiment will specifically describe several steps in the above embodiments. Referring to fig. 2, fig. 2 is a flowchart illustrating a multithreading data aggregation according to an embodiment of the present disclosure. The method comprises an MDS (metadata service) thread, a migration thread, a read callback thread, a flash callback thread, an update metadata callback thread and a delete callback thread. The specific process comprises the following steps:
(1) and the MDS issues a migration aggregation task, and the Backend (namely the migration thread) adds the migration task into the task queue after receiving the migration task (namely the aggregation task, or called the migration aggregation task).
(2) And traversing the tasks in the task queue by the Backend migration thread, and issuing the file reading request for each task to traverse the file in the task queue.
(3) And after reading is finished, writing the aggregate cache in a read-back debugging thread, and issuing a disk refreshing request after all files are written into the aggregate cache, namely writing the aggregate cache into a disk.
(4) And traversing the file of the task in the disk-brushing callback thread after the disk brushing is finished, and sending a file metadata updating request to the MDS.
(5) And the MDS updates the file metadata after receiving the file metadata, and responds after finishing the file metadata.
(6) And after receiving the response, issuing a deletion request in the metadata updating callback thread.
(7) And after the deletion is finished, judging whether all files of the task are deleted in the deletion callback thread, namely, all files finish the migration aggregation process and respond to the MDS after all files are deleted.
The data aggregation device provided by the embodiment of the present application is introduced below, and the data aggregation device described below and the data aggregation method described above may be referred to correspondingly.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a data aggregation apparatus according to an embodiment of the present application, including:
an obtaining module 110, configured to obtain initial data and write the initial data into a source storage pool;
a reading module 120, configured to read a plurality of data to be aggregated from a source storage pool;
the aggregation module 130 is configured to perform aggregation processing on the data to be aggregated to obtain aggregated data;
the writing module 140 is configured to write the aggregated data into the destination storage pool, and update the metadata corresponding to each data to be aggregated.
Optionally, the reading module 120 includes:
the task acquiring unit is used for acquiring a first aggregation task from the task queue;
the first screening unit is used for screening the initial data to obtain first to-be-aggregated data corresponding to the first aggregation task;
the first reading unit is used for reading the first data to be aggregated.
Optionally, the method further comprises:
the existence judging module is used for judging whether a second aggregation task exists in the task queue;
the second screening unit is used for screening the initial data if the second aggregation task exists to obtain second data to be aggregated corresponding to the second aggregation task;
and the second reading unit is used for reading the second data to be aggregated.
Optionally, the method further comprises:
the deleting module is used for deleting first copy data corresponding to the data to be aggregated;
and the generating module is used for generating second copy data corresponding to the aggregation data.
Optionally, the method further comprises:
and the rereading module is used for rereading the data to be aggregated if the cache failure fault is detected.
Optionally, the method further comprises:
the instruction acquisition module is used for acquiring a reading instruction and determining target aggregated data according to the reading instruction;
the analysis module is used for analyzing the target aggregated data to obtain target data specified by the reading instruction;
and the output module is used for outputting the target data.
In the following, the data aggregation device provided by the embodiment of the present application is introduced, and the data aggregation device described below and the data aggregation method described above may be referred to correspondingly.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a data aggregation device according to an embodiment of the present application. Wherein the data aggregation device 100 can include a processor 101 and a memory 102, and can further include one or more of a multimedia component 103, an information input/information output (I/O) interface 104, and a communication component 105.
The processor 101 is configured to control the overall operation of the data aggregation apparatus 100 to complete all or part of the steps in the data aggregation method; the memory 102 is used to store various types of data to support operations at the data aggregation device 100, which may include, for example, instructions for any application or method operating on the data aggregation device 100, as well as application-related data. The Memory 102 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as one or more of Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk.
The multimedia component 103 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 102 or transmitted through the communication component 105. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 104 provides an interface between the processor 101 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 105 is used for wired or wireless communication between the data aggregation apparatus 100 and other apparatuses. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination of one or more of them, so that the corresponding Communication component 105 may include: Wi-Fi part, Bluetooth part, NFC part.
The data aggregation Device 100 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components, and is configured to perform the data aggregation method according to the above embodiments.
The following describes a computer-readable storage medium provided in an embodiment of the present application, and the computer-readable storage medium described below and the data aggregation method described above may be referred to correspondingly.
The present application further provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the above-mentioned data aggregation method.
The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it should also be noted that, herein, relationships such as first and second, etc., are intended only to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms include, or any other variation is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The data aggregation method, the data aggregation apparatus, the data aggregation device, and the computer-readable storage medium provided by the present application are described in detail above, and a specific example is applied in the present application to explain the principles and embodiments of the present application, and the description of the above embodiment is only used to help understand the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for data aggregation, comprising:
acquiring initial data and writing the initial data into a source storage pool;
reading a plurality of data to be aggregated from the source storage pool;
carrying out polymerization treatment on the data to be polymerized to obtain polymerized data;
and writing the aggregation data into a destination storage pool, and updating the metadata corresponding to each data to be aggregated.
2. The data aggregation method of claim 1, wherein reading a plurality of data to be aggregated from the source storage pool comprises:
acquiring a first aggregation task from a task queue;
screening the initial data to obtain first to-be-aggregated data corresponding to the first aggregation task;
and reading the first data to be aggregated.
3. The data aggregation method according to claim 2, further comprising, after the reading the data to be aggregated and before writing the aggregated data to a destination storage pool:
judging whether a second aggregated task exists in the task queue;
if the second aggregation task exists, screening the initial data to obtain second data to be aggregated corresponding to the second aggregation task;
and reading the second data to be aggregated.
4. The data aggregation method according to claim 1, further comprising, after writing the aggregated data into a destination storage pool and updating metadata corresponding to each of the data to be aggregated:
deleting first copy data corresponding to the data to be aggregated;
and generating second copy data corresponding to the aggregation data.
5. The data aggregation method of claim 1, further comprising:
and if the cache failure fault is detected, re-reading the data to be aggregated.
6. The data aggregation method of claim 1, further comprising:
acquiring a reading instruction, and determining target aggregated data according to the reading instruction;
analyzing the target aggregated data to obtain target data specified by the reading instruction;
and outputting the target data.
7. The data aggregation method of claim 1, wherein the source storage pool is a fast storage pool and the destination storage pool is a low-speed storage pool.
8. A data aggregation apparatus, comprising:
the acquisition module is used for acquiring initial data and writing the initial data into a source storage pool;
a reading module, configured to read a plurality of data to be aggregated from the source storage pool;
the aggregation module is used for carrying out aggregation processing on the data to be aggregated to obtain aggregated data;
and the writing module is used for writing the aggregation data into a target storage pool and updating the metadata corresponding to the data to be aggregated.
9. A data aggregation device comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor for executing the computer program to implement the data aggregation method of any one of claims 1 to 7.
10. A computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the data aggregation method of any one of claims 1 to 7.
CN202010740268.4A 2020-07-28 2020-07-28 Data aggregation method, device, equipment and computer readable storage medium Withdrawn CN111881135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010740268.4A CN111881135A (en) 2020-07-28 2020-07-28 Data aggregation method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010740268.4A CN111881135A (en) 2020-07-28 2020-07-28 Data aggregation method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN111881135A true CN111881135A (en) 2020-11-03

Family

ID=73201428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010740268.4A Withdrawn CN111881135A (en) 2020-07-28 2020-07-28 Data aggregation method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111881135A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112817544A (en) * 2021-03-05 2021-05-18 北京星网锐捷网络技术有限公司 Data processing method, storage system and storage device
CN113190176A (en) * 2021-05-11 2021-07-30 上海华东汽车信息技术有限公司 Data storage method and device, electronic equipment and storage medium
CN113434278A (en) * 2021-07-08 2021-09-24 上海浦东发展银行股份有限公司 Data aggregation system, method, electronic device, and storage medium
CN113687782A (en) * 2021-07-30 2021-11-23 济南浪潮数据技术有限公司 Storage pool time delay determination method and device, electronic equipment and readable storage medium
CN113821164A (en) * 2021-08-20 2021-12-21 济南浪潮数据技术有限公司 Object aggregation method and device of distributed storage system
CN113849421A (en) * 2021-09-16 2021-12-28 苏州浪潮智能科技有限公司 Hierarchical aggregation method and device for data in full flash memory
CN114327280A (en) * 2021-12-29 2022-04-12 以萨技术股份有限公司 Message storage method and system based on cold-hot separation storage
CN114489510A (en) * 2022-01-28 2022-05-13 维沃移动通信有限公司 Data reading method and device
CN115576505A (en) * 2022-12-13 2023-01-06 浪潮电子信息产业股份有限公司 Data storage method, device and equipment and readable storage medium

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112817544A (en) * 2021-03-05 2021-05-18 北京星网锐捷网络技术有限公司 Data processing method, storage system and storage device
CN113190176A (en) * 2021-05-11 2021-07-30 上海华东汽车信息技术有限公司 Data storage method and device, electronic equipment and storage medium
CN113434278A (en) * 2021-07-08 2021-09-24 上海浦东发展银行股份有限公司 Data aggregation system, method, electronic device, and storage medium
CN113687782A (en) * 2021-07-30 2021-11-23 济南浪潮数据技术有限公司 Storage pool time delay determination method and device, electronic equipment and readable storage medium
CN113687782B (en) * 2021-07-30 2023-12-22 济南浪潮数据技术有限公司 Storage pool time delay determining method and device, electronic equipment and readable storage medium
CN113821164A (en) * 2021-08-20 2021-12-21 济南浪潮数据技术有限公司 Object aggregation method and device of distributed storage system
CN113821164B (en) * 2021-08-20 2024-02-13 济南浪潮数据技术有限公司 Object aggregation method and device of distributed storage system
CN113849421B (en) * 2021-09-16 2023-11-17 苏州浪潮智能科技有限公司 Hierarchical aggregation method and device for data in full flash memory
CN113849421A (en) * 2021-09-16 2021-12-28 苏州浪潮智能科技有限公司 Hierarchical aggregation method and device for data in full flash memory
CN114327280A (en) * 2021-12-29 2022-04-12 以萨技术股份有限公司 Message storage method and system based on cold-hot separation storage
CN114327280B (en) * 2021-12-29 2024-02-09 以萨技术股份有限公司 Message storage method and system based on cold and hot separation storage
CN114489510A (en) * 2022-01-28 2022-05-13 维沃移动通信有限公司 Data reading method and device
CN115576505A (en) * 2022-12-13 2023-01-06 浪潮电子信息产业股份有限公司 Data storage method, device and equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN111881135A (en) Data aggregation method, device, equipment and computer readable storage medium
CN108319654B (en) Computing system, cold and hot data separation method and device, and computer readable storage medium
US10417137B2 (en) Flushing pages from solid-state storage device
JP5827403B2 (en) Technology for moving data between memory types
JP6412244B2 (en) Dynamic integration based on load
US11687276B2 (en) Data streaming for computational storage
US9575827B2 (en) Memory management program, memory management method, and memory management device
US9886449B1 (en) Delayed allocation for data object creation
US20240086332A1 (en) Data processing method and system, device, and medium
WO2019015490A1 (en) Data processing method, apparatus, device, and system
WO2018049883A1 (en) File operation method and device
CN114968839A (en) Hard disk garbage recycling method, device and equipment and computer readable storage medium
CN109947712A (en) Automatically merge method, system, equipment and the medium of file in Computational frame
CN114489475A (en) Distributed storage system and data storage method thereof
JPWO2012124017A1 (en) Command control method and command control program
JP2016515258A (en) File aggregation for optimized file operation
CN112860188A (en) Data migration method, system, device and medium
CN110058938B (en) Memory processing method and device, electronic equipment and readable medium
CN110658993A (en) Snapshot rollback method, device, equipment and storage medium
CN107018163B (en) Resource allocation method and device
JP5187944B2 (en) Apparatus and method for executing computer usable code
CN114297196A (en) Metadata storage method and device, electronic equipment and storage medium
US8977814B1 (en) Information lifecycle management for binding content
WO2015058628A1 (en) File access method and device
WO2018077092A1 (en) Saving method applied to distributed file system, apparatus and distributed file system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201103