WO2024011896A1

WO2024011896A1 - Data processing method and device and storage medium

Info

Publication number: WO2024011896A1
Application number: PCT/CN2023/074616
Authority: WO
Inventors: 刘土明
Original assignee: 中兴通讯股份有限公司
Priority date: 2022-07-15
Filing date: 2023-02-06
Publication date: 2024-01-18
Also published as: CN117439854A

Abstract

The present disclosure relates to the technical filed of data management, and provides a data processing method and device and a storage medium. The method comprises: in response to a data acquisition request sent by a terminal device, determining identification information of first data on the basis of the data acquisition request; according to the identification information of the first data, determining each host machine and POD respectively corresponding to each host machine; acquiring second data from at least one POD according to a preset data acquisition rule, wherein the second data is obtained by each host machine processing, according to a preset data processing rule, the first data in the at least one POD; and sending the second data to the terminal device.

Description

Data processing methods, equipment and storage media

Cross-references to related applications

This disclosure claims the priority of Chinese patent application CN202210833724.9 titled "Data processing method, equipment and storage medium" submitted on July 15, 2022, the entire content of which is incorporated into this disclosure by reference.

Technical field

The present disclosure relates to the field of data management technology, and in particular, to a data processing method, equipment and storage medium.

Background technique

At present, the industry generally uses distributed file systems such as HDFS (Hadoop Distributed File System), Ceph, etc. to store and/or call data. For example, if a distributed file system is used in a GIS system (Geographic Information System), when the GIS system reads and writes data, the distributed file system needs to serialize and deserialize a large amount of data. There are technical problems such as complex data processing and low data processing efficiency, which affect the user experience.

Contents of the invention

The present disclosure provides a data processing method, equipment and storage medium, aiming to solve the technical problems of complex data processing procedures and low data processing efficiency.

In a first aspect, the present disclosure provides a data processing method, which includes: in response to a data acquisition request sent by a terminal device, determining identification information of the first data based on the data acquisition request; and determining each host and machine according to the identification information of the first data. Each host has a corresponding POD; according to the preset data acquisition rules, the second data is obtained from at least one POD, wherein the second data is processed by the host in at least one POD according to the preset data processing rules. The data is processed and the second data is sent to the terminal device.

In a second aspect, the present disclosure also provides a data processing device, including a processor, a memory, a computer program stored on the memory and executable by the processor, and a data bus for realizing connection communication between the processor and the memory, When the computer program is executed by the processor, the steps of any data processing method provided by this disclosure are implemented.

In a third aspect, the present disclosure also provides a storage medium for computer-readable storage. The storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the present disclosure. Steps for any data processing method provided in the instructions.

Description of drawings

In order to explain the technical solutions of the present disclosure more clearly, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present disclosure. For ordinary people in the art For technical personnel, other drawings can also be obtained based on these drawings without exerting creative work.

Figure 1 is a schematic flow chart of a data processing method provided by the present disclosure;

Figure 2 is a schematic diagram of a data processing framework related to an embodiment of the present disclosure;

Figure 3 is a data processing interaction diagram related to an embodiment of the present disclosure;

Figure 4 is a data processing interaction diagram related to an embodiment of the present disclosure;

Figure 5 is a schematic structural block diagram of a data processing device provided by the present disclosure.

Detailed ways

The technical solutions in this disclosure will be clearly and completely described below with reference to the accompanying drawings in this disclosure. Obviously, the described embodiments are part of the embodiments of this disclosure, rather than all embodiments. Based on the embodiments in this disclosure, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this disclosure.

The flowcharts shown in the accompanying drawings are only examples and do not necessarily include all contents and operations/steps, nor are they necessarily performed in the order described. For example, some operations/steps can also be decomposed, combined or partially merged, so the actual order of execution may change according to actual conditions.

It should be understood that the terminology used in the description of the disclosure is for the purpose of describing particular embodiments only and is not intended to limit the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms unless the context clearly dictates otherwise.

If a distributed file system is used in a GIS system, when the GIS system reads and writes data, the distributed file system needs to perform serialization, deserialization, and network transmission of a large amount of data. The data processing flow is Complexity can easily lead to low data processing efficiency and affect the user experience.

The present disclosure provides a data processing method device and a storage medium. Among them, the data processing method can be applied to terminal equipment, which can be electronic equipment such as mobile phones, tablet computers, notebook computers, and desktop computers. It can also be applied to servers. The server can be a separate server or can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, Cloud servers for basic cloud computing services such as Content Delivery Network (CDN) and big data and artificial intelligence platforms.

Some embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. The following embodiments and features in the embodiments may be combined with each other without conflict.

Please refer to Figure 1, which is a schematic flow chart of a data processing method provided by the present disclosure.

Please refer to FIG. 2 , which is a schematic diagram of a data processing framework according to an embodiment of the present disclosure.

In an exemplary embodiment, the data processing framework is deployed in a Kubernetes environment. For example, as shown in Figure 2, the data processing framework includes a Master, a host, and a POD. The Master can connect to and communicate with multiple hosts, and the host can accommodate multiple PODs.

In some embodiments, the Master is used to manage clusters in the Kubernetes environment, such as hosts and PODs; the hosts are used to communicate with the Master; the POD is the smallest deployment unit in the Kubernetes environment and is used to store data.

As shown in Figure 1, the data processing method includes steps S101 to S104.

Step S101: In response to the data acquisition request sent by the terminal device, determine the identification information of the first data based on the data acquisition request.

In an exemplary embodiment, the data processing method can be applied to any system that requires distributed storage of a large amount of data or the need to call a large amount of data, such as a GIS system, to process relevant data of the GIS system. In some embodiments, in response to the user's related operations on the terminal device, the GIS system needs to present the corresponding geographical image to the user. For example, the presentation of geographical images requires obtaining corresponding data from the GIS system and processing the acquired data to obtain geographical images.

In an exemplary embodiment, the data distribution of the GIS system is stored in a POD in the data processing framework. In some embodiments, when the terminal device requests to render a certain geographed image, it is equivalent to receiving a data acquisition request sent by the terminal device. In an exemplary embodiment, the Master in the data processing framework can analyze and determine the first data required to present the corresponding geographical image based on the data acquisition request. For example, the Master responds to the data acquisition request sent by the terminal device, based on number Determine the identification information of the first data according to the acquisition request.

Step S102: Determine each host and the POD corresponding to each host according to the identification information of the first data.

In some embodiments, the Master can determine the storage location of the first data based on the identification information of the first data, for example, determine each host where each POD storing the first data is located and the corresponding POD.

In an exemplary embodiment, determining each host and the POD corresponding to each host according to the identification information of the first data includes: according to the identification information of the first data, from the preset data identification information, the host identification In the association between the information and the POD identification information, each host and the corresponding POD of each host are determined.

In an exemplary embodiment, multiple databases are provided in the Master. For example, the database can be a lightweight database. For example, the database can be used to store preset data identification information, host identification information and POD. Identify the association relationship between the information and/or store the second data obtained from the POD. In some embodiments, for example, when HDFS is used for distributed storage of data in a GIS system, HDFS needs to respond to data acquisition instructions, obtain the first data, and perform serialization, network transmission, deserialization, etc. on the first data. During the processing, any data is stored in the form of objects in the Master in HDFS. For example, if each object occupies about 150 bytes, then when there are one million objects, the Master in HDFS stores each object. Approximately 2G of memory is required, and as the number of objects increases, the memory capacity requirements of the Master in HDFS also increase accordingly. Understandably, the memory capacity of the Master in HDFS severely restricts cluster expansion. In an exemplary embodiment, multiple databases, such as lightweight databases, are provided in the Master to store various objects, such as the association between preset data identification information, host identification information and POD identification information and/or Or storing the second data obtained from the POD can stabilize the memory usage in the Master at a low level and will not increase with the growth of objects, which is conducive to cluster expansion.

In an exemplary embodiment, the preset association relationship between the data identification information, the host identification information and the POD identification information may be set in advance or updated in real time.

For example, before responding to the data acquisition request sent by the terminal device and determining the identification information of the first data based on the data acquisition request, an identification information reporting instruction may also be sent to each host to indicate each POD in each host. Report data identification information, host identification information and POD identification information respectively; based on the correlation between data identification information, host identification information and POD identification information. Of course, it is not limited to this. For example, the Master can send identification information reporting instructions to each host regularly or when it detects changes in the host to update the association between data identification, host identification information, and POD identification information. There are no restrictions here.

In some embodiments, the Master selects the preset data identification information, the host, and the host according to the preset identification information of the first data. In the association relationship between the host identification information and the POD identification information, the host identification information and the POD identification information associated with the identification information of the first data are determined. For example, each host and its corresponding POD can be determined through the determined host identification information and POD identification information, so as to determine the storage location of the first data.

In an exemplary embodiment, the Master can manage data stored in each host and the POD corresponding to each host. In an exemplary embodiment, detect whether the third data exists; if the third data exists, determine each host that stores the third data of each category and the POD corresponding to each host based on preset data storage rules; respectively Instruct the corresponding host to store the third data of the corresponding category in the corresponding POD.

In some embodiments, based on preset storage rules, a corresponding number of copies of the third data of the same category can be created and stored in PODs corresponding to different hosts.

For example, based on preset data storage rules, determining each host that stores third data of each category and the POD corresponding to each host includes: determining the number of categories corresponding to the third data; according to the number of categories of the third data and the number of hosts, to determine the number of copies corresponding to each category of third data; according to the number of copies corresponding to each category of third data, determine the number of hosts that store the corresponding category of third data; according to the preset The correlation between the data identification information, the host identification information and the POD identification information is determined to determine each host that stores the third data of each category and the POD corresponding to each host.

In an exemplary embodiment, the Master can obtain the identification information of the third data. It can be understood that the identification information of different categories of third data is different. For example, the Master can obtain the identification information of the third data based on the number of different identification information of the third data. , determine the number of categories corresponding to the third data, and of course it is not limited to this.

In some embodiments, the number of copies corresponding to each category of third data is determined based on the number of categories of third data and the number of hosts, so that the number of copies of the corresponding category of third data is determined according to the number of copies of each category of third data. The third data is copied until the number of copies corresponding to the number of copies is reached, so that multiple copies of the third data of the same category are stored in different hosts and PODs corresponding to each host.

In an exemplary embodiment, after determining the number of copies corresponding to each category of third data, for example, based on the preset association relationship between the data identification information, the host identification information and the POD identification information, determine the current number of copies of each host. The number of categories of data stored in each POD in the host is used to determine the host that stores the third data of each category and the POD corresponding to each host.

Please refer to FIG. 3 , which is a data processing interaction diagram according to an embodiment of the present disclosure.

As shown in Figure 3, the third data detected by the Master is temporarily stored in HDFS, for example, where the third data includes A.zip, B.zip, C.zip and D.zip, and the hosts currently connected to the Master include Host 1, Host 2 and Host 3. In an exemplary embodiment, the Master determines each of A.zip, B.zip, C.zip and D.zip based on the preset storage rules and according to the number of third data categories being 4 and the number of hosts being 3. The corresponding number of copies is 2 to control the number of copies of A.zip, B.zip, C.zip and D.zip, and at the same time, according to the association between the preset data identification information, host identification information and POD identification information , instructs host 1 to store A.zip, C.zip and D.zip, host 2 to store B.zip and D.zip, and host 3 to store A.zip, B.zip and C.zip. Storage is performed to distribute and store A.zip, B.zip, C.zip, and D.zip in Host 1, Host 2, and Host 3 in a balanced manner.

In some embodiments, the data currently stored in each POD included in host 1, host 2 and host 3 can also be determined based on the preset correlation between the data identification information, the host identification information and the POD identification information. Corresponding number of categories, thereby determining the PODs corresponding to A.zip, B.zip, C.zip and D.zip stored in Host 1, Host 2 and Host 3 respectively to balance Host 1, Host 2 and The number of categories corresponding to the data stored in each POD in host 3.

Step S103: Obtain second data from at least one POD according to preset data acquisition rules, where the second data is obtained by the host processing the first data in at least one POD according to preset data processing rules.

In some embodiments, if the Master determines the storage location of the first data based on the identification information of the first data, acquires the first data and sends it to the terminal device, the terminal device still needs to process the first data accordingly. For example, when HDFS is used to distribute the data of the GIS system, it is necessary to obtain the first data in response to the data acquisition instruction, and perform serialization, network transmission, deserialization and other processing processes on the first data in order to obtain the first data. The data is sent to the GIS system. The GIS system still needs to process the first data to generate a geographical image, and then display the geographical image. The process is complicated and takes a lot of time, and the data processing efficiency is relatively low. In an exemplary embodiment, because HDFS needs to respond to the data acquisition instruction, acquire the first data, and perform serialization, network transmission, deserialization and other processing on the first data, the corresponding delay of the data processing is relatively Correspondingly, the storage device carrier used to store data in HDFS, such as a hard disk, needs to maintain very high real-time performance when reading and writing data, and the requirements for the storage device carrier are relatively high.

In an exemplary embodiment, the second data can be directly obtained from at least one POD according to preset data acquisition rules, wherein the second data is processed by the host in at least one POD according to preset data processing rules. The first data is processed. For example, based on the preset data processing rules, the host processes the first data required for the GIS system to generate a geographical image, generates a geographical image, and feeds the generated geographical image back to the Master as the second data. In this way, the Master can send the geographicalized image to the terminal device, and the terminal device no longer needs to process the first data, saving the data processing process and improving the efficiency of data processing. In an exemplary embodiment, based on the flow of saved data processing, data The delay corresponding to the data processing process is relatively small, so POD can flexibly use different hard disks, such as mechanical hard disks or other hard disks, and has lower requirements for storage device carriers.

In an exemplary embodiment, based on preset data storage rules, data of the same category can be stored in PODs corresponding to multiple hosts by creating copies. Then, data of the same category can be obtained according to the preset The data acquisition rules can be obtained from at least one POD.

In an exemplary embodiment, obtaining the second data from at least one POD according to preset data acquisition rules includes: when the data acquisition request is used to indicate the acquisition of the same category of data, the data amount of the category data that needs to be acquired is obtained. , determine the amount of data that needs to be fed back by each POD that stores data of the corresponding category; obtain the second data corresponding to the amount of data from the corresponding POD.

In an exemplary embodiment, please refer to Figure 4, when the data acquisition request is used to indicate the acquisition of the same category of data, for example, when the data acquisition request is used to indicate the acquisition of A.zip, according to the amount of data of A.zip that needs to be acquired, For example, if you need to obtain ten copies of A.zip, you can determine the amount of data that needs to be fed back by each POD that stores A.zip. For example, POD1 of host 1 stores A.zip, and POD3 of host 3 stores A.zip. Then The amount of data that POD1 of host 1 needs to feed back is 5 copies of A.zip, and the amount of data that POD3 of host 3 needs to feed back is 5 copies of A.zip. Therefore, host 1 and host 3 can process A.zip at the same time. The second data corresponding to the data amount is processed to feed back the second data corresponding to the data amount to the Master, thereby improving the concurrency of data processing and thus improving the data processing efficiency.

In an exemplary embodiment, obtaining the second data from at least one POD according to preset data acquisition rules includes: when the data acquisition request is used to indicate the acquisition of different categories of data, obtaining any category of data as needed The amount of data determines the amount of data that needs to be fed back by any POD that stores data of any type; and obtains the second data corresponding to the amount of data from any POD.

In an exemplary embodiment, please refer to Figure 4, when the data acquisition request is used to indicate the acquisition of different categories of data, for example, when the data acquisition request is used to indicate the acquisition of A.zip, B.zip and C.zip, as needed The amount of data corresponding to the obtained A.zip, B.zip and C.zip. For example, if you need to obtain one A.zip, one B.zip and one C.zip, you can be sure that any category of data is stored. The amount of data that needs to be fed back by any POD, for example, A.zip is stored in POD1 of host 1 and C.zip is stored in POD2, B.zip is stored in POD2 of host 2, and POD1 of host 3 stores There is C.zip and POD3 stores A.zip. In an exemplary embodiment, it is determined that the amount of data that needs to be fed back by POD1 of host 1 is one A.zip, and the amount of data that needs to be fed back by POD2 of host 2 is One copy of B.zip, the amount of data that POD1 of host 3 needs to feed back is one copy of C.zip, so host 1, host 2 and host 3 can respectively process the amount of data fed back by POD to obtain the corresponding data amount. The second data feeds back the second data corresponding to the data amount to the Master to improve the efficiency of data processing. Concurrency balances the feedback data volume of each host and the POD corresponding to each host, thereby improving data processing efficiency.

Step S104: Send the second data to the terminal device.

In some embodiments, the terminal device includes, for example, a terminal device of a GIS system. According to the second data, the geographical image can be directly presented to the user, thereby improving the user experience.

In an exemplary embodiment, the data processing method further includes: detecting whether there is a host change; when a host change is detected, based on the difference between the data identification information after the host change and the data identification information before the host change. According to the matching situation, adjust the data in each host and the corresponding POD of each host.

In an exemplary embodiment, the host change category includes any one of adding a host, reducing a host, and replacing a host. In some embodiments, changes to the host may be tracked. In an exemplary embodiment, the host change category may be determined based on changes in the host within adjacent first time thresholds and second time thresholds, where the first time threshold is earlier than the second time threshold. For example, if an increase in hosts is detected within the first time threshold, and a decrease in hosts is detected within the second time threshold, the host change category can be determined to be host replacement; if within the first time threshold, an increase in hosts is detected. When the number of hosts decreases and an increase in hosts is detected within the second time threshold, the host change category can be determined to be host replacement; if an increase in hosts is detected within the first time threshold, within the second time threshold , if no host change is detected, it can be determined that the host change category is adding a host; if a host decrease is detected within the first time threshold, and no host change is detected within the second time threshold, it can be determined The host change category is host reduction.

In some embodiments, when the host is changed to another host, the missing data identification information after the change of the host can be determined based on the matching of the data identification information after the change of the host and the data identification information before the change of the host. This allows the missing data to be determined. In an exemplary embodiment, the lost data can be downloaded to the replacement host, and/or the POD corresponding to each category of lost data can be determined to be stored in the replacement host according to preset storage rules. For example, the lost data can be stored in the corresponding POD of the replaced host according to the data storage time and acquisition frequency. For example, the closer the storage time of data is to the change time of the host, the corresponding data will be stored first; for example, the higher the frequency of data acquisition, the corresponding data will be stored first to avoid data loss.

In some embodiments, when the host is changed to add a host, each host and the corresponding data of each host are determined based on the matching between the data identification information after the host is changed and the data identification information before the host is changed. The difference in the number of categories corresponding to the data stored in the POD, so that according to the difference in the number of categories corresponding to the data stored in each host and the POD corresponding to each host, adjust the copy amount of the corresponding category of data and/or transfer the corresponding category of data Data migration to In the corresponding POD of the added host, to balance the number of categories corresponding to each host and the data stored in the POD corresponding to each host.

In some embodiments, when the host is changed to a reduced host, the lost data is determined based on the matching between the data identification information after the host change and the data identification information before the host change, and each host is determined. And the difference in the number of categories corresponding to the data stored in the POD corresponding to each host. In an exemplary embodiment, the lost data can be stored in the corresponding host and the corresponding POD of the corresponding host and/or the copy amount and amount of the corresponding category of data can be adjusted according to the storage time and acquisition frequency of the data. /Or migrate the data of the corresponding category to the POD corresponding to the corresponding host to balance the number of categories corresponding to the data stored in each host and the POD corresponding to each host.

In an exemplary embodiment, after adjusting the data in each host and the corresponding POD of each host according to the matching between the data identification information after the host changes and the data identification information before the host changes, Update the association between data identification information, host identification information, and POD identification information.

In some embodiments, each POD in each host reports data identification information, host identification information and POD identification information to the Master respectively, so that the Master updates the association between data identification information, host identification information and POD identification information. relation.

In an exemplary embodiment, the data processing method can also detect the storage time and acquisition frequency of data. For example, when the storage time of data is greater than the preset storage time threshold, and the data acquisition frequency is lower than the preset acquisition frequency threshold, the corresponding data can be deleted to reduce the number of hosts and the number of requests corresponding to each host. Redundant data stored in POD.

By responding to the data acquisition request sent by the terminal device, determining the identification information of the first data based on the data acquisition request; determining each host and the corresponding POD of each host according to the identification information of the first data; obtaining data according to the preset Rule, obtain the second data from at least one POD, wherein the second data is obtained by the host processing the first data in at least one POD according to the preset data processing rules, and send the second data to the terminal device, It effectively solves the problem of complex data processing processes, resulting in low data processing efficiency and affecting the user experience. It simplifies the data processing process, improves the data processing efficiency, and enhances the user experience.

Please refer to FIG. 5 , which is a schematic structural block diagram of a data processing device provided by the present disclosure.

As shown in Figure 5, the data processing device 300 includes a processor 301 and a memory 302. The processor 301 and the memory 302 are connected through a bus 303, which is, for example, an I2C (Inter-integrated Circuit) bus.

In an exemplary embodiment, the processor 301 is used to provide computing and control capabilities to support the operation of the entire data processing device. OK. The processor 301 can be a central processing unit (Central Processing Unit, CPU). The processor 301 can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC). ), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general processor may be a microprocessor or the processor may be any conventional processor.

In an exemplary embodiment, the memory 302 may be a Flash chip, a read-only memory (ROM, Read-Only Memory) disk, an optical disk, a USB disk, a mobile hard disk, or the like.

Those skilled in the art can understand that the structure shown in Figure 5 is only a block diagram of a partial structure related to the disclosed solution, and does not constitute a limitation on the data processing equipment to which the disclosed solution is applied. The specific server can May include more or fewer parts than shown, or combine certain parts, or have a different arrangement of parts.

Wherein, the processor is used to run the computer program stored in the memory, and implement any data processing method provided by the present disclosure when executing the computer program.

In one embodiment, the processor is configured to run a computer program stored in the memory, and implement the following steps when executing the computer program: in response to a data acquisition request sent by the terminal device, determine the identification information of the first data based on the data acquisition request; According to the identification information of the first data, each host and the corresponding POD of each host are determined; according to the preset data acquisition rules, the second data is obtained from at least one POD, wherein the second data is obtained by the host in at least one In the POD, the first data is processed according to the preset data processing rules; the second data is sent to the terminal device.

In one embodiment, when determining each host and the POD corresponding to each host according to the identification information of the first data, the processor is configured to: based on the identification information of the first data, identify the preset data from In the correlation between the information, the host identification information and the POD identification information, each host and the corresponding POD of each host are determined.

In one embodiment, when the processor obtains the second data from at least one POD according to the preset data acquisition rules, the processor is configured to: when the data acquisition request is used to indicate the acquisition of the same category of data, the processor obtains the second data as needed. The data volume of the category data determines the data volume that needs to be fed back by each POD that stores the corresponding category data; the second data corresponding to the data volume is obtained from the corresponding POD respectively.

In one embodiment, the processor obtains the second data from at least one POD according to preset data acquisition rules, including: when the data acquisition request is used to indicate the acquisition of different categories of data, any category required to acquire The amount of data, determine the amount of data that needs to be fed back by any POD that stores any type of data; obtain the corresponding data from any POD The second data of the data quantity.

In one embodiment, when implementing the data processing method, the processor is configured to: detect whether third data exists; if third data exists, determine each location where third data of each category is stored based on preset data storage rules. The host machine and the POD corresponding to each host machine respectively instruct the corresponding host machine to store the third data of the corresponding category in the corresponding POD.

In one embodiment, when determining each host that stores third data of each category and the POD corresponding to each host based on preset data storage rules, the processor is configured to: determine the category corresponding to the third data. quantity; according to the number of categories of third data and the number of hosts, determine the number of copies corresponding to each category of third data; according to the number of copies corresponding to each category of third data, determine the number of copies for storing the corresponding category of third data. The number of hosts; according to the preset correlation between the data identification information, the host identification information and the POD identification information, determine each host that stores the third data of each category and the POD corresponding to each host.

In some embodiments, before the processor determines the identification information of the first data based on the data acquisition request in response to the data acquisition request sent by the terminal device, the processor is configured to: send an identification information reporting instruction to each host to indicate that each Each POD in the host machine reports data identification information, host identification information and POD identification information respectively; based on the data identification information, host identification information and POD identification information, determine the relationship between the data identification information, host identification information and POD identification information. relationship.

In one embodiment, when implementing the data processing method, the processor is used to: detect whether there is a host change; when detecting that there is a host change, based on the data identification information after the host change and the data before the host change Match the data identification information, adjust the data in each host and the corresponding POD of each host.

It should be noted that those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working process of the data processing equipment described above can be referred to the corresponding process in the foregoing data processing method embodiment, and will not be described here. Again.

The present disclosure also provides a storage medium for computer-readable storage. The storage medium stores one or more programs. The one or more programs can be executed by one or more processors to implement any of the tasks provided by the present disclosure. The steps of a data processing method.

The storage medium may be an internal storage unit of the data processing device described in the previous embodiment, such as a hard disk or memory of the data processing device. The storage medium may also be an external storage device of the data processing device, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), or a secure digital (SD) equipped on the data processing device. card, flash card, etc.

The present disclosure provides a data processing method, device and storage medium. The present disclosure determines the identification information of the first data based on the data acquisition request in response to the data acquisition request sent by the terminal device; determines each sink according to the identification information of the first data. The host and the corresponding POD of each host; according to the preset data acquisition rules, obtain the second data from at least one POD, wherein the second data is processed by the host in at least one POD according to the preset data processing rules. The first data is processed and the second data is sent to the terminal device. It solves the problem of complex data processing procedures, simplifies the data processing procedures, improves the data processing efficiency, and improves the user experience. The technical solution disclosed in this disclosure aims to simplify the data processing process, improve data processing efficiency, and enhance user experience.

Those of ordinary skill in the art can understand that all or some steps, systems, and functional modules/units in the devices disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. In hardware embodiments, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components execute cooperatively. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

It will be understood that the term "and/or" as used in the specification and appended claims of the present disclosure refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, as used herein, the terms "include", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, such that a process, method, article or system that includes a list of elements not only includes those elements, but It also includes other elements not expressly listed or that are inherent to the process, method, article or system. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

The above serial numbers of the embodiments of the present disclosure are only for description and do not represent the advantages and disadvantages of the embodiments. The above are only specific embodiments of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person familiar with the technical field can easily think of various equivalent methods within the technical scope disclosed in the present disclosure. Modifications or substitutions, these modifications or substitutions should be covered by the protection scope of this disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims

A data processing method including:

In response to the data acquisition request sent by the terminal device, determine the identification information of the first data based on the data acquisition request;

Determine each host and the POD corresponding to each host according to the identification information of the first data;

Acquire second data from at least one of the PODs according to preset data acquisition rules, wherein the second data is processed by the host in at least one of the PODs according to preset data processing rules. The first data is processed;

Send the second data to the terminal device.
The data processing method according to claim 1, wherein determining each host and the corresponding POD of each host according to the identification information of the first data includes:

According to the identification information of the first data, each host and the corresponding POD of each host are determined from the preset association relationship between the data identification information, the host identification information and the POD identification information.
The data processing method according to claim 1, wherein obtaining the second data from at least one of the PODs according to preset data acquisition rules includes:

When the data acquisition request is used to indicate the acquisition of data of the same category, the amount of data to be fed back by each POD that stores the corresponding category data is determined based on the data amount of the category data that needs to be acquired;

The second data corresponding to the data amount is obtained from the corresponding POD respectively.
The data processing method according to claim 1, wherein obtaining the second data from at least one of the PODs according to preset data acquisition rules includes:

When the data acquisition request is used to indicate the acquisition of data of different categories, the amount of data to be fed back by any POD that stores the data of any category is determined based on the amount of data of any category that needs to be acquired;

Obtain the second data corresponding to the amount of data from any POD.
The data processing method according to any one of claims 1 to 4, further comprising:

Detect whether there is third data;

If third data exists, determine each host that stores the third data of each category based on preset data storage rules. And the POD corresponding to each host;

Respectively instruct the corresponding host to store the third data of the corresponding category in the corresponding POD.
The data processing method according to claim 5, wherein determining each host that stores third data of each category and the POD corresponding to each host based on preset data storage rules includes:

Determine the number of categories corresponding to the third data;

According to the number of categories of third data and the number of hosts, determine the number of copies corresponding to each category of third data;

Determine the number of hosts that store the third data of the corresponding category according to the corresponding number of copies of each category of third data;

According to the preset correlation between the data identification information, the host identification information and the POD identification information, each host that stores the third data of each category and the POD corresponding to each host are determined.
The data processing method according to any one of claims 1 to 4, wherein before determining the identification information of the first data based on the data acquisition request sent by the terminal device in response to the data acquisition request, it further includes:

Send identification information reporting instructions to each of the hosts to instruct each POD in each of the hosts to respectively report data identification information, host identification information and POD identification information;

According to the data identification information, the host identification information and the POD identification information, the association relationship between the data identification information, the host identification information and the POD identification information is determined.
The data processing method according to any one of claims 1 to 4, further comprising:

Detect whether there are host changes;

When a host change is detected, the data in each host and the corresponding POD of each host is adjusted based on the matching between the data identification information after the host change and the data identification information before the host change.
A data processing device, including a processor, a memory, a computer program stored on the memory and executable by the processor, and a data bus for realizing connection communication between the processor and the memory, When the computer program is executed by the processor, the steps of the data processing method according to any one of claims 1 to 8 are implemented.
A storage medium for computer-readable storage. The storage medium stores one or more programs. The one or more programs can be executed by one or more processors to implement any of claims 1 to 8. The steps of the data processing method described in one item.