CN112804335B - Data processing method, data processing device, computer readable storage medium and processor - Google Patents

Data processing method, data processing device, computer readable storage medium and processor Download PDF

Info

Publication number
CN112804335B
CN112804335B CN202110064888.5A CN202110064888A CN112804335B CN 112804335 B CN112804335 B CN 112804335B CN 202110064888 A CN202110064888 A CN 202110064888A CN 112804335 B CN112804335 B CN 112804335B
Authority
CN
China
Prior art keywords
data
cluster
target data
shared
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110064888.5A
Other languages
Chinese (zh)
Other versions
CN112804335A (en
Inventor
郝冰
陈震宇
刘国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Postal Savings Bank of China Ltd
Original Assignee
Postal Savings Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Postal Savings Bank of China Ltd filed Critical Postal Savings Bank of China Ltd
Priority to CN202110064888.5A priority Critical patent/CN112804335B/en
Publication of CN112804335A publication Critical patent/CN112804335A/en
Application granted granted Critical
Publication of CN112804335B publication Critical patent/CN112804335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/63Routing a service request depending on the request content or context

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method, a data processing device, a computer readable storage medium and a processor. Wherein, the method comprises the following steps: acquiring a first request of a first cluster, wherein the first request is used for requesting to acquire target data; responding to the first request, and judging whether a first shared data space storing target data exists or not, wherein the first shared data space belongs to a second cluster, and data stored in the first shared data space is shared by a plurality of clusters, and the plurality of clusters comprise a first cluster and a second cluster; if the first shared data space does not exist, sending a second request to the upstream system, wherein the second request is used for requesting the upstream system to issue the target data; and if the first shared data space exists, returning first storage information of the target data to the first cluster, wherein the first storage information is used for indicating that the target data is stored in the first shared data space in the second cluster. The invention solves the technical problem of low data processing efficiency.

Description

Data processing method, data processing device, computer readable storage medium and processor
Technical Field
The present invention relates to the field of data processing, and in particular, to a data processing method, apparatus, computer-readable storage medium, and processor.
Background
Currently, when a cluster needs to acquire data, the cluster needs to import the data from a data source and store the data in the cluster, so that the data can be provided for a user.
When more clusters reuse the same data, the method needs to repeatedly import and store the data, and needs to send data to the upstream system each time, and more data requests put pressure on the upstream system, which has a technical problem of low efficiency of data processing.
In view of the technical problem of low efficiency of data processing, no effective solution has been proposed at present.
Disclosure of Invention
Embodiments of the present invention provide a data processing method, an apparatus, a computer-readable storage medium, and a processor, so as to at least solve the technical problem of low efficiency of data processing.
According to an aspect of an embodiment of the present invention, a data processing method is provided. The method can comprise the following steps: acquiring a first request of a first cluster, wherein the first request is used for requesting to acquire target data; responding to the first request, and judging whether a first shared data space storing target data exists or not, wherein the first shared data space belongs to a second cluster, data stored in the first shared data space are shared by a plurality of clusters, and the plurality of clusters comprise a first cluster and a second cluster; if the first shared data space does not exist, sending a second request to the upstream system, wherein the second request is used for requesting the upstream system to issue the target data; and if the first shared data space exists, returning first storage information of the target data to the first cluster, wherein the first storage information is used for indicating that the target data is stored in the first shared data space in the second cluster.
Optionally, after sending the second request to the upstream system, the method further comprises: storing the target data to a second shared data space, wherein the second shared data space belongs to a third cluster, the plurality of clusters comprise the third cluster, the second shared data space is an idle shared data space in the plurality of shared data spaces, and the plurality of shared data spaces are in one-to-one correspondence with the plurality of clusters; and second storage information of the target data is saved, wherein the second storage information is used for indicating that the target data is stored in a second shared data space in the third cluster.
Optionally, the free shared data space is a largest free shared data space of the plurality of shared data spaces.
Optionally, in a case that the second shared data space fails to store the target data, a prompt message is returned, where the prompt message is used to indicate that the request for obtaining the target data fails.
Optionally, if the third cluster is the same as the first cluster, the target data is directly queried by the first cluster; if the third cluster is not the same as the first cluster, the target data is queried by the first cluster from a second shared data space of the third cluster.
Optionally, before returning the first storage information of the target data to the first cluster, searching the recorded plurality of storage information for the first storage information, where each storage information is used to indicate that the requested data has been stored into the shared data space in the corresponding cluster.
Optionally, after returning the first storage information of the target data to the first cluster, recording request information, where the request information is used to indicate that the first cluster requests to acquire the target data from the first shared data space in the second cluster.
Optionally, after sending the second request to the upstream system or returning the first storage information of the target data to the first cluster, the method further comprises: acquiring a third request of a fourth cluster, wherein the plurality of clusters comprise the fourth cluster, and the third request is used for requesting to delete target data; allowing the target data to be deleted in a shared data space in which the target data is stored when the cluster requesting the target data is not available in the plurality of clusters; when there is a cluster that requests acquisition of target data among the plurality of clusters, deletion of the target data in the shared data space in which the target data is stored is prohibited.
Optionally, responding to the first request comprises: and responding to the first request under the condition that the attribute of the target data is a shared attribute, wherein the shared attribute is used for indicating that the target data is allowed to be shared by a plurality of clusters.
Optionally, the method further comprises: under the condition that the attribute of the target data is a private attribute, sending a second request to an upstream system, wherein the private attribute is used for indicating that the target data is private by the first cluster; and storing the target data issued by the upstream system into the private data space of the first cluster.
According to another aspect of the embodiment of the invention, the invention also provides a data processing device. The device includes: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first request of a first cluster, and the first request is used for requesting to acquire target data; the device comprises a judging unit, a first processing unit and a second processing unit, wherein the judging unit is used for responding to a first request and judging whether a first shared data space storing target data exists or not, the first shared data space belongs to a second cluster, data stored in the first shared data space is shared by a plurality of clusters, and the plurality of clusters comprise a first cluster and a second cluster; the sending unit is used for sending a second request to the upstream system when judging that the first shared data space does not exist, wherein the second request is used for requesting the upstream system to issue target data; and the returning unit is used for returning first storage information of the target data to the first cluster when the first shared data space is judged to exist, wherein the first storage information is used for indicating that the target data is stored in the first shared data space in the second cluster.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium. The computer readable storage medium includes a stored program, wherein when the program runs, the apparatus in which the computer readable storage medium is located is controlled to execute the data processing method of the embodiment of the present invention.
According to another aspect of the embodiments of the present invention, there is also provided a processor. The processor is configured to execute a program, wherein the program executes the data processing method according to the embodiment of the present invention when the program is executed by the processor.
In the embodiment of the invention, a first request for acquiring a first cluster is adopted, wherein the first request is used for requesting to acquire target data; responding to the first request, and judging whether a first shared data space storing target data exists or not, wherein the first shared data space belongs to a second cluster, data stored in the first shared data space are shared by a plurality of clusters, and the plurality of clusters comprise a first cluster and a second cluster; if the first shared data space does not exist, sending a second request to the upstream system, wherein the second request is used for requesting the upstream system to issue the target data; and if the first shared data space exists, returning first storage information of the target data to the first cluster, wherein the first storage information is used for indicating that the target data is stored in the first shared data space in the second cluster. That is to say, the data sharing module of the present application may obtain the data application of each cluster, and determine whether the required data already exists in the current shared data space, and if the required data does not exist, apply for the data from the upstream system; if the needed data exists, the stored information of the stored data is returned, so that under the multi-cluster scene, the data only needs to be applied once and also needs to be stored once, and the data can be shared and used by the multiple clusters, thereby avoiding repeated application of the same data, saving the time of data application, saving the space occupied by repeated storage of the data, reducing the pressure of data provided by a data provider, solving the technical problem of low data processing efficiency, and achieving the technical effect of improving the data processing efficiency.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method of data processing according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a system for sharing data among multiple clusters, according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method of clustered data applications in accordance with an embodiment of the present invention;
FIG. 4 is a flow diagram of a method for a cluster to request deletion of data according to an embodiment of the invention;
fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of a data processing method, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than that herein.
Fig. 1 is a flow chart of a data processing method according to an embodiment of the present invention. As shown in fig. 1, the method may include the steps of:
step S102, a first request of the first cluster is obtained, where the first request is used to request to obtain target data.
In the technical solution provided in step S102 of the present invention, a data sharing module is created, where the data sharing module obtains a first request sent by a first cluster, where the first cluster may be any one of a plurality of clusters that needs to obtain target data, and the first request is also a data application request, and is used to request the data sharing module to obtain (apply) the target data needed by the first cluster, which may be a data file, so that the first cluster is a cluster that initiates a data application at this time in the plurality of clusters, that is, a data application cluster.
In this embodiment, each cluster is divided into two storage spaces, one is a private data space, and the other is a shared data space. The data in the private data space is private to the cluster, and the stored data cannot be shared with other clusters, that is, is shared by the current cluster user, and may be referred to as a cluster private space. The data stored in the shared data space is used for sharing among the clusters, can be accessed by other clusters, and can be referred to as a cluster shared space. The shared data space is mainly managed by the data sharing module.
Step S104, responding to the first request, and judging whether a first shared data space storing the target data exists.
In the technical solution provided in step S104 of the present invention, after the first request of the first cluster is obtained, whether a first shared data space storing target data exists is determined in response to the first request, where the first shared data space belongs to the second cluster, and data stored in the first shared data space is shared by multiple clusters, where the multiple clusters include the first cluster and the second cluster.
In this embodiment, the data sharing module responds to the acquired first request, determines whether a first shared data space in which the target data is stored exists, and may determine whether the cluster requests the data sharing module to acquire the target data for the first time, if the cluster requests the data sharing module to acquire the target data for the first time, the first shared data space in which the target data is stored does not exist, and if the cluster does not request the data sharing module to acquire the target data for the first time, the first shared data space in which the target data is stored may exist.
Optionally, the data sharing module in this embodiment may obtain record information requesting data from the data sharing module in response to the first request, and if the record information does not include record information requesting to obtain target data, it may be determined that the cluster first requests the data sharing module to obtain the target data; if the record information includes record information requesting to acquire the target data, it may be determined that the cluster does not request the data sharing module for acquiring the target data for the first time.
In this embodiment, the plurality of clusters may include a second cluster, and the first shared data space is a shared data space in the second cluster.
Step S106, if the first shared data space does not exist, a second request is sent to an upstream system.
In the technical solution provided in step S106 of the present invention, after determining whether the first shared data space storing the target data exists, if it is determined that the first shared data space does not exist, sending a second request to the upstream system, where the second request is used to request the upstream system to issue the target data.
In this embodiment, if it is determined that the first shared data space in which the target data is stored does not exist, that is, the data sharing module determines that the cluster requests the data sharing module for the target data for the first time, the data sharing module sends a second request, that is, a data obtaining request, to an upstream system, where the second request is used to request the upstream system to send the target data, and the upstream system is a data source (data provider) of data that the cluster needs to obtain, so that the data sharing module obtains the target data, and the first cluster obtains the target data.
And step S108, if the first shared data space is judged to exist, returning first storage information of the target data to the first cluster.
In the technical solution provided by the above step S108 of the present invention, after determining whether the first shared data space storing the target data exists, if it is determined that the first shared data space exists, returning first storage information of the target data to the first cluster, where the first storage information is used to indicate that the target data is stored in the first shared data space in the second cluster.
In this embodiment, if it is determined that the first shared data space in which the target data is stored exists, that is, the data sharing module determines that the cluster does not request the target data from the data sharing module for the first time, the first storage information recorded in the data sharing module may be searched, where the first storage information is used to indicate that the target data is stored in the first shared data space in the second cluster, that is, the second cluster is a cluster in which the target data is currently stored, and the second cluster may be determined by the first storage information, so that the first storage information may be understood as cluster information. And the data sharing module returns the first storage information to the first cluster, so that the first cluster can determine a second cluster through the first storage information, and acquires the target data from a second shared data space of the second cluster through communication connection with the second cluster.
Through the steps S102 to S108 described above, a first request of a first cluster is obtained, where the first request is used to request to obtain target data; responding to the first request, and judging whether a first shared data space storing target data exists or not, wherein the first shared data space belongs to a second cluster, data stored in the first shared data space are shared by a plurality of clusters, and the plurality of clusters comprise a first cluster and a second cluster; if the first shared data space does not exist, sending a second request to the upstream system, wherein the second request is used for requesting the upstream system to issue the target data; and if the first shared data space exists, returning first storage information of the target data to the first cluster, wherein the first storage information is used for indicating that the target data is stored in the first shared data space in the second cluster. That is to say, the data sharing module in this embodiment may obtain the data application of each cluster, and determine whether there is needed data already in the current shared data space, and if there is no needed data, apply for the data from the upstream system; if the needed data exists, the stored information of the stored data is returned, so that under the multi-cluster scene, the data only needs to be applied once and also needs to be stored once, and the data can be shared and used by the multiple clusters, thereby avoiding repeated application of the same data, saving the time of data application, saving the space occupied by repeated storage of the data, reducing the pressure of data provided by a data provider, solving the technical problem of low data processing efficiency, and achieving the technical effect of improving the data processing efficiency.
The above-described method of this embodiment is further described below.
As an optional implementation manner, after sending the second request to the upstream system in step S106, the method further includes: storing the target data to a second shared data space, wherein the second shared data space belongs to a third cluster, the plurality of clusters comprise the third cluster, the second shared data space is the largest free shared data space in the plurality of shared data spaces, and the plurality of shared data spaces are in one-to-one correspondence with the plurality of clusters; and second storage information of the target data is saved, wherein the second storage information is used for indicating that the target data is stored in a second shared data space in the third cluster.
In this embodiment, after the data sharing module sends the second request to the upstream system and acquires the target data delivered by the upstream system, the data sharing module may determine the second shared data space according to the usage of the shared data spaces of the multiple clusters, where the second shared data space may be an idle shared data space in the multiple shared data spaces of the multiple clusters, and may be a maximum idle shared data space in the multiple shared data spaces, and then store the target data delivered by the upstream system in the second shared data space of the third cluster.
As an optional implementation, the method further comprises: and returning prompt information under the condition that the target data stored in the second shared data space fails, wherein the prompt information is used for indicating that the target data acquisition request fails.
In this embodiment, before storing the target data in the second shared data space, the size of the target data may be obtained first, and the capacity of the second shared storage space is obtained, and if the capacity of the second shared storage space is not enough to store the target data, it may be determined that the second shared data space fails to store the target data, and a prompt message may be returned, where the prompt message may be used to indicate that the request for obtaining the target data fails, and may not record information that the first cluster requests the data sharing module to obtain the target data this time.
As an alternative embodiment, if the third cluster is the same as the first cluster, the target data is directly queried by the first cluster; if the third cluster is not the same as the first cluster, the target data is queried by the first cluster from a second shared data space of the third cluster.
In this embodiment, if the third cluster is the same as the first cluster, that is, the data sharing module stores the target data in the first cluster, so that the cluster initiating the data request and the cluster storing the requested data are the same cluster, the target data may be directly queried in the first cluster; and if the third cluster is different from the first cluster, the first cluster can be in communication connection with the third cluster, the target data is inquired from a second shared data space of the third cluster, and a result is returned.
As an optional implementation manner, before returning the first storage information of the target data to the first cluster in step S108, the method further includes: and searching first storage information in the recorded plurality of storage information, wherein each storage information is used for indicating that the requested data is stored in the shared data space in the corresponding cluster.
In this embodiment, the data sharing module records a plurality of storage information, where each storage information is used to indicate that the requested data has been stored in the shared data space in the corresponding cluster, and may be used to determine the cluster that has stored the requested data. Before returning the first storage information of the target data to the first cluster in the data sharing module, the embodiment may search the first storage information in the recorded plurality of storage information, and further determine the second cluster in which the target data is stored according to the first storage information.
As an optional implementation manner, after returning the first storage information of the target data to the first cluster in step S108, the method further includes: and recording request information, wherein the request information is used for indicating that the first cluster requests to acquire target data from the first shared data space in the second cluster.
In this embodiment, after the first storage information of the target data is returned to the first cluster by the data sharing module, the current request information may be recorded, where the current request information is used to indicate that the target data is stored in the first shared data space in the second cluster, and the first cluster may request to acquire the target data from the first shared data space in the second cluster.
As an optional implementation manner, in step S106, after sending the second request to the upstream system, or in step S108, returning the first storage information of the target data to the first cluster, the method further includes: acquiring a third request of a fourth cluster, wherein the plurality of clusters comprise the fourth cluster, and the third request is used for requesting to delete target data; allowing the target data to be deleted in a shared data space in which the target data is stored in the case where there is no cluster requesting acquisition of the target data among the plurality of clusters; when there is a cluster that requests acquisition of target data among the plurality of clusters, deletion of the target data in a shared data space in which the target data is stored is prohibited.
In this embodiment, the data sharing module may receive a third request of the fourth cluster, which may be for requesting the data sharing module to delete the target data. After the data sharing module receives the third request, it may be determined whether there is a cluster requesting to acquire the target data among the plurality of clusters, or whether there is an application record of the cluster requesting to acquire the target data, and if not, the target data may be deleted in the shared data space where the target data is stored; if so, the data sharing module records the third request for requesting the data sharing module to delete the target data, but the target data cannot be directly deleted in the shared data space in which the target data is stored.
As an optional implementation manner, step S104, in response to the first request, includes: and responding to the first request under the condition that the attribute of the target data is a shared attribute, wherein the shared attribute is used for indicating that the target data is allowed to be shared by a plurality of clusters.
In this embodiment, after obtaining the first request, it may be determined whether the attribute of the target data requested by the first request is a shared attribute, that is, whether the target data is data of a shared data space, where the data of the shared data space may be sharable by a plurality of clusters. If the attribute of the target data is judged to be the shared attribute, the first request can be directly responded, and whether a first shared data space storing the target data exists or not is judged.
As an optional implementation, the method further comprises: under the condition that the attribute of the target data is a private attribute, sending a second request to an upstream system, wherein the private attribute is used for indicating that the target data is private by the first cluster; and storing the target data issued by the upstream system into the private data space of the first cluster.
In this embodiment, after the first request is obtained, if it is determined that the attribute of the target data requested by the first request is not a shared attribute, that is, is a private attribute, and the target data is private by the first cluster and is private space data, the data sharing module may directly initiate a second request for requesting the upstream system to issue to the upstream system, record the requested target data in the private data space of the first cluster, and the data sharing module does not record the application recording and storage condition of the target data.
In this embodiment, in a multi-cluster environment, requests such as data application, data deletion, and the like of a cluster need to be uniformly sent to the data sharing module, and the data sharing module receives and processes the corresponding requests. The data sharing module can receive data applications of each cluster, judge whether the current shared data space has the required data or not, and apply for the data from an upstream system if the required data does not exist; if the data exists, returning the storage information of the saved data; when data is deleted, the data sharing module receives requests of all clusters uniformly, after the data sharing module receives a deletion request, if the data has records of applied data, the data is not deleted, and only after the data does not have the records of applied data, the data is deleted, so that only one copy of data is stored in multiple applications of the same data, the data sharing in multiple clusters is realized, under the environment of multiple clusters, the occupation of the same data on storage equipment is reduced, the speed of applying and loading the same data is accelerated, the transmission of the data is reduced, cluster resources are utilized more efficiently, the pressure of an upstream system in the aspect of data provision is reduced, repeated data only needs to be provided once and does not need to be provided for multiple times, the technical problem of low data processing efficiency is solved, and the technical effect of improving the data processing efficiency is achieved.
Example 3
The technical solutions of the embodiments of the present invention are described below by way of examples with reference to preferred embodiments.
In the related art, when data needs to be acquired, a cluster needs to import the data from a data source and store the data in the cluster, so that the data can be provided for a user; when more clusters reuse data, the method needs to repeatedly import and store the data, and the upstream system is required to provide the data every time the data is applied, and the upstream system is stressed by more data applications.
In view of the above problems, the embodiment is mainly used for solving the problem of multi-cluster data sharing, and data only needs to be applied once in a multi-cluster scene, and can be shared and used by multiple clusters, and the data also only needs to be stored once, so that the time for data repeated application is saved, the space occupied by data repeated storage is saved, and the pressure of a data provider for providing data is reduced.
FIG. 2 is a schematic diagram of a system for sharing data among multiple clusters, according to an embodiment of the invention. As shown in fig. 2, the system includes: cluster 1, cluster 2 \8230, cluster 8230, cluster n, data sharing module 21 and upstream system 22.
In the embodiment, a mode of constructing the data sharing module 21 is adopted, requests of application, deletion and the like of data of the cluster 1, the cluster 2 and the cluster 8230are required to be uniformly sent to the data sharing module 21, and the data sharing module 21 receives and processes the corresponding requests.
The data sharing module 21 in this embodiment receives the data application of each cluster, and determines whether the current shared data space already has the required data, and if the current shared data space does not have the required data, the data is required to be applied from the upstream system 22; if the shared data space already exists, the storage information of the data saved in the shared data space can be returned.
The same is true when the data is deleted, the data sharing module 21 receives requests of all the clusters uniformly, after the data sharing module receives the deletion request, if the data has a record of the applied data, the data is not deleted, and only if the data does not have the applied record, the data is deleted.
In this embodiment, two data storage spaces are divided in each cluster, one is a private data space, and the other is a shared data space. The data stored in the private data space is private to the cluster, and the stored data cannot be shared with other clusters, namely, is shared by the current cluster user. The data stored in the shared data space is used for sharing among the clusters and can be accessed by other clusters. Wherein, the shared data space is mainly managed by the data sharing module.
The following describes a specific implementation of this embodiment.
Fig. 3 is a flowchart of a method for clustered data application according to an embodiment of the present invention. As shown in fig. 3, the method may include the steps of:
step S301, the cluster sends a request for applying for data to the data sharing module.
When a cluster initiates a request for applying for data to the data sharing module, the request may be sent to the data sharing module. If the request is used for applying for the data of the private data space, the data sharing module can directly initiate data application to an upstream system, and the data sharing module does not record the application record and storage condition of the data. If the request is for data sharing the data space, step S302 is executed.
In step S302, the data sharing module determines whether to apply for the data for the first time according to the elements in the request for applying for the data.
If the data sharing module determines that the data is applied for the first time, step S303 is executed, otherwise, step S305 is executed.
Step S303, the data sharing module initiates a request for applying for data to the upstream system.
If the data is applied for the first time, the data sharing module can initiate a request for applying for the data to an upstream system.
Step S304, according to the use condition of the shared data space of each cluster, preferentially storing the data issued by the upstream system into the cluster with larger idle shared data space.
This embodiment may also store information of the cluster where the largest free shared data space is located. And if the data applied for the time is large and the recorded maximum free shared data space can not be met, determining that the data application fails.
In step S305, cluster information in which the requested data has already been stored is returned.
If the data is not applied for the first time, the recorded information is searched in the data sharing module, the cluster information of the data which is stored with the request is found, the information is returned to the cluster which initiates the data application at this time, the data application cluster is connected to the cluster which has stored the data through the cluster information, the required data is obtained from the shared data space of the cluster, and the data sharing module records the application information at this time.
In this embodiment, when a cluster queries for data, the results may be returned directly if the required data happens to be stored in the present cluster. If the required data is stored in other clusters, the data can be obtained from the shared data space of the cluster by connecting to the cluster storing the data, and the result is returned.
Fig. 4 is a flowchart of a method for requesting deletion of data by a cluster according to an embodiment of the present invention. As shown in fig. 4, the method may include the steps of:
step S401, the cluster sends a request for deleting data to the data sharing module.
When data is deleted, a request to delete the data is sent to the data sharing module.
Step S402, the data sharing module checks whether data application records of other clusters exist.
If the data application records of other clusters exist, executing step S403; otherwise, step S405 is executed.
Step S403, the data sharing module records the data deletion request.
And if other cluster data application records exist, the data sharing module records the data deletion request.
In step S404, the data is not deleted directly.
In step S405, the data stored in the other cluster is deleted.
If there is no record of other cluster data applications, the data is deleted.
According to the embodiment, through the method, in the environment of multiple clusters, the data sharing module is used for managing data application, deletion and other operations, on the basis, when the part of data exists, only one part of data can be stored through multiple applications of the same data, the operation of importing the data into the clusters is reduced, the importing time is saved, data sharing in the multiple clusters is realized, the occupation of the same data on storage equipment is reduced, the loading speed of the same data application is increased, the data transmission is reduced, cluster resources are utilized more efficiently, the pressure of an upstream system in the aspect of data provision is reduced, repeated data only needs to be provided once without being provided for multiple times, the problems of repeated loading (importing) of the data and repeated storage of the data are reduced, cluster resources are saved, the technical problem of low data processing efficiency is solved, and the technical effect of improving the data processing efficiency is achieved.
Example 3
The embodiment of the invention also provides a data processing device. It should be noted that the data processing apparatus of this embodiment is configured to execute the data processing method of the embodiment of the present invention.
Fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention. As shown in fig. 5, the data processing apparatus 50 may include: an acquisition unit 51, a judgment unit 52, a transmission unit 53, and a return unit 54.
The obtaining unit 51 is configured to obtain a first request of the first cluster, where the first request is used to request to obtain target data.
The determining unit 52 is configured to determine, in response to the first request, whether a first shared data space storing the target data exists, where the first shared data space belongs to a second cluster, and data stored in the first shared data space is shared by multiple clusters, where the multiple clusters include the first cluster and the second cluster.
The sending unit 53 is configured to send a second request to the upstream system when it is determined that the first shared data space does not exist, where the second request is used to request the upstream system to issue the target data.
And a returning unit 54, configured to, when it is determined that the first shared data space exists, return first storage information of the target data to the first cluster, where the first storage information is used to indicate that the target data is stored in the first shared data space in the second cluster.
In the data processing apparatus of this embodiment, the data sharing module may obtain a data application of each cluster, and determine whether there is required data already in the current shared data space, and if there is no required data, apply for the data from an upstream system; if the needed data exist, the stored data storage information is returned, so that under the multi-cluster scene, the data only needs to be applied once and stored once, and can be shared and used by the multiple clusters, repeated application of the same data is avoided, the data application time is saved, the space occupied by repeated storage of the data is saved, the pressure of a data provider on providing the data is reduced, the technical problem of low data processing efficiency is solved, and the technical effect of improving the data processing efficiency is achieved.
Example 4
According to an embodiment of the present invention, there is also provided a computer-readable storage medium. The computer readable storage medium includes a stored program, wherein when the program runs, the apparatus in which the computer readable storage medium is located is controlled to execute the data processing method of the embodiment of the present invention.
Example 5
According to an embodiment of the present invention, there is further provided a processor, configured to execute the program, where the program executes the data processing method according to the embodiment of the present invention when the program is executed by the processor.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (11)

1. A data processing method, comprising:
acquiring a first request of a first cluster, wherein the first request is used for requesting to acquire target data;
responding to the first request, and judging whether a first shared data space storing the target data exists or not, wherein the first shared data space belongs to a second cluster, and data stored in the first shared data space is shared by a plurality of clusters, and the plurality of clusters comprise the first cluster and the second cluster;
if the first shared data space does not exist, sending a second request to an upstream system, wherein the second request is used for requesting the upstream system to issue the target data;
if the first shared data space exists, returning first storage information of the target data to the first cluster, wherein the first storage information is used for indicating that the target data is stored in the first shared data space in the second cluster;
wherein, responding to the first request, and determining whether a first shared data space storing target data exists comprises: judging whether the first cluster requests to acquire the target data for the first time, responding to the first cluster to acquire the target data for the first time, wherein the first shared data space does not exist, and responding to the non-first request to acquire the target data, wherein the first shared data space exists;
wherein after sending the second request to the upstream system, the method further comprises: storing the target data to a second shared data space, wherein the second shared data space belongs to a third cluster, the plurality of clusters include the third cluster, and the second shared data space is a free shared data space in a plurality of shared data spaces, wherein the free shared data space is a largest free shared data space in the plurality of shared data spaces, and the plurality of shared data spaces are in one-to-one correspondence with the plurality of clusters; and second storage information of the target data is saved, wherein the second storage information is used for indicating that the target data is stored in the second shared data space in the third cluster.
2. The method of claim 1, further comprising:
and returning prompt information under the condition that the target data is failed to be stored in the second shared data space, wherein the prompt information is used for indicating that the target data is failed to be acquired.
3. The method of claim 1, wherein if the third cluster is the same as the first cluster, the target data is queried directly by the first cluster; the target data is queried by the first cluster from the second shared data space of the third cluster if the third cluster is not the same as the first cluster.
4. The method of claim 1, wherein prior to returning the first stored information of the target data to the first cluster, the method further comprises:
and searching the first storage information in a plurality of recorded storage information, wherein each storage information is used for indicating that the requested data is stored in the shared data space in the corresponding cluster.
5. The method of claim 1, wherein after returning the first stored information of the target data to the first cluster, the method further comprises:
recording request information, wherein the request information is used for indicating that the first cluster requests to acquire the target data from the first shared data space in the second cluster.
6. The method of claim 1, wherein after sending a second request to an upstream system or returning first stored information of the target data to the first cluster, the method further comprises:
obtaining a third request of a fourth cluster, wherein the plurality of clusters comprise the fourth cluster, and the third request is used for requesting to delete the target data;
allowing the target data to be deleted in a shared data space in which the target data is stored if there is no cluster requesting to acquire the target data among the plurality of clusters;
and when the cluster requesting to acquire the target data exists in the plurality of clusters, forbidding to delete the target data in the shared data space stored by the target data.
7. The method of any of claims 1 to 6, wherein responding to the first request comprises:
and responding to the first request under the condition that the attribute of the target data is a shared attribute, wherein the shared attribute is used for indicating that the target data is allowed to be shared by the plurality of clusters.
8. The method of claim 7, further comprising:
sending the second request to the upstream system when the attribute of the target data is a private attribute, wherein the private attribute is used for indicating that the target data is private by the first cluster;
and storing the target data issued by the upstream system into a private data space of the first cluster.
9. A data processing apparatus, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first request of a first cluster, and the first request is used for requesting to acquire target data;
a determining unit, configured to determine, in response to the first request, whether a first shared data space in which the target data is stored exists, where the first shared data space belongs to a second cluster, and data stored in the first shared data space is shared by multiple clusters, where the multiple clusters include the first cluster and the second cluster;
a sending unit, configured to send a second request to an upstream system when it is determined that the first shared data space does not exist, where the second request is used to request the upstream system to issue the target data;
a returning unit, configured to return first storage information of the target data to the first cluster when it is determined that the first shared data space exists, where the first storage information is used to indicate that the target data is stored in the first shared data space in the second cluster;
wherein the determining unit is configured to determine whether a first shared data space storing target data exists in response to the first request by: judging whether the first cluster requests to acquire the target data for the first time, responding to the first cluster to acquire the target data for the first time, wherein the first shared data space does not exist, and responding to the non-first request to acquire the target data, wherein the first shared data space exists;
wherein, the device includes: a storage unit, configured to store the target data to a second shared data space after sending a second request to an upstream system, where the second shared data space belongs to a third cluster, and the plurality of clusters include the third cluster, and the second shared data space is a free shared data space in a plurality of shared data spaces, and the plurality of shared data spaces are in one-to-one correspondence with the plurality of clusters, where the free shared data space is a largest free shared data space in the plurality of shared data spaces; a saving unit, configured to save second storage information of the target data, where the second storage information is used to indicate that the target data has been stored in the second shared data space in the third cluster.
10. A computer-readable storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the storage medium is located to perform the method of any of claims 1-8.
11. A processor, characterized in that the processor is configured to run a program, wherein the program when run by the processor performs the method of any of claims 1 to 8.
CN202110064888.5A 2021-01-18 2021-01-18 Data processing method, data processing device, computer readable storage medium and processor Active CN112804335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110064888.5A CN112804335B (en) 2021-01-18 2021-01-18 Data processing method, data processing device, computer readable storage medium and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110064888.5A CN112804335B (en) 2021-01-18 2021-01-18 Data processing method, data processing device, computer readable storage medium and processor

Publications (2)

Publication Number Publication Date
CN112804335A CN112804335A (en) 2021-05-14
CN112804335B true CN112804335B (en) 2022-11-22

Family

ID=75810194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110064888.5A Active CN112804335B (en) 2021-01-18 2021-01-18 Data processing method, data processing device, computer readable storage medium and processor

Country Status (1)

Country Link
CN (1) CN112804335B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920111A (en) * 2018-07-27 2018-11-30 中国联合网络通信集团有限公司 Data sharing method and Distributed data share system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5590301A (en) * 1995-10-06 1996-12-31 Bull Hn Information Systems Inc. Address transformation in a cluster computer system
WO2016159765A1 (en) * 2015-03-27 2016-10-06 Recore Systems B.V. Many-core processor architecture and many-core operating system
CN108984639B (en) * 2018-06-22 2021-12-24 联想(北京)有限公司 Data processing method and device for server cluster
CN109525592B (en) * 2018-12-14 2023-02-03 平安证券股份有限公司 Data sharing method, device, equipment and computer readable storage medium
CN109788037B (en) * 2018-12-24 2022-03-11 北京旷视科技有限公司 Cluster management method, device and system and computer storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920111A (en) * 2018-07-27 2018-11-30 中国联合网络通信集团有限公司 Data sharing method and Distributed data share system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Efficient Cooperative Caching for File Systems in Cluster-Based Web Servers;Woo Hyun Ahn 等;《Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000》;20020806;全文 *
基于Hadoop+MPP架构的电信运营商网络数据共享平台研究;辛晃 等;《电信科学2014年(第30卷)第4期》;20140420;全文 *

Also Published As

Publication number Publication date
CN112804335A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN107622091B (en) Database query method and device
WO2017167171A1 (en) Data operation method, server, and storage system
KR101994021B1 (en) File manipulation method and apparatus
CN110096336B (en) Data monitoring method, device, equipment and medium
CN109299157B (en) Data export method and device for distributed big single table
CN101958838A (en) Data access method and device
CN112685148B (en) Asynchronous communication method and device for mass terminals, computer equipment and storage medium
CN112162707A (en) Storage method, electronic device and storage medium for distributed storage system
CN113946291A (en) Data access method, device, storage node and readable storage medium
CN107181773A (en) Data storage and data managing method, the equipment of distributed memory system
CN110740155A (en) Request processing method and device in distributed system
CN108132759B (en) Method and device for managing data in file system
CN101483668A (en) Network storage and access method, device and system for hot spot data
CN110798358B (en) Distributed service identification method and device, computer readable medium and electronic equipment
CN112804335B (en) Data processing method, data processing device, computer readable storage medium and processor
CN116775712A (en) Method, device, electronic equipment, distributed system and storage medium for inquiring linked list
CN111669629A (en) Video CDN node instant capacity expansion method, scheduler and CND storage system
CN113905252B (en) Data storage method and device for live broadcasting room, electronic equipment and storage medium
CN106446080B (en) Data query method, query service equipment, client equipment and data system
CN108173892B (en) Cloud mirror image operation method and device
CN113449042B (en) Automatic data warehouse separation method and device
CN114528274A (en) Authority management method and related device
CN115525618A (en) Storage cluster, data storage method, system and storage medium
CN110019448B (en) Data interaction method and device
CN113127717A (en) Key retrieval method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant