CN107526530B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN107526530B
CN107526530B CN201610453751.8A CN201610453751A CN107526530B CN 107526530 B CN107526530 B CN 107526530B CN 201610453751 A CN201610453751 A CN 201610453751A CN 107526530 B CN107526530 B CN 107526530B
Authority
CN
China
Prior art keywords
data
remote system
remote
request
identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610453751.8A
Other languages
Chinese (zh)
Other versions
CN107526530A (en
Inventor
黄刚
曹逾
高雯雯
袁丹
崔妍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
EMC IP Holding Co LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by EMC IP Holding Co LLC filed Critical EMC IP Holding Co LLC
Priority to CN201610453751.8A priority Critical patent/CN107526530B/en
Priority to US15/628,624 priority patent/US20170364293A1/en
Publication of CN107526530A publication Critical patent/CN107526530A/en
Application granted granted Critical
Publication of CN107526530B publication Critical patent/CN107526530B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0661Format or protocol conversion arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present disclosure relate to a data processing method and apparatus. For example, a method is proposed, comprising: acquiring an intermediate identifier of data to be processed in an intermediate system; converting the intermediate identity to a first identity in a remote system based on an identity mapping between the intermediate system and the remote system; and processing the data in association with the remote system based at least in part on the first identification. Corresponding apparatus and computer program products are also disclosed.

Description

Data processing method and device
Technical Field
Embodiments of the present invention relate generally to data processing, and more particularly, to a data processing method and apparatus.
Background
Currently, there is an increasing demand for data storage. Widely used storage systems include, for example, file systems, block stores, and object stores. With respect to other storage systems, such as a file system that manages data as a file hierarchy, and a block storage that manages data as blocks, an object storage is a storage architecture that manages data as objects.
Taking object storage as an example, this approach is applicable to storage of unstructured data and allows for relatively inexpensive, scalable, and Self-Healing Retention of large amounts of data (Self-Healing Retention). Some solutions have been proposed for public cloud object storage services. Further, there are solutions that aim to provide private cloud object storage services. These known solutions have some commonality, e.g. based on the HTTP/HTTPs protocol, simple read/write Application Programming Interfaces (APIs) in the REST style, based on a specific API, etc. When users use existing object storage services, lower efficiency and security issues are typically encountered, which directly degrade the user experience.
Disclosure of Invention
Embodiments of the present disclosure provide a data processing method, apparatus and corresponding computer program product.
According to a first aspect of the present disclosure, a data processing method is provided. The method comprises the following steps: acquiring an intermediate identifier of data to be processed in an intermediate system; converting the intermediate identity to a first identity in a remote system based on an identity mapping between the intermediate system and the remote system; and processing the data in association with the remote system based at least in part on the first identification.
In some embodiments, obtaining the intermediate identity comprises: receiving a user request from a client to manipulate the data at the remote system; and extracting the intermediate identity of the data from the user request.
In some embodiments, processing the data in association with the remote system includes: generating a first request for performing the operation at the remote system based on the user request, the first request including the first identification; and sending the first request to the remote system.
In some embodiments, the operation in the user request comprises a read of the data, and processing the data in association with the remote system further comprises: receiving the data from the remote system; and sending the data to the client.
In some embodiments, the remote system is a first remote system and the operation in the user request comprises an update to the data at the first remote system, wherein processing the data in association with the remote system further comprises: converting the intermediate identity of the data to a second identity of the data at a second remote system based on an identity mapping between the intermediate system and the second remote system; generating a second request for the update to the data at the second remote system, the second request including the second identification; and sending the second request to the second remote system.
In some embodiments, the update includes at least one of: creation, deletion and modification.
In some embodiments, generating the first request comprises: the first request is generated using a different syntax than the user request.
In some embodiments, at least one of the user request and the first request includes a key associated with the data.
In some embodiments, the remote system is a first remote system, and processing the data in association with the remote system comprises: converting the intermediate identifier to a third identification of the data in a third remote system based on an identification mapping between the intermediate system and the third remote system, the third remote system being different from the first remote system; obtaining the data from the third remote system using the third identification; and storing data to the first remote system using the first identification.
In some embodiments, processing the data in association with the remote system further comprises: deleting the data from the third remote system in response to at least one of: it is determined that the data has been stored intact at the first remote system and that processing of pending requests for the data is complete.
According to a second aspect of the present disclosure, an electronic device is provided. The electronic device includes: at least one processing unit; and at least one memory coupled to the at least one processing unit and storing machine-executable instructions that, when executed by the at least one processing unit, cause the at least one processing unit to be configured to: acquiring an intermediate identifier of data to be processed in an intermediate system; converting the intermediate identity to a first identity in a remote system based on an identity mapping between the intermediate system and the remote system; and processing the data in association with the remote system based at least in part on the first identification.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the disclosure, nor is it intended to be used to limit the scope of the disclosure.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following more particular descriptions of exemplary embodiments of the disclosure as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the disclosure.
FIG. 1 shows a schematic diagram of a storage system in a conventional solution;
FIG. 2 shows a schematic diagram of a storage system according to an embodiment of the present disclosure;
FIG. 3 shows a schematic diagram of an identity mapping in an intermediate system according to an embodiment of the present disclosure;
FIG. 4 shows a schematic diagram of an intermediate system reading data in association with a remote system, in accordance with an embodiment of the present disclosure;
FIG. 5 shows a schematic diagram of an intermediate system updating data in association with a remote system, according to an embodiment of the disclosure;
FIG. 6 illustrates a schematic diagram of an intermediate system migrating data in association with a remote system, according to an embodiment of the disclosure;
FIG. 7 shows a flow diagram of a data processing procedure or method according to an embodiment of the present disclosure;
FIG. 8 shows a schematic block diagram of an apparatus for data processing according to an embodiment of the present disclosure; and
FIG. 9 shows a schematic block diagram of a device suitable for use to implement embodiments of the present disclosure.
Detailed Description
Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The term "include" and variations thereof as used herein is meant to be inclusive in an open-ended manner, i.e., "including but not limited to". Unless specifically stated otherwise, the term "or" means "and/or". The term "based on" means "based at least in part on". The terms "one example embodiment" and "one embodiment" mean "at least one example embodiment". The term "another embodiment" means "at least one additional embodiment". The terms "first," "second," and the like may refer to different or the same object. Other explicit and implicit definitions are also possible below.
Fig. 1 shows a schematic diagram of a storage system 100 in a conventional solution. The storage system 100 may include a client 110 and remote systems 130-1, …,130-N,130- (N +1) (collectively "remote systems 130"), where N is a natural number. The remote systems 130-1, …,130-N,130- (N +1) may store mass data to provide storage services for the client 110. The remote systems 130-1, …,130-N,130- (N +1) may provide object storage services, file system storage services, block storage services, etc., and the type of remote system 130-1, …,130-N,130- (N +1) does not constitute a limitation of the embodiments of the present disclosure, as long as it is capable of providing storage services. The client 110 may be a desktop computer, a notebook computer, a tablet computer, a smart phone, a personal digital assistant, a reader, an audio player, a camera, etc., and the type of client 110 does not constitute a limitation of the embodiments of the present disclosure.
As shown in FIG. 1, the client 110 connects to the remote systems 130-1, …,130-N and stores its data to the remote systems 130-1, …, 130-N. However, when the client 110 wishes to migrate data to other remote systems, such as the remote system 130- (N +1), it is difficult for the client 110 to migrate the data it stores on the remote system 130-1, …,130-N to other remote systems, and therefore the client 110 has to risk being locked to the remote system 130-1, …, 130-N.
Furthermore, in some cases, the client 110 may wish that its data stored on the remote systems 130-1, …,130-N flow between different remote systems as needed. However, since different remote systems are configured differently, data is made to flow between different remote systems, often requiring a user to make separate configurations, making it difficult to simply achieve data flow.
Furthermore, the user needs a method of measuring different remote systems in multiple aspects such as cost, performance, and SLA (service level agreement) before deciding whether to store data. For example, enterprise users have high requirements on performance and SLA of remote systems and low requirements on cost control, and by this measure, enterprise users can select the best choice for a particular storage requirement, e.g., for enterprise applications. However, in the existing storage system 100, the user is unable to measure aspects of the remote systems 130-1, …,130-N and therefore is unable to make an optimal selection.
Therefore, when a user uses an existing storage system, the user is exposed to the risk of being locked to a specific object storage service, data flow between different remote systems is difficult to achieve, and the remote systems are difficult to measure to make an optimal selection, so that the efficiency and the safety of the existing storage system are not guaranteed, and the user experience is directly reduced.
To address the above and other potential problems and deficiencies, embodiments of the present disclosure provide a data processing scheme. FIG. 2 shows a schematic diagram of a storage system 200 according to an embodiment of the present disclosure. The differences between the storage system 200 according to an embodiment of the present disclosure and the storage system 100 in the existing solution will be described in detail below with reference to fig. 1.
In particular, in the following discussion, a description will be given mainly with an example in which a data object is an object to be operated. It should be understood that this is by way of example only and is not intended to limit the scope of the present disclosure in any way. In other embodiments, the data may be stored in any suitable manner, whether presently known or developed in the future.
Similar to the storage system 100 shown in FIG. 1, the storage system 200 shown in FIG. 2 may include a client 110 and remote systems 130-1, …, 130-N. Unlike the storage system 100 shown in FIG. 1, the storage system 200 shown in FIG. 2 may also include an intermediate system 220. As shown in FIG. 2, the client 110 is connected to the remote systems 130-1, …,130-N through an intermediate system 220.
The intermediate system 220 may enable the client 110 to operate transparently on data on the remote systems 130-1, …, 130-N. In particular, the intermediate system 220 may provide a set of interfaces that are compatible with the remote systems 130-1, …,130-N such that the intermediate system 220 appears identical to the remote systems 130-1, …,130-N to the client 110. Further, the intermediate system 220 can generate a universal intermediate identifier for the data (e.g., each data object), such intermediate identifier being independent of the remote system 130. Accordingly, the intermediate system 220 may maintain a mapping relationship, referred to as an "identity mapping," between the intermediate identities of the recorded data and their remote identities in the remote systems 130-1, …, 130-N. In some embodiments, the intermediate system 220 may also store metadata (metadata) for the data, which may be used to identify the data, and may include other information such as intermediate identification, remote identification, and to describe the data. Moreover, the intermediate system 220 may measure aspects of the remote systems 130-1, …,130-N, such as cost, performance, and SLA (service level Agreement), to assist the user in making the best choice.
Several example operations/functions of the intermediate system 220 will be described below in conjunction with fig. 4-6. Referring first to fig. 3, a schematic diagram of an identity mapping 300 in an intermediate system is shown, according to an embodiment of the present disclosure. As described above, the intermediate system 200 stores an intermediate identifier of the data in the intermediate system 220 and an identification mapping of the remote system 130. In the example of fig. 3, such an identification mapping is implemented by means of a mapping table 300. Of course, this is merely exemplary and the identification mapping may be stored using any suitable data structure and/or format.
As shown in FIG. 3, column 310 in mapping table 300 represents the intermediate identification of the data, and the other columns represent the remote identifications corresponding to the intermediate identification in the respective remote systems 130-1, …, 130-N. For example, the value in the record in element (2,2) represents the remote identification #1-130-1 of the data with the intermediate identification #1 on the remote system 130-1. The intermediate system 220 may translate between the intermediate identity and the remote identity by querying the mapping table 300.
FIG. 4 shows a schematic diagram of a process 400 for the intermediate system 220 to read data in association with the remote system 130-1, in accordance with an embodiment of the present disclosure. As shown in FIG. 4, the intermediate system 220 may receive (410) a user request from the client 110 to read data at the remote system 130-1. Assume that the target data for which the user request is intended will be read from the first remote system 130-1.
In response to the request, the intermediate system 220 converts (420) the intermediate identification of the data contained in the user request in the intermediate system 220 to an identification of the data in the target remote system (in this example, the first remote system 130-1), referred to as the "first identification". In some embodiments, the intermediate identification is included in the user request. In such an embodiment, the intermediary system 220 may extract the intermediate identity from the user request and convert the intermediate identity to the first identity based on an identity mapping (e.g., mapping table 300 shown in fig. 3).
In some embodiments, the user request may also include a key required to access the first remote system 130-1 in order to read data stored in the first remote system 130-1. In such embodiments, the intermediate system 220 may also extract the key from the user request.
Next, the intermediate system 200 sends 430 a request for reading data, referred to as a "first request," to the targeted first remote system 130-1. The first request includes at least a first identification. Further, as described above, in some embodiments the intermediate system 220 may extract the keys needed to access the data from the user request. In such embodiments, the extracted key may also be included in the first request. Further, in some embodiments, the key required to access the data may be stored in the intermediary system 220, and the intermediary system 220 may automatically include the key in the first request.
In particular, in some embodiments, the intermediate system 220 may perform a conversion of the request format/syntax to accommodate the characteristics and/or requirements of the destination remote system. For example, the intermediate system 220 may generate the first request based on the requirements of the first remote system 130-1. The first request may have a different syntax and/or format than the original user request, but leaving the semantics of the user request unchanged. In this manner, differences between remote systems 130 are handled by the intermediate system 220, making it transparent to the client 110. This is advantageous in simplifying the operation of the client.
After receiving the first request, the first system 130-1 returns the data to be read to the intermediate system 220. Accordingly, the intermediate system 220 receives (440) the data from the first remote system 130-1. In some embodiments, the received data may include its first identification in the first remote system 130-1 for error checking, log processing, and the like. This is of course not essential and the received data may not include the first identity or contain other information that may be used for similar purposes. The intermediate system 220 then provides 460 the data to the client 110.
As shown, in those embodiments where the received data contains a first identification, the intermediary system 220 may convert (450) the first identification of the data back to the intermediary identification, e.g., based on the identification mapping 310. In such an embodiment, the intermediary system 220 may include the intermediary identifier in the data when it is sent (460) to the client 110. In this way, the client 110 can confirm that the obtained data is indeed the requested data. Of course, this is not required. In other embodiments, act 450 may be omitted.
The above describes a process 400 for the client 110 to read data from the first remote system 130-1 through the intermediate system 220. In some embodiments, when multiple copies of data are stored in multiple remote systems, the intermediate system 220 may select any of all of the multiple remote systems because each copy has the same value. In some embodiments, the intermediate system 220 may also measure aspects of multiple remote systems to select the best remote system. In one example, when the intermediate system 220 detects that one of the plurality of remote systems is unavailable, the intermediate system 220 may send a user request from the client 110 to the other available remote systems to increase the availability of the storage system. In another example, the intermediate system 220 may measure network delays for multiple remote systems and send user requests from the client 110 to the remote system with the lowest delay to improve performance of the storage system.
FIG. 5 shows a schematic diagram of a process 500 for updating data by the intermediate system 220 in association with the remote systems 130-1 and 130-2, according to an embodiment of the disclosure. As shown in FIG. 5, the intermediate system 220 may receive (510) a user request from the client 110 to update data at the remote systems 130-1 and 130-2. Assume that the target data for which the user request is intended is to be updated at the first remote system 130-1 and the second remote system 130-2.
In response to the request, the intermediate system 220 converts (520) the intermediate identity of the data contained in the user request in the intermediate system 220 to an identity of the data in the target remote system (in this example, the first remote system 130-1), referred to as the "first identity". In some embodiments, the intermediate identification is included in the user request. In such an embodiment, the intermediary system 220 may extract the intermediate identity from the user request and convert the intermediate identity to the first identity based on an identity mapping (e.g., mapping table 300 shown in fig. 3).
In some embodiments, the user request may also include a key required to access the first remote system 130-1 in order to update the data stored in the first remote system 130-1. In such embodiments, the intermediate system 220 may also extract the key from the user request.
Next, the intermediate system 200 sends (530) a request for updating data, referred to as a "first request," to the targeted first remote system 130-1. The first request includes at least a first identification. Further, as described above, in some embodiments the intermediate system 220 may extract the keys needed to access the data from the user request. In such embodiments, the extracted key may also be included in the first request. Further, in some embodiments, the key required to access the data may be stored in the intermediary system 220, and the intermediary system 220 may automatically include the key in the first request.
Further, in response to the request, the intermediate system 220 converts (540) the intermediate identification of the data contained in the user request in the intermediate system 220 to an identification of the data in the target remote system (in this case, the second remote system 130-2), referred to as the "second identification". In some embodiments, the intermediate identification is included in the user request. In such an embodiment, the intermediary system 220 may extract the intermediate identity from the user request and convert the intermediate identity to the second identity based on an identity mapping (e.g., mapping table 300 shown in fig. 3).
In some embodiments, the user request may also include a key required to access the second remote system 130-2 in order to update the data stored in the second remote system 130-2. In such embodiments, the intermediate system 220 may also extract the key from the user request.
Next, the intermediate system 200 sends 550 a request for updating data, referred to as a "second request," to the targeted second remote system 130-2. The second request contains at least a second identification. Further, as described above, in some embodiments the intermediate system 220 may extract the keys needed to access the data from the user request. In such embodiments, the extracted key may also be included in the second request. Further, in some embodiments, the key required to access the data may be stored in the intermediary system 220, and the intermediary system 220 may automatically include the key in the second request.
In particular, in some embodiments, the intermediate system 220 may perform a conversion of the request format/syntax to accommodate the characteristics and/or requirements of the destination remote system. For example, the intermediate system 220 may generate a first request based on the requirements of a first remote system 130-1 and a second request based on the requirements of a second remote system 130-2. The first request and the second request may have a different syntax and/or format than the original user request, but leave the semantics of the user request unchanged. In this manner, differences between remote systems 130 are handled by the intermediate system 220, making it transparent to the client 110. This is advantageous in simplifying the operation of the client.
The above describes a process 500 for a client 110 to update data in a first remote system 130-1 and a second remote system 130-2 through an intermediate system 220. The update may include one of creation, deletion, and modification. In one example, when the client 110 creates data through the intermediate system 220, the client 110 defines the data in the intermediate system 220 and configures a plurality of remote systems for the data. If the intermediate system 220 receives a user request from the client 110 to create data and recognizes that the data is defined as having multiple copies in multiple remote systems, multiple user requests are sent to the multiple remote systems, respectively, to create data on the multiple remote systems. In another example, when the client 110 modifies or deletes data through the intermediate system 220, if the intermediate system 220 receives a user request to modify or delete data from the client 110 and recognizes that the data is stored in a plurality of remote systems, a plurality of user requests are respectively sent to the plurality of remote systems to modify or delete data on the plurality of remote systems. Managing multiple copies of data in multiple remote systems through the intermediate system 220 has the advantage that the user does not have to configure each remote system separately, which reduces the user's workload and improves efficiency. It may also improve the security of the storage system, for example if a copy of data in one remote system is corrupted, a copy of data in the other remote system is still available.
FIG. 6 shows a schematic diagram of a process 600 for the intermediate system 220 to migrate data in association with the remote systems 130-1 and 130-3, according to an embodiment of the disclosure. As shown in FIG. 6, the intermediate system 220 may receive (610) a user request from the client 110 to migrate data from the remote system 130-3 to the remote system 130-1. Assume that the target data for which the user request is directed is migrated from the third remote system 130-3 to the first remote system 130-1.
In response to the request, the intermediate system 220 converts (620) the intermediate identification of the data contained in the user request in the intermediate system 220 to an identification of the data in the target remote system (in this example, the first remote system 130-1), referred to as the "first identification". In addition, the intermediate system 220 converts the intermediate identification of the data contained in the user request in the intermediate system 220 into an identification of the data in the target remote system (in this case, the third remote system 130-3), referred to as the "third identification". In some embodiments, the intermediate identification is included in the user request. In such an embodiment, the intermediary system 220 may extract the intermediate identity from the user request and convert the intermediate identity into the first identity and the third identity based on an identity mapping (e.g., mapping table 300 shown in fig. 3).
In some embodiments, to migrate data from the third remote system 130-3 to the first remote system 130-1, the user request may also include a key required to access the first remote system 130-1 and the third remote system 130-3. In such embodiments, the intermediate system 220 may also extract the key from the user request. Further, in some embodiments, keys required to access the data may be stored in the intermediate system 220.
Next, the intermediate system 220 retrieves (630) data from the third remote system 130-3 using the third identification and stores (640) data returned from the third remote system 130-3 to the first remote system 130-1 using the first identification.
The process 600 of the intermediate system 220 receiving a user request to migrate data from a client 110 is described above. During the migration of data, the intermediate system 220 is a normal client from the perspective of the remote system 130, and the intermediate system 220 can still obtain data from the remote system. Further, the intermediate system 220 may also send a new request for the data to the first remote system 130-1 after determining that the migrated data is stored completely in the first remote system 130-1. Still further, the intermediate system 220 may also send a request to the third remote system 130-3 to delete data from the third remote system 130-3 after determining that existing request processing for the migrated data in the third remote system 130-3 is complete.
By using the intermediate system 220, the client 110 can migrate data between different remote systems as needed, improving the availability and performance of data access, increasing the security of data protection, eliminating the risk of the client 110 being locked to a particular remote system, so that users will have complete control over their data.
In addition to the implementation described in FIG. 6, in some embodiments, the intermediate system 220 may automatically determine to migrate data from one remote system to another based on results of measuring aspects of multiple remote systems 130. The migration operation performed by the intermediate system 220 is transparent to the user, so that the optimal selection can be automatically made for the user without increasing the workload of the user. That is, in such embodiments, migration of data between remote systems 130 is triggered by the intermediate system 220 rather than upon a request from the client 110.
In other implementations, the intermediate system 220 may also receive system requests from one or more remote systems 130 to migrate data between the remote systems. For example, the intermediate system 220 may receive a system request from the remote system 130-3 to migrate data from the remote system 130-3 to the remote system 130-1. The intermediate system 220 may extract the third identification of the data in the remote system 130-3 from the system request. The intermediate system 220 may convert the third identification to an intermediate identification in the intermediate system 220 and further convert the intermediate identification to the first identification in the remote system 130-1 based on the identification mapping. The intermediate system 220 obtains data from the remote system 130-3 using the third identification and stores the data to the remote system 130-1 using the first identification. By using the intermediate system 220, not only the clients benefit, but also the interoperation between remote systems is simplified, and the flexibility of the whole storage system is improved.
Fig. 7 shows a flow diagram of a method 700 according to an embodiment of the present disclosure. In some embodiments, the method 700 may be implemented at the intermediate system 220. In step 710, an intermediate identification of the data to be processed in the intermediate system is obtained. In some embodiments, at step 710, a user request for manipulating the data at the remote system may be received from a client; and extracting the intermediate identity of the data from the user request.
Next, in step 720, the intermediate identity is converted to a first identity in the remote system based on an identity mapping between the intermediate system and the remote system. The identification mapping may be implemented, for example, by means of the mapping table 300 shown in fig. 3 and/or any other suitable structure.
In step 730, the data is processed in association with the remote system based at least in part on the first identification. In some embodiments, processing the data in association with the remote system includes: generating a first request for performing the operation at the remote system based on the user request, the first request including the first identification; and sending the first request to the remote system. In some embodiments, generating the first request comprises: the first request is generated using a different syntax than the user request. In some embodiments, at least one of the user request and the first request includes a key associated with the data.
In some embodiments, the operation in the user request comprises a read of the data. At this point, the data may be received from the remote system, step 730; and sending the data to the client.
For example, in some embodiments, the remote system is a first remote system and the operation in the user request includes an update to the data at the first remote system. In some embodiments, the update may include, for example, at least one of: creation, deletion and modification. In such embodiments, at step 730, the intermediate identity of the data may be converted to a second identity of the data at a second remote system based on an identity mapping between the intermediate system and the second remote system; generating a second request for the update to the data at the second remote system, the second request including the second identification; and sending the second request to the second remote system.
In some embodiments, the remote system is a first remote system. In such embodiments, at step 730, the intermediate identifier may be converted to a third identification of the data in a third remote system based on an identification mapping between the intermediate system and the third remote system, the third remote system being different from the first remote system; obtaining data from a third remote system using the third identifier; and storing the data to the first remote system using the first identification.
In certain embodiments, the data may also be deleted from the third remote system at step 730 in response to at least one of: it is determined that the data has been stored intact at the first remote system and that processing of pending requests for the data is complete.
Fig. 8 shows a schematic block diagram of an apparatus 800 according to an embodiment of the present disclosure. The apparatus 800 may be implemented, for example, at the intermediate system 220 or directly act as the intermediate system 220 itself. As shown, the apparatus 800 includes an identity acquisition unit 810, an identity mapping unit 820, and a data processing unit 830.
The identity retrieval unit 810 is configured to extract an intermediate identity of the data to be processed from the user request received by the client.
The identity mapping unit 820 is configured to convert the intermediate identity into a first identity in the remote system based on an identity mapping between the intermediate system and the remote system. The data processing unit 830 is configured to process data in association with the remote system based at least in part on the first identification.
In some embodiments, the identification acquisition unit 810 is configured to receive a user request from a client for manipulating the data at the remote system; and extracting the intermediate identity of the data from the user request.
In some embodiments, the data processing unit 830 is configured to generate a first request for performing the operation at the remote system based on the user request, the first request comprising the first identification; and sending the first request to the remote system. For example, in some embodiments, the data processing unit 830 is configured to generate the first request using a different syntax than the user request. Alternatively or additionally, in some embodiments, at least one of the user request and the first request comprises a key associated with the data.
In some embodiments, the operation in the user request comprises a read of the data. In such embodiments, the data processing unit 830 is configured to receive the data from the remote system; and sending the data to the client.
In some embodiments, the remote system is a first remote system and the operation in the user request includes an update to the data at the first remote system. In such embodiments, the data processing unit 830 is configured to: converting the intermediate identity of the data to a second identity of the data at a second remote system based on an identity mapping between the intermediate system and the second remote system; generating a second request for the update to the data at the second remote system, the second request including the second identification; and sending the second request to the second remote system.
In some embodiments, the remote system is a first remote system. In such embodiments, the data processing unit 830 is configured to: converting the intermediate identifier to a third identification of the data in a third remote system based on an identification mapping between the intermediate system and the third remote system, the third remote system being different from the first remote system; obtaining data from a third remote system using the third identifier; and storing the data to the first remote system using the first identification.
In certain embodiments, the data processing unit 830 is further configured to delete the data from the third remote system in response to at least one of: it is determined that the data has been stored intact at the first remote system and that processing of pending requests for the data is complete.
The elements included in apparatus 800 may be implemented in a variety of ways including software, hardware, firmware, or any combination thereof. In one embodiment, one or more of the units may be implemented using software and/or firmware, such as machine executable instructions stored on a storage medium. In addition to, or in the alternative to, machine-executable instructions, some or all of the elements in apparatus 800 may be implemented at least in part by one or more hardware logic components. By way of example, and not limitation, exemplary types of hardware logic components that may be used include Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standards (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and so forth.
FIG. 9 illustrates a schematic block diagram of an electronic device 900 suitable for use in implementing embodiments of the present disclosure. As shown, device 900 includes a Central Processing Unit (CPU)910 that may perform various appropriate actions and processes in accordance with computer program instructions stored in a Read Only Memory (ROM)920 or loaded from a storage unit 980 into a Random Access Memory (RAM) 930. In the RAM 930, various programs and data required for the operation of the device 900 may also be stored. The CPU910, ROM 920, and RAM 930 are connected to each other via a bus 940. An input/output (I/O) interface 950 is also connected to bus 940.
Various components in device 900 are connected to I/O interface 950, including: an input unit 960 such as a keyboard, a mouse, etc.; an output unit 970 such as various types of displays, speakers, and the like; a storage unit 980 such as a magnetic disk, optical disk, or the like; and a communication unit 990 such as a network card, a modem, a wireless communication transceiver, or the like. The communication unit 990 allows the device 900 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Various processes and processes described above, such as processes/ methods 400, 500, 600, and 700, may be performed by processing unit 910. For example, in some embodiments, methods 400, 500, 600, and 700 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 980. In some embodiments, some or all of the computer program may be loaded and/or installed onto device 900 via ROM 920 and/or communication unit 990. When loaded into RAM 930 and executed by CPU910, may perform one or more of the steps of methods 400, 500, 600, and 700 described above. Alternatively, in other embodiments, the CPU 901 may also be configured in any other suitable manner to implement the processes/methods described above.
Many modifications and other embodiments of the disclosure set forth herein will come to mind to one skilled in the art to which this disclosure pertains having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the embodiments of the disclosure are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the disclosure. Moreover, while the above description and the related figures describe example embodiments in the context of certain example combinations of components and/or functions, it should be appreciated that different combinations of components and/or functions may be provided by alternative embodiments without departing from the scope of the present disclosure. In this regard, for example, other combinations of components and/or functions than those explicitly described above are also contemplated as within the scope of the present disclosure. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (17)

1. A method of data processing, comprising:
acquiring an intermediate identifier of data to be processed in an intermediate system, wherein the acquiring of the intermediate identifier comprises:
receiving, from a client, a user request to read the data from a plurality of remote systems, wherein the data is replicated on the plurality of remote systems; and
extracting the intermediate identification of the data from the user request;
measuring, via the intermediate system, a plurality of performance aspects associated with the plurality of remote systems;
selecting, via the intermediary system, a first remote system from the plurality of remote systems for reading the data based at least in part on the measured plurality of performance aspects associated with the plurality of remote systems;
converting the intermediate identity to a first identity in the first remote system based on an identity mapping between the intermediate system and the first remote system of the plurality of remote systems defined by a mapping table, the mapping table being stored within the intermediate system; and
processing the data in association with the remote system based at least in part on the first identification, wherein processing the data in association with the remote system comprises:
generating a first request for reading the data from the first remote system of the plurality of remote systems based on the user request, wherein the first request includes the first identification; and
transmitting the first request to the remote system.
2. The method of claim 1, wherein processing the data in association with the remote system further comprises:
receiving the data from the remote system; and
and sending the data to the client.
3. The method of claim 1, wherein the remote system is a first remote system and the user request further comprises an update to the data at the first remote system, wherein processing the data in association with the remote system further comprises:
converting the intermediate identity of the data to a second identity of the data at a second remote system based on an identity mapping between the intermediate system and the second remote system;
generating a second request for the update to the data at the second remote system, the second request including the second identification; and
sending the second request to the second remote system.
4. The method of claim 3, wherein the update comprises at least one of: creation, deletion and modification.
5. The method of claim 1, wherein generating the first request comprises:
generating the first request using a different syntax than the user request.
6. The method of claim 1, wherein at least one of the user request and the first request comprises a key associated with the data.
7. The method of claim 1, wherein the remote system is a first remote system, and processing the data in association with the remote system comprises:
converting the intermediate identifier to a third identification of the data in a third remote system based on an identification mapping between the intermediate system and the third remote system, the third remote system being different from the first remote system;
obtaining the data from the third remote system using the third identification; and
storing data to the first remote system using the first identification.
8. The method of claim 7, further comprising:
deleting the data from the third remote system in response to at least one of:
determining that the data has been stored intact at the first remote system, an
The processing of the pending request for the data is complete.
9. An electronic device, comprising:
at least one processing unit; and
at least one memory coupled to the at least one processing unit and storing machine-executable instructions that, when executed by the at least one processing unit, cause the at least one processing unit to be configured to:
acquiring an intermediate identifier of data to be processed in an intermediate system, wherein the acquiring of the intermediate identifier comprises:
receiving, from a client, a user request to read the data from a plurality of remote systems, wherein the data is replicated on the plurality of remote systems; and
extracting the intermediate identification of the data from the user request;
measuring, via the intermediate system, a plurality of performance aspects associated with the plurality of remote systems;
selecting, via the intermediary system, a first remote system from the plurality of remote systems for reading the data based at least in part on the measured plurality of performance aspects associated with the plurality of remote systems;
converting the intermediate identity to a first identity in the first remote system based on an identity mapping between the intermediate system and the first remote system of the plurality of remote systems defined by a mapping table, the mapping table being stored within the intermediate system; and
processing the data in association with the remote system based at least in part on the first identification, wherein processing the data in association with the remote system comprises:
generating a first request for reading the data from the first remote system of the plurality of remote systems based on the user request, wherein the first request includes the first identification; and
transmitting the first request to the remote system.
10. The device of claim 9, wherein the at least one processing unit is configured to:
receiving the data from the remote system; and
and sending the data to the client.
11. The apparatus of claim 9, wherein the remote system is a first remote system and the user request further comprises an update to the data at the first remote system, wherein the at least one processing unit is configured to:
converting the intermediate identity of the data to a second identity of the data at a second remote system based on an identity mapping between the intermediate system and the second remote system;
generating a second request for the update to the data at the second remote system, the second request including the second identification; and
sending the second request to the second remote system.
12. The apparatus of claim 11, wherein the update comprises at least one of: creation, deletion and modification.
13. The device of claim 9, wherein the at least one processing unit is configured to:
generating the first request using a different syntax than the user request.
14. The apparatus of claim 9, wherein at least one of the user request and the first request comprises a key associated with the data.
15. The apparatus of claim 9, wherein the remote system is a first remote system, and the at least one processing unit is configured to:
converting the intermediate identifier to a third identification of the data in a third remote system based on an identification mapping between the intermediate system and the third remote system, the third remote system being different from the first remote system;
obtaining the data from the third remote system using the third identification; and
storing data to the first remote system using the first identification.
16. The device of claim 15, wherein the at least one processing unit is further configured to include:
deleting the data from the third remote system in response to at least one of:
determining that the data has been stored intact at the first remote system, an
The processing of the pending request for the data is complete.
17. A computer readable storage medium having stored thereon program code configured to, when executed, cause an apparatus to perform the steps of a method according to any of claims 1 to 8.
CN201610453751.8A 2016-06-21 2016-06-21 Data processing method and device Active CN107526530B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610453751.8A CN107526530B (en) 2016-06-21 2016-06-21 Data processing method and device
US15/628,624 US20170364293A1 (en) 2016-06-21 2017-06-20 Method and apparatus for data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610453751.8A CN107526530B (en) 2016-06-21 2016-06-21 Data processing method and device

Publications (2)

Publication Number Publication Date
CN107526530A CN107526530A (en) 2017-12-29
CN107526530B true CN107526530B (en) 2021-02-19

Family

ID=60660186

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610453751.8A Active CN107526530B (en) 2016-06-21 2016-06-21 Data processing method and device

Country Status (2)

Country Link
US (1) US20170364293A1 (en)
CN (1) CN107526530B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210209098A1 (en) * 2018-06-15 2021-07-08 Micro Focus Llc Converting database language statements between dialects
JP2022180956A (en) * 2021-05-25 2022-12-07 富士通株式会社 Information processing apparatus, program, and information processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103180852A (en) * 2012-08-09 2013-06-26 华为技术有限公司 Distributed data processing method and apparatus
CN104239122A (en) * 2014-09-04 2014-12-24 华为技术有限公司 VM (virtual machine) migration method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8612439B2 (en) * 2009-06-30 2013-12-17 Commvault Systems, Inc. Performing data storage operations in a cloud storage environment, including searching, encryption and indexing
US9348840B2 (en) * 2012-12-14 2016-05-24 Intel Corporation Adaptive data striping and replication across multiple storage clouds for high availability and performance
US9280678B2 (en) * 2013-12-02 2016-03-08 Fortinet, Inc. Secure cloud storage distribution and aggregation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103180852A (en) * 2012-08-09 2013-06-26 华为技术有限公司 Distributed data processing method and apparatus
CN104239122A (en) * 2014-09-04 2014-12-24 华为技术有限公司 VM (virtual machine) migration method and device

Also Published As

Publication number Publication date
US20170364293A1 (en) 2017-12-21
CN107526530A (en) 2017-12-29

Similar Documents

Publication Publication Date Title
JP6381776B2 (en) Generating unregistered user accounts for sharing content items
US10210191B2 (en) Accelerated access to objects in an object store implemented utilizing a file storage system
KR102376713B1 (en) Composite partition functions
US9952940B2 (en) Method of operating a shared nothing cluster system
WO2019166940A2 (en) Transactional operations in multi-master distributed data management systems
US10656972B2 (en) Managing idempotent operations while interacting with a system of record
CN110019080B (en) Data access method and device
CN107103011B (en) Method and device for realizing terminal data search
US10262024B1 (en) Providing consistent access to data objects transcending storage limitations in a non-relational data store
CN108093026B (en) Method and device for processing multi-tenant request
US20200301944A1 (en) Method and apparatus for storing off-chain data
CN110413595B (en) Data migration method applied to distributed database and related device
US10783073B2 (en) Chronologically ordered out-of-place update key-value storage system
US20220035844A1 (en) Centralized database system with geographically partitioned data
CN111371851A (en) Connection method, connection device, electronic equipment and storage medium
CN107526530B (en) Data processing method and device
US9948694B2 (en) Addressing application program interface format modifications to ensure client compatibility
CN113395340A (en) Information updating method, device, equipment, system and readable storage medium
US11394748B2 (en) Authentication method for anonymous account and server
CN115840956A (en) File processing method, device, server and medium
KR20120073799A (en) Data synchronizing and servicing apparatus and method based on cloud storage
US11526446B1 (en) Modifying caching amongst services from a history of requests and responses
US20220075830A1 (en) Resumable ordered recursive traversal of an unordered directory tree
US11151110B2 (en) Identification of records for post-cloning tenant identifier translation
CN110058790B (en) Method, apparatus and computer program product for storing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200413

Address after: Massachusetts, USA

Applicant after: EMC IP Holding Company LLC

Address before: Ma Sazhusaizhou

Applicant before: EMC Corp.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant