CN115934006B - IO access point and data processing task management method, device, equipment and medium - Google Patents

IO access point and data processing task management method, device, equipment and medium Download PDF

Info

Publication number
CN115934006B
CN115934006B CN202310238936.7A CN202310238936A CN115934006B CN 115934006 B CN115934006 B CN 115934006B CN 202310238936 A CN202310238936 A CN 202310238936A CN 115934006 B CN115934006 B CN 115934006B
Authority
CN
China
Prior art keywords
access point
request
storage service
data processing
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310238936.7A
Other languages
Chinese (zh)
Other versions
CN115934006A (en
Inventor
何育华
徐文豪
王弘毅
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SmartX Inc
Original Assignee
SmartX Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SmartX Inc filed Critical SmartX Inc
Priority to CN202310238936.7A priority Critical patent/CN115934006B/en
Publication of CN115934006A publication Critical patent/CN115934006A/en
Application granted granted Critical
Publication of CN115934006B publication Critical patent/CN115934006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides an IO access point and a data processing task management method, a device, equipment and a medium, wherein a first storage service sub-node receives a first IO request or a data processing task management request of a data object, sends a query request to metadata service, inquires and operates an IO access point record of the data object, sends the IO access point record to the first storage service sub-node, processes the first IO request or forwards the first IO request to a second storage service sub-node corresponding to the IO access point record to finish the control of an IO link; or the metadata service sends the data processing task management request to the second storage service sub-node for processing, so that the unique storage service sub-node for processing the IO request of the data object is ensured, and the storage service sub-nodes for processing the IO request and the data processing task management request are the same, thereby ensuring the possibility of tracking processing of the IO by the data processing task.

Description

IO access point and data processing task management method, device, equipment and medium
Technical Field
The invention relates to the technical field of distributed storage, in particular to an IO access point and a data processing task management method, device, equipment and medium.
Background
The conventional centralized storage system stores all data by using a single storage server, the single node processing performance of the storage server and the single point data capacity upper limit are easy to become bottlenecks of system performance, and when single point faults occur, the safety and stability of user data cannot be guaranteed generally, and in this case, a distributed system is generated.
The distributed storage system is a system for storing and processing data by utilizing a plurality of servers, can utilize the Input (I) Output (O) processing pressure shared by the plurality of storage servers, can expand the cluster storage capacity by adding nodes, and can provide multiple copies, migration and recovery capability for users by metadata management service of the distributed system. Distributed systems have advantages in both data availability, security and storage service scalability. In the existing distributed system, the access and the access of the user to the data under the distributed storage system are generally distributed on a plurality of data processing servers, and the system has the advantages of utilizing the performance of all server processors of the whole cluster, tolerating single-point faults and improving the overall concurrency performance of the storage system.
However, in a distributed storage system, an IO of a user to a certain data object (i.e. an input/output action of the data object) should generally be handled by a single storage service sub-node, otherwise, it is difficult to handle an IO disorder problem caused by a plurality of storage service sub-nodes to the same data object IO at the same time, and a data correctness problem caused thereby. In addition, there is a data processing task requirement for tracking, sensing or performing secondary processing on real-time user IOs in the distributed storage system, and when a certain storage service sub-node needs to process such a data processing task, it is difficult to track and process data objects IOs scattered in each storage service sub-node.
Disclosure of Invention
The invention provides an IO access point and a data processing task management method, device, equipment and medium, which are used for solving the problems that in the prior art, a distributed storage system is difficult to process IO disorder caused by a plurality of storage service sub-nodes to the same data object IO at the same time and data correctness caused by the same data object IO, and the distributed storage system is difficult to track and process the data objects IO scattered in each storage service sub-node.
The invention provides an IO access point and a data processing task management method, which are applied to a metadata server and comprise the following steps:
s3, receiving a query request for the data object or receiving a query request for the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
s4, inquiring IO access point records of the data object according to the inquiring request, and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
s5, sending the IO access point record to the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to the access point information, and the second storage service sub-node returns the first IO request to the client in an original way after completing the first IO request;
or alternatively, the process may be performed,
rejecting the data processing task management request according to the IO access point record, or sending the data processing task management request to a second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request;
And S8, after receiving an execution result of the data processing task management request sent by the second storage service sub-node, returning the execution result to the client.
According to the IO access point and the data processing task management method provided by the invention, the IO access point records are operated according to the number of the IO access point records, and the method specifically comprises the following steps:
and when the IO access point record does not exist in the data object, a storage service sub-node is allocated for the data object, the IO access point record is created, and the access point information of the storage service sub-node is recorded in the IO access point record.
According to the IO access point and the data processing task management method provided by the invention, a storage service child node is allocated to a data object, and the method specifically comprises one of the following modes:
randomly allocating a storage service child node for the data object;
distributing a storage service child node with the lowest delay for the data object according to the delay data of the client and each storage service child node;
and distributing a storage service child node closest to the client to the data object according to the storage position information of the data object.
According to the IO access point and the data processing task management method provided by the invention, the data processing task management request specifically comprises one of the following requests:
creating a data processing task;
suspension or resumption of existing data processing tasks;
the cancellation or deletion of existing data processing tasks.
According to the IO access point and the data processing task management method provided by the invention,
when the data processing task management request is the cancellation or deletion of the existing data processing task, after receiving the execution result of the data processing task management request sent by the second storage service sub-node and before returning the execution result to the client, the method further includes:
and deleting the IO access point record of the data object when the IO access point record of the data object is generated by the existing data processing task.
According to the method for managing the IO access point and the data processing task provided by the invention, after the data processing task management request is sent to the second storage service sub-node corresponding to the access point information, the method further comprises the following steps:
when the existing data processing task is not executed yet, and the client of the third storage service sub-node receives a second IO request for the data object, receiving a registration request sent by the third storage service sub-node, and creating the IO access point record corresponding to the third storage service sub-node for the data object;
Receiving a query request sent by the third storage service sub-node, querying the IO access point record of the data object, and sending a notification of cancellation of execution to the storage service sub-node executing the existing data processing task when a plurality of IO access point records exist in the data object; when the data object only has one IO access point record, comparing the changes of the access point information in the IO access point record before and after the IO access point record corresponding to the third storage service sub-node is created, and sending a notification of canceling execution to the storage service sub-node executing the existing data processing task under the condition that the access point information changes.
The invention also provides another IO access point and a data processing task management method, which are applied to a storage server and comprise the following steps:
s1, a client of a first storage service child node receives a first IO request for a data object or a data processing task management request for the data object;
s2, the first storage service sub-node sends a query request to a metadata service, or sends a query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
S6, the first storage service sub-node receives the IO access point record sent by the metadata service, processes the first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to access point information in the IO access point record;
or, the second storage service sub-node corresponding to the access point information in the IO access point record receives the data processing task management request sent by the metadata service;
s7, after the second storage service child node receives the first IO request, the first IO request is completed, and the first IO request is returned to the client in an original way;
or after receiving the data processing task management request, the second storage service sub-node executes the data processing task management request, and after the execution is completed, the second storage service sub-node sends an execution result to the metadata service.
According to the method for managing the IO access point and the data processing task, the first IO request is processed according to the record of the IO access point, or the first IO request is sent to a second storage service sub-node corresponding to the access point information in the record of the IO access point, and the method specifically comprises the following steps:
When the IO access point record is unique and the second storage service sub-node corresponding to the access point information in the IO access point record is the first storage service sub-node, the first storage service sub-node completes the first IO request;
when the IO access point record is unique and the second storage service sub-node corresponding to the access point information in the IO access point record is not the first storage service sub-node, the first storage service sub-node sends the first IO request to the second storage service sub-node;
and when the IO access point records more than one IO request, the first storage service child node refuses the first IO request.
According to the IO access point and the data processing task management method provided by the invention, after the data processing task management request is executed, the method further comprises the following steps:
when the client of the third storage service sub-node receives the second IO request for the data object, the third storage service sub-node sends the query request and the registration request to the metadata service, so that the metadata service receives the registration request, creates the IO access point record corresponding to the third storage service sub-node for the data object, and enables the metadata service to receive the query request, and query the IO access point record of the data object according to the query request.
The invention also provides an IO access point and a data processing task management device, which comprises:
the first receiving module is used for receiving a query request of the data object or receiving a query request of the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
the first processing module is used for inquiring the IO access point records of the data object according to the inquiring request and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
the first sending module is configured to send the IO access point record to the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to the access point information, and the second storage service sub-node returns the first IO request to the client in a primary path after completing the first IO request;
The first sending module is further configured to send the data processing task management request to the second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request, and then receives an execution result sent by the second storage service sub-node;
the first processing module is further used for rejecting the data processing task management request according to the IO access point record;
the first receiving module is further configured to receive the execution result sent by the second storage service child node.
The invention also provides another IO access point and a data processing task management device, which comprises:
the client module is used for receiving a first IO request of the data object or a data processing task management request of the data object;
the second sending module is used for sending a query request to a metadata service, or sending the query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
The second receiving module is used for receiving the IO access point record sent by the metadata service or receiving the data processing task management request sent by the metadata service;
the second sending module is further configured to send the first IO request to the IO access point and a data processing task management device corresponding to the access point information in the IO access point record;
the second receiving module is further configured to receive the first IO request sent by the other IO access points and the data processing task management device;
the second processing module is used for processing the first IO request according to the IO access point record or executing the data processing task management request; the second processing module is further configured to complete the first IO request;
the second sending module is further configured to return the first IO request to the client module in a primary way, or send an execution result of executing the data processing task management request to the metadata service.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the IO access point and the data processing task management method are realized when the processor executes the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements an IO access point and a data processing task management method as described in any one of the above.
The invention provides an IO access point and a data processing task management method, device, equipment and medium, wherein a client of a first storage service receives a first IO request or a data processing task management request for a data object, and sends a request for inquiring an IO access point record of the data object to a metadata service; the metadata service can query and operate the IO access point record of the data object and send the IO access point record to the first storage service sub-node, so that the first storage service sub-node can process the first IO request or forward the first IO request to the second storage service sub-node corresponding to the access point information in the IO access point record to finish, and the controllability of the IO link is realized; or the metadata service can send the data processing task management request to a second storage service sub-node corresponding to the access point information in the IO access point record for processing; the strategy ensures that the storage service sub-node for processing the IO request of the data object is unique, solves the problem of IO processing time sequence caused by multiple access points of the same data object, and simultaneously ensures that the storage service sub-node for processing the IO request simultaneously manages the storage service sub-node for processing the data processing task, thereby ensuring the possibility of tracking the IO by the data processing task. And after the storage service sub-node corresponding to the access point information finishes the first IO request, returning the first IO request to the client side in a primary way, or sending an execution result to the metadata service after the execution of the data processing task management request is finished.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic general flow diagram of an IO access point and a data processing task management method provided by the invention;
FIG. 2 is a schematic flow chart of IO access point management in the IO access point and data processing task management method provided by the invention;
FIG. 3 is a schematic flow chart of data processing task management in the IO access point and the data processing task management method provided by the invention;
FIG. 4 is a flow chart of IO access point management for ISCSI data links under distributed storage in an embodiment of the present invention;
FIG. 5 is a system block diagram of IO access point management for ISCSI data links under distributed storage in an embodiment of the present invention;
FIG. 6 is a flow diagram of IO access point management for a vHost data link under distributed storage in an embodiment of the present invention;
FIG. 7 is a system block diagram of IO access point management for a vHost data link under distributed storage in an embodiment of the present invention;
FIG. 8 is a schematic flow chart of creation of data processing tasks in an IO access point and data processing task management method provided by the invention;
FIG. 9 is a schematic flow chart of suspension or resumption of data processing tasks in the IO access point and data processing task management method provided by the invention;
FIG. 10 is a schematic flow chart of cancelling and deleting data processing tasks in an IO access point and a data processing task management method provided by the invention;
FIG. 11 is a schematic diagram of a processing flow when IO is performed on a data object having a data processing task in the IO access point and the data processing task management method provided by the present invention;
fig. 12 is a schematic structural diagram of an IO access point and a data processing task management device provided by the present invention;
fig. 13 is a schematic structural view of an electronic device provided by the present invention;
FIG. 14 is a flow chart of the IO access point and data processing task management method provided by the invention and applied to a metadata server side;
FIG. 15 is a flowchart of the IO access point and data processing task management method provided by the invention and applied to a storage server side;
Fig. 16 is a schematic structural diagram of another IO access point and a data processing task management device provided by the present invention.
Reference numerals:
121: a first receiving module; 122: a first processing module; 123: a first transmitting module; 161: a second receiving module; 162: a second processing module; 163: a second transmitting module; 164: and a client module.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, in the embodiment of the present invention, the metadata service is a metadata management service in the distributed system, where the metadata service may be provided by a metadata server in the distributed storage system, or may operate on any server node in the distributed storage system, for example, on a storage server. Thus, a metadata server as described herein is any server node that runs a metadata service. The metadata service may be multiple or one, and if the metadata service is multiple and runs on a special metadata server, the metadata service generally needs to run on multiple metadata servers; if running on a storage server, it is typically running within multiple storage servers. When a plurality of metadata services are operated in a distributed system, the plurality of metadata services usually form a metadata service cluster to provide services to the outside in an integral form. The storage service child nodes in the embodiments of the present invention are storage servers in a distributed storage system. The first IO request and the second IO request described in this embodiment are both IO requests.
Fig. 14 is a schematic flow chart of an IO access point and data processing task management method applied to a metadata server according to an embodiment of the present invention, where, as shown in fig. 14, the method includes:
s3, receiving a query request for the data object or receiving a query request for the data object and a data processing task management request. The query request is a request for querying an IO access point record of a data object. In this step, the query request and the data processing task management request are both sent by the first storage service sub-node, wherein the management request for the data processing task is sent by a client of the first storage service sub-node, and a client exists under each storage service sub-node, and the client is a part of the storage service sub-node. It should be understood that, the first storage service child node described in this embodiment does not refer to a fixed storage server, but refers to a storage server in a distributed storage system, where a client receives a first IO request for a data object or a data processing task management request for the data object. For example, when the client of the storage service sub-node 1 receives the first IO request for the data object, the storage service sub-node 1 is the first storage service sub-node in the task of this IO request; and when the client of the storage service sub-node 2 receives the data processing task management request for the data object, the storage service sub-node 2 is the first storage service sub-node in the data processing task management request task.
The data processing task management request for the data object in the embodiment of the present invention may specifically include one of the following requests:
creating a data processing task;
suspension or resumption of existing data processing tasks;
the cancellation or deletion of existing data processing tasks.
It should be noted that, when the data processing task management request is creation of a data processing task, a certain storage service sub-node executes the data processing task management request, that is, creates and executes the data processing task in the storage service sub-node.
S4, inquiring IO access point records of the data object according to the inquiring request, and operating the IO access point records according to the number of the IO access point records. When a certain storage service sub-node is carrying out IO on a data object or is carrying out a data processing task on the data object, the data object can have an IO access point record corresponding to the storage service sub-node, and the IO access point record stores access point information corresponding to the storage service sub-node. Each IO access point record stores access point information, one access point information corresponds to one storage service sub-node, for example, the access point information can include an ID of the storage service sub-node, the storage service sub-node which is currently carrying out IO on the data object or carrying out data processing task on the data object can be determined through the access point information, and interaction can be carried out with the corresponding storage service sub-node according to the access point information. When an IO access point record exists in the data object, the data object indicates that all the current IO and data processing task management requests for the data object are processed by the storage service child node corresponding to the access point information in the IO access point record, namely the access point. There may be an indefinite number of IO access point records per data object.
In this step, the metadata service queries the IO access point record of the data object, and operates the IO access point record according to the number of the IO access point records, which specifically includes:
when the data object does not have the IO access point record, the metadata service allocates a storage service sub-node to the data object, creates the IO access point record, and records the access point information of the storage service sub-node in the IO access point record.
It should be noted that, the metadata service allocates a storage service child node to the data object, which specifically includes one of the following ways:
(1) The metadata service randomly allocates a storage service child node for the data object;
(2) The metadata service distributes a storage service sub-node with the lowest delay as the optimal storage service sub-node for the data object according to the delay data of the client and each storage service sub-node;
(3) The metadata service allocates a storage service sub-node closest to the client to the data object as an optimal storage service sub-node according to the storage location information of the data object. It should be understood that in a distributed storage system, a data object may have copies on multiple storage servers, that is, there are multiple data objects in different storage locations, so a metadata service may query the distances between each of the multiple storage servers storing the copies and a client, and select, as an optimal access point, a storage server storing the copies closest to the client, where the distances refer to distances under the network topology perspective, for example, the distances closest to the client are local storage servers where the clients are located; the second is a storage server under the same frame and the same switch; and a storage server under a different switch from the rack; and the storage servers are storage servers of different switches under different racks.
S5, sending the IO access point record to the first storage service sub-node, or rejecting the data processing task management request according to the IO access point record, or sending the data processing task management request to the second storage service sub-node corresponding to the access point information in the IO access point record.
In this step, the metadata service needs to perform a corresponding operation according to the type of request received by the client of the first storage service child node. As shown in fig. 2, when the first storage service sub-node receives the first IO request of the data object, the IO access point record is sent to the first storage service sub-node, so that the first storage service sub-node processes the first IO request according to the access point information in the IO access point record, or forwards the first IO request to the second storage service sub-node corresponding to the access point information, and the second storage service sub-node completes the first IO request. Specifically, when there is only one IO access point record and the second storage service sub-node corresponding to the access point information is the first storage service sub-node, the first storage service sub-node does not send the first IO request; when only one IO access point record exists and the second storage service sub-node corresponding to the access point information is different from the first storage service sub-node, the first storage service sub-node sends a first IO request to the second storage service sub-node; when there is more than one IO access point record for the data object, the first storage service child node denies the first IO request.
As shown in fig. 3, when the first storage service child node receives a data processing task management request of a data object, the metadata service needs to perform a corresponding operation according to the number of IO access point records of the data object. When only one IO access point record of the data object exists, the metadata service sends the data processing task management request to a storage service sub-node corresponding to access point information in the IO access point record; when the IO access point of the data object records more than one, the metadata service denies the data processing task management request.
And S8, after receiving the execution result sent by the second storage service child node, the metadata service returns the execution result to the client.
In this step, the metadata service returns the execution result to the client after receiving the execution result. To this end, a lifecycle of a data processing task management request for the target data object ends. Through the management of the IO access point and the management of the data processing task, the storage service sub-node for processing the IO request is ensured to simultaneously manage the storage service sub-node for the request for the data processing task, so that the possibility of tracking the IO request by the data processing task is ensured.
Fig. 15 is a schematic flow chart of an IO access point and data processing task management method applied to a storage server according to an embodiment of the present invention, as shown in fig. 15, the method includes:
s1, a client of a first storage service child node receives a first IO request for a data object or a data processing task management request for the data object.
It should be understood that, the first storage service child node described in this embodiment does not refer to a fixed storage server, but refers to a storage server in a distributed storage system, where a client receives a first IO request for a data object or a data processing task management request for the data object. For example, when the client of the storage service sub-node 1 receives the first IO request for the data object, the storage service sub-node 1 is the first storage service sub-node in the task of this IO request; and when the client of the storage service sub-node 2 receives the data processing task management request for the data object, the storage service sub-node 2 is the first storage service sub-node in the data processing task management request task.
The data processing task management request for the data object in the embodiment of the present invention may specifically include one of the following requests:
Creating a data processing task;
suspension or resumption of existing data processing tasks;
the cancellation or deletion of existing data processing tasks.
It should be noted that, when the data processing task management request is creation of a data processing task, a certain storage service sub-node executes the data processing task management request, that is, creates and executes the data processing task in the storage service sub-node.
S2, the first storage service sub-node sends a query request to the metadata service or sends the query request and the data processing task management request to the metadata service.
Specifically, as shown in fig. 2 and fig. 3, when the client of the first storage service sub-node receives a first IO request for a data object, the first storage service sub-node sends a query request to the metadata service; when the client of the first storage service sub-node receives a data processing task management request of a data object, the first storage service sub-node sends a query request to the metadata service, and the client of the first storage service sub-node sends the data processing task management request to the metadata service, so that the metadata service receives a request for querying the IO access point record of the data object, or receives the request for querying the IO access point record of the data object and the data processing task management request, queries the IO access point record of the data object, and operates the first IO request or the IO access point record according to the number of the IO access point records. The metadata service queries IO access point records of the data object, and operates the first IO request or the IO access point records according to the number of the IO access point records, and specifically comprises the following steps:
If the data object only has one IO access point record, not operating the IO access point record;
if two or more IO access point records exist in the data object, rejecting the first IO request;
if the data object does not have the IO access point record, the metadata service allocates a storage service sub-node to the data object, creates the IO access point record, and records the access point information of the storage service sub-node in the IO access point record.
When an IO access point record exists in the data object, the data object indicates that all the IO requests and the data processing task management requests for the data object are processed by the storage service sub-node corresponding to the access point information in the IO access point record, namely the access point. Each data object may have an indefinite number of IO access point records, each of which contains one access point information. One access point information corresponds to one storage service child node.
When the data object has the IO access point record, the storage service child node for carrying out IO on the data object or executing the data processing task management request exists at the moment; if the data object does not have the IO access point record, the fact that the data object is not provided with the storage service sub-node for carrying out IO or executing the data processing task management request is indicated, so that metadata service needs to allocate a storage service sub-node for the data object, an IO access point record is created, and access point information of the allocated storage service sub-node is recorded in the IO access point record.
It should be noted that, the metadata service allocates a storage service child node to the data object, which specifically includes one of the following ways:
(1) The metadata service randomly allocates a storage service child node for the data object;
(2) The metadata service distributes a storage service sub-node with the lowest delay as the optimal storage service sub-node for the data object according to the delay data of the client and each storage service sub-node;
(3) The metadata service allocates a storage service sub-node closest to the client to the data object as an optimal storage service sub-node according to the storage location information of the data object. It should be understood that in a distributed storage system, a data object may have copies on multiple storage servers, that is, there are multiple data objects in different storage locations, so a metadata service may query the distances between each of the multiple storage servers storing the copies and a client, and select, as an optimal access point, a storage server storing the copies closest to the client, where the distances refer to distances under the network topology perspective, for example, the distances closest to the client are local storage servers where the clients are located; the second is a storage server under the same frame and the same switch; and a storage server under a different switch from the rack; and the storage servers are storage servers of different switches under different racks.
S6, the first storage service sub-node receives the IO access point record sent by the metadata service, processes the first IO request according to the IO access point record, or sends the first IO request to the second storage service sub-node corresponding to the access point information in the IO access point record; or the second storage service sub-node corresponding to the access point information in the IO access point record receives the data processing task management request sent by the metadata service.
The metadata service needs to perform a corresponding operation according to the request type received by the client of the first storage service child node. As shown in fig. 2, when the client receives a first IO request for a data object, in this step, the first storage service sub-node receives an IO access point record sent by the metadata service, and processes the first IO request according to the IO access point record or sends the first IO request to a second storage service sub-node, where the second storage service sub-node is a storage service sub-node corresponding to access point information in the IO access point record. Specifically, when there is only one IO access point record and the second storage service sub-node corresponding to the access point information is the first storage service sub-node, the first storage service sub-node does not send the first IO request; when only one IO access point record exists and the second storage service sub-node corresponding to the access point information is different from the first storage service sub-node, the first storage service sub-node sends a first IO request to the second storage service sub-node; when more than one IO access point record exists in the data object, in order to avoid the problem of IO disorder caused by that a plurality of storage service sub-nodes perform IO on the data object at the same time, a first storage service sub-node refuses the first IO request.
As shown in fig. 3, when the client receives a management request for a data processing task of a data object, in this step, the second storage service sub-node receives the management request for the data processing task sent by the metadata service, and the second storage service sub-node is a storage service sub-node corresponding to the access point information in the IO access point record.
S7, after the second storage service child node receives the first IO request, the first IO request is completed, and the first IO request is returned to the client in an original way; or when the second storage service sub-node receives the data processing task management request, executing the data processing task management request, and after the execution is finished, sending an execution result to the metadata service.
Specifically, as shown in fig. 2, when the client of the first storage service sub-node receives a first IO request for a data object, in this step, when there is only one IO access point record and the second storage service sub-node corresponding to the access point information is the first storage service sub-node, the first storage service sub-node completes the first IO request, and then returns the first IO request to the client in the original path; when the IO access point records are only one, and the second storage service sub-node corresponding to the access point information is different from the first storage service sub-node, the second storage service sub-node completes the first IO request after receiving the first IO request sent by the first storage service sub-node, and then returns the first IO request to the client side of the first storage service sub-node in a primary way. For example, the client a of the storage service sub-node 1 receives the transmitted first IO request of the data object, forwards the IO to the storage service sub-node 2 according to the IO access point record transmitted by the metadata service for processing, and after the storage service sub-node 2 completes the first IO request, returns the first IO request to the storage service sub-node 1, and then the storage service sub-node 1 returns the first IO request to the client a. After the metadata service senses that the client a is disconnected, deleting the IO access point record of the data object generated for completing the first IO request.
As shown in fig. 3, when the client sends a data processing task management request for a data object to the first storage service sub-node, in this step, when the second storage service sub-node receives the data processing task management request, the data processing task management request is executed, and after the execution is completed, an execution result is sent to the metadata service. And after receiving the execution result, the metadata service returns the execution result to the client. Through the management of the IO access point and the management of the data processing task, the IO link of the data object is controllable, the unique storage service sub-node for finally processing the IO request of the data object is ensured, the problem of IO disorder caused by a plurality of storage service sub-nodes simultaneously aiming at the same data object IO and the problem of data correctness caused by the same data object IO are avoided, the storage service sub-node for processing the IO request is ensured to be the storage service sub-node for managing the request of the data processing task at the same time, and the possibility of tracking processing of the IO by the data processing task is ensured.
The two IO access points and the data processing task management method are commonly applied to a distributed storage system to form a complete processing flow of the IO request or the data processing task management request for the data object shown in fig. 1, wherein the flow comprises steps S1-S8 with a sequence from beginning to end, S1-S2 and S6-S7 are executed by a storage server, and S3-S5 and S8 are executed by a metadata server.
For the above-mentioned IO access point and data processing task management method, since there may be multiple types of links in the connection manner in which the user initiates the IO request, the following detailed description respectively describes the forwarding manner of the child storage service node after receiving the access point information of the data object in two common types of user-interposed links:
type one: IO access point management for ISCSI data links under distributed storage
An internet small computer system interface (Internet Small Computer System Interface, iSCSI) is an open standard for data block transfer over internet protocols, particularly ethernet, designed in a client-server mode, and distributed storage systems may provide IO capability of data objects by providing an iSCSI service interface. The iSCSI protocol transfers control instructions and data via a transfer control (Transmission Control Protocol, TCP) protocol, and the iSCSI client (initiator) establishes an iSCSI session with a storage resource (iSCSI Target) of the iSCSI Server (Target Server), maps the iSCSI logical unit number (Logical Unit Number, LUN) to a specific data object, and then performs IO.
Under the iSCSI access link, the user needs to perform a client authentication handshake with the server first before initiating an IO to a certain data object, and after authentication is completed, the initiator can initiate the IO to the data object address returned by iSCSI Target server.
As shown in fig. 4 and fig. 5, the IO access point management flow of the distributed storage system in the iSCSI access mode is as follows:
1. the user's iSCSI initiator attempts an authentication handshake with the iSCSI access service of a storage service child node.
2. The storage service child node iSCSI storage access service queries the IO access point information of the data object to the distributed system metadata service through the data object information in the authentication request, and specifically:
if the data object has IO access point information, directly returning connection information of a corresponding access point;
if the data object has no IO access point information, the metadata service preferentially takes a storage service sub-node which is currently processing the iSCSI initiator request as an IO access point, and if the storage service sub-node is unhealthy, one IO access point is selected from the rest healthy access points in a Hash (Hash) mode;
3. and after the iSCSI access service obtains the IO access point information of the data object, the access point information address is returned to the iSCSI initiator.
4. And connecting the iSCSI initiator of the user with the iSCSI storage service child node corresponding to the IO access point, and then carrying out IO.
Type two: IO access point management for vHost data links in distributed systems
The vHost data link is an inter-process IO link optimized for a virtualized system under a linux operating system, is in a client/server mode, and is a higher-performance IO link scheme because no network layer consumption exists and IO can bypass a system kernel as a vHost client is connected and communicated with a vHost server through an inter-process communication socket (Unix domain socket). The vHost server (server) is typically integrated as a module in the storage service of the child node. The implementation of the vHost data link determines that the vHost client and the server must be on the same physical child node.
In the case of the vHost access link, the connection characteristics require that the IO access point must be guaranteed to be the current node, and the IO access point management flow is simplified to be an exclusive and preemptive management method, as shown in fig. 6 and fig. 7, the detailed flow is as follows:
1. when a vHost client (client) process of a user is created, the data object IO access permission needs to be registered to the distributed storage metadata service by using the information of the storage node where the user is currently located and the ID information of the vHost client.
2. The metadata service needs to update the IO access permission, and notifies all storage service sub-nodes of the access permission notification, only the permitted storage service sub-nodes and the data object IO request initiated by the current vhost client will be processed, and IO requests initiated by other storage service sub-nodes or other vhost clients received by the storage service sub-nodes to the data object will be refused. Wherein:
If the access point information of the data object already exists and is different from the application information, the current access application information is used to cover the old intervention information, and the old intervention information is synchronized to all storage service child nodes, and the acceptance permission is returned.
Otherwise, the metadata service records IO access permission information, synchronizes the information to the storage service of all storage service sub-nodes, and returns an acceptance permission after synchronization is completed.
3. After the user vHost client completes the local IO access registration, the user vHost client can directly communicate with a local storage system.
Next, the present embodiment specifically describes a data processing task management section in the IO access point and the data processing task management method of the present invention in three aspects of creation of a data processing task, suspension or resumption of a data processing task, cancellation and deletion of a data processing task:
1. creation of data processing tasks.
As shown in fig. 8, the creation of the data processing task requires the following steps:
(1) The client of the first storage service child node receives a data processing task creation request for the data object, and then the client sends the data processing task creation request and an IO access point record query request for the data object to the metadata service.
(2) The metadata service receives the data processing task creation request.
(3) The metadata service retrieves a target data object of the data processing task and queries whether the data object has an IO access point record, i.e., determines whether the data object has an access point.
If the data object only has one IO access point record, namely only has one access point, the creation condition is met, and the step (4) is entered;
if the data object does not have the IO access point record, the metadata management service distributes an access point for the data object, creates the IO access point record of the data object, records the access point information of the access point in the IO access point record, and then enters the step (4);
if the IO access point of the data object records more than one, the creation request is denied.
(4) And the metadata service sends a data processing task creation request to a second storage service sub-node corresponding to the access point information in the IO access point record.
(5) And executing a data processing task by a second storage service child node corresponding to the access point information in the IO access point record, and notifying that the metadata service response task is successfully created.
2. A suspension or resumption of the data processing task.
The suspension or resumption of the data processing task is performed by suspending or resuming the data processing task existing in the storage service child node.
As shown in fig. 9, the suspension or resumption of the data processing task requires the following steps to be performed:
(1) The client of the first storage service child node receives a pause or resume request for a data processing task, and the client sends the pause or resume request for the data processing task to the metadata service.
(2) The metadata service receives a pause or resume request for the data processing task.
(3) The metadata service retrieves a target data object for the data processing task and queries whether the data object has an IO access point record.
If the data object only has one IO access point record, the creation condition is met, and the step (4) is entered;
if the data object does not have the IO access point record, the metadata management service distributes an access point for the data object, creates the IO access point record of the data object, records the access point information of the access point in the IO access point record, and then enters the step (4);
if the IO access point of the data object records more than one, the pause or resume request of the data processing task is denied.
(4) And the metadata service sends a pause or resume request of the data processing task to a second storage service sub-node corresponding to the access point information in the IO access point record.
(5) And the second storage service child node corresponding to the access point information in the IO access point record executes suspension or restoration of the existing data processing task and sends a processing result to the metadata service.
(6) The metadata service returns the execution result of the pause or resume request of the data processing task to the client.
3. Cancellation and deletion of data processing tasks.
And canceling and deleting the data processing tasks, namely canceling and deleting the existing data processing tasks in the storage service child node. When the data processing task management request is the cancellation or deletion of the existing data processing task, in step S8 in the foregoing IO access point and the data processing task management method, after receiving the execution result of the data processing task management request sent by the second storage service child node and before returning the execution result to the client, the method further includes: when the IO access point record of the data object is generated by an existing data processing task, the metadata service deletes the IO access point record.
Specifically, as shown in fig. 10, the cancellation and deletion of the data processing task is required to be performed through the following steps:
(1) The client of the first storage service sub-node receives a cancellation and deletion request of the data processing task, and the client sends the cancellation and deletion request of the data processing task to the metadata service.
(2) The metadata service receives cancellation and deletion requests for the data processing task.
(3) The metadata service retrieves a target data object for the data processing task and queries whether the data object has an IO access point record.
If the data object only has one IO access point record, the creation condition is met, and the step (4) is entered;
if the data object does not have the IO access point record, the metadata management service distributes an access point for the data object, creates the IO access point record of the data object, records the access point information of the access point in the IO access point record, and then enters the step (4);
cancellation and deletion of the data processing task is denied if the IO access point of the data object records more than one.
(4) And the metadata service sends a cancellation and deletion request of the data processing task to a second storage service sub-node corresponding to the access point information in the IO access point record.
(5) And the second storage service child node corresponding to the access point information in the IO access point record executes cancellation and deletion of the existing data processing task and sends a processing result to the metadata service.
(6) When the existing data processing task has been cancelled and deleted, the metadata service needs to check whether the access point information in the IO access point record of the data object is the access point information generated by the existing data processing task, and if yes, the metadata service deletes the access point information.
In the distributed system, the data processing task is generally a long-term task, and during the execution of the existing data processing task, there may be a third storage service sub-node (i.e., any storage service sub-node) attempting to make a second IO request to the data object (where the second IO request refers to an IO request made by the data object when the data object has the data processing task), where in step S5 in the above-mentioned IO access point and data processing task management method, after sending the data processing task management request to the second storage service sub-node corresponding to the access point information, the method further includes:
when the existing data processing task is not executed, and the client of the third storage service sub-node receives a second IO request for the data object, receiving a registration request sent by the third storage service sub-node, and creating an IO access point record corresponding to the third storage service sub-node for the data object;
Receiving a query request sent by a third storage service sub-node, querying IO access point records of a data object, and sending a notification of cancellation of execution to the storage service sub-node executing the existing data processing task when a plurality of IO access point records exist in the data object; when only one IO access point record exists in the data object, comparing changes of access point information in the IO access point record before and after the IO access point record corresponding to the third storage service sub-node is created, and sending a notification of canceling execution to the storage service sub-node executing the existing data processing task under the condition that the access point information changes.
Specifically, as shown in fig. 11, when a new storage service child node attempts to make an IO request for a data object during execution of an existing data processing task, the following steps are performed:
(1) If the existing data processing task for a certain data line object is not executed, and the client of the third storage service sub-node receives the sent IO for the data object, the third storage service sub-node sends a query request and a registration request to the metadata service;
(2) The metadata service receives a registration request sent by a third storage service sub-node, and creates an IO access point record corresponding to the storage service sub-node for the data object;
(3) The metadata service queries the IO access point record of the data object according to the received query request, and if the data object has a plurality of IO access point records (i.e. changes from one to a plurality of IO access point records), the metadata service sends a notification to cancel the existing data processing task of the data object to the storage service child node that is executing the existing data processing task. The data object has a plurality of IO access point records, which indicate that the storage service sub-node which is executing the IO of the data object and the storage service sub-node which is executing the data processing task of the data object are not the same, so that the data processing task of the data object is required to be canceled currently, and the problem of IO processing time sequence caused by the same data object and a plurality of access points is avoided.
(4) If the number of the IO access point records is only one, the metadata service compares whether access point information in the IO access point records of the data object changes before and after the IO access point records corresponding to the storage service child node are created, and if so, the metadata service sends a notification for canceling the existing data processing task of the data object to the storage service child node which is executing the existing data processing task. In this step, the number of the IO access point records of the data object is only one, which indicates that the storage service child node that is executing the IO of the data object is the same as the storage service child node that is executing the data processing task of the data object, and because the priority of the IO access to the data object is higher than that of the data processing task of the data object, the current data processing task to the data object needs to be canceled, and the IO of the data object is preferentially executed.
Based on the same inventive concept, a further embodiment of the present invention provides an IO access point and a data processing task management device, and the IO access point and the data processing task management device provided by the present invention are described below, and the IO access point and the data processing task management device described below and the IO access point and the data processing task management method described above may be referred to correspondingly.
As shown in fig. 12, the IO access point and the data processing task management device provided by the present invention include a first receiving module 121, a first processing module 122, and a first sending module 123.
A first receiving module 121, configured to receive a query request for a data object, or receive a query request for a data object and a data processing task management request; both the query request and the data processing task management request are sent by the storage service child node.
The first processing module 122 is configured to query the IO access point records of the data object according to the query request, and operate the IO access point records according to the number of the IO access point records; the IO access point record contains access point information, and one access point information corresponds to one storage service child node.
The first sending module 123 is configured to send the IO access point record to the first storage service sub-node, so that the first storage service sub-node processes the first IO request according to the IO access point record, or sends the first IO request to the second storage service sub-node corresponding to the access point information, and after the second storage service sub-node completes the first IO request, returns the first IO request to the client in an original path; and the second storage service sub-node is further used for sending a data processing task management request to the second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request, and then receives an execution result sent by the second storage service sub-node.
The first processing module 122 is further configured to reject the data processing task management request according to the IO access point record.
The first receiving module 121 is further configured to receive an execution result sent by the second storage service child node.
Based on the same inventive concept, a further embodiment of the present invention provides another IO access point and data processing task management device, and the IO access point and data processing task management device provided by the present invention are described below, and the IO access point and data processing task management device described below and the IO access point and data processing task management method described above may be referred to correspondingly.
As shown in fig. 16, the IO access point and the data processing task management device provided by the present invention include a second receiving module 161, a second processing module 162, a second transmitting module 163, and a client module 164.
The client module 164 is configured to receive a first IO request for a data object or a data processing task management request for the data object.
The second sending module 163 is configured to send a query request to the metadata service, or send a query request and a data processing task management request to the metadata service, so that the metadata service receives the query request for the data object, or receives the query request for the data object and the data processing task management request, queries the IO access point record of the data object according to the query request, and operates on the IO access point record according to the number of the IO access point records.
The second receiving module 161 is further configured to receive an IO access point record sent by the metadata service, or receive a data processing task management request sent by the metadata service.
The second sending module 163 is further configured to send the first IO request to an IO access point and a data processing task management device corresponding to the access point information in the IO access point record.
The second receiving 161 module is configured to receive a first IO request sent by another IO access point and the data processing task management device.
The second processing module 162 is configured to process the first IO request according to the IO access point record, or is configured to execute a data processing task management request, and is further configured to complete the first IO request.
The second sending module 163 is further configured to send the first IO request to the client module 164 in a primary way, or send an execution result to the metadata service.
Based on the same inventive concept, a further embodiment of the present invention provides an electronic device. Fig. 13 illustrates a physical structure diagram of an electronic device, as shown in fig. 13, which may include: processor 131, communication interface (Communications Interface) 132, memory 133 and communication bus 134, wherein processor 131, communication interface 132, memory 133 accomplish the communication between each other through communication bus 134. Processor 131 may invoke logic instructions in memory 133 to perform the IO access point and data processing task management method comprising:
S3, receiving a query request for the data object or receiving a query request for the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
s4, inquiring IO access point records of the data object according to the inquiring request, and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
s5, sending the IO access point record to the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to the access point information, and the second storage service sub-node returns the first IO request to the client in an original way after completing the first IO request;
or alternatively, the process may be performed,
rejecting the data processing task management request according to the IO access point record, or sending the data processing task management request to a second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request;
S8, after receiving an execution result of the data processing task management request sent by the second storage service sub-node, returning the execution result to the client;
or alternatively, the process may be performed,
processor 131 may invoke logic instructions in memory 133 to perform the IO access point and data processing task management method comprising:
s1, a client of a first storage service child node receives a first IO request for a data object or a data processing task management request for the data object;
s2, the first storage service sub-node sends a query request to a metadata service, or sends a query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
s6, the first storage service sub-node receives the IO access point record sent by the metadata service, processes the first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to access point information in the IO access point record;
Or, the second storage service sub-node corresponding to the access point information in the IO access point record receives the data processing task management request sent by the metadata service;
s7, after the second storage service child node receives the first IO request, the first IO request is completed, and the first IO request is returned to the client in an original way;
or after receiving the data processing task management request, the second storage service sub-node executes the data processing task management request, and after the execution is completed, the second storage service sub-node sends an execution result to the metadata service.
Further, the logic instructions in the memory 133 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing an IO access point and a data processing task management method as described above, the method comprising:
s3, receiving a query request for the data object or receiving a query request for the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
s4, inquiring IO access point records of the data object according to the inquiring request, and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
s5, sending the IO access point record to the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to the access point information, and the second storage service sub-node returns the first IO request to the client in an original way after completing the first IO request;
Or alternatively, the process may be performed,
rejecting the data processing task management request according to the IO access point record, or sending the data processing task management request to a second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request;
s8, after receiving an execution result of the data processing task management request sent by the second storage service sub-node, returning the execution result to the client;
or alternatively, the process may be performed,
when the computer program is executed by the processor, the computer can execute the above-mentioned another method for managing the IO access point and the data processing task, and the method comprises the following steps:
s1, a client of a first storage service child node receives a first IO request for a data object or a data processing task management request for the data object;
s2, the first storage service sub-node sends a query request to a metadata service, or sends a query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
S6, the first storage service sub-node receives the IO access point record sent by the metadata service, processes the first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to access point information in the IO access point record;
or, the second storage service sub-node corresponding to the access point information in the IO access point record receives the data processing task management request sent by the metadata service;
s7, after the second storage service child node receives the first IO request, the first IO request is completed, and the first IO request is returned to the client in an original way;
or after receiving the data processing task management request, the second storage service sub-node executes the data processing task management request, and after the execution is completed, the second storage service sub-node sends an execution result to the metadata service.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform an IO access point and data processing task management method provided above, the method comprising:
S3, receiving a query request for the data object or receiving a query request for the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
s4, inquiring IO access point records of the data object according to the inquiring request, and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
s5, sending the IO access point record to the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to the access point information, and the second storage service sub-node returns the first IO request to the client in an original way after completing the first IO request;
or alternatively, the process may be performed,
rejecting the data processing task management request according to the IO access point record, or sending the data processing task management request to a second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request;
S8, after receiving an execution result of the data processing task management request sent by the second storage service sub-node, returning the execution result to the client;
or alternatively, the process may be performed,
the computer program, when executed by a processor, is implemented to perform another IO access point and data processing task management method provided above, the method comprising:
s1, a client of a first storage service child node receives a first IO request for a data object or a data processing task management request for the data object;
s2, the first storage service sub-node sends a query request to a metadata service, or sends a query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
s6, the first storage service sub-node receives the IO access point record sent by the metadata service, processes the first IO request according to the IO access point record, or sends the first IO request to a second storage service sub-node corresponding to access point information in the IO access point record;
Or, the second storage service sub-node corresponding to the access point information in the IO access point record receives the data processing task management request sent by the metadata service;
s7, after the second storage service child node receives the first IO request, the first IO request is completed, and the first IO request is returned to the client in an original way;
or after receiving the data processing task management request, the second storage service sub-node executes the data processing task management request, and after the execution is completed, the second storage service sub-node sends an execution result to the metadata service.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (12)

1. The IO access point and the data processing task management method are characterized by being applied to a metadata server, and the method comprises the following steps:
s3, receiving a query request for the data object or receiving a query request for the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
s4, inquiring IO access point records of the data object according to the inquiring request, and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
s5, when the IO access point record is only one, and the second storage service sub-node corresponding to the access point information is a first storage service sub-node, the IO access point record is sent to the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, and when the IO access point record is only one, and the second storage service sub-node corresponding to the access point information is different from the first storage service sub-node, the first IO request is sent to the second storage service sub-node corresponding to the access point information, and after the first IO request is completed by the second storage service sub-node, the first IO request is returned to a client of the first storage service sub-node in an original way; rejecting the first IO request when more than one IO access point record exists in the data object;
Or alternatively, the process may be performed,
rejecting the data processing task management request according to the IO access point record, or sending the data processing task management request to a second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request;
and S8, after receiving an execution result of the data processing task management request sent by the second storage service sub-node, returning the execution result to the client.
2. The IO access point and data processing task management method according to claim 1, wherein the operation on the IO access point record according to the number of the IO access point records specifically includes:
and when the IO access point record does not exist in the data object, a storage service sub-node is allocated for the data object, the IO access point record is created, and the access point information of the storage service sub-node is recorded in the IO access point record.
3. The IO access point and data processing task management method according to claim 2, wherein a storage service child node is allocated to the data object, specifically comprising one of the following ways:
Randomly allocating a storage service child node for the data object;
distributing a storage service child node with the lowest delay for the data object according to the delay data of the client and each storage service child node;
and distributing a storage service child node closest to the client to the data object according to the storage position information of the data object.
4. The IO access point and the data processing task management method according to claim 1, wherein the data processing task management request specifically includes one of the following requests:
creating a data processing task;
suspension or resumption of existing data processing tasks;
the cancellation or deletion of existing data processing tasks.
5. The IO access point and the data processing task management method according to claim 4, wherein when the data processing task management request is a cancellation or deletion of the existing data processing task, after receiving an execution result of the data processing task management request sent by the second storage service child node and before returning the execution result to the client, further comprising:
And deleting the IO access point record of the data object when the IO access point record of the data object is generated by the existing data processing task.
6. The IO access point and data processing task management method according to claim 4, further comprising, after sending the data processing task management request to the second storage service child node corresponding to the access point information:
when the existing data processing task is not executed yet, and the client of the third storage service sub-node receives a second IO request for the data object, receiving a registration request sent by the third storage service sub-node, and creating the IO access point record corresponding to the third storage service sub-node for the data object;
receiving a query request sent by the third storage service sub-node, querying the IO access point record of the data object, and sending a notification of cancellation of execution to the storage service sub-node executing the existing data processing task when a plurality of IO access point records exist in the data object; when the data object only has one IO access point record, comparing the changes of the access point information in the IO access point record before and after the IO access point record corresponding to the third storage service sub-node is created, and sending a notification of canceling execution to the storage service sub-node executing the existing data processing task under the condition that the access point information changes.
7. An IO access point and a data processing task management method, which are applied to a storage server, the method includes:
s1, a client of a first storage service child node receives a first IO request for a data object or a data processing task management request for the data object;
s2, the first storage service sub-node sends a query request to a metadata service, or sends a query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
s6, the first storage service sub-node receives the IO access point record sent by the metadata service, and when the IO access point record is unique and a second storage service sub-node corresponding to the access point information in the IO access point record is the first storage service sub-node, the first storage service sub-node completes the first IO request; when the IO access point record is unique and the second storage service sub-node corresponding to the access point information in the IO access point record is not the first storage service sub-node, the first storage service sub-node sends the first IO request to the second storage service sub-node; when the IO access point records more than one IO request, the first storage service child node refuses the first IO request;
Or, the second storage service sub-node corresponding to the access point information in the IO access point record receives the data processing task management request sent by the metadata service;
s7, after the second storage service child node receives the first IO request, the first IO request is completed, and the first IO request is returned to the client in an original way;
or after receiving the data processing task management request, the second storage service sub-node executes the data processing task management request, and after the execution is completed, the second storage service sub-node sends an execution result to the metadata service.
8. The IO access point and data processing task management method of claim 7, further comprising, after executing the data processing task management request:
when the client of the third storage service sub-node receives the second IO request for the data object, the third storage service sub-node sends the query request and the registration request to the metadata service, so that the metadata service receives the registration request, creates the IO access point record corresponding to the third storage service sub-node for the data object, and enables the metadata service to receive the query request, and query the IO access point record of the data object according to the query request.
9. An IO access point and data processing task management device, comprising:
the first receiving module is used for receiving a query request of the data object or receiving a query request of the data object and a data processing task management request; the query request and the data processing task management request are both sent by a first storage service child node;
the first processing module is used for inquiring the IO access point records of the data object according to the inquiring request and operating the IO access point records according to the number of the IO access point records; the IO access point records comprise access point information, and one access point information corresponds to one storage service child node;
the first sending module is configured to send the IO access point record to a first storage service sub-node when there is only one IO access point record and the second storage service sub-node corresponding to the access point information is the first storage service sub-node, so that the first storage service sub-node processes a first IO request according to the IO access point record, and send the first IO request to a second storage service sub-node corresponding to the access point information when there is only one IO access point record and the second storage service sub-node corresponding to the access point information is different from the first storage service sub-node, and after the second storage service sub-node completes the first IO request, return the first IO request to a client of the first storage service sub-node in a source way;
The first sending module is further configured to send the data processing task management request to the second storage service sub-node corresponding to the access point information, so that the second storage service sub-node receives and executes the data processing task management request, and then receives an execution result sent by the second storage service sub-node;
the first processing module is further configured to reject the first IO request when there is more than one IO access point record for the data object;
the first receiving module is further configured to receive the execution result sent by the second storage service child node.
10. An IO access point and data processing task management device, comprising:
the client module is used for receiving a first IO request of the data object or a data processing task management request of the data object;
the second sending module is used for sending a query request to a metadata service, or sending the query request and the data processing task management request to the metadata service, so that the metadata service receives the query request, or receives the query request and the data processing task management request, queries IO access point records of the data object according to the query request, and operates the IO access point records according to the number of the IO access point records;
The second receiving module is used for receiving the IO access point record sent by the metadata service or receiving the data processing task management request sent by the metadata service;
the second sending module is further configured to send the first IO request to the IO access point and the data processing task management device corresponding to the access point information when the IO access point record is unique and the IO access point and the data processing task management device corresponding to the access point information in the IO access point record are not the current IO access point and the data processing task management device;
the second receiving module is further configured to receive the first IO request sent by the other IO access points and the data processing task management device;
the second processing module is used for completing the first IO request when the IO access point record is unique and the IO access point and the data processing task management device corresponding to the access point information in the IO access point record are the current IO access point and the data processing task management device; rejecting the first IO request when the IO access point records more than one IO request; or for executing the data processing task management request;
The second sending module is further configured to return the first IO request to the client module in a primary way, or send an execution result of executing the data processing task management request to the metadata service.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the IO access point and data processing task management method according to any one of claims 1-6 or the IO access point and data processing task management method according to any one of claims 7-8 when executing the program.
12. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the IO access point and data processing task management method of any of claims 1-6 or implements the IO access point and data processing task management method of any of claims 7-8.
CN202310238936.7A 2023-03-14 2023-03-14 IO access point and data processing task management method, device, equipment and medium Active CN115934006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310238936.7A CN115934006B (en) 2023-03-14 2023-03-14 IO access point and data processing task management method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310238936.7A CN115934006B (en) 2023-03-14 2023-03-14 IO access point and data processing task management method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN115934006A CN115934006A (en) 2023-04-07
CN115934006B true CN115934006B (en) 2023-05-12

Family

ID=85825495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310238936.7A Active CN115934006B (en) 2023-03-14 2023-03-14 IO access point and data processing task management method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115934006B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101544480B1 (en) * 2010-12-24 2015-08-13 주식회사 케이티 Distribution storage system having plural proxy servers, distributive management method thereof, and computer-readable recording medium
CN104050250B (en) * 2011-12-31 2018-06-05 北京奇虎科技有限公司 A kind of distributed key-value querying method and query engine system
CN104050249B (en) * 2011-12-31 2018-03-30 北京奇虎科技有限公司 Distributed query engine system and method and meta data server
US10162834B2 (en) * 2016-01-29 2018-12-25 Vmware, Inc. Fine-grained metadata management in a distributed file system
CN112486074B (en) * 2020-12-03 2021-11-16 上海哔哩哔哩科技有限公司 Data processing system, method and device
CN115599300A (en) * 2022-10-21 2023-01-13 济南浪潮数据技术有限公司(Cn) Task allocation method, device, equipment and medium

Also Published As

Publication number Publication date
CN115934006A (en) 2023-04-07

Similar Documents

Publication Publication Date Title
US11687555B2 (en) Conditional master election in distributed databases
CN113169952B (en) Container cloud management system based on block chain technology
US7185096B2 (en) System and method for cluster-sensitive sticky load balancing
US8954786B2 (en) Failover data replication to a preferred list of instances
US7089318B2 (en) Multi-protocol communication subsystem controller
US8195742B2 (en) Distributed client services based on execution of service attributes and data attributes by multiple nodes in resource groups
US9344494B2 (en) Failover data replication with colocation of session state data
JP5841177B2 (en) Method and system for synchronization mechanism in multi-server reservation system
US9652469B2 (en) Clustered file service
US6081826A (en) System using environment manager with resource table in each computer for managing distributed computing resources managed for each application
CN100359508C (en) Merge protocol for schooling computer system
WO2018133721A1 (en) Authentication system and method, and server
US20100138540A1 (en) Method of managing organization of a computer system, computer system, and program for managing organization
JPH10187519A (en) Method for preventing contention of distribution system
JP2010044552A (en) Request processing method and computer system
CN108063813B (en) Method and system for parallelizing password service network in cluster environment
WO2007073429A2 (en) Distributed and replicated sessions on computing grids
US20100318654A1 (en) Routing of pooled messages via an intermediary
KR20180090181A (en) Method for processing acquire lock request and server
US20230367749A1 (en) Data migration method and apparatus, device, medium, and computer product
US20220318071A1 (en) Load balancing method and related device
CN112291298A (en) Data transmission method and device for heterogeneous system, computer equipment and storage medium
KR20140047230A (en) Method for optimizing distributed transaction in distributed system and distributed system with optimized distributed transaction
WO2017185992A1 (en) Method and apparatus for transmitting request message
CN111158949A (en) Configuration method, switching method and device of disaster recovery architecture, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant