WO2021164163A1 - Request processing method and apparatus, device and storage medium - Google Patents

Request processing method and apparatus, device and storage medium Download PDF

Info

Publication number
WO2021164163A1
WO2021164163A1 PCT/CN2020/097969 CN2020097969W WO2021164163A1 WO 2021164163 A1 WO2021164163 A1 WO 2021164163A1 CN 2020097969 W CN2020097969 W CN 2020097969W WO 2021164163 A1 WO2021164163 A1 WO 2021164163A1
Authority
WO
WIPO (PCT)
Prior art keywords
request
scsi
requests
scheduling
processing method
Prior art date
Application number
PCT/CN2020/097969
Other languages
French (fr)
Chinese (zh)
Inventor
李宏伟
张东
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2021164163A1 publication Critical patent/WO2021164163A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/63Routing a service request depending on the request content or context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4482Procedural
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • a distributed storage system is a cluster composed of a large number of standard servers.
  • the distributed storage system can be subdivided into distributed file systems, distributed object storage systems, and distributed block systems, including Distributed database, etc.
  • distributed storage has the advantages of low cost, strong scalability, easy use and management, and no single point of failure.
  • Distributed storage system architecture is the core of HCI (Hyper Converged Infrastructure).
  • HCI Hydro Converged Infrastructure
  • a commonly used solution is to use distributed storage system as a software service, and provide block devices for upper hypervisors through RBD or iSCSI protocols. Store virtual machine data and other massive data.
  • the device may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, a palmtop computer, and a portable computer.
  • PC Personal Computer
  • terminal device such as a smart phone, a tablet computer, a palmtop computer, and a portable computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An IO scheduling layer based request processing method and apparatus, a device and a storage medium. In the solution, for SCSI requests sent by an application layer, if the size of data to be processed corresponding to the SCSI requests does not exceed a preset threshold, the SCSI requests are not sent firstly to a software stack but the IO scheduling layer; and the received requests are merged by means of the IO scheduling layer and then sent to the software stack, so that the requests are sent to corresponding server nodes by means of a message routing system. The mode, by which the SCSI requests are processed by means of the IO scheduling layer and then sent to the software stack, can reduce, by merging the requests, the number of times of interactions passing through the software stack, thus improving throughput capacity of a storage system.

Description

一种请求处理方法、装置、设备及存储介质Request processing method, device, equipment and storage medium
本申请要求于2020年02月21日提交中国专利局、申请号为202010108904.1、发明名称为“一种请求处理方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 21, 2020, the application number is 202010108904.1, and the invention title is "a request processing method, device, equipment and storage medium", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本发明涉及分布式存储领域,更具体地说,涉及一种基于分布式存储系统的请求处理方法、装置、设备及计算机可读存储介质。The present invention relates to the field of distributed storage, and more specifically, to a request processing method, device, equipment and computer-readable storage medium based on a distributed storage system.
背景技术Background technique
分布式存储系统是由大量标准服务器组成的集群,对外作为整体提供存储功能,而数据实际上分布在每个服务器上,本质上是一种软件定义存储。就具体的分布式存储系统所提供的iscsi块存储功能实现而言,一般是由iscsi target软件栈接收到的SCSI命令后直接解包,然后通过消息路由系统交给数据所在服务器节点处理,然后再经路由系统将数据返回给上层应用。可以看出,对于分布式系统,它由多层软件栈组成,每个SCSI请求需要跨越多个软件栈甚至跨越主机,因此对单个SCSI请求的处理要慢于传统的集中式存储。A distributed storage system is a cluster composed of a large number of standard servers, which provides storage functions as a whole to the outside world, and the data is actually distributed on each server, which is essentially a software-defined storage. In terms of the realization of the iscsi block storage function provided by the specific distributed storage system, the SCSI command received by the iscsi target software stack is generally unpacked directly, and then passed to the server node where the data is located through the message routing system for processing, and then The data is returned to the upper application via the routing system. It can be seen that for a distributed system, it consists of multiple layers of software stacks, and each SCSI request needs to span multiple software stacks or even across hosts, so the processing of a single SCSI request is slower than traditional centralized storage.
发明内容Summary of the invention
本发明的目的在于提供一种基于分布式存储系统的请求处理方法、装置、设备及计算机可读存储介质,以提高SCSI请求的处理速度。The purpose of the present invention is to provide a request processing method, device, equipment and computer-readable storage medium based on a distributed storage system to improve the processing speed of SCSI requests.
为实现上述目的,本发明提供一种基于分布式存储系统的请求处理方法,基于IO调度层,所述请求处理方法包括:To achieve the above objective, the present invention provides a request processing method based on a distributed storage system. Based on the IO scheduling layer, the request processing method includes:
接收应用层发送的SCSI请求;Receive the SCSI request sent by the application layer;
若与所述SCSI请求对应的待处理数据大小未超过预定阈值,则利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理,并对处理后的请求分别进行封装;If the size of the to-be-processed data corresponding to the SCSI request does not exceed the predetermined threshold, use the scheduling strategy to analyze the received SCSI request, perform request merging processing, and encapsulate the processed requests separately;
将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。The encapsulated request is sent to the software stack, so that the software stack sends the encapsulated request to the corresponding server node through the message routing system.
其中,所述接收应用层发送的SCSI请求,包括:Wherein, the receiving the SCSI request sent by the application layer includes:
通过IO调度层的调度队列接收应用层发送的SCSI请求。The SCSI request sent by the application layer is received through the scheduling queue of the IO scheduling layer.
其中,所述IO调度层包括IO调度接口,所述IO调度接口以插件的形式接入不同的调度策略。Wherein, the IO scheduling layer includes an IO scheduling interface, and the IO scheduling interface accesses different scheduling strategies in the form of a plug-in.
其中,所述利用调度策略对已接收的SCSI请求进行解析,包括:Wherein, the use of the scheduling strategy to analyze the received SCSI request includes:
利用调度策略对当前时间周期内已接收的SCSI请求进行解析。The scheduling strategy is used to analyze the received SCSI requests in the current time period.
其中,若与所述SCSI请求对应的待处理数据大小超过预定阈值,则所述请求处理方法还包括:Wherein, if the size of the data to be processed corresponding to the SCSI request exceeds a predetermined threshold, the request processing method further includes:
判断与所述SCSI请求对应的待处理数据是否为多副本数据;Judging whether the to-be-processed data corresponding to the SCSI request is multi-copy data;
若是,则根据所述待处理数据的副本数量将所述SCSI请求拆分为多个SCSI子请求,并将每个SCSI子请求封装后发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至每个SCSI子请求对应的服务器节点。If so, the SCSI request is split into multiple SCSI sub-requests according to the number of copies of the data to be processed, and each SCSI sub-request is encapsulated and sent to the software stack, so that the software stack passes the encapsulated request through the message The routing system sends to the server node corresponding to each SCSI sub-request.
其中,所述利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理,包括:Wherein, the use of the scheduling strategy to analyze the received SCSI request and perform request merging processing includes:
若存在至少两个SCSI请求对应的地址首位相连,则将地址首位相连的SCSI请求合并为一个请求;If there are at least two SCSI requests that are connected in the first digit of the address, merge the SCSI requests connected in the first digit of the address into one request;
若存在至少两个SCSI请求对应的存储对象为同一个存储对象,则将同一个存储对象的SCSI请求合并为一个请求;If the storage objects corresponding to at least two SCSI requests are the same storage object, the SCSI requests of the same storage object are combined into one request;
若存在至少两个SCSI请求对应的服务器节点为同一个服务器节点,则将同一个服务器节点的SCSI请求合并为一个请求。If the server nodes corresponding to at least two SCSI requests are the same server node, the SCSI requests of the same server node are combined into one request.
为实现上述目的,本发明进一步提供一种基于分布式存储系统的请求处理装置,基于IO调度层,所述请求处理装置包括:To achieve the above objective, the present invention further provides a request processing device based on a distributed storage system. Based on the IO scheduling layer, the request processing device includes:
接收模块,用于接收应用层发送的SCSI请求;The receiving module is used to receive the SCSI request sent by the application layer;
解析模块,用于在与所述SCSI请求对应的待处理数据大小未超过预定阈值时,利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理;The parsing module is used for parsing the received SCSI request using a scheduling strategy when the size of the data to be processed corresponding to the SCSI request does not exceed a predetermined threshold, and performing request merging processing;
封装模块,用于对处理后的请求分别进行封装;Encapsulation module, used to encapsulate the processed request separately;
发送模块,用于将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。The sending module is used to send the encapsulated request to the software stack, so that the software stack sends the encapsulated request to the corresponding server node through the message routing system.
其中,本方案还包括:Among them, this program also includes:
判断模块,用于在与所述SCSI请求对应的待处理数据大小超过预定阈值时,判断与所述SCSI请求对应的待处理数据是否为多副本数据;A judging module, configured to judge whether the to-be-processed data corresponding to the SCSI request is multi-copy data when the size of the to-be-processed data corresponding to the SCSI request exceeds a predetermined threshold;
拆分模块,用于在与所述SCSI请求对应的待处理数据为多副本数据时,根据所述待处理数据的副本数量将所述SCSI请求拆分为多个SCSI子请求;A splitting module, configured to split the SCSI request into multiple SCSI sub-requests according to the number of copies of the to-be-processed data when the to-be-processed data corresponding to the SCSI request is multiple copies of data;
所述封装模块,用于对每个SCSI子请求进行封装;The encapsulation module is used to encapsulate each SCSI sub-request;
所述发送模块,用于将封装后的每个SCSI子请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至每个SCSI子请求对应的服务器节点。为实现上述目的,本发明进一步提供一种基于分布式存储系统的请求处理设备,包括:The sending module is used to send each encapsulated SCSI sub-request to the software stack, so that the software stack sends the encapsulated request to the server node corresponding to each SCSI sub-request through the message routing system. To achieve the above objective, the present invention further provides a request processing device based on a distributed storage system, including:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行所述计算机程序时实现上述的请求处理方法的步骤。The processor is used to implement the steps of the above request processing method when the computer program is executed.
为实现上述目的,本发明进一步提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述的请求处理方法的步骤。In order to achieve the foregoing objective, the present invention further provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the foregoing request processing method are implemented.
通过以上方案可知,本发明实施例提供的一种基于分布式存储系统的请求处理方法,基于IO调度层,所述请求处理方法包括:接收应用层发送的SCSI请求;判断与SCSI请求对应的待处理数据大小是否超过预定阈值;若否,则利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理,并对处理后的请求分别进行封装;将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。It can be seen from the above solution that a request processing method based on a distributed storage system provided by an embodiment of the present invention is based on the IO scheduling layer. The request processing method includes: receiving a SCSI request sent by the application layer; determining the pending request corresponding to the SCSI request. Whether the processing data size exceeds a predetermined threshold; if not, use the scheduling strategy to analyze the received SCSI request, perform request merging processing, and encapsulate the processed requests separately; send the encapsulated request to the software stack to make The software stack sends the encapsulated request to the corresponding server node through the message routing system.
可见,在本申请中,对于应用层发送的SCSI请求,并不发送至软件栈,而是先发送至IO调度层,通过IO调度层对接收的请求进行合并处理后,再发送至软件栈,以通过消息路由系统发送至对应的服务器节点;这 种通过IO调度层对SCSI请求进行处理后,再发送至软件栈的方式,可以通过合并请求的方式减少经过软件栈的交互次数,提高存储系统的吞吐量。It can be seen that in this application, the SCSI request sent by the application layer is not sent to the software stack, but first sent to the IO scheduling layer, and the received requests are merged through the IO scheduling layer, and then sent to the software stack. It can be sent to the corresponding server node through the message routing system; this method of processing the SCSI request through the IO scheduling layer and then sending it to the software stack can reduce the number of interactions through the software stack by combining requests and improve the storage system Throughput.
本发明还公开了一种基于分布式存储系统的请求处理装置、设备及计算机可读存储介质,同样能实现上述技术效果。The present invention also discloses a request processing device, equipment and computer-readable storage medium based on the distributed storage system, which can also achieve the above technical effects.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为本发明实施例公开的一种基于分布式存储系统的请求处理方法流程示意图;FIG. 1 is a schematic flowchart of a request processing method based on a distributed storage system disclosed in an embodiment of the present invention;
图2为本发明实施例公开的一种基于分布式存储系统的请求处理装置结构示意图;2 is a schematic structural diagram of a request processing apparatus based on a distributed storage system disclosed in an embodiment of the present invention;
图3为本发明实施例公开的一种基于分布式存储系统的请求处理设备结构示意图。Fig. 3 is a schematic structural diagram of a request processing device based on a distributed storage system disclosed in an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
可以理解的是,分布式存储系统是由大量标准服务器组成的集群,从实现方式上,分布式存储系统又可以细分为分布式文件系统、分布式对象存储系统和分布式块系统,也包括分布式数据库等。相较于传统的集中式 存储,分布式存储具有成本低、扩展性强、易用易管理、没有单点故障等优势。分布式存储系统构是构成HCI(Hyper Converged Infrastructure,超融合基础架构)的核心,常用的方案是由分布式存储系统作为一种软件服务,通过RBD或者iSCSI协议为上层hypervisor提供块设备,用于存储虚拟机数据及其它海量数据。就具体的分布式存储系统所提供的iscsi块存储功能实现而言,一般是由iscsi target软件栈接收到的SCSI命令后直接解包,然后通过消息路由系统交给数据所在服务器节点处理,然后再经路由系统将数据返回给上层应用。It is understandable that a distributed storage system is a cluster composed of a large number of standard servers. In terms of implementation, the distributed storage system can be subdivided into distributed file systems, distributed object storage systems, and distributed block systems, including Distributed database, etc. Compared with traditional centralized storage, distributed storage has the advantages of low cost, strong scalability, easy use and management, and no single point of failure. Distributed storage system architecture is the core of HCI (Hyper Converged Infrastructure). A commonly used solution is to use distributed storage system as a software service, and provide block devices for upper hypervisors through RBD or iSCSI protocols. Store virtual machine data and other massive data. In terms of the realization of the iscsi block storage function provided by the specific distributed storage system, the SCSI command received by the iscsi target software stack is generally unpacked directly, and then passed to the server node where the data is located through the message routing system for processing, and then The data is returned to the upper application via the routing system.
传统的集中式存储对SCSI指令的请求处理由和硬盘集成一起的嵌入式处理器集处理,具有逻辑简单但速度块的特点,因此适于对单个SCSI指令进行直接处理。而对于分布式系统,它由多层软件栈组成,每个SCSI请求需要跨越多个软件栈甚至跨越主机,因此对单个SCSI请求的处理要慢于传统的集中式存储。具体来说,基于软件基本规则“高内聚、低耦合”的原理,软件栈一般都近乎独立,有自己独立的工作流,而且通常是异步的,因此一个请求在不同软件栈之间传输及处理时:1.需要按约定格式打包/解析;2.需要通过socket或者rpc等方式进行消息传输;3.到达对端后通常只是放入处理队列,需要排队等待处理。以上这些因素都会增加处理开销和延迟。The traditional centralized storage for processing requests for SCSI commands is processed by an embedded processor set integrated with the hard disk. It has the characteristics of simple logic but block speed, so it is suitable for direct processing of a single SCSI command. For a distributed system, it is composed of multiple layers of software stacks, and each SCSI request needs to span multiple software stacks or even across the host, so the processing of a single SCSI request is slower than traditional centralized storage. Specifically, based on the principle of "high cohesion and low coupling" in the basic software rules, software stacks are generally almost independent, have their own independent workflows, and are usually asynchronous. Therefore, a request is transmitted between different software stacks. When processing: 1. It needs to be packaged/parsed according to the agreed format; 2. The message needs to be transmitted through socket or rpc; 3. After reaching the opposite end, it is usually placed in the processing queue and needs to be queued for processing. These factors will increase processing overhead and latency.
但是,分布式存储系统的软件定义特性带来了灵活性的优势,针对上述问题,本方案公开了一种基于分布式存储系统的请求处理方法、装置、设备及计算机可读存储介质,通过增加IO调度层,对来自上层应用的SCSI请求进行调度,例如通过合并消息处理,可减少不同软件栈的交互次数,提高SCSI请求的处理速度,减少跨越软件栈的请求调用引入的消耗于延迟,最大化存储系统的吞吐量。However, the software-defined nature of the distributed storage system brings the advantage of flexibility. In response to the above problems, this solution discloses a request processing method, device, equipment and computer-readable storage medium based on the distributed storage system. The IO scheduling layer schedules SCSI requests from upper-layer applications. For example, by merging message processing, the number of interactions between different software stacks can be reduced, the processing speed of SCSI requests can be improved, and the delay caused by request calls across the software stack can be reduced to the maximum Optimize the throughput of the storage system.
参见图1,本发明实施例提供的一种基于分布式存储系统的请求处理方法,所述请求处理方法包括:Referring to FIG. 1, an embodiment of the present invention provides a request processing method based on a distributed storage system, and the request processing method includes:
S101、接收应用层发送的SCSI请求;S101: Receive a SCSI request sent by the application layer;
其中,所述接收应用层发送的SCSI请求,包括:通过IO调度层的调度队列接收应用层发送的SCSI请求。所述IO调度层包括IO调度接口,所述IO调度接口以插件的形式接入不同的调度策略。Wherein, the receiving the SCSI request sent by the application layer includes: receiving the SCSI request sent by the application layer through the scheduling queue of the IO scheduling layer. The IO scheduling layer includes an IO scheduling interface, and the IO scheduling interface accesses different scheduling strategies in the form of a plug-in.
需要说明的是,在本申请中,在分布式存储系统的iscsi target软件栈增加了一个IO调度层,在现有方案中,从应用层过来的SCSI请求会放到等待队列,然后IO处理线程作为消费者从等待队列拿出原始SCSI消息并解析处理。而本发明为了提高请求处理速度,增加了IO调度层,具体来说,IO调度层首先需要接收应用层发送的SCSI请求,并将接收的SCSI请求首先放到调度队列,将该SCSI请求作为输入,以便根据预先设定的调度策略将原SCSI请求调度合并后进行封装,然后再放入软件栈的等待队列,交给IO处理线程,再经由消息路由系统传给存储节点处理。It should be noted that in this application, an IO scheduling layer is added to the iscsi target software stack of the distributed storage system. In the existing solution, SCSI requests from the application layer are placed in the waiting queue, and then the IO processing thread As a consumer, take out the original SCSI message from the waiting queue and parse it. In order to improve the request processing speed, the present invention adds an IO scheduling layer. Specifically, the IO scheduling layer first needs to receive the SCSI request sent by the application layer, and first put the received SCSI request into the scheduling queue, and use the SCSI request as input. , In order to combine the original SCSI request scheduling and encapsulation according to the preset scheduling strategy, and then put it into the waiting queue of the software stack, hand it over to the IO processing thread, and then pass it to the storage node for processing via the message routing system.
S102、若与所述SCSI请求对应的待处理数据大小未超过预定阈值,则利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理,并对处理后的请求分别进行封装;S102. If the size of the to-be-processed data corresponding to the SCSI request does not exceed a predetermined threshold, use a scheduling strategy to parse the received SCSI request, perform request merging processing, and encapsulate the processed requests separately;
其中,所述利用调度策略对已接收的SCSI请求进行解析,包括:利用调度策略对当前时间周期内已接收的SCSI请求进行解析。Wherein, using the scheduling strategy to analyze the received SCSI request includes: using the scheduling strategy to analyze the received SCSI request in the current time period.
在本申请中,可以在IO调度层增加相应的辅助函数,例如:定时器,监控接口等,具体来说:定时器是用来控制合并窗口,该合并窗口可以理解为合并周期,也即:本申请通过定时器对合并周期进行记录,当前时间周期即为当前的合并周期,本申请会在当前时间周期内积攒SCSI请求,对当前时间周期内已接收的SCSI请求进行解析。例如:合并周期为1小时,1:00~2:00为一个合并周期,2:00~3:00为另一个合并周期,如果现在时间为1:30,则当前时间周期即为1:00~2:00这一个合并周期,因此在对SCSI请求进行解析合并时,需要对当前时间周期1:00~2:00内已接收的SCSI请求进行解析。该时间周期可根据需求进行自定义修改,这样在满足IO延迟的情况下,可通过增大时间周期合并更多的请求。如果定时器超时,则会强制下刷当前已完成合并的请求。监控接口可作为一个独立的线程,会监控每个节点的IO压力,时刻运行。在对多副本请求进行拆解时,可将拆解的请求优先发送至IO压力较小的服务器节点,从而实现各服务器节点 的负载均衡。可以看出,本申请通过在IO调度层增加相应的辅助函数,可以避免饥饿现象发生,确保所有的SCSI请求都能在用户指定时间内返回处理结果。In this application, corresponding auxiliary functions can be added to the IO scheduling layer, such as timers, monitoring interfaces, etc., specifically: the timer is used to control the merge window, which can be understood as the merge cycle, that is: This application records the merging cycle through a timer, and the current time period is the current merging cycle. This application will accumulate SCSI requests in the current time period and analyze the SCSI requests received in the current time period. For example: the merge period is 1 hour, 1:00~2:00 is a merge period, 2:00~3:00 is another merge period, if the current time is 1:30, the current time period is 1:00 This is a consolidation period of ~2:00, so when analyzing and consolidating SCSI requests, it is necessary to analyze the SCSI requests received within the current time period from 1:00 to 2:00. The time period can be customized and modified according to requirements, so that when the IO delay is met, more requests can be merged by increasing the time period. If the timer expires, the current merged request will be forced down. The monitoring interface can be used as an independent thread, which monitors the IO pressure of each node and runs at all times. When disassembling multi-copy requests, the disassembling request can be sent to the server node with lower IO pressure first, so as to realize the load balance of each server node. It can be seen that this application can avoid starvation by adding corresponding auxiliary functions in the IO scheduling layer, and ensure that all SCSI requests can return processing results within the user's specified time.
需要说明的是,本申请为了提高IO调度层的灵活性,使分布式存储系统能够针对不同应用场景采用恰当的调度策略,本方案根据机制与策略相分离的原则,为IO调度框架增加了一组抽象接口,以插件的形式接入不同的调度策略实现,有针对性的进行调优。具体来说,本方案中IO调度策略的抽象接口设计主要包括以下几个步骤:It should be noted that, in order to improve the flexibility of the IO scheduling layer and enable the distributed storage system to adopt appropriate scheduling strategies for different application scenarios, this solution adds a new feature to the IO scheduling framework based on the principle of separation of mechanisms and strategies. Group abstract interfaces, which are implemented in the form of plug-ins to access different scheduling strategies for targeted tuning. Specifically, the abstract interface design of the IO scheduling strategy in this solution mainly includes the following steps:
1、设计了io_sched_type结构体,主要包含了与具体调度策略相关的接口;1. The io_sched_type structure is designed, which mainly contains interfaces related to specific scheduling strategies;
2、实现具体的调度策略算法,即:按上述接口实现具体的调度功能,如针对不同应用的IO请求模式或者磁盘类型等等。2. Realize the specific scheduling strategy algorithm, that is, realize the specific scheduling function according to the above interface, such as the IO request mode or disk type for different applications.
具体来说,若磁盘类型为固态硬盘SSD,由于其本身处理速度就快,这时小块请求可不执行合并操作,如果磁盘类型为传统磁盘,处理速度较慢,这时可通过本方案进行请求合并;进一步,若IO请求模式要求速度,则可减小时间周期,若IO请求模式要求吞吐量,则增大时间周期;具体来说,IO模式主要包括:大块请求/小块请求、direct/non-direct IO请求、buffer io模式。对于小块请求模式,IO调度层会在满足延迟的情况下尽可能多的合并,这样可以增加存储的吞吐量。对于大块请求模式,若请求的数据为多副本数据,这时可拆分到不同的副本节点上同时响应,提高读性能。对于direct io模式,会尽可能快的响应请求,而non-direct io模式,则正常处理;对于buffer io模式,则在满足延迟的情形下,尽可能合并以提高吞吐量。因此本申请在利用调度策略对已接收的SCSI请求进行解析时,还可以根据IO模式的不同进行配置。Specifically, if the disk type is a solid state drive SSD, because its processing speed is fast, the small block request does not need to perform the merge operation at this time. If the disk type is a traditional disk, the processing speed is slow, then this solution can be used to request Consolidation; further, if the IO request mode requires speed, the time period can be reduced, and if the IO request mode requires throughput, the time period can be increased; specifically, the IO mode mainly includes: large block request/small block request, direct /non-direct IO request, buffer io mode. For the small block request mode, the IO scheduling layer will merge as much as possible while meeting the delay, which can increase the storage throughput. For the bulk request mode, if the requested data is multiple copies of data, it can be split to different copy nodes to respond at the same time to improve read performance. For direct io mode, the request will be responded as quickly as possible, while for non-direct io mode, it will be processed normally; for buffer io mode, it will be combined as much as possible to improve throughput when the delay is met. Therefore, when analyzing the received SCSI request by using the scheduling strategy in this application, it can also be configured according to the different IO modes.
S103、将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。S103. Send the encapsulated request to the software stack, so that the software stack sends the encapsulated request to the corresponding server node through the message routing system.
需要说明的是,本申请中的IO调度层的主要目的是合并消息、优化消息的分发流程。在保证延迟满足的情况下,尽可能合并消息处理,减少不同软件栈的交互次数。例如:一个请求A,经过所有软件栈花费的时间是 t1,消息处理时间是t2;如果不进行合并,那么N条请求处理需要花费时间是N*(t1+t2),如果这些消息可以合并,那么处理时间近似t1+N*T2。在分布式存储中,对于小块IO的处理,软件栈交互引入的开销t1远大于具体处理IO的时间t2,这种情形下,如果能对小块IO进行合并,优化意义非常大。It should be noted that the main purpose of the IO scheduling layer in this application is to merge messages and optimize the message distribution process. Under the condition that the delay is satisfied, the message processing is combined as much as possible to reduce the number of interactions between different software stacks. For example: For a request A, the time it takes to go through all the software stacks is t1, and the message processing time is t2; if it is not merged, the processing time for N requests is N*(t1+t2). If these messages can be merged, Then the processing time is approximately t1+N*T2. In distributed storage, for the processing of small blocks of IO, the overhead t1 introduced by software stack interaction is much greater than the time t2 for specific IO processing. In this case, if small blocks of IO can be combined, optimization is of great significance.
可以看出,本申请通过设计并实现IO调度层,通过IO调度层作为消费者将应用层的SCSI请求拿过来放入调度队列,然后根据指定的调度策略对调度队列中的SCSI请求做解析,根据分布式存储的节点分布和对象存储的特点做合并处理,然后再将封装后的消息放入等待队列,交给IO处理线程进而通过消息路由交给相应的节点处理。通过这种方式,可以大大提高分布式存储系统的性能和吞吐量,并可针对不同应用场景调整分布式存储系统的IO模式,提高产品性能与竞争力。It can be seen that this application designs and implements the IO scheduling layer, and uses the IO scheduling layer as a consumer to take the SCSI requests of the application layer into the scheduling queue, and then analyze the SCSI requests in the scheduling queue according to the specified scheduling strategy. According to the node distribution of distributed storage and the characteristics of object storage, the merging process is performed, and then the encapsulated message is put into the waiting queue, handed over to the IO processing thread, and then sent to the corresponding node for processing through message routing. In this way, the performance and throughput of the distributed storage system can be greatly improved, and the IO mode of the distributed storage system can be adjusted for different application scenarios to improve product performance and competitiveness.
基于上述实施例,在本实施例中,若与SCSI请求对应的待处理数据大小超过预定阈值,则所述请求处理方法还包括:Based on the foregoing embodiment, in this embodiment, if the size of the data to be processed corresponding to the SCSI request exceeds a predetermined threshold, the request processing method further includes:
判断与所述SCSI请求对应的待处理数据是否为多副本数据;Judging whether the to-be-processed data corresponding to the SCSI request is multi-copy data;
若是,则根据所述待处理数据的副本数量将所述SCSI请求拆分为多个SCSI子请求,并将每个SCSI子请求封装后发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至每个SCSI子请求对应的服务器节点。If so, the SCSI request is split into multiple SCSI sub-requests according to the number of copies of the data to be processed, and each SCSI sub-request is encapsulated and sent to the software stack, so that the software stack passes the encapsulated request through the message The routing system sends to the server node corresponding to each SCSI sub-request.
需要说明的是,由于分布式存储中数据总是多副本的,因此本申请中的IO调度层不仅仅可以对SCSI请求进行合并处理,还可以执行拆解处理,即:在响应IO请求时,如果与SCSI请求对应的待处理数据大小超过预定阈值,并且该待处理数据在系统中存在多个副本,这时可根据待处理数据的副本数量对该SCSI请求进行拆分,拆分后的每个请求负责待处理数据的其中一段数据,并将拆分后的请求同时发往存储副本数据的不同服务器,从而通过多个服务器同时响应,从而提高读取速率。It should be noted that since data in distributed storage always has multiple copies, the IO scheduling layer in this application can not only merge SCSI requests, but also perform disassembly processing, that is, when responding to IO requests, If the size of the to-be-processed data corresponding to the SCSI request exceeds the predetermined threshold, and there are multiple copies of the to-be-processed data in the system, then the SCSI request can be split according to the number of copies of the One request is responsible for a piece of data to be processed, and the split request is sent to different servers that store the copy data at the same time, so that multiple servers respond at the same time, thereby increasing the reading rate.
基于上述实施例,在本实施例中,利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理的过程,具体可以包括:若存在至少两个SCSI请求对应的地址首位相连,则将地址首位相连的SCSI请求合并为一个请求;若存在至少两个SCSI请求对应的存储对象为同一个存储对象,则将同一个存储对象的SCSI请求合并为一个请求;若存在至少两个SCSI请求对应的服务器节点为同一个服务器节点,则将同一个服务器节点的SCSI请求合并为一个请求。Based on the above embodiment, in this embodiment, the scheduling strategy is used to parse the received SCSI request and perform the request merging process. Specifically, the process may include: if there are at least two SCSI requests corresponding to the address first connected, then the address The first connected SCSI requests are combined into one request; if there are at least two SCSI requests corresponding to the same storage object, the SCSI requests of the same storage object are combined into one request; if there are at least two SCSI requests corresponding to the same storage object If the server nodes are the same server node, the SCSI requests of the same server node are combined into one request.
在本申请中,合并处理过程主要包括如下几种情形:In this application, the merger processing process mainly includes the following situations:
1、两个请求地址首位相连,那么两个请求可以合并成一个连续的请求;1. If the two request addresses are connected in the first place, then the two requests can be combined into one continuous request;
2、两个请求对应底层同一个存储对象,那么两个请求可以合并为同一个请求,由底层同时填充两个不相邻的数据vector;2. Two requests correspond to the same storage object at the bottom, then the two requests can be combined into the same request, and two non-adjacent data vectors are filled at the same time from the bottom;
3、两个请求如果是发往同一个节点,那么可以将两个请求打包一次性发往目标存储节点。3. If two requests are sent to the same node, then the two requests can be packaged and sent to the target storage node at once.
需要说明的是,本申请通过IO调度层对SCSI请求执行合并及拆解处理,可重新设计IO线程所消费的IO处理单元和分布式节点通信的协议,例如:在原方案中,IO线程所处理的IO单元是对SCSI请求的直接封装,即以SCSI请求为单位;而在本申请中,IO处理单元是合并/拆分后的SCSI请求,例如:SCSI请求A和B是地址首位相接的,那么会将两个SCSI请求合并为一个消息传给底层去处理,减少了交互次数;并且,在原方案中,节点收到请求后,需要判断该请求的数据副本是否在本节点上,如果不在本节点,则需要转发至相应节点上,而在本申请中,分发请求前会确定数据副本所在节点,从而拆分出发送不同节点的请求,因此节点收到请求后无需判断,直接进行请求响应即可,既简化了处理逻辑,也提高了数据处理效率。可以看出,本申请提供的这种新的IO处理单元已针对分布式存储的特点进行优化设计,可简化底层的处理与交互逻辑。It should be noted that this application uses the IO scheduling layer to perform merging and disassembling of SCSI requests, which can redesign the IO processing unit consumed by the IO thread and the communication protocol of the distributed node. For example, in the original solution, the IO thread processes the communication protocol. The IO unit is a direct encapsulation of the SCSI request, that is, the SCSI request is the unit; in this application, the IO processing unit is the combined/split SCSI request, for example: SCSI requests A and B are connected in the first place of the address , Then the two SCSI requests will be combined into one message and sent to the bottom layer for processing, reducing the number of interactions; and, in the original scheme, after the node receives the request, it needs to determine whether the requested data copy is on the node, if not This node needs to be forwarded to the corresponding node. In this application, the node where the data copy is located will be determined before the request is distributed, so as to split the request to send to different nodes, so the node does not need to judge after receiving the request, and directly responds to the request That is, it not only simplifies the processing logic, but also improves the efficiency of data processing. It can be seen that the new IO processing unit provided by this application has been optimized for the characteristics of distributed storage, which can simplify the underlying processing and interaction logic.
综上可以看出,本发明提出了分布式存储领域中,一种分布式存储系统IO调度框架的设计和实现方案,在分布式存储系统的iscsi target软件栈增加了IO调度层,以SCSI请求作为输入,将调度处理后请求做封装并经由消息路由系统传给存储节点处理;为IO调度层设计并抽象了一组IO调 度接口,以插件的形式接入不同的IO调度策略实现,以适用于不同的存储应用场景。该方案可大大提高分布式存储系统的性能和吞吐量,提高产品竞争力。In summary, the present invention proposes a design and implementation of an IO scheduling framework for a distributed storage system in the field of distributed storage. The IO scheduling layer is added to the iscsi target software stack of the distributed storage system to request SCSI As input, the request after scheduling is encapsulated and passed to the storage node through the message routing system for processing; a set of IO scheduling interfaces is designed and abstracted for the IO scheduling layer, and different IO scheduling strategies are implemented in the form of plug-ins. For different storage application scenarios. This solution can greatly improve the performance and throughput of the distributed storage system and improve product competitiveness.
下面对本发明实施例提供的请求处理装置进行介绍,下文描述的请求处理装置与上文描述的请求处理方法可以相互参照。The request processing apparatus provided by the embodiment of the present invention will be introduced below. The request processing apparatus described below and the request processing method described above may refer to each other.
参见图2,本发明实施例提供的一种基于分布式存储系统的请求处理装置,基于IO调度层,所述请求处理装置包括:Referring to FIG. 2, an embodiment of the present invention provides a request processing device based on a distributed storage system, based on the IO scheduling layer, and the request processing device includes:
接收模块100,用于接收应用层发送的SCSI请求;The receiving module 100 is used to receive the SCSI request sent by the application layer;
解析模块200,用于在与所述SCSI请求对应的待处理数据大小未超过预定阈值时,利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理;The parsing module 200 is configured to analyze the received SCSI request by using a scheduling strategy when the size of the to-be-processed data corresponding to the SCSI request does not exceed a predetermined threshold, and perform request merging processing;
封装模块300,用于对处理后的请求分别进行封装;The encapsulation module 300 is used to encapsulate the processed requests separately;
发送模块400,用于将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。The sending module 400 is configured to send the encapsulated request to the software stack, so that the software stack sends the encapsulated request to the corresponding server node through the message routing system.
其中,所述接收模块具体用于:通过IO调度层的调度队列接收应用层发送的SCSI请求。其中,所述IO调度层包括IO调度接口,所述IO调度接口以插件的形式接入不同的调度策略。Wherein, the receiving module is specifically configured to receive the SCSI request sent by the application layer through the scheduling queue of the IO scheduling layer. Wherein, the IO scheduling layer includes an IO scheduling interface, and the IO scheduling interface accesses different scheduling strategies in the form of a plug-in.
其中,所述解析模块具体用于:利用调度策略对当前时间周期内已接收的SCSI请求进行解析。Wherein, the analysis module is specifically configured to analyze the received SCSI requests in the current time period by using a scheduling strategy.
其中,本方案还包括:Among them, this program also includes:
判断模块,用于在与所述SCSI请求对应的待处理数据大小超过预定阈值时,判断与所述SCSI请求对应的待处理数据是否为多副本数据;A judging module, configured to judge whether the to-be-processed data corresponding to the SCSI request is multi-copy data when the size of the to-be-processed data corresponding to the SCSI request exceeds a predetermined threshold;
拆分模块,用于在与所述SCSI请求对应的待处理数据为多副本数据时,根据所述待处理数据的副本数量将所述SCSI请求拆分为多个SCSI子请求;A splitting module, configured to split the SCSI request into multiple SCSI sub-requests according to the number of copies of the to-be-processed data when the to-be-processed data corresponding to the SCSI request is multiple copies of data;
所述封装模块,用于对每个SCSI子请求进行封装;The encapsulation module is used to encapsulate each SCSI sub-request;
所述发送模块,用于将封装后的每个SCSI子请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至每个SCSI子请求对应的服务器节点。The sending module is used to send each encapsulated SCSI sub-request to the software stack, so that the software stack sends the encapsulated request to the server node corresponding to each SCSI sub-request through the message routing system.
其中,所述解析模块具体用于:存在至少两个SCSI请求对应的地址首位相连时,将地址首位相连的SCSI请求合并为一个请求;存在至少两个SCSI请求对应的存储对象为同一个存储对象时,将同一个存储对象的SCSI请求合并为一个请求;存在至少两个SCSI请求对应的服务器节点为同一个服务器节点时,将同一个服务器节点的SCSI请求合并为一个请求。Wherein, the parsing module is specifically configured to: when there are at least two SCSI requests corresponding to the address first connected, combine the SCSI requests connected at the first address into one request; there are at least two SCSI requests corresponding to the same storage object as the storage object When the SCSI requests of the same storage object are combined into one request; when there are at least two SCSI requests corresponding to the same server node, the SCSI requests of the same server node are combined into one request.
参见图3,本发明实施例还公开了一种基于分布式存储系统的请求处理设备结构示意图;该设备具体包括:Referring to Figure 3, the embodiment of the present invention also discloses a schematic structural diagram of a request processing device based on a distributed storage system; the device specifically includes:
存储器11,用于存储计算机程序;The memory 11 is used to store computer programs;
处理器12,用于执行所述计算机程序时实现上述任意方法实施例所述的的请求处理方法的步骤。The processor 12 is configured to implement the steps of the request processing method described in any of the foregoing method embodiments when executing the computer program.
在本实施例中,设备可以是PC(Personal Computer,个人电脑),也可以是智能手机、平板电脑、掌上电脑、便携计算机等终端设备。In this embodiment, the device may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, a palmtop computer, and a portable computer.
该设备可以包括存储器11、处理器12和总线13。The device may include a memory 11, a processor 12, and a bus 13.
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁性存储器、磁盘、光盘等。存储器11在一些实施例中可以是设备的内部存储单元,例如该设备的硬盘。存储器11在另一些实施例中也可以是设备的外部存储设备,例如设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括设备的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于设备的应用软件及各类数据,例如执行请求处理方法的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。The memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like. In some embodiments, the memory 11 may be an internal storage unit of the device, such as a hard disk of the device. In other embodiments, the memory 11 may also be an external storage device of the device, such as a plug-in hard disk equipped on the device, a Smart Media Card (SMC), a Secure Digital (SD) card, and a flash memory card. (Flash Card) and so on. Further, the memory 11 may also include both an internal storage unit of the device and an external storage device. The memory 11 can be used not only to store application software and various types of data installed in the device, such as codes for executing a request processing method, etc., but also to temporarily store data that has been output or will be output.
处理器12在一些实施例中可以是一中央处理器(Central Processing  Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行请求处理方法的代码等。The processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor or other data processing chip in some embodiments, and is used to run the program code or processing stored in the memory 11 Data, such as code that executes the request processing method, etc.
该总线13可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图3中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The bus 13 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only a thick line is used in FIG. 3 to represent it, but it does not mean that there is only one bus or one type of bus.
进一步地,设备还可以包括网络接口14,网络接口14可选的可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该设备与其他电子设备之间建立通信连接。Further, the device may also include a network interface 14. The network interface 14 may optionally include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used to communicate between the device and other electronic devices. Establish a communication connection.
可选地,该设备还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在设备中处理的信息以及用于显示可视化的用户界面。Optionally, the device may also include a user interface. The user interface may include a display (Display) and an input unit such as a keyboard (Keyboard). The optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the device and to display a visualized user interface.
图3仅示出了具有组件11-14的设备,本领域技术人员可以理解的是,图3示出的结构并不构成对设备的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。Figure 3 only shows a device with components 11-14. Those skilled in the art can understand that the structure shown in Figure 3 does not constitute a limitation on the device, and may include fewer or more components than shown. Or some parts are combined, or different parts are arranged.
本发明实施例还公开了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述任意方法实施例所述的的请求处理方法的步骤。The embodiment of the present invention also discloses a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the request processing method described in any of the foregoing method embodiments is implemented A step of.
其中,该存储介质可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。Among them, the storage medium may include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc., which can store program code medium.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。The various embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be obvious to those skilled in the art, and the general principles defined herein can be implemented in other embodiments without departing from the spirit or scope of the present invention. Therefore, the present invention will not be limited to the embodiments shown in this document, but should conform to the widest scope consistent with the principles and novel features disclosed in this document.

Claims (10)

  1. 一种基于分布式存储系统的请求处理方法,其特征在于,基于IO调度层,所述请求处理方法包括:A request processing method based on a distributed storage system, characterized in that, based on an IO scheduling layer, the request processing method includes:
    接收应用层发送的SCSI请求;Receive the SCSI request sent by the application layer;
    若与所述SCSI请求对应的待处理数据大小未超过预定阈值,则利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理,并对处理后的请求分别进行封装;If the size of the to-be-processed data corresponding to the SCSI request does not exceed the predetermined threshold, use the scheduling strategy to analyze the received SCSI request, perform request merging processing, and encapsulate the processed requests separately;
    将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。The encapsulated request is sent to the software stack, so that the software stack sends the encapsulated request to the corresponding server node through the message routing system.
  2. 根据权利要求1所述的请求处理方法,其特征在于,所述接收应用层发送的SCSI请求,包括:The request processing method according to claim 1, wherein the receiving the SCSI request sent by the application layer comprises:
    通过IO调度层的调度队列接收应用层发送的SCSI请求。The SCSI request sent by the application layer is received through the scheduling queue of the IO scheduling layer.
  3. 根据权利要求2所述的请求处理方法,其特征在于,所述IO调度层包括IO调度接口,所述IO调度接口以插件的形式接入不同的调度策略。The request processing method according to claim 2, wherein the IO scheduling layer includes an IO scheduling interface, and the IO scheduling interface accesses different scheduling policies in the form of a plug-in.
  4. 根据权利要求3所述的请求处理方法,其特征在于,所述利用调度策略对已接收的SCSI请求进行解析,包括:The request processing method according to claim 3, wherein the parsing of the received SCSI request by using a scheduling strategy comprises:
    利用调度策略对当前时间周期内已接收的SCSI请求进行解析。The scheduling strategy is used to analyze the received SCSI requests in the current time period.
  5. 根据权利要求1所述的请求处理方法,其特征在于,若与所述SCSI请求对应的待处理数据大小超过预定阈值,则所述请求处理方法还包括:The request processing method according to claim 1, wherein if the size of the data to be processed corresponding to the SCSI request exceeds a predetermined threshold, the request processing method further comprises:
    判断与所述SCSI请求对应的待处理数据是否为多副本数据;Judging whether the to-be-processed data corresponding to the SCSI request is multi-copy data;
    若是,则根据所述待处理数据的副本数量将所述SCSI请求拆分为多个SCSI子请求,并将每个SCSI子请求封装后发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至每个SCSI子请求对应的服务器节点。If so, the SCSI request is split into multiple SCSI sub-requests according to the number of copies of the data to be processed, and each SCSI sub-request is encapsulated and sent to the software stack, so that the software stack passes the encapsulated request through the message The routing system sends to the server node corresponding to each SCSI sub-request.
  6. 根据权利要求1至5中任意一项所述的请求处理方法,其特征在于,所述利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理,包括:The request processing method according to any one of claims 1 to 5, wherein the use of a scheduling strategy to analyze received SCSI requests and perform request merging processing includes:
    若存在至少两个SCSI请求对应的地址首位相连,则将地址首位相连的SCSI请求合并为一个请求;If there are at least two SCSI requests that are connected in the first digit of the address, merge the SCSI requests connected in the first digit of the address into one request;
    若存在至少两个SCSI请求对应的存储对象为同一个存储对象,则将同一个存储对象的SCSI请求合并为一个请求;If the storage objects corresponding to at least two SCSI requests are the same storage object, the SCSI requests of the same storage object are combined into one request;
    若存在至少两个SCSI请求对应的服务器节点为同一个服务器节点,则将同一个服务器节点的SCSI请求合并为一个请求。If the server nodes corresponding to at least two SCSI requests are the same server node, the SCSI requests of the same server node are combined into one request.
  7. 一种基于分布式存储系统的请求处理装置,其特征在于,基于IO调度层,所述请求处理装置包括:A request processing device based on a distributed storage system, characterized in that, based on an IO scheduling layer, the request processing device includes:
    接收模块,用于接收应用层发送的SCSI请求;解析模块,用于在与所述SCSI请求对应的待处理数据大小未超过预定阈值时,利用调度策略对已接收的SCSI请求进行解析,执行请求合并处理;The receiving module is used to receive the SCSI request sent by the application layer; the analyzing module is used to analyze the received SCSI request using the scheduling strategy when the size of the data to be processed corresponding to the SCSI request does not exceed a predetermined threshold, and execute the request Merge processing
    封装模块,用于对处理后的请求分别进行封装;Encapsulation module, used to encapsulate the processed request separately;
    发送模块,用于将封装的请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至对应的服务器节点。The sending module is used to send the encapsulated request to the software stack, so that the software stack sends the encapsulated request to the corresponding server node through the message routing system.
  8. 根据权利要求7所述的请求处理装置,其特征在于,还包括:The request processing device according to claim 7, further comprising:
    判断模块,用于在与所述SCSI请求对应的待处理数据大小超过预定阈值时,判断与所述SCSI请求对应的待处理数据是否为多副本数据;A judging module, configured to judge whether the to-be-processed data corresponding to the SCSI request is multi-copy data when the size of the to-be-processed data corresponding to the SCSI request exceeds a predetermined threshold;
    拆分模块,用于在与所述SCSI请求对应的待处理数据为多副本数据时,根据所述待处理数据的副本数量将所述SCSI请求拆分为多个SCSI子请求;A splitting module, configured to split the SCSI request into multiple SCSI sub-requests according to the number of copies of the to-be-processed data when the to-be-processed data corresponding to the SCSI request is multiple copies of data;
    所述封装模块,用于对每个SCSI子请求进行封装;The encapsulation module is used to encapsulate each SCSI sub-request;
    所述发送模块,用于将封装后的每个SCSI子请求发送至软件栈,以使软件栈将封装的请求通过消息路由系统发送至每个SCSI子请求对应的服务器节点。The sending module is used to send each encapsulated SCSI sub-request to the software stack, so that the software stack sends the encapsulated request to the server node corresponding to each SCSI sub-request through the message routing system.
  9. 一种基于分布式存储系统的请求处理设备,其特征在于,包括:A request processing device based on a distributed storage system, which is characterized in that it comprises:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1至6任一项所述的请求处理方法的步骤。The processor is configured to implement the steps of the request processing method according to any one of claims 1 to 6 when the computer program is executed.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述的请求处理方法的步骤。A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the request processing method according to any one of claims 1 to 6 is realized A step of.
PCT/CN2020/097969 2020-02-21 2020-06-24 Request processing method and apparatus, device and storage medium WO2021164163A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010108904.1 2020-02-21
CN202010108904.1A CN111371848A (en) 2020-02-21 2020-02-21 Request processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2021164163A1 true WO2021164163A1 (en) 2021-08-26

Family

ID=71210042

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/097969 WO2021164163A1 (en) 2020-02-21 2020-06-24 Request processing method and apparatus, device and storage medium

Country Status (2)

Country Link
CN (1) CN111371848A (en)
WO (1) WO2021164163A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112612415B (en) * 2020-12-22 2022-08-30 新华三大数据技术有限公司 Data processing method and device, electronic equipment and storage medium
CN115857792A (en) * 2021-09-23 2023-03-28 华为技术有限公司 Data processing method and related equipment
CN116737398B (en) * 2023-08-16 2023-11-17 北京卡普拉科技有限公司 Asynchronous IO request scheduling and processing method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572716A (en) * 2009-05-27 2009-11-04 杭州华三通信技术有限公司 Method for transmitting small computer system interface (SCSI) packet and device thereof
CN101639763A (en) * 2009-08-27 2010-02-03 中兴通讯股份有限公司 IO dispatching method and device
CN106572135A (en) * 2015-10-09 2017-04-19 北京国双科技有限公司 Network request processing method and device
US20180349037A1 (en) * 2017-06-02 2018-12-06 EMC IP Holding Company LLC Method and device for data read and write
CN110535692A (en) * 2019-08-12 2019-12-03 华为技术有限公司 Fault handling method, device, computer equipment, storage medium and storage system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469198B (en) * 2016-08-31 2019-10-15 华为技术有限公司 Key assignments storage method, apparatus and system
CN107229424B (en) * 2017-05-31 2020-09-22 苏州浪潮智能科技有限公司 Data writing method for distributed storage system and distributed storage system
CN109327539A (en) * 2018-11-15 2019-02-12 上海天玑数据技术有限公司 A kind of distributed block storage system and its data routing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572716A (en) * 2009-05-27 2009-11-04 杭州华三通信技术有限公司 Method for transmitting small computer system interface (SCSI) packet and device thereof
CN101639763A (en) * 2009-08-27 2010-02-03 中兴通讯股份有限公司 IO dispatching method and device
CN106572135A (en) * 2015-10-09 2017-04-19 北京国双科技有限公司 Network request processing method and device
US20180349037A1 (en) * 2017-06-02 2018-12-06 EMC IP Holding Company LLC Method and device for data read and write
CN110535692A (en) * 2019-08-12 2019-12-03 华为技术有限公司 Fault handling method, device, computer equipment, storage medium and storage system

Also Published As

Publication number Publication date
CN111371848A (en) 2020-07-03

Similar Documents

Publication Publication Date Title
WO2021164163A1 (en) Request processing method and apparatus, device and storage medium
US10365830B2 (en) Method, device, and system for implementing hardware acceleration processing
CA2898053C (en) Deep packet inspection method, device, and coprocessor
WO2018076793A1 (en) Nvme device, and methods for reading and writing nvme data
CN101290604A (en) Information processing apparatus and method, and program
KR20090061955A (en) A system and a method for dynamic loading and execution of module devices using inter-core-communication channel in multicore system environment
US20190250852A1 (en) Distributed compute array in a storage system
WO2021169298A1 (en) Method and apparatus for reducing back-to-source requests, and computer readable storage medium
EP4350515A1 (en) Load balancing method for multi-thread forwarding, and related apparatus
WO2024082985A1 (en) Unloading card provided with accelerator
WO2023104194A1 (en) Service processing method and apparatus
WO2023046141A1 (en) Acceleration framework and acceleration method for database network load performance, and device
CN114900699A (en) Video coding and decoding card virtualization method and device, storage medium and terminal
CN113014608B (en) Flow distribution control method and device, electronic equipment and storage medium
US11507292B2 (en) System and method to utilize a composite block of data during compression of data blocks of fixed size
CN110445580B (en) Data transmission method and device, storage medium, and electronic device
CN116820527A (en) Program upgrading method, device, computer equipment and storage medium
EP3555767B1 (en) Partial storage of large files in distinct storage systems
CN107229424B (en) Data writing method for distributed storage system and distributed storage system
US20230393782A1 (en) Io request pipeline processing device, method and system, and storage medium
CN111221642A (en) Data processing method and device, storage medium and terminal
CN116136790A (en) Task processing method and device
CN114371935A (en) Gateway processing method, gateway, device and medium
KR20130104958A (en) Apparatus and methods for executing multi-operating systems
US10374893B1 (en) Reactive non-blocking input and output for target device communication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20920002

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20920002

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20920002

Country of ref document: EP

Kind code of ref document: A1