WO2019153702A1 - 一种中断处理方法、装置及服务器 - Google Patents

一种中断处理方法、装置及服务器 Download PDF

Info

Publication number
WO2019153702A1
WO2019153702A1 PCT/CN2018/100622 CN2018100622W WO2019153702A1 WO 2019153702 A1 WO2019153702 A1 WO 2019153702A1 CN 2018100622 W CN2018100622 W CN 2018100622W WO 2019153702 A1 WO2019153702 A1 WO 2019153702A1
Authority
WO
WIPO (PCT)
Prior art keywords
interrupt
processing core
service
core
data
Prior art date
Application number
PCT/CN2018/100622
Other languages
English (en)
French (fr)
Inventor
郑卫炎
雷舒莹
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019153702A1 publication Critical patent/WO2019153702A1/zh
Priority to US16/987,014 priority Critical patent/US20200364080A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9063Intermediate storage in different physical parts of a node or terminal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4418Suspend and resume; Hibernate and awake
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4812Task transfer initiation or dispatching by interrupt, e.g. masked
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9063Intermediate storage in different physical parts of a node or terminal
    • H04L49/9068Intermediate storage in different physical parts of a node or terminal in the network interface card
    • H04L49/9073Early interruption upon arrival of a fraction of a packet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/483Multiproc
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • the present application relates to the field of data storage technologies, and in particular, to an interrupt processing method, apparatus, and server.
  • the cache is to solve the difference between the central processing unit (CPU) and the memory speed, including the first level (level 1, L1) cache, the second level (level 2, referred to as L2).
  • the cache and the third level (level 3, referred to as L3) cache have a total of three levels of cache.
  • the access priority and access rate of the L3 cache are: L1>L2>L3, which can increase the access rate of data by using different caches.
  • each server may include one or more CPUs, each CPU including multiple cores, and different CPU cores may share cache resources.
  • an ARM server consists of 2 CPUs, each CPU consisting of 32 cores.
  • each of the four cores is divided into a cluster, and each 16 cores is divided into one logical unit (die).
  • Each core in the CPU has an L1 cache, and four cores in one cluster share one L2 cache, and 16 cores in one logical unit share one L3 cache.
  • the core of the processor processes the input/output (I/O) operation request in an interrupted manner.
  • the specific process is: when the server receives the transmission control protocol carrying the I/O operation request (Transmission) Control Protocol, TCP)
  • TCP Transmission Control Protocol
  • the TCP data message is stored in an interrupt queue associated with it.
  • Each interrupt queue is configured with a processor core (called interrupt processing core), and the interrupt processing core is configured according to The first-in-first-out method sequentially acquires the TCP data packet, and notifies the processor core (that is, the core running the service process, which is called the service processing core) that processes the service corresponding to the TCP data packet. Then, the business processing core needs to read data from the cache or memory of the interrupt processing core to complete the reading and writing of the data.
  • the interrupt processing core and the service processing core may not be in the same cluster or the same logical unit, and the interrupt processing core and the service processing core cannot share the cache resources.
  • the interrupt processing core and the business processing core need to access the cache across the CPU or across logical units through the internal bus, resulting in long processing or write operations.
  • FIG. 1 is a schematic diagram of a distributed data storage system. As shown in FIG. 1, the VBS process communicates with each OSD process and the OSD process of different servers through a TCP connection. In FIG. 1, the OSD1 to OSDn represent different The OSD process on the server is described as an example.
  • the VBS process When the data is read or written, the VBS process first sends the data to be read and written in the form of the payload data of the TCP packet to the OSD process where the primary backup data is located, and then the OSD process in which the primary backup data is located synchronizes the data to the OSD process in which the other backup data is located.
  • the OSD process can receive TCP data packets from the VBS process, and can also receive TCP data packets from the OSD process on other servers, so that the OSD process can receive multiple TCP data packets.
  • the multiple TCP data messages are likely to be stored in multiple different interrupt queues, and the interrupt processing core of each interrupt queue is from its own interrupt queue.
  • the TCP data message is obtained and processed, and the data in the corresponding TCP data message is stored in the corresponding cache and memory. Since the interrupt processing core of the interrupt queue is randomly configured, multiple interrupt processing cores corresponding to multiple interrupt queues are likely to be dispersed in different logical units and different CPUs. In this case, the service processing core needs to be from a different cache. And the data is read in the memory, the access delay of the service processing check memory and the access delay of the L3 cache are both greater than the access delay to the L2 cache, and the service processing core needs to access the cache across the CPU or across the logical unit through the internal bus. And the memory increases the access latency. Therefore, this will cause the service processing core to have a large data access delay, thereby reducing the processing rate of the user data and affecting the performance of the system.
  • the present invention provides a method, an apparatus, and a server for interrupt processing, which solve the problem of large data access delay and low user data processing rate in the prior art.
  • an interrupt processing method for use in a server of a central processing unit CPU including a plurality of cores, the CPU of the plurality of cores including an interrupt processing core for processing interrupts, and a service processing for running a business process
  • the core includes: when the server receives multiple TCP data packets of the service process, the destination port of each TCP data packet in the multiple TCP data packets corresponds to the same interrupt queue, and therefore, the multiple The TCP data message is stored in the interrupt queue and triggers an interrupt processing request; the interrupt processing core receives an interrupt processing request, and the interrupt processing request is used to request to process at least one of the plurality of TCP data messages stored in the interrupt queue.
  • the TCP data packet may be used to request to process a TCP data message, and may also be used to request to process multiple TCP data messages; the interrupt processing core acquires at least one TCP data message from the interrupt queue; The interrupt processing core can determine the service process to which the at least one TCP data packet belongs according to the TCP connection information of the at least one TCP data packet. The process is run by the service processing core to determine the service processing core, and the interrupt processing core and the service processing core have a shared cache space; the interrupt processing core can send a wake-up instruction to the service processing core to wake up the service processing core and enable the service processing core processing. At least one TCP data message, for example, the service processing core updates the user data stored in the server according to the user data in the at least one TCP data message, or sends the data to other servers for data synchronization.
  • multiple TCP connections of one service process in the server are configured to correspond to one interrupt queue, so that multiple TCP data packets received by the service process through multiple TCP connections can be stored in an interrupt queue.
  • the interrupt processing core configured by the interrupt queue has the same cache space as the service processing core running the service process, so that the service processing core can use the shared cache to access data, thereby reducing data access delay, improving data processing efficiency, and thereby improving System performance.
  • the interrupt processing core and the service processing core are the same core in a CPU.
  • the service processing core can obtain at least one user data in the TCP data packet from the L1 cache, and the data access is performed. The delay is the smallest and the processing rate is the highest.
  • the service processing core and the interrupt processing core belong to the same cluster.
  • the service processing core can obtain user data in at least one TCP data packet from the L2 cache, and the data access delay is small and the processing rate is low. Higher.
  • the service processing core and the interrupt processing core are belonged to the same logical unit (die).
  • the service processing core can obtain at least one user data in the TCP data packet from the L3 cache, the data access delay and the processing rate. Access to memory is relatively high.
  • the server includes multiple interrupt queues
  • the destination port that the service process can use includes multiple destination ports.
  • the method further includes: determining, by the service processing core, the multiple Correspondence between the interrupt queue and the multiple destination ports, each interrupt queue corresponds to a destination port set, and one destination port set includes multiple destination ports; the service processing core establishes multiple TCP connections of the service process through a destination port set. Multiple TCP connections are used to transmit TCP data packets of a business process.
  • multiple TCP data packets to the service process can be stored in an interrupt queue, thereby avoiding the service process. Multiple TCP data messages are stored in multiple different interrupt queues.
  • the service processing core determines a correspondence between the multiple interrupt queues and the multiple destination ports, including: obtaining, according to each destination port and the specified hash value of the multiple destination ports. An interrupt queue corresponding to each destination port to obtain a correspondence between multiple interrupt queues and multiple destination ports. In the foregoing possible implementation manner, the service processing core can determine the correspondence between multiple interrupt queues and multiple destination ports simply and effectively according to the specified hash value.
  • the specified hash values are different.
  • different TCP data packets of the service process may be stored in an interrupt queue by setting different specified hash values.
  • an interrupt processing apparatus comprising: a receiving unit, configured to receive an interrupt processing request, and the interrupt processing request is used to request to process at least one of a plurality of TCP data messages of a service process stored in the interrupt queue The destination port of each TCP data packet in the TCP data packet is corresponding to the same interrupt queue; the obtaining unit is configured to obtain at least one TCP data packet from the interrupt queue; a unit, configured to determine a service processing core according to the at least one TCP data packet, where the first processing unit and the second processing unit have a shared cache space; the first processing unit is further configured to wake up the second processing unit, so that the second processing unit Process at least one TCP data message.
  • the first processing unit and the second processing unit are the same processing unit; or the first processing unit and the second processing unit belong to the same cluster; or the first processing unit And the second processing unit belongs to the same logical unit (die).
  • the device includes multiple interrupt queues
  • the destination port that the service process can use includes multiple destination ports
  • the second processing unit is further configured to: determine multiple interrupt queues and the multiple Correspondence between the destination ports, each interrupt queue corresponds to a destination port set, and one destination port set includes multiple destination ports; multiple TCP connections of the service process are established through a destination port set, and multiple TCP connections are used for transmission services.
  • the TCP data packet of the process includes
  • the second processing unit is further configured to: obtain an interrupt queue corresponding to each destination port according to each destination port and the specified hash value of the multiple destination ports, to obtain multiple interrupts. Correspondence between a queue and multiple destination ports.
  • the specified hash values are different.
  • a processor is provided for performing the interrupt processing method provided by the above first aspect or any one of the possible implementation manners of the first aspect.
  • a server comprising a memory, a processor, a bus and a communication interface, wherein the memory stores code and data, the processor, the memory and the communication interface are connected by a bus, and the processor runs the code in the memory to cause the server to execute
  • the interrupt processing method provided by the above first aspect or any possible implementation of the first aspect.
  • a computer readable storage medium where computer executed instructions are stored, and when the at least one processor of the device executes the computer to execute an instruction, the device performs the first aspect or the first aspect An interrupt handling method provided by any of the possible implementations.
  • a computer program product comprising computer executable instructions stored in a computer readable storage medium; at least one processor of the device can read the computer from a computer readable storage medium Executing the instructions, the at least one processor executing the computer to execute the instructions, such that the apparatus implements the interrupt processing method provided by the first aspect or any one of the possible implementations of the first aspect.
  • any device, processor, server, computer storage medium or computer program product of the interrupt processing method provided above is used to execute the corresponding method provided above, and therefore, the beneficial effects thereof can be achieved. Reference may be made to the beneficial effects in the corresponding methods provided above, and details are not described herein again.
  • FIG. 1 is a schematic diagram of a TCP connection in a distributed data storage system
  • FIG. 2 is a schematic structural diagram of a server provided by the present application.
  • FIG. 3 is a schematic structural diagram of a processor provided by the present application.
  • FIG. 4 is a schematic diagram of data storage in a distributed data storage system provided by the present application.
  • FIG. 5 is a schematic flowchart diagram of an interrupt processing method provided by the present application.
  • FIG. 6 is a schematic flowchart diagram of another interrupt processing method provided by the present application.
  • FIG. 7 is a schematic diagram of a relationship between a service process and an interrupt queue according to the present application.
  • FIG. 8 is a schematic structural diagram of an interrupt processing apparatus provided by the present application.
  • FIG. 9 is a schematic structural diagram of another processor provided by the present application.
  • FIG. 2 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • the server may include a memory 201, a processor 202, a communication interface 203, and a bus 204.
  • the memory 201, the processor 202, and the communication interface 203 are connected to one another via a bus 204.
  • the memory 201 can be used for storing data, software programs and modules, and mainly includes a storage program area and a storage data area, the storage program area can store an operating system, an application required for at least one function, and the like, and the storage data area can store the use time of the device. The data created, etc.
  • the processor 202 is configured to control and manage the actions of the server, such as by running or executing software programs and/or modules stored in the memory 201, and calling data stored in the memory 201 to perform various functions of the server and Data processing.
  • the communication interface 203 is used to support the server for communication.
  • the processor 202 can include a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor 202 can also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 204 can be a peripheral component interconnect (PCI) bus, or an extended industry standard architecture (EISA) bus or the like.
  • the bus 204 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 2, but it does not mean that there is only one bus or one type of bus.
  • the number of processors 202 included in the same server may be one or more, and each processor 202 may include multiple cores.
  • the server in the embodiment of the present invention is referred to as a first server.
  • the processor 202 may be an ARM processor.
  • the ARM processor may include multiple central processing units (CPUs), and each CPU may include Multiple cores (for example, 32 cores), each of which can be called a cluster, and each of the four clusters can be called a logical unit.
  • the processor 202 includes two CPUs as an example.
  • the two CPUs include 64 cores (for example, core 0 to core 63), each CPU includes two logical units, and the processor 202 includes four. Logical unit.
  • the structure of the x86 processor may be extended to the structure of the processor 202 provided in FIG. 3, which is not specifically limited herein.
  • the CPU cache can be divided into a first level cache (L1cache), a second level cache (L2cache) and a third level cache (L3cache), each level cache All stored data is part of the next level of cache.
  • the L1 cache is located closest to the CPU and is the CPU cache most closely integrated with the CPU. It can be used to temporarily store and deliver various types of arithmetic instructions and operations to the CPU core with the fastest access rate.
  • the L2 cache is located between the L1 cache and the L3 cache.
  • the L2 cache and the L3 cache are only used to store CPU core processing.
  • the L2 cache has higher access priority and access rate than the L3 cache, and the L3 cache capacity. From big to small, it is L3, L2, and L1.
  • the working principle of the third-level cache is that when the core of the CPU needs to read a data, it first searches from the L1 cache. If the L1 cache does not exist, it needs to be searched from the L2 cache. If the L2 cache does not exist, then Lookup in the L3 cache, if it does not exist in the L3 cache, you need to read from memory.
  • the data stored in the cache is a small part of the memory, but this small part of the data is about to be accessed by the CPU core in a short time. When the CPU core reads and writes data, the data access efficiency is improved by using different caches.
  • the processor core can process input/output (I/O) operations through interrupts.
  • I/O input/output
  • the specific process is: when the device receives a TCP data message, the TCP data message is stored in an interrupt queue.
  • Each interrupt queue is configured with a core (called an interrupt processing core).
  • the interrupt processing core obtains the TCP data packet from the interrupt queue and parses it, and stores the data in the TCP data packet in the cache and the memory. Then, the core of the service process corresponding to the TCP data packet (that is, the core running the service process, called the service processing core) reads data from the buffer or the memory of the interrupt processing core to perform data read and write operations.
  • the accessed data can pass through the L2 cache.
  • the transmission is performed, that is, the first core caches the accessed data in the L2 cache, and the second core directly accesses the shared L2 cache.
  • the accessed data can be transmitted through the L3 cache, that is, the first CPU core will access.
  • the data is cached in the L3 cache, and the second CPU core directly accesses the shared L3 cache (which can be called cross-logical unit access).
  • the accessed data can only be transferred through the memory, that is, the data that the first core will access is stored in its memory, and the second core reads the data from the memory of the first core.
  • the transfer process needs to span multiple CPUs through the internal bus to complete. Since the access delay of the L3 cache is greater than the access delay of the L2 cache, the access latency of the memory is greater than the access latency of the L3 cache. Therefore, when the two cores are in cross-logical unit access or cross-CPU access, there is access. The problem of large delay.
  • the interrupt processing method provided by the embodiment of the present invention can be applied to all servers that transmit data packets through a TCP connection.
  • the server may be a server in a distributed data storage system.
  • a distributed data storage system For the convenience of the following description, the following describes a distributed data storage system as an example.
  • a distributed data storage system may include multiple servers.
  • user data may be stored in the form of multiple copies of data, and multiple copies of the same data may be stored on different servers.
  • the data stored in the server performs I/O operations, it is required to ensure the consistency of multiple copies of the same data, and the multiple copies of the data may be one primary backup data and multiple secondary backup data.
  • the user can access the replica data in the server where the object storage device (OSD) process is deployed through a server deployed with a virtual block system (VBS) process, and multiple OSDs can be deployed in one server.
  • OSD object storage device
  • VBS virtual block system
  • each OSD process corresponds to a disk in the server, which can store multiple different replica data.
  • the VBS process is an I/O process of the service, and is used to provide an access point service (that is, user data is presented in the form of a virtual block, access to the virtual block can be accessed to real data), and the VBS process can also be used for management.
  • the metadata of the volume is an I/O process of the service, and is used to provide an access point service (that is, user data is presented in the form of a virtual block, access to the virtual block can be accessed to real data), and the VBS process can also be used for management.
  • the metadata of the volume is an I/O process of the service, and is used to provide an access point service (that is,
  • the user data may be stored in the form of a volume, and the metadata of the volume may refer to information used to describe the distribution of user data in the storage server, such as the address of the data, the modification time of the data, and the access rights of the data. and many more.
  • the OSD process is also a service I/O process for managing user data stored in the corresponding disk. It can also be used to perform specific I/O operations, that is, to perform specific data read and write operations.
  • the distributed data storage system includes three servers for storing user data, and the user data stored by the system is described as an example of a three-copy model.
  • the storage diagram of the user data in the server may be as shown in FIG. 4 . Show.
  • the three-copy model means that each data block is stored in the storage system in three copies, one of which can be primary backup data (Master) and two of which can be secondary backup data (Slave).
  • the VBS process can segment the user data stored in the server, and assume that after the segmentation, n data blocks are obtained, that is, Part1 to Partn, and each block of data stores three copies, and three copies of the n data blocks Part1 to Partn are duplicated.
  • the storage structure can be as shown in Figure 4.
  • the three backups of each data block are scattered among the disks of different servers.
  • M represents the master of each data block
  • S1 represents the Slave1 part of each data block
  • S2 represents the Slave2 part of each data block.
  • each server includes n disks, namely Disk1 to Diskn.
  • the volume metadata in FIG. 4 is volume metadata of Part 1 to Partn managed by the VBS process, and the volume metadata may include identification information of a server storing each data block, and a specific location of the data block in the server.
  • the VBS process needs to establish a transmission control protocol (transmission) with each OSD process deployed in the server.
  • the control protocol (TCP) is connected.
  • the TCP connection is also required between the OSD processes of different servers.
  • the TCP connection can be transmitted through the established TCP connection.
  • the OSD process on OSD1 to OSDn in Figure 1 is used as an example. .
  • the VBS process can query the volume metadata to determine a server where the three copies of the data block operated by the I/O operation are located, and The specific location in the server.
  • the VBS process sends a TCP data packet to the OSD process in the server where the master of the data block is located, and the OSD process stores the data in the TCP data packet.
  • the OSD process then sends the received data to the OSD process in the server corresponding to the two slaves through the TCP connection, so that the data is consistent among the multiple copies.
  • the OSD process in the server corresponding to the master receives the response information sent by the OSD process in the server corresponding to the two slaves, and returns a response message to the VBS process, thereby completing the I/O operation.
  • the OSD process can receive TCP data packets from the VBS process, and can also receive TCP data packets from the OSD process on other servers, so that the OSD process can receive multiple TCP data packets.
  • the multiple TCP data messages are likely to be stored in multiple different interrupts.
  • multiple interrupt queues correspond to multiple interrupt processing cores, and the interrupt processing core of each interrupt queue obtains corresponding TCP data packets from the respective interrupt queues and parses them, and stores the data in the corresponding TCP data packets. In their respective caches and in memory.
  • each interrupt queue Since the interrupt processing core of each interrupt queue is randomly configured, multiple interrupt processing cores corresponding to multiple interrupt queues are likely to be dispersed in different logical units and different CPUs, so that the service processing core is read more.
  • the access delay of the memory and the access delay of the L3 cache are greater than the access delay of the L2 cache, thus causing the service processing core.
  • There is a problem of large data access delay which in turn reduces the processing rate of user data and affects system performance.
  • FIG. 5 is a flowchart of an interrupt processing method according to an embodiment of the present invention.
  • the method is applied to a server of a CPU including multiple cores, and the CPUs of the plurality of cores include an interrupt processing core and a service processing core.
  • the service processing core refers to a core running a business process, and the service processing core can be used to process data read and write operations related to the business process.
  • the business process can be an OSD process
  • the core running the OSD process is called A service processing core that can be used to process read and write operations of backup data managed by the OSD process.
  • the interrupt processing core is the core used to process the interrupt.
  • the server can configure an interrupt processing core for an interrupt queue. Accordingly, the method includes the following steps.
  • Step 501 The first server receives multiple TCP data packets, and the destination ports of the multiple TCP data packets all correspond to one interrupt queue.
  • the server is the first server.
  • the first server may include multiple service processes, and each service process may be used to manage backup data of multiple data blocks.
  • the backup data may include Master data, and may also include Slave. Data, and Master and Slave are backups of different data blocks.
  • a service process of the first server is used as an example.
  • the service process can establish a TCP connection with multiple processes of different servers, and the TCP connection is used to transmit TCP data packets.
  • the business process can be an OSD process, and an OSD process can establish a TCP connection with the VBS process or a TCP connection with multiple OSD processes of other servers.
  • the user when the user performs a write operation, if the master data of the data block corresponding to the write operation is in the user data managed by an OSD process of the first server, the user can pass the VBS process with the first server.
  • the TCP connection between the OSD processes sends TCP data packets.
  • the other server may send the TCP connection between the corresponding OSD process and the OSD process. TCP data message.
  • the first server may receive a plurality of TCP data packets, and the plurality of TCP data packets may be received by the communication interface, where the plurality of TCP data packets may include TCP data packets from the VBS process. It can also include TCP data messages from OSD processes in other servers.
  • Each of the plurality of TCP data packets includes port information, and the port information is used to indicate a destination port of the TCP data packet.
  • the TCP data packet may include the quaternary information, that is, the source IP address, the source port, the destination IP address, and the destination port.
  • the destination port indicated by the port information in a TCP data packet may be the quaternary group. The destination port in the message.
  • the destination port in this application refers to a communication protocol port for a connection service, which may also be called a TCP port, and is an abstract software structure, and does not refer to a hardware port.
  • Step 502 The first server stores the plurality of TCP data messages in an interrupt queue corresponding to the destination port of the plurality of TCP data packets.
  • the NIC driver of the first server can obtain the quaternary in the TCP data packet for each of the plurality of TCP data messages.
  • Group information, the quaternary information may include port information, and the NIC driver may mask other information in the quaternion information when performing hash operations according to the quaternion information and the specified hash value (for example, in a hash operation)
  • all the bits corresponding to the information other than the destination port in the quaternion information are set to 0), and only the destination port is reserved.
  • the network card driver can search the Ethernet queue array (indirection) according to the value corresponding to the specified length (for example, 8 bits) in the operation result.
  • each value in the array can be an Ethernet queue index, used to represent an Ethernet queue.
  • the Ethernet queue indicated by the discovered Ethernet queue index is the interrupt queue in which the TCP data packet is stored.
  • the specified hash value can be set in advance. Because the network card driver in the first server is different, the corresponding specified length and the Ethernet queue array may also be different. Therefore, the network card type in the first server is not At the same time, the corresponding specified hash value is also different, which is not specifically limited in the embodiment of the present invention.
  • the destination ports of the plurality of TCP data packets all correspond to one interrupt queue, after processing according to the foregoing method, the plurality of TCP data messages are stored in an interrupt queue.
  • the destination port of the multiple TCP data packets corresponds to an interrupt queue.
  • the TCP port used is filtered when multiple TCP connections of the service process are established, as follows:
  • the first server may include a plurality of interrupt queues, which may also be referred to as an Ethernet queue, and the destination port that the service process can use may include multiple destination ports.
  • the first server establishes a plurality of TCP connections of the service process, including: step 500a and step 500b.
  • Step 500a The first server determines a correspondence between the multiple interrupt queues and the plurality of destination ports. Each of the interrupt queues corresponds to one destination port set, and one destination port set may include multiple destination ports.
  • the service processing core of the first server may determine the correspondence between the multiple interrupt queues and the multiple destination ports, and may include: according to each destination port and the specified hash value of the multiple destination ports, The interrupt queue corresponding to each destination port is determined.
  • the multiple destination ports corresponding to one interrupt queue are associated with the interrupt queue as a destination port set, so that the correspondence between multiple interrupt queues and the multiple destination ports can be obtained.
  • the correspondence between the multiple interrupt queues and the multiple destination ports may also be referred to as a correspondence between the interrupt queue and the port set.
  • the first server includes nine interrupt queues, and the indexes of the nine interrupt queues are respectively q1 to q9.
  • the method for determining the interrupt queue corresponding to the destination port may be: performing a hash operation according to the destination port and the specified hash value to determine a specified length. The value is assumed to be 8 bits, and the value of the 8 bits corresponding to the destination port is 12. When the Ethernet queue array shown in Table 1 below is queried according to the value 12, the corresponding interrupt queue index is determined to be q4.
  • Step 500b The first server establishes multiple TCP connections of the service process by using multiple destination ports included in the destination port set, and the multiple TCP connections may be used to transmit TCP data packets of the service process.
  • the service processing core of the first server may establish multiple TCP connections of the service process. Because multiple TCP connections in the service process are used, multiple ports in the port set corresponding to the interrupt queue are used, so The destination port of the first server that receives the multiple TCP data packets corresponds to an interrupt queue, and the multiple TCP data packets can be mapped in an interrupt queue.
  • Step 503 The first server acquires an interrupt processing request, where the interrupt processing request is used to request to process at least one of the plurality of TCP data messages stored in the interrupt queue, and each of the plurality of TCP data messages The destination port of the TCP data packet corresponds to the interrupt queue.
  • the first server may configure an interrupt processing core for each interrupt queue.
  • the peripherals of the server for example, the network card module of the server
  • the interrupt processing core corresponding to the interrupt queue sends an interrupt processing request, and the interrupt processing request may be used to request to process a TCP data message stored in the interrupt queue, or to request to process multiple TCP data messages stored in the interrupt queue. That is, the interrupt processing request can be used to request to process at least one TCP data message.
  • Step 504 The first server acquires the at least one TCP data packet from the interrupt queue, and determines a service processing core according to the at least one TCP data packet.
  • the interrupt processing core may be executed by the interrupt processing core.
  • the interrupt processing core may obtain the at least one TCP data packet from the interrupt queue, and perform the TCP data packet. Parsing, storing the data in the at least one TCP data packet in the cache and the memory, and determining the service process according to the TCP connection information of the at least one TCP data packet, thereby determining the service processing core.
  • Step 505 The first server wakes up the service processing core, so that the service processing core processes the at least one TCP data packet, and the interrupt processing core and the service processing core have a shared cache space.
  • the interrupt processing core can wake up the service processing core. For example, the interrupt processing core can send a wake-up instruction to the service processing core. When the service processing core receives the wake-up instruction, the The business processing core is awakened.
  • the service processing core can read the data in the at least one TCP data packet from the buffer of the interrupt processing core to implement the at least one TCP datagram, because the interrupt processing core and the service processing core have a shared cache space. Text data operation. For example, the original data stored in the server is updated according to the data in the TCP data packet, and the user data in the data packet is sent to other servers, so that other servers update the stored original data.
  • the existence of the shared buffer space between the interrupt processing core and the service processing core may include: the interrupt processing core and the service processing core are the same core; or the interrupt processing core and the service processing core satisfy one of the following conditions: : Located in the same cluster or in the same logical unit (die).
  • the accessed data may be transmitted through the L1 cache, and the transmission process may be: the interrupt processing core will be The data in the at least one TCP data packet is temporarily stored in the L1 cache, and the service processing core directly accesses the L1 cache.
  • the accessed data can be transmitted through the L2 cache, and the transmission process may be: the interrupt processing core will be The data in the at least one TCP data packet is temporarily stored in the L2 cache, and the service processing core directly accesses the L2 cache.
  • the accessed data may be transmitted through the L3 cache, and the transmission process may be:
  • the interrupt processing core temporarily stores the data in the at least one TCP data packet in the L3 cache, and the service processing core directly accesses the L3 cache.
  • the interrupt CPU core and the service CPU core may be configured in different clusters of the same CPU, so that the CPU cores are located in different CPUs. It can also reduce part of the data access delay and increase the data processing rate. Since each cache access rate is L1>L2>L3>cross-die memory access>cross-CPU memory access, the interrupt processing core and the service processing core can be configured as the same core as possible, or they can be located in the same cluster (cluster). Or in the same logical unit (die) to reduce data access latency and increase data processing rate.
  • the service processing cores running the business processes and the interrupt processing cores of each interrupt queue are different.
  • the service processing core and the multiple interrupt processing cores are likely to be dispersed in different CPUs or in different clusters, which results in a large processing delay of the processing core.
  • the different destination ports of one service process in the first server correspond to one interrupt queue
  • the service processing core running the service process and the interrupt processing core of the interrupt queue are located in the same cluster or the same logical unit.
  • the relationship between the service processing core and the interrupt processing core can be as shown in FIG.
  • Corex in Figure 7 represents the service processing core
  • OSD1 represents the service process running on corex
  • port0 ⁇ portn (port 0 to port n) represent multiple destination ports
  • ethq0 represents the interrupt queue corresponding to multiple destination ports
  • core0 indicates the The interrupt processing core of the interrupt queue.
  • the corex and corey in Figure 7 can be in the same cluster or in the same logical unit, and they can also be the same core.
  • multiple TCP connections of one service process in the server are configured to correspond to one interrupt queue, so that multiple TCP data packets received by the service process through multiple TCP connections can be stored.
  • the interrupt processing core configured by the interrupt queue has the same cache space as the service processing core running the business process, the service processing core can use the shared cache to access data, thereby reducing data access latency and improving Data processing efficiency, which in turn improves system performance.
  • the server includes corresponding hardware structures and/or software modules for performing various functions.
  • the embodiments of the present invention can be implemented in a combination of hardware or hardware and computer software in combination with the apparatus and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
  • the embodiment of the present application may divide the function module into the server according to the foregoing method example.
  • each function module may be divided according to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and may be further divided in actual implementation.
  • FIG. 8 is a schematic diagram showing a possible structure of the interrupt processing apparatus involved in the foregoing embodiment.
  • the interrupt processing apparatus includes: a receiving unit 801, an obtaining unit 802, and a first embodiment.
  • the receiving unit 801 is configured to perform step 501 in FIG. 5 or FIG. 6 , and is further configured to perform step 503 in FIG. 5 or 6 ;
  • the obtaining unit 802 and the first processing unit 803 are configured to perform the method in FIG. 5 or FIG. 6 Step 504;
  • the first processing unit 803 and the second processing unit 804 are configured to perform step 505 in FIG. 5 or FIG. 6, as well as other technical processes and the like described herein.
  • the foregoing interrupt processing device may also be a server. All related content of the steps involved in the method embodiment may be referred to the function description of the corresponding function module, and details are not described herein again.
  • the receiving unit 801 and the obtaining unit 802 may be communication interfaces, and the first processing unit 803 and the second processing unit 804 may be processors.
  • interrupt processing device shown in FIG. 8 can also implement the interrupt processing method shown in FIG. 5 or FIG. 6 by software
  • the interrupt processing device and each module thereof can also be a software module.
  • FIG. 2 is a schematic diagram of a possible logical structure of a server involved in the foregoing embodiment according to an embodiment of the present invention.
  • the processor 202 in the server may include multiple cores, which may be multiple cores in one CPU, or multiple cores of multiple CPUs, and the multiple cores may include an interrupt processing core and a service processing core.
  • the interrupt processing core is used to perform the operations described in steps 501 - 505 of FIG. 5 or FIG. 6, and the service processing core is used to perform the operations described in steps 500a-500b of FIG.
  • the processor may include a plurality of cores, including an interrupt processing core 901 and a service processing core 902, the processor It can be used to perform the interrupt processing method provided in FIG. 5 or FIG. 6.
  • the interrupt processing core 901 and the service processing core 902 may be the same core; or the interrupt processing core 901 and the service processing core 902 may belong to the same cluster; or the interrupt processing core 901 and the service processing Core 902 can belong to the same logical unit.
  • the interrupt processing core 901 and the service processing core 902 are two different cores as an example.
  • the above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present invention are generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transfer to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains one or more sets of available media.
  • the usable medium can be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium.
  • the semiconductor medium can be a solid state drive (SSD).
  • a chip system in another embodiment, is further provided, the chip system includes a processor, a memory, a communication interface, and a bus, and the processor, the memory, and the communication interface are connected by a bus, and the code and the data are stored in the memory.
  • the processor runs the code in the memory, the chip system is caused to perform the interrupt processing method provided in FIG. 5 or FIG. 6.
  • a plurality of TCP connections of a service process in the server are configured to correspond to one interrupt queue, so that multiple TCP data messages received by the service process through multiple TCP connections can be stored in an interrupt queue, and
  • the interrupt processing core configured with the interrupt queue has the same cache space as the service processing core running the service process, so that the service processing core can use the shared cache to access data, thereby reducing data access delay, improving data processing efficiency, and thereby improving the system. performance.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Multi Processors (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本申请提供一种中断处理方法、装置及服务器,涉及数据存储技术领域,用于降低数据访问时延。该方法应用于包括多个核的服务器中,所述多个核包括中断处理核和运行有业务进程的业务处理核,包括:中断处理核接收中断处理请求,中断处理请求用于请求处理中断队列中存放的业务进程的多个TCP数据报文中的至少一个TCP数据报文,多个TCP数据报文中每个TCP数据报文的目的端口均与同一个中断队列对应;中断处理核从中断队列中获取至少一个TCP数据报文;中断处理核根据至少一个TCP数据报文确定业务处理核,中断处理核和业务处理核存在共享缓存空间;中断处理核唤醒业务处理核,以使业务处理核处理至少一个TCP数据报文。

Description

一种中断处理方法、装置及服务器 技术领域
本申请涉及数据存储技术领域,尤其涉及一种中断处理方法、装置及服务器。
背景技术
在通用计算机结构中,缓存是为了解决中央处理单元(central processing unit,CPU)和内存速度的差异问题,包括第一级(level 1,简称L1)缓存、第二级(level 2,简称L2)缓存和第三级(level 3,简称L3)缓存共三级缓存。三级缓存的访问优先级和访问速率依次为:L1>L2>L3,通过利用不同的缓存可以提高数据的访问速率。当CPU需要读取数据时,首先从缓存中查找是否有待读取的数据,如果找到立即发送给CPU处理。如果没有找到,以相对慢的速度从内存中读取并发送给CPU处理,同时把这个数据所在的数据块调入缓存中,可以使得以后对整块数据的读取都从缓存中进行,不必再调用内存。
目前,在服务器架构中,每个服务器可以包括一个或多个CPU,每个CPU包括多个核,不同CPU核可以共享缓存资源。例如,一个ARM服务器包括2个CPU,每个CPU包括32个核,在同一个CPU内,每四个核划分为一个集群(cluster),每16个核划分为一个逻辑单元(die)。其中,CPU中的每个核独享一个L1缓存,一个集群中的四个核共享一个L2缓存,一个逻辑单元中的16个核共享一个L3缓存。在业务处理过程中,处理器的核采用中断的方式处理输入/输出(input/output,I/O)操作请求,具体过程为:当服务器接收到携带I/O操作请求的传输控制协议(Transmission Control Protocol,TCP)数据报文时,该TCP数据报文会被存放在一个与之关联的中断队列中,每个中断队列配置有一个处理器核(称为中断处理核),中断处理核按照先进先出的方式依次获取TCP数据报文,并通知处理该TCP数据报文对应的业务进程的处理器核(即运行该业务进程的核,称为业务处理核)处理。然后,业务处理核需从中断处理核的缓存或内存中读取数据,以完成数据的读写。当服务器包括多个CPU、每个CPU包括多个核时,中断处理核和业务处理核可能不在同一个集群或同一个逻辑单元中,中断处理核和业务处理核不能共享缓存资源。此时,中断处理核和业务处理核需要通过内部总线跨CPU或跨逻辑单元访问缓存,导致读取或写入操作处理时间长。
当上述中断处理方法应用在分布式数据存储系统时,同一数据的多个副本数据可以存储在不同的服务器上,部署有虚拟块系统(virtual block system,VBS)进程的服务器,访问部署有对象存储设备(object storage device,OSD)进程的服务器中的副本数据,每个服务器中可以部署多个OSD进程,每个OSD进程对应服务器中的一个磁盘,每个进程由一个处理器核处理。图1为一种分布式数据存储系统示意图,如图1 所示,VBS进程与各个OSD进程之间,以及不同服务器的OSD进程之间均通过TCP连接通信,图1中以OSD1~OSDn表示不同服务器上的OSD进程为例进行说明。数据读写时,VBS进程先将待读写数据以TCP报文的净荷数据形式发送给主备份数据所在OSD进程,再由主备份数据所在OSD进程同步数据到其他辅备份数据所在OSD进程。对于一个OSD进程而言,该OSD进程可以接收来自VBS进程的TCP数据报文,也可以接收来自其他服务器上的OSD进程的TCP数据报文,从而该OSD进程可以接收到多个TCP数据报文。相应的,当一个服务器接收到多个TCP数据报文时,该多个TCP数据报文很可能会被存放在多个不同的中断队列中,每个中断队列的中断处理核从各自的中断队列中获取TCP数据报文并处理,以及将相应的TCP数据报文中的数据存储在对应的缓存和内存中。由于中断队列的中断处理核是随机配置的,多个中断队列对应的多个中断处理核很可能会分散位于不同的逻辑单元、以及不同的CPU中,此时,业务处理核需要从不同的缓存和内存中读取数据,业务处理核对内存的访问时延和L3缓存的访问时延均大于其对L2缓存的访问时延,而且,业务处理核需要通过内部总线跨CPU或跨逻辑单元访问缓存和内存又增加了访存时延,因此,这样会导致业务处理核存在数据访问时延大的问题,进而降低用户数据的处理速率,影响系统的性能。
发明内容
本申请提供一种中断处理方法、装置及服务器,解决了现有技术中数据访问时延大、用户数据处理速率低的问题。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种中断处理方法,应用于包括多个核的中央处理单元CPU的服务器中,多个核的CPU中包括用于处理中断的中断处理核、以及运行有业务进程的业务处理核,该包括:当服务器接收到业务进程的多个TCP数据报文时,由于多个TCP数据报文中每个TCP数据报文的目的端口均与同一个中断队列对应,因此,该多个TCP数据报文被存放在该中断队列中,并触发中断处理请求;中断处理核接收中断处理请求,该中断处理请求用于请求处理该中断队列中存放的多个TCP数据报文中的至少一个TCP数据报文,即该中断处理请求可以用于请求处理一个TCP数据报文,也可以用于请求处理多个TCP数据报文;中断处理核从该中断队列中获取至少一个TCP数据报文;中断处理核根据至少一个TCP数据报文的TCP连接信息,可以确定该至少一个TCP数据报文所属的业务进程,该业务进程由业务处理核运行,从而确定该业务处理核,中断处理核和业务处理核存在共享缓存空间;中断处理核可以向业务处理核发送唤醒指令,以唤醒该业务处理核,使业务处理核处理至少一个TCP数据报文,比如,业务处理核根据至少一个TCP数据报文中的用户数据更新服务器中存储的用户数据,或者将其发送给其他服务器实现数据同步。
上述技术方案中,通过配置服务器中的一个业务进程的多个TCP连接对应一个中断队列,从而可以将该业务进程通过多个TCP连接接收到的多个TCP数据报文存放在一个中断队列中,以及通过配置该中断队列的中断处理核与运行该业务进程的业务处理核存在相同的缓存空间,使得业务处理核可以使用共享缓存访问数据,进而降低 数据访问时延,提高数据处理效率,进而提高系统性能。
在一种可能的实现方式中,中断处理核和业务处理核为一个CPU中的同一个核,此时,业务处理核可以从L1缓存中获取至少一个TCP数据报文中的用户数据,数据访问延时最小、处理速率最高。或者,业务处理核和中断处理核归属于同一个集群(cluster),此时,业务处理核可以从L2缓存中获取至少一个TCP数据报文中的用户数据,数据访问延时较小、处理速率较高。或者,业务处理核和中断处理核归属于同一个逻辑单元(die)中,此时,业务处理核可以从L3缓存中获取至少一个TCP数据报文中的用户数据,数据访问延时和处理速率
Figure PCTCN2018100622-appb-000001
访问内存相比,相对较高。
在另一种可能的实现方式中,服务器包括多个中断队列,业务进程可使用的目的端口包括多个目的端口,中断处理核获取中断处理请求之前,该方法还包括:业务处理核确定多个中断队列与多个目的端口之间的对应关系,每个中断队列对应一个目的端口集合,一个目的端口集合包括多个目的端口;业务处理核通过一个目的端口集合建立业务进程的多个TCP连接,多个TCP连接用于传输业务进程的TCP数据报文。上述可能的实现方式中,通过使用一个目的端口集合建立业务进程的多个TCP连接,可以使得到该业务进程的多个TCP数据报文被存放在一个中断队列中,从而避免了该业务进程的多个TCP数据报文被存放在多个不同的中断队列中。
在另一种可能的实现方式中,业务处理核确定多个中断队列与所述多个目的端口之间的对应关系,包括:根据多个目的端口中每个目的端口和指定哈希值,获取每个目的端口对应的中断队列,以得到多个中断队列与多个目的端口之间的对应关系。上述可能的实现方式中,业务处理核可以根据指定哈希值,简单有效的确定多个中断队列与多个目的端口之间的对应关系。
在另一种可能的实现方式中,当服务器包括的网卡类型不同时,指定哈希值不同。上述可能的实现方式中,对于不同的服务器,当其网络类型不同时,通过设置不同的指定哈希值可以使得业务进程的多个TCP数据报文被存放在一个中断队列中。
第二方面,提供一种中断处理装置,该装置包括:接收单元,用于接收中断处理请求,中断处理请求用于请求处理中断队列中存放的业务进程的多个TCP数据报文中的至少一个TCP数据报文,多个TCP数据报文中每个TCP数据报文的目的端口均与同一个中断队列对应;获取单元,用于从该中断队列中获取至少一个TCP数据报文;第一处理单元,用于根据至少一个TCP数据报文确定业务处理核,第一处理单元和第二处理单元存在共享缓存空间;第一处理单元,还用于唤醒第二处理单元,以使第二处理单元处理至少一个TCP数据报文。
在一种可能的实现方式中,第一处理单元和第二处理单元为同一个处理单元;或者,第一处理单元和第二处理单元归属于同一个集群(cluster);或者,第一处理单元和第二处理单元归属于同一个逻辑单元(die)中。
在另一种可能的实现方式中,该装置包括多个中断队列,业务进程可使用的目的端口包括多个目的端口,第二处理单元,还用于:确定多个中断队列与所述多个目的端口之间的对应关系,每个中断队列对应一个目的端口集合,一个目的端口集合包括多个目的端口;通过一个目的端口集合建立业务进程的多个TCP连接,多个TCP连接用于传输业务进程的TCP数据报文。
在另一种可能的实现方式中,第二处理单元,还用于:根据多个目的端口中每个目的端口和指定哈希值,获取每个目的端口对应的中断队列,以得到多个中断队列与多个目的端口之间的对应关系。
在另一种可能的实现方式中,当该中断处理装置包括的网卡类型不同时,指定哈希值不同。
第三方面,提供一种处理器,该处理器用于执行上述第一方面或第一方面的任一种可能的实现方式所提供的中断处理方法。
第四方面,提供一种服务器,该服务器包括存储器、处理器、总线和通信接口,存储器中存储代码和数据,处理器、存储器和通信接口通过总线连接,处理器运行存储器中的代码使得服务器执行上述第一方面或第一方面的任一种可能的实现方式所提供的中断处理方法。
第五方面,提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机执行指令,当设备的至少一个处理器执行该计算机执行指令时,设备执行上述第一方面或者第一方面的任一种可能的实现方式所提供的中断处理方法。
第六方面,提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;设备的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得设备实施上述第一方面或者第一方面的任一种可能的实现方式所提供的中断处理方法。
可以理解地,上述提供的任一种中断处理方法的装置、处理器、服务器、计算机存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
附图说明
图1为一种分布式数据存储系统中的TCP连接的示意图;
图2为本申请提供的一种服务器的结构示意图;
图3为本申请提供的一种处理器的结构示意图;
图4为本申请提供的一种分布式数据存储系统中的数据存储示意图;
图5为本申请提供的一种中断处理方法的流程示意图;
图6为本申请提供的另一种中断处理方法的流程示意图;
图7为本申请提供的一种业务进程与中断队列的关系示意图;
图8为本申请提供的一种中断处理装置的结构示意图;
图9为本申请提供的另一种处理器的结构示意图。
具体实施方式
图2为本发明实施例提供的一种服务器的结构示意图,参见图2,该服务器可以包括存储器201、处理器202、通信接口203和总线204。其中,存储器201、处理器202以及通信接口203通过总线204相互连接。存储器201可用于存储数据、软件程序以及模块,主要包括存储程序区和存储数据区,存储程序区可存储操作系统、至少一个功能所需的应用程序等,存储数据区可存储该设备的使用时所创建的数据等。处 理器202用于对该服务器的动作进行控制管理,比如通过运行或执行存储在存储器201内的软件程序和/或模块,以及调用存储在存储器201内的数据,执行该服务器的各种功能和处理数据。通信接口203用于支持该服务器进行通信。
其中,处理器202可以包括中央处理器单元,通用处理器,数字信号处理器,专用集成电路,现场可编程门阵列或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器202也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理器和微处理器的组合等等。总线204可以是外设部件互连标准(peripheral component interconnect,PCI)总线,或者扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线204可以分为地址总线、数据总线、控制总线等。为便于表示,图2中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本发明实施例中,同一服务器所包括的处理器202的数量可以是一个或者多个,且每个处理器202可以包括多个核。为便于后续描述,将本发明实施例中服务器称为第一服务器。
图3为第一服务器中一种处理器202的内部结构示意图,该处理器202可以为ARM处理器,ARM处理器可以包括多个中央处理单元(center processing unit,CPU),每个CPU可以包括多个核(比如,32个核),每四个核可以称为一个集群(cluster),每4个集群可以称为一个逻辑单元(die)。图3中以处理器202包括两个CPU为例进行说明,则两个CPU包括64个核(比如,核0~核63),每个CPU包括两个逻辑单元,处理器202共包括四个逻辑单元。可选地,x86处理器的结构也可以被扩展为图3所提供的处理器202的结构,本申请不做具体限定。
其中,按照数据读取顺序和与CPU结合的紧密程度,CPU缓存可以分为第一级缓存(L1cache)、第二级缓存(L2cache)和第三级缓存(L3cache),每一级缓存中所储存的全部数据都是下一级缓存的一部分。L1缓存位于与CPU最接近的位置,是与CPU结合最为紧密的CPU缓存,可用于暂时存储并向CPU的核递送各类运算指令和运算所需数据,其访问速率最快。L2缓存位于L1缓存与L3缓存之间,L2缓存和L3缓存仅用于存储CPU的核处理时需要用到数据,L2缓存的访问优先级和访问速率高于L3缓存,且三级缓存的容量从大到小依次为L3、L2、L1。
三级缓存的工作原理是当CPU的核需要读取一个数据时,首先从L1缓存中查找,如果L1缓存中不存在,则需要从L2缓存中查找,如果L2缓存中也不存在,则从L3缓存中查找,如果L3缓存中也不存在,则需要从内存中读取。缓存中存放的数据是内存中的一小部分,但这一小部分数据是短时间内CPU的核即将访问的,当CPU的核读写数据时,通过利用不同的缓存提高数据的访问效率。
处理器的核可以通过中断处理输入/输出(input/output,I/O)操作,具体过程为:当设备接收到一个TCP数据报文时,该TCP数据报文会被存放在一个中断队列中,每个中断队列配置有一个核(称为中断处理核),该中断处理核会从中断队列中获取该TCP数据报文并解析,将TCP数据报文中的数据存储在缓存和内存中。之后,该TCP数据报文对应的业务进程的核(即运行该业务进程的核,称为业务处理核)从中 断处理核的缓存或内存中读取数据,以执行数据的读写操作。
在本发明实施例中,当一个核需要访问另一个核的数据时,如果两个核位于同一集群中,由于同一集群中的多个核可以共享一个L2缓存,则访问的数据可以通过L2缓存进行传输,即第一个核将访问的数据缓存在L2缓存中,第二个核直接访问共享的L2缓存。同理,如果两个核位于同一逻辑单元的不同的集群中,由于同一逻辑单元中的多个核共享一个L3缓存,则访问的数据可以通过L3缓存进行传输,即第一个CPU核将访问的数据缓存在L3缓存中,第二个CPU核直接访问共享的L3缓存(可以称为跨逻辑单元访问)。如果两个核不在同一个CPU中,则访问的数据只能通过内存传输,即第一个核将访问的数据存放在其内存中,第二个核从第一个核的内存中读取数据(可以称为跨CPU访问),此时,传输过程需要通过内部总线跨越多个CPU才能完成。由于L3缓存的访问时延大于L2缓存的访问时延,内存的访问时延大于L3缓存的访问时延,因此,当两个核处于跨逻辑单元访问或者跨CPU访问的情况时,会存在访问时延大的问题。
本发明实施例所提供的中断处理方法可适用于所有通过TCP连接传输数据报文的服务器中。比如,服务器可以为分布式数据存储系统中的服务器,为便于后续描述,以下以分布式数据存储系统为例进行说明。
分布式数据存储系统可以包括多个服务器,在分布式数据存储系统中,用户的数据可以通过多副本数据的形式进行存储,同一数据的多个副本数据可以存储在不同的服务器上,当用户对服务器中存储的数据进行I/O操作时,需要保证同一数据的多个副本数据的一致性,该多个副本数据可以为一个主备份数据和多个辅备份数据。
其中,用户可以通过部署有虚拟块系统(virtual block system,VBS)进程的服务器,访问部署有对象存储系统(object storage device,OSD)进程的服务器中的副本数据,一个服务器中可以部署多个OSD进程,每个OSD进程对应服务器中的一个磁盘,该磁盘中可以存储多个不同的副本数据。其中,VBS进程是业务的I/O进程,用于提供接入点服务(即用户数据以虚拟块的形式呈现,对虚拟块的访问可实现对真实数据的访问),VBS进程还可用于管理卷(volume)的元数据。其中,用户数据可以通过卷的形式进行存储,卷的元数据可以是指用于描述用户数据在存储服务器中的分布情况的相关信息,比如,数据的地址、数据的修改时间、数据的访问权限等等。OSD进程也是业务的I/O进程,用于管理对应磁盘中存储的用户数据,还可用于执行具体的I/O操作,即用于执行具体的数据读写操作。
为便于理解,这里以分布式数据存储系统包括三个用于存储用户数据的服务器、且系统存储的用户数据为三副本模型为例进行描述,用户数据在服务器中的存储示意图可以如图4所示。三副本模型是指每个数据块在存储系统中存储三份,其中,一份可以为主备份数据(Master),两份可以为辅备份数据(Slave)。VBS进程可以对服务器中存储的用户数据进行切分,假设切分后得到n个数据块,即Part1~Partn,每块数据块存储三份,则n个数据块Part1~Partn的三个副本数据的存储结构可以如图4所示。每个数据块的三个备份分散在不同服务器的磁盘中,图4中以M表示每个数据块的Master,以S1表示每个数据块的Slave1部分,以S2表示每个数据块的Slave2部分。假设每个服务器包括n个磁盘,即Disk1~Diskn。图4中的卷元数据为VBS进程管理 的Part1~Partn的卷元数据,该卷元数据中可以包括存储每个数据块的服务器的标识信息,以及该数据块位于服务器中的具体位置。
另外,如图1所示,不同服务器中的VBS进程与OSD进程、以及OSD进程与OSD进程之间在进行数据传输时,VBS进程需要与服务器中部署的每个OSD进程建立传输控制协议(transmission control protocol,TCP)连接,不同服务器的OSD进程之间也需要建立TCP连接,通过建立的TCP连接可以传输TCP数据报文,图1中以OSD1~OSDn表示不同服务器上的OSD进程为例进行说明。
由于同一数据块的不同备份数据(Master和Slave)是存储在不同服务器上的,当对其中一份数据进行输入或输出(input/output,I/O)操作时,需要保证其他备份数据的一致性。具体地,当VBS进程对服务器中存储的用户数据进行I/O操作时,VBS进程可以查询卷元数据,以确定该I/O操作所操作的数据块的三个副本数据所在的服务器、以及在服务器中的具体位置。VBS进程向该数据块的Master所在的服务器中的OSD进程发送TCP数据报文,该OSD进程对TCP数据报文中的数据进行存储。该OSD进程再将接收到的数据通过TCP连接分别发送至两个Slave对应的服务器中的OSD进程,以使该数据在多个副本中保持一致。之后,Master对应的服务器中的OSD进程接收到两个Slave对应的服务器中的OSD进程发送的应答信息后,向VBS进程返回一个应答信息,从而完成该I/O操作。
对于一个OSD进程而言,该OSD进程可以接收来自VBS进程的TCP数据报文,也可以接收来自其他服务器上的OSD进程的TCP数据报文,从而该OSD进程可以接收到多个TCP数据报文。相应的,结合上述处理器的核处理一个TCP数据报文的原理,则当一个服务器接收到多个TCP数据报文时,该多个TCP数据报文很可能会被存放在多个不同的中断队列中,多个中断队列对应多个中断处理核,则每个中断队列的中断处理核从各自的中断队列中获取对应TCP数据报文并解析,以及将相应的TCP数据报文中的数据存储在各自的缓存和内存中。
由于每个中断队列的中断处理核都是随机配置的,多个中断队列对应的多个中断处理核很可能分散位于不同的逻辑单元、以及不同的CPU中,从而使业务处理核在读取多个TCP数据报文中的数据时,需要从不同的缓存和内存中读取数据,而内存的访问时延和L3缓存的访问时延均大于L2缓存的访问时延,因此会导致业务处理核存在数据访问时延大的问题,进而降低用户数据的处理速率,影响系统的性能。
图5为本发明实施例提供的一种中断处理方法的流程图,该方法应用于包括多个核的CPU的服务器中,多个核的CPU包括中断处理核和业务处理核。其中,业务处理核是指运行有业务进程的一个核,业务处理核可用于处理与该业务进程相关的数据读写操作,比如,该业务进程可以是OSD进程,运行OSD进程的核即称为业务处理核,该业务处理核可用于处理该OSD进程管理的备份数据的读写操作。中断处理核是指用于处理中断的核,服务器可以为一个中断队列配置一个中断处理核。相应的,该方法包括以下几个步骤。
步骤501:第一服务器接收多个TCP数据报文,该多个TCP数据报文的目的端口均对应一个中断队列。
其中,这里以服务器为第一服务器为例,第一服务器可以包括多个业务进程,每 个业务进程可以用于管理多个数据块的备份数据,该备份数据可以包括Master数据,也可以包括Slave数据,且Master和Slave是不同数据块的备份。本发明实施例中,以第一服务器的一个业务进程为例进行说明,该业务进程可以与其他不同服务器的多个进程之间建立TCP连接,该TCP连接用于传输TCP数据报文。比如,在分布式数据存储系统中,该业务进程可以是一个OSD进程,一个OSD进程可以与VBS进程之间建立TCP连接,也可以与其他服务器的多个OSD进程之间建立TCP连接。
在分布式数据存储系统中,当用户执行写操作时,若写操作对应的数据块的Master数据在第一服务器的一个OSD进程管理的用户数据中,则用户可以通过VBS进程与第一服务器的OSD进程之间的TCP连接发送TCP数据报文。或者,当其他服务器需要进行副本数据的同步时,若对应的Slave数据在第一服务器的OSD进程管理的用户数据中,则其他服务器可以通过对应的OSD进程与该OSD进程之间的TCP连接发送TCP数据报文。因此,第一服务器可以接收到多个TCP数据报文,具体可以通过通信接口接收到该多个TCP数据报文,其中,该多个TCP数据报文可以包括来自VBS进程的TCP数据报文,也可以包括来自其他服务器中的OSD进程的TCP数据报文。
对于多个TCP数据报文中的每个TCP数据报文均包括端口信息,该端口信息可用于指示该TCP数据报文的目的端口。比如,该TCP数据报文中可以包括四元组信息,即源IP地址、源端口、目的IP地址和目的端口,一个TCP数据报文中的端口信息所指示的目的端口可以是该四元组信息中的目的端口。
需要说明的是,本申请中的目的端口是指面向连接服务的通信协议端口,也可以称为TCP端口,是一种抽象的软件结构,并不是指硬件端口。
步骤502:第一服务器将该多个TCP数据报文存放在该多个TCP数据报文的目的端口所对应的中断队列中。
具体地,当第一服务器接收到多个TCP数据报文时,对于多个TCP数据报文中的每个TCP数据报文,第一服务器的网卡驱动可以获取该TCP数据报文中的四元组信息,该四元组信息中可以包括端口信息,网卡驱动可以在根据四元组信息和指定哈希值进行哈希运算时,屏蔽四元组信息中的其他信息(比如,在哈希运算过程中将四元组信息中除目的端口以外的其他信息对应的比特位全部置为0),仅保留目的端口。哈希运算之后,会得到一定长度(比如,32个比特位)的运算结果,网卡驱动可以根据运算结果中的指定长度(比如,8个比特位)对应的数值,查找以太网队列数组(indirection table),数组里的每一个数值都可以是一个以太网队列索引,用于表示一个以太网队列。查找到的以太网队列索引所指示的以太网队列就是该TCP数据报文会被存放的中断队列。
需要说明的是,指定哈希值可以事先进行设置,由于第一服务器中的网卡驱动不同时,对应的指定长度、以及以太网队列数组也可能不同,因此,在第一服务器中的网卡类型不同时,对应的指定哈希值也不同,本发明实施例对此不作具体限定。
进一步地,由于该多个TCP数据报文的目的端口均对应一个中断队列,因此,按照上述方法进行处理后,该多个TCP数据报文会被存放在一个中断队列中。其中,该多个TCP数据报文的目的端口均对应一个中断队列,这是因为在建立该业务进程的多个TCP连接时,所使用的TCP端口是经过筛选的,具体如下所述:
第一服务器可以包括多个中断队列,该多个中断队列也可以称为以太网队列,该业务进程可使用的目的端口可以包括多个目的端口。相应的,参见图6,第一服务器建立该业务进程的多个TCP连接包括:步骤500a和步骤500b。
步骤500a:第一服务器确定该多个中断队列与该多个目的端口之间的对应关系;其中,每个中断队列对应一个目的端口集合,一个目的端口集合中可以包括多个目的端口。
其中,具体可以由第一服务器的业务处理核确定该多个中断队列与该多个目的端口之间对应关系,且可以包括:根据多个目的端口中的每个目的端口和指定哈希值,确定每个目的端口对应的中断队列;将一个中断队列对应的多个目的端口作为一个目的端口集合与该中断队列对应,从而可以得到多个中断队列与该多个目的端口之间的对应关系。
可选地,多个中断队列与多个目的端口之间的对应关系也可以称为中断队列与端口集合之间的对应关系。
为便于理解,这里以第一服务器包括9个中断队列,9个中断队列的索引分别为q1~q9为例进行说明。对于该业务进程可使用的多个目的端口中的每个目的端口,确定该目的端口对应的中断队列的方法可以为:根据该目的端口和指定哈希值进行哈希运算,以确定指定长度的数值,假设指定长度为8个比特位,该目的端口对应的8个比特位的数值为12;根据数值12查询如下表1所示的以太网队列数组时,确定对应的中断队列索引为q4。
表1
指定长度数值 中断队列索引
0、9、18、27、…… q1
1、10、19、28、…… q2
2、11、20、29、…… q3
3、12、21、30、…… q4
…… ……
需要说明的是,上述表1所示的以太网队列数组、以及上述确定多个目的端口与多个中断队列之间的对应关系的方式仅为示例性的,并不对本申请构成限定。
步骤500b:第一服务器通过一个目的端口集合包括的多个目的端口,建立该业务进程的多个TCP连接,该多个TCP连接可用于传输该业务进程的TCP数据报文。
具体可以由第一服务器的业务处理核建立该业务进程的多个TCP连接,由于在建立该业务进程的多个TCP连接时,使用的是一个中断队列对应的端口集合中的多个端口,所以第一服务器接收到该多个TCP数据报文的目的端口与一个中断队列对应,进而可以将该多个TCP数据报文映射在一个中断队列中。
步骤503:第一服务器获取中断处理请求,该中断处理请求用于请求处理该中断队列中存放的多个TCP数据报文中的至少一个TCP数据报文,该多个TCP数据报文中每个TCP数据报文的目的端口均与该中断队列对应。
其中,第一服务器可以为每个中断队列配置一个中断处理核,当该多个TCP数据报文被存放在该中断队列后,该服务器的外设(比如,该服务器的网卡模块)可以向该中断队列对应的中断处理核发送中断处理请求,该中断处理请求可以用于请求处理该中断队列中存放的一个TCP数据报文、或者用于请求处理该中断队列中存放的多个TCP数据报文,即该中断处理请求可用于请求处理至少一个TCP数据报文。
步骤504:第一服务器从该中断队列中获取该至少一个TCP数据报文,根据该至少一个TCP数据报文确定业务处理核。
其中,具体可以由该中断处理核执行,当该中断处理核接收到中断处理请求时,该中断处理核可以从该中断队列中获取该至少一个TCP数据报文,并对该TCP数据报文进行解析,将该至少一个TCP数据报文中的数据保存在缓存和内存中,同时根据该至少一个TCP数据报文的TCP连接信息,确定该业务进程,进而确定该业务处理核。
步骤505:第一服务器唤醒业务处理核,以使业务处理核处理该至少一个TCP数据报文,该中断处理核与该业务处理核存在共享缓存空间。
当该中断处理核确定该业务处理核后,该中断处理核可以唤醒该业务处理核,比如,中断处理核可以向业务处理核发送一个唤醒指令,当业务处理核接收到该唤醒指令时,该业务处理核被唤醒。由于该中断处理核与该业务处理核存在共享缓存空间,因此该业务处理核可以从该中断处理核的缓存中读取该至少一个TCP数据报文中的数据,实现对该至少一个TCP数据报文的数据操作。比如,根据TCP数据报文中的数据更新服务器中存储的原始数据,以及将数据报文中的用户数据发送给其他服务器,以使其他服务器对存储的原始数据进行更新等。
其中,该中断处理核与该业务处理核存在共享缓存空间可以包括:该中断处理核和该业务处理核为同一个核;或者,该中断处理核与该业务处理核满足以下条件中的一种:位于同一个集群(cluster)中,或者位于同一个逻辑单元(die)中。
具体地,结合图3所示的处理器结构,当该中断处理核和该业务处理核为同一个核时,则访问的数据可以通过L1缓存进行传输,传输过程可以为:该中断处理核将该至少一个TCP数据报文中的数据暂存在L1缓存中,该业务处理核直接访问L1缓存。
当该中断处理核和该业务处理核位于同一集群中时,由于同一集群中的多个核共享一个L2缓存,则访问的数据可以通过L2缓存进行传输,传输过程可以为:该中断处理核将该至少一个TCP数据报文中的数据暂存在L2缓存中,该业务处理核直接访问L2缓存。
当该中断处理核和该业务处理核位于同一逻辑单元的不同的集群中,由于同一逻辑单元中的多个核共享一个L3缓存,则访问的数据可以通过L3缓存进行传输,传输过程可以为:该中断处理核将该至少一个TCP数据报文中的数据暂存在L3缓存中,该业务处理核直接访问L3缓存。
可选地,当第一服务器包括两个或者两个以上的CPU时,也可以配置中断CPU核和业务CPU核位于同一CPU的不同集群中,这样与两个CPU核位于不同的CPU相比,也可以降低一部分数据访问时延,提高数据处理速率。由于各缓存访问速率为L1>L2>L3>跨die内存访问>跨CPU内存访问,因此可以尽可能的配置中断处理核和 业务处理核为同一核,或者使其位于同一个集群(cluster)中,或者位于同一个逻辑单元(die)中,以降低数据访问时延,提高数据处理速率。
示例性地,在分布式数据存储系统中,当第一服务器中一个OSD进程的多个TCP连接对应不同的中断队列,且运行业务进程的业务处理核和每个中断队列的中断处理核位于不同的集群或者CPU时,业务处理核和多个中断处理核很有可能被分散在不同的CPU中、或者不同的集群中,这样会导致处理核的数据处理时延较大。
而在本发明实施例中,第一服务器中的一个业务进程的不同目的端口对应一个中断队列,且运行该业务进程的业务处理核和该中断队列的中断处理核位于同一集群或者同一逻辑单元时,业务处理核和中断处理核的关系可以如图7所示。图7中的corex表示业务处理核,OSD1表示运行在corex上的业务进程,port0~portn(端口0~端口n)表示多个目的端口,ethq0表示多个目的端口对应的中断队列,core0表示该中断队列的中断处理核。图7中的corex和corey可以位于同一集群或者同一逻辑单元中,二者也可以为同一个核。
本发明实施例提供的中断处理方法中,通过配置服务器中的一个业务进程的多个TCP连接对应一个中断队列,从而可以将该业务进程通过多个TCP连接接收到的多个TCP数据报文存放在一个中断队列中,以及通过配置该中断队列的中断处理核与运行该业务进程的业务处理核存在相同的缓存空间,使得业务处理核可以使用共享缓存访问数据,进而降低数据访问时延,提高数据处理效率,进而提高系统性能。
上述主要从服务器的角度对本发明实施例提供的方案进行了介绍。可以理解的是,服务器为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的设备及算法步骤,本发明实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请的实施例可以根据上述方法示例对服务器进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请的实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图8示出了上述实施例中所涉及的中断处理装置的一种可能的结构示意图,中断处理装置包括:接收单元801、获取单元802、第一处理单元803和第二处理单元804。其中,接收单元801用于执行图5或图6中的步骤501,还用于执行图5或6中的步骤503;获取单元802和第一处理单元803用于执行图5或图6中的步骤504;第一处理单元803和第二处理单元804用于执行图5或图6中的步骤505,以及本文所描述的其他技术过程等。上述中断处理装置也可以服务器,该方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在硬件实现上,上述接收单元801和获取单元802可以为通信接口,第一处理单 元803和第二处理单元804可以为处理器。
图8所示的中断处理装置也可以通过软件实现图5或图6所示的中断处理的实现方法时,中断处理装置及其各个模块也可以为软件模块。
如图2所示,为本发明实施例提供的上述实施例中所涉及的服务器的一种可能的逻辑结构示意图。服务器中的处理器202可以包括多个核,该多个核可以是一个CPU中的多个核,也可以是多个CPU的多个核,该多个核可以包括中断处理核和业务处理核;其中,中断处理核用于执行图5或图6中的步骤501-步骤505所述操作,业务处理核用于执行图6中的步骤500a-步骤500b所述操作。
在本申请的另一实施例中,如图9所示,还提供一种处理器,该处理器可以包括多个核,该多个核包括中断处理核901和业务处理核902,该处理器可用于执行图5或图6所提供的中断处理方法。其中,该中断处理核901和该业务处理核902可以是同一核;或者,该中断处理核901和该业务处理核902可以归属于同一个集群中;或者,该中断处理核901和该业务处理核902可以归属于同一个逻辑单元中。图9中以中断处理核901和业务处理核902为两个不同的核为例进行说明。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘(solid state Drive,SSD)。
在本申请的另一实施例中,还提供一种芯片系统,该芯片系统包括处理器、存储器、通信接口和总线,处理器、存储器和通信接口通过总线连接,存储器中存储代码和数据,当处理器运行存储器中的代码时,使得芯片系统执行图5或图6所提供的中断处理方法。
本申请中,通过配置服务器中的一个业务进程的多个TCP连接对应一个中断队列,从而可以将该业务进程通过多个TCP连接接收到的多个TCP数据报文存放在一个中断队列中,以及通过配置该中断队列的中断处理核与运行该业务进程的业务处理核存在相同的缓存空间,使得业务处理核可以使用共享缓存访问数据,进而降低数据访问时延,提高数据处理效率,进而提高系统性能。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (12)

  1. 一种中断处理方法,其特征在于,应用于包括多个核的中央处理单元CPU的服务器中,所述多个核的CPU中包括中断处理核和运行有业务进程的业务处理核,所述方法包括:
    所述中断处理核接收中断处理请求,所述中断处理请求用于请求处理中断队列中存放的所述业务进程的多个TCP数据报文中的至少一个TCP数据报文,所述多个TCP数据报文中每个TCP数据报文的目的端口均与同一个中断队列对应;
    所述中断处理核从所述中断队列中获取所述至少一个TCP数据报文;
    所述中断处理核根据所述至少一个TCP数据报文确定所述业务处理核,所述中断处理核和所述业务处理核存在共享缓存空间;
    所述中断处理核唤醒所述业务处理核,以使所述业务处理核处理所述至少一个TCP数据报文。
  2. 根据权利要求1所述的方法,其特征在于,所述中断处理核和所述业务处理核为一个CPU中的同一个核;或者,
    所述业务处理核和所述中断处理核归属于同一个集群(cluster);或者,
    所述业务处理核和所述中断处理核归属于同一个逻辑单元(die)中。
  3. 根据权利要求1或2所述的方法,其特征在于,所述服务器包括多个中断队列,所述业务进程可使用的目的端口包括多个目的端口,所述中断处理核获取中断处理请求之前,所述方法还包括:
    所述业务处理核确定所述多个中断队列与所述多个目的端口之间的对应关系,每个中断队列对应一个目的端口集合,一个目的端口集合包括多个目的端口;
    所述业务处理核通过一个目的端口集合建立所述业务进程的多个TCP连接,所述多个TCP连接用于传输所述业务进程的TCP数据报文。
  4. 根据权利要求3所述的方法,其特征在于,所述业务处理核确定所述多个中断队列与所述多个目的端口之间的对应关系,包括:
    根据所述多个目的端口中每个目的端口和指定哈希值,获取所述每个目的端口对应的中断队列,以得到所述多个中断队列与所述多个目的端口之间的对应关系。
  5. 根据权利要求4所述的方法,其特征在于,当所述服务器包括的网卡类型不同时,所述指定哈希值不同。
  6. 一种中断处理装置,其特征在于,所述装置包括:
    接收单元,用于接收中断处理请求,所述中断处理请求用于请求处理所述中断队列中存放的业务进程的多个TCP数据报文中的至少一个TCP数据报文,所述多个TCP数据报文中每个TCP数据报文的目的端口均与同一个中断队列对应;
    获取单元,用于从所述中断队列中获取所述至少一个TCP数据报文;
    第一处理单元,用于根据所述至少一个TCP数据报文确定第二处理单元,所述第一处理单元和所述第二处理单元存在共享缓存空间;唤醒所述第二处理单元,以使所述第二处理单元处理所述至少一个TCP数据报文。
  7. 根据权利要求6所述的装置,其特征在于,所述第一处理单元和所述第二处理单元为同一个处理单元;或者,
    所述第一处理单元和所述第二处理单元归属于同一个集群(cluster);或者,
    所述第一处理单元和所述第二处理单元归属于同一个逻辑单元(die)中。
  8. 根据权利要求6或7所述的装置,其特征在于,所述装置包括多个中断队列, 所述业务进程可使用的目的端口包括多个目的端口,所述第二处理单元,还用于:
    确定所述多个中断队列与所述多个目的端口之间的对应关系,每个中断队列对应一个目的端口集合,一个目的端口集合包括多个目的端口;
    通过一个目的端口集合建立所述业务进程的多个TCP连接,所述多个TCP连接用于传输所述业务进程的TCP数据报文。
  9. 根据权利要求8所述的装置,其特征在于,所述第二处理单元,还用于:
    根据所述多个目的端口中每个目的端口和指定哈希值,获取所述每个目的端口对应的中断队列,以得到所述多个中断队列与所述多个目的端口之间的对应关系。
  10. 根据权利要求9所述的装置,其特征在于,当所述装置包括的网卡类型不同时,所述指定哈希值不同。
  11. 一种处理器,其特征在于,所述处理器包括多个核,所述多个核包括中断处理核和业务处理核,所述处理器用于执行权利要求1-5任一项所述的中断处理方法。
  12. 一种服务器,其特征在于,所述服务器包括存储器、处理器、总线和通信接口,所述存储器中存储代码和数据,所述处理器、所述存储器和所述通信接口通过所述总线连接,所述处理器运行所述存储器中的代码使得所述服务器执行上述权利要求1-5任一项所述的中断处理方法。
PCT/CN2018/100622 2018-02-07 2018-08-15 一种中断处理方法、装置及服务器 WO2019153702A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/987,014 US20200364080A1 (en) 2018-02-07 2020-08-06 Interrupt processing method and apparatus and server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810124945.2A CN110119304B (zh) 2018-02-07 2018-02-07 一种中断处理方法、装置及服务器
CN201810124945.2 2018-02-07

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/987,014 Continuation US20200364080A1 (en) 2018-02-07 2020-08-06 Interrupt processing method and apparatus and server

Publications (1)

Publication Number Publication Date
WO2019153702A1 true WO2019153702A1 (zh) 2019-08-15

Family

ID=67519647

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/100622 WO2019153702A1 (zh) 2018-02-07 2018-08-15 一种中断处理方法、装置及服务器

Country Status (3)

Country Link
US (1) US20200364080A1 (zh)
CN (1) CN110119304B (zh)
WO (1) WO2019153702A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111447155A (zh) * 2020-03-24 2020-07-24 广州市百果园信息技术有限公司 数据传输方法、装置、设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112306693B (zh) * 2020-11-18 2024-04-16 支付宝(杭州)信息技术有限公司 数据包的处理方法和设备
CN113037649B (zh) * 2021-05-24 2021-09-07 北京金山云网络技术有限公司 网络中断数据包的收发方法和装置、电子设备和存储介质
CN114741214B (zh) * 2022-04-01 2024-02-27 新华三技术有限公司 一种数据传输方法、装置及设备
CN115225430A (zh) * 2022-07-18 2022-10-21 中安云科科技发展(山东)有限公司 一种高性能IPsec VPN CPU负载均衡方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013383A (zh) * 2007-02-13 2007-08-08 杭州华为三康技术有限公司 实现多核cpu进行报文联合处理的系统及方法
US20110087815A1 (en) * 2009-10-13 2011-04-14 Ezekiel John Joseph Kruglick Interrupt Masking for Multi-Core Processors
CN102077181A (zh) * 2008-04-28 2011-05-25 惠普开发有限公司 用于在多核处理器中和在某些共享存储器多处理器系统中产生并输送处理器间中断的方法和系统

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6957281B2 (en) * 2002-01-15 2005-10-18 Intel Corporation Ingress processing optimization via traffic classification and grouping
US7076545B2 (en) * 2002-07-31 2006-07-11 Sun Microsystems, Inc. Load balancing the servicing of received packets
US20070168525A1 (en) * 2006-01-18 2007-07-19 Deleon Baltazar Iii Method for improved virtual adapter performance using multiple virtual interrupts
US20070271401A1 (en) * 2006-05-16 2007-11-22 Eliel Louzoun Techniques to moderate interrupt transfer
US8316368B2 (en) * 2009-02-05 2012-11-20 Honeywell International Inc. Safe partition scheduling on multi-core processors
US8655974B2 (en) * 2010-04-30 2014-02-18 International Business Machines Corporation Zero copy data transmission in a software based RDMA network stack
CN102929819B (zh) * 2012-10-19 2016-02-10 北京忆恒创源科技有限公司 用于处理计算机系统中的存储设备的中断请求的方法
US9756138B2 (en) * 2013-04-08 2017-09-05 Here Global B.V. Desktop application synchronization to process data captured on a mobile device
US20150242344A1 (en) * 2014-02-27 2015-08-27 International Business Machines Corporation Delaying floating interruption while in tx mode
CN104023250B (zh) * 2014-06-13 2015-10-21 腾讯科技(深圳)有限公司 基于流媒体的实时互动方法和系统
US9667321B2 (en) * 2014-10-31 2017-05-30 Pearson Education, Inc. Predictive recommendation engine
CN106557358B (zh) * 2015-09-29 2020-08-11 北京东土军悦科技有限公司 一种基于双核处理器的数据存储方法及装置
CN105511964B (zh) * 2015-11-30 2019-03-19 华为技术有限公司 I/o请求的处理方法和装置
CN106357808B (zh) * 2016-10-25 2019-09-24 Oppo广东移动通信有限公司 一种数据同步方法和装置
US10776385B2 (en) * 2016-12-02 2020-09-15 Vmware, Inc. Methods and apparatus for transparent database switching using master-replica high availability setup in relational databases
US10397096B2 (en) * 2017-04-28 2019-08-27 International Business Machines Corporation Path resolution in InfiniBand and ROCE networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013383A (zh) * 2007-02-13 2007-08-08 杭州华为三康技术有限公司 实现多核cpu进行报文联合处理的系统及方法
CN102077181A (zh) * 2008-04-28 2011-05-25 惠普开发有限公司 用于在多核处理器中和在某些共享存储器多处理器系统中产生并输送处理器间中断的方法和系统
US20110087815A1 (en) * 2009-10-13 2011-04-14 Ezekiel John Joseph Kruglick Interrupt Masking for Multi-Core Processors

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111447155A (zh) * 2020-03-24 2020-07-24 广州市百果园信息技术有限公司 数据传输方法、装置、设备及存储介质
CN111447155B (zh) * 2020-03-24 2023-09-19 广州市百果园信息技术有限公司 数据传输方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN110119304A (zh) 2019-08-13
CN110119304B (zh) 2021-08-31
US20200364080A1 (en) 2020-11-19

Similar Documents

Publication Publication Date Title
US11841814B2 (en) System with cache-coherent memory and server-linking switch
WO2019153702A1 (zh) 一种中断处理方法、装置及服务器
EP3916566A1 (en) System and method for managing memory resources
EP3057272B1 (en) Technologies for concurrency of cuckoo hashing flow lookup
EP2406723B1 (en) Scalable interface for connecting multiple computer systems which performs parallel mpi header matching
KR100925572B1 (ko) 상이한 길이의 캐시 위치 내의 캐시 코히어런시를 위한시스템, 방법, 프로세스 및 장치
CN109582223B (zh) 一种内存数据迁移的方法及装置
US11922537B2 (en) Resiliency schemes for distributed storage systems
WO2019233322A1 (zh) 资源池的管理方法、装置、资源池控制单元和通信设备
Cassell et al. Nessie: A decoupled, client-driven key-value store using RDMA
WO2023093418A1 (zh) 数据迁移方法、装置及电子设备
CN115344551A (zh) 一种数据迁移的方法以及数据节点
WO2016049807A1 (zh) 多核处理器系统的缓存目录处理方法和目录控制器
US9288163B2 (en) Low-latency packet receive method for networking devices
WO2023104194A1 (zh) 一种业务处理方法及装置
WO2023124304A1 (zh) 芯片的缓存系统、数据处理方法、设备、存储介质及芯片
US20190050274A1 (en) Technologies for synchronizing triggered operations
WO2023231572A1 (zh) 一种容器的创建方法、装置及存储介质
CN108762666B (zh) 一种存储系统的访问方法、系统、介质及设备
CN116501456A (zh) 用于利用一致性接口的队列管理的系统、方法和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905153

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18905153

Country of ref document: EP

Kind code of ref document: A1