WO2018040750A1 - 一种配置方法、装置和数据处理服务器 - Google Patents

一种配置方法、装置和数据处理服务器 Download PDF

Info

Publication number
WO2018040750A1
WO2018040750A1 PCT/CN2017/092517 CN2017092517W WO2018040750A1 WO 2018040750 A1 WO2018040750 A1 WO 2018040750A1 CN 2017092517 W CN2017092517 W CN 2017092517W WO 2018040750 A1 WO2018040750 A1 WO 2018040750A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor cores
data processing
operating system
threads
thread
Prior art date
Application number
PCT/CN2017/092517
Other languages
English (en)
French (fr)
Inventor
单卫华
李嘉
熊林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018040750A1 publication Critical patent/WO2018040750A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Definitions

  • the present invention relates to the field of data processing, and in particular, to a configuration method, apparatus, and data processing server.
  • Big data processing generally meets four characteristics: scale, diversity, high speed and value.
  • data size and generation speed are faster than ever, and higher requirements are imposed on big data processing, and fast data processing has emerged.
  • Fast data processing has stringent requirements in terms of delay and concurrency. For example, in the financial field, the small jitter of delay may cause anti-fraud detection to time out, thereby causing fraudulent transactions to pass, resulting in economic losses.
  • the embodiment of the invention discloses a configuration method, a device and a data processing server, which can reduce the delay of data processing.
  • the present application provides a configuration method.
  • the data processing server is configured to perform real-time processing on data.
  • the data processing server has K processor cores, and K is an integer greater than 1.
  • the number of processor cores that the operating system configures to manage thread exclusive is N, N is less than K and N is an integer greater than 0;
  • the number of processor cores that the operating system configures for worker threads exclusive is L, L is less than K and L Is an integer greater than 0.
  • the operating system selects unbound N processor cores from K processor cores, binds the management thread of the data processing process to the selected N processors, and the management thread is bound to the N processor cores. Exclusively running on N processor cores, other threads are not allowed to run on N processor cores; the binding method can be: after the management thread is created, the operating system allocates a thread id for the management thread, K processor cores. Each processor core is pre-assigned with a serial number, for example, starting from 0, the operating system obtains the thread id of the created management thread, and obtains the serial number of the selected N processor cores, and manages the thread id of the thread and N The serial number of the processor core is bound.
  • the operating system creates L working threads of the data processing process, and the operating system selects unbound L processor cores from K processor cores, and binds L working threads to L processor cores, one work A thread is bound to a processing core, and each worker thread exclusively runs on a bound processor core.
  • the management thread is used to perform management and scheduling functions, and to perform data interaction with an external device; for example, receiving or sending of business data or execution code, scheduling a worker thread to execute code or processing business data.
  • the worker thread is used to perform data processing tasks, and the business data is processed according to the execution code, and the L work threads can be executed in parallel.
  • the management thread of the data processing process and the processing core of the worker thread by configuring the management thread of the data processing process and the processing core of the worker thread, after the management thread is created, the management thread is bound to a preset number of processor cores, so that the management thread exclusively runs in the specified On the processor core; after the worker thread is created, the worker thread is bound to an equal number of processor cores, and each worker thread monopolizes one processor core. In this way, the task waiting caused by multiple working threads competing for the CPU time slice can be avoided, so that a single processor core is occupied by a single working thread, and the CPU time slice is not shared with other working threads, which effectively reduces the delay of data processing.
  • the method before the number of processor cores configured to manage threads is N, the method further includes: the number of processor cores configured by the operating system for the operating system is M, M. Less than K and M is an integer greater than 0, then select M unbound processor cores from the K processor cores to cause the operating system to run on the selected M processor cores.
  • the binding method of the operating system and the M processor cores includes two methods: 1.
  • the operating system After the operating system configures the number of processor cores for the operating system to be M, the operating system stores the configuration file including the quantity information M in the non- Volatile memory, for example: stored in a mechanical disk or a solid state disk, and then performs a restart operation, triggers the BIOS to read the configuration file after the restart, and selects M out of the K processor cores according to the quantity information M in the configuration file.
  • the processor core runs the operating system on M processor cores. 2.
  • the operating system currently runs on one or more processor cores. After the operating system configures the number M of processor cores for the operating system, selects M processor cores from the K processor cores, and the operating system One or more processor cores currently running migrate to the selected M processor cores.
  • the operating system in addition to the operating system needs to be bound to the M processor cores, other processes running in the process space except the data processing process need to bind the M processor cores.
  • the operating system exclusively runs on the selected M processor cores to prevent the operating system from preempting the CPU time slice of the working thread and causing data processing tasks to wait, further reducing the delay of data processing.
  • a VMM Virtual Machine Monitor, VMM for short
  • VMM Virtual Machine Monitor
  • Select K processor cores assign K processor cores to virtual machines, and generate K virtual processor cores in the virtual machine.
  • VMM will perform K processor cores and processor cores of K virtual machines one to one.
  • the mapping including the virtual machines of K virtual processor cores, runs the operating system.
  • data processing is performed by deploying a virtual machine on the data processing server, thereby improving the utilization rate of physical resources in the data processing server.
  • the operating system acquires an interrupt request interrupt number and a sequence number of the selected N processor cores, and performs N processing
  • the sequence number of the kernel is associated with the interrupt number of the interrupt request to generate interrupt mapping information.
  • the interrupt request processing process is: the operating system receives the interrupt request, acquires the interrupt request interrupt number, and queries the N processor cores associated with the interrupt request according to the interrupt mapping information, and the operating system notifies N
  • the processor core processes the interrupt request.
  • the operating system further includes: configuring, by the operating system, the scheduling type of the L working threads as the real-time scheduling type according to the preset scheduling type configuration information.
  • the scheduling type of the worker thread By setting the scheduling type of the worker thread to the real-time scheduling type, the worker thread can always run on the corresponding processor core, avoiding interruption of data processing tasks and reducing processing delay.
  • the operating system sets the priority of the L working threads according to the preset priority configuration information. Set to the highest priority. Avoid higher priority threads to preempt the worker core of the worker thread, so that the worker thread runs on the corresponding processor core, reducing the processing delay of the worker thread.
  • the method further includes:
  • the management thread receives the new execution code or business instruction sent by the client, puts the new execution code or business instruction into the lock-free queue, and the L worker threads take an execution code or service from the lock-free queue in a lock-free competition.
  • the data is processed, and the management thread and the worker thread communicate by means of a lock-free queue, thereby avoiding resource competition of each worker thread and reducing the waiting time for the worker thread to perform data processing tasks.
  • the operating system further includes: the operating system allocates a private memory space for each of the L working threads according to the preset memory mapping information, where the private memory space is used for storing the work.
  • the private data of the thread avoids the preemption of the working space by the worker thread through the shared-nothing structure and reduces the processing time.
  • the method further includes: the operating system receiving the memory access request sent by the working thread; wherein the memory access request carries the memory address, and the working thread is L Any one of the working threads, the operating system determines whether the memory address is located in the private memory space associated with the worker thread, and if so, the operating system sends the memory access request to the associated private memory space for processing; if not, the operating system will Memory access should be sent to the shared memory space for processing.
  • the method before the data processing process is created, the method further includes: operating system configuration memory mapping information; wherein the memory mapping information indicates data processing All worker threads of a process each allocate a private memory space.
  • the second aspect of the present application provides a configuration apparatus, including: a configuration module and a processing module;
  • the number of processor cores used by the configuration module to manage threads is N, and the number of processor cores configured for worker threads is L; where N is less than K and N is an integer greater than 0, L is less than K and L An integer greater than 0; the processing module is configured to create a management thread for the data processing process, and select N processor cores from the K processor cores, and bind the management thread of the data processing process to the N processor cores; Create L worker threads for the data processing process, and select unbound L processor cores from the K processor cores, and tie the L work threads of the data processing process to the L processor cores one by one. set.
  • the management thread of the data processing process and the processing core of the worker thread by configuring the management thread of the data processing process and the processing core of the worker thread, after the management thread is created, the management thread is bound to a preset number of processor cores, so that the management thread exclusively runs in the specified On the processor core; after the worker thread is created, the worker thread is bound to an equal number of processor cores, and each worker thread monopolizes one processor core. In this way, the task waiting caused by multiple working threads competing for the CPU time slice can be avoided, so that a single processor core is occupied by a single working thread, and the CPU time slice is not shared with other working threads, which effectively reduces the delay of data processing.
  • the configuration module is further configured to configure the number of processor cores for the operating system to be M; wherein, M is less than K and M is an integer greater than 0;
  • the processing module is also used to select M processor cores from K processor cores and run an operating system on the M processor cores.
  • the operating system exclusively runs on the selected M processor cores to prevent the operating system from preempting the CPU time slice of the working thread and causing data processing tasks to wait, further reducing the delay of data processing.
  • the operating system runs in a virtual machine, where the virtual machine includes a processor core of K virtual machines, and K virtual processor cores and K processor cores are one.
  • K virtual processor cores and K processor cores are one.
  • a mapping relationship In the above embodiment, data processing is performed by deploying a virtual machine, which can improve utilization of physical resources.
  • the configuration module is further configured to associate the N processor cores with the interrupt request to generate The mapping information is interrupted; the processing module is further configured to receive the interrupt request; query the N processor cores associated with the interrupt request according to the interrupt mapping information; and notify the N processor cores to process the interrupt request.
  • the processing module is further configured to set the L worker thread scheduling types to the real-time scheduling type according to the preset scheduling type configuration information.
  • the worker thread can be run on the corresponding processor core to avoid interruption of data processing tasks and reduce processing delay.
  • the processing module is further configured to: use L work threads according to the preset priority configuration information.
  • the priority is set to the highest priority. Avoid higher priority threads to preempt the worker core of the worker thread, so that the worker thread runs on the corresponding processor core, reducing the processing delay of the worker thread.
  • a third aspect of the present application provides a data processing server, including: K processor cores and a memory; wherein K processors call code in a memory to perform the following operations: configuring a processor core for managing threads The number is M; where M is less than K and M is an integer greater than 0; M unbound processor cores are selected from K processor cores; M processor cores call code in memory for execution The following operations: Run the operating system.
  • the M processor cores are further configured to perform:
  • the number of processor cores configured to manage threads is N, and the number of processor cores configured for worker threads is L; where N is less than K and N is an integer greater than 0, L is less than K and L is greater than 0.
  • the operating system runs exclusively after M processors, and the M processor cores are also used to execute:
  • the N processor cores call the code in the memory to perform the following operations: running the management thread;
  • the L processor cores call the code in the memory to perform the following operations: Run one worker thread of the respective binding.
  • a virtual machine is deployed in the data processing server, the operating system is installed in the virtual machine, and the virtual machine includes K virtual The processor core, the K virtual processor cores and the K processor cores have a one-to-one mapping relationship.
  • the M processor cores are further configured to perform: N processor cores and interrupts Generate interrupt mapping information after requesting association;
  • N processor cores are also used to execute:
  • the interrupt request to be processed is processed according to the interrupt mapping information.
  • the M processor cores are further configured to perform:
  • the L worker thread scheduling types are set to the real-time scheduling type according to the preset scheduling type configuration information.
  • the M processor cores are further configured to perform:
  • the priority of the L worker threads is set to the highest priority according to the preset priority configuration information.
  • FIG. 1 is a schematic structural diagram of a data processing system according to an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a data processing server according to an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of still another data processing server according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of still another data processing server according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a configuration method according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of still another configuration method according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a configuration apparatus according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of still another data processing server according to an embodiment of the present invention.
  • FIG. 1 is a network structure diagram of a data processing system according to an embodiment of the present invention.
  • a data processing system includes a client cluster 10 and a data processing server cluster 11, and a client cluster 10
  • the client 101 includes a client 101, a client 102, a client 10n
  • the data service cluster 11 includes a plurality of data processing servers: a data processing server 111, a data processing server 112, a data processing server 113, ...
  • the server 11n is processed.
  • the client cluster 10 client may evenly distribute the data processing tasks to the data processing server in the data processing server cluster 11 according to the load balancing algorithm.
  • one of the n clients generates the data to be processed, and the client uses the hash algorithm to hash the data to be processed to obtain a hash value, and the hash value and the m are used to obtain the remainder, where m is
  • the number of data processing servers in the data processing server cluster 10 each data processing server in the data processing server cluster 10 is preset with a sequence number, and the remainder obtained by the above-mentioned remainder is used as the serial number of the target data processing server, and the client will process the data to be processed.
  • the data processing server runs a data processing process, an operating system, and other processes other than the data processing process.
  • the data processing process includes a management thread and a worker thread.
  • the data processing server includes a processor core resource pool, and the processor core resource pool includes a plurality of processor cores.
  • the data processing server performs related configuration and binding before performing data processing.
  • the configuration and binding process includes: configuring for The number of processor cores of the operating system, the number of processor cores configured for the management thread of the data processing process, and the number of processor cores configured for the worker threads of the data processing process; the slave processor core resources according to a preset number The unselected processor core is selected in the pool, and the selected processor core is bound to the operating system, and the operating system exclusively runs on the selected processor core; the operating system creates a data processing process, and the operating system creates a data processing process.
  • the management thread selects an unbound processor core from the processor core resource pool according to a preset quantity, and binds the selected processor core to the management thread, so that the management thread exclusively runs on the selected processor core.
  • the operating system creates a worker thread according to a preset number, and the operating system is from the processor core resource pool according to a preset number.
  • Optional unbound processor core the processor core selected to bind with the worker thread is created, each worker thread binding a processor core.
  • the data processing server includes three architectures, a bare metal architecture, a virtual machine architecture, and a container architecture.
  • FIG. 2 is a data processing server of a bare metal architecture
  • the data processing server 2 includes a processor core resource pool 22, the processor core resource pool 22 includes K processor cores, and the data processor server 2 is configured for an operating system.
  • the number of processor cores is M
  • the number of processor cores configured to manage threads is N
  • the number of processor cores configured for worker threads is L
  • the configuration method may be: the operating system executes after startup The above configuration process.
  • the binding between the operating system and the processor core may be: the operating system may select unbound M processor cores from the K processor cores, and the operating system will currently run one or more processors.
  • the core is migrated to the selected M processor cores; or, after the configuration is completed, the operating system performs a restart operation.
  • the BIOS reads the configuration file stored in the non-volatile memory, and the configuration file includes The number of processor cores of the operating system.
  • the BIOS selects M processor cores from the K processor cores and runs the operating system on the selected M processor cores.
  • the operating system creates a data processing process 20, and the operating system creates the management thread 200 according to the number N, and the number of the created management threads 200 may be one or more, in FIG. 1 is an example.
  • the operating system selects the unbound N processor cores from the K processor cores, binds the N processor cores to the created management thread 200, and the management thread 200 exclusively runs on the N processor cores.
  • the operating system creates L worker threads based on the number L, from K processor cores. Select L processor cores that are not bound, and bind L processor cores to L worker threads. Each worker thread runs exclusively on one processor core.
  • the data processing server 2 further includes a memory resource pool 23, where the memory resource pool 23 includes multiple memories, and the operating system 21 selects a memory space from the memory resource pool, and binds the selected memory space to the operating system.
  • the operating system allocates a separate memory interval for the management thread and each worker thread, and the operating system, the management thread, and the memory space of each worker thread are independent of each other.
  • FIG. 3 is a data processing server of a virtual machine architecture.
  • the data processing server 3 includes a processor core resource pool 33, and the processor core resource pool 33 includes a plurality of processor cores.
  • One or more virtual machines are deployed in the data processing server 3.
  • three virtual machines are taken as an example.
  • VMM Virtual Machine Monitor, VMM for short
  • the hardware resources include CPU resources, memory resources, and IO resources.
  • the VMM provides access to the data processing server for the virtual machine. The ability of hardware resources.
  • the VMM must satisfy the allocation of CPU resources to the virtual machine: the virtual processor core of each virtual machine maps one physical processor core.
  • Each virtual machine runs a data processing process, an operating system, and other processes.
  • the data processing process includes a management thread and a worker thread.
  • the following describes the configuration method of the present application in the virtual machine 30: the VMM selects K processor cores from the virtual machine resource pool and allocates them to the virtual machine 30, and the VMM virtualizes the allocated K processor cores to make the virtual machine 30
  • Each virtual processor core maps a physical processor core.
  • the processor core is a physical processor core. The following does not distinguish between a physical processor core and a virtual processor core. Is the processor core.
  • Operating system 302 configures the number M of processor cores for the operating system, the number of processor cores configured to manage threads is N, the number of processor cores configured for worker threads is L; operating system 302 from K An unbound M processor core is selected in the processor core, and the operating system 302 migrates from the currently running one or more processor cores to the selected M processor cores; the operating system 302 creates a management thread according to the number N.
  • the number of management threads is equal to N, and the operating system 302 selects unbound N processor cores from the K processor cores, and binds the created management threads to N processor cores; the management operating system 302 The unselected L processor cores are selected from the K processor cores, and the L processor cores are bound to L working threads, and each worker thread is bound to one processor core.
  • the data processing server 3 further includes: a memory resource pool 44, where the memory resource pool includes one or more memories, and for each virtual machine, an operating system, a management thread, and each worker thread included in the virtual machine are There is 1 separate memory space allocated.
  • FIG. 4 is a data processing server of a container architecture.
  • the data processing server 4 includes a processor core resource pool, and the processor core resource pool includes K processor cores.
  • One or more containers are deployed in the data processing server 4, and each data container runs a data processing process so that multiple data processing processes can be run on one data processing server by the isolation of the containers.
  • the operating system and other processes run on the data processing server.
  • the data processing process includes management threads and worker threads. In each container, the management thread exclusively runs on a preset number of processor cores, and the worker threads run on a preset number of processor cores, each of which runs exclusively on one processor core.
  • the operating system 42 configures the number M of processor cores for the operating system, is configured to manage the number of threads N and the number L configured for the working thread, and the operating system processes from K
  • the unselected M processor cores are selected in the kernel, and the operating system migrates from one or more processor cores currently running to the M processor cores, and the operating system creates a data processing process in the container 40, and operates
  • the system creates a management thread according to the number N.
  • the operating system selects unbound N processor cores from K processor cores, binds N processor cores to the management thread, and manages the threads exclusively to run on N processors.
  • Nuclear The system creates L working threads according to the quantity L.
  • the operating system selects unbound L processor cores from K processor cores, and the data processing process binds L processor cores to L working threads, each The worker thread runs exclusively on one processor core.
  • the data processing server 4 further includes a memory resource pool 44.
  • the memory resource pool 44 includes one or more memory. For each container, the management thread and each worker thread are allocated a separate memory space, the operating system and other processes. Also allocated a separate memory space.
  • the management thread of the data processing process and the processing core of the worker thread when the management thread is created, the management thread is bound to a preset number of processor cores, so that the management thread exclusively runs in the above manner. On the processor core; and when the worker thread is created, the worker thread is bound to an equal number of processor cores, so that each worker thread monopolizes one processor core. In this way, the task waiting caused by multiple working threads competing for the CPU time slice can be avoided, so that a single processor core is occupied by a single working thread, and the CPU time slice is not shared with other working threads, which effectively reduces the delay of data processing.
  • FIG. 5 is a schematic flowchart of a configuration method according to an embodiment of the present invention, where the method includes but is not limited to the following steps.
  • the number of processor cores configured to manage threads is N.
  • the management thread is used to manage the worker thread, schedule the worker thread, and perform data interaction with the external device; the worker thread is used to execute the data processing task.
  • the data processing process includes a management thread and a worker thread, and both the operating system and the data processing process run on the data processing server.
  • the data processing server may include one processor core resource pool, and the number of processor cores in the processor core resource pool is K, wherein the operating system may maintain a state information table, and the state information table stores and records each processor core. Status entries, the state of the processor core is divided into a bound state and an unbound state. The bound state indicates that the thread or process is exclusively bound, and other processes or threads cannot use the processor core.
  • the unbound state indicates The processor core can be occupied by any thread or thread; when the state of any one of the processor core resource pools changes, the operating system updates the entries of the processor core in the state record table.
  • the number of processor cores configured to manage threads after the operating system can be restarted is N, N is less than K, and N and K are both positive integers.
  • the data of the processor core configured for the worker thread is L.
  • the number of processor cores that the operating system configures for the worker thread to be exclusive is L, L is less than K, and L is a positive integer.
  • the operating system creates one or more management threads for the data processing process according to the preset number N. wherein, in the case that multiple management threads need to be created, the operating system may be managed by the first management after creating the first management thread.
  • the thread creates the remaining management threads.
  • the number of management threads created may be equal to N, so that each management thread occupies 1 processor core and improves the utilization of the processor core.
  • the operating system selects the unbound N processor cores from the K processor cores, and exclusively binds all the management threads created in S503 to the N processor cores.
  • the binding method may be: The system obtains the thread ID of the created management thread and the serial number of the selected N processor cores, binds the created thread id of the management thread and the sequence number of the N processor cores, and binds the management thread to the N processors. Exclusive operation on the nuclear.
  • the operating system creates an equal number of L worker threads according to the number L set by S502.
  • the operating system selects L unbound processor cores from the K processor cores, and binds the created L working threads to the L processor cores.
  • the binding method may be: operating system acquisition. Create the thread ID of the L worker threads and the serial number of the selected L processor cores, and bind the thread ids of the L worker threads to the serial numbers of the L processor cores one-to-one, and each worker thread runs exclusively. On a processor core.
  • the management thread of the data processing process exclusively runs on the bound processor core, and each worker thread runs independently on the bound processor core, and the worker thread does not need to compete with other threads.
  • CPU time slice reducing the delay of data processing.
  • FIG. 6 is a schematic flowchart diagram of still another configuration method according to an embodiment of the present invention, where the method includes but is not limited to the following steps.
  • the number of processor cores configured for the operating system is M.
  • the data processing server includes K processor cores, each of the K processor cores is a physical processor core, and the operating system can maintain a state record table, and each processor core is stored in the state record table.
  • the state of the table, the state of the processor core includes a bound state and an unbound state.
  • the operating system updates the processor in the state record table.
  • the data processing process includes a management thread and a worker thread, the management thread is used to manage and schedule the worker thread, and interact with external devices for data or instructions; the worker thread is used to perform data processing tasks.
  • the number of processor cores configured to manage threads is N.
  • the number of processor cores that the data processing server configures to manage threads is N.
  • the operating system configures the processor core for managing threads to have a data of 4.
  • the number of processor cores configured for the worker thread is L.
  • the number of processor cores configured by the operating system for the worker thread is L, for example, the number of processor cores configured by the operating system for the worker thread is 10.
  • the operating system is currently running on one or more processor cores, and when the operating system needs to bind the processor core, the operating system processes from K
  • the M processor cores are selected in the kernel, and the operating system migrates from the currently running one or more processor cores to the selected M processor cores.
  • the operating system configures the number of processor cores for the operating system to generate a configuration file, and the configuration file is stored in a non-volatile memory (eg, a mechanical hard disk, a solid state hard disk), and configured.
  • a non-volatile memory eg, a mechanical hard disk, a solid state hard disk
  • the operating system performs a restart operation.
  • the BIOS reads the pre-stored configuration file, and selects M processor cores from the K processor cores according to the number M in the configuration file, during the startup process.
  • the states of the K processor cores are all unbound, and the selected M processor cores are bound to the operating system, and the operating system is run on the selected M processor cores.
  • the operating system can also be regarded as one when it is in the startup state. process.
  • the operating system uses a system call of int sched_setaffinity (pid_t pid, unsigned int cpusetsize, cpu_set_t*mask) to bind a process or thread to one or more specific processor cores.
  • the first parameter pid_t pid in the function indicates the thread id or process id that needs to set or get the binding information.
  • the first parameter is 0, it means setting the currently called thread; the second parameter cpusetsize is generally set to Sizeof(cpu_set_t) is used to indicate the size of the memory structure object pointed to by the third parameter; the third parameter mask points to a pointer of type cpu_set_t object, which is used to set or get a list of processor cores bound to the specified thread or process.
  • a taskset command can be used to bind a process to one or more specific CPU cores.
  • the command format is “taskset-pc 3 21184”, “21184” indicates the process id or process ID, “3” indicates the serial number of the processor core, and the command indicates that the process with id 21184 exclusively runs on the 4th processor core. Up (the first processor core has a sequence number of 0).
  • "cpuset” means the bound processor core, such as 0-3 or separated by commas such as 0, 3, 4 (0 is the sequence number of the first processor core); process id represents the id of the process.
  • the init process is the ancestor of all processes
  • the operating system creates a management thread of the data processing process, selects unbound N processor cores from the K processor cores, and binds the selected N processor cores to the management thread.
  • the binding of the thread and the processor core refer to the example of S604, and details are not described herein again.
  • the operating system creates L worker threads for the data processing process according to the pre-configured number L.
  • the operating system selects unbound L processor cores from K processor cores, and binds L working threads of the data processing process to L processor cores, and each working thread is uniquely bound to one.
  • Processor core The binding of the worker thread to the processor core can operate the description of S604, and details are not described herein again.
  • the operating system associates the N processor cores with the interrupt request to generate interrupt mapping information.
  • the operating system obtains the interrupt number on the system, and different interrupt requests correspond to different interrupt numbers, and the operating system binds the obtained interrupt number with the N processor cores to generate interrupt mapping information. After the interrupt request is bound to the N processor cores, the interrupt requests received by the subsequent operating systems are all processed by the N processor cores, and the worker thread does not process any interrupt requests, thereby preventing the worker thread from being interrupted when performing the data processing task. Reduction Delay.
  • the operating system executes the cat/proc/interrupts command to view the interrupt request interrupt number.
  • the operating system sets the binding relationship between the interrupt request and the processor core by modifying the /proc/irq/ ⁇ irq_number ⁇ /smp_affinity configuration file.
  • echo 3>/proc/irq/20/smp_affinity indicates that the processor core with sequence number 3 is assigned an interrupt request with an interrupt number of 20.
  • the operating system queries the N processor cores associated with the interrupt request according to the preset interrupt mapping information, and the operating system notifies the N processor cores to process the interrupt request, and processes the processor core through a specific processor core.
  • the interrupt request, the processor core bound to the worker thread does not need to perform any interrupt processing, avoid interruption of data processing tasks, and reduce processing delay.
  • the system call of pthread_attr_setschedpolicy is used to set the scheduling type of the thread.
  • the operating system can set the L working threads to the real-time scheduling type of the sched_fifo first-come first-served service. Once a worker thread occupies a processor core, the data processing task is run until a higher priority task arrives or gives up.
  • the operating system sets the priority of the L working threads to the highest priority according to the preset priority configuration information, thereby preventing the working thread from running a higher priority task than the data processing task, and reducing the delay of the data processing task.
  • the operating system uses the pthread_attr_setschedparam system call to set the priority of the worker thread.
  • the management thread after receiving the new execution code, writes the new execution code to the lock-free queue, and notifies the L work threads by the lock-free manner, and the work thread performs each operation before and after each operation processing. , detecting the notification state of the lockless queue, and finding that the new execution code needs to be reloaded, the worker thread loads the new execution code to run the data processing task.
  • the operating system mainly implements lock-free queues through atomic operations such as CAS or FAA and Retry-Loop, or share changes to realize message transfer between the management thread and the worker thread.
  • the management thread receives new execution code or business data
  • the management thread places the new execution code or business data into a lock-free queue
  • the idle working worker thread takes an execution code or service data from the lockless queue for data processing.
  • it also includes:
  • the operating system allocates a private memory space for each of the L working threads according to the preset memory mapping information.
  • the operating system uses TLS (Thread Local Storage, thread Local storage) to associate the memory space with a specified thread that is executing.
  • TLS Thread Local Storage, thread Local storage
  • the K processor cores are allocated to a virtual machine, and the virtual machine includes K virtual processor cores, K processor cores and K virtual processors.
  • the kernel is a one-to-one mapping relationship between the operating system and the virtual machine.
  • it also includes:
  • the operating system receives a memory access request sent by a worker thread; wherein the memory access request carries a memory address, and the worker thread is any one of the L working threads;
  • the operating system determines whether the memory address is located in a private memory space associated with the worker thread
  • the operating system sends the memory access request to the associated private memory space for processing
  • the operating system sends the memory access to the shared memory space for processing.
  • the method before the creating the data processing process, the method further includes:
  • the operating system configures the memory mapping information; wherein the memory mapping information indicates that all worker threads created by the data processing process each allocate a private memory space.
  • each worker thread has a private memory space, which prevents multiple worker threads from preempting the memory space when storing private data, and reduces the delay of the worker thread performing data processing tasks.
  • the worker core can use dedicated hardware resources to perform data processing tasks through processor core binding, memory binding, interrupt binding, and worker thread priority and scheduling type settings. , reducing processing delays.
  • FIG. 7 is a schematic structural diagram of a configuration apparatus according to an embodiment of the present invention.
  • the configuration apparatus may include a configuration module 701 and a processing module 702.
  • the configuration apparatus 7 may pass an application specific integrated circuit (English: Application- Specific Integrated Circuit (abbreviation: ASIC) implementation, or programmable logic device (English: Programmable Logic Device, abbreviation: PLD) implementation.
  • the PLD may be a Complex Programmable Logic Device (CPLD), an FPGA, a Generic Array Logic (GAL), or any combination thereof.
  • the configuration device 7 is used to implement the configuration method shown in FIG.
  • the configuration device 7 and its respective modules may also be software modules. A detailed description of each module is as follows.
  • the configuration module 701 is configured to configure the number of processor cores for managing threads to be N, and the number of processor cores configured for the working threads is L; wherein N is less than K and N is an integer greater than 0, and L is less than K And L is an integer greater than zero.
  • the processing module 702 is configured to create a management thread for the data processing process, and select N processor cores from the K processor cores, and bind the management thread of the data processing process to the N processor cores Setting L work threads for the data processing process, and selecting unbound L processor cores from the K processor cores, and L working threads of the data processing process and the L One processor core is bound one-to-one.
  • the configuration module 701 is further configured to configure the number of processor cores for the operating system to be M; wherein, M Less than K and M is an integer greater than 0;
  • the processing module 702 is further configured to select M processor cores from the K processor cores, and run the operating system on the M processor cores.
  • the configuration module 701 is further configured to: after the N processor cores are associated with the interrupt request, generate interrupt mapping information;
  • the processing module 702 is further configured to receive an interrupt request, query the N processor cores associated with the interrupt request according to the interrupt mapping information, and notify the N processor cores to process the interrupt request.
  • the operating system runs in a virtual machine, where the virtual machine includes K virtual processor cores, and the K virtual processor cores are in a one-to-one mapping relationship with the K processor cores.
  • processing module 702 is further configured to set the L work thread scheduling types to a real-time scheduling type according to preset scheduling type configuration information.
  • processing module 702 is further configured to set a priority of the L working threads to a highest priority according to preset priority configuration information.
  • the configuration module enables the worker thread to perform data processing tasks using dedicated hardware resources through the binding of the processor core, the memory binding, the interrupt binding, and the setting of the worker thread priority and the scheduling type, thereby reducing processing. Delay.
  • each module may also correspond to the corresponding description of the method embodiments shown in FIG. 5 and FIG. 6.
  • FIG. 8 is a schematic structural diagram of a data processing server according to an embodiment of the present invention.
  • the data processing server 8 includes a processor 801, a memory 802, and a communication interface 803.
  • the communication interface 803 is used to perform data or instruction interaction with an external device.
  • the number of processors 801 in the data processing server 8 may be one or more, and the processor 801 includes K processor cores.
  • the processor 801, the memory 802, and the communication interface 803 may be connected by a bus system or other manner, and the processor 801, the memory 802, and the communication interface 803 may be connected by wire, or may be wirelessly transmitted. Other means to achieve communication.
  • the data processing server 8 can be used to perform the method shown in FIG. For the meaning and examples of the terms involved in the embodiment, reference may be made to the embodiment corresponding to FIG. 5. I will not repeat them here.
  • the program code is stored in the memory 802.
  • the K processor cores in the processor 801 are used to call program code stored in the memory 802 for performing the following operations:
  • the number of processor cores configured for the operating system is M; wherein M is less than K and M is an integer greater than 0;
  • the M processor cores call the code in the memory to perform the following operations:
  • the M processor cores are further configured to execute:
  • the number of processor cores configured to manage threads is N, and the number of processor cores configured for worker threads is L; where N is less than K and N is an integer greater than 0, L is less than K and L is greater than 0.
  • the M processor cores are also used to execute:
  • the N processor cores call code in the memory to perform the following operations:
  • the L processor cores call code in the memory to perform the following operations:
  • the M processor cores are further configured to execute:
  • the N processor cores are also used to execute:
  • the interrupt request to be processed is processed according to the interrupt mapping information.
  • the operating system runs in a virtual machine, where the virtual machine includes K virtual processor cores, and the K virtual processor cores are in a one-to-one mapping relationship with the K processor cores.
  • the M processor cores are further configured to execute:
  • the M processor cores are further configured to execute:
  • the priority of the L worker threads is set to the highest priority according to preset priority configuration information.
  • the management thread of the data processing process exclusively runs on the bound processor core, and each worker thread runs independently on the bound processor core, and the worker thread does not need to be Threads compete for CPU time slices, reducing the latency of data processing.
  • the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Abstract

一种配置方法、配置装置和数据处理服务器,应用于包括K个处理器核的数据处理服务器,包括:配置用于管理线程的处理器核的数量为N(S501);配置用于工作线程的处理器核的数量为L(S502);其中,L为小于K且大于0的整数;为数据处理进程创建管理线程(S503);从所述K个处理器核中选择N个处理器核,将所述数据处理进程的管理线程与所述N个处理器核进行绑定(S504);为所述数据处理进程创建L个工作线程(S505);从所述K个处理器核中选择未绑定的L个处理器核,将所述数据处理进程的L个工作线程与所述L个处理器核进行一对一的绑定(S506)。上述方法、配置装置和数据处理服务器,能减少数据处理的时延。

Description

一种配置方法、装置和数据处理服务器 技术领域
本发明涉及数据处理领域,尤其涉及一种配置方法、装置和数据处理服务器。
背景技术
随着互联网业务和技术的发展,带来了数据的爆炸式增长,从而催生了大数据处理产业,大数据处理一般满足4个特点:规模性、多样性、高速性和价值性。然而,在移动互联网和物联网浪潮下,数据规模与生成速度更是前所未有地加快,对大数据处理提出了更高的要求,出现了快数据处理。快数据处理的在时延和并发量上有着苛刻的要求,例如:在金融领域中,时延的微小抖动都可能使得防欺诈检测超时,从而使得欺诈交易通过,造成经济损失。
当前业界已有部分快数据实时处理平台通过部署多核处理器和大内存的方式来解决时延问题,但CPU(Central Processing Unit,中央处理单元,简称CPU)有效利用率不足,并发量的增大也会对时延指标造成较大影响,并没有高效发挥服务器多核和大内存优势,也没有很好解决并发量影响时延指标的问题。
发明内容
本发明实施例公开了一种配置方法、装置和数据处理服务器,能够减少数据处理的时延。
第一方面,本申请提供了一种配置方法,数据处理服务器用于进行对数据实时处理,数据处理服务器具有K个处理器核,K为大于1的整数。操作系统配置用于管理线程独占的处理器核的数量为N,N小于K且N为大于0的整数;操作系统配置用于工作线程独占的处理器核的数量为L,L小于K且L为大于0的整数。操作系统在启动后,可创建数据处理进程,数据处理进程为执行数据处理任务的程序的实体。数据处理进程在创建后,操作系统可创建一个或多个管理线程。操作系统从K个处理器核中选择未绑定的N个处理器核,将数据处理进程的管理线程与选择的N个处理器进行绑定,管理线程与N个处理器核绑定后会独占运行在N个处理器核,其他线程不允许运行在N个处理器核上;绑定的方法可以是:管理线程在创建后,操作系统为管理线程分配一个线程id,K个处理器核中每个处理器核预先分配有一个序号,例如从0开始编号,操作系统获取创建的管理线程的线程id,以及获取选择的N个处理器核的序号,将管理线程的线程id与N个处理器核的序号进行绑定。操作系统创建数据处理进程的L个工作线程,操作系统从K个处理器核中选择未绑定的L个处理器核,将L个工作线程与L个处理器核进行绑定,1个工作线程与1个处理核进行绑定,每个工作线程独占运行在绑定的一个处理器核上。其中,管理线程用于执行管理和调度功能,以及与外部设备进行数据交互;例如:业务数据或执行代码的接收、发送、调度工作线程装在执行代码或对业务数据进行处理。工作线程用于执行数据处理任务,根据执行代码对业务数据进行处理,L个工作线程可并行的执行。
上述实施例,通过对数据处理进程的管理线程和工作线程的处理核的数量进行配置,在创建管理线程后管理线程绑定在预设数量的处理器核上,使管理线程独占运行在指定的处理器核上;在创建工作线程后将工作线程绑定在相等数量的处理器核上,每个工作线程独占一个处理器核。这样可避免多个工作线程争抢CPU时间片而引起的任务等待,使单个处理器核被单个工作线程占用,不需要和其他工作线程共享CPU时间片,有效的减少了数据处理的时延。
结合第一方面,在第一种可能的实施方式中,配置用于管理线程的处理器核的数量为N之前,还包括:操作系统配置用于操作系统的处理器核的数量为M,M小于K且M为大于0的整数,然后从K个处理器核中选择M个未绑定的处理器核,使操作系统运行在选择的M个处理器核上。其中,操作系统与M个处理器核的绑定方法包括两种方法:1、操作系统在配置用于操作系统的处理器核的数量为M之后,将包括数量信息M的配置文件存储在非易失性存储器中,例如:存储在机械磁盘或固态硬盘中,然后执行重启操作,在重启后触发BIOS读取配置文件,根据配置文件中的数量信息M从K个处理器核中选择M个处理器核,将M个处理器核上运行操作系统。2、操作系统当前运行在一个或多个处理器核上,操作系统在配置用于操作系统的处理器核的数量M后,从K个处理器核中选择M个处理器核,操作系统从当前运行的一个或多个处理器核迁移到选择的M个处理器核。需要说明的是,除了操作系统需要与M个处理器核进行绑定,进程空间中运行的除数据处理进程的其他进程均需要对M个处理器核进行绑定。上述实施例,操作系统独占运行在选择的M个处理器核上,避免操作系统去抢占工作线程的CPU时间片而造成数据处理任务的等待,进一步减少数据处理的时延。
结合第一方面或第一方面的第一种可能的实现方式,在第二种可能的实现方式中,VMM(Virtual Machine Monitor,虚拟机监控器,简称VMM)从数据处理服务器的物理资源池中选取K个处理器核,将K个处理器核分配给虚拟机,虚拟机中生成K个虚拟的处理器核,VMM将K个处理器核与K个虚拟机的处理器核进行一对一的映射,包括K个虚拟的处理器核的虚拟机运行操作系统。
上述实施例,通过在数据处理服务器上部署虚拟机的方式进行数据处理,能提高数据处理服务器中物理资源的利用率。
结合第一方面或第一方面的第一种可能的实现方式,在第三种可能的实现方式中,操作系统获取中断请求的中断号和选择的N个处理器核的序号,将N个处理器核的序号与中断请求的中断号进行关联后生成中断映射信息。在进行上述的中断绑定后,中断请求的处理过程为:操作系统接收中断请求,获取中断请求的中断号,根据中断映射信息查询与中断请求关联的N个处理器核,操作系统通知N个处理器核处理中断请求。这样外部设备发出的所有的中断请求都由N个处理器核进行处理,工作线程不需要处理任何中断请求,避免中断请求抢占工作线程的CPU时间片,进一步减少数据处理的时延。
结合第一方面的第二种可能的实施方式,在第四种可能的实施方式中,还包括:操作系统根据预设的调度类型配置信息将L个工作线程的调度类型配置为实时调度类型。通过将工作线程的调度类型设置为实时调度类型,可以使工作线程一直运行在对应的处理器核上,避免数据处理任务的中断,减少处理时延。
结合第一方面至第一方面的第三种可能的实施方式中的任意一种,在第五种可能的实施方式中,操作系统根据预设的优先级配置信息将L个工作线程的优先级设置为最高优先级。避免更高优先级的线程抢占工作线程的处理器核,使工作线程一直运行在对应的处理器核上,减少工作线程的处理时延。
结合第一方面,在第七种可能的实施方式中,还包括:
管理线程接收客户端发送的新的执行代码或业务指令,将新的执行代码或业务指令放入无锁队列,L个工作线程采用无锁竞争的方式从无锁队列中取出一个执行代码或业务数据进行处理,管理线程与工作线程之间通过无锁队列的方式进行通信,避免各个工作线程的资源竞争,减少工作线程执行数据处理任务的等待时间。
结合第一方面,在第八种可能的实施方式中,还包括:操作系统根据预设的内存映射信息为L个工作线程中每个工作线程分配一个私有内存空间,私有内存空间用于存放工作线程的私有数据,通过无共享结构避免工作线程对内存空间的抢占,减少处理时间。
结合第一方面的第八种可能的实施方式,在第九种可能的实施方式中,还包括:操作系统接收工作线程发送的内存访问请求;其中,内存访问请求携带内存地址,工作线程为L个工作线程中的任意一个,操作系统判断内存地址是否位于工作线程关联的私有内存空间,若为是,操作系统将内存访问请求发送给关联的私有内存空间进行处理;若为否,操作系统将内存访问请发送给共享内存空间进行处理。
结合第一方面的第八或第九种可能的实施方式,在第十种可能的实施方式中,创建数据处理进程之前,还包括:操作系统配置内存映射信息;其中,内存映射信息表示数据处理进程的所有工作线程各自分配一个私有内存空间。
本申请第二方面提供了一种配置装置,包括:配置模块和处理模块;
配置模块用于配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;其中,N小于K且N为大于0的整数,L小于K且L为大于0的整数;处理模块用于为数据处理进程创建管理线程,以及从K个处理器核中选择N个处理器核,将数据处理进程的管理线程与N个处理器核进行绑定;为数据处理进程创建L个工作线程,以及从K个处理器核中选择未绑定的L个处理器核,将数据处理进程的L个工作线程与L个处理器核进行一对一的绑定。
上述实施例,通过对数据处理进程的管理线程和工作线程的处理核的数量进行配置,在创建管理线程后管理线程绑定在预设数量的处理器核上,使管理线程独占运行在指定的处理器核上;在创建工作线程后将工作线程绑定在相等数量的处理器核上,每个工作线程独占一个处理器核。这样可避免多个工作线程争抢CPU时间片而引起的任务等待,使单个处理器核被单个工作线程占用,不需要和其他工作线程共享CPU时间片,有效的减少了数据处理的时延。
结合第二方面,在第一种可能的实施方式中,配置模块还用于配置用于操作系统的处理器核的数量为M;其中,M小于K且M为大于0的整数;
处理模块还用于从K个处理器核中选择M个处理器核,在M个处理器核上运行操作系统。上述实施例,操作系统独占运行在选择的M个处理器核上,避免操作系统去抢占工作线程的CPU时间片而造成数据处理任务的等待,进一步减少数据处理的时延。
结合第二方面,在第一种可能的实现方式中,操作系统运行在虚拟机中,虚拟机包括K个虚拟机的处理器核,K个虚拟的处理器核与K个处理器核为一一映射关系。上述实施例,通过部署虚拟机的方式进行数据处理,能提高物理资源的利用率。
结合第二方面至第二方面的第二种可能的实施方式中的任意一种,在第三种可能的实施方式中,配置模块还用于将N个处理器核与中断请求进行关联后生成中断映射信息;处理模块还用于接收中断请求;根据中断映射信息查询与中断请求关联的N个处理器核;通知N个处理器核处理中断请求。通过将工作线程的调度类型设置为实时调度类型,可以使工作线程一直运行在对应的处理器核上,避免数据处理任务的中断,减少处理时延。
结合第二方面,在第四种可能的实现方式中,处理模块还用于根据预设的调度类型配置信息将L个工作线程调度类型设置为实时调度类型。可以使工作线程一直运行在对应的处理器核上,避免数据处理任务的中断,减少处理时延。
结合第二方面至第二方面的第四种可能的实现方式中的任意一种,在第五种可能的实现方式中,处理模块还用于根据预设的优先级配置信息将L个工作线程的优先级设置为最高优先级。避免更高优先级的线程抢占工作线程的处理器核,使工作线程一直运行在对应的处理器核上,减少工作线程的处理时延。
本申请第三方面提供了一种数据处理服务器,包括:K个处理器核和存储器;其中,K个处理器调用存储器中的代码,用于执行以下操作:配置用于管理线程的处理器核的数量为M;其中,M小于K且M为大于0的整数;从K个处理器核中选择M个未绑定的处理器核;M个处理器核调用存储器中的代码,用于执行以下操作:运行操作系统。
结合第三方面,在第一种可能的实现方式中,M个处理器核还用于执行:
配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;其中,N小于K且N为大于0的整数,L小于K且L为大于0的整数;
操作系统独占运行在M个处理器之后,M个处理器核还用于执行:
为数据处理进程创建管理线程,以及从K个处理器核中选择N个未绑定的处理器核,将管理线程与N个处理器核进行绑定;
创建L个工作线程,以及从K个处理器核中选择未绑定的L个处理器核,将L个工作线程与L个处理器核进行一对一的绑定;
管理线程运行在N个处理器核上之后,N个处理器核调用存储器中的代码,用于执行以下操作:运行管理线程;
L个工作线程运行在L个处理器核上之后,L个处理器核调用存储器中的代码,用于执行以下操作:运行各自绑定的一个工作线程。
结合第三方面或第三方面的第一种可能的实现方式,在第二种可能的实现方式中,数据处理服务器中部署有虚拟机,操作系统安装在虚拟机中,虚拟机包括K个虚拟的处理器核,K个虚拟的处理器核与K个处理器核为一一映射关系。
结合第三方面至第三方面的第二种可能的实现方式中的任意一种,在第三种可能的实现方式中,M个处理器核还用于执行:将N个处理器核与中断请求进行关联后生成中断映射信息;
N个处理器核还用于执行:
接收待处理的中断请求;
根据中断映射信息处理待处理的中断请求。
结合第三方面的第二种可能的实现方式,在第三种可能的实现方式中,M个处理器核还用于执行:
根据预设的调度类型配置信息将L个工作线程调度类型设置为实时调度类型。
结合第三方面至第三方面的第三种可能的实现方式中的任意一种,在第四种可能的实现方式中,M个处理器核还用于执行:
根据预设的优先级配置信息将L个工作线程的优先级设置为最高优先级。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种数据处理系统的结构示意图;
图2是本发明实施例提供的一种数据处理服务器的结构示意图;
图3是本发明实施例提供的又一种数据处理服务器的结构示意图;
图4是本发明实施例提供的又一种数据处理服务器的结构示意图;
图5是本发明实施例提供的一种配置方法的流程示意图;
图6是本发明实施例提供的又一种配置方法的流程示意图;
图7是本发明实施例提供的一种配置装置的结构示意图;
图8是本发明实施例提供的又一种数据处理服务器的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明的一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,在本发明实施例中使用的术语是出于描述特定实施例的目的,而非旨在限制本发明。在本发明实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。另外,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
请参见图1,图1为本发明实施例提供的一种数据处理系统的网络结构图,在本发明实施例中,数据处理系统包括客户端集群10和数据处理服务器集群11,客户端集群10包括多个客户端:客户端101、客户端102、……、客户端10n,数据服务集群11包括多个数据处理服务器:数据处理服务器111、数据处理服务器112、数据处理服务器113、……数据处理服务器11n。客户端集群10客户端需要执行数据处理的任务时,客户端集群可根据负载均衡算法将数据处理任务平均分配到数据处理服务器集群11中的数据处理服务器。例如:n个客户端中的某个客户端生成待处理数据,该客户端采用哈希算法对待处理数据进行哈希运行得到一个哈希值,将得到哈希值与m进行求余,m为数据处理服务器集群10中数据处理服务器的数量,数据处理服务器集群10中每个数据处理服务器预先设置有序号,将上述求余得到的余数作为目标数据处理服务器的序号,该客户端将待处理数据发送给目标数据处理服务器进行相应的数据处理。数据处理服务器中运行有数据处理进程、操作系统以及除数据处理进程的其他进程,数据处理进程包括管理线程和工作线程。数据处理服务器包括处理器核资源池,处理器核资源池包括多个处理器核,数据处理服务器在进行数据处理之前进行了相关的配置和绑定,配置和绑定的过程包括:配置用于操作系统的处理器核的数量、配置用于数据处理进程的管理线程的处理器核的数量和配置用于数据处理进程的工作线程的处理器核的数量;根据预设数量从处理器核资源池中选择未绑定的处理器核,将选择的处理器核与操作系统进行绑定,操作系统独占运行在选择的处理器核上;操作系统创建数据处理进程,操作系统创建数据处理进程的管理线程,操作系统根据预设数量从处理器核资源池中选择未绑定的处理器核,将选择的处理器核与管理线程进行绑定,使管理线程独占运行在选择的处理器核上;操作系统根据预设数量创建工作线程,操作系统根据预设数量从处理器核资源池中选择未绑定的处理器核,将选择的处理器核与创建的工作线程进行绑定,每个工作线程绑定一个处理器核。
在本申请中,数据处理服务器包括三种架构,分别为裸机架构、虚拟机架构和容器架构。
参见图2,图2为裸机架构的数据处理服务器,数据处理服务器2包括处理器核资源池22,处理器核资源池22包括K个处理器核,数据处理器服务器2配置用于操作系统的处理器核的数量为M,配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L,其中,配置的方法可以是:操作系统在启动后执行上述的配置过程。配置完成后,对于操作系统与处理器核的绑定可以是:操作系统可从K个处理器核中选择未绑定的M个处理器核,操作系统将当前运行的一个或多个处理器核迁移到选择的M个处理器核上;或者,操作系统在配置完成后,执行重启操作,在重启的过程中,BIOS读取非易失性存储器中存储的配置文件,配置文件包括用于操作系统的处理器核的数量,BIOS从K个处理器核中选择M个处理器核,在选择的M个处理器核上运行操作系统。当操作系统独占运行在M个处理器核上后,操作系统创建数据处理进程20,操作系统根据数量N创建管理线程200,创建的管理线程200的数量可以为一个或多个,图2中以1个为例。操作系统从K个处理器核中选择未绑定的N个处理器核,将N个处理器核与创建的管理线程200进行绑定,管理线程200独占运行在N个处理器核上。操作系统根据数量L创建L个工作线程,从K个处理器核中 选择未绑定的L个处理器核,将L个处理器核与L个工作线程进行绑定,每个工作线程独占运行在一个处理器核上。
可选的,数据处理服务器2还包括内存资源池23,内存资源池23中包括多个内存,操作系统21从内存资源池中选取一个内存空间,将选择的内存空间与操作系统进行绑定,同时,操作系统为管理线程和每个工作线程均分配有一段独立的内存区间,操作系统、管理线程和各个工作线程的内存空间相互独立。
参见图3,图3为虚拟机架构的数据处理服务器,数据处理服务器3包括处理器核资源池33,处理器核资源池33包括多个处理器核。数据处理服务器3中部署有一个或多个虚拟机,图3中以3个虚拟机为例。VMM(Virtual Machine Monitor,虚拟机监视器,简称VMM)在创建虚拟机时为虚拟机分配一定的硬件资源,硬件资源包括CPU资源、内存资源和IO资源等,VMM为虚拟机提供访问数据处理服务器的硬件资源的能力。其中,VMM为虚拟机分配CPU资源时必须满足:每个虚拟机的虚拟的处理器核映射1个物理的处理器核。每个虚拟机中运行数据处理进程、操作系统和其他进程,数据处理进程包括管理线程和工作线程。下面就虚拟机30对本申请的配置方法进行说明:VMM从虚拟机资源池中选择K个处理器核分配给虚拟机30,VMM对分配的K个处理器核进行虚拟化,使虚拟机30中每个虚拟的处理器核映射一个物理的处理器核,对于虚拟机30而言所拥有的处理器核为物理的处理器核,下面不区分物理的处理器核和虚拟的处理器核,统称为处理器核。操作系统302配置用于操作系统的处理器核的数量M,配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;操作系统302从K个处理器核中选择未绑定的M个处理器核,操作系统302从当前运行的一个或多个处理器核上迁移到选择的M个处理器核上;操作系统302根据数量N创建管理线程,例如管理线程的数量等于N,操作系统302从K个处理器核中选择未绑定的N个处理器核,将创建的管理线程与N个处理器核进行绑定;管理操作系统302从K个处理器核中选择未绑定的L个处理器核,将L个处理器核与L个工作线程进行绑定,每个工作线程绑定1一个处理器核。可选的,数据处理服务器3还包括:内存资源池44,内存资源池中包括一个或多个内存,对于每个虚拟机而言,虚拟机中包括的操作系统、管理线程和各个工作线程均分配有1个独立的内存空间。
参见图4,图4为容器架构的数据处理服务器,数据处理服务器4包括处理器核资源池,处理器核资源池包括K个处理器核。数据处理服务器4中部署有一个或多个容器,每个容器中运行一个数据处理进程,这样通过容器的隔离可以一个数据处理服务器上运行多个数据处理进程。操作系统和其他进程运行在数据处理服务器上。数据处理进程包括管理线程和工作线程。每个容器中,管理线程独占运行在预设数量的处理器核上,工作线程运行在预设数量的处理器核上,每个工作线程独占运行在1个处理器核上。下面就容器40的配置过程进行说明:操作系统42配置用于操作系统的处理器核的数量M,配置用于管理线程的数量N和配置用于工作线程的数量L,操作系统从K个处理器核中选择未绑定的M个处理器核,操作系统从当前运行的一个或多个处理器核迁移到M个处理器核上,操作系统在容器40内创建1个数据处理进程,操作系统根据数量N创建管理线程,操作系统从K个处理器核中选择未绑定的N个处理器核,将N个处理器核与管理线程进行绑定,管理线程独占运行在N个处理器核上;操作系 统根据数量L创建L个工作线程,操作系统从K个处理器核中选择未绑定的L个处理器核,数据处理进程将L个处理器核与L个工作线程进行绑定,每个工作线程独占运行在1个处理器核上。可选的,数据处理服务器4还包括内存资源池44,内存资源池44包括一个或多个内存,对于每个容器,管理线程和各个工作线程均分配一个独立的内存空间,操作系统和其他进程也分配有一个独立的内存空间。
上述实施例,通过对数据处理进程的管理线程和工作线程的处理核的数量进行配置,在创建管理线程时将管理线程绑定在预设数量的处理器核上,使管理线程独占运行在上述的处理器核上;以及在创建工作线程时将工作线程绑定在相等数量的处理器核上,使每个工作线程独占一个处理器核。这样可避免多个工作线程争抢CPU时间片而引起的任务等待,使单个处理器核被单个工作线程占用,不需要和其他工作线程共享CPU时间片,有效的减少了数据处理的时延。
请参见图5,图5是本发明实施例提供的一种配置方法的流程示意图,该方法包括但不限于如下步骤。
S501、配置用于管理线程的处理器核的数量为N。
具体地,管理线程用于管理工作线程、调度工作线程、与外部设备进行数据交互;工作线程用于执行数据处理任务。数据处理进程包括管理线程和工作线程,操作系统和数据处理进程均运行在数据处理服务器上。数据处理服务器可包括1个处理器核资源池,处理器核资源池中处理器核的数量为K,其中,操作系统可维护一个状态信息表,状态信息表中存储记录每个处理器核的状态的表项,处理器核的状态分为绑定状态和未绑定状态,绑定状态表示被线程或进程独占绑定,其他进程或线程不能使用处理器核,反之,未绑定状态表示该处理器核可以被任意或线程占用;处理器核资源池中任意一个处理器核的状态发生变更时,操作系统更新该处理器核在状态记录表中的表项。操作系统可以再启动之后配置用于管理线程的处理器核的数量为N,N小于K且N和K均为正整数。
S502、配置用于工作线程的处理器核的数据为L。
具体地,操作系统配置用于工作线程独占的处理器核的数量为L,L小于K且L为正整数。
S503、为数据处理进程创建管理线程。
具体地,操作系统根据预设数量N为数据处理进程创建一个或多个管理线程,其中,在需要创建多个管理线程的情况下,操作系统在创建首个管理线程后,可以由首个管理线程创建剩余的管理线程。优选的,创建的管理线程的数量可以等于N,这样使每个管理线程占用1个处理器核,提高处理器核的利用率。
S504、从K个处理器核中选择N个处理器核,将数据处理进程的管理线程与N个处理器核进行绑定。
具体的,操作系统从K个处理器核中选择未绑定的N个处理器核,将S503中创建的所有管理线程与N个处理器核进行独占绑定,绑定的方法可以是:操作系统获取创建的管理线程的线程id和选择的N个处理器核的序号,将创建的管理线程的线程id和N个处理器核的序号进行绑定,绑定后管理线程在N个处理器核上独占运行。
S505、为数据处理进程创建L个工作线程。
具体的,管理线程在N个处理器核上运行后,操作系统根据S502设置的数量L创建相等数量的L个工作线程。
S506、从K个处理器核中选择L个未绑定的处理器核,将数据处理进程的L个工作线程与L个处理器核进行一对一的绑定。
具体的,操作系统从K个处理器核中选择L个未绑定的处理器核,将创建的L个工作线程与L个处理器核进行绑定,绑定的方法可以是:操作系统获取创建的L个工作线程的线程id和选择的L个处理器核的序号,将L个工作线程的线程id与L个处理器核的序号进行一对一的绑定,每个工作线程独占运行在1个处理器核上。
在图1所描述的方法中,数据处理进程的管理线程独占运行在绑定的处理器核上,每个工作线程独立运行在绑定的处理器核上,工作线程不需要和其他线程争抢CPU时间片,减少数据处理的时延。
请参见图6,图6是本发明实施例提供的又一种配置方法的流程示意图,该方法包括但不限于如下步骤。
S601、配置用于操作系统的处理器核的数量为M。
具体地,数据处理服务器包括K个处理器核,K个处理器核中每个处理器核为物理的处理器核,操作系统可维护一个状态记录表,状态记录表中存放每个处理器核的状态的表项,处理器核的状态包括绑定状态和未绑定状态,当K个处理器核中某个处理器核的状态发生变更时,操作系统在状态记录表中更新该处理器核的表项。数据处理进程包括管理线程和工作线程,管理线程用于管理和调度工作线程,以及与外部设备进行数据或指令的交互;工作线程用于执行数据处理任务。操作系统配置用于操作系统的处理器核的数量M,M小于K且M为整数。例如:K=100,数据处理服务器配置用于操作系统的处理器核的数量为4。
S602、配置用于管理线程的处理器核的数量为N。
具体地,数据处理服务器配置用于管理线程的处理器核的数量为N。例如,操作系统配置用于管理线程的处理器核的数据为4。
S603、配置用于工作线程的处理器核的数量为L。
具体地,操作系统配置用于工作线程的处理器核的数量为L,例如:操作系统配置用于工作线程的处理器核的数量为10。
S604、从K个处理器核中选择M个处理器核,在M个处理器核上运行操作系统。
其中,在一种可能的实施方式中,操作系统在启动后,操作系统当前运行在一个或多个处理器核上,当操作系统需要进行处理器核的绑定时,操作系统从K个处理器核中选择M个处理器核,操作系统从当前运行的1个或多个处理器核上迁移到选择的M个处理器核上。
在另一种可能的实施方式中,操作系统配置用于操作系统的处理器核的数量M后生成配置文件,配置文件存储于非易失性存储器(例如:机械硬盘、固态硬盘)中,配置完成后,操作系统执行重启操作,在重启的过程中,BIOS读取预先存储的配置文件,根据配置文件中的数量M,从K个处理器核中选择M个处理器核,在启动过程中K个处理器核的状态均为未绑定状态,将选择的M个处理器核与操作系统进行绑定,在选择的M个处理器核上运行操作系统。其中,操作系统在启动状态时也可以看成一个 进程。
需要说明的是,除了操作系统,进程空间中除数据处理进程的其他进程都需要与选择的N个处理器核进行绑定,避免其他进程抢占工作线程的CPU时间片。
举例说明,在linux操作系统中,操作系统使用int sched_setaffinity(pid_t pid,unsigned int cpusetsize,cpu_set_t*mask)的系统调用可以将某个进程或线程绑定到一个或多个特定的处理器核上。该函数中第1个参数pid_t pid表示需要设置或获取绑定信息的线程id或进程id,如果为第一个参数为0,表示对当前调用的线程进行设置;第2个参数cpusetsize一般设置为sizeof(cpu_set_t),用于表示第3个参数指向的内存结构对象的大小;第3个参数mask指向类型为cpu_set_t对象的指针,用以设置或获取指定线程或进程绑定的处理器核列表。
举例说明,在linux操作系统中,使用taskset命令可以将某个进程绑定到一个或多个特定的CPU核上。命令格式如“taskset-pc 3 21184”,“21184”表示进程id或进程ID,“3”表示处理器核的序号,该条命令表示为id为21184的进程独占运行在第4个处理器核上(首个处理器核的序号为0)。
举例说明,在Docker容器架构的数据处理服务器中,使用“--cpuset-cpus=process id”命令可以将某个进程绑定到一个或多个特定的处理器核上。“cpuset”表示绑定的处理器核,如0-3或以逗号分割如0,3,4(0是首个处理器核的序号);process id表示进程的id。
举例说明,在linux操作系统中,init进程是所有进程的祖先,可设置init进程的affinity来实现设置所有进程的affinity的目地,然后把指定进程绑定到目处理器核上。例如:在/etc/rc.sysinit中增加“/bin/bind 1 1”,可以绑定init进程至处理器核0。
S605、从K个处理器核中选择N个处理器核,将数据处理进程的管理线程与N个处理器核进行绑定。
具体地,操作系统创建数据处理进程的管理线程,从K个处理器核中选择未绑定的N个处理器核,将选择的N个处理器核与管理线程进行绑定。线程和处理器核的绑定可参照S604的例子,此处不再赘述。
S606、为数据处理进程创建L个工作线程。
具体地,操作系统根据预先配置的数量L为数据处理进程创建L个工作线程。
S607、从K个处理器核中选择未绑定的L个处理器核,将数据处理进程的L个工作线程与L个处理器核进行绑定。
具体的,操作系统从K个处理器核中选择未绑定的L个处理器核,将数据处理进程的L个工作线程与L个处理器核进行绑定,每个工作线程唯一绑定一个处理器核。工作线程与处理器核的绑定可操作S604的描述,此处不再赘述。
S608、操作系统将N个处理器核与中断请求进行关联后生成中断映射信息。
具体的,操作系统获取系统上的中断号,不同的中断请求对应不同的中断号,操作系统将获取到的中断号与N个处理器核进行绑定,生成中断映射信息。中断请求与N个处理器核进行绑定后,后续操作系统接收到的中断请求全部由N个处理器核处理,工作线程不处理任何中断请求,避免工作线程在执行数据处理任务时被打断,减少处 理时延。
举例说明,在linux操作系统中操作系统执行cat/proc/interrupts的命令,查看中断请求的中断号。操作系统通过修改/proc/irq/{irq_number}/smp_affinity配置文件,设置中断请求和处理器核的绑定关系。
例如:echo 3>/proc/irq/20/smp_affinity表示分配序号为3的处理器核给中断号为20的中断请求。
S609、接收中断请求。
S610、根据中断映射信息查询与中断请求关联的N个处理器核。
S611、通知N个处理器核处理中断请求。
具体的,操作系统接收到中断请求后,根据预设的中断映射信息查询中断请求关联的N个处理器核,操作系统通知N个处理器核处理该中断请求,通过特定的处理器核来处理中断请求,工作线程绑定的处理器核不需要进行任何中断处理,避免数据处理任务的中断,减少处理时延。
S612、根据预设的调度类型配置信息将L个工作线程的调度类型设置为实时调度类型。
举例说明:在linux操作系统中,使用pthread_attr_setschedpolicy的系统调用设置线程的调度类型,操作系统可以将L个工作线程均设置为sched_fifo先到先服务的实时调度类型。一旦某个工作线程占用某个处理器核则一直运行数据处理任务,直到有更高优先级的任务到达或自己放弃。
S613、根据预设的优先级配置信息将L个工作线程的优先级设置为最高优先级。
具体的,操作系统根据预设的优先级配置信息将L个工作线程的优先级设置为最高优先级,避免工作线程运行比数据处理任务更高优先级的任务,减少数据处理任务的时延。
举例说明,在linux系统中,操作系统使用pthread_attr_setschedparam系统调用设置工作线程的优先级。
可选的,管理线程接收到新的执行代码后,将新的执行代码写入无锁队列,并通过无锁方式通知L个工作线程,工作线程在执行每个操作处理前后每个操作完成后,检测无锁队列的通知状态,发现需要重新加载新的执行代码后,工作线程加载新的执行代码运行数据处理任务。
其中,操作系统主要是通过CAS或FAA等原子操作和Retry-Loop等技术实现无锁队列,或者共享变来实现管理线程和工作线程之间的消息传递。
所述管理线程接收新的执行代码或业务数据;
所述管理线程将所述新的执行代码或业务数据放入无锁队列;
所述L个工作线程中任意一个工作线程为空闲状态时,空闲状态的工作线程从所述无锁队列中取出一个执行代码或业务数据进行数据处理。
可选的,还包括:
操作系统根据预设的内存映射信息为所述L个工作线程中每个工作线程分配一个私有内存空间。
例如,在linux操作系统中,操作系统使用TLS(Thread Local Storage,线程 本地存储)来实现将内存空间与一个正在执行的指定线程关联起来。
可选的,数据处理服务器中部署虚拟机的情况下,K个处理器核分配给某个虚拟机,虚拟机包括K个虚拟的处理器核,K个处理器核与K个虚拟的处理器核为一一映射关系,操作系统安装与虚拟机中。通过虚拟机的部署方案,能提高资源的利用率。
可选的,还包括:
所述操作系统接收工作线程发送的内存访问请求;其中,所述内存访问请求携带内存地址,所述工作线程为所述L个工作线程中的任意一个;
所述操作系统判断所述内存地址是否位于所述工作线程关联的私有内存空间;
若为是,所述操作系统将所述内存访问请求发送给所述关联的私有内存空间进行处理;
若为否,所述操作系统将所述内存访问请发送给共享内存空间进行处理。
可选的,所述创建数据处理进程之前,还包括:
操作系统配置所述内存映射信息;其中,所述内存映射信息表示所述数据处理进程创建的所有工作线程各自分配一个私有内存空间。
上述实施例,通过设置无共享架构,每个工作线程具有一个私有内存空间,避免多个工作线程在存放私有数据时抢占内存空间,减少工作线程执行数据处理任务的时延。
在图2所描述的方法中,通过处理器核的绑定、内存绑定、中断绑定以及工作线程优先级和调度类型的设置,使工作线程能使用专有的硬件资源来执行数据处理任务,减少处理时延。
上述详细阐述了本发明实施例的方法,为了便于更好地实施本发明实施例的上述方案,相应地,下面提供了本发明实施例的装置。
请参见图7,图7是本发明实施例提供的一种配置装置的结构示意图,该配置装置可以包括配置模块701和处理模块702,其中,配置装置7可以通过专用集成电路(英文:Application-Specific Integrated Circuit,缩写:ASIC)实现,或可编程逻辑器件(英文:Programmable Logic Device,缩写:PLD)实现。上述PLD可以是复杂可编程逻辑器件(英文:Complex Programmable Logic Device,缩写:CPLD),FPGA,通用阵列逻辑(英文:Generic Array Logic,缩写:GAL)或其任意组合。该配置装置7用于实现图5所示的配置方法。通过软件实现图5所示的配置方法时,配置装置7及其各个模块也可以为软件模块。各个模块的详细描述如下。
配置模块701,用于配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;其中,N小于K且N为大于0的整数,L小于K且L为大于0的整数。
处理模块702,用于为数据处理进程创建管理线程,以及从所述K个处理器核中选择N个处理器核,将所述数据处理进程的管理线程与所述N个处理器核进行绑定;为所述数据处理进程创建L个工作线程,以及从所述K个处理器核中选择未绑定的L个处理器核,将所述数据处理进程的L个工作线程与所述L个处理器核进行一对一的绑定。
可选的,配置模块701还用于配置用于操作系统的处理器核的数量为M;其中,M 小于K且M为大于0的整数;
处理模块702还用于从所述K个处理器核中选择M个处理器核,在所述M个处理器核上运行所述操作系统。
可选的,配置模块701还用于将所述N个处理器核与中断请求进行关联后生成中断映射信息;
处理模块702还用于接收中断请求;根据所述中断映射信息查询与所述中断请求关联的所述N个处理器核;通知所述N个处理器核处理所述中断请求。
可选的,所述操作系统运行在虚拟机中,所述虚拟机包括K个虚拟的处理器核,所述K个虚拟的处理器核与所述K个处理器核为一一映射关系。
可选的,处理模块702还用于根据预设的调度类型配置信息将所述L个工作线程调度类型设置为实时调度类型。
可选的,处理模块702还用于根据预设的优先级配置信息将所述L个工作线程的优先级设置为最高优先级。
上述实施例,配置模块通过处理器核的绑定、内存绑定、中断绑定以及工作线程优先级和调度类型的设置,使工作线程能使用专有的硬件资源来执行数据处理任务,减少处理时延。
需要说明的是,在本发明实施例中,各个模块的具体实现还可以对应参照图5和图6所示的方法实施例的相应描述。
参见图8,为本发明实施例提供的一种数据处理服务器的结构示意图,在本发明实施例中,数据处理服务器8包括处理器801、存储器802和通信接口803。通信接口803用于与外部设备进行数据或指令的交互。数据处理服务器8中的处理器801的数量可以是一个或多个,处理器801包括K个处理器核。本发明的一些实施例中,处理器801、存储器802和通信接口803可通过总线系统或其他方式连接,处理器801、存储器802和通信接口803之间可通过有线方式连接,也可以通过无线传输等其他手段实现通信。数据处理服务器8可以用于执行图5所示的方法。关于本实施例涉及的术语的含义以及举例,可以参考图5对应的实施例。此处不再赘述。
其中,存储器802中存储程序代码。处理器801中的K个处理器核用于调用存储器802中存储的程序代码,用于执行以下操作:
配置用于操作系统的处理器核的数量为M;其中,M小于K且M为大于0的整数;
从所述K个处理器核中选择M个未绑定的处理器核;
所述M个处理器核调用所述存储器中的代码,用于执行以下操作:
运行所述操作系统。
可选的,所述M个处理器核还用于执行:
配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;其中,N小于K且N为大于0的整数,L小于K且L为大于0的整数;
所述M个处理器核还用于执行:
为数据处理进程创建管理线程,以及从所述K个处理器核中选择N个未绑定的处理器核,将所述管理线程与所述N个处理器核进行绑定;
创建L个工作线程,以及从所述K个处理器核中选择未绑定的L个处理器核,将 所述L个工作线程与所述L个处理器核进行一对一的绑定;
所述N个处理器核调用所述存储器中的代码,用于执行以下操作:
运行所述管理线程;
所述L个处理器核调用所述存储器中的代码,用于执行以下操作:
运行各自绑定的一个工作线程。
可选的,所述M个处理器核还用于执行:
将所述N个处理器核与中断请求进行关联后生成中断映射信息;
所述N个处理器核还用于执行:
接收待处理的中断请求;
根据所述中断映射信息处理所述待处理的中断请求。
可选的,所述操作系统运行在虚拟机中,所述虚拟机包括K个虚拟的处理器核,所述K个虚拟的处理器核与所述K个处理器核为一一映射关系。
可选的,所述M个处理器核还用于执行:
根据预设的调度类型配置信息将所述L个工作线程调度类型设置为实时调度类型。
可选的,所述M个处理器核还用于执行:
根据预设的优先级配置信息将所述L个工作线程的优先级设置为最高优先级。
综上所述,通过实施本发明实施例,数据处理进程的管理线程独占运行在绑定的处理器核上,每个工作线程独立运行在绑定的处理器核上,工作线程不需要和其他线程争抢CPU时间片,减少数据处理的时延。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上实施例仅揭露了本发明中较佳实施例,不能以此来限定本发明之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本发明权利要求所作的等同变化,仍属于发明所涵盖的范围。

Claims (18)

  1. 一种配置方法,其特征在于,所述方法应用于包括K个处理器核的数据处理服务器,包括:
    配置用于管理线程的处理器核的数量为N;其中,N小于K且N为大于0的整数;
    配置用于工作线程的处理器核的数量为L;其中,L小于K且L为大于0的整数;
    为数据处理进程创建管理线程;
    从所述K个处理器核中选择N个处理器核,将所述数据处理进程的管理线程与所述N个处理器核进行绑定;
    为所述数据处理进程创建L个工作线程;
    从所述K个处理器核中选择未绑定的L个处理器核,将所述数据处理进程的L个工作线程与所述L个处理器核进行一对一的绑定。
  2. 如权利要求1所述的方法,其特征在于,所述配置用于管理线程的处理器核的数量为N之前,还包括:
    配置用于操作系统的处理器核的数量为M;其中,M小于K且M为大于0的整数;
    从所述K个处理器核中选择M个处理器核,在所述M个处理器核上运行所述操作系统。
  3. 如权利要求1或2所述的方法,其特征在于,所述操作系统运行在虚拟机中,所述虚拟机包括K个虚拟的处理器核,所述K个虚拟机的处理器核与所述K个处理器核为一一映射关系。
  4. 如权利要求1-3任意一项所述的方法,还包括:
    将所述N个处理器核与中断请求进行关联后生成中断映射信息;
    接收中断请求;
    根据所述中断映射信息查询与所述中断请求关联的所述N个处理器核;
    通知所述N个处理器核处理所述中断请求。
  5. 如权利要求1所述的方法,其特征在于,还包括:
    根据预设的调度类型配置信息将所述L个工作线程的调度类型设置为实时调度类型。
  6. 如权利要求1-5任意一项所述的方法,其特征在于,还包括:
    根据预设的优先级配置信息将所述L个工作线程的优先级设置为最高优先级。
  7. 一种配置装置,其特征在于,包括:配置模块和处理模块;
    所述配置模块,用于配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;其中,N小于K且N为大于0的整数,L小于K且L为大于0的整数;
    所述处理模块,用于为数据处理进程创建管理线程,以及从所述K个处理器核中选择N个处理器核,将所述数据处理进程的管理线程与所述N个处理器核进行绑定;为所述数据处理进程创建L个工作线程,以及从所述K个处理器核中选择未绑定的L个处理器核,将所述数据处理进程的L个工作线程与所述L个处理器核进行一对一的绑定。
  8. 如权利要求7所述的配置装置,其特征在于,所述配置模块还用于配置用于 操作系统的处理器核的数量为M;其中,M小于K且M为大于0的整数;
    所述处理模块还用于从所述K个处理器核中选择M个处理器核,在所述M个处理器核上运行所述操作系统。
  9. 如权利要求7或8所述的方法,其特征在于,所述操作系统运行在虚拟机中,所述虚拟机包括K个虚拟的处理器核,所述K个虚拟的处理器核与所述K个处理器核为一一映射关系。
  10. 如权利要求6-9任意一项所述的装置,所述配置模块还用于将所述N个处理器核与中断请求进行关联后生成中断映射信息;
    所述处理模块还用于接收中断请求;根据所述中断映射信息查询与所述中断请求关联的所述N个处理器核;通知所述N个处理器核处理所述中断请求。
  11. 如权利要求7所述的装置,其特征在于,所述处理模块还用于根据预设的调度类型配置信息将所述L个工作线程调度类型设置为实时调度类型。
  12. 如权利要求7-11任意一项所述的装置,其特征在于,所述处理模块还用于根据预设的优先级配置信息将所述L个工作线程的优先级设置为最高优先级。
  13. 一种数据处理服务器,其特征在于,包括:K个处理器核和存储器;其中,所述K个处理器调用所述存储器中的代码,用于执行以下操作:
    配置用于操作系统的处理器核的数量为M;其中,M小于K且M为大于0的整数;
    从所述K个处理器核中选择M个未绑定的处理器核;
    所述M个处理器核调用所述存储器中的代码,用于执行以下操作:
    运行所述操作系统。
  14. 如权利要求13所述的数据处理服务器,其特征在于,所述M个处理器核还用于执行:
    配置用于管理线程的处理器核的数量为N,配置用于工作线程的处理器核的数量为L;其中,N小于K且N为大于0的整数,L小于K且L为大于0的整数;
    所述M个处理器核还用于执行:
    为数据处理进程创建管理线程,以及从所述K个处理器核中选择N个未绑定的处理器核,将所述管理线程与所述N个处理器核进行绑定;
    创建L个工作线程,以及从所述K个处理器核中选择未绑定的L个处理器核,将所述L个工作线程与所述L个处理器核进行一对一的绑定;
    所述N个处理器核调用所述存储器中的代码,用于执行以下操作:
    运行所述管理线程;
    所述L个处理器核调用所述存储器中的代码,用于执行以下操作:
    运行各自绑定的一个工作线程。
  15. 如权利要求13或14所述的数据处理服务器,其特征在于,所述操作系统运行在虚拟机中,所述虚拟机包括K个虚拟的处理器核,所述K个虚拟的处理器核与所述K个处理器核为一一映射关系。
  16. 如权利要求13-15任意一项所述的数据处理服务器,所述M个处理器核还用于执行:
    将所述N个处理器核与中断请求进行关联后生成中断映射信息;
    所述N个处理器核还用于执行:
    接收待处理的中断请求;
    根据所述中断映射信息处理所述待处理的中断请求。
  17. 如权利要求13所述的数据处理服务器,其特征在于,所述M个处理器核还用于执行:
    根据预设的调度类型配置信息将所述L个工作线程调度类型设置为实时调度类型。
  18. 如权利要求13-17任意一项所述的数据处理服务器,其特征在于,所述M个处理器核还用于执行:
    根据预设的优先级配置信息将所述L个工作线程的优先级设置为最高优先级。
PCT/CN2017/092517 2016-08-31 2017-07-11 一种配置方法、装置和数据处理服务器 WO2018040750A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610797408.5A CN106371894B (zh) 2016-08-31 2016-08-31 一种配置方法、装置和数据处理服务器
CN201610797408.5 2016-08-31

Publications (1)

Publication Number Publication Date
WO2018040750A1 true WO2018040750A1 (zh) 2018-03-08

Family

ID=57899211

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/092517 WO2018040750A1 (zh) 2016-08-31 2017-07-11 一种配置方法、装置和数据处理服务器

Country Status (2)

Country Link
CN (2) CN106371894B (zh)
WO (1) WO2018040750A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825528A (zh) * 2019-11-11 2020-02-21 聚好看科技股份有限公司 资源管理方法、装置及设备
CN114296865A (zh) * 2021-12-15 2022-04-08 中汽创智科技有限公司 一种虚拟机线程的调度方法、装置、电子设备及存储介质
CN115695334A (zh) * 2022-10-11 2023-02-03 广州市玄武无线科技股份有限公司 一种多服务节点的线程分配控制方法
CN111831390B (zh) * 2020-01-08 2024-04-16 北京嘀嘀无限科技发展有限公司 服务器的资源管理方法、装置及服务器

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106371894B (zh) * 2016-08-31 2020-02-14 华为技术有限公司 一种配置方法、装置和数据处理服务器
CN109144681B (zh) * 2017-06-27 2021-01-22 大唐移动通信设备有限公司 一种控制方法及装置
CN107479976A (zh) * 2017-08-14 2017-12-15 郑州云海信息技术有限公司 一种多程序实例同时运行下cpu资源分配方法及装置
CN107832151B (zh) * 2017-11-10 2020-09-25 东软集团股份有限公司 一种cpu资源分配方法、装置及设备
CN109871275A (zh) * 2017-12-01 2019-06-11 晨星半导体股份有限公司 多处理器系统及其处理器管理方法
CN108804211A (zh) * 2018-04-27 2018-11-13 西安华为技术有限公司 线程调度方法、装置、电子设备及存储介质
CN110362402B (zh) * 2019-06-25 2021-08-10 苏州浪潮智能科技有限公司 一种负载均衡方法、装置、设备及可读存储介质
CN110442423B (zh) * 2019-07-09 2022-04-26 苏州浪潮智能科技有限公司 一种利用控制组实现虚拟机预留cpu的方法和设备
CN113301087B (zh) * 2020-07-21 2024-04-02 阿里巴巴集团控股有限公司 资源调度方法、装置、计算设备和介质
CN112039963B (zh) * 2020-08-21 2023-04-07 广州虎牙科技有限公司 一种处理器的绑定方法、装置、计算机设备和存储介质
CN116431365A (zh) * 2023-06-07 2023-07-14 北京集度科技有限公司 基于车载服务导向架构的监控系统、方法、车辆

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120117075A1 (en) * 2010-11-04 2012-05-10 Electron Database Corporation Systems and methods for grouped request execution
CN102831011A (zh) * 2012-08-10 2012-12-19 上海交通大学 一种基于众核系统的任务调度方法及装置
CN103365718A (zh) * 2013-06-28 2013-10-23 贵阳朗玛信息技术股份有限公司 一种线程调度方法、线程调度装置及多核处理器系统
CN104050036A (zh) * 2014-05-29 2014-09-17 汉柏科技有限公司 多核处理器网络设备的控制系统及方法
CN105700949A (zh) * 2014-11-24 2016-06-22 中兴通讯股份有限公司 一种多核处理器下的数据处理方法及装置
CN106371894A (zh) * 2016-08-31 2017-02-01 华为技术有限公司 一种配置方法、装置和数据处理服务器

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101634953A (zh) * 2008-07-22 2010-01-27 国际商业机器公司 搜索空间计算方法和装置及自适应线程调度方法和系统
CN103513932B (zh) * 2012-06-28 2016-04-13 深圳市腾讯计算机系统有限公司 一种数据处理方法及装置
US9086925B2 (en) * 2013-01-18 2015-07-21 Nec Laboratories America, Inc. Methods of processing core selection for applications on manycore processors
CN103617071B (zh) * 2013-12-02 2017-01-25 北京华胜天成科技股份有限公司 一种资源独占及排它的提升虚拟机计算能力的方法及装置
CN104750543B (zh) * 2013-12-26 2018-06-15 杭州华为数字技术有限公司 线程创建方法、业务请求处理方法及相关设备
CN103838552B (zh) * 2014-03-18 2016-06-22 北京邮电大学 4g宽带通信系统多核并行流水线信号的处理系统和方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120117075A1 (en) * 2010-11-04 2012-05-10 Electron Database Corporation Systems and methods for grouped request execution
CN102831011A (zh) * 2012-08-10 2012-12-19 上海交通大学 一种基于众核系统的任务调度方法及装置
CN103365718A (zh) * 2013-06-28 2013-10-23 贵阳朗玛信息技术股份有限公司 一种线程调度方法、线程调度装置及多核处理器系统
CN104050036A (zh) * 2014-05-29 2014-09-17 汉柏科技有限公司 多核处理器网络设备的控制系统及方法
CN105700949A (zh) * 2014-11-24 2016-06-22 中兴通讯股份有限公司 一种多核处理器下的数据处理方法及装置
CN106371894A (zh) * 2016-08-31 2017-02-01 华为技术有限公司 一种配置方法、装置和数据处理服务器

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825528A (zh) * 2019-11-11 2020-02-21 聚好看科技股份有限公司 资源管理方法、装置及设备
CN110825528B (zh) * 2019-11-11 2022-02-01 聚好看科技股份有限公司 资源管理方法、装置及设备
CN111831390B (zh) * 2020-01-08 2024-04-16 北京嘀嘀无限科技发展有限公司 服务器的资源管理方法、装置及服务器
CN114296865A (zh) * 2021-12-15 2022-04-08 中汽创智科技有限公司 一种虚拟机线程的调度方法、装置、电子设备及存储介质
CN114296865B (zh) * 2021-12-15 2024-03-26 中汽创智科技有限公司 一种虚拟机线程的调度方法、装置、电子设备及存储介质
CN115695334A (zh) * 2022-10-11 2023-02-03 广州市玄武无线科技股份有限公司 一种多服务节点的线程分配控制方法

Also Published As

Publication number Publication date
CN106371894A (zh) 2017-02-01
CN106371894B (zh) 2020-02-14
CN111274015A (zh) 2020-06-12

Similar Documents

Publication Publication Date Title
WO2018040750A1 (zh) 一种配置方法、装置和数据处理服务器
US10467725B2 (en) Managing access to a resource pool of graphics processing units under fine grain control
US20190377604A1 (en) Scalable function as a service platform
AU2014311463B2 (en) Virtual machine monitor configured to support latency sensitive virtual machines
CN113037538B (zh) 分布式资源管理中低时延节点本地调度的系统和方法
US8635615B2 (en) Apparatus and method for managing hypercalls in a hypervisor and the hypervisor thereof
US10109030B1 (en) Queue-based GPU virtualization and management system
WO2017070900A1 (zh) 多核数字信号处理系统中处理任务的方法和装置
US10459773B2 (en) PLD management method and PLD management system
JP2008535099A (ja) 拡張割込み制御装置および合成割込みソースに関するシステムおよび方法
JP2013546098A (ja) グラフィックス計算プロセススケジューリング
CN103262002A (zh) 优化系统调用请求通信
JP2013546097A (ja) グラフィックス処理計算リソースのアクセシビリティ
US9378047B1 (en) Efficient communication of interrupts from kernel space to user space using event queues
US20210200704A1 (en) Input/output command rebalancing in a virtualized computer system
CN103582877A (zh) 计算机系统中断处理
US10572412B1 (en) Interruptible computing instance prioritization
US10372470B2 (en) Copy of memory information from a guest transmit descriptor from a free pool and assigned an intermediate state to a tracking data structure
US20200334070A1 (en) Management of dynamic sharing of central processing units
Sajjapongse et al. A flexible scheduling framework for heterogeneous CPU-GPU clusters
KR102576443B1 (ko) 연산 장치 및 그 잡 스케줄링 방법
US9547522B2 (en) Method and system for reconfigurable virtual single processor programming model
US11954534B2 (en) Scheduling in a container orchestration system utilizing hardware topology hints
CN117472570A (zh) 用于调度加速器资源的方法、装置、电子设备和介质
Anedda et al. Flexible Clusters for High-Performance Computing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17845053

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17845053

Country of ref document: EP

Kind code of ref document: A1