CN108089920A - A kind of methods, devices and systems of data processing - Google Patents

A kind of methods, devices and systems of data processing Download PDF

Info

Publication number
CN108089920A
CN108089920A CN201611042690.2A CN201611042690A CN108089920A CN 108089920 A CN108089920 A CN 108089920A CN 201611042690 A CN201611042690 A CN 201611042690A CN 108089920 A CN108089920 A CN 108089920A
Authority
CN
China
Prior art keywords
processor
complexity
cpu
data
pending data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611042690.2A
Other languages
Chinese (zh)
Inventor
柴守刚
梁文亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201611042690.2A priority Critical patent/CN108089920A/en
Publication of CN108089920A publication Critical patent/CN108089920A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5044Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

The embodiment of the present application discloses a kind of methods, devices and systems of data processing.This method performs in the system including first processor and second processor, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor, and this method includes:Complexity threshold is determined according to the processing capacity of the first processor;According to the complexity of pending data and the complexity threshold, determine to handle the target processor of the pending data from the first processor and the second processor.Processing data method provided by the embodiments of the present application, complexity threshold is determined according to the processing capacity of processor, and target processor is determined according to the complexity of the complexity threshold and pending data, processing capacity so as to realize task complexity and processor matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.

Description

A kind of methods, devices and systems of data processing
Technical field
The invention relates to computer realm more particularly to a kind of methods, devices and systems of data processing.
Background technology
Server system is usually made of multiple CPU (Central Processing Unit, central processing unit), each CPU possesses the equipment such as memory, hard disk, I/O (input/output, input/output) unit respectively, and is between two CPU And directly connect.CPU can be managed by MAC (Memory Access Controller, internal storage access control unit) Memory modules are set by PCI (Peripheral Component Interconnect, external device interconnection) interface with outside It is standby to be connected, if there is multiple external devices, it is also necessary to increase PCI interchangers (switch);Pass through PPI between two CPU (Processor to Processor Interface, processor and processor interface unit) directly communicates, so as to fulfill outer If device is shared.
Currently, the cpu function of server system is powerful, complicated calculating task can be completed, however, in real work In, being frequently present of some simple but long-term tasks, (such as Ethernet data transmitting-receiving and the scheduling of operating system are appointed Business), although these tasks are simple, it is also required to monopolize the core (core, including physics core or using super of a CPU The logic core of threading), current server system is used to handle these tasks and causes powerful core for a long time It is occupied by simple task, whole potential can not be given full play to so as to cause these cores, cause the waste of computing resource, simultaneously Handle that the energy consumptions of these tasks is also very big, and the Energy Efficiency Ratio for causing server system is low using current server system.
The content of the invention
In view of this, the embodiment of the present application provides a kind of methods, devices and systems for handling data, passes through processing capacity Weaker processor processing simple task, handles complex task, so as to avoid letter by the stronger processor of processing capacity Single task occupies the stronger processor of processing capacity, realizes task complexity and matches with processing capacity, so as to improve clothes The computing resource utilization rate and Energy Efficiency Ratio for device system of being engaged in.
On the one hand, a kind of data processing method is provided, this method is being including first processor and second processor It is performed in system, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor, this method bag It includes:Complexity threshold is determined according to the processing capacity of the first processor;According to the complexity of pending data and described multiple Miscellaneous degree threshold value determines to handle at the target of the pending data from the first processor and the second processor Manage device.
Processing data method provided by the embodiments of the present application determines complexity threshold according to the processing capacity of processor, and Target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task complexity and place The processing capacity of reason device matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the complexity according to pending data and the complexity threshold, from the first processor and Determine to handle the target processor of the pending data in the second processor, including:If the pending number According to complexity be less than or equal to the complexity threshold, then the first processor is determined as the target processor;Such as The complexity of pending data described in fruit is more than the complexity threshold, then the second processor is determined as at the target Manage device.
The method of data processing provided by the embodiments of the present application is appointed by the way that the weaker processor processing of processing capacity is simple Business handles complex task by the stronger processor of processing capacity, stronger so as to which simple task is avoided to occupy processing capacity Processor, realize task complexity and match with processing capacity, so as to improve the computing resource utilization rate of system and Energy Efficiency Ratio.
Optionally it is determined that before the target processor, the data processing method further includes:According to the pending number The complexity of the pending data is determined according to the type of affiliated task.
Data processing method provided by the embodiments of the present application, the type of the task according to belonging to pending data determine to wait to locate The complexity of data is managed, and target processor is determined according to the complexity and complexity threshold of pending data, so as to reality Current task complexity and the processing capacity of processor match, and improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in memory According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor and the second processor by LLC (Last Level Cache, last Grade caching) connection.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in caching According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor Block connects.
According to data processing method provided by the embodiments of the present application, LLC, interface of the target processor by target processor Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor Between communication, and improve the rate that target processor obtains pending data from other processors.
On the other hand, a kind of data processing equipment is provided, configures and is being including first processor and second processor In system, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor, which includes:Really Order member, for determining complexity threshold according to the processing capacity of the first processor;Processing unit, for according to pending The complexity threshold that the complexity of data and the determination unit determine, from the first processor and the second processing Determine to handle the target processor of the pending data in device.
Processing data set provided by the embodiments of the present application determines complexity threshold according to the processing capacity of processor, and Target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task complexity and place The processing capacity of reason device matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the processing unit is specifically used for:If the complexity of the pending data is less than or equal to described The first processor is then determined as the target processor by complexity threshold;If the complexity of the pending data More than the complexity threshold, then the second processor is determined as the target processor.
The device of data processing provided by the embodiments of the present application is appointed by the way that the weaker processor processing of processing capacity is simple Business handles complex task by the stronger processor of processing capacity, stronger so as to which simple task is avoided to occupy processing capacity Processor, realize task complexity and match with processing capacity, so as to improve the computing resource utilization rate of system and Energy Efficiency Ratio.
Optionally, the processing unit is additionally operable to:Described in the type of task according to belonging to the pending data determines The complexity of pending data.
Data processing equipment provided by the embodiments of the present application determines answering for pending data according to the type of pending data Miscellaneous degree, and target processor is determined according to the complexity and complexity threshold of pending data, so as to realize task complexity It spends and matches with the processing capacity of processor, improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in memory According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor is connected with the second processor by LLC.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in caching According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor Block connects.
According to data processing equipment provided by the embodiments of the present application, LLC, interface of the target processor by target processor Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor Between communication, and improve the rate that target processor obtains pending data from other processors.
In another aspect, the embodiment of the present application provides a kind of data handling system, which is included described in above-mentioned aspect First processor and second processor.
In another aspect, the embodiment of the present application provides a kind of computer storage media, for saving as above-mentioned data processing Computer software instructions used in system, it includes for performing the program designed by above-mentioned aspect.
Description of the drawings
Fig. 1 applies to a kind of schematic architectural diagram of system of the embodiment of the present application;
Fig. 2 is the schematic flow chart of data processing method provided by the embodiments of the present application;
Fig. 3 is a kind of schematic flow chart of method for obtaining pending data provided by the embodiments of the present application;
Fig. 4 is the schematic flow chart of another method for obtaining pending data provided by the embodiments of the present application;
Fig. 5 is the schematic flow chart of the method for another acquisition pending data provided by the embodiments of the present application;
Fig. 6 is the schematic flow chart of the method for another acquisition pending data provided by the embodiments of the present application;
Fig. 7 is the schematic diagram of data processing equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with attached drawing, the scheme in the embodiment of the present application is described.
Fig. 1 shows a kind of schematic architectural diagram of system suitable for the embodiment of the present application.As shown in Figure 1, the system 100 include:CPU 1, CPU 2 and Offload (shunting) CPU, wherein, the processing capacity of CPU 1 and the processing capacity of CPU 2 It is all higher than the processing capacity of Offload CPU.It is to be understood that Offload CPU are more to be treated simple for processing system Task, title are understood not to the limitation to the embodiment of the present application, and other any processing capacities are weaker and for handling letter The CPU of single task both falls within the scope of the embodiment of the present application protection.
As one optionally example, Offload CPU can be directly connected to communicate with multiple CPU, such as Fig. 1 institutes Show, passage path is 1. respectively, 2. path communicates with CPU 1, CPU 2 by Offload CPU.Optionally, if Offload CPU can only be directly connected to a CPU (such as CPU 1), then Offload CPU can be indirectly logical with CPU 2 by CPU 1 Letter.
Each CPU includes at least one core, and optionally, the CPU 1 and CPU 2 shown in Fig. 1 includes 4 cores respectively The heart, Offload CPU include two cores, but the quantity of the core suitable for the CPU included by the system of the embodiment of the present application It is without being limited thereto.In addition, include the stronger CPU of at least one processing capacity and at least one suitable for the system of the embodiment of the present application The weaker CPU of processing capacity.
Optionally, system 100 further includes one or more dresses in memory, interchanger, hard disk, I/O units and accelerator It puts, the main function of accelerator is for completing the modulation /demodulation in some special calculating tasks, such as signal of communication processing Coding and decoding video calculating in calculating, multimedia application etc..
Communication mode between each device (alternatively referred to as " unit " or " module ") shown in FIG. 1 is only to illustrate Bright, the embodiment of the present application is without being limited thereto, for example, CPU 1 can also pass through PCIe (Peripheral Component Interconnect-Express, external device interconnection extension) unit communicates with interchanger.
Fig. 2 is a kind of schematic flow chart of method for handling data provided by the embodiments of the present application.It as shown in Fig. 2, should Method 200 includes:
S210 determines complexity threshold according to the processing capacity of the first processor.
S220, according to the complexity of pending data and the complexity threshold, from the first processor and described Determine to handle the target processor of the pending data in two processors.
Method 200 can be performed by primary processor (that is, the processor for carrying operating system), which can be Other processing in first processor or second processor or system in addition to first processor and second processor Device.It is to be understood that above system refers to the equipment for being used to implement some functions for including first processor core second processor, with " behaviour Make system " it is different, the meaning of system as described below is identical with this, repeats no more.It is succinct in order to describe, below, at first Reason device is referred to as the first CPU, and second processor is referred to as the 2nd CPU.
Under normal conditions, the processing capacity of the framework processor identical with processing procedure is directlyed proportional to energy consumption, the embodiment of the present application In, the processing capacity of the first CPU is less than the processing capacity of the 2nd CPU, therefore, when handling identical data, the energy of the first CPU Low energy consumption by the 2nd CPU of loss-rate.According to the difference of application scenarios, system task to be dealt with is different, correspondingly, processor institute The complexity of data to be processed also differs.
For example, in high-performance calculation scene, the main function of operating system is responsible for calculating task being dispatched to system Available processor performs up, and task scheduling work is with task execution in itself compared to simply many under normal circumstances, It needs to carry out simple logical calculated and fixed-point computation.The work of another aspect task scheduling is to continue to carry out, because not It is disconnected to there is task computation completion or new task to reach.It can be seen that in this case, operating system is a kind of letter of continuation in itself Single task.If on the powerful processor of the server capability of operating system carried, the wave of computing resource and the energy can be caused Take.Based on method 200 provided by the embodiments of the present application, operating system can be loaded into the first CPU in os starting On, so as to which the 2nd CPU be released, exclusively carry out complicated task computation.
For another example at C-RAN (Centralized Radio Access Networks, the wireless access network of centralization) Base-Band Processing in, server need it is lasting receive and dispatch user data and control information from core net, while be also required to continue Slave RRU (Remote Radio Unit, Remote Radio Unit) transmitting-receiving physical layer data.Although these tasks are very simple, it is not required to Want complicated logical calculated, it is not required that substantial amounts of floating-point operation, but be continued for existing.If with the processing of server Device goes to perform, and the huge waste of computing resource is not only for powerful processor-server, and power consumption is also very high. Based on method 200 provided by the embodiments of the present application, we can carry out thread binding, these are continued in advance in the application Property simple task be tied to the first CPU up perform, it is achieved thereby that the matching of task complexity and processing capacity, makes calculating Resource utilization and energy consumption efficiency have all reached relatively good optimization.
In S210, since the processing capacity of the first different CPU is different, in order to avoid the first CPU can not complete to distribute to Its task, therefore, primary processor need to determine complexity threshold according to the processing capacity of the first CPU, which uses In the maximum complexity for the data that the first CPU of instruction can be handled.
In S220, complexity and above-mentioned complexity threshold of the primary processor according to pending data, from the first CPU and Two CPU kinds determine target processor, which is used to handle the pending data, so that pending data Complexity and the processing capacity of target processor match.
It is to be understood that above-mentioned complexity threshold can change with the difference of application scenarios, for example, current system is pending Task it is more, then can heighten the complexity threshold, the higher task of some complexities is distributed into the first CPU, if work as The pending task of preceding system is less, then can turn down the complexity threshold, and the higher task of some complexities is distributed to Two CPU, so as to be conducive to the equilibrium of the load of whole system.
Above-mentioned complexity threshold can also be default fixed value, if current system waiting task is more, and first The waiting task of CPU is less, then being more than the complexity threshold with complexity of the task distributes to the first CPU;It is if current System waiting task is less, and the waiting task of the first CPU is more, then can complexity be less than the complexity threshold Task distributes to the 2nd CPU, so as to be conducive to the equilibrium of the load of whole system.
In the embodiment of the present application, the first CPU is for performing simple task, therefore, the place that the first CPU need not be very strong Reason ability, but must possess the characteristics of low-power consumption.The quantity of the processor core of first CPU can be it is single can also be more It is a.For example, the first CPU can be any one in following three kinds of CPU:
1) the x86 processors of low-power consumption;
2) the low work(realized using FPGA (Field-Programmable Gate Array, field programmable gate array) Consume CPU (due to the flexible programmable ability of FPGA, can realize arbitrarily desired cpu function);
3) (Advanced RISC Machines, wherein advanced RISC machines machine, RISC are ARM The abbreviation of " Reduced Instruction Set Computer ") framework or MIPS (Microprocessor without Interlocked Piped Stages, no inner interlocked pipelining-stage microprocessor) framework processor.
In conclusion processing data method provided by the embodiments of the present application, complexity is determined according to the processing capacity of processor Threshold value is spent, and target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task Complexity and the processing capacity of processor match, and improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the complexity according to pending data and the complexity threshold, from the first processor and Determine to handle the target processor of the pending data in the second processor, including:
S221, if the complexity of the pending data is less than or equal to the complexity threshold, by described first Processor is determined as the target processor.
S222, if the complexity of the pending data is more than the complexity threshold, by the second processor It is determined as the target processor.
Therefore, the method for data processing provided by the embodiments of the present application handles letter by the weaker processor of processing capacity Single task handles complex task, so as to which simple task is avoided to occupy processing capacity by the stronger processor of processing capacity Stronger processor realizes task complexity and matches with processing capacity, so as to improve the computing resource utilization rate of system And Energy Efficiency Ratio.
Optionally it is determined that before the target processor, the data processing method 200 further includes:
S201, the type of the task according to belonging to the pending data determine the complexity of the pending data.
The complexity of pending data is usually related with the type of the task belonging to pending data, for example, video editing Task usually requires to carry out data substantial amounts of encoding and decoding calculating, it is necessary to which processor possesses stronger processing capacity, therefore video Video data included by editor's task belongs to the higher pending data of complexity, hence, it can be determined that video data is multiple The higher data of miscellaneous degree.For another example data transmit-receive task is usually only necessary to be forwarded to data, complicated logical calculated is not required Or floating-point operation, hence, it can be determined that the data included by data transmit-receive task belong to the relatively low pending data of complexity.
In the embodiment of the present application, scheduling logic can be increased in the application, the relatively low task of complexity is tied to In the core of first CPU, processing logic can also be increased in the task scheduling layer of operating system, make operating system automatically by complexity Relatively low task scheduling is spent to the first CPU.
Above-described embodiment is merely illustrative of, and the embodiment of the present application is without being limited thereto, therefore, number provided by the embodiments of the present application According to processing method, the type of the task according to belonging to pending data determines the complexity of pending data, and according to pending The complexity and complexity threshold of data determine target processor, so as to realize the processing energy of task complexity and processor Power matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
After target processor is determined, target processor is needed to obtain pending data and it handled, target Processor can by pci interface directly from miscellaneous equipment (such as interchanger) obtain pending data, can also by PPI from Other processors in system residing for the target processor obtain pending data.
Fig. 3 shows a kind of schematic flow chart of method for obtaining pending data provided by the embodiments of the present application.Such as Shown in Fig. 3, the first CPU is communicated with the 2nd CPU by respective PPI, and the first CPU and the 2nd CPU can also be by respective MAC deposit into row read/write operation to interior, it is assumed that currently definite first CPU is target processor, then the first CPU can be under It states step and obtains pending data from the 2nd CPU.
Pending data is write memory 302 by S310, the 2nd CPU by MAC301.
Above-mentioned pending data is sent to memory 305 by S320, MAC301.
For example, MAC301 can be passed through using DMA (Direct Memory Access, direct memory access) mechanism Pending data is sent to memory 305 by PPI303 and PPI304.
S330, the first CPU read pending data by MAC306 from memory 305.
Above-described embodiment is merely illustrative of, and the embodiment of the present application is without being limited thereto, and the first CPU can also pass through the prior art Other methods obtain the second CPU write and enter the pending data of memory, for sake of simplicity, details are not described herein.In addition, the first CPU The pending data in miscellaneous equipment write-in memory 350 can also be read.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in memory According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor is connected with the second processor by LLC.
Current almost all of CPU has multilevel cache, and the capacity of the Capacity Ratio memory of caching is small, but cache Access rate is faster than the access rate of memory, and therefore, pending data can be write and cached by other processors in advance, at target It manages device and obtains pending data by caching, so as to improve the rate for obtaining pending data.
Fig. 4 shows the schematic flow chart of another method for obtaining pending data provided by the embodiments of the present application. As shown in figure 4, the LLC and the LLC of the 2nd CPU of the first CPU are connected by PPI, in this way, can pass through LLC between two CPU Direct communication, since LLC has faster access rate, also have between the first CPU and the 2nd CPU and communicate faster Rate.Assuming that currently definite first CPU is target processor, then the first CPU can be by following step from other processor (examples Such as the 2nd CPU) obtain pending data.
Pending data is write LLC401 by S410, the 2nd CPU.
S420, LLC405 read pending number from by PPI402 and PPI403 according to the instruction of the first CPU from LLC401 According to.
Pending data is sent to the core of the first CPU by S430, LLC404.
It should be noted that between the LLC of above-mentioned two CPU, there may be cache coherency problems, i.e. two CPU's The backup of same part data may be included in LLC, and the two CPU independently read and write respective Backup Data, may cause same Two backups of one data are inconsistent.Cache coherency problems can pass through CC (Cache Coherent, buffer consistency) Agreement solves, and specifically can be solved by CA (Cache Agent, caching agent unit).
Above-described embodiment is merely illustrative of, and the embodiment of the present application is without being limited thereto, and the first CPU can also pass through the prior art Other methods obtain the second CPU write enter caching in pending data, for sake of simplicity, details are not described herein.In addition, first CPU can also read the pending data in miscellaneous equipment write-in LLC405.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in caching According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor Block connects.
In the embodiment of the present application, target processor may belong to same instruction set architecture with other processors, it is also possible to no Belong to same instruction set architecture.
When target processor (such as the first CPU) and other processors (such as the 2nd CPU) belong to same instruction set architecture When, the Cache coherency protocol and PPI of the first CPU and the 2nd CPU all exactly match, and there is no compatibility issues.
For example, the first CPU and the 2nd CPU are the CPU of Intel (Intel), then PPI is the QPI (Quick of Intel Path Interconnect, Quick Path Interconnect) bus, buffer consistency then by the MESIF of Intel (Modified, Exclusive, Shared, Invalid and Forward rewrite/exclusive/shared/invalid/forward pass) agreement ensures.If First CPU and the 2nd CPU is the CPU of AMD (Advanced Micro Devices, ultra micro equipment), then PPI is AMD's HyperTransport (super transmission) bus, buffer consistency then by the MOESI of AMD (Modified, Owned, Exclusive, Shared, and Invalid, rewrite/possessing/monopolizes/shares/invalid) agreement ensures.
When target processor (such as the first CPU) and other processors (such as the 2nd CPU) are not belonging to same instruction set frame During structure, there are compatibility issues for the Cache coherency protocol and PPI of the first CPU and the 2nd CPU.According to the realization shape of the first CPU Formula is different, can be handled by following two modes.
Mode 1
If the first CPU is using FPGA ip core's (intellectual property core, IP core) Form is realized, then can utilize the flexible programmable of FPGA, and the interface modular converter and caching of PPI are realized in FPGA The protocol conversion module of consistency protocol, so as to which two CPU of different instruction set framework be connected.
As shown in figure 5, the first CPU includes protocol conversion module cvt2 and interface modular converter cvt1, cvt1 are used to implement Conversion between PPI, cvt2 are used to implement the conversion of Cache coherency protocol.For example, the 2nd CPU uses intel x86 frameworks, First CPU uses ARM frameworks, then cvt1 be used to implement QPI agreements and AXI (Advanced eXtensible Interface, Advanced extensible Interface) conversion between agreement, cvt2 is used to implement MESIF protocol and ACE (AMBA Coherency Extensions, Advanced Microcontroller Bus Architecture unanimously extend, and wherein AMBA is " Advanced Microcontroller The abbreviation of Bus Architecture ") conversion between agreement.
Due to FPGA have flexible programmable, the PPI interfaces in the first CPU can be changed, be revised as with PPI interfaces identical PPI used in 2nd CPU, so as to without interface modular converter cvt1 can be realized the first CPU with The communication of 2nd CPU.
Mode 2
If the first CPU realizes that the first CPU and the 2nd CPU can be by being located at two using non-programmable device The modular converter of the outside of CPU communicates.
As shown in fig. 6, the first CPU and the 2nd CPU is carried out by the modular converter PPI&CC cvt between being located at two CPU Communication.PPI&CC cvt are used to implement the conversion of PPI and the conversion of Cache coherency protocol between the first CPU and the 2nd CPU. The method that the prior art may be employed in PPI&CC cvt is realized, for sake of simplicity, details are not described herein.
According to data processing method provided by the embodiments of the present application, LLC, interface of the target processor by target processor Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor Between communication, and improve the rate that target processor obtains pending data from other processors.
Data processing method provided by the embodiments of the present application is described in detail above in association with Fig. 2 to Fig. 6, in the following, Data processing equipment provided by the embodiments of the present application will be described in detail with reference to Fig. 7.
As shown in fig. 7, device 700 includes:
Determination unit 710 determines complexity threshold for the processing capacity according to first processor;
Processing unit 720, for the definite complexity of the complexity according to pending data and the determination unit 710 Threshold value is spent, determines to handle the target processor of the pending data from the first processor and second processor.
In the embodiment of the present application, device 700 is configured in the system including the first processor and the second processor In, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor.
It is to be understood that device 700 is that data processing equipment provided by the embodiments of the present application is described from functional perspective, dress Put 700 executive agents that may correspond to data processing method according to the embodiment of the present application, and the unit in device 700 Above and other operation and/or function respectively in order to realize the corresponding flow of each method in Fig. 2 to Fig. 6, for sake of simplicity, Details are not described herein.
Processing data set provided by the embodiments of the present application determines complexity threshold according to the processing capacity of processor, and Target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task complexity and place The processing capacity of reason device matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the processing unit 710 is specifically used for:
If the complexity of the pending data is less than or equal to the complexity threshold, by the first processor It is determined as the target processor;
If the complexity of the pending data is more than the complexity threshold, the second processor is determined as The target processor.
The device of data processing provided by the embodiments of the present application is appointed by the way that the weaker processor processing of processing capacity is simple Business handles complex task by the stronger processor of processing capacity, stronger so as to which simple task is avoided to occupy processing capacity Processor, realize task complexity and match with processing capacity, so as to improve the computing resource utilization rate of system and Energy Efficiency Ratio.
Optionally, the processing unit 710 is additionally operable to:
The type of task according to belonging to the pending data determines the complexity of the pending data.
Data processing equipment provided by the embodiments of the present application determines answering for pending data according to the type of pending data Miscellaneous degree, and target processor is determined according to the complexity and complexity threshold of pending data, so as to realize task complexity It spends and matches with the processing capacity of processor, improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in memory According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor is connected with the second processor by LLC.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in caching According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor Block connects.
According to data processing equipment provided by the embodiments of the present application, LLC, interface of the target processor by target processor Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor Between communication, and improve the rate that target processor obtains pending data from other processors.
Further, data processing equipment provided by the embodiments of the present application can also have processor and memory to form, For performing data processing method provided by the embodiments of the present application, memory is used to store the finger performed by processor middle processor Order.
Those skilled in the art can be understood that, for convenience of description and succinctly, the device of foregoing description With the specific work process of unit, the corresponding process in preceding method embodiment is may be referred to, details are not described herein.
In the embodiment of the present application, the size of the sequence number of each process is not meant to the priority of execution sequence, each process Execution sequence should determine that the implementation process without tackling the embodiment of the present application forms any restriction with its function and internal logic.
In addition, the terms "and/or", is only a kind of incidence relation for describing affiliated partner, represents there may be Three kinds of relations, for example, A and/or B, can represent:Individualism A exists simultaneously A and B, these three situations of individualism B.Separately Outside, character "/" herein, it is a kind of relation of "or" to typically represent forward-backward correlation object.
Those of ordinary skill in the art may realize that each exemplary lists described with reference to the embodiments described herein Member and step can realize with the combination of electronic hardware, computer software or the two, in order to clearly demonstrate hardware and soft The interchangeability of part generally describes each exemplary composition and step according to function in the above description.These work( It can be performed actually with hardware or software mode, specific application and design constraint depending on technical solution.Professional skill Art personnel can realize described function to each specific application using distinct methods, but this realization should not be recognized To exceed scope of the present application.
The disclosed systems, devices and methods in embodiment provided herein, can be real by another way It is existing.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, is only that one kind is patrolled The division of volume function, can there is an other dividing mode in actual implementation, such as multiple units or component can combine or can be with It is integrated into another system or some features can be ignored or does not perform.In addition, different components or the mutual coupling of unit Conjunction or communication connection can be INDIRECT COUPLING or the communication connection by some interfaces, device or unit, above-mentioned coupling or communication Connection can be electric, mechanical or other forms couplings or connection.
The above is only the specific embodiment of the application, but the protection domain of the application is not limited thereto.

Claims (12)

1. a kind of data processing method, which is characterized in that it is performed in the system including first processor and second processor, In, the processing capacity of the first processor is less than the processing capacity of the second processor, and the data processing method includes:
Complexity threshold is determined according to the processing capacity of the first processor;
According to the complexity of pending data and the complexity threshold, from the first processor and the second processor Determine to handle the target processor of the pending data.
2. data processing method according to claim 1, which is characterized in that the complexity according to pending data and The complexity threshold determines to handle the pending data from the first processor and the second processor Target processor, including:
If the complexity of the pending data is less than or equal to the complexity threshold, the first processor is determined For the target processor;
If the complexity of the pending data is more than the complexity threshold, the second processor is determined as described Target processor.
3. data processing method according to claim 1 or 2, which is characterized in that before determining the target processor, institute Data processing method is stated to further include:
The type of task according to belonging to the pending data determines the complexity of the pending data.
4. data processing method according to any one of claim 1 to 3, which is characterized in that the target processor leads to Cross Memory Controller Hub MAC and Memory linkage.
5. data processing method according to any one of claim 1 to 3, which is characterized in that the first processor with The second processor caches LLC connections by afterbody.
6. data processing method according to any one of claim 1 to 3, which is characterized in that the first processor with The second processor is connected by LLC, interface modular converter with protocol conversion module.
7. a kind of data processing equipment, which is characterized in that it configures in the system including first processor and second processor, In, the processing capacity of the first processor is less than the processing capacity of the second processor, and the data processing equipment includes:
Determination unit, for determining complexity threshold according to the processing capacity of the first processor;
Processing unit, for the complexity threshold that the complexity according to pending data and the determination unit determine, from Determine to handle the target processor of the pending data in the first processor and the second processor.
8. data processing equipment according to claim 7, which is characterized in that the processing unit is specifically used for:
If the complexity of the pending data is less than or equal to the complexity threshold, the first processor is determined For the target processor;
If the complexity of the pending data is more than the complexity threshold, the second processor is determined as described Target processor.
9. the data processing equipment according to claim 7 or 8, which is characterized in that the processing unit is additionally operable to:
The type of task according to belonging to the pending data determines the complexity of the pending data.
10. the data processing equipment according to any one of claim 7 to 9, which is characterized in that the target processor leads to Cross Memory Controller Hub MAC and Memory linkage.
11. the data processing equipment according to any one of claim 7 to 9, which is characterized in that the first processor with The second processor caches LLC connections by afterbody.
12. the data processing equipment according to any one of claim 7 to 9, which is characterized in that the first processor with The second processor is connected by LLC, interface modular converter with protocol conversion module.
CN201611042690.2A 2016-11-23 2016-11-23 A kind of methods, devices and systems of data processing Pending CN108089920A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611042690.2A CN108089920A (en) 2016-11-23 2016-11-23 A kind of methods, devices and systems of data processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611042690.2A CN108089920A (en) 2016-11-23 2016-11-23 A kind of methods, devices and systems of data processing

Publications (1)

Publication Number Publication Date
CN108089920A true CN108089920A (en) 2018-05-29

Family

ID=62170935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611042690.2A Pending CN108089920A (en) 2016-11-23 2016-11-23 A kind of methods, devices and systems of data processing

Country Status (1)

Country Link
CN (1) CN108089920A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051064A (en) * 2019-12-26 2021-06-29 中移(上海)信息通信科技有限公司 Task scheduling method, device, equipment and storage medium
CN113661485A (en) * 2019-04-10 2021-11-16 赛灵思公司 Domain assisted processor peering for coherency acceleration
WO2022143019A1 (en) * 2020-12-31 2022-07-07 华为云计算技术有限公司 Heterogeneous computing system and related device
TWI811620B (en) * 2020-03-24 2023-08-11 威盛電子股份有限公司 Computing apparatus and data processing method
US11941433B2 (en) 2020-03-24 2024-03-26 Via Technologies Inc. Computing apparatus and data processing method for offloading data processing of data processing task from at least one general purpose processor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104142729A (en) * 2014-08-11 2014-11-12 联想(北京)有限公司 Method and device for controlling processor and electronic equipment
US20160335734A1 (en) * 2015-05-11 2016-11-17 Vixs Systems, Inc. Memory subsystem synchronization primitives

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104142729A (en) * 2014-08-11 2014-11-12 联想(北京)有限公司 Method and device for controlling processor and electronic equipment
US20160335734A1 (en) * 2015-05-11 2016-11-17 Vixs Systems, Inc. Memory subsystem synchronization primitives

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113661485A (en) * 2019-04-10 2021-11-16 赛灵思公司 Domain assisted processor peering for coherency acceleration
CN113661485B (en) * 2019-04-10 2024-05-07 赛灵思公司 Domain assisted processor peering for coherency acceleration
CN113051064A (en) * 2019-12-26 2021-06-29 中移(上海)信息通信科技有限公司 Task scheduling method, device, equipment and storage medium
CN113051064B (en) * 2019-12-26 2024-05-24 中移(上海)信息通信科技有限公司 Task scheduling method, device, equipment and storage medium
TWI811620B (en) * 2020-03-24 2023-08-11 威盛電子股份有限公司 Computing apparatus and data processing method
US11941433B2 (en) 2020-03-24 2024-03-26 Via Technologies Inc. Computing apparatus and data processing method for offloading data processing of data processing task from at least one general purpose processor
WO2022143019A1 (en) * 2020-12-31 2022-07-07 华为云计算技术有限公司 Heterogeneous computing system and related device

Similar Documents

Publication Publication Date Title
CN108089920A (en) A kind of methods, devices and systems of data processing
US20210064117A1 (en) Optimizing power usage by factoring processor architectural events to pmu
CN112465129B (en) On-chip heterogeneous artificial intelligent processor
CN105279133B (en) VPX Parallel DSP Signal transacting board analysis based on SoC on-line reorganizations
CN104331528B (en) The General Porcess Unit of (DSP) pattern is handled with low power digital signals
CN110650347B (en) Multimedia data processing method and device
EP1963963A2 (en) Methods and apparatus for multi-core processing with dedicated thread management
CN107111553A (en) System and method for providing dynamic caching extension in many cluster heterogeneous processor frameworks
US10558574B2 (en) Reducing cache line collisions
DE112013005287T5 (en) Heterogeneous processor device and method
CN102736595A (en) Unified platform of intelligent power distribution terminal based on 32 bit microprocessor and real time operating system (RTOS)
US20210373799A1 (en) Method for storing data and method for reading data
CN205038556U (en) VPX multinuclear intelligence computation hardware platform based on two FPGA of two DSP
CN106385329A (en) Processing method and device of resource pool and equipment
CN108229687A (en) Data processing method, data processing equipment and electronic equipment
CN109858621A (en) A kind of debugging apparatus, method and the storage medium of convolutional neural networks accelerator
CN114564435A (en) Inter-core communication method, device and medium for heterogeneous multi-core chip
CN104750660A (en) Embedded reconfigurable processor with multiple operating modes
CN111008042B (en) Efficient general processor execution method and system based on heterogeneous pipeline
CN206331335U (en) Computer motherboard and computer
CN112347035B (en) Remote FPGA equipment-oriented dynamic part reconfigurable configuration device and method
CN101989191B (en) Realizing method of multi-Ready input CPU (central processing unit)
CN103150952A (en) Reconfigurable electronic design automation (EDA) experimental platform
CN204009892U (en) Embedded overall treatment platform based on X86 and the general-purpose operating system
CN112905528A (en) Intelligent household chip based on Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180529