CN108089920A - A kind of methods, devices and systems of data processing - Google Patents
A kind of methods, devices and systems of data processing Download PDFInfo
- Publication number
- CN108089920A CN108089920A CN201611042690.2A CN201611042690A CN108089920A CN 108089920 A CN108089920 A CN 108089920A CN 201611042690 A CN201611042690 A CN 201611042690A CN 108089920 A CN108089920 A CN 108089920A
- Authority
- CN
- China
- Prior art keywords
- processor
- complexity
- cpu
- data
- pending data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5044—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering hardware capabilities
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Power Sources (AREA)
Abstract
The embodiment of the present application discloses a kind of methods, devices and systems of data processing.This method performs in the system including first processor and second processor, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor, and this method includes:Complexity threshold is determined according to the processing capacity of the first processor;According to the complexity of pending data and the complexity threshold, determine to handle the target processor of the pending data from the first processor and the second processor.Processing data method provided by the embodiments of the present application, complexity threshold is determined according to the processing capacity of processor, and target processor is determined according to the complexity of the complexity threshold and pending data, processing capacity so as to realize task complexity and processor matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Description
Technical field
The invention relates to computer realm more particularly to a kind of methods, devices and systems of data processing.
Background technology
Server system is usually made of multiple CPU (Central Processing Unit, central processing unit), each
CPU possesses the equipment such as memory, hard disk, I/O (input/output, input/output) unit respectively, and is between two CPU
And directly connect.CPU can be managed by MAC (Memory Access Controller, internal storage access control unit)
Memory modules are set by PCI (Peripheral Component Interconnect, external device interconnection) interface with outside
It is standby to be connected, if there is multiple external devices, it is also necessary to increase PCI interchangers (switch);Pass through PPI between two CPU
(Processor to Processor Interface, processor and processor interface unit) directly communicates, so as to fulfill outer
If device is shared.
Currently, the cpu function of server system is powerful, complicated calculating task can be completed, however, in real work
In, being frequently present of some simple but long-term tasks, (such as Ethernet data transmitting-receiving and the scheduling of operating system are appointed
Business), although these tasks are simple, it is also required to monopolize the core (core, including physics core or using super of a CPU
The logic core of threading), current server system is used to handle these tasks and causes powerful core for a long time
It is occupied by simple task, whole potential can not be given full play to so as to cause these cores, cause the waste of computing resource, simultaneously
Handle that the energy consumptions of these tasks is also very big, and the Energy Efficiency Ratio for causing server system is low using current server system.
The content of the invention
In view of this, the embodiment of the present application provides a kind of methods, devices and systems for handling data, passes through processing capacity
Weaker processor processing simple task, handles complex task, so as to avoid letter by the stronger processor of processing capacity
Single task occupies the stronger processor of processing capacity, realizes task complexity and matches with processing capacity, so as to improve clothes
The computing resource utilization rate and Energy Efficiency Ratio for device system of being engaged in.
On the one hand, a kind of data processing method is provided, this method is being including first processor and second processor
It is performed in system, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor, this method bag
It includes:Complexity threshold is determined according to the processing capacity of the first processor;According to the complexity of pending data and described multiple
Miscellaneous degree threshold value determines to handle at the target of the pending data from the first processor and the second processor
Manage device.
Processing data method provided by the embodiments of the present application determines complexity threshold according to the processing capacity of processor, and
Target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task complexity and place
The processing capacity of reason device matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the complexity according to pending data and the complexity threshold, from the first processor and
Determine to handle the target processor of the pending data in the second processor, including:If the pending number
According to complexity be less than or equal to the complexity threshold, then the first processor is determined as the target processor;Such as
The complexity of pending data described in fruit is more than the complexity threshold, then the second processor is determined as at the target
Manage device.
The method of data processing provided by the embodiments of the present application is appointed by the way that the weaker processor processing of processing capacity is simple
Business handles complex task by the stronger processor of processing capacity, stronger so as to which simple task is avoided to occupy processing capacity
Processor, realize task complexity and match with processing capacity, so as to improve the computing resource utilization rate of system and
Energy Efficiency Ratio.
Optionally it is determined that before the target processor, the data processing method further includes:According to the pending number
The complexity of the pending data is determined according to the type of affiliated task.
Data processing method provided by the embodiments of the present application, the type of the task according to belonging to pending data determine to wait to locate
The complexity of data is managed, and target processor is determined according to the complexity and complexity threshold of pending data, so as to reality
Current task complexity and the processing capacity of processor match, and improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in memory
According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also
It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first
Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor and the second processor by LLC (Last Level Cache, last
Grade caching) connection.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in caching
According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor
Block connects.
According to data processing method provided by the embodiments of the present application, LLC, interface of the target processor by target processor
Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor
Between communication, and improve the rate that target processor obtains pending data from other processors.
On the other hand, a kind of data processing equipment is provided, configures and is being including first processor and second processor
In system, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor, which includes:Really
Order member, for determining complexity threshold according to the processing capacity of the first processor;Processing unit, for according to pending
The complexity threshold that the complexity of data and the determination unit determine, from the first processor and the second processing
Determine to handle the target processor of the pending data in device.
Processing data set provided by the embodiments of the present application determines complexity threshold according to the processing capacity of processor, and
Target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task complexity and place
The processing capacity of reason device matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the processing unit is specifically used for:If the complexity of the pending data is less than or equal to described
The first processor is then determined as the target processor by complexity threshold;If the complexity of the pending data
More than the complexity threshold, then the second processor is determined as the target processor.
The device of data processing provided by the embodiments of the present application is appointed by the way that the weaker processor processing of processing capacity is simple
Business handles complex task by the stronger processor of processing capacity, stronger so as to which simple task is avoided to occupy processing capacity
Processor, realize task complexity and match with processing capacity, so as to improve the computing resource utilization rate of system and
Energy Efficiency Ratio.
Optionally, the processing unit is additionally operable to:Described in the type of task according to belonging to the pending data determines
The complexity of pending data.
Data processing equipment provided by the embodiments of the present application determines answering for pending data according to the type of pending data
Miscellaneous degree, and target processor is determined according to the complexity and complexity threshold of pending data, so as to realize task complexity
It spends and matches with the processing capacity of processor, improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in memory
According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also
It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first
Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor is connected with the second processor by LLC.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in caching
According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor
Block connects.
According to data processing equipment provided by the embodiments of the present application, LLC, interface of the target processor by target processor
Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor
Between communication, and improve the rate that target processor obtains pending data from other processors.
In another aspect, the embodiment of the present application provides a kind of data handling system, which is included described in above-mentioned aspect
First processor and second processor.
In another aspect, the embodiment of the present application provides a kind of computer storage media, for saving as above-mentioned data processing
Computer software instructions used in system, it includes for performing the program designed by above-mentioned aspect.
Description of the drawings
Fig. 1 applies to a kind of schematic architectural diagram of system of the embodiment of the present application;
Fig. 2 is the schematic flow chart of data processing method provided by the embodiments of the present application;
Fig. 3 is a kind of schematic flow chart of method for obtaining pending data provided by the embodiments of the present application;
Fig. 4 is the schematic flow chart of another method for obtaining pending data provided by the embodiments of the present application;
Fig. 5 is the schematic flow chart of the method for another acquisition pending data provided by the embodiments of the present application;
Fig. 6 is the schematic flow chart of the method for another acquisition pending data provided by the embodiments of the present application;
Fig. 7 is the schematic diagram of data processing equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with attached drawing, the scheme in the embodiment of the present application is described.
Fig. 1 shows a kind of schematic architectural diagram of system suitable for the embodiment of the present application.As shown in Figure 1, the system
100 include:CPU 1, CPU 2 and Offload (shunting) CPU, wherein, the processing capacity of CPU 1 and the processing capacity of CPU 2
It is all higher than the processing capacity of Offload CPU.It is to be understood that Offload CPU are more to be treated simple for processing system
Task, title are understood not to the limitation to the embodiment of the present application, and other any processing capacities are weaker and for handling letter
The CPU of single task both falls within the scope of the embodiment of the present application protection.
As one optionally example, Offload CPU can be directly connected to communicate with multiple CPU, such as Fig. 1 institutes
Show, passage path is 1. respectively, 2. path communicates with CPU 1, CPU 2 by Offload CPU.Optionally, if Offload
CPU can only be directly connected to a CPU (such as CPU 1), then Offload CPU can be indirectly logical with CPU 2 by CPU 1
Letter.
Each CPU includes at least one core, and optionally, the CPU 1 and CPU 2 shown in Fig. 1 includes 4 cores respectively
The heart, Offload CPU include two cores, but the quantity of the core suitable for the CPU included by the system of the embodiment of the present application
It is without being limited thereto.In addition, include the stronger CPU of at least one processing capacity and at least one suitable for the system of the embodiment of the present application
The weaker CPU of processing capacity.
Optionally, system 100 further includes one or more dresses in memory, interchanger, hard disk, I/O units and accelerator
It puts, the main function of accelerator is for completing the modulation /demodulation in some special calculating tasks, such as signal of communication processing
Coding and decoding video calculating in calculating, multimedia application etc..
Communication mode between each device (alternatively referred to as " unit " or " module ") shown in FIG. 1 is only to illustrate
Bright, the embodiment of the present application is without being limited thereto, for example, CPU 1 can also pass through PCIe (Peripheral Component
Interconnect-Express, external device interconnection extension) unit communicates with interchanger.
Fig. 2 is a kind of schematic flow chart of method for handling data provided by the embodiments of the present application.It as shown in Fig. 2, should
Method 200 includes:
S210 determines complexity threshold according to the processing capacity of the first processor.
S220, according to the complexity of pending data and the complexity threshold, from the first processor and described
Determine to handle the target processor of the pending data in two processors.
Method 200 can be performed by primary processor (that is, the processor for carrying operating system), which can be
Other processing in first processor or second processor or system in addition to first processor and second processor
Device.It is to be understood that above system refers to the equipment for being used to implement some functions for including first processor core second processor, with " behaviour
Make system " it is different, the meaning of system as described below is identical with this, repeats no more.It is succinct in order to describe, below, at first
Reason device is referred to as the first CPU, and second processor is referred to as the 2nd CPU.
Under normal conditions, the processing capacity of the framework processor identical with processing procedure is directlyed proportional to energy consumption, the embodiment of the present application
In, the processing capacity of the first CPU is less than the processing capacity of the 2nd CPU, therefore, when handling identical data, the energy of the first CPU
Low energy consumption by the 2nd CPU of loss-rate.According to the difference of application scenarios, system task to be dealt with is different, correspondingly, processor institute
The complexity of data to be processed also differs.
For example, in high-performance calculation scene, the main function of operating system is responsible for calculating task being dispatched to system
Available processor performs up, and task scheduling work is with task execution in itself compared to simply many under normal circumstances,
It needs to carry out simple logical calculated and fixed-point computation.The work of another aspect task scheduling is to continue to carry out, because not
It is disconnected to there is task computation completion or new task to reach.It can be seen that in this case, operating system is a kind of letter of continuation in itself
Single task.If on the powerful processor of the server capability of operating system carried, the wave of computing resource and the energy can be caused
Take.Based on method 200 provided by the embodiments of the present application, operating system can be loaded into the first CPU in os starting
On, so as to which the 2nd CPU be released, exclusively carry out complicated task computation.
For another example at C-RAN (Centralized Radio Access Networks, the wireless access network of centralization)
Base-Band Processing in, server need it is lasting receive and dispatch user data and control information from core net, while be also required to continue
Slave RRU (Remote Radio Unit, Remote Radio Unit) transmitting-receiving physical layer data.Although these tasks are very simple, it is not required to
Want complicated logical calculated, it is not required that substantial amounts of floating-point operation, but be continued for existing.If with the processing of server
Device goes to perform, and the huge waste of computing resource is not only for powerful processor-server, and power consumption is also very high.
Based on method 200 provided by the embodiments of the present application, we can carry out thread binding, these are continued in advance in the application
Property simple task be tied to the first CPU up perform, it is achieved thereby that the matching of task complexity and processing capacity, makes calculating
Resource utilization and energy consumption efficiency have all reached relatively good optimization.
In S210, since the processing capacity of the first different CPU is different, in order to avoid the first CPU can not complete to distribute to
Its task, therefore, primary processor need to determine complexity threshold according to the processing capacity of the first CPU, which uses
In the maximum complexity for the data that the first CPU of instruction can be handled.
In S220, complexity and above-mentioned complexity threshold of the primary processor according to pending data, from the first CPU and
Two CPU kinds determine target processor, which is used to handle the pending data, so that pending data
Complexity and the processing capacity of target processor match.
It is to be understood that above-mentioned complexity threshold can change with the difference of application scenarios, for example, current system is pending
Task it is more, then can heighten the complexity threshold, the higher task of some complexities is distributed into the first CPU, if work as
The pending task of preceding system is less, then can turn down the complexity threshold, and the higher task of some complexities is distributed to
Two CPU, so as to be conducive to the equilibrium of the load of whole system.
Above-mentioned complexity threshold can also be default fixed value, if current system waiting task is more, and first
The waiting task of CPU is less, then being more than the complexity threshold with complexity of the task distributes to the first CPU;It is if current
System waiting task is less, and the waiting task of the first CPU is more, then can complexity be less than the complexity threshold
Task distributes to the 2nd CPU, so as to be conducive to the equilibrium of the load of whole system.
In the embodiment of the present application, the first CPU is for performing simple task, therefore, the place that the first CPU need not be very strong
Reason ability, but must possess the characteristics of low-power consumption.The quantity of the processor core of first CPU can be it is single can also be more
It is a.For example, the first CPU can be any one in following three kinds of CPU:
1) the x86 processors of low-power consumption;
2) the low work(realized using FPGA (Field-Programmable Gate Array, field programmable gate array)
Consume CPU (due to the flexible programmable ability of FPGA, can realize arbitrarily desired cpu function);
3) (Advanced RISC Machines, wherein advanced RISC machines machine, RISC are ARM
The abbreviation of " Reduced Instruction Set Computer ") framework or MIPS (Microprocessor without
Interlocked Piped Stages, no inner interlocked pipelining-stage microprocessor) framework processor.
In conclusion processing data method provided by the embodiments of the present application, complexity is determined according to the processing capacity of processor
Threshold value is spent, and target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task
Complexity and the processing capacity of processor match, and improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the complexity according to pending data and the complexity threshold, from the first processor and
Determine to handle the target processor of the pending data in the second processor, including:
S221, if the complexity of the pending data is less than or equal to the complexity threshold, by described first
Processor is determined as the target processor.
S222, if the complexity of the pending data is more than the complexity threshold, by the second processor
It is determined as the target processor.
Therefore, the method for data processing provided by the embodiments of the present application handles letter by the weaker processor of processing capacity
Single task handles complex task, so as to which simple task is avoided to occupy processing capacity by the stronger processor of processing capacity
Stronger processor realizes task complexity and matches with processing capacity, so as to improve the computing resource utilization rate of system
And Energy Efficiency Ratio.
Optionally it is determined that before the target processor, the data processing method 200 further includes:
S201, the type of the task according to belonging to the pending data determine the complexity of the pending data.
The complexity of pending data is usually related with the type of the task belonging to pending data, for example, video editing
Task usually requires to carry out data substantial amounts of encoding and decoding calculating, it is necessary to which processor possesses stronger processing capacity, therefore video
Video data included by editor's task belongs to the higher pending data of complexity, hence, it can be determined that video data is multiple
The higher data of miscellaneous degree.For another example data transmit-receive task is usually only necessary to be forwarded to data, complicated logical calculated is not required
Or floating-point operation, hence, it can be determined that the data included by data transmit-receive task belong to the relatively low pending data of complexity.
In the embodiment of the present application, scheduling logic can be increased in the application, the relatively low task of complexity is tied to
In the core of first CPU, processing logic can also be increased in the task scheduling layer of operating system, make operating system automatically by complexity
Relatively low task scheduling is spent to the first CPU.
Above-described embodiment is merely illustrative of, and the embodiment of the present application is without being limited thereto, therefore, number provided by the embodiments of the present application
According to processing method, the type of the task according to belonging to pending data determines the complexity of pending data, and according to pending
The complexity and complexity threshold of data determine target processor, so as to realize the processing energy of task complexity and processor
Power matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
After target processor is determined, target processor is needed to obtain pending data and it handled, target
Processor can by pci interface directly from miscellaneous equipment (such as interchanger) obtain pending data, can also by PPI from
Other processors in system residing for the target processor obtain pending data.
Fig. 3 shows a kind of schematic flow chart of method for obtaining pending data provided by the embodiments of the present application.Such as
Shown in Fig. 3, the first CPU is communicated with the 2nd CPU by respective PPI, and the first CPU and the 2nd CPU can also be by respective
MAC deposit into row read/write operation to interior, it is assumed that currently definite first CPU is target processor, then the first CPU can be under
It states step and obtains pending data from the 2nd CPU.
Pending data is write memory 302 by S310, the 2nd CPU by MAC301.
Above-mentioned pending data is sent to memory 305 by S320, MAC301.
For example, MAC301 can be passed through using DMA (Direct Memory Access, direct memory access) mechanism
Pending data is sent to memory 305 by PPI303 and PPI304.
S330, the first CPU read pending data by MAC306 from memory 305.
Above-described embodiment is merely illustrative of, and the embodiment of the present application is without being limited thereto, and the first CPU can also pass through the prior art
Other methods obtain the second CPU write and enter the pending data of memory, for sake of simplicity, details are not described herein.In addition, the first CPU
The pending data in miscellaneous equipment write-in memory 350 can also be read.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in memory
According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also
It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first
Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor is connected with the second processor by LLC.
Current almost all of CPU has multilevel cache, and the capacity of the Capacity Ratio memory of caching is small, but cache
Access rate is faster than the access rate of memory, and therefore, pending data can be write and cached by other processors in advance, at target
It manages device and obtains pending data by caching, so as to improve the rate for obtaining pending data.
Fig. 4 shows the schematic flow chart of another method for obtaining pending data provided by the embodiments of the present application.
As shown in figure 4, the LLC and the LLC of the 2nd CPU of the first CPU are connected by PPI, in this way, can pass through LLC between two CPU
Direct communication, since LLC has faster access rate, also have between the first CPU and the 2nd CPU and communicate faster
Rate.Assuming that currently definite first CPU is target processor, then the first CPU can be by following step from other processor (examples
Such as the 2nd CPU) obtain pending data.
Pending data is write LLC401 by S410, the 2nd CPU.
S420, LLC405 read pending number from by PPI402 and PPI403 according to the instruction of the first CPU from LLC401
According to.
Pending data is sent to the core of the first CPU by S430, LLC404.
It should be noted that between the LLC of above-mentioned two CPU, there may be cache coherency problems, i.e. two CPU's
The backup of same part data may be included in LLC, and the two CPU independently read and write respective Backup Data, may cause same
Two backups of one data are inconsistent.Cache coherency problems can pass through CC (Cache Coherent, buffer consistency)
Agreement solves, and specifically can be solved by CA (Cache Agent, caching agent unit).
Above-described embodiment is merely illustrative of, and the embodiment of the present application is without being limited thereto, and the first CPU can also pass through the prior art
Other methods obtain the second CPU write enter caching in pending data, for sake of simplicity, details are not described herein.In addition, first
CPU can also read the pending data in miscellaneous equipment write-in LLC405.
According to data processing method provided by the embodiments of the present application, target processor is by reading the pending number in caching
According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor
Block connects.
In the embodiment of the present application, target processor may belong to same instruction set architecture with other processors, it is also possible to no
Belong to same instruction set architecture.
When target processor (such as the first CPU) and other processors (such as the 2nd CPU) belong to same instruction set architecture
When, the Cache coherency protocol and PPI of the first CPU and the 2nd CPU all exactly match, and there is no compatibility issues.
For example, the first CPU and the 2nd CPU are the CPU of Intel (Intel), then PPI is the QPI (Quick of Intel
Path Interconnect, Quick Path Interconnect) bus, buffer consistency then by the MESIF of Intel (Modified,
Exclusive, Shared, Invalid and Forward rewrite/exclusive/shared/invalid/forward pass) agreement ensures.If
First CPU and the 2nd CPU is the CPU of AMD (Advanced Micro Devices, ultra micro equipment), then PPI is AMD's
HyperTransport (super transmission) bus, buffer consistency then by the MOESI of AMD (Modified, Owned,
Exclusive, Shared, and Invalid, rewrite/possessing/monopolizes/shares/invalid) agreement ensures.
When target processor (such as the first CPU) and other processors (such as the 2nd CPU) are not belonging to same instruction set frame
During structure, there are compatibility issues for the Cache coherency protocol and PPI of the first CPU and the 2nd CPU.According to the realization shape of the first CPU
Formula is different, can be handled by following two modes.
Mode 1
If the first CPU is using FPGA ip core's (intellectual property core, IP core)
Form is realized, then can utilize the flexible programmable of FPGA, and the interface modular converter and caching of PPI are realized in FPGA
The protocol conversion module of consistency protocol, so as to which two CPU of different instruction set framework be connected.
As shown in figure 5, the first CPU includes protocol conversion module cvt2 and interface modular converter cvt1, cvt1 are used to implement
Conversion between PPI, cvt2 are used to implement the conversion of Cache coherency protocol.For example, the 2nd CPU uses intel x86 frameworks,
First CPU uses ARM frameworks, then cvt1 be used to implement QPI agreements and AXI (Advanced eXtensible Interface,
Advanced extensible Interface) conversion between agreement, cvt2 is used to implement MESIF protocol and ACE (AMBA Coherency
Extensions, Advanced Microcontroller Bus Architecture unanimously extend, and wherein AMBA is " Advanced Microcontroller
The abbreviation of Bus Architecture ") conversion between agreement.
Due to FPGA have flexible programmable, the PPI interfaces in the first CPU can be changed, be revised as with
PPI interfaces identical PPI used in 2nd CPU, so as to without interface modular converter cvt1 can be realized the first CPU with
The communication of 2nd CPU.
Mode 2
If the first CPU realizes that the first CPU and the 2nd CPU can be by being located at two using non-programmable device
The modular converter of the outside of CPU communicates.
As shown in fig. 6, the first CPU and the 2nd CPU is carried out by the modular converter PPI&CC cvt between being located at two CPU
Communication.PPI&CC cvt are used to implement the conversion of PPI and the conversion of Cache coherency protocol between the first CPU and the 2nd CPU.
The method that the prior art may be employed in PPI&CC cvt is realized, for sake of simplicity, details are not described herein.
According to data processing method provided by the embodiments of the present application, LLC, interface of the target processor by target processor
Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor
Between communication, and improve the rate that target processor obtains pending data from other processors.
Data processing method provided by the embodiments of the present application is described in detail above in association with Fig. 2 to Fig. 6, in the following,
Data processing equipment provided by the embodiments of the present application will be described in detail with reference to Fig. 7.
As shown in fig. 7, device 700 includes:
Determination unit 710 determines complexity threshold for the processing capacity according to first processor;
Processing unit 720, for the definite complexity of the complexity according to pending data and the determination unit 710
Threshold value is spent, determines to handle the target processor of the pending data from the first processor and second processor.
In the embodiment of the present application, device 700 is configured in the system including the first processor and the second processor
In, wherein, the processing capacity of the first processor is less than the processing capacity of the second processor.
It is to be understood that device 700 is that data processing equipment provided by the embodiments of the present application is described from functional perspective, dress
Put 700 executive agents that may correspond to data processing method according to the embodiment of the present application, and the unit in device 700
Above and other operation and/or function respectively in order to realize the corresponding flow of each method in Fig. 2 to Fig. 6, for sake of simplicity,
Details are not described herein.
Processing data set provided by the embodiments of the present application determines complexity threshold according to the processing capacity of processor, and
Target processor is determined according to the complexity of the complexity threshold and pending data, so as to realize task complexity and place
The processing capacity of reason device matches, and improves the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the processing unit 710 is specifically used for:
If the complexity of the pending data is less than or equal to the complexity threshold, by the first processor
It is determined as the target processor;
If the complexity of the pending data is more than the complexity threshold, the second processor is determined as
The target processor.
The device of data processing provided by the embodiments of the present application is appointed by the way that the weaker processor processing of processing capacity is simple
Business handles complex task by the stronger processor of processing capacity, stronger so as to which simple task is avoided to occupy processing capacity
Processor, realize task complexity and match with processing capacity, so as to improve the computing resource utilization rate of system and
Energy Efficiency Ratio.
Optionally, the processing unit 710 is additionally operable to:
The type of task according to belonging to the pending data determines the complexity of the pending data.
Data processing equipment provided by the embodiments of the present application determines answering for pending data according to the type of pending data
Miscellaneous degree, and target processor is determined according to the complexity and complexity threshold of pending data, so as to realize task complexity
It spends and matches with the processing capacity of processor, improve the computing resource utilization rate and Energy Efficiency Ratio of system.
Optionally, the target processor passes through MAC and Memory linkage.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in memory
According to, can be flexibly applied in the system of various frameworks, no matter the first CPU and the 2nd CPU using identical instruction set architecture also
It is different instruction set architectures, the method for data processing provided by the embodiments of the present application can quickly and easily realize first
Interconnecting between CPU and the 2nd CPU.
Optionally, the first processor is connected with the second processor by LLC.
According to data processing equipment provided by the embodiments of the present application, target processor is by reading the pending number in caching
According to the rate that target processor obtains pending data from other processors can be improved.
Optionally, the first processor passes through LLC, interface modular converter and protocol conversion mould with the second processor
Block connects.
According to data processing equipment provided by the embodiments of the present application, LLC, interface of the target processor by target processor
Modular converter and protocol conversion module read the pending data stored in the LLC of other processors, realize heterogeneous processor
Between communication, and improve the rate that target processor obtains pending data from other processors.
Further, data processing equipment provided by the embodiments of the present application can also have processor and memory to form,
For performing data processing method provided by the embodiments of the present application, memory is used to store the finger performed by processor middle processor
Order.
Those skilled in the art can be understood that, for convenience of description and succinctly, the device of foregoing description
With the specific work process of unit, the corresponding process in preceding method embodiment is may be referred to, details are not described herein.
In the embodiment of the present application, the size of the sequence number of each process is not meant to the priority of execution sequence, each process
Execution sequence should determine that the implementation process without tackling the embodiment of the present application forms any restriction with its function and internal logic.
In addition, the terms "and/or", is only a kind of incidence relation for describing affiliated partner, represents there may be
Three kinds of relations, for example, A and/or B, can represent:Individualism A exists simultaneously A and B, these three situations of individualism B.Separately
Outside, character "/" herein, it is a kind of relation of "or" to typically represent forward-backward correlation object.
Those of ordinary skill in the art may realize that each exemplary lists described with reference to the embodiments described herein
Member and step can realize with the combination of electronic hardware, computer software or the two, in order to clearly demonstrate hardware and soft
The interchangeability of part generally describes each exemplary composition and step according to function in the above description.These work(
It can be performed actually with hardware or software mode, specific application and design constraint depending on technical solution.Professional skill
Art personnel can realize described function to each specific application using distinct methods, but this realization should not be recognized
To exceed scope of the present application.
The disclosed systems, devices and methods in embodiment provided herein, can be real by another way
It is existing.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, is only that one kind is patrolled
The division of volume function, can there is an other dividing mode in actual implementation, such as multiple units or component can combine or can be with
It is integrated into another system or some features can be ignored or does not perform.In addition, different components or the mutual coupling of unit
Conjunction or communication connection can be INDIRECT COUPLING or the communication connection by some interfaces, device or unit, above-mentioned coupling or communication
Connection can be electric, mechanical or other forms couplings or connection.
The above is only the specific embodiment of the application, but the protection domain of the application is not limited thereto.
Claims (12)
1. a kind of data processing method, which is characterized in that it is performed in the system including first processor and second processor,
In, the processing capacity of the first processor is less than the processing capacity of the second processor, and the data processing method includes:
Complexity threshold is determined according to the processing capacity of the first processor;
According to the complexity of pending data and the complexity threshold, from the first processor and the second processor
Determine to handle the target processor of the pending data.
2. data processing method according to claim 1, which is characterized in that the complexity according to pending data and
The complexity threshold determines to handle the pending data from the first processor and the second processor
Target processor, including:
If the complexity of the pending data is less than or equal to the complexity threshold, the first processor is determined
For the target processor;
If the complexity of the pending data is more than the complexity threshold, the second processor is determined as described
Target processor.
3. data processing method according to claim 1 or 2, which is characterized in that before determining the target processor, institute
Data processing method is stated to further include:
The type of task according to belonging to the pending data determines the complexity of the pending data.
4. data processing method according to any one of claim 1 to 3, which is characterized in that the target processor leads to
Cross Memory Controller Hub MAC and Memory linkage.
5. data processing method according to any one of claim 1 to 3, which is characterized in that the first processor with
The second processor caches LLC connections by afterbody.
6. data processing method according to any one of claim 1 to 3, which is characterized in that the first processor with
The second processor is connected by LLC, interface modular converter with protocol conversion module.
7. a kind of data processing equipment, which is characterized in that it configures in the system including first processor and second processor,
In, the processing capacity of the first processor is less than the processing capacity of the second processor, and the data processing equipment includes:
Determination unit, for determining complexity threshold according to the processing capacity of the first processor;
Processing unit, for the complexity threshold that the complexity according to pending data and the determination unit determine, from
Determine to handle the target processor of the pending data in the first processor and the second processor.
8. data processing equipment according to claim 7, which is characterized in that the processing unit is specifically used for:
If the complexity of the pending data is less than or equal to the complexity threshold, the first processor is determined
For the target processor;
If the complexity of the pending data is more than the complexity threshold, the second processor is determined as described
Target processor.
9. the data processing equipment according to claim 7 or 8, which is characterized in that the processing unit is additionally operable to:
The type of task according to belonging to the pending data determines the complexity of the pending data.
10. the data processing equipment according to any one of claim 7 to 9, which is characterized in that the target processor leads to
Cross Memory Controller Hub MAC and Memory linkage.
11. the data processing equipment according to any one of claim 7 to 9, which is characterized in that the first processor with
The second processor caches LLC connections by afterbody.
12. the data processing equipment according to any one of claim 7 to 9, which is characterized in that the first processor with
The second processor is connected by LLC, interface modular converter with protocol conversion module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611042690.2A CN108089920A (en) | 2016-11-23 | 2016-11-23 | A kind of methods, devices and systems of data processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611042690.2A CN108089920A (en) | 2016-11-23 | 2016-11-23 | A kind of methods, devices and systems of data processing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108089920A true CN108089920A (en) | 2018-05-29 |
Family
ID=62170935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611042690.2A Pending CN108089920A (en) | 2016-11-23 | 2016-11-23 | A kind of methods, devices and systems of data processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108089920A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113051064A (en) * | 2019-12-26 | 2021-06-29 | 中移(上海)信息通信科技有限公司 | Task scheduling method, device, equipment and storage medium |
CN113661485A (en) * | 2019-04-10 | 2021-11-16 | 赛灵思公司 | Domain assisted processor peering for coherency acceleration |
WO2022143019A1 (en) * | 2020-12-31 | 2022-07-07 | 华为云计算技术有限公司 | Heterogeneous computing system and related device |
TWI811620B (en) * | 2020-03-24 | 2023-08-11 | 威盛電子股份有限公司 | Computing apparatus and data processing method |
US11941433B2 (en) | 2020-03-24 | 2024-03-26 | Via Technologies Inc. | Computing apparatus and data processing method for offloading data processing of data processing task from at least one general purpose processor |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104142729A (en) * | 2014-08-11 | 2014-11-12 | 联想(北京)有限公司 | Method and device for controlling processor and electronic equipment |
US20160335734A1 (en) * | 2015-05-11 | 2016-11-17 | Vixs Systems, Inc. | Memory subsystem synchronization primitives |
-
2016
- 2016-11-23 CN CN201611042690.2A patent/CN108089920A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104142729A (en) * | 2014-08-11 | 2014-11-12 | 联想(北京)有限公司 | Method and device for controlling processor and electronic equipment |
US20160335734A1 (en) * | 2015-05-11 | 2016-11-17 | Vixs Systems, Inc. | Memory subsystem synchronization primitives |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113661485A (en) * | 2019-04-10 | 2021-11-16 | 赛灵思公司 | Domain assisted processor peering for coherency acceleration |
CN113661485B (en) * | 2019-04-10 | 2024-05-07 | 赛灵思公司 | Domain assisted processor peering for coherency acceleration |
CN113051064A (en) * | 2019-12-26 | 2021-06-29 | 中移(上海)信息通信科技有限公司 | Task scheduling method, device, equipment and storage medium |
CN113051064B (en) * | 2019-12-26 | 2024-05-24 | 中移(上海)信息通信科技有限公司 | Task scheduling method, device, equipment and storage medium |
TWI811620B (en) * | 2020-03-24 | 2023-08-11 | 威盛電子股份有限公司 | Computing apparatus and data processing method |
US11941433B2 (en) | 2020-03-24 | 2024-03-26 | Via Technologies Inc. | Computing apparatus and data processing method for offloading data processing of data processing task from at least one general purpose processor |
WO2022143019A1 (en) * | 2020-12-31 | 2022-07-07 | 华为云计算技术有限公司 | Heterogeneous computing system and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108089920A (en) | A kind of methods, devices and systems of data processing | |
US20210064117A1 (en) | Optimizing power usage by factoring processor architectural events to pmu | |
CN112465129B (en) | On-chip heterogeneous artificial intelligent processor | |
CN105279133B (en) | VPX Parallel DSP Signal transacting board analysis based on SoC on-line reorganizations | |
CN104331528B (en) | The General Porcess Unit of (DSP) pattern is handled with low power digital signals | |
CN110650347B (en) | Multimedia data processing method and device | |
EP1963963A2 (en) | Methods and apparatus for multi-core processing with dedicated thread management | |
CN107111553A (en) | System and method for providing dynamic caching extension in many cluster heterogeneous processor frameworks | |
US10558574B2 (en) | Reducing cache line collisions | |
DE112013005287T5 (en) | Heterogeneous processor device and method | |
CN102736595A (en) | Unified platform of intelligent power distribution terminal based on 32 bit microprocessor and real time operating system (RTOS) | |
US20210373799A1 (en) | Method for storing data and method for reading data | |
CN205038556U (en) | VPX multinuclear intelligence computation hardware platform based on two FPGA of two DSP | |
CN106385329A (en) | Processing method and device of resource pool and equipment | |
CN108229687A (en) | Data processing method, data processing equipment and electronic equipment | |
CN109858621A (en) | A kind of debugging apparatus, method and the storage medium of convolutional neural networks accelerator | |
CN114564435A (en) | Inter-core communication method, device and medium for heterogeneous multi-core chip | |
CN104750660A (en) | Embedded reconfigurable processor with multiple operating modes | |
CN111008042B (en) | Efficient general processor execution method and system based on heterogeneous pipeline | |
CN206331335U (en) | Computer motherboard and computer | |
CN112347035B (en) | Remote FPGA equipment-oriented dynamic part reconfigurable configuration device and method | |
CN101989191B (en) | Realizing method of multi-Ready input CPU (central processing unit) | |
CN103150952A (en) | Reconfigurable electronic design automation (EDA) experimental platform | |
CN204009892U (en) | Embedded overall treatment platform based on X86 and the general-purpose operating system | |
CN112905528A (en) | Intelligent household chip based on Internet of things |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180529 |