CN102902655A - Information processing heterogeneous system - Google Patents

Information processing heterogeneous system Download PDF

Info

Publication number
CN102902655A
CN102902655A CN2012103392335A CN201210339233A CN102902655A CN 102902655 A CN102902655 A CN 102902655A CN 2012103392335 A CN2012103392335 A CN 2012103392335A CN 201210339233 A CN201210339233 A CN 201210339233A CN 102902655 A CN102902655 A CN 102902655A
Authority
CN
China
Prior art keywords
cpu
mic
information processing
chip
performance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012103392335A
Other languages
Chinese (zh)
Inventor
张清
张广勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN2012103392335A priority Critical patent/CN102902655A/en
Publication of CN102902655A publication Critical patent/CN102902655A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to an information processing heterogeneous system. The system comprises a central processing unit (CPU) platform, at least one many integrated core (MIC) chip and a connector, wherein the CPU platform comprises a CPU chip, and the connector is used for connecting the MIC chip with the CPU platform. The system has the advantages that system performances can be improved effectively, and high-performance application requirements can be satisfied.

Description

The information processing heterogeneous system
Technical field
The present invention relates to high-performance computing sector, be specifically related to a kind of information processing heterogeneous system.
Background technology
High-performance calculation is the forward position hi-tech of message area, safeguarding national security, promote the science and techniques of defence progress, promoting to have direct impetus aspect the sophisticated weapons development, is one of important symbol of weighing a national comprehensive strength.Develop rapidly along with informationized society, human more and more higher to the requirement of information processing capability, the demand high-performance calculations such as not only petroleum prospecting, weather forecast, space flight national defence, scientific research, and finance, e-government, education, enterprise, online game etc. widely the field to the demand rapid growth of high-performance calculation.
Computing velocity is particularly important for high-performance calculation, high-performance calculation will be towards multinuclear, many nuclear development, adopt the parallel computation speed that promotes of isomery, CPU+GPU is the very ripe collaborative computation schema of isomery at present, but because there is huge challenge in GPU on programming efficiency, fine granularity parallel algorithm, large-scale parallel performance.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of information processing heterogeneous system, the low problem of system performance when using with the solution high-performance calculation.
For solving the problems of the technologies described above, the invention provides a kind of information processing heterogeneous system, this system comprises:
A central processing unit (CPU) platform, described platform comprises cpu chip;
At least one integrated many nuclear (MIC) chip;
Connector is used for connecting described MIC chip to described CPU platform.
The present invention also provides another kind of information processing heterogeneous system, and this system comprises:
The first performance element, its processor are realized by 2 cpu chips, are used for carrying out information processing;
Second, third performance element all is connected with described the first performance element, and processor is realized by 1 MIC chip respectively, is used for and described the first performance element executed in parallel information processing.
Information processing heterogeneous system of the present invention is made of cpu chip and MIC chip, is preferably to adopt at present popular two-way cpu chip and at least two MIC chips, can the Effective Raise system performance, satisfy the requirement of performance application.
Description of drawings
Fig. 1 is the modular structure synoptic diagram of information processing heterogeneous system embodiment 1 of the present invention;
Fig. 2 is the modular structure synoptic diagram of information processing heterogeneous system embodiment 2 of the present invention;
Fig. 3 is PSTM serial operational effect figure;
Fig. 4 is for adopting the operational effect figure of the operation PSTM of system of the present invention.
Embodiment
Embodiment 1
The present invention is based on CPU and Intel MIC (Intel Many Integrated Core, integrated many nuclears) information processing heterogeneous system, as shown in Figure 1, this system comprises:
A central processing unit (CPU) platform, described platform comprises cpu chip;
At least one integrated many nuclear (MIC) chip;
Connector is used for connecting described MIC chip to described CPU platform.
Particularly, described connector is the PCIE slot.
MIC is Intel Company's exploitation, and the crowd who is used for high performance parallel computation examines chip.It is to develop from existing Xeon processor product basis, and it aims at very-high performance and calculates and living new architecture.Formal product based on the MIC framework is Xeon Phi.It is not that wish replaces CPU in computer architecture, but exist as coprocessor.The MIC chip has the x86 core of simplifying more than 50 usually, and each core supports 4 hardware threads, but the number of tasks of executed in parallel reach more than 200, the computing power of highly-parallel is provided, the smart peak performance of its pair reaches 1TFlops.The MIC technology will be accelerated the development of high-performance calculation, the performance bottleneck that the quick solution high-performance calculation is used.
This system uses for high-performance calculation, adopt the CPU/MIC isomeric architecture, merge the multinuclear computing power of CPU platform and the crowd of MIC and assessed the calculation ability, take full advantage of the computing power of two kinds of chips, make the two all common participation calculating, thereby the computing power of system is strengthened greatly, solved the performance bottleneck that high-performance calculation is used, so this system is a high performance system.Simultaneously this system or a low energy consumption system, its power dissipation ratio of performance is higher than isomorphism CPU platform far away, and whole system has been saved energy consumption in the high performance while of acquisition, so generally speaking, this system is a high-effect system.
The memory configurations of described system is more than the 96GB, and peak power is supported more than the 1300w.
MIC is all supported in the operating system of described CPU platform, compiler and driving.
Described operating system is Linux, and described compiler is icc, icpc, the ifort of Intel.
Preferably, described system comprises 2 cpu chips and 2 MIC chips, and described cpu chip comprises 6 cores, and described MIC chip comprises 50 more than the core.
In order to make the purpose, technical solutions and advantages of the present invention more clear, below in conjunction with drawings and Examples, the present invention is described in detail below.
Information processing heterogeneous system of the present invention is realized high-performance, low-power consumption based on CPU/MIC isomery framework.Below describe from hardware components and system environments configuration two inventions:
Hardware components:
The CPU platform adopts two-way, supports that 2 CPU work simultaneously, and this implementation process system adopts 2 intel Xeon56756 nuclear CPU, and dominant frequency is 3.07GHz
System can insert 2 MIC chips with two above PCIE slots, and native system adopts 2 MIC chips, has 50 more than the core on each card.
It is large that the memory configurations of system is wanted, and is more than 2 times of original CPU platform.The above internal memory of native system configuration 96GB.
System power dissipation is supported more than the 1300w, guarantees that whole system runs well, and the native system peak power is supported 1300w.
The system environments configuration:
Operating system is supported MIC, needs to install (SuSE) Linux OS.This implementation process adopts Red Hat Enterprise Linux 6.0GA 64-bit kernel 2.6.32-71;
Compiler is supported MIC, can adopt icc, icpc, the ifrt compiler of Intel, adopts Intel compiler l ccompxe 2013beta.0.047 in the present embodiment;
Support the driving of MIC, adopt KNC-AlphaUpdate1-2.1.2430-9.
Embodiment 2
This system will realize efficiently, and necessary Codesign allows the application software operation run in this system most effective.
Given this, information processing heterogeneous system provided by the invention also can be from being described with lower angle, and as shown in Figure 2, this system comprises:
The first performance element, its processor are realized by 2 cpu chips, are used for carrying out information processing;
Second, third performance element all is connected with described the first performance element, and processor is realized by 1 MIC chip respectively, is used for and described the first performance element executed in parallel information processing;
Particularly, described first, second, third performance element adopts the mode of multithreading to carry out information processing, and described first, second, third performance element is carried out information processing based on the principle of load balancing.
Wherein, described the first performance element starts 12 thread execution information processings, described second, third performance elements and starts respectively 200 above thread execution information processings.
Preferably, described cpu chip comprises at least 6 cores, and each has examined a thread, and described MIC chip comprises at least 50 cores, and each core can play 4 threads.
The server of main flow is two-way at present, namely inserts 2 CPU, the corresponding MIC of each piece CPU, and PCIE is most effective, and is best from the transmission of data performance between CPU and the MIC.
In order to test the performance of this system, can select high-performance calculation to use, this uses algorithm high parallel task, data are without dependence between the parallel task, and concurrency is good, and whole application requires high to system performance, seismic pre-stack time migration (PreStack Time Migration, PSTM) possess just above specific application, below be applied as example with this, existing CPU platform with single-threaded operation is carried out improved process describe:
Original PSTM program is with the single-threaded CPU platform that operates in, at first utilize the CPU multi-core platform, adopt the OpenMP programming model that it is realized with multithreading, adopt used calculation task 12 thread parallels to get up, the computing power of all nuclears of 2 CPU is all brought into play;
Then on PSTM CPU multi-threaded parallel procedure basis, realize the thread expansion at the MIC chip, all calculation task numbers are adopted 200 walk abreast more than the thread, make its executed in parallel on MIC, given play to the computing power of the many nuclears of MIC;
The computing power of whole system is divided into 3 equipment, and first MIC chip starts 200 more than the thread as equipment 0, and second MIC chip starts 200 more than the thread as equipment 1, and 2 CPU start 12 threads as equipment 2; As shown in Figure 2;
The calculation task of whole PSTM is divided according to the computing power of these three equipment, make the simultaneously parallel computation of three equipment, namely above common participation of these 412 threads calculated, and reaches the effect that CPU and MIC calculate simultaneously, and proof load is balanced, and whole system realizes high-performance.
Particularly, to test 91 surveys line, 963 CMP (common midpoint) point on the every survey line, inputting 110000 track datas, to carry out migration imaging be example, under original CPU isomorphism system, PSTM take time of single-threaded serial mode cost as 76053s, and native system working time is 1075s, performance promotes greatly.The imaging effect figure of CPU serial version PSTM operation sees shown in the accompanying drawing (3) that the imaging effect figure of native system operation sees that shown in the accompanying drawing (4), wherein horizontal ordinate is the common midpoint of certain bar side line, ordinate is the time, from image, two width of cloth images are basically identical, illustrate that operation result is correct.
System of the present invention, this system have high-performance, low-power consumption characteristics, with solving performance bottleneck and the power problems of performance application, satisfy actual production and scientific research demand, and reduce machine room construction cost and management, operation, maintenance cost.Among the present invention, CPU not only participates in logical calculated, also participate in intensive core calculations, and MIC only participates in the core intensive calculations, and CPU and MIC calculate jointly, realizes maximizing performance.
Can find out that from seismic pre-stack time migration embodiment whole system realizes high-performance, low-power consumption, greatly satisfy scientific research requirement and the demand of industrial production of performance application, this system has also reduced machine room construction cost and management, operation, maintenance cost.

Claims (10)

1. an information processing heterogeneous system is characterized in that, this system comprises:
A central processing unit (CPU) platform, described platform comprises cpu chip;
At least one integrated many nuclear (MIC) chip;
Connector is used for connecting described MIC chip to described CPU platform.
2. the system as claimed in claim 1, it is characterized in that: described connector is the PCIE slot.
3. the system as claimed in claim 1, it is characterized in that: the memory configurations of described system is more than the 96GB, peak power is supported more than the 1300w.
4. the system as claimed in claim 1 is characterized in that: the operating system of described CPU platform, compiler and drive and all support MIC.
5. the system as claimed in claim 1, it is characterized in that: described operating system is LinuX, described compiler is icc, icpc, the ifort of Intel.
6. the system as claimed in claim 1, it is characterized in that: described system comprises 2 cpu chips and 2 MIC chips, and described cpu chip comprises 6 cores, and described MIC chip comprises at least 50 cores.
7. an information processing heterogeneous system is characterized in that, this system comprises:
The first performance element, its processor are realized by 2 cpu chips, are used for carrying out information processing;
Second, third performance element all is connected with described the first performance element, and processor is realized by 1 MIC chip respectively, is used for and described the first performance element executed in parallel information processing.
8. system as claimed in claim 7 is characterized in that: described first, second, third performance element adopts the mode of multithreading to carry out information processing.
9. system as claimed in claim 7 is characterized in that: described first, second, third performance element is carried out information processing based on the principle of load balancing.
10. system as claimed in claim 7 is characterized in that: described the first performance element starts 12 thread execution information processings, described second, third performance elements and starts respectively at least 200 thread execution information processings.
CN2012103392335A 2012-09-13 2012-09-13 Information processing heterogeneous system Pending CN102902655A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012103392335A CN102902655A (en) 2012-09-13 2012-09-13 Information processing heterogeneous system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012103392335A CN102902655A (en) 2012-09-13 2012-09-13 Information processing heterogeneous system

Publications (1)

Publication Number Publication Date
CN102902655A true CN102902655A (en) 2013-01-30

Family

ID=47574895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012103392335A Pending CN102902655A (en) 2012-09-13 2012-09-13 Information processing heterogeneous system

Country Status (1)

Country Link
CN (1) CN102902655A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049329A (en) * 2012-11-22 2013-04-17 浪潮电子信息产业股份有限公司 High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure
CN103744898A (en) * 2013-12-25 2014-04-23 浪潮电子信息产业股份有限公司 Dependent library automatic processing method of MIC (Many Integrated Core) native mode program
CN105893151A (en) * 2016-04-01 2016-08-24 浪潮电子信息产业股份有限公司 High-dimensional data flow processing method based on CPU-MIC heterogeneous platform
CN110287139A (en) * 2019-06-13 2019-09-27 深圳大学 CPU and MIC cooperated computing method, apparatus and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101526934A (en) * 2009-04-21 2009-09-09 浪潮电子信息产业股份有限公司 Construction method of GPU and CPU combined processor
US20120192198A1 (en) * 2011-01-24 2012-07-26 Nec Laboratories America, Inc. Method and System for Memory Aware Runtime to Support Multitenancy in Heterogeneous Clusters

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101526934A (en) * 2009-04-21 2009-09-09 浪潮电子信息产业股份有限公司 Construction method of GPU and CPU combined processor
US20120192198A1 (en) * 2011-01-24 2012-07-26 Nec Laboratories America, Inc. Method and System for Memory Aware Runtime to Support Multitenancy in Heterogeneous Clusters

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ALEJANDRO DURAN等: "The Intel® Many Many Integrated Core Architecture", 《INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND SIMULATION》, 6 July 2012 (2012-07-06), pages 365 - 366 *
ERIK SAULE等: "An Early Evaluation of the Scalability of Graph Algorithms on the Intel MIC Architecture", 《INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS&PHD FORUM》, 25 May 2012 (2012-05-25), pages 1629 - 1630 *
张清: "浪潮异构HPC三大方案", 《科技浪潮》, 15 October 2011 (2011-10-15) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049329A (en) * 2012-11-22 2013-04-17 浪潮电子信息产业股份有限公司 High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure
CN103744898A (en) * 2013-12-25 2014-04-23 浪潮电子信息产业股份有限公司 Dependent library automatic processing method of MIC (Many Integrated Core) native mode program
CN105893151A (en) * 2016-04-01 2016-08-24 浪潮电子信息产业股份有限公司 High-dimensional data flow processing method based on CPU-MIC heterogeneous platform
CN105893151B (en) * 2016-04-01 2019-03-08 浪潮电子信息产业股份有限公司 A kind of processing method of the High Dimensional Data Streams based on CPU+MIC heterogeneous platform
CN110287139A (en) * 2019-06-13 2019-09-27 深圳大学 CPU and MIC cooperated computing method, apparatus and storage medium

Similar Documents

Publication Publication Date Title
CN101901042B (en) Method for reducing power consumption based on dynamic task migrating technology in multi-GPU (Graphic Processing Unit) system
Liao et al. MilkyWay-2 supercomputer: system and application
Reaño et al. Local and remote GPUs perform similar with EDR 100G InfiniBand
Gilani et al. Power-efficient computing for compute-intensive GPGPU applications
CN103279446A (en) Isomerism mixed calculation multi-platform system using central processing unit (CPU)+graphic processing unit (GPU)+many integrated core (MIC)
Gao et al. A survey of homogeneous and heterogeneous system architectures in high performance computing
CN102902655A (en) Information processing heterogeneous system
CN102253919A (en) Concurrent numerical simulation method and system based on GPU and CPU cooperative computing
CN103049329A (en) High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure
CN103294639A (en) CPU+MIC mixed heterogeneous cluster system for achieving large-scale computing
Cesini et al. Power-efficient computing: experiences from the COSA project
Wang et al. Task scheduling of parallel processing in CPU-GPU collaborative environment
Zhang et al. Comparison and analysis of GPGPU and parallel computing on multi-core CPU
He et al. Haas: Cloud-based real-time data analytics with heterogeneity-aware scheduling
Chen et al. Integrated research of parallel computing: Status and future
Childs et al. Particle advection performance over varied architectures and workloads
Zhou et al. Parallel data cube computation on graphic processing units
CN203465722U (en) Computer system facing multi-scale calculation
Rojek et al. Parallelization of EULAG model on multicore architectures with GPU accelerators
CN102866423A (en) Seismic prestack time migration processing method and system
Gong et al. Optimizing Sweep3D for graphic processor unit
Jararweh et al. Gpu-based personal supercomputing
US20150106589A1 (en) Small form high performance computing mini hpc
Wang et al. Data motion acceleration: Chaining cross-domain multi accelerators
CN103019323A (en) Binary-star computer server based on Loongson processor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130130